ontolearn.data_struct

Data structures.

Classes

PrepareBatchOfPrediction

An abstract class representing a Dataset.

PrepareBatchOfTraining

An abstract class representing a Dataset.

Experience

A class to model experiences for Replay Memory.

TriplesData

CLIPDataset

An abstract class representing a Dataset.

CLIPDatasetInference

An abstract class representing a Dataset.

NCESBaseDataset

NCESDataset

An abstract class representing a Dataset.

NCESDatasetInference

An abstract class representing a Dataset.

ROCESDataset

An abstract class representing a Dataset.

ROCESDatasetInference

An abstract class representing a Dataset.

TriplesDataset

An abstract class representing a Dataset.

Module Contents

class ontolearn.data_struct.PrepareBatchOfPrediction(current_state: torch.FloatTensor, next_state_batch: torch.FloatTensor, p: torch.FloatTensor, n: torch.FloatTensor)[source]

Bases: torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

X
__len__()[source]
__getitem__(idx)[source]
get_all()[source]
class ontolearn.data_struct.PrepareBatchOfTraining(current_state_batch: torch.Tensor, next_state_batch: torch.Tensor, p: torch.Tensor, n: torch.Tensor, q: torch.Tensor)[source]

Bases: torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

S
S_Prime
y
Negatives
X
__len__()[source]
__getitem__(idx)[source]
class ontolearn.data_struct.Experience(maxlen: int)[source]

A class to model experiences for Replay Memory.

current_states
next_states
rewards
__len__()[source]
append(e)[source]

Append. :param e: A tuple of s_i, s_j and reward, where s_i and s_j represent refining s_i and reaching s_j.

retrieve()[source]
clear()[source]
class ontolearn.data_struct.TriplesData(knowledge_base_path)[source]
Graph
triples = []
entities
relations
entity2idx
relation2idx
load_data()[source]
static get_relations(data)[source]
static get_entities(data)[source]
class ontolearn.data_struct.CLIPDataset(data, embeddings, num_examples, shuffle_examples, example_sizes=None, k=5, sorted_examples=True)[source]

Bases: torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

data
embeddings
num_examples
shuffle_examples
example_sizes = None
k = 5
sorted_examples = True
__len__()[source]
__getitem__(idx)[source]
class ontolearn.data_struct.CLIPDatasetInference(data: list, embeddings, num_examples, shuffle_examples, sorted_examples=True)[source]

Bases: torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

data
embeddings
num_examples
shuffle_examples
sorted_examples = True
__len__()[source]
__getitem__(idx)[source]
class ontolearn.data_struct.NCESBaseDataset(vocab, inv_vocab, max_length)[source]
vocab
inv_vocab
max_length
static decompose(concept_name: str) list[source]

Decomposes a class expression into a sequence of tokens (atoms)

get_labels(target)[source]
class ontolearn.data_struct.NCESDataset(data, embeddings, num_examples, vocab, inv_vocab, shuffle_examples, max_length, example_sizes=None, sorted_examples=True)[source]

Bases: NCESBaseDataset, torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

data
embeddings
num_examples
shuffle_examples
example_sizes = None
sorted_examples = True
__len__()[source]
__getitem__(idx)[source]
class ontolearn.data_struct.NCESDatasetInference(data, embeddings, num_examples, vocab, inv_vocab, shuffle_examples, max_length=48, sorted_examples=True)[source]

Bases: NCESBaseDataset, torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

data
embeddings
num_examples
shuffle_examples
sorted_examples = True
__len__()[source]
__getitem__(idx)[source]
class ontolearn.data_struct.ROCESDataset(data, triples_data, num_examples, k, vocab, inv_vocab, max_length, sampling_strategy='p')[source]

Bases: NCESBaseDataset, torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

data
triples_data
num_examples
k
sampling_strategy = 'p'
load_embeddings(embedding_model)[source]
set_k(k)[source]
__len__()[source]
__getitem__(idx)[source]
class ontolearn.data_struct.ROCESDatasetInference(data, triples_data, num_examples, k, vocab, inv_vocab, max_length, sampling_strategy='p', num_pred_per_lp=1)[source]

Bases: NCESBaseDataset, torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

data
triples_data
k
sampling_strategy = 'p'
num_examples
num_pred_per_lp = 1
load_embeddings(embedding_model)[source]
set_k(k)[source]
__len__()[source]
__getitem__(idx)[source]
class ontolearn.data_struct.TriplesDataset(er_vocab, num_e)[source]

Bases: torch.utils.data.Dataset

An abstract class representing a Dataset.

All datasets that represent a map from keys to data samples should subclass it. All subclasses should overwrite __getitem__(), supporting fetching a data sample for a given key. Subclasses could also optionally overwrite __len__(), which is expected to return the size of the dataset by many Sampler implementations and the default options of DataLoader. Subclasses could also optionally implement __getitems__(), for speedup batched samples loading. This method accepts list of indices of samples of batch and returns list of samples.

Note

DataLoader by default constructs an index sampler that yields integral indices. To make it work with a map-style dataset with non-integral indices/keys, a custom sampler must be provided.

num_e
head_idx
rel_idx
tail_idx
__len__()[source]
__getitem__(idx)[source]