# Examples

In this guide we will show some non-trival examples of typical use-cases of Ontolearn
which you can also find in the
[examples](https://github.com/dice-group/Ontolearn/tree/master/examples) folder.


## Ex. 1: Learning Over a Local Ontology

The first example is about using EvoLearner to learn class expressions about the 
following target concepts: "_Aunt_", "_Brother_", "_Cousin_", "_Granddaughter_", 
"_Uncle_" and "_Grandgrandfather_". 

```python
import json
from ontolearn.knowledge_base import KnowledgeBase
from ontolearn.concept_learner import EvoLearner
from ontolearn.learning_problem import PosNegLPStandard
from owlapy.owl_individual import OWLNamedIndividual, IRI
from owlapy.class_expression import OWLClass
from ontolearn.utils import setup_logging

setup_logging()

# Access the learning problem and the ontology file path
with open('synthetic_problems.json') as json_file:
    settings = json.load(json_file)

# Define the Knowledge Base from a local ontology file
kb = KnowledgeBase(path=settings['data_path'])

# For each target concept
for str_target_concept, examples in settings['problems'].items():
    p = set(examples['positive_examples'])
    n = set(examples['negative_examples'])
    print('Target concept: ', str_target_concept)

    # Lets inject more background info where we ignore some trivial concepts
    if str_target_concept in ['Granddaughter', 'Aunt', 'Sister', 'Brother']:
        NS = 'http://www.benchmark.org/family#'
        concepts_to_ignore = {
            OWLClass(IRI(NS, 'Brother')),
            OWLClass(IRI(NS, 'Sister')),
            OWLClass(IRI(NS, 'Daughter')),
            OWLClass(IRI(NS, 'Mother')),
            OWLClass(IRI(NS, 'Grandmother')),
            OWLClass(IRI(NS, 'Father')),
            OWLClass(IRI(NS, 'Grandparent')),
            OWLClass(IRI(NS, 'PersonWithASibling')),
            OWLClass(IRI(NS, 'Granddaughter')),
            OWLClass(IRI(NS, 'Son')),
            OWLClass(IRI(NS, 'Child')),
            OWLClass(IRI(NS, 'Grandson')),
            OWLClass(IRI(NS, 'Grandfather')),
            OWLClass(IRI(NS, 'Grandchild')),
            OWLClass(IRI(NS, 'Parent')),
        }
        target_kb = kb.ignore_and_copy(ignored_classes=concepts_to_ignore)
    else:
        target_kb = kb
    
    # Define learning problem
    typed_pos = set(map(OWLNamedIndividual, map(IRI.create, p)))
    typed_neg = set(map(OWLNamedIndividual, map(IRI.create, n)))
    lp = PosNegLPStandard(pos=typed_pos, neg=typed_neg)
    
    # Define learning model
    model = EvoLearner(knowledge_base=target_kb, max_runtime=600)
    
    # Fit the learning problem to the model
    model.fit(lp, verbose=False)
    
    # Save top n hypotheses
    model.save_best_hypothesis(n=3, path='Predictions_{0}'.format(str_target_concept))
    
    # Get top n hypotheses
    hypotheses = list(model.best_hypotheses(n=3))
    
    # Use hypotheses as binary function to label individuals.
    predictions = model.predict(individuals=list(typed_pos | typed_neg),
                                hypotheses=hypotheses)
    # Print hypothesis
    [print(_) for _ in hypotheses]
```

In `synthetic_problems.json` we store the learning problem individuals as string
of IRIs, as well as the path to the ontology. 
You can check that file [here](https://github.com/dice-group/Ontolearn/blob/develop/examples/synthetic_problems.json).

Instead of EvoLearner, you can use the other models. More details about this example
are given in the [_Concept Learning_](06_concept_learners.md) guide.


----------------------------------------------------------


## Ex. 2: Learning Over a Triplestore Ontology

In our next example we will see how to use another model, specifically 
the _Tree-based Description Logic Concept Learner_ or **TDL** for short in
a dataset which is hosted on a triplestore.

```python
from owlapy.owl_individual import OWLNamedIndividual, IRI
from ontolearn.learners import TDL
from ontolearn.learning_problem import PosNegLPStandard
from ontolearn.triple_store import TripleStore
from ontolearn.utils.static_funcs import save_owl_class_expressions
from owlapy.render import DLSyntaxObjectRenderer

# (1) Initialize Triplestore
kb = TripleStore(url="http://dice-dbpedia.cs.upb.de:9080/sparql")

# (2) Initialize a DL renderer.
render = DLSyntaxObjectRenderer()

# (3) Initialize a learner.
model = TDL(knowledge_base=kb)

# (4) Define a description logic concept learning problem.
lp = PosNegLPStandard(pos={OWLNamedIndividual(IRI.create("http://dbpedia.org/resource/Angela_Merkel"))},
                      neg={OWLNamedIndividual(IRI.create("http://dbpedia.org/resource/Barack_Obama"))})
# (5) Learn description logic concepts best fitting (4).

h = model.fit(learning_problem=lp).best_hypotheses()
str_concept = render.render(h)
print("Concept:", str_concept)  # e.g.  ∃ predecessor.WikicatPeopleFromBerlin

# (6) Save ∃ predecessor.WikicatPeopleFromBerlin into disk
save_owl_class_expressions(expressions=h, path="owl_prediction")
```

Here we have used the triplestore endpoint as you see in step _(1)_ which is
available only on a private network. However, you can host your own triplestore
server following [this guide](06_concept_learners.md#loading-and-launching-a-triplestore)
and run TDL using you own local endpoint. We have a [script](https://github.com/dice-group/Ontolearn/blob/master/examples/concept_learning_via_triplestore_example.py) for that also.

--------------------------------------------------------------

## Ex. 3: CMD Friendly Execution

Now let's see how you can create a script which you can execute via cmd and 
pass arguments dynamically. For this example we will use the learning model: 
**Drill**.

```python
import json
from argparse import ArgumentParser
import numpy as np
from sklearn.model_selection import StratifiedKFold
from ontolearn.utils.static_funcs import compute_f1_score
from ontolearn.knowledge_base import KnowledgeBase
from ontolearn.learning_problem import PosNegLPStandard
from ontolearn.refinement_operators import LengthBasedRefinement
from ontolearn.learners import Drill
from ontolearn.metrics import F1
from ontolearn.heuristics import CeloeBasedReward
from owlapy.owl_individual import OWLNamedIndividual, IRI
from owlapy.render import DLSyntaxObjectRenderer


def start(args):
    # Define knowledge base
    kb = KnowledgeBase(path=args.path_knowledge_base)
    
    # Define Learning model: Drill
    drill = Drill(knowledge_base=kb,
                  path_embeddings=args.path_embeddings,
                  refinement_operator=LengthBasedRefinement(knowledge_base=kb),
                  quality_func=F1(),
                  reward_func=CeloeBasedReward(),
                  epsilon_decay=args.epsilon_decay,
                  learning_rate=args.learning_rate,
                  num_of_sequential_actions=args.num_of_sequential_actions,
                  num_episode=args.num_episode,
                  iter_bound=args.iter_bound,
                  max_runtime=args.max_runtime)

    if args.path_pretrained_dir:
        # load pretrained data
        drill.load(directory=args.path_pretrained_dir)
    else:
        # train the model (Drill needs training)
        drill.train(num_of_target_concepts=args.num_of_target_concepts,
                    num_learning_problems=args.num_of_training_learning_problems)
        drill.save(directory="pretrained_drill")

        
    # Define the larning problem and split them into "train" and "test"
    with open(args.path_learning_problem) as json_file:
        examples = json.load(json_file)
    p = examples['positive_examples']
    n = examples['negative_examples']
    kf = StratifiedKFold(n_splits=args.folds, shuffle=True, random_state=args.random_seed)
    X = np.array(p + n)
    Y = np.array([1.0 for _ in p] + [0.0 for _ in n])
    dl_render = DLSyntaxObjectRenderer()
    for (ith, (train_index, test_index)) in enumerate(kf.split(X, Y)):
        train_pos = {pos_individual for pos_individual in X[train_index][Y[train_index] == 1]}
        train_neg = {neg_individual for neg_individual in X[train_index][Y[train_index] == 0]}
        test_pos = {pos_individual for pos_individual in X[test_index][Y[test_index] == 1]}
        test_neg = {neg_individual for neg_individual in X[test_index][Y[test_index] == 0]}
        train_lp = PosNegLPStandard(pos=set(map(OWLNamedIndividual, map(IRI.create, train_pos))),
                                    neg=set(map(OWLNamedIndividual, map(IRI.create, train_neg))))

        test_lp = PosNegLPStandard(pos=set(map(OWLNamedIndividual, map(IRI.create, test_pos))),
                                   neg=set(map(OWLNamedIndividual, map(IRI.create, test_neg))))

        pred_drill = drill.fit(train_lp).best_hypotheses()
        # Quality on train data
        train_f1_drill = compute_f1_score(individuals=frozenset({i for i in kb.individuals(pred_drill)}),
                                          pos=train_lp.pos,
                                          neg=train_lp.neg)
        # Quality on test data
        test_f1_drill = compute_f1_score(individuals=frozenset({i for i in kb.individuals(pred_drill)}),
                                         pos=test_lp.pos,
                                         neg=test_lp.neg)
        
        # Print predictions
        print(
            f"Prediction: {dl_render.render(pred_drill)} | Train Quality: {train_f1_drill:.3f} | Test Quality: {test_f1_drill:.3f} \n")


# Define the cmd arguments
if __name__ == '__main__':
    parser = ArgumentParser()
    # General
    parser.add_argument("--path_knowledge_base", type=str,
                        default='../KGs/Family/family-benchmark_rich_background.owl')
    parser.add_argument("--path_embeddings", type=str,
                        default='../embeddings/Keci_entity_embeddings.csv')
    parser.add_argument("--num_of_target_concepts",
                        type=int,
                        default=1)
    parser.add_argument("--num_of_training_learning_problems",
                        type=int,
                        default=1)
    parser.add_argument("--path_pretrained_dir", type=str, default=None)

    parser.add_argument("--path_learning_problem", type=str, default='uncle_lp2.json',
                        help="Path to a .json file that contains 2 properties 'positive_examples' and "
                             "'negative_examples'. Each of this properties should contain the IRIs of the respective"
                             "instances. e.g. 'some/path/lp.json'")
    parser.add_argument("--max_runtime", type=int, default=10, help="Max runtime")
    parser.add_argument("--folds", type=int, default=10, help="Number of folds of cross validation.")
    parser.add_argument("--random_seed", type=int, default=1)
    parser.add_argument("--iter_bound", type=int, default=10_000, help='iter_bound during testing.')
    # DQL related
    parser.add_argument("--num_episode", type=int, default=1, help='Number of trajectories created for a given lp.')

    parser.add_argument("--epsilon_decay", type=float, default=.01, help='Epsilon greedy trade off per epoch')
    parser.add_argument("--max_len_replay_memory", type=int, default=1024,
                        help='Maximum size of the experience replay')
    parser.add_argument("--num_epochs_per_replay", type=int, default=2,
                        help='Number of epochs on experience replay memory')
    parser.add_argument('--num_of_sequential_actions', type=int, default=1, help='Length of the trajectory.')

    # NN related
    parser.add_argument("--learning_rate", type=int, default=.01)

    start(parser.parse_args())
```

-----------------------------------------------------------

In the next guide we will explore the [KnowledgeBase](ontolearn.knowledge_base.KnowledgeBase) class which is needed to 
run a concept learner.