Skip to content

Ontology Verbalisation

Verbalising an ontology into natural language texts is a challenging task. \(\textsf{DeepOnto}\) provides some basic building blocks for achieving this goal. The implemented OntologyVerbaliser is essentially a recursive concept verbaliser that first splits a complex concept \(C\) into a sub-formula tree, verbalising the leaf nodes (atomic concepts or object properties) by their names, then merging the verbalised child nodes according to the logical pattern at their parent node.

Please cite the following paper if you consider using our verbaliser.

Paper

The recursive concept verbaliser is proposed in the paper: Language Model Analysis for Ontology Subsumption Inference (Findings of ACL 2023).

@inproceedings{he-etal-2023-language,
    title = "Language Model Analysis for Ontology Subsumption Inference",
    author = "He, Yuan  and
    Chen, Jiaoyan  and
    Jimenez-Ruiz, Ernesto  and
    Dong, Hang  and
    Horrocks, Ian",
    booktitle = "Findings of the Association for Computational Linguistics: ACL 2023",
    month = jul,
    year = "2023",
    address = "Toronto, Canada",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2023.findings-acl.213",
    doi = "10.18653/v1/2023.findings-acl.213",
    pages = "3439--3453"
}

OntologyVerbaliser(onto, apply_lowercasing=False, keep_iri=False, apply_auto_correction=False, add_quantifier_word=False)

A recursive natural language verbaliser for the OWL logical expressions, e.g., OWLAxiom and OWLClassExpression.

The concept patterns supported by this verbaliser are shown below:

Pattern Verbalisation (\(\mathcal{V}\))
\(A\) (atomic) the name (\(\texttt{rdfs:label}\)) of \(A\) (auto-correction is optional)
\(r\) (property) the name (\(\texttt{rdfs:label}\)) of \(r\) (auto-correction is optional)
\(\neg C\) "not \(\mathcal{V}(C)\)"
\(\exists r.C\) "something that \(\mathcal{V}(r)\) some \(\mathcal{V}(C)\)" (the quantifier word "some" is optional)
\(\forall r.C\) "something that \(\mathcal{V}(r)\) only \(\mathcal{V}(C)\)" (the quantifier word "only" is optional)
\(C_1 \sqcap ... \sqcap C_n\) if \(C_i = \exists/\forall r.D_i\) and \(C_j = \exists/\forall r.D_j\), they will be re-written into \(\exists/\forall r.(D_i \sqcap D_j)\) before verbalisation; suppose after re-writing the new expression is \(C_1 \sqcap ... \sqcap C_{n'}\)

(a) if all \(C_i\)s (for \(i = 1, ..., n'\)) are restrictions, in the form of \(\exists/\forall r_i.D_i\):
"something that \(\mathcal{V}(r_1)\) some/only \(V(D_1)\) and ... and \(\mathcal{V}(r_{n'})\) some/only \(V(D_{n'})\)"
(b) if some \(C_i\)s (for \(i = m+1, ..., n'\)) are restrictions, in the form of \(\exists/\forall r_i.D_i\):
"\(\mathcal{V}(C_{1})\) and ... and \(\mathcal{V}(C_{m})\) that \(\mathcal{V}(r_{m+1})\) some/only \(V(D_{m+1})\) and ... and \(\mathcal{V}(r_{n'})\) some/only \(V(D_{n'})\)"
(c) if no \(C_i\) is a restriction:
"\(\mathcal{V}(C_{1})\) and ... and \(\mathcal{V}(C_{n'})\)"

\(C_1 \sqcup ... \sqcup C_n\) similar to verbalising \(C_1 \sqcap ... \sqcap C_n\) except that "and" is replaced by "or" and case (b) uses the same verbalisation as case (c)
\(r_1 \cdot r_2\) (property chain) \(\mathcal{V}(r_1)\) something that \(\mathcal{V}(r_2)\)

With this concept verbaliser, a range of OWL axioms are supported:

  • Class axioms for subsumption, equivalence, assertion.
  • Object property axioms for subsumption, assertion.

The verbaliser operates at the concept level, and an additional template is needed to integrate the verbalised components of an axiom.

Warning

This verbaliser utilises spacy for POS tagging used in the auto-correction of property names. Automatic download of the rule-based library en_core_web_sm is available at the init function. However, if you somehow cannot find it, please manually download it using python -m spacy download en_core_web_sm.

Attributes:

Name Type Description
onto Ontology

An ontology whose entities and axioms are to be verbalised.

parser OntologySyntaxParser

A syntax parser for the string representation of an OWLObject.

vocab dict[str, list[str]]

A dictionary with (entity_iri, entity_name) pairs, by default the names are retrieved from \(\texttt{rdfs:label}\).

apply_lowercasing bool

Whether to apply lowercasing to the entity names. Defaults to False.

keep_iri bool

Whether to keep the IRIs of entities without verbalising them using self.vocab. Defaults to False.

apply_auto_correction bool

Whether to automatically apply rule-based auto-correction to entity names. Defaults to False.

add_quantifier_word bool

Whether to add quantifier words ("some"/"only") as in the Manchester syntax. Defaults to False.

Parameters:

Name Type Description Default
onto Ontology

An ontology whose entities and axioms are to be verbalised.

required
apply_lowercasing bool

Whether to apply lowercasing to the entity names. Defaults to False.

False
keep_iri bool

Whether to keep the IRIs of entities without verbalising them using self.vocab. Defaults to False.

False
apply_auto_correction bool

Whether to automatically apply rule-based auto-correction to entity names. Defaults to False.

False
add_quantifier_word bool

Whether to add quantifier words ("some"/"only") as in the Manchester syntax. Defaults to False.

False
Source code in src/deeponto/onto/verbalisation.py
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
def __init__(
    self,
    onto: Ontology,
    apply_lowercasing: bool = False,
    keep_iri: bool = False,
    apply_auto_correction: bool = False,
    add_quantifier_word: bool = False,
):
    """Initialise an ontology verbaliser.

    Args:
        onto (Ontology): An ontology whose entities and axioms are to be verbalised.
        apply_lowercasing (bool, optional): Whether to apply lowercasing to the entity names. Defaults to `False`.
        keep_iri (bool, optional): Whether to keep the IRIs of entities without verbalising them using `self.vocab`. Defaults to `False`.
        apply_auto_correction (bool, optional): Whether to automatically apply rule-based auto-correction to entity names. Defaults to `False`.
        add_quantifier_word (bool, optional): Whether to add quantifier words ("some"/"only") as in the Manchester syntax. Defaults to `False`.
    """
    self.onto = onto
    self.parser = OntologySyntaxParser()

    # download en_core_web_sm for object property
    try:
        spacy.load("en_core_web_sm")
    except:
        print("Download `en_core_web_sm` for pos tagger.")
        os.system("python -m spacy download en_core_web_sm")

    self.nlp = spacy.load("en_core_web_sm")

    # build the default vocabulary for entities
    self.apply_lowercasing_to_vocab = apply_lowercasing
    self.vocab = dict()
    for entity_type in ["Classes", "ObjectProperties", "DataProperties", "Individuals"]:
        entity_annotations, _ = self.onto.build_annotation_index(
            entity_type=entity_type, apply_lowercasing=self.apply_lowercasing_to_vocab
        )
        self.vocab.update(**entity_annotations)
    literal_or_iri = lambda k, v: list(v)[0] if v else k  # set vocab to IRI if no string available
    self.vocab = {k: literal_or_iri(k, v) for k, v in self.vocab.items()}  # only set one name for each entity

    self.keep_iri = keep_iri
    self.apply_auto_correction = apply_auto_correction
    self.add_quantifier_word = add_quantifier_word

update_entity_name(entity_iri, entity_name)

Update the name of an entity in self.vocab.

If you want to change the name of a specific entity, you should call this function before applying verbalisation.

Source code in src/deeponto/onto/verbalisation.py
184
185
186
187
188
189
190
def update_entity_name(self, entity_iri: str, entity_name: str):
    """Update the name of an entity in `self.vocab`.

    If you want to change the name of a specific entity, you should call this
    function before applying verbalisation.
    """
    self.vocab[entity_iri] = entity_name

verbalise_class_expression(class_expression)

Verbalise a class expression (OWLClassExpression) or its parsed form (in RangeNode).

See currently supported types of class (or concept) expressions here.

Parameters:

Name Type Description Default
class_expression Union[OWLClassExpression, str, RangeNode]

A class expression to be verbalised.

required

Raises:

Type Description
RuntimeError

Occurs when the class expression is not in one of the supported types.

Returns:

Type Description
CfgNode

A nested dictionary that presents the recursive results of verbalisation. The verbalised string can be accessed with the key ["verbal"] or with the attribute .verbal.

Source code in src/deeponto/onto/verbalisation.py
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
def verbalise_class_expression(self, class_expression: Union[OWLClassExpression, str, RangeNode]):
    r"""Verbalise a class expression (`OWLClassExpression`) or its parsed form (in `RangeNode`).

    See currently supported types of class (or concept) expressions [here][deeponto.onto.verbalisation.OntologyVerbaliser].


    Args:
        class_expression (Union[OWLClassExpression, str, RangeNode]): A class expression to be verbalised.

    Raises:
        RuntimeError: Occurs when the class expression is not in one of the supported types.

    Returns:
        (CfgNode): A nested dictionary that presents the recursive results of verbalisation. The verbalised string
            can be accessed with the key `["verbal"]` or with the attribute `.verbal`.
    """

    if not isinstance(class_expression, RangeNode):
        parsed_class_expression = self.parser.parse(class_expression).children[0]  # skip the root node
    else:
        parsed_class_expression = class_expression

    # for a singleton IRI
    if parsed_class_expression.is_iri:
        return self._verbalise_iri(parsed_class_expression)

    if parsed_class_expression.name.startswith("NEG"):
        # negation only has one child
        cl = self.verbalise_class_expression(parsed_class_expression.children[0])
        return CfgNode({"verbal": "not " + cl.verbal, "class": cl, "type": "NEG"})

    # for existential and universal restrictions
    if parsed_class_expression.name.startswith("EX.") or parsed_class_expression.name.startswith("ALL"):
        return self._verbalise_restriction(parsed_class_expression)

    # for conjunction and disjunction
    if parsed_class_expression.name.startswith("AND") or parsed_class_expression.name.startswith("OR"):
        return self._verbalise_junction(parsed_class_expression)

    # for a property chain
    if parsed_class_expression.name.startswith("OPC"):
        return self._verbalise_property(parsed_class_expression)

    raise RuntimeError(f"Input class expression `{str(class_expression)}` is not in one of the supported types.")

verbalise_class_subsumption_axiom(class_subsumption_axiom)

Verbalise a class subsumption axiom.

The subsumption axiom can have two forms:

  • \(C_{sub} \sqsubseteq C_{super}\), the SubClassOf axiom;
  • \(C_{super} \sqsupseteq C_{sub}\), the SuperClassOf axiom.

Parameters:

Name Type Description Default
class_subsumption_axiom OWLAxiom

Then class subsumption axiom to be verbalised.

required

Returns:

Type Description
Tuple[CfgNode, CfgNode]

The verbalised sub-concept \(\mathcal{V}(C_{sub})\) and super-concept \(\mathcal{V}(C_{super})\) (order matters).

Source code in src/deeponto/onto/verbalisation.py
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
def verbalise_class_subsumption_axiom(self, class_subsumption_axiom: OWLAxiom):
    r"""Verbalise a class subsumption axiom.

    The subsumption axiom can have two forms:

    - $C_{sub} \sqsubseteq C_{super}$, the `SubClassOf` axiom;
    - $C_{super} \sqsupseteq C_{sub}$, the `SuperClassOf` axiom.

    Args:
        class_subsumption_axiom (OWLAxiom): Then class subsumption axiom to be verbalised.

    Returns:
        (Tuple[CfgNode, CfgNode]): The verbalised sub-concept $\mathcal{V}(C_{sub})$ and super-concept $\mathcal{V}(C_{super})$ (order matters).
    """

    # input check
    self._axiom_input_check(class_subsumption_axiom, "SubClassOf", "SuperClassOf")

    parsed_subsumption_axiom = self.parser.parse(class_subsumption_axiom).children[0]  # skip the root node
    if str(class_subsumption_axiom).startswith("SubClassOf"):
        parsed_sub_class, parsed_super_class = parsed_subsumption_axiom.children
    elif str(class_subsumption_axiom).startswith("SuperClassOf"):
        parsed_super_class, parsed_sub_class = parsed_subsumption_axiom.children

    verbalised_sub_class = self.verbalise_class_expression(parsed_sub_class)
    verbalised_super_class = self.verbalise_class_expression(parsed_super_class)
    return verbalised_sub_class, verbalised_super_class

verbalise_class_equivalence_axiom(class_equivalence_axiom)

Verbalise a class equivalence axiom.

The equivalence axiom has the form \(C \equiv D\).

Parameters:

Name Type Description Default
class_equivalence_axiom OWLAxiom

The class equivalence axiom to be verbalised.

required

Returns:

Type Description
Tuple[CfgNode, CfgNode]

The verbalised concept \(\mathcal{V}(C)\) and its equivalent concept \(\mathcal{V}(D)\) (order matters).

Source code in src/deeponto/onto/verbalisation.py
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
def verbalise_class_equivalence_axiom(self, class_equivalence_axiom: OWLAxiom):
    r"""Verbalise a class equivalence axiom.

    The equivalence axiom has the form $C \equiv D$.

    Args:
        class_equivalence_axiom (OWLAxiom): The class equivalence axiom to be verbalised.

    Returns:
        (Tuple[CfgNode, CfgNode]): The verbalised concept $\mathcal{V}(C)$ and its equivalent concept $\mathcal{V}(D)$ (order matters).
    """

    # input check
    self._axiom_input_check(class_equivalence_axiom, "EquivalentClasses")

    parsed_equivalence_axiom = self.parser.parse(class_equivalence_axiom).children[0]  # skip the root node
    parsed_class_left, parsed_class_right = parsed_equivalence_axiom.children

    verbalised_left_class = self.verbalise_class_expression(parsed_class_left)
    verbalised_right_class = self.verbalise_class_expression(parsed_class_right)
    return verbalised_left_class, verbalised_right_class

verbalise_class_assertion_axiom(class_assertion_axiom)

Verbalise a class assertion axiom.

The class assertion axiom has the form \(C(x)\).

Parameters:

Name Type Description Default
class_assertion_axiom OWLAxiom

The class assertion axiom to be verbalised.

required

Returns:

Type Description
Tuple[CfgNode, CfgNode]

The verbalised class \(\mathcal{V}(C)\) and individual \(\mathcal{V}(x)\) (order matters).

Source code in src/deeponto/onto/verbalisation.py
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
def verbalise_class_assertion_axiom(self, class_assertion_axiom: OWLAxiom):
    r"""Verbalise a class assertion axiom.

    The class assertion axiom has the form $C(x)$.

    Args:
        class_assertion_axiom (OWLAxiom): The class assertion axiom to be verbalised.

    Returns:
        (Tuple[CfgNode, CfgNode]): The verbalised class $\mathcal{V}(C)$ and individual $\mathcal{V}(x)$ (order matters).
    """

    # input check
    self._axiom_input_check(class_assertion_axiom, "ClassAssertion")

    parsed_equivalence_axiom = self.parser.parse(class_assertion_axiom).children[0]  # skip the root node
    parsed_class, parsed_individual = parsed_equivalence_axiom.children

    verbalised_class = self.verbalise_class_expression(parsed_class)
    verbalised_individual = self._verbalise_iri(parsed_individual)
    return verbalised_class, verbalised_individual

verbalise_object_property_subsumption_axiom(object_property_subsumption_axiom)

Verbalise an object property subsumption axiom.

The subsumption axiom can have two forms:

  • \(r_{sub} \sqsubseteq r_{super}\), the SubObjectPropertyOf axiom;
  • \(r_{super} \sqsupseteq r_{sub}\), the SuperObjectPropertyOf axiom.

Parameters:

Name Type Description Default
object_property_subsumption_axiom OWLAxiom

The object property subsumption axiom to be verbalised.

required

Returns:

Type Description
Tuple[CfgNode, CfgNode]

The verbalised sub-property \(\mathcal{V}(r_{sub})\) and super-property \(\mathcal{V}(r_{super})\) (order matters).

Source code in src/deeponto/onto/verbalisation.py
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
def verbalise_object_property_subsumption_axiom(self, object_property_subsumption_axiom: OWLAxiom):
    r"""Verbalise an object property subsumption axiom.

    The subsumption axiom can have two forms:

    - $r_{sub} \sqsubseteq r_{super}$, the `SubObjectPropertyOf` axiom;
    - $r_{super} \sqsupseteq r_{sub}$, the `SuperObjectPropertyOf` axiom.

    Args:
        object_property_subsumption_axiom (OWLAxiom): The object property subsumption axiom to be verbalised.

    Returns:
        (Tuple[CfgNode, CfgNode]): The verbalised sub-property $\mathcal{V}(r_{sub})$ and super-property $\mathcal{V}(r_{super})$ (order matters).
    """

    # input check
    self._axiom_input_check(
        object_property_subsumption_axiom,
        "SubObjectPropertyOf",
        "SuperObjectPropertyOf",
        "SubPropertyChainOf",
        "SuperPropertyChainOf",
    )

    parsed_subsumption_axiom = self.parser.parse(object_property_subsumption_axiom).children[
        0
    ]  # skip the root node
    if str(object_property_subsumption_axiom).startswith("SubObjectPropertyOf"):
        parsed_sub_property, parsed_super_property = parsed_subsumption_axiom.children
    elif str(object_property_subsumption_axiom).startswith("SuperObjectPropertyOf"):
        parsed_super_property, parsed_sub_property = parsed_subsumption_axiom.children

    verbalised_sub_property = self._verbalise_property(parsed_sub_property)
    verbalised_super_property = self._verbalise_property(parsed_super_property)
    return verbalised_sub_property, verbalised_super_property

verbalise_object_property_assertion_axiom(object_property_assertion_axiom)

Verbalise an object property assertion axiom.

The object property assertion axiom has the form \(r(x, y)\).

Parameters:

Name Type Description Default
object_property_assertion_axiom OWLAxiom

The object property assertion axiom to be verbalised.

required

Returns:

Type Description
Tuple[CfgNode, CfgNode]

The verbalised object property \(\mathcal{V}(r)\) and two individuals \(\mathcal{V}(x)\) and \(\mathcal{V}(y)\) (order matters).

Source code in src/deeponto/onto/verbalisation.py
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
def verbalise_object_property_assertion_axiom(self, object_property_assertion_axiom: OWLAxiom):
    r"""Verbalise an object property assertion axiom.

    The object property assertion axiom has the form $r(x, y)$.

    Args:
        object_property_assertion_axiom (OWLAxiom): The object property assertion axiom to be verbalised.

    Returns:
        (Tuple[CfgNode, CfgNode]): The verbalised object property $\mathcal{V}(r)$ and two individuals $\mathcal{V}(x)$ and $\mathcal{V}(y)$ (order matters).
    """

    # input check
    self._axiom_input_check(object_property_assertion_axiom, "ObjectPropertyAssertion")

    # skip the root node
    parsed_object_property_assertion_axiom = self.parser.parse(object_property_assertion_axiom).children[0]
    parsed_obj_prop, parsed_indiv_x, parsed_indiv_y = parsed_object_property_assertion_axiom.children

    verbalised_object_property = self._verbalise_iri(parsed_obj_prop, is_property=True)
    verbalised_individual_x = self._verbalise_iri(parsed_indiv_x)
    verbalised_individual_y = self._verbalise_iri(parsed_indiv_y)
    return verbalised_object_property, verbalised_individual_x, verbalised_individual_y

verbalise_object_property_domain_axiom(object_property_domain_axiom)

Verbalise an object property domain axiom.

The domain of a property \(r: X \rightarrow Y\) specifies the concept expression \(X\) of its subject.

Parameters:

Name Type Description Default
object_property_domain_axiom OWLAxiom

The object property domain axiom to be verbalised.

required

Returns:

Type Description
Tuple[CfgNode, CfgNode]

The verbalised object property \(\mathcal{V}(r)\) and its domain \(\mathcal{V}(X)\) (order matters).

Source code in src/deeponto/onto/verbalisation.py
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
def verbalise_object_property_domain_axiom(self, object_property_domain_axiom: OWLAxiom):
    r"""Verbalise an object property domain axiom.

    The domain of a property $r: X \rightarrow Y$ specifies the concept expression $X$ of its subject.

    Args:
        object_property_domain_axiom (OWLAxiom): The object property domain axiom to be verbalised.

    Returns:
        (Tuple[CfgNode, CfgNode]): The verbalised object property $\mathcal{V}(r)$ and its domain $\mathcal{V}(X)$ (order matters).
    """

    # input check
    self._axiom_input_check(object_property_domain_axiom, "ObjectPropertyDomain")

    # skip the root node
    parsed_object_property_domain_axiom = self.parser.parse(object_property_domain_axiom).children[0]
    parsed_obj_prop, parsed_obj_prop_domain = parsed_object_property_domain_axiom.children

    verbalised_object_property = self._verbalise_iri(parsed_obj_prop, is_property=True)
    verbalised_object_property_domain = self.verbalise_class_expression(parsed_obj_prop_domain)

    return verbalised_object_property, verbalised_object_property_domain

verbalise_object_property_range_axiom(object_property_range_axiom)

Verbalise an object property range axiom.

The range of a property \(r: X \rightarrow Y\) specifies the concept expression \(Y\) of its object.

Parameters:

Name Type Description Default
object_property_range_axiom OWLAxiom

The object property range axiom to be verbalised.

required

Returns:

Type Description
Tuple[CfgNode, CfgNode]

The verbalised object property \(\mathcal{V}(r)\) and its range \(\mathcal{V}(Y)\) (order matters).

Source code in src/deeponto/onto/verbalisation.py
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
def verbalise_object_property_range_axiom(self, object_property_range_axiom: OWLAxiom):
    r"""Verbalise an object property range axiom.

    The range of a property $r: X \rightarrow Y$ specifies the concept expression $Y$ of its object.

    Args:
        object_property_range_axiom (OWLAxiom): The object property range axiom to be verbalised.

    Returns:
        (Tuple[CfgNode, CfgNode]): The verbalised object property $\mathcal{V}(r)$ and its range $\mathcal{V}(Y)$ (order matters).
    """

    # input check
    self._axiom_input_check(object_property_range_axiom, "ObjectPropertyRange")

    # skip the root node
    parsed_object_property_range_axiom = self.parser.parse(object_property_range_axiom).children[0]
    parsed_obj_prop, parsed_obj_prop_range = parsed_object_property_range_axiom.children

    verbalised_object_property = self._verbalise_iri(parsed_obj_prop, is_property=True)
    verbalised_object_property_range = self.verbalise_class_expression(parsed_obj_prop_range)

    return verbalised_object_property, verbalised_object_property_range

OntologySyntaxParser()

A syntax parser for the OWL logical expressions, e.g., OWLAxiom and OWLClassExpression.

It makes use of the string representation (based on Manchester Syntax) defined in the OWLAPI. In Python, such string can be accessed by simply using str(some_owl_object).

To keep the Java import in the main Ontology class, this parser does not deal with OWLAxiom directly but instead its string representation.

Due to the OWLObject syntax, this parser relies on two components:

  1. Parentheses matching;
  2. Tree construction (RangeNode).

As a result, it will return a RangeNode that specifies the sub-formulas (and their respective positions in the string representation) in a tree structure.

Examples:

Suppose the input is an OWLAxiom that has the string representation:

>>> str(owl_axiom)
>>> 'EquivalentClasses(<http://purl.obolibrary.org/obo/FOODON_00001707> ObjectIntersectionOf(<http://purl.obolibrary.org/obo/FOODON_00002044> ObjectSomeValuesFrom(<http://purl.obolibrary.org/obo/RO_0001000> <http://purl.obolibrary.org/obo/FOODON_03412116>)) )'

This corresponds to the following logical expression:

\[ CephalopodFoodProduct \equiv MolluskFoodProduct \sqcap \exists derivesFrom.Cephalopod \]

After apply the parser, a RangeNode will be returned which can be rentered as:

axiom_parser = OntologySyntaxParser()
print(axiom_parser.parse(str(owl_axiom)).render_tree())
Output:
Root@[0:inf]
└── EQV@[0:212]
    ├── FOODON_00001707@[6:54]
    └── AND@[55:210]
        ├── FOODON_00002044@[61:109]
        └── EX.@[110:209]
            ├── RO_0001000@[116:159]
            └── FOODON_03412116@[160:208]

Or, if graphviz (installed by e.g., sudo apt install graphviz) is available, you can visualise the tree as an image by:

axiom_parser.parse(str(owl_axiom)).render_image()

Output:

range_node

The name for each node has the form {node_type}@[{start}:{end}], which means a node of the type {node_type} is located at the range [{start}:{end}] in the abbreviated expression (see abbreviate_owl_expression below).

The leaf nodes are IRIs and they are represented by the last segment (split by "/") of the whole IRI.

Child nodes can be accessed by .children, the string representation of the sub-formula in this node can be accessed by .text. For example:

parser.parse(str(owl_axiom)).children[0].children[1].text
Output:
'[AND](<http://purl.obolibrary.org/obo/FOODON_00002044> [EX.](<http://purl.obolibrary.org/obo/RO_0001000> <http://purl.obolibrary.org/obo/FOODON_03412116>))'
Source code in src/deeponto/onto/verbalisation.py
670
671
def __init__(self):
    pass

abbreviate_owl_expression(owl_expression)

Abbreviate the string representations of logical operators to a fixed length (easier for parsing).

The abbreviations are specified at deeponto.onto.verbalisation.ABBREVIATION_DICT.

Parameters:

Name Type Description Default
owl_expression str

The string representation of an OWLObject.

required

Returns:

Type Description
str

The modified string representation of this OWLObject where the logical operators are abbreviated.

Source code in src/deeponto/onto/verbalisation.py
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
def abbreviate_owl_expression(self, owl_expression: str):
    r"""Abbreviate the string representations of logical operators to a
    fixed length (easier for parsing).

    The abbreviations are specified at `deeponto.onto.verbalisation.ABBREVIATION_DICT`.

    Args:
        owl_expression (str): The string representation of an `OWLObject`.

    Returns:
        (str): The modified string representation of this `OWLObject` where the logical operators are abbreviated.
    """
    for k, v in ABBREVIATION_DICT.items():
        owl_expression = owl_expression.replace(k, v)
    return owl_expression

parse(owl_expression)

Parse an OWLAxiom into a RangeNode.

This is the main entry for using the parser, which relies on the parse_by_parentheses method below.

Parameters:

Name Type Description Default
owl_expression Union[str, OWLObject]

The string representation of an OWLObject or the OWLObject itself.

required

Returns:

Type Description
RangeNode

A parsed syntactic tree given what parentheses to be matched.

Source code in src/deeponto/onto/verbalisation.py
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
def parse(self, owl_expression: Union[str, OWLObject]) -> RangeNode:
    r"""Parse an `OWLAxiom` into a [`RangeNode`][deeponto.onto.verbalisation.RangeNode].

    This is the main entry for using the parser, which relies on the [`parse_by_parentheses`][deeponto.onto.verbalisation.OntologySyntaxParser.parse_by_parentheses]
    method below.

    Args:
        owl_expression (Union[str, OWLObject]): The string representation of an `OWLObject` or the `OWLObject` itself.

    Returns:
        (RangeNode): A parsed syntactic tree given what parentheses to be matched.
    """
    if not isinstance(owl_expression, str):
        owl_expression = str(owl_expression)
    owl_expression = self.abbreviate_owl_expression(owl_expression)
    # print("To parse the following (transformed) axiom text:\n", owl_expression)
    # parse complex patterns first
    cur_parsed = self.parse_by_parentheses(owl_expression)
    # parse the IRI patterns latter
    return self.parse_by_parentheses(owl_expression, cur_parsed, for_iri=True)

parse_by_parentheses(owl_expression, already_parsed=None, for_iri=False) classmethod

Parse an OWLAxiom based on parentheses matching into a RangeNode.

This function needs to be applied twice to get a fully parsed RangeNode because IRIs have a different parenthesis pattern.

Parameters:

Name Type Description Default
owl_expression str

The string representation of an OWLObject.

required
already_parsed RangeNode

A partially parsed RangeNode to continue with. Defaults to None.

None
for_iri bool

Parentheses are by default () but will be changed to <> for IRIs. Defaults to False.

False

Raises:

Type Description
RuntimeError

Raised when the input axiom text is nor properly formatted.

Returns:

Type Description
RangeNode

A parsed syntactic tree given what parentheses to be matched.

Source code in src/deeponto/onto/verbalisation.py
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
@classmethod
def parse_by_parentheses(
    cls, owl_expression: str, already_parsed: RangeNode = None, for_iri: bool = False
) -> RangeNode:
    r"""Parse an `OWLAxiom` based on parentheses matching into a [`RangeNode`][deeponto.onto.verbalisation.RangeNode].

    This function needs to be applied twice to get a fully parsed [`RangeNode`][deeponto.onto.verbalisation.RangeNode] because IRIs have
    a different parenthesis pattern.

    Args:
        owl_expression (str): The string representation of an `OWLObject`.
        already_parsed (RangeNode, optional): A partially parsed [`RangeNode`][deeponto.onto.verbalisation.RangeNode] to continue with. Defaults to `None`.
        for_iri (bool, optional): Parentheses are by default `()` but will be changed to `<>` for IRIs. Defaults to `False`.

    Raises:
        RuntimeError: Raised when the input axiom text is nor properly formatted.

    Returns:
        (RangeNode): A parsed syntactic tree given what parentheses to be matched.
    """
    if not already_parsed:
        # a root node that covers the entire sentence
        parsed = RangeNode(0, math.inf, name=f"Root", text=owl_expression, is_iri=False)
    else:
        parsed = already_parsed
    stack = []
    left_par = "("
    right_par = ")"
    if for_iri:
        left_par = "<"
        right_par = ">"

    for i, c in enumerate(owl_expression):
        if c == left_par:
            stack.append(i)
        if c == right_par:
            try:
                start = stack.pop()
                end = i
                if not for_iri:
                    # the first character is actually "["
                    real_start = start - 5
                    axiom_type = owl_expression[real_start + 1 : start - 1]
                    node = RangeNode(
                        real_start,
                        end + 1,
                        name=f"{axiom_type}",
                        text=owl_expression[real_start : end + 1],
                        is_iri=False,
                    )
                    parsed.insert_child(node)
                else:
                    # no preceding characters for just atomic class (IRI)
                    abbr_iri = owl_expression[start : end + 1].split("/")[-1].rstrip(">")
                    node = RangeNode(
                        start, end + 1, name=abbr_iri, text=owl_expression[start : end + 1], is_iri=True
                    )
                    parsed.insert_child(node)
            except IndexError:
                print("Too many closing parentheses")

    if stack:  # check if stack is empty afterwards
        raise RuntimeError("Too many opening parentheses")

    return parsed

RangeNode(start, end, name=None, **kwargs)

Bases: NodeMixin

A tree implementation for ranges (without partial overlap).

  • Parent node's range fully covers child node's range, e.g., [1, 10] is a parent of [2, 5].
  • Partial overlap between ranges are not allowed, e.g., [2, 4] and [3, 5] cannot appear in the same RangeNodeTree.
  • Non-overlap ranges are on different branches (irrelevant).
  • Child nodes are ordered according to their relative positions.
Source code in src/deeponto/onto/verbalisation.py
786
787
788
789
790
791
792
793
794
795
def __init__(self, start, end, name=None, **kwargs):
    if start >= end:
        raise RuntimeError("invalid start and end positions ...")
    self.start = start
    self.end = end
    self.name = "Root" if not name else name
    self.name = f"{self.name}@[{self.start}:{self.end}]"  # add start and ent to the name
    for k, v in kwargs.items():
        setattr(self, k, v)
    super().__init__()

__gt__(other)

Compare two ranges if they have a different start and/or a different end.

  • \(R_1 \lt R_2\): if range \(R_1\) is completely contained in range \(R_2\), and \(R_1 \neq R_2\).
  • \(R_1 \gt R_2\): if range \(R_2\) is completely contained in range \(R_1\), and \(R_1 \neq R_2\).
  • "irrelevant": if range \(R_1\) and range \(R_2\) have no overlap.

Warning

Partial overlap is not allowed.

Source code in src/deeponto/onto/verbalisation.py
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
def __gt__(self, other: RangeNode):
    r"""Compare two ranges if they have a different `start` and/or a different `end`.

    - $R_1 \lt R_2$: if range $R_1$ is completely contained in range $R_2$, and $R_1 \neq R_2$.
    - $R_1 \gt R_2$: if range $R_2$ is completely contained in range $R_1$,  and $R_1 \neq R_2$.
    - `"irrelevant"`: if range $R_1$ and range $R_2$ have no overlap.

    !!! warning

        Partial overlap is not allowed.
    """
    # ranges inside
    if self.start <= other.start and other.end <= self.end:
        return True

    # ranges outside
    if other.start <= self.start and self.end <= other.end:
        return False

    if other.end < self.start or self.end < other.start:
        return "irrelevant"

    raise RuntimeError("Compared ranges have a partial overlap.")

sort_by_start(nodes) staticmethod

A sorting function that sorts the nodes by their starting positions.

Source code in src/deeponto/onto/verbalisation.py
826
827
828
829
830
@staticmethod
def sort_by_start(nodes: List[RangeNode]):
    """A sorting function that sorts the nodes by their starting positions."""
    temp = {sib: sib.start for sib in nodes}
    return list(dict(sorted(temp.items(), key=lambda item: item[1])).keys())

insert_child(node)

Inserting a child RangeNode.

Child nodes have a smaller (inclusive) range, e.g., [2, 5] is a child of [1, 6].

Source code in src/deeponto/onto/verbalisation.py
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
def insert_child(self, node: RangeNode):
    r"""Inserting a child [`RangeNode`][deeponto.onto.verbalisation.RangeNode].

    Child nodes have a smaller (inclusive) range, e.g., `[2, 5]` is a child of `[1, 6]`.
    """
    if node > self:
        raise RuntimeError("invalid child node")
    if node.start == self.start and node.end == self.end:
        # duplicated node
        return
    # print(self.children)
    if self.children:
        inserted = False
        for ch in self.children:
            if (node < ch) is True:
                # print("further down")
                ch.insert_child(node)
                inserted = True
                break
            elif (node > ch) is True:
                # print("insert in between")
                ch.parent = node
                # NOTE: should not break here as it could be parent of multiple children !
                # break
            # NOTE: the equal case is when two nodes are exactly the same, no operation needed
        if not inserted:
            self.children = list(self.children) + [node]
            self.children = self.sort_by_start(self.children)
    else:
        node.parent = self
        self.children = [node]

render_tree()

Render the whole tree.

Source code in src/deeponto/onto/verbalisation.py
867
868
869
def render_tree(self):
    """Render the whole tree."""
    return RenderTree(self)

render_image()

Calling this function will generate a temporary range_node.png file which will be displayed.

To make this visualisation work, you need to install graphviz by, e.g.,

sudo apt install graphviz
Source code in src/deeponto/onto/verbalisation.py
871
872
873
874
875
876
877
878
879
880
881
882
def render_image(self):
    """Calling this function will generate a temporary `range_node.png` file
    which will be displayed.

    To make this visualisation work, you need to install `graphviz` by, e.g.,

    ```bash
    sudo apt install graphviz
    ```
    """
    RenderTreeGraph(self).to_picture("range_node.png")
    return Image("range_node.png")

Last update: August 7, 2023
Created: January 24, 2023
GitHub: @Lawhy   Personal Page: yuanhe.wiki