notes-cog-ai-reasoning-descriptionLogics

my summary: for applications in which polytime inference is desired, use https://www.w3.org/TR/owl2-profiles/#OWL_2_EL

DL intros: On beyond Gruber: “Ontologies” in today’s biomedical information systems and the limits of OWL (2019) https://en.wikipedia.org/wiki/Description_logic http://www.matrix.umcs.lublin.pl/~akrajka/Ontologia/Materials/IntroDL.pdf https://www.uio.no/studier/emner/matnat/ifi/INF3170/h15/undervisningsmateriale/dl1.pdf https://corescholar.libraries.wright.edu/cgi/viewcontent.cgi?article=1184&context=cse

DL link: http://dl.kr.org/

DL book (looks more detailed than i want, but not sure): https://www.cambridge.org/core/books/an-introduction-to-description-logic/6D329698AFC2E6C6C5C15801ED9B6D07 $40 on amazon, 2017 edition (most recent, i think): https://www.amazon.com/Introduction-Description-Logic-Franz-Baader-dp-0521695422/dp/0521695422/ref=mt_other?_encoding=UTF8&me=&qid= contents:http://assets.cambridge.org/97805218/73611/toc/9780521873611_toc.pdf

Popular languages/logics to learn about

about ALC: https://en.wikipedia.org/wiki/Description_logic#Formal_description https://www.lesliesikos.com/alc-description-logic/

https://en.wikipedia.org/wiki/Description_logic mentions 3 families (or at least 'base logics') of DL: AL, FL, EL. It suggests that a variant ALC of AL is the most popular base to work with (a "centrally important description logic from which comparisons with other varieties can be made"; [1] says "The prototypical DL Attributive Concept Language with Complements ( A L C {\displaystyle {\mathcal {ALC}}} {\mathcal {ALC}}) was introduced by Manfred Schmidt-Schauß and Gert Smolka in 1991, and is the basis of many more expressive DLs"). In the Examples section it gives a lot of examples of ALC (note: the S prefix is just an abbreviation for ALC with transitive roles; so eg SHOIN is an ALC variant), and a few of EL ("Three major biomedical informatics terminology bases, SNOMED CT, GALEN, and GO, are expressible in EL (with additional role properties). "). Both Protege and OWL-DL are SHOIN(D).

So maybe we can omit studying FL for now. Should i learn about the EL family too?

https://www.google.com/search?q=shoin(d)+vs+"el"+Description+logics https://www.lesliesikos.com/description-logics/ https://www.uio.no/studier/emner/matnat/ifi/INF3170/h15/undervisningsmateriale/dl1.pdf https://arxiv.org/pdf/1401.5850.pdf http://dai.fmph.uniba.sk/~sefranek/kri/handbook/chapter03.pdf or other version http://www.cs.ox.ac.uk/people/ian.horrocks/Publications/download/2007/BaHS07a.pdf https://corescholar.libraries.wright.edu/cgi/viewcontent.cgi?article=1184&context=cse https://www.researchgate.net/publication/271976273_Extending_description_logic_SHOIN_based_on_cloud_model https://aifb.kit.edu/images/7/72/TR3005-OWL-EL-Reasoning.pdf http://www.matrix.umcs.lublin.pl/~akrajka/Ontologia/Materials/IntroDL.pdf

"Some DL languages have overlapping constructs, as, for example, union and full existential quantification can be expressed using negation, therefore U and E are never used together in DL names, and C is used instead." [2]

" By extending ALC and transitivity roles (i.e., S) with role hierarchies (H), inverse roles (I), functional properties (F), and datatypes (D), we get the SHIF(D) description logic, which roughly corresponds to OWL Lite. By adding nominals (O) and cardinality restrictions (N) to SHIF(D), we get SHOIN(D), the description logic underlying OWL DL. After industrial applications highlighted several key features missing from SHOIN(D) to model complex knowledge domains, SHOIN(D) has been extended with complex role inclusion axioms, reflexive and irreflexive roles, asymmetric roles, disjoint roles, the universal role, self-constructs, negated role assertions, and qualified number restrictions, leading to SROIQ(D), one of the most expressive description logic whose decidability is proven, and which largely corresponds to OWL 2 DL. Furthermore, SROIQ(D) supports not only TBox and ABox axioms, but also so-called Role Boxes (RBox) to collect all statements related to roles and the interdependencies between roles. Each RBox consists of a role hierarchy (including generalized role inclusion axioms) and a set of role assertions. " [3]

"Notable Description Logic Subfamilies

    General Description Logics
        Basic Description Logics and Their Derivatives: AL, ALC, ALCHIF, ALCIQ, etc.
        The EL Family of Description Logics: EL, ELRO (EL++), etc.
        The DL-Lite Family of Description Logics: DL-Litecore, DL-Litebool, DL-Litekrom, DL-Litehorn, etc.
        Frame-Based Description Logics (FL)
        The SH Family of Description Logics: SHOIN, SROIQ, etc." [4]

https://www.uio.no/studier/emner/matnat/ifi/INF3170/h15/undervisningsmateriale/dl1.pdf speaks of EL in the "Common restricted languages" section. It also mentions DL-Lite and RL. It notes that OWL has various profiles corresponding to different DLs. Main ones are OWL Lite: SHIF(D), OWL DL: SHION(D), OWL 2 DL: SROIQ(D). But "(Other) profiles are tailored for specific ends, e.g.,

"OWL 2 DL: corresponds to SROIQ(D) and is the “normal” OWL 2 (sublanguage): “maximum” expressivity while keeping reasoning problems decidable—but still very expensive;" [5]

OWL also has another, not DL profile: " – OWL Full (not a proper DL): Anything goes: classes, relations, individuals, highly expressive, not decidable. But we want OWL’s reasoning capabilities, so stay away if you can—and you almost always can." " [6]

FL and EL are 'structural approaches'; intro, context, history: http://dai.fmph.uniba.sk/~sefranek/kri/handbook/chapter03.pdf " Structural Approaches

As mentioned in the introduction, early DL systems were based on so-called structural subsumption algorithms, which first normalize the concepts to be tested for subsump- tion, and then compare the syntactic structure of the normalized concepts. The claim was that these algorithms can decide subsumption in polynomial time. However, the first complexity results for DLs, also mentioned in the introduction, showed that these algorithms were neither polynomial nor decision procedures for subsumption. For ex- ample, all early systems used unfolding of concept definitions, which can cause an exponential blow-up of the size of concepts. Nebel’s coNP-hardness result [127] for subsumption with respect to definitorial TBoxes showed that this blow-up cannot be avoided whenever the constructors conjunction and value restriction are available. In addition, the early structural subsumption algorithms were not complete, i.e., they were not able to detect all valid subsumption relationships. These negative results for structural subsumption algorithms together with the advent of tableau based al- gorithms for expressive DLs, which behaved well in practice, was probably the main reason why structural approaches—and with them the quest for DLs with a polynomial subsumption problem—were largely abandoned during the 1990s. More recent results [11, 42, 6] on the complexity of reasoning in DLs with existential restrictions, rather than value restrictions, have led to a partial rehabilitation of structural approaches. "

http://dai.fmph.uniba.sk/~sefranek/kri/handbook/chapter03.pdf compares FL and EL: " When trying to find a DL with a polynomial subsumption problem...This leads to the following two minimal candidate DLs:

the DL F L0, which offers the concept constructors conjunction, value restriction (\forall r.C), and the top concept; • the DL EL, which offers the concept constructors conjunction, existential re- striction (\exists r.C), and the top concept. In the following, we will look at the subsumption problem9 in these two DLs in some detail. Whereas subsumption without a TBox turns out to be polynomial in both cases, we will also see that EL exhibits a more robust behavior with respect to the complexity of the subsumption problem in the presence of TBoxes...In contrast to the negative complexity results for subsumption with respect to TBoxes in F L0, subsumption in EL remains polynomial even in the presence of general TBoxes [42] "

" subsumption in EL remains polynomial even in the presence of gen- eral TBoxes [S. Brandt. Polynomial time reasoning in a description logic with existential re- strictions, GCI axioms, and—what else? In R. López de Mántaras and L. Saitta, editors, Proc. of the 16th Eur. Conf. on Artificial Intelligence (ECAI 2004), pages 298–302, 2004.] ... In [6] this result is extended to the DL EL++, which extends EL with the bottom concept, nominals, a restricted form of concrete domains, and a restricted form of so- called role-value maps. In addition, it is shown in [6] that basically all other additions of typical DL constructors to EL make subsumption with respect to general TBoxes EXPTIME-complete.

It should be noted that these results are not only of theoretical interest. In fact, large bio-medical ontologies such as the Gene Ontology [55] and SNOMED [153] can be expressed in EL, and a first implementation of the subsumption algorithm for EL sketched above behaves very well on these very large knowledge bases [23]. " -- [7]

are there other things like decision logics to learn about (Horn clauses is one):

" Other Logics Up to now, we have considered theorem proving in general first-order logic. However, there are many more specialized logics for which more efficient methods exist. Such logics fix the domain of the interpretation, such as to the reals or integers, and also the interpretations of some of the symbols, such as “+” and “∗”. Examples of theories con- sidered include Presburger arithmetic, the first-order theory of natural numbers with addition [200], Euclidean and non-Euclidean geometry [272, 55], inequalities involv- ing real polynomials (for which Tarski first gave a decision procedure) [52], ground equalities and inequalities, for which congruence closure [191] is an efficient decision procedure, modal logic, temporal logic, and many more specialized logics. Theorem proving for ground formulas of first-order logic is also known as satisfiability mod- ulo theories (SMT) in the literature. Description logics [8], discussed in Chapter 3 of this Handbook, are sublanguages of first-order logic, with extensions, that often have efficient decision procedures and have applications to the semantic web. Specialized logics are often built into provers or logic programming systems using constraints [33]. The idea of using constraints in theorem proving has been around for some time [143]. Another specialized area is that of computing polynomial ideals, for which efficient methods have been developed [44]. An approach to combining decision pro- cedures was given in [190] and there has been continued interest in the combination of decision procedures since that time. " -- [8]

"...Deductive reasoning is too expensive...special-purpose reasoning techniques have been developed, e.g., by the description logic community [11] as well as by Cyc (see Section 1.4.2) to determine subsumption and disjointness of classes; and logic programming techniques (for both Prolog (see Section 1.4.4) and answer set programming (see Chapter 7)) have been developed so that relatively ef- ficient inferences can be carried out under certain restricted assumption" -- [9]

" Some Prominent DLs • ALC [Baader and Nutt 2007]: simplest Boolean-closed DL that admits top and bottom concept; concept intersection, union, and complement; value and existential restrictions; TBox axioms; and ABox axioms. Reasoning is ExpTime?-complete. • FL0: simple DL that admits top concept, concept intersection, value restriction, TBox axioms and ABox axioms. It is notable since reasoning is ExpTime?-complete, but polynomial if done with an empty KB [Donini 2007, Baader et al 2005]. • EL: simple DL that allows top concept, concept intersection, existential restriction, TBox axioms and ABox axioms. It is notable since reasoning is polynomial. In fact, SROEL (also known as EL++ [Baader et al 2005]), obtained by adding bottom concept, role hierarchy (thus role equivalence), and general role inclusion to EL is still polynomial. EL++ is adopted for the OWL 2 EL profile of OWL 2 DL standard [Motik et al 2009]. Its sublanguages also found extensive use in biomedical applications. • SHIF: an expressive DL obtained from ALC by adding role transitivity, role hierarchy, inverse roles, functionality (concepts of the form ≤1R and axioms of the form Fun(R)), role symmetry. This DL underlies the OWL 1 Lite standard and has an ExpTime?-complete reasoning [Horrocks and Patel-Schneider 2004, Hayes et al 2004]. • SHOIN : very expressive DL, obtained from SHIF by adding unqualified number restrictions and nominals. It underlies the OWL 1 DL standard and has an NExpTime?-complete reasoning [Horrocks and Patel-Schneider 2004, Hayes et al 2004]. • SROIQ: very expressive DL, obtained from SHOIN by adding general role inclusion, qualified number restriction, role asymmetry, role reflexivity and role irreflexivity. It underlies OWL 2 DL and has an N2ExpTime?-complete reasoning [Horrocks et al 2006, Kazakov 2008, W3C OWL Working Group 2009 " -- [10]

OK... so if we want polynomial reasoning, maybe stick with EL only/EL++/OWL 2 EL! Even OWL 1 Lite and ALC have ExpTime?-complete reasoning. But if ALC is exptime-complete, and EL++ is SROEL, and S is ALC, then how can EL++ be polytime? another source (https://www.lesliesikos.com/description-logics/]) says EL++ is ELRO, not SROEL. That makes more sense.

recall that [11] says that after EL++, "basically all other additions of typical DL constructors to EL make subsumption with respect to general TBoxes EXPTIME-complete"

that paper seems to question the polynomiality of OWL EL/equivalence to EL++: https://aifb.kit.edu/images/7/72/TR3005-OWL-EL-Reasoning.pdf Efficient Inferencing for the Description Logic Underlying OWL EL Markus Krötzsch. Efficient Inferencing for OWL EL. In Tomi Janhunen, Ilkka Niemelä (eds.): Proceedings of the 12th European Confernce on Logics in Artificial Intelligence (JELIA’10), LNAI. Springer, 2010. To appear

"It is widely assumed that inferencing in OWL EL is possible in polynomial time, but it is not obvious how to extend existing reasoning procedures for EL++ accordingly" or maybe it's just that it is, and they prove it, or something close to it: "this paper is – to the best of our knowledge – the first to present a sound and complete polynomial time calculus for inferencing in a DL that is so closely related to the OWL EL ontology language"

"Indeed, our findings agree with practical experiences that especially nominals and role chains are harder to implement efficiently than basic EL features.5 Computational complexity has not been able to provide an explanation for such discrepancies, since all reasoning problems we consider are P-complete. In addition, our study also shows that various other features are not harder to implement than some of the most basic ones, thus pro- viding guidance for deciding which features to implement or to use in an application"

" This revealed the arity of IDB predicates as an interesting mea- sure for the worst-case space requirements of materialisation-based algorithms. While SROEL(⊓, ×) fragments without role chains and nominals admit classification calculi based on binary IDB predicates, the inclusion of either feature increases the required arity by one. Having both features, SROEL(⊓, ×) thus does not admit any sound and complete classification calculus of arity below four "

http://www.cs.man.ac.uk/~ezolin/dl/ Complexity of reasoning in Description Logics tool! it defaults to ALC, and says that ALC is P-SPACE complete. It doesn't seem to have an entries for things less powerful than ALC.

"The PSPACE-complete problems are widely suspected to be outside the more famous complexity classes P (polynomial time) and NP (non-deterministic polynomial time), but that is not known."

" In practice, most biomedical applications of OWL use restricted but more computationally efficient dialects equivalent to the description logic EL++ [51] that underpins OWL 2 EL [27] or SNOMED’s compositional grammar [52]. These restricted dialects lack negation,17 so classes can (almost) never be proved unsatisfiable.18" On beyond Gruber: “Ontologies” in today’s biomedical information systems and the limits of OWL

So the answer to my question is... yes, i should study EL also, and EL++, because they are polytime (and possibly i should ignore the more popular ALC and the SH family).

some citations of papers on EL and EL++: "EL: simple DL that allows top concept, concept intersection, existential restriction, TBox axioms and ABox axioms. It is notable since reasoning is polynomial. In fact, SROEL (also known as EL++ [Baader F, Brandt S, Lutz C (2005) Pushing the EL envelope. In: the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI-05), Morgan-Kaufmann Publishers, Edinburgh, UK]), obtained by adding bottom concept, role hierarchy (thus role equivalence), and general role inclusion to EL is still polynomial. EL++ is adopted for the OWL 2 EL profile of OWL 2 DL standard [Motik B, Fokoue A, Horrocks I, Wu Z, Lutz C, Grau BC (2009) OWL 2 web ontology language profiles. W3C recommendation, W3C, http://www.w3.org/TR/owl2-profiles/. Accessed on Oct 18, 2012.]. Its sublanguages also found extensive use in biomedical applications"

In ". Baader, S. Brandt, and C. Lutz. Pushing the EL envelope. In Proc. of the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI 2005), pages 364–369, 2005." this result is extended to the DL EL++, which extends EL with the bottom concept, nominals, a restricted form of concrete domains, and a restricted form of so- called role-value maps. In addition, it is shown in (ibid) that basically all other additions of typical DL constructors to EL make subsumption with respect to general TBoxes EXPTIME-complete. It should be noted that these results are not only of theoretical interest. In fact, large bio-medical ontologies such as the Gene Ontology [The Gene ontology Consortium. Gene ontology: Tool for the unification of bi- ology. Nature Genetics, 25:25–29, 200] and SNOMED [K.A. Spackman, K.E. Campbell, and R.A. Cote. SNOMED RT: A reference terminology for health care. J. of the American Medical Informatics Associa- tion:640–644, 1997 (Fall Symposium Supplement).] can be expressed in EL, and a first implementation of the subsumption algorithm for EL sketched above behaves very well on these very large knowledge bases [F. Baader, C. Lutz, and B. Suntisrivaraporn. CEL—a polynomial-time reasoner for life science ontologies. In U. Furbach and N. Shankar, editors. Proc. of the Int. Joint Conf. on Automated Reasoning (IJCAR 2006), Lecture Notes in Artifi- cial Intelligence, vol. 4130, pages 287–291. Springer-Verlag, 2006] "

"...known admissibility requirements for range restrictions in EL++ [Baader, F., Brandt, S., Lutz, C.: Pushing the EL envelope further. In: Clark, K.G., Patel- Schneider, P.F. (eds.) Proc. OWLED 2008 DC Workshop on OWL: Experiences and Direc- tions. CEUR Workshop Proceedings, vol. 496. CEUR-WS.org (2008)] " [12]

selected papers from those citations:

https://iccl.inf.tu-dresden.de/w/images/8/81/BaaderBrandtLutz-IJCAI-05.pdf Baader F, Brandt S, Lutz C (2005) Pushing the EL envelope. In: the 19th Int. Joint Conf. on Artificial Intelligence (IJCAI-05) appears to be the paper that introduced EL++

http://webont.org/owled/2008dc/papers/owled2008dc_paper_3.pdf "We extend the description logic EL++ with reflexive roles and range restrictions, and show that subsumption remains tractable if a certain syntactic restriction is adopted. We also show that subsumption becomes PSpace-hard (resp. undecidable) if this restriction is weakened (resp. dropped). Additionally, we prove that tractability is lost when symmetric roles are added: in this case, subsumption becomes ExpTime?- hard"

https://www.w3.org/TR/owl2-profiles/#OWL_2_EL

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.120.44&rep=rep1&type=pdf CEL—A? Polynomial-time Reasoner for Life Science Ontologies Franz Baader, Carsten Lutz, Boontawee Suntisrivaraporn "CEL (Classifier for EL) is a reasoner for the small description logic EL+ which can be used to compute the subsumption hierarchy in- duced by EL+ ontologies. The most distinguishing feature of CEL is that, unlike all other modern DL reasoners, it is based on a polynomial-time subsumption algorithm, which allows it to process very large ontologies in reasonable time. In spite of its restricted expressive power, EL+ is well-suited for formulating life science ontologies"

Wikipedia says "Three major biomedical informatics terminology bases, SNOMED CT, GALEN (mb https://en.wikipedia.org/wiki/OpenGALEN ?), and GO (Gene Ontology; https://en.wikipedia.org/wiki/Gene_Ontology), are expressible in E L {\displaystyle {\mathcal {EL}}} {\mathcal {EL}} (with additional role properties). "

https://www.researchgate.net/publication/314132489_GEL_A_Platform-Independent_Reasoner_for_Parallel_Classification_with_OWL_EL_Ontologies_Using_Graph_Representation GEL: A Platform-Independent Reasoner for Parallel Classification with OWL EL Ontologies Using Graph Representation (2017)

https://www.uni-ulm.de/fileadmin/website_uni_ulm/iui.inst.090/Publikationen/2014/KazKroSim13ELK_JAR.pdf

The Incredible ELK From Polynomial Procedures to Efficient Reasoning with EL Ontologies EL is a simple tractable Description Logic that features conjunctions and existen- tial restrictions. Due to its favorable computational properties and relevance to existing on- tologies, EL has become the language of choice for terminological reasoning in biomedical applications, and has formed the basis of the OWL EL profile of the Web ontology language OWL. This paper describes ELK—a high performance reasoner for OWL EL ontologies— and details various aspects from theory to implementation that make ELK one of the most competitive reasoning systems for EL ontologies available today (2014) also http://ceur-ws.org/Vol-858/ore2012_paper10.pdf 2012

https://www.sciencedirect.com/science/article/pii/S2590177X19300010 On beyond Gruber: “Ontologies” in today’s biomedical information systems and the limits of OWL☆

Notes

"There are two features of description logic that are not shared by most other data description formalisms: DL does not make the unique name assumption (UNA) or the closed-world assumption (CWA). Not having UNA means that two concepts with different names may be allowed by some inference to be shown to be equivalent. Not having CWA, or rather having the open world assumption (OWA) means that lack of knowledge of a fact does not immediately imply knowledge of the negation of a fact. " [13] (more on the Open World Assumption: "DL-based ontologies do not specify a particular interpretation based on a default assumption or a state of the world. Instead, they consider all possible cases in which the axioms are satisfied. This characteristic makes it possible to handle incomplete information, such as statements that have not been explicitly made yet, by keeping unspecified information open, which is called the open world assumption (OWA)." [14])

"Many description logics are more expressive than propositional logic, which deals with declarative propositions and does not use quantifiers, and more efficient in decision problems than first-order predicate logic, which uses predicates and quantified variables over nonlogical objects" [15]

"What cannot be expressed in DLs" eg Brothers ("Given terms {hasSibling, Male}, a brother is defined to be a sibling who is male"), Diamond Properties, Connecting Properties ("Given terms {Person, hasChild, hasBirthday}, A twin parent is defined to be a person who has two children with the same birthday."), general reasoning about numbers (last few slides of [16])

"Details about relationship of OWL 2 to DLs are found in ... Hitzler, P., Krötzsch, M., Rudolph, S.: Foundations of Semantic Web Technologies. Chapman & Hall/CRC (2009)" [17]