NLP Thought Leadership - Ontologies

By definition, an ontology is a data model which aims to formally represent the knowledge that individuals have about a particular domain. Ontologies are used in many different industries and disciplines, from more abstract fields, such as philosophy, to the more precise and tangible, such as biology and computer science. What makes an ontology special is the way that it stores and organizes information, specifically - Natural Language Processing engineers use ontologies for semantic reasoning and machine translation to provide algorithms a fuller depiction of the language data being processed. The most common applications of ontologies involve Artificial Intelligence.

Illustration of a science researcher at a desk with a laptop, surrounded by symbols of DNA, molecules, a magnifying glass, and a light bulb, representing scientific discovery and research.

Ontologies themselves don’t actually do anything, per se, they are a formal, single source for domain knowledge essential for streamlining tasks like onboarding or learning, allowing users to define a set of concepts and describe how the concepts relate to each other, which in turn, leads to inferences about what is logical, true, and consistent. 

A piece of information is (essentially) stored in one of four forms: as an individual, a concept, an attribute, or a relationship. Concepts represent high-level abstractions of individuals, while individuals represent “real-world” elements. Attributes define properties of particular concepts and individuals, while relationships define the way in which two concepts or two individuals are connected. Concepts, attributes, and relationships are all structured in a hierarchical manner. The hierarchical nature of an ontology allows for data inference, which ends up being one of the primary reasons ontologies are so well-suited for Natural Language Processing tasks. 

For example, an ontology defines the music domain. One concept in the ontology could be “rock” which has a subconcept, or subclass, “classic_rock”. In addition, there could be another concept “band” with the individual “Queen” assigned to it. Lastly, there could be a “plays” relationship between “Queen” and “classic_rock”. With this information, we can then infer that, in addition to classic rock, “Queen plays rock” is also a true statement. 

A diagram showing different music genres connected by lines, with a large music note icon in the center. Genres include Jazz, Pop, Rock, Classic Rock, Queen, and subgenres.

Ontologies can be compared to a truth table with more specific, real-world concepts. This is essential in artificial intelligence. If you want a machine to answer questions for you, it must know about the real world and what conditions apply in it. To list out every single fact about the universe would be impossible, so we must define these facts in terms of concepts, individuals, relations, and axioms. Without an ontology, humans and computer systems alike would not be able to even agree on the meaning of a word, much less evaluate the logic of statements. But with their help, we are able to build machines that can not only understand what we say, but also hold a conversation, draw conclusions about intentions, answer questions, ask for more information, and even write content that is indistinguishable from a human’s writing. 

Ontologies are powerful in their practical applications but they’re not without limitations. First of all, mainstream ontologies are created by people, and need to be thorough and extensive to cover all concepts and conditions in their domain. It is easy for a person creating an ontology to forget some concepts that are essential, and they may not notice a concept is missing until a problem becomes apparent in a downstream application. Moreover, it takes time to build, making some ontologies costly to create.

At Lymba, we’re able to overcome these limitations with our semi-automatic Jaguar and Jager applications. Jaguar helps to speed up the process of brainstorming by identifying and organizing the most important concepts in a set of documents, with or without a seed concept. Then, using Jager, the user is able to quickly and easily edit the initial schema and export it as an ontology that can be used alone or in other applications. This way, the process of creating an ontology is reduced from months to days, leaving more time and resources for review and testing in the downstream application.

To learn more about Lymba’s research and publications, click here.

Request a Demo

Harness the Power of Ontologies for NLP

info@lymba.com