SOEL: Supporting Ontology Engineering with Large Language Models

Grant PID2023-152703NA-I00 funded by MCIN/AEI/ 10.13039/501100011033 and by “ERDF/UE”

MCIN/AEI/ acknowledgment

About

The influence of Large Language Models (LLMs) has not only propelled but also reshaped the current computer science landscape. Given their high capabilities for natural language understanding, the impact of such models has been significantly noticeable in the Natural Language Processing field. However, other Artificial Intelligence areas may also benefit from such technologies. In this project, we explore the exploitation of LLMs for Ontology Engineering.

SOEL (Supporting Ontology Engineering with Large Language Models) is designed to achieve such integration by adapting state-of-the-art LLMs and evaluating them through the creation of open reference datasets, evaluation tasks and benchmarks. Although the SOEL approach is domain independent, the results of the project will be assessed in three different domains: IoT, Linguistics and Scientific Research.

Due to the participation of the research team in different initiatives, SOEL is intended to support new and existing internationalisation activities. Such activities can be summarised in three specific objectives:

  1. Characterising LLM Challenge Tasks (CTs) for Ontology Engineering, which focuses on the identification of tasks in which the use of LLMs would help the ontology engineering process. This objective also involves the creation of datasets that are expected to evaluate the identified tasks.
  2. Challenge Task evaluation framework and benchmarks, which focuses on the development of benchmarks to evaluate the performance of the open source LLMs applied in the above mentioned tasks. This objective also involves the adaptation of LLMs according to the reference datasets to check their performance, as well as the proposal of a metadata model for describing the developed LLMs and datasets. As a result, LLM-powered assistants will be developed with the best performing models.
  3. Application use cases, which focus on the application of LLM-powered assistants in the three application domains: Open Science, IoT and Linguistics. This objective also involves the compilation of results from the use cases and the outcomes of the previous steps to evolve existing Ontology Engineering methodologies by including LLMs in the loop.

Given the previous expertise of the research team in knowledge representation and interdisciplinary projects, we believe that the outcomes from SOEL will establish the foundation needed to pave the way towards accelerating Ontology Engineering through human-machine assisted collaboration.