schema-miner pro: Agentic AI for Ontology Grounding Over LLM-Discovered Scientific Schemas in a Human-in-the-Loop Workflow Academic Article uri icon

abstract

  • Scientific processes are often described in free text, making it difficult to represent and reason over them computationally. We present schema-miner p r o , a human-in-the-loop framework that automatically extracts and grounds structured schemas from scientific literature. Our approach combines large language models for schema extraction with an agent-based system that aligns extracted elements to external ontologies through interpretable, multi-step reasoning. The agent leverages lexical heuristics, semantic similarity, and expert feedback to ensure accurate grounding. We demonstrate the framework on two semiconductor manufacturing workflows—atomic layer deposition and atomic layer etching—mapping process parameters and outputs to the QUDT (Quantities, Units, Dimensions, and Types) ontology. By producing ontology-aligned, semantically precise schemas, schema-miner p r o lays the groundwork for machine-actionable scientific knowledge and automated reasoning across disciplines.

authors

  • Sadruddin, Sameer
  • D’Souza, Jennifer
  • Poupaki, Eleni
  • Watkins, Alex
  • Karasulu, Bora
  • Auer, Sören
  • Mackus, Adrie
  • Kessels, Erwin

publication date

  • 2026

start page

  • 22104968261431521

volume

  • 17

issue

  • 3