overview
- Head of the emerging junior research group for Scientific Knowledge Engineering (ontology and thesaurus engineering and alignment, domain vocabulary design for technical, industrial, and library-specific use cases).
Motivation and context description
The SKE research and development group focusses on knowledge engineering services in and for information infrastructure institutions – this includes libraries but also specialized research institutes and stakeholders from the industry dealing with data and information resources of all kinds.
For many domains (even technical disciplines) there is still a lack of advanced knowledge organization systems (KOS), and many institutions use in-house, off-line solutions. Especially for smaller specialized institutions the transfer of their KOS into a standard format in order to make them interoperable represents a considerable challenge due to shortages in staff and in technological competence.
Services pertaining to knowledge organization can reach from training and consultation to structural analyses to concrete tools and hosting of software solutions, and help clients to:
analyze textual and other (possibly heterogeneous) data in order to extract terms, entities, relations and other semantic content
conceptualize/model and build their domain-specific or interdisciplinary vocabulary
maintain their vocabulary (distributed, collaborative editing; using for example a VoCol instance)
discover structural flaws in their KOS (e.g., cyclic structures) and correct them
transform their vocabulary into a standard format, publish it, and handle updates
align their vocabulary with other KOS (linking, merging, concordances); multilingual enrichment and (virtual) clustering; interaction with systems such as DBpedia, Wikidata
create visualizations for various purposes (structural analyses, exploration)
integrate their KOS into a knowledge organization, resource discovery, or information retrieval system (for example for query expansion, or for query answering applications)
apply knowledge engineering solutions to their own specific use cases.
Corresponding service platforms will be composed by evaluating and applying existing software solutions that are or will be available at the TIB (e.g., vocabulary curation: VoCol; question answering: QAnary) and blending it with an appropriate in-house development. All developed services will be designed with the intention to establish them as standard parts of the TIB portfolio, for customers from science and industry, but also with the intention to provide them for open and constructive reuse by libraries and other information infrastructure institutions via suitable cloud-based development repositories.
The TIB is and has been already involved in a number of knowledge engineering projects and collaborations, such as the development of the thesaurus “Technik und Management” with WTI-Frankfurt eG, several work packages within the Specialized Information Service (“Fachinformationsdienst”; FID) for Mathematics with SUB Göttingen and for mobility and transportation (MOVE) with SLUB Dresden, as well as the German and the international VIVO ontology improvement task forces. Ideas for future activities include a transdisciplinary, ontology-based knowledge network for chemistry, pharmacy, and bioinformatics, and other domain-specific or interdisciplinary vocabulary service packages, custom-tailored to the specific needs of the respective target group.
Research themes
Smart Factory (“Industrie 4.0”)
Although the idea of the Smart Factory has now been around for almost a decade and there have been a number of initiatives in order to lay the groundwork for semantic layers in industrial production (see for example the “Reference Architectural Model RAMI 4.0 and the Industrie 4.0 Component”), a concrete use of knowledge engineering methods in the industry is only just emerging. The aim of the SKE research group is to demonstrate in scientific case studies with industrial partners how an ontology-based handling of metadata in this sector can optimize workflows and production output.
BOOST 4.0: As a member of the L3S, TIB also participates in the BOOST 4.0 project (2017-) with the objective to demonstrate an open, highly standardized, shared data-driven Factory 4.0 model through ten lighthouse factories. With respect to knowledge engineering, TIB will be responsible for a notable contribution to Task 3.2 “Semantic Models, Vocabularies & Registry” which comprises the design of the models that will ensure semantic interoperability at each stage of the factory lifecycle. The main component will be a network of vocabularies of the core concepts of manufacturing, and the vocabularies will be cultivated in the VoCol environment which will be enhanced and adapted accordingly by the University of Bonn and L3S/TIB.
TIB is in contact with the AG IV “Semantic Web, Ontologies and Web Chain” of ZVEI and has built an example ontology for an industrial application (MAT label, for product package traceability). The research group SKE is aspiring to establish many more contacts with stakeholders and working groups in the Factory 4.0 area and with enterprises wishing to extend their use of semantic data and knowledge management methods, including small and middle-sized enterprises.
Library-related knowledge organization systems
The second focus of this research group lies on library-related KOS, in particular the German Authority File (GND) and various classifications, and on tools that facilitate subject indexing and the enrichment of the relevant KOS for subject librarians in their everyday work, for example via term extraction methods.
It is vital that libraries keep track with new developments in the area of semantic technologies – however, the exchange should be mutual: Modern semantic technologies can benefit from the cultivated domain expertise contained in the KOS used in libraries and in other specialized information infrastructure institutions as well. The result of transferring the raw material represented by those vocabularies and classifications into standard formats, purifying them, and making them interoperable in combination with statistical, neuronal, and other methods of automation can add depth and sharpness to semantic retrieval systems that hitherto may have focussed on maximal coverage on a more superficial level, which may for example result in more performant scientific question answering systems. The benefits of such an exchange should be explored by examining the question if the virtues of such concise kernels of domain expertise can be amplified by modern methods of automation and thus preserved in big data scenarios and in interdisciplinary contexts, and by locating the ideal trade-off point for such a symbiosis.
TIB collaborates with the German "GND-Kooperative" (organized by the German National Library) with respect to an advancement of the GND towards a version that is compatible with modern semantic technologies; a project proposal with the objective of developing an intelligent visualization and structure validation tool for the GND is in planning.