Formal tree languages and their algorithmic learnability Thesis uri icon

abstract

  • This thesis centers on formal tree languages and on their learnability by algorithmic methods in abstractions of several learning settings. After a general introduction, we present a survey of relevant definitions for the formal tree concept as well as special cases (strings) and refinements (multi-dimensional trees) thereof. In Chapter 3 we discuss the theoretical foundations of algorithmic learning in a specific type of setting of particular interest in the area of Grammatical Inference where the task consists in deriving a correct formal description for an unknown target language from various information sources (queries and/or finite samples) in a polynomial number of steps. We develop a parameterized meta-algorithm that incorporates several prominent learning algorithms from the literature in order to highlight the basic routines which regardless of the nature of the information sources have to be run through by all those algorithms alike. In this framework, the intended target descriptions are deterministic finite-state tree automata. We discuss the limited transferability of this approach to another class of descriptions, residual finite-state tree automata, for which we propose several learning algorithms as well. The learnable class by these techniques corresponds to the class of regular tree languages. In Chapter 4 we outline a recent range of attempts in Grammatical Inference to extend the learnable language classes beyond regularity and even beyond context-freeness by techniques based on syntactic observations which can be subsumed under the term 'distributional learning', and we describe learning algorithms in several settings for the tree case taking this approach. We conclude with some general reflections on the notion of learning from structural information.

publication date

  • 2012