grammar::fa::op(n) | Finite automaton operations and usage | grammar::fa::op(n) |
grammar::fa::op - Operations on finite automatons
package require Tcl 8.4
package require snit
package require struct::list
package require struct::set
package require grammar::fa::op ?0.4.1?
::grammar::fa::op::constructor cmd
::grammar::fa::op::reverse fa
::grammar::fa::op::complete fa ?sink?
::grammar::fa::op::remove_eps fa
::grammar::fa::op::trim fa ?what?
::grammar::fa::op::determinize fa ?mapvar?
::grammar::fa::op::minimize fa ?mapvar?
::grammar::fa::op::complement fa
::grammar::fa::op::kleene fa
::grammar::fa::op::optional fa
::grammar::fa::op::union fa fb ?mapvar?
::grammar::fa::op::intersect fa fb ?mapvar?
::grammar::fa::op::difference fa fb ?mapvar?
::grammar::fa::op::concatenate fa fb ?mapvar?
::grammar::fa::op::fromRegex fa regex ?over?
::grammar::fa::op::toRegexp fa
::grammar::fa::op::toRegexp2 fa
::grammar::fa::op::toTclRegexp regexp symdict
::grammar::fa::op::simplifyRegexp regexp
This package provides a number of complex operations on finite automatons (Short: FA), as provided by the package grammar::fa. The package does not provide the ability to create and/or manipulate such FAs, nor the ability to execute a FA for a stream of symbols. Use the packages grammar::fa and grammar::fa::interpreter for that. Another package related to this is grammar::fa::compiler which turns a FA into an executor class which has the definition of the FA hardwired into it.
For more information about what a finite automaton is see section FINITE AUTOMATONS in package grammar::fa.
The package exports the API described here. All commands modify their first argument. I.e. whatever FA they compute is stored back into it. Some of the operations will construct an automaton whose states are all new, but related to the states in the source automaton(s). These operations take variable names as optional arguments where they will store mappings which describe the relationship(s). The operations can be loosely partitioned into structural and language operations. The latter are defined in terms of the language the automaton(s) accept, whereas the former are defined in terms of the structural properties of the involved automaton(s). Some operations are both. Structure operations
Any container class using this package for complex operations should set its own class command as the constructor. See package grammar::fa for an example.
The language of fa is unchanged by this operation.
This is done by adding a single new state, the sink, and transitions from all other states to that sink for all symbols they have no transitions for. The sink itself is made complete by adding loop transitions for all symbols.
Note: When a FA has epsilon-transitions transitions over a symbol for a state S can be indirect, i.e. not attached directly to S, but to a state in the epsilon-closure of S. The symbols for such indirect transitions count when computing completeness of a state. In other words, these indirectly reached symbols are not missing.
The argument sink provides the name for the new state and most not be present in the fa if specified. If the name is not specified the command will name the state "sinkn", where n is set so that there are no collisions with existing states.
Note that the sink state is not useful by definition. In other words, while the FA becomes complete, it is also not useful in the strict sense as it has a state from which no final state can be reached.
Note: This operation may cause states to become unreachable or not useful. These states are not removed by this operation. Use ::grammar::fa::op::trim for that instead.
The command will store a dictionary describing the relationship between the new states of the resulting dfa and the states of the input nfa in mapvar, if it has been specified. Keys of the dictionary are the handles for the states of the resulting dfa, values are sets of states from the input nfa.
Note: An empty dictionary signals that the command was able to make the fa deterministic without performing a full subset construction, just by removing states and shuffling transitions around (As part of making the FA epsilon-free).
Note: The algorithm fails to make the FA deterministic in the technical sense if the FA has no start state(s), because determinism requires the FA to have exactly one start states. In that situation we make a best effort; and the missing start state will be the only condition preventing the generated result from being deterministic. It should also be noted that in this case the possibilities for trimming states from the FA are also severely reduced as we cannot declare states unreachable.
The command will store a dictionary describing the relationship between the new states of the resulting minimal fa and the states of the input fa in mapvar, if it has been specified. Keys of the dictionary are the handles for the states of the resulting minimal fa, values are sets of states from the input fa.
Note: An empty dictionary signals that the command was able to minimize the fa without having to compute new states. This should happen if and only if the input FA was already minimal.
Note: If the algorithm has no start or final states to work with then the result might be technically minimal, but have a very unexpected structure. It should also be noted that in this case the possibilities for trimming states from the FA are also severely reduced as we cannot declare states unreachable.
Language operations All operations in this section require that all input FAs have at least one start and at least one final state. Otherwise the language of the FAs will not be defined, making the operation senseless (as it operates on the languages of the FAs in a defined manner).
The result will have all states and transitions of the input, and different final states.
The result will have all states and transitions of the input, and new start and final states.
The result will have all states and transitions of the input, and new start and final states.
The result will have all states and transitions of the two input FAs, and new start and final states. All states of fb which exist in fa as well will be renamed, and the mapvar will contain a mapping from the old states of fb to the new ones, if present.
It should be noted that the result will be non-deterministic, even if the inputs are deterministic.
The command will store a dictionary describing the relationship between the new states of the resulting fa and the pairs of states of the input FAs in mapvar, if it has been specified. Keys of the dictionary are the handles for the states of the resulting fa, values are pairs of states from the input FAs. Pairs are represented by lists. The first element in each pair will be a state in fa, the second element will be drawn from fb.
The command will store a dictionary describing the relationship between the new states of the resulting fa and the pairs of states of the input FAs in mapvar, if it has been specified. Keys of the dictionary are the handles for the states of the resulting fa, values are pairs of states from the input FAs. Pairs are represented by lists. The first element in each pair will be a state in fa, the second element will be drawn from fb.
The result FA will be non-deterministic.
The regular expression is represented by a nested list, which forms a syntax tree. The following structures are legal:
Note that this operator accepts zero or more arguments. With zero arguments the represented language is epsilon, the empty word.
Note that this operator accepts zero or more arguments. With zero arguments the represented language is the empty language, the language without words.
The command will fail and throw an error if regexp contains complementation and intersection operations.
The argument symdict is a dictionary mapping symbol names to pairs of syntactic type and Tcl-regexp. If a symbol occurring in the regexp is not listed in this dictionary then single-character symbols are considered to designate themselves whereas multiple-character symbols are considered to be a character class name.
This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report such in the category grammar_fa of the Tcllib SF Trackers [http://sourceforge.net/tracker/?group_id=12883]. Please also report any ideas for enhancements you may have for either package and/or documentation.
automaton, finite automaton, grammar, parsing, regular expression, regular grammar, regular languages, state, transducer
Grammars and finite automata
Copyright (c) 2004-2008 Andreas Kupries <andreas_kupries@users.sourceforge.net>
0.4 | grammar_fa |