Introduction
Chaperon is a project, that helps to convert structured text to XML. It includes a
strong LALR(1) parser to parse the text, and a tree builder, which creates an
XML document.
What is structured text?
Examples of structured text are TeX files, java files, config files, etc.
Function
The Chaperon Parser consists of the following two components:
-
a parser table generator, and
-
a parser
The parser table generator generates a parser table from a grammar, like a compiler
which generates byte code to improve the execution speed of parsing. The parser table generator
does similar things. It makes the parsing process as fast as possible
The parser uses the parser table to parse text and then generate an XML document from it.
The generation of the parser table do the generator/transformer once as a first step, and
stores the parser table into the persistent store.
If the grammar has change the parser creates new parser table.
Grammar
The parser can used similar as XML Parser. But instead of an XML parser the
chaperon parser need a grammar file. This grammar file is also specified
in XML.
The XML grammar is not really so handy, so the Chaperon project also provides a grammar for a
text grammar similar to yacc/bison, and a stylesheet for converting this text grammar format to
the XML grammar format.
So it is easier to write a grammar in this text format rather than directly in the XML format.
The grammar format, the XML and the text format, consists of two parts. The first part contains
the token definitions and special instruction declarations. The other part contains the productions.
The token declarations were needed to build a lexer, which feeds the parser which tokens. The
parser arrange the tokens greater aggregations, which help of the production definitions.
|