Skip links
Main content

Who old am i? and how this affects my choice of grammar

zaterdag 02 februari 2008 20:16

Having coded the parser does not mean you're done. This is just the beginning. Now you need to create the rules that the parser uses.
You don't just download the set of syntax rules from some public American university site. There are some sets
available, but they're not complete. And if they were complete, you would be in trouble too. Because the English language is so rich that
the full set of rules must be huge and very complex.

And there's not just one set of rules. They are several, because this is science, and grammars are theories about languages. The main
theory is Transformational Grammar (TG) (by Chomsky). And I would not hesitate to use the most popular theory, if I could. But I just changed my mind.

The first sentence I was able to parse was "you breathe". The second sentence "i am 38 years old.". (I also transformed this sentence into a semantic representation "<me> <age> <38>", but more about that later. The third sentence caused me to think about the type of grammar I should use. This sentence is: "How old am I?".

Now let be start by saying that I am a complete noob in the field of grammars, so I may utter complete nonsense here. While looking for the "phrase structure rules" to parse this sentence into a syntactic tree, I discovered that the sentence is subject to "wh-movement". That means that the word "how" (that belongs to the group of "why", "what", "where", etc, words starting with "wh") is moved to another location in the deep structure of the syntactic tree. The sentence should be parsed as if it read something like "i am how old?". That way, the deep structure of "i am 38 years old" and "how old am i" would be the same. And you can see that "how" than acts as a simple variable for "38 years". Which is quite interesting, really. However, it is not very well suited for the Earley parser I am using. Actually, it is a big problem.

Interesting enough, the "wh-movement" phenomenon is part of Transformational Grammar, but not of all other grammars. A grammar that is explicitly more suited for use by parsers is the so called "Lexical Functional Grammar". This grammar allows you to parse the sentence in a way that is more natural for the Earley parser.

Looking at this grammar I even found a great on-line demo of a functioning LFG parser.

So for the moment I am inclined to use LFG over TG as a grammar. However, this may all change later. And it doesn't mean I now have all the rules. I can now start to collect some rules.

my agent

« Terug