Skip links
Main content

In a bar, under the sea

donderdag 05 december 2013 20:36

Preventing infinite recursion in surface realization

Today I want to talk to you about a problem that always surfaces in natural language generation systems. What's more, in every system that exposes a set of production rules that allows recursion, there is always the problem that rules bite themselves in the tail. And each system will need to provide a simple and solid way to end it.

The problem

As always, I will provide an example. In this case it is a set of relations that need to be turned into a sentence.

isa(?event, Break), tense(?event, Past),
subject(?event, ?subject), object(?event, ?object),
isa(?object, Mirror),
name(?subject, "John"),
link(In, ?object, ?loc1), isa(?loc1, Bar),
link(Under, ?object, ?loc2), isa(?loc2, Sea)

The sentence is:

John broke the mirror in a bar under the sea.

This sentence is special because it contains two prepositional phrases below the same noun phrase:

  • in a bar
  • under the sea

A prepositional phrase is realized by this production rule:

NBar => NBar PP

When this rule first matches, it produces two goals: NBar and PP.

The PP will be realized as "in a bar".

The NBar will now be made the next goal. And this is where the problem occurs.

When nothing is done, NBar will be realized by NBar => NBar PP.

PP will again be realized as "in a bar", and NP will be made the next goal.

NBar will be realized by NBar => NBar PP.

PP will again be realized as "in a bar", and NP will be made the next goal.

NBar will be realized by NBar => NBar PP.

PP will again be realized as "in a bar", and NP will be made the next goal.

and so on. Until the stack overflows.

The solution

Multiple solutions exist for this problem. For example, it is possible to allow only consequents (the nodes after the =>) that do not occur as consequents (the node before the =>). Like this:

NBar => NBar1 PP
NBar1 => NBar2 PP
NBar2 => NP

or like this:

NBar => NP PP PP
NBar => NP PP

But these are not as expressive as the rule we used above, and we need a lot more rules.

And then there are solutions thinkable that remove relations from the input set once they have been used in a rule. But I quickly ran into their limits. In stead, the solution I used goes like this:

In every generation process, keep track of all rule/bindings combinations and make sure that each is used only once.

For example, the full rule NBar => NBar PP is like this:

        NBar => NBar PP,
        link(?link, NBar.entity, ?entity),
    bind: {
        NBar.entity = NBar.entity;
        PP.superObject = NBar.entity

It fires when the condition `link(?link, NBar.entity, ?entity)` binds to the input relations, using the property `NBar.entity` that was passed to it from the node above.

When it fires, the variable ?entity is bound the node that represents `a bar`, and the property `NBar.entity` is bound to the node that represents `the mirror`. `link(In, ?object, ?loc1)`. This combination of the rule `NBar => NBar PP` and these bindings is stored in a table.

The second time the rule is checked, the same bindings are tried, but since these are already stored in the table, they must not be used. They are skipped and the next relation is bound: link(Under, ?object, ?loc2).

Thus, the infinite recursion is avoided and the sentence is generated.


"In A Bar, Under The Sea" is a great album by the Belgian rock band dEUS. My favourite song on it is called "Little Arithmetics".

The photos came from here and here.

The paired prepositional phrases are both prepositions of place. Even when a system would discern between different productions for place, time, manner, and so on, this would still not solve the problem.

One problem with my solution is that it doesn't allow sentences that duplicate the same clause, like "John broke the mirror and then he fixed the mirror". When necessary, I will need to find something to fix this. Perhaps the table check only needs to be done for rules that recurse in themselves (that have de antecedent node as one of its own consequents). But this would allow infinite recursion within two or more levels.

Then there is the problem which preposition is produced first: `in a bar` or `under the sea`. In this system, the one with the first relation will be produced first.


« Terug