From andand@csd.uu.se Fri Sep 30 08:41:15 1994
Return-Path: <andand@csd.uu.se>
Date: Fri, 30 Sep 1994 12:46:48 +0100
From: Anders Andersson <andand@csd.uu.se>
To: aaron@vienna.njit.edu
Subject: Re:  SLR versus LR1 and LL1, is it worth the trouble?
X-Charset: ASCII
X-Char-Esc: 29

> You say C is not LR(1).

Well, actually the problem is easy to solve in practice. You simply dump the
complexity on the lexical analyzer. The problem is that identifiers are
parsed differently if they are typedef-declared. For example:
typedef int foo;
void main() {
  foo *bar;
}
In this example `foo *bar;' means ``declare bar a pointer to int'' and is 
parsed in one way, as a declaration. In contrast:
int foo, bar;
void main() {
  foo *bar;
}
Here `foo *bar;' means ``multiply the integers foo and bar and do nothing with
the result'' and it is parsed as a statement. 

This is nearly always solved by letting the lexical analyzer carry symbol
table information. From a practical point of view everything is fine, but
>from a theoretical point of view we have a non context-free lexical analyzer.

This may also cause some problems in generators for declarative languages,
parallell languages and non-deterministic parsing. In my parser generator
SAGA that runs under the Agents system beeing developed at SICS this turned
out to be a big problem. Speed suffered, semantics became more complex,
non-determinism will be forced to rescan the input etc. 

Even in the normal imperative generators like Yacc or Bison this technique
may cause some grief. The problem is that if the look-ahead token has been 
read prior to the symbol table information have been entered, the parse fails.
Sure, in C you can always manage by writing the grammar in the right way,
by exploiting the semicolon that is known to terminate the statement, but
the whole thing requires care and some understanding of the way the parse
is performed.

> Unfortunately, I found most other books I've seen to be even
> less elegant and more verbose.

Well, the Dragon is the clearest presentation. There are some detailed notes
on how to implement LALR efficiently in various articles, but they are
very near impossible to follow. Besides, if you do some transformation
of the algorithm in the Dragon book, you get a resonably efficient one.
It is quite possible to get by with Dragon book only, it's just a lot of
work. Some work can be saved if you have a library with graph operations,
in particular SCC and topological sort. These may be used instead of the
iterative propagation algorithm in the Dragon.

	/Anders


