[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

abstract syntax



Dave Bowen and Lojbab both asked me what I meant by ``abstract syntax''. The
notion is a common one in computing circles (at least the ones I move in); it
is an informal term meaning a ``stripped down'' syntax that shows the kinds of
constructs available, usually in terms appropriate to the language semantics.
(One might think of it as a concrete syntax with all the brackets and
precedence taken out.) The contrasting term is ``concrete syntax'', which
details how things are laid out so as to be unambiguously parseable.

For example, a simple grammar of arithmetic expressions might go:

    Expr ::= Factor | Expr AddOp Factor
    Factor ::= Term | Factor MulOp Term
    Term ::= Id | Number | "(" Expr ")"
    AddOp ::= "+" | "-"
    MulOp ::= "*" | "/"

and an abstract syntax would say

    Expr ::= Expr Op Expr | Id | Number
    Op ::= "+" | "-" | "*" | "/"

If the language has let-declarations and where-declarations for
binding, the concrete syntax may distinguish them

    Expr ::= "let" Dec "in" Expr | Expr "where" Dec
    Dec ::= Id "==" Expr

but the abstract syntax need not

    Expr ::= "letwhere" Dec Expr

(and note that my ``concrete'' syntax above is ambiguous!) Typically the
abstract syntax is shorter than the concrete one and is easier to use for
explaining meanings.

As a more appropriate example, I recall from my Loglan days (now more than ten
years behind me, so forgive me any lapses) that predicates could be conjoined
thus:

    ti preda e prede                        (this foos and bars)
or
    ti ke preda ki prede
or
    ti preda ce prede

and arguments could also be conjoined

    ti e ta preda                           (that and that both foo)
or
    ke ti ki ta preda

with lots of fun to be had in ensuring (by suitable restrictions on what could
appear where, and appropriate use of gi and gu) that the grammar was
unambiguous. This is all very well in arranging the suitable linear token
stream for parsing, but a right royal pain when trying to determine what kinds
of operations could be applied to what.

An abstract syntax would contain clauses like:

    Arg ::= Arg Connective Arg
    Pred ::= Pred Connective Pred

with Connective including the logical connectives (and perhaps the causative
ones too), and leave it to an unparsing scheme to choose an unambiguous
rendering into tokens.

A useful thing about an abstract syntax is that it makes it easy to ``see''
missing parts of the grammar: for example, suppose that if ``preda'' is a
predicate and ``ti'' is an argument, is ``preda ti'' a predicate? Why (or why
not?) If it is, can I conjoin it with another predicate - ie can I say

    ti (preda e (prede ta))

where the brackets are to show the abstract structure? If not, why not?

I hope I have made myself clearer.

Regards, Kers.  | "See the darkness all around   is coming down on you."
Renaissance.    | [Running Hard]