### LL(1) grammar restrictions

Before I discuss the statement syntax for SPTL, I want to look at grammar restrictions. As I mentioned in a previous post, I'm planning to build a recursive descent (top down) parser for SPTL. Specifically, it will be an LL(1) parser. That is, parsing from

This places restrictions on the grammar. The parser must be able to determine, when it finds a token, what grammar rule to apply. If a language had these rules:

then the parser wouldn't know when it encountered a var whether this was an incstatement or a decstatement.

For a grammar to qualify as LL(1), it must satisfy a set of rules.

Define:

The following symbols mean:

A grammar G is LL(1) if and only if:

A pair is any two elements that can appear in the same part of the sentence (in the BNF-like grammar above, these are denoted by the

So, going back to the incstatement/decstatement example above,

{

Therefore, the grammar above is not LL(1).

How can we change it into an LL(1) grammar? Simple:

Now the var is part of

Now the grammar (or at least this portion of it) is LL(1).

This is a very simple example. But how do we determine FIRST and FOLLOW for all sentences of a grammar? I'll explain in my next post.

**L**eft to right, with a**L**ookahead of**1**token.This places restrictions on the grammar. The parser must be able to determine, when it finds a token, what grammar rule to apply. If a language had these rules:

statement : incstatement | decstatement

incstatement : var '++'

decstatement : var '--'

then the parser wouldn't know when it encountered a var whether this was an incstatement or a decstatement.

For a grammar to qualify as LL(1), it must satisfy a set of rules.

Define:

**A**as a sentence (one or more tokens) in the grammar

**u**and

**v**as sentences derived from

**A**

**w**and

**x**as unspecified sentences

**FIRST**as the set of possible first tokens for the sentence

**FOLLOW**as the set of possible first tokens for the following sentence

The following symbols mean:

**∀**For all

**∃**There exists

**∅**Empty set

**∈**Empty sentence

**∩**Intersection (the elements common to both sets)

A grammar G is LL(1) if and only if:

- ∀ A, A ≠ ∈,

∀ pairs A → u and A → v,

FIRST(u) ∩ FIRST(v) = ∅ - If ∃ A, A → ∈,

FIRST(A) ∩ FOLLOW(A) = ∅

A pair is any two elements that can appear in the same part of the sentence (in the BNF-like grammar above, these are denoted by the

**|**symbol).So, going back to the incstatement/decstatement example above,

**statement**is**A**, the sentence we are looking at**incstatement**and**decstatement**are**u**and**v**respectively**FIRST(incstatement)**is {**var**}**FIRST(decstatement)**is {**var**}**FIRST(incstatement)**∩**FIRST(decstatement)**= {**var**}{

**var**} ≠ ∅Therefore, the grammar above is not LL(1).

How can we change it into an LL(1) grammar? Simple:

statement = var (incop | decop)

incop = '++'

decop = '--'

Now the var is part of

**statement**, and in place of**incstatement**/**decstatement**we have**incop**/**decop**.**FIRST(incop)**is {**'++'**}**FIRST(decop)**is {**'--'**}**FIRST(incop)**∩**FIRST(decop)**= ∅Now the grammar (or at least this portion of it) is LL(1).

This is a very simple example. But how do we determine FIRST and FOLLOW for all sentences of a grammar? I'll explain in my next post.

Labels: grammar rules, LL(1), recursive descent