**Lecture
9 (Parse Trees), 2/13/18 **

**Concepts,
definitions (in italics):**

Example of CFG: simple arithmetic expressions with +, - and *

3 derivations of x - x + x (2 equivalent left+rightmost incorrect, 1 different correct)

*Leftmost*derivations,*rightmost*derivations*Parse Trees: internal nodes =*variables*, leaves =*terminals,*root*= start variable; the*fringe*of the tree spells out the derived stringExample: 2 parse trees for arithmetic expression x – x + x

*For every derivation, there is a corresponding parse tree*Carries the same information, except for the order of substitutions

Relation to lexing (numbers turn into NUM token)

Connection between parse trees and expression evaluation

*Parsing*: given string*s*and grammar*G*, constructing parse tree for*s*in*G*This is

*bottom-up*

Flattening a parse tree into a derivation

A derivation corresponds to a unique parse tree

A parse tree may correspond to multiple derivations, but there is exactly one

*rightmost*, and one*leftmost*among them.

*Applications*of parse trees:compilers/interpreters

Natural Language Processing

A language

*L*_{0}is*context-free*(CFL) iff there is some*context-free grammar (CFG)**G*such that*L*_{0 }= L(*G*)Why are these grammars called context-free?

Two grammars are

*equivalent*iff they generate the same language.*Ambiguous*grammars – there are two different parse trees, or different leftmost derivations for the same stringThe simple CFG for the language of arithmetic expressions is ambiguous; programming languages would use a different CFG for it which is not ambiguous.

*Disambiguation*: given an ambiguous CFG, finding an equivalent grammar that's not ambiguousExample: Unambiguous grammar for arithmetic expressions with + and - (which only creates correct parse trees)

*Inherently ambiguous CFL:*all grammars for it are ambiguousExample: a

^{n }b^{n }c^{m}U a^{n }b^{m }c^{m}If two grammars share variables, we can rename the variables for one or both of the grammars so the variables are disjoint.

Do 3

*constructions*Given CFGs

*G*_{1}and*G*_{2}, we can construct a CFG*G*_{0}such that L(*G*_{0}) = L(*G*_{1})^{o}L(G_{2})Given CFGs

*G*_{1}and*G*_{2}, we can construct a CFG*G*_{0}such that L(*G*_{0}) = L(*G*_{1}) U L(G_{2})Given CFGs

*G*_{1}and*G*_{2}, we can construct a CFG*G*_{0}such that L(*G*_{0}) = L(*G*_{1})*

Each of the constructions involves making the variables disjoint, combining together existing grammars, then creating a new start state and adding one or two new rules involving this state and the original start states of

*G*_{1}and*G*_{2}.Theorem: The class of Regular Languages is a

*proper (or strict) subset*of the class of Context Free LanguagesPart 1 (every RL is CF): Proved by recursion (induction) that all regular languages are context free, using the above three constructions.

Part 2 (some CFLs are not regular): Proved by finding a CFG whose language is not regular (did in class last time)

**Reading:**

Chapter 2.1 up to end of p. 108 (“chomsky normal form”)

**Skills:**

Know new definitions, understand new concepts.

Given a CFG

*G*and a derivation in G for some string*w*, construct a parse tree that corresponds to this derivationGiven a CFG

*G*and a parse tree in G for some string*w*, construct a leftmost/rightmost deriv that corresponds to this parse treeGiven a string

*w*and a CFG*G,*create a derivation/parse tree for*w*in*G*Given a description of a language, show that it is context-free (i.e. create a CFG for it).

Show that a given grammar G is ambiguous (i.e. find a string in L(G) with more than one parse tree).

Given a RE, create an equivalent CFG (including homework problem)

**Notes:**