Grammar InputCFG

Examples

Productions

One rule per line · | for alternation
Uppercase = non-terminal · lowercase = terminal
Spaces are optional: S->aSb and S -> a S b are equivalent
eps or ε = empty string (epsilon)

ⓘ Start symbol must be S · First rule's LHS must be S

Conversion OutputPDA

⇢

Enter a CFG and click Convert

Formal Language Theory:
CFG & PDA

A concise reference covering definitions, the equivalence theorem, and both conversion algorithms — everything you need to understand what this tool computes and why it works.

Context-Free Grammars

Definition

A CFG is a 4-tuple G = (V, Σ, R, S)

V — finite set of non-terminal variables
Σ — finite set of terminal symbols (V ∩ Σ = ∅)
R — finite set of production rules A → α
S ∈ V — the start variable

Each rule has a single non-terminal on the left and any string over V ∪ Σ on the right (including ε for the empty string).

Derivation

How a CFG generates strings

Starting from S, repeatedly replace any non-terminal with the right-hand side of one of its rules. A string w ∈ Σ* is in L(G) iff S ⇒* w.

S ⇒ aSb ⇒ aaSbb ⇒ aaεbb ⇒ aabb
Grammar: S → aSb | ε Language: {aⁿbⁿ | n ≥ 0}

Input format for this tool

Writing productions

Write one rule per line in the form LHS -> RHS. Separate symbols in the RHS with spaces. Use ε or eps for the empty string. Use | for alternation.

S -> a S b | ε or equivalently on two lines:
S -> a S b
S -> ε

Pushdown Automata

Definition

A PDA is a 7-tuple M = (Q, Σ, Γ, δ, q₀, Z₀, F)

Q — finite set of states
Σ — input alphabet
Γ — stack alphabet
δ — transition function Q × Σε × Γε → 2^Q×Γ*
q₀ ∈ Q — initial state
Z₀ ∈ Γ — initial stack symbol
F ⊆ Q — accept states

Transitions

How transitions work

A transition δ(p, a, A) ∋ (q, γ) means: in state p, reading input a (or ε), with A on the stack top — pop A, push γ, move to state q.

δ(q0, a, Z0) = (q0, A Z0)
In q0, read 'a', pop Z0, push A Z0

δ(q0, b, A) = (q1, ε)
In q0, read 'b', pop A, push nothing

Two acceptance modes: final state (in F when input ends) or empty stack (stack empty when input ends). They are equivalent in power.

Key property

Why the stack matters

A finite automaton has no memory beyond its current state. A PDA adds a LIFO stack — this single addition is exactly enough to recognise all context-free languages. The stack can simulate a counter or match nested structures, but it cannot count two independent quantities simultaneously (which is why {aⁿbⁿcⁿ} is not context-free).

The Equivalence Theorem

Closure properties of CFLs

✓ Union L₁ ∪ L₂
✓ Concatenation L₁ ⋅ L₂
✓ Kleene star L*
✗ Intersection L₁ ∩ L₂
✗ Complement ∁L
✓ Intersection with regular

Pumping Lemma (CFLs)

For every CFL L, ∃ p such that every string w ∈ L with |w| ≥ p can be written w = uvxyz where:

|vy| ≥ 1
|vxy| ≤ p
uvℷxyℷz ∈ L for all i ≥ 0

Used to prove a language is not context-free.

Algorithm: CFG → PDA (via GNF)

Goal: Given CFG G = (V, Σ, R, S), construct a 3-state NPDA P that accepts L(G) by empty stack. This tool converts via Greibach Normal Form (GNF) — every rule has the form A → t α where t is a terminal and α is a (possibly empty) sequence of non-terminals. This makes the PDA transitions one-to-one with grammar rules.

Transform CFG to GNF (four sub-steps)

1a. Remove ε-productions: Find all nullable non-terminals (those deriving ε). For every rule A→α, add versions where each nullable symbol in α is optionally deleted. Retain S→ε only if S is nullable.
1b. Remove unit productions: A→B (single NT) is replaced by copying all B-rules into A, then discarding A→B. Repeat until none remain.
1c. Introduce auxiliary terminal variables: For any terminal t appearing at position ≥1 in a RHS, introduce a fresh variable A_i with rule A_i→t. Replace t with A_i in those positions. This ensures all non-first positions hold only non-terminals.
1d. Eliminate left recursion: Using the standard algorithm (substitute Aj rules into Ai for j<i, then eliminate direct left recursion A_i→A_iα by introducing A_i′). After all passes every rule starts with a terminal.

Build the 3-state NPDA from GNF

States: q0 (start), q1 (simulate), q2 (accept). Z₀ is the initial stack symbol.
δ(q0, ε, Z₀) = (q1, S) — discard Z₀, push start symbol S.

One PDA transition per GNF rule

For each GNF rule A → t B₁ B₂ … Bₖ:
δ(q1, t, A) = (q1, B₁ B₂ … Bₖ) — read terminal t, pop A, push the non-terminal suffix B₁…Bₖ (Bₖ pushed first so B₁ is on top).
For A → ε: δ(q1, ε, A) = (q1, ε) — just pop A.

Accept by empty stack

δ(q1, ε, Z₀) = (q2, ε) — when Z₀ is reached the stack is effectively empty and all derivation is done; move to the accept state q2. The string is accepted iff all input is consumed at this point.

Example: G: S → A C A → a A b | a A | ε C → c C | ε
After GNF conversion (aux vars A₁→b, A₂→c created):
S → a Aʼ C (example GNF rule from left-recursion elimination)
δ(q0, ε, Z₀) = (q1, S)
δ(q1, a, A) ∋ (q1, A A₁) ← rule A→a A b via A₁→b
δ(q1, a, A) ∋ (q1, A)    ← rule A→a A
δ(q1, ε, A) ∋ (q1, ε)     ← rule A→ε
δ(q1, c, C) ∋ (q1, C)    ← rule C→c C
δ(q1, ε, C) ∋ (q1, ε)     ← rule C→ε
δ(q1, ε, Z₀) = (q2, ε)

Algorithm: PDA → CFG

Goal: Given PDA P = (Q, Σ, Γ, δ, q₀, Z₀, F), construct a CFG G with L(G) = L(P). Assume P accepts by empty stack (convert from final-state acceptance if needed).

The key idea: CFG variables are triples [p, A, q], each representing the ability to "start in state p with A on the stack, consume input, and reach state q with A fully processed."

Start productions

Add S → [q₀, Z₀, q] for every state q ∈ Q. This covers all ways the PDA can fully consume the initial stack while starting from the initial configuration.

Pop transitions (γ = ε)

For δ(p, a, A) ∋ (r, ε): add rule [p, A, r] → a. One rule per transition; no intermediate states needed since nothing is pushed.

Push transitions (γ = B₁…B⁾)

For δ(p, a, A) ∋ (r, B₁…B⁾): add rules [p, A, q⁾] → a [r, B₁, q₁] [q₁, B₂, q₂] … [q⁾₋₁, B⁾, q⁾] for every combination q₁,…,q⁾ ∈ Q. This yields |Q|ⁿ rules per transition.

Clean up (optional)

Remove unreachable variables (not derivable from S) and useless variables (that never derive a terminal string). The language is unchanged; the grammar is simplified.

Quick Reference

Property	CFG	PDA
Representation	Production rules	States + stack + transitions
Memory model	Implicit (derivation tree)	Explicit LIFO stack
Acceptance	w derivable from S	Final state or empty stack
Deterministic?	Ambiguous vs unambiguous	DPDA ⊊ NPDA in power
Recognised class	Context-Free Languages	Context-Free Languages
Parsing complexity	CYK: O(n³) for CNF	Simulation: O(n³) in general
Non-determinism	Multiple rules per variable	Multiple transitions per config
Examples	aⁿbⁿ, balanced brackets	aⁿbⁿ, balanced brackets

Instruction

Formal Language Theory:CFG & PDA

A CFG is a 4-tuple G = (V, Σ, R, S)

How a CFG generates strings

Writing productions

A PDA is a 7-tuple M = (Q, Σ, Γ, δ, q₀, Z₀, F)

How transitions work

Why the stack matters

Formal Language Theory:
CFG & PDA