cursurile 2-4

Post on 12-Apr-2015

42 views 4 download

Transcript of cursurile 2-4

Limbaje Formale si Automate

Instructor Andrei PAUN

Semestrul II 2009-2010

Departmentul de InfomrmaticaUniversitatea BucurestiBucuresti, Romania

1

apaun
Rectangle

Structura cursului:

1. Cunostinte de bazamultimi, multiseturi, secvente, functii,relatii, metode si tehnici de demonstratie,alfabet, cuvinte, limbaje

2. Automate finiteautomate finite nedeterministe (NFA),automate finite deterministe (DFA),transformarea de la NFA la DFA,proprietati pentru limbajele acceptate de

automate,lema de pompare

3. Expresii regulateechivalenta dintre expresiile regulate si au-

tomatele finite(in definirea limbajelor)

4. Gramatici independente de context (CFG)simplificare si forme normale,algoritmi de parsare, lema de pompare sialte proprietati

2

5. Automate pushdown (PDA)nondeterministic pushdown automata (NPDA),deterministic pushdown automata (DPDA),PDA vs. CFGs

6. Masini Turing (TM)diverse tipuri de TM, TM universale,decidabilitate si nedecidabilitate,complexitate de timp si spatiu

3

Motivare pentru curs

Compiler

Programming

Database

AI

Graphics

VLSI

Theory of Computing

(1) Putere de calcul

(2) Complexitate

(3) Software design

(4) Circuit design

. . . . . . . . . . . . . . . . . . . . .

4

I. Cunostinte de baza

1. Multimi1) O multime este o colectie de elemente luate

dintr-un univers.

2) O multime este finita daca contine un nu-mar finit de elemente, si este infinita in cazcontrar.

3) Cardinalitatea unei multimi este numarul saude elemente si se noteaza cu #A sau ∣A∣ (pen-tru multimea A).

5

– Cum specificam o multime:

(1) Listam toate elementele sale:{} (or ∅),A = {2, 3, 5, 7},B = {3, 2, 2, 7, 5, 7}

(2) Precizam o proprietate: {x ∣ P (x)}C = {x ∣ x este numar prim},D = {x ∣ x este un cuvant in romana si x

este si palindrom},E = {x ∣ x este un intreg divizibil cu

3 si mai mic decat 10 }

(3) Definite recursiv:Forma generala:i) Anumite elemente initiale sau de baza

sunt in Aii) Daca a, b ∈ A, atunci a ∘ b ∈ Aiii) Nici un alt element nu apartine lui A

6

Exemplu Definirea lui A:i) 2 ∈ Aii) Daca a, b ∈ A, atunci a + b ∈ Aiii) Nici un alt element nu apartine lui A

(4) Inchiderea la operatii

∙ Spunem ca o multime A este inchisa la o op-eratie binara ∘ daca pentru toate a, b ∈ A,a ∘ b ∈ A

∙ Fie B ⊇ A si B este inchisa la ∘. Spunem caB este o inchidere a lui A la operatia ∘ dacanu exista B′ astfel incat A ⊆ B′ ⊂ B si B′ esteinchisa la ∘.

∙ Forma generala (pentru multimi definite prininchidere)A este cea mai mica multime care contine an-umite elemente initiale (de baza) si care esteinchisa la operatia ∘.

7

∙ ExempluB este cea mai mica multime care il continepe ¿ si este inchisa la “+”.

— Operatii cu multimiDaca se dau A si B doua multimi:

A ∪B = {x ∣ x este in A sau x este in B}

A ∩B = {x ∣ x este in A si x este si in B}

A−B = {x ∣ x este in A si x nu este in B}

8

∪ni=1Ai = {x ∣ x este in Ai, pentru un i, 1 ≤ i ≤

n}∩ni=1Ai = {x ∣ x este in Ai, 1 ≤ i ≤ n}

— Proprietati ale operatiilor cu multimi

DistributivitateA ∩ (B ∪ C) = (A ∩B) ∪ (A ∩ C)A ∪ (B ∩ C) = (A ∪B) ∩ (A ∪ C)

IndempotentaA ∪ A = A; A ∩ A = A

Involutie¬(¬A) = A

ComutativitateA ∩B = B ∩ A; A ∪B = B ∪ A

AsociativitateA ∪ (B ∪ C) = (A ∪B) ∪ CA ∩ (B ∩ C) = (A ∩B) ∩ C

9

Regulile lui De Morgan¬(A ∪B) = ¬A ∩ ¬B¬(A ∩B) = ¬A ∪ ¬B

— Multimea submultimilorMultimea tuturor submultimilor unei mul-timi A (“power set” of A) se noteaza cu 2A.

Exemplu:A = {2, 3, 5}2A = {∅, {2}, {3}, {5}, {2, 3}, {2, 5}, {3, 5}, {2, 3, 5}}

— Cateva multimi speciale:∅: multimea vida;Z: multimea intregilor;N : multimea intregilor pozitivi;N0: multimea intregilor care nu sunt negativi;ℛ: multimea numerelor reale.. . . . . . . . . . . . . . .

10

2. MultiseturiUn multiset este o colectie de elementedintr-un univers oarecare, colectie in carerepetitiile nu sunt ignorate.

3. SecventeO secventa de elemente dintr-un univers oare-care este o lista de elemente care suntordonate (fiecare element are o pozitie).

11

4. Functii

Definitie Daca se dau doua multimi A si B, ofunctie (partiala) de la A la B asociaza cufiecare a din A (cel mult) un element b din B.

Notatii:functie f : A → Basocierea f (a) = bnedefinita f (a) = undef

Functii speciale:

functia identitate:f : A → A sif (a) = a pentru toti a ∈ A

Functia caracteristica a lui A pentru B:B ⊆ Af : A → {true, false}f (a) = true, if a ∈ Bf (a) = false, if a ∈ A−B

12

Cateva proprietati ale functiilor f : A → B

totala:f (a) este definita pentru toate elementele a

din Asurjectiva:pentru toate elementele b din B exista un a

din A astfel incat f (a) = binjectiva:a, a′ ∈ A, a ∕= a′ implica faptul ca f (a) ∕= f (a′)

bijectiva:in acelasi timp si surjectiva si injectiva

Notatia big-O

f (n) = O(g(n)) daca exista c > 0 si intregul poz-itiv d astfel incat pentru toate n ≥ d

f (n) ≤ cg(n)

f (n) = Ω(g(n)) daca

cf (n) ≥ g(n)

f (n) = Θ(g(n)) daca

f (n) = O(g(n)) si f (n) = Ω(g(n))

13

functie (partiala) m,n

f : A1 × A2 × . . .× Am → B1 × . . .×Bn

f (a1, a2, . . . , am) = (b1, . . . , bn)

sau f (a1, a2, . . . , am) = undef

14

5. Relatii

O relatie n-ara R, n ≥ 1, pentru multimileA1, . . . , An este orice submultime R a produsu-lui cartezian A1 × . . .× An.

Exemplu

P = {p1, p2}, C = {c1, c2, c3, c4}T ⊆ P × CT = {(p1, c1), (p1, c2), (p1, c3), (p2, c1), (p2, c4)}

15

Compunerea relatiilor

Fie R ⊆ A×B si S ⊆ B×C doua relatii. Atuncicompunerea lui R cu S, notata prin R∘S, esterelatia R ∘ S ⊆ A× C, definita prin:R ∘ S = {(a, c) ∣ (a, b) este in R si (b, c) este in S

pentru un b din B}

Compunerea unei relatii R : A× A cu ea insasi

R0 : {(a, a) ∣ a este in A}R1 : RRi, i≥1 : R ∘Ri−1

R+ = ∪∞i=1R

i inchiderea tranzitiva a lui RR∗ = ∪∞

i=0Ri inchiderea reflexiva si tranzitiva a

lui R

16

Proprietatile relatiei R ⊆ A× A

∙ reflexiva: aRa pentru toti a din A

∙ simetrica: aRb implica bRa

∙ antisimetrica: aRb, bRa implica a = b

17

∙ tranzitiva: aRb, bRc implica aRc

△ O relatie binara R peste A este orelatie de echivalenta daca este

∙ reflexiva,∙ simetrica, si∙ tranzitiva.

△ O relatie de echivalenta R peste Adefineste submultimi disjuncte ale lui A,numite clase de echivalenta.

[a]R = {b ∣ b este in A si aRb}

18

∙ Fie R o relatie binara de echivalenta peste Asi B ⊆ A.Spunem ca B este R-inchisa daca pentru totia din B, aRb implica faptul ca b este in B.

Exemplu A : multimea intregilor;B = {9, 4, 3, 2};R :“este patratul lui” (“is the square

of”)Atunci B este R-inchisa.

∙ R este o relatie n-ara peste A. B ⊆ A esteR-inchisa daca pentru toate b1, . . . , bn−1 din B,avem ca daca (b1, . . . , bn−1, bn) este in R atuncibn este in B.

Exemplu A : multimea tuturor intregilor;R = {(a, b, c) ∣ a× b = c};B = {ia ∣ i ≥ 1} pentru un intreg a

Atunci B este R-inchisa.

19

Lema Fie R o relatie de echivalenta peste A.Atunci pentru toti a, b din A avem sau [a]R =[b]R sau [a]R ∩ [b]R = ∅

△ O relatie binara R peste A este o ordine par-tiala daca este∙ reflexiva,∙ tranzitiva, si∙ antisimetrica.

△ Inchidere

– R+ este inchiderea tranzitiva a lui R

– O reprezentare in digraf pentru o inchideretranzitiva:

20

6. Cardinalitate

△ Daca se dau doua multimi A si B, spunemca ele au cardinalitate egala daca exista obijectie f : A → B.

Exemplu #Z = #N0

f (i) = −2i− 1, i < 0f (i) = 2i, i ≥ 0

△ O multime este numarabila sau enumerabiladaca este ori finita ori infinita, caz in care areaceeasi cardinalitate cu N .

21

7. Metode de demonstratie

— demonstratie directa

— demonstratie prin contradictie

— demonstratie prin inductie

△ Demonstratie directaEnumeram toate cazurile pentru a arata ca

proprietatea este adevarate.

Exemplu Orice numar par dintre 4 si 232 estesuma a doua numere prime.

△ Demonstratie prin contradictieStabilim validitatea propozitiei presupunand

ca nu este adevarata si inferam o contradictie.

Exemplu Exista un numar infinit de numereimpare.

22

△ Demonstratia prin inductie

– baza

– ipoteza inductiva

– pasul inductiv

– concluzie

Two forms:(1) — Prove it holds for a basis value b,

— Hypothesize it holds for value n,— Prove it holds for value n + 1.

(2) — Prove it holds for basis values,— Hypothesize it holds for all values

less than n and greater than basisvalue,

— Prove it holds for value n.

23

Example (proof by induction)

(i) no two lines are parallel;

(ii) no three lines have a common intersec-tion point.

Prove that the number of regions in anarrangement of n lines is

1 + n(n + 1)/2

24

8. Proof Techniques

— pigeonhole principle

— counting

— diagonalisation

The Pigeonhole PrincipleLet n ≥ 1 be an integer, let there be npigeonholes, and there be n + 1 items of mailto be placed in the pigeonholes. Then, how-ever the the items are placed, at least onepigeonhole will contain at least two items.

The Pigeonhole Principle(Extended version)Let q1, . . . , qt ≥ 1. There exists a positive inte-ger N(q1, . . . , qt). If n ≥ N(q1, . . . , qt), then how-ever n items are placed into t pigeonholes,there exists i, 1 ≤ i ≤ t such that the i-thpigeonhole contains at least q1 + 1 items.

25

Example Let G = (V,E) be a digraph with#V = n ≥ 1. Then G has a cycle if and only ifG has a path of length at least n.

���� ����

��������

����6

R

:

k

¾

26

9. Alphabets, Words, and Languages

△ An alphabet is a finite, nonempty set ofelements.The elements of the alphabet are calledsymbols or letters.

△ A word over an alphabet Σ is a finite se-quence of symbols from Σ.

△ A language is a set of words.

Examples:(1) English alphabet, words and language.

(2) Alphabet: {a, b}words: ¸, a, b, ab, ba, . . . . . .language: {¸, ab, aabb, aaabbb, . . . . . .}

i.e. {aibi ∣ i ≥ 0}

(3) Alphabet : Fortran reserved words andidentifiers

words : a Fortran programLanguage : the set of all Fortran

programs.

27

△ Empty word is a word over every alphabet.

¸

△ The length of a word is the number of sym-bols in the given word.

Example: ∣¸∣ = 0

∣abbc∣ = 4

△ The a-length of a word x is the number oftimes a occurs in x.

Example: ∣01101∣0 = 2

∣cbcbc∣a = 0

28

△ The catenation of two words x and y, de-noted by xy, is the word obtained by append-ing the word y to x.

Example x = abcy = defgxy = abcdefgyx = defgabc

x¸ = ¸x = x

¸¸ = ¸

△ Catenation is associative:

(xy)z = x(yz)

△ ∣xy∣ = ∣x∣ + ∣y∣

△ x0 = ¸ (Power of a word)xi = xxi−1, i ≥ 1

Example (abc)0 = ¸(abc)3 = abcabcabc

29

△ Prefix: x is a prefix of y if ∃z such that

xz = y

Example ∙ not is a prefix of notice∙ ¸ is a prefix of any word∙ Any word is a prefix of itself

△ Suffix: x is a suffix of y if ∃z such that

zx = y

Example ∙ allow is a suffix of swallow∙ ¸ is a suffix of any word∙ 01011 is a suffix of 01011

△ Subword: x is a subword of y if ∃w, z s.t.

wxz = y

Example The subwords of beat are:

¸, b, e, a, t, be, ea, at, bea, eat, beat

△ Proper prefix (suffix, subword):x is a proper prefix (suffix, subword) of yif x is a prefix (suffix, subword) of yand x ∕= ¸ and x ∕= y.

30

△ Reversal:(i) xR = x, if x = ¸;(ii) xR = yRa, if x = ay for a ∈ Σ,y is a word

Lemma If x = a1 . . . an for some n ≥ 0, then

xR = an . . . a1.

Proof: By induction on ∣x∣.Basis: ∣x∣ = 0. ¸R = ¸ by (i).I.H.: Assume Lemma holds for all x

with ∣x∣ = n.I.S.: Consider a word x with ∣x∣ = n + 1.

Then x = a1 . . . an+1 = a1u, where

u = a2 . . . an+1.

Then xR = uRa1.But ∣u∣ = n, by I.H., uR = an+1 . . . a2,so we get

xR = an+1 . . . a1. qed

△ Universal Language over Σ: Σ∗

Σ = {0, 1}, Σ∗ = {¸, 0, 1, 00, 01, 10, 11, 000, . . .}

31

△ Mappings (Morphism and Substitution)

Definition Let Σ and Δ be two alphabets.Then a mapping ℎ : Σ∗ → Δ∗ is said to bea morphism if(i) ℎ(¸) = ¸;(ii) ℎ(xy) = ℎ(x)ℎ(y), for all x, y in Σ∗.

Note: (ii) implies that we need only definethe images of letters to define the images ofwords.

Example:Σ = {0, 1}, Δ = {a, b, c, d},ℎ(0) = baac, ℎ(1) = ¸.Then ℎ(1001) = ℎ(1)ℎ(0)ℎ(0)ℎ(1) = ¸baacbaac¸ =baacbaac

A morphism ℎ is said to be ¸-free if, for allx ∕= ¸, ℎ(x) ∕= ¸.

32

Definition Let Σ and Δ be two alphabets. Amapping ¾ : Σ∗ → 2Δ

∗is said to be a

substitution ifi) ¾(¸) = {¸};ii) ¾(xy) = ¾(x)¾(y), for all x, y ∈ Σ∗.

Example:Σ1 = {a, b}, Δ1 = {a, b, c},a substitution ¾1:¾1(a) = {ab, ac, b};¾1(b) = {b, ba}.

Then ¾1(ba) =

Example:Σ2 = {0, 1, 2}, Δ2 = {a, b},¾2(0) = {ai ∣ i ≥ 1}¾2(1) = {bi ∣ i ≥ 1}¾2(2) = {¸}

Then ¾2(1020) =

33

△ Given an alphabet Σ, Σ∗ is defined as fol-lows:(i) ¸ ∈ Σ∗;(ii) if x ∈ Σ∗, then ax ∈ Σ∗ for all a ∈ Σ.

△ A language L over an alphabet Σ is a subsetof Σ∗, i.e., L ⊆ Σ∗.

Example:Á is a language over every alphabet .{¸} is a language over every alphabet.{aibi ∣ i ≥ 0} is a language over {a, b}{aibici ∣ i ≥ 0} is a language over {a, b, c}

34

△ Catenation of Languages

L1 ∘ L2 = {x1 ∘ x2 ∣ x1 in L1, x2 in L2},usually written as L1L2.

Examples

L1 = {a, ab}; L2 = {bc, c}

L1L2 = {abc, ac, abbc}

(Note that L1×L2 = {(a, bc), (a, c), (ab, bc), (ab, c)})

ÁL1 =

{¸}L1 =

{¸}Á =

LA = {a, ¸}; LB = {ab, b}

LALB =

35

△ Powers of Languages:L0 = {¸}Ln+1 = Ln ∘ L, n ≥ 1

△ PlusL+ = ∪∞

i=1Li

△ StarL∗ = ∪∞

i=0Li L0 ∪ L1 ∪ L2 ∪ . . . . . .

ExamplesL = {ab, b}. Then

L0 = {¸}L1 = L = {ab, b}L2 = {abab, bab, abb, bb}

Á0 = {¸} ∕= Á{¸}0 = {¸}

L∗ = {¸, ab, b, abab, bab, abb, bb, . . . . . .}(Note abababab ∈ L4)

36

△ Reversal of LanguagesLR = {xR ∣ x is in L}

ExamplesLF = {aab, abb, abbb};LRF = {baa, bba, bbba}

LI = {ambn ∣ m > n > 0}LRI = {bnam ∣ m > n > 0}

— properties of reversal

(A ∪B)R = AR ∪BR

(A ∩B)R = AR ∩BR

(AB)R = BRAR

(A+)R = (AR)+

(A∗)R = (AR)∗

△ Complementation of LanguagesGiven a language L over Σ,L = Σ∗ − L

Σ∗ = Á Σ∗ ? {¸} (A)R = AR

L = {anbn ∣ n ≥ 0} over Σ = {a, b}L =

37

△ Alternative definition of L+ and L∗

L+ = {w in Σ∗ ∣ w = w1w2 . . . wn, forw1, w2, . . . , wn ∈ L and n ≥ 1}

L∗ = {w in Σ∗ ∣ w = w1w2 . . . wn, forw1, w2, . . . , wn ∈ L and n ≥ 0}

△ Given a word x in Σ∗, and a language L ⊆ Σ∗

to test if x is in L∗ we use L∗ = ∪∞i=0L

i and testif x is in L0, L1, . . . .

38

△ Property Summaries— properties of catenation (of languages)

Associativity — L1(L2L3) = (L1L2)L3

Identity — L{¸} = {¸}L = L

Zero — LÁ = ÁL = Á

Distributivity — L1(L2 ∪ L3) = L1L2 ∪ L1L3

w.r. ∪ — (L1 ∪ L2)L3 = L1L3 ∪ L2L3

Distributivity does not hold with respect to∩

— properties of Plus and Star

L∗ = L+ ∪ {¸} ( But L+ = L∗ − {¸}?)L+ = LL∗

(L+)+ = L+, (L∗)∗ = L∗

Á+ = Á, Á∗ = {¸}, Á0 = {¸}{¸}+ = {¸}, {¸}∗ = {¸}, Á2 = Á

L+L = LL+, L∗L = LL∗

A ⊆ B ⇒ A+ ⊆ B+ and A∗ ⊆ B∗

39

Chapter 2. FINITE AUTOMATA

Example Design a “sequential lock”. Thelock has 1-bit sequential input. Initially thelock is closed. If the lock is closed it will openwhen the last three input signals are “1”, “0”,“1”, and then remains open.

— state (transition) diagram

— state (or transition) table

40

state set : {0, 1, 2, 3}input alphabet : {0, 1}Transition function : ±(0, 0) = 0, ±(0, 1) = 1, . . .Start state : 0Final state set : {3}

Deterministic Finite Automata (DFA)M = (Q, Σ, ±, s, F ) where

Q is a finite nonempty set of statesΣ is the input alphabet± : Q× Σ → Q transition functions start stateF ⊆ Q final state set

41

A computer is a finite state system (i.e. FA)which has millions of states.

There are many examples of FINITESTATE SYSTEMS. A finite automaton is anABSTRACTION of them.

View a DFA as a machine

42

Specifying ±1)

State diagram(Transition diagram)

Start state

Final state

2)

43

Configurationsa word in QΣ∗

px

where p is the present state, andx is the remaining input

Example:

0aa . . . 1a . . . 2(start configuration) (final configuration)

Moves of a DFA0aa ⊢ 1a px ⊢ qy1a ⊢ 2 if x = ay and ±(p, a) = q

Configuration sequence0aa ⊢ 1a ⊢ 2

⊢+ and ⊢∗

⊢ is a binary relation over QΣ∗.⊢+ : transitive closure of ⊢.⊢∗ : reflexive transitive closure of ⊢.

44

0aa ⊢+ 2

0aa ⊢∗ 2

0aa ⊢∗ 0aa

0aa ⊢2 2

px ⊢k qy

if px ⊢ pi1xi1 ⊢ pi2xi2 ⊢ . . .︸ ︷︷ ︸

⊢ qy

k steps

Accepting Configuration Sequence0aa ⊢ 1a ⊢ 2

⊢ can also be viewed as a function

⊢ : QΣ∗ → QΣ∗,

since the next configuration is determineduniquely for a given configuration.

The DFA stops when:(i) we have no more input,or (ii) the next configuration is undefined.

45

A word x is said to be accepted by a DFA Mif sx ⊢∗ f, f ∈ F .

The language of a DFA M , L(M), is definedas:

L(M) = {x ∣ sx ⊢∗ f, for some f ∈ F}

Examples

L(M) =

L(N) =

46

DFA membership problem

DFA MEMBERSHIP

INSTANCE: A DFA, M = (Q,Σ, ±, s, F )and a word x ∈ Σ∗.

QUESTION: Is x in L(M) ?

Run the DFA M with input x.In at most ∣x∣ steps it accepts, rejects oraborts.

Examples

Checking for words thatcontain aba as subword.

Check: ababbaabbaabbbab

47

Let L denote any letter of English alphabetand D any decimal digit; the form ofPASCAL IDENTIFIERS can be specifiedby

Recognizing comments that may go overseveral lines.

/*................*/

Δ: symbols other than ” ∗ ” and ”/”

48

A DFA which has a total ± is saidto be complete; if ± is nontotal it isincomplete.

Theorem. Every incomplete DFA Mcan be ”completed” by adding onenew state (”sink”) to give DFA M ’such that L(M ′) = L(M).

Example:

L(M) is the set of all words thatdo not contain two consecutive a’s.

49

△ Two DFA M1 and M2 are equivalentif L(M1) = L(M2).

△ The collection of languages acceptedby DFA’s is denoted by

ℒDFA.

It is called the family of DFA languagesand it is defined as:

ℒDFA = {L ∣ L = L(M) for some DFA M }

△K = {aibi ∣ i ≥ 1} is not acceptedby any DFA.

50

Proof: Use contradiction andPigeonhole principle.

Assume K = L(M), for some DFA

M = (Q, {a, b}, ±, s, F ).

Let n = #Q. Consider the acceptingconfiguration sequence for anbn,

s0anbn ⊢ s1a

n−1bn ⊢ . . . ⊢ snbn ⊢ . . . ⊢ s2n

where s0 = s and s2n ∈ F . Now n + 1 statesappear during the reading of an, butthere are only n distinct states in Q.By Pigeonhole principle at least onestate must appear at least twice duringthe reading of a’s.

Assume si = sj, 0 ≤ i < j ≤ n.Then

s0an−(j−i)bn ⊢ . . . ⊢ sia

n−jbn

sjan−jbn ⊢ . . . ⊢ snb

n ⊢ . . . ⊢⊢ s2n

Therefore an−(j−i)bn ∈ K.

This is a contradiction.

51

△ Li = {aibi}, i ≥ 1.

For any i ≥ 1, is Li a DFA language?

△ Kj = {aibi : 0 ≤ i ≤ j}, j ≥ 1.

For any j ≥ 1, is Kj a DFA language ?

52

Nondeterministic Finite Automata (NFA)

M = (Q,Σ, ±, s, F )same as a DFA except± ⊆ Q× Σ×Q.

± is a finite transition relation.

In a DFA± is a transition function:± : Q× Σ → Q

It can be viewed as a relation± : Q× Σ×Q

In a NFA, ± can be be viewed as a function:± : Q× Σ → 2Q

Examples:NFA for words in {a, b}∗ that contain threeconsecutive a’s.

53

Both (0, a, 1) and (0, a, 3) are in ±.

We define acceptance by existenceof a computation that leads to a finalstate.

Conversely, we define rejection bythe nonexistence of any computationthat leads to a final state.

The language of an NFA M = (Q,Σ, ±, s, F )is defined by

L(M) = {x ∣ sx ⊢∗ f , for some f in F }.

54

The family of NFA languages ℒNFA

is defined by:

ℒNFA = {L ∣ L = L(M), for some NFA M }.

Two NFAs M1, and M2 are equivalent ifL(M1) = L(M2).

Why NFA?

(i) easy to construct;(ii) useful theoretically;(iii) are of same power as DFA.

Note:

configurations are defined in the same wayTransition (move)

px ⊢ qy

if x = ay, for some a ∈ Σ, and (p, a, q) ∈ ±.

55

Transforming NFA to DFA

Consider the NFA M1 again

There are only limited number of choices.For example:

0aab ⊢ 1ab ⊢ 1b ⊢ 20aab ⊢ 3ab ⊢ 3b{0}aab ⊢ {1, 3}ab ⊢ {1, 3}b ⊢ {2}Why limited number of choices?

The state set is finite.

We summarize the choices at each stepby combining all configuration sequencesinto one ”super-conf. sequence”.

{0}aab ⊢ {1, 3}ab ⊢ {1, 3}b ⊢ {2}.56

We now have a set of all possiblestates at each step. From this pointof view the computation of the NFAon an input word is deterministic.

A super-configuration has the form

Kx

where K ⊆ Q and x ∈ Σ∗.

Note that ∅x is a super-conf.,it means that the NFA cannot bein any state at that point, i.e.,an abort has occured.

We say that

Kx ⊢ Ny

if x = ay, for some a ∈ Σ, and

N = {q ∣ (p, a, q) ∈ ±, for some p ∈ K}

57

More examples on super-configurations

M : L(M) is the set of all wordsthat have “ba” as a subword.

The super-configuration sequenceon input word “abbaa” is as follows:

{0}abbaa ⊢ {0}bbaa ⊢ {0, 1}baa ⊢ {0, 1}aa⊢ {0, 2}a ⊢ {0, 2}

58

Notice that given a set K ⊆ Q andan input symbol a ∈ Σ, the set N ⊆ Qs.t. Ka ⊢ N is uniquely determined.

Lemma (2.3.1) (Determinism Lemma)Let M = (Q,Σ, ±, s, F ) be an NFA.Then for all words x in Σ∗ and forall K ⊆ Q.

Kx ⊢∗ N and Kx ⊢∗ P

implies

P = N .

Lemma (2.3.2) Let M = (Q,Σ, ±, s, F )be an NFA. Then for all words x in Σ∗

and for all q in Q,

qx ⊢∗ p

iff {q}x ⊢∗ P , for some P with p in P .

59

Example (Transformation of an NFA to a DFA)

M = (Q,Σ, ±, s, F ) where

Q = 0, 1, 2, Σ = a, bs = 0, F = {2}

M ′ = (Q′,Σ, ±′, s′, F ′) where

Q′ = 2Q = {∅, {0}, {1}, {2}, {0, 1}, {0, 2}, {1, 2}, {0, 1, 2}}

±′(P, a) = {q ∣ (p, a, q) ∈ ± and p ∈ P}60

s′ = {0}F ′ = {

Algorithm NFA to DFA—The Subset Construction

On entry: An NFA M = (Q,Σ, ±, s, F ).

On exit: A DFA M ′ = (Q′,Σ, ±′, s′, F ′)satisfying L(M) = L(M ′).

begin Let Q′ = 2Q, s′ = {s} andF ′ = {K ∣ K ∈ Q′, and K ∩ F ∕= ∅}We define ±′ : Q′ × Σ → Q′ byFor all K ∈ Q′ and for all a ∈ Σ,±′(K, a) = N , if Ka ⊢ N in M .

end of Algorithm

if N = {q ∣ (p, a, q) ∈ ± and p ∈ K}61

s′ = {0}

Algorithm NFA to DFA 2—The Iterative Subset Construction

62

Theorem Given an NFA M = (Q,Σ, ±, s, F ), thenthe DFA M ′ = (Q′,Σ′, ±′, s′, F ′) obtained by ei-ther subset construction satisfiesL(M ′) = L(M).

Proof:By Lemma 2.3.2, for all x ∈ Σ∗ in M

sx ⊢∗ p, iff {s}x ⊢∗ P for some P with p ∈ P

By the construction of M ′,{s}x ⊢∗ P in M iff{s}x ⊢∗ P in M ′.

x ∈ L(M) ⇔ sx ⊢∗ f, for some f ∈ F

⇔ {s}x ⊢∗ P, f ∈ P, in M

⇔ {s}x ⊢∗ P, in M ′ and P ∩ F ∕= ∅⇔ s′x ⊢∗ P, P ∈ F

⇔ x ∈ L(M ′)

63

TheoremEvery NFA Language is a DFA language andconversely.(ℒNFA = ℒDFA)

ExampleEvery finite language is accepted by a DFA.

64

¸–NFA

It is useful to loosen the definition of NFAeven more by allowing the read head to re-main over the same symbol of the input andread nothing.

ExampleL1 = {aibj ∣ i ≥ 0, j ≥ 0}M:

L(M) = L1

± : (0, a, 0)

(0, ¸, 1) ¸− transition

(1, b, 1)

0aab ⊢ 0ab ⊢ 0b ⊢ 1b ⊢ 1

0bb ⊢ 1bb ⊢ 1b ⊢ 1

0a ⊢ 0 ⊢ 1

65

Formally, a ¸-NFA M = (Q,Σ, ±, s, F )where Q, Σ, s, F are as before, but± is a finite transition relationfor which

± ⊆ Q× (Σ ∪ {¸})×Q

Configurations are as before.⊢ is defined by

px ⊢ qy

if either x = ay for a ∈ Σ and (p, a, q) ∈ ±or x = y and (p, ¸, q) ∈ ±

ExampleGiven FA M1 and M2, constructa FA M3 such thatL(M3) = L(M1) ∪ L(M2)

66

Transforming ¸-NFA to NFA

Two steps:Step I: ¸− completion

Step II: ¸− transition removal

(I). ¸-Completion

Given a ¸-NFA M = (Q,Σ, ±, s, F )perform the following process:

For all p, q, r ∈ Q:whenever (p, ¸, q), (q, ¸, r) are in ±add (p, ¸, r) to ±

until no new transitions are added to ±and let this be ±′.

Let the new ¸−NFA beM ′ = (Q,Σ, ±′, s, F ′)where F ′ = F ∪ {p ∣ (p, ¸, f ) ∈ ± and f ∈ F}and ±′ = ± ∪ {(p, ¸, q) ∣ p ⊢+ q}

Example:

67

Claim 1: For any p, q ∈ Q,

p ⊢+M q if and only if p ⊢M ′ q

Claim 2: For any p, q ∈ Q, x ∈ Σ∗,

px ⊢∗M q if and only if px ⊢∗

M ′ q

Theorem: L(M ′) = L(M)

Example:

68

(II) ¸–Transition Removal

Given a ¸-completed ¸-NFA

M = (Q, Σ, ±, s, F ),

perform the following process:

(0) ±′ = ±;

(i) For all p, q, r ∈ Q,

if (p, ¸, q) and (q, a, r) in ±

then add (p, a, r) to ±′;(ii) Delete all ¸-transitions from ±′.

Now we got M ′ = (Q, Σ, ±′, s, F )where ±′ = (± ∪ {(p, a, r) ∣ (p, ¸, q), (q, a, r) ∈ ±})−{(p, ¸, q) ∣ p, q ∈ Q}

Example

69

Claim Whenever

sx ⊢∗M f

for some f ∈ F , we have

sx ⊢∗M ′ f

and vice versa.

Claim L(M ′) = L(M)

Theoremℒ¸−NFA = ℒNFA

70

Properties of FA Languages

I. Closure Properties

Union: The union of two DFA languagesis a DFA language.Proof: Given L1 = L(M1) and L2 = L(M2),L1, L2 are generic DFA languages.

M1 = (Q1, Σ1, ±1, s1, F1) and

M2 = (Q2, Σ2, ±2, s2, F2).

Construct a ¸-NFA M = (Q, Σ, ±, s, F ) where

Q = Q1 ∪Q2 ∪ {s}Σ = Σ1 ∪ Σ2

± = ±1 ∪ ±2 ∪ {(s, ¸, s1), (s, ¸, s2)}F = F1 ∪ F2

71

Now we claim that L(M) = L1 ∪ L2

(1) Let x be a generic word in L(M).x ∈ L(M) impliessx ⊢+

M f for some f ∈ F1 or f ∈ F2.If f ∈ F1, then sx ⊢M s1x ⊢∗

M f and thens1x ⊢∗

M1f . We know that x ∈ L(M1).

If f ∈ F2, then sx ⊢M s2x ⊢∗M f . Since

s2x ⊢∗M2

f , we know that x ∈ L(M2),therefore x ∈ L(M1) ∪ L(M2), i.e., x ∈ L1 ∪ L2.So L(M) ⊆ L1 ∪ L2.

(2) Let x be a generic word in L1 ∪ L2.Without losing of generality we can assumethat x ∈ L1, i.e., x ∈ L(M1).Then s1x ⊢∗

M1f , for some f ∈ F1.

Then sx ⊢M s1x ⊢∗M f . Therefore x ∈ L(M).

So, L1 ∪ L2 ⊆ L(M).

(3) By the results of (1) and (2), we concludeL(M) = L1 ∪ L2.

Since every ¸-NFA language is a DFA language,L(M) is a DFA language.

72

2. Complementation:

The complement of a DFA language L ⊆ Σ∗

is a DFA language.Proof: There is a complete DFAM = (Q, Σ, ±, s, F ) with L = L(M). DefineM = (Q, Σ, ±, s, Q− F )

3. Intersection:

L1 and L2

L = L1 ∩ L2 = L1 ∪ L2

73

4. Catenation:

The catenation of two DFA languages is aDFA language:

M1 = (Q1, Σ1, ±1, s1, F1);M2 = (Q2, Σ2, ±2, s2, F2).

L(M) = L(M1)L(M2)

M = (Q1 ∪Q2, Σ1 ∪ Σ2, ±, s1, F2),± = ±1 ∪ ±2 ∪ {(f, ¸, s2) ∣ f ∈ F1}

M = (Q, Σ, ±, s, F ) where

Q = Q1 ∪Q2

Σ = Σ1 ∪ Σ2

± = ±1 ∪ ±2 ∪ {(f, ¸, s2) ∣ f ∈ F1}s = s1F = F2

74

5. Star:The star of a DFA language is a DFA language

L = L(M) where M = (Q, Σ, ±, s, F )

L∗ = L(M ′)M ′ = (Q′, Σ′, ±′, s′, F ′) where

Q′ = Q ∪ {s′}Σ′ = Σ

±′ = ± ∪ {(s′, ¸, s)} ∪ {(f, ¸, s′) ∣ f ∈ F}F ′ = F ∪ {s′} F ′ = {s′}

6. Plus:

75

II. Decidability Properties

A decision problem is a problem each in-stance of which is either false or true.

A decision algorithm is an algorithm whoseresult for each possible input is either falseor true.

A decision problem is decidable if there ex-ists a decision algorithm for it. Otherwise itis undecidable.

1. DFA membershipINSTANCE: A DFA M = (Q, Σ, ±, s, F )

and a word x ∈ Σ∗.QUESTION: Is x ∈ L(M) ?

Theorem: DFA membership is decidable.

76

Proof: Compute for the given M and x the ter-minating state q such that sx ⊢∗ q. If q ∈ Fthen answer “True” else answer “false”.

2. DFA Emptiness

INSTANCE: A DFA M = (Q, Σ, ±, s, F ).QUESTION: Is L(M) = ∅ ?

Theorem DFA emptiness is decidable.

Proof: L(M) = ∅ iff there is no path in thestate diagram of M from s to a final state. IfF = ∅, then this holds immediately.Otherwise, enumerate the states that can bereached from s. (Mark s. Now mark all statesreachable by one transition from one of themarked states. Repeat this until no newlymarked state is introduced.The marked states are the reachable states).

77

3. DFA Universality

INSTANCE: A DFA M = (Q, Σ, ±, s, F ).QUESTION: Is L(M) = Σ∗ ?

Theorem DFA universality is decidable.Proof:

4. DFA Containment

INSTANCE: Two DFAM1 = (Q1, Σ1, delta1, s1, F1) andM2 = (Q2, Σ2, delta2, s2, F2).

QUESTION: Is L(M1) ⊆ L(M2) ?

Theorem DFA containment is decidable.Proof:

L(M1) ∩ L(M2) = ∅ means that L(M1) ⊆ L(M2)

78

5. DFA Equivalence

INSTANCE: Two DFAM1 = (Q1, Σ1, delta1, s1, F1) andM2 = (Q2, Σ2, delta2, s2, F2).

QUESTION: Is L(M1) = L(M2) ?

Theorem DFA equivalence is decidable.

L(M1) ⊆ L(M2) ? yes

L(M2) ⊆ L(M1) ? yes

79

Pumping Lemma and Non-DFA Language.

The DFA Pumping LemmaLet M = (Q, Σ, ±, s, F ) be a DFA and letp = #Q. For all words x in L(M) such that∣x∣ ≥ p, x can be decomposed into uvw, forsome u, v, and w in Σ∗ such that

(i) ∣uv∣ ≤ p;

(ii) ∣v∣ ≥ 1; and

(iii) for all i ≥ 0, uviw is in L(M).

Example M = (Q,Σ, ±, s, F ) whereQ = {s, p, q}, Σ = {a, b},F = {q}. #Q = 3

Let x be an arbitrary word of length ≥ #Qin L(M), e.g. x = baaa. The accepting conf.sequence for x is:sbaaa ⊢ paaa ⊢ qaa ⊢ pa ⊢ q.So, u = b, v = aa, w = a and b(aa)ia ∈ L(M) forall i ≥ 0.

80

Proof of DFA Pumping Lemma:

Let x be in L(M) with ∣x∣ ≥ p.Then M has an accepting configuration se-quence

q0a1a2 . . . ar ⊢ q1a2 . . . ar ⊢ . . . ⊢ qr−1ar ⊢ qr

where q0 = s, qr ∈ F .Consider the first p transitions. s, q1, . . . , qpcannot all be distinct since there are only pdistinct states (by pigeonhole principle).This means that qi = qj for some 0 ≤ i < j ≤ p.Let

u = a1 . . . ai,

v = ai+1 . . . aj,

w = aj+1 . . . ar.

So, we have q0u ⊢∗ qiqiv ⊢∗ qj (qi = qj)qjw ⊢∗ qr.

Since qi = qj, qivk ⊢∗ qj for any k ≥ 0

81

(i) ∣uv∣ ≤ p, since j ≤ p

(ii) ∣v∣ ≥ 1, since i < j

(iii) uvkw ∈ L(M) for all k ≥ 0, since

q0uvkw ⊢∗ qiv

kw ⊢∗ qjw ⊢ qr ∈ F

qed.

Re-state Pumping LemmaIf L is a DFA Languagethen

(L = L(M) for some M of p states;)for every word x of length ≥ p in L,there exists one decomposition x = uvwwhich satisfies

(i) ∣uv∣ ≤ p,

(ii) ∣v∣ ≥ 1,

(iii) uvkw ∈ L(M) for all k ≥ 0.

82

Comments: The DFA pumping lemma showsa property of DFA languages. It can be usedpossitively; but it is mainly used negativelyto show that some languages are not in ℒDFA.The DFA pumping lemma gives a necessarycondition for DFA languages. The conditionis not sufficient.

Review of logic:if A then B ⇔ if ¬B then ¬Aif A then B ∕⇔ if B then Aif A then B ∕⇔ if ¬A then ¬B

Exampleif it is sugar then it is sweet.if it is not sweet then it is not sugar.if it is sweet then it is sugar.

So, pumping lemma can be used to showthat a language is not a DFA language, butcannot be used to show that a language is aDFA language.

83

Non-DFA Language

L = {aibi ∣ i ≥ 0} is not a DFA Language

Proof (I):

Assume L is a DFA language. Then L = L(M)for some DFA M = (Q, Σ, ±, s, F ) with ∣Q∣ = p.Consider x = atbt, ∣x∣ ≥ p.By DFA pumping lemma, there is a decom-position

x = uvw

which satisfies (i), (ii), and (iii).Consider all the possible decompositions

Case 1 : v = ak, k ≥ 1. k ≤ t

But uv0w = at−kbt ∕∈ L.

Case 2 : v = bk, k ≥ 1.

uv0w = atbt−k ∕∈ L.

Case 3 : v = akbl, k, l ≥ 1.

Then uv2w = atblakbt ∕∈ L.

All the possible decompositions fail.Therefore, L is not a DFA language.

84

Proof (II):. . . . . . . . . . . . . . . . . .∣Q∣ = p.Consider x = apbp in L. Obviously, ∣x∣ ≥ p.By DFA pumping lemma, there is a decom-

positionx = uvw

that satisfies (i), (ii), and (iii).The only decompositions that satisfy

(i) and (ii) are the following:

v = ak for p ≥ k ≥ 1.

But uv0w = ap−kbp ∕∈ L.So, there is no such decomposition.L is not a DFA language.

(i) ∣uv∣ ≤ p;(ii) ∣v∣ ≥ 1;

85

Notes on proving that a language is nota DFA language by pumping lemma

1) Assume L is a DFA language.

Then we have a constant p.

2) Choose a word in L of length ≥ p.

3) Consider all the possible decompositions

of x. If none of them satisfy

(i), (ii), and (iii) at the same time,

then conclude that such decomposition

does not exist.

So, L is not a DFA language.

86

L = {a2n ∣ n ≥ 0} is not a DFA language.

Proof:

(1) Assume L = L(M) for some M = (Q,Σ, ±, s, F )with ∣Q∣ = p.

(2) Consider x = a2p+1

, ∣x∣ ≥ p.(3) By DFA P.L., there is a decomposition

x = uvw

satisfying (i), (ii), and (iii).The only decompositions of x which satisfy(i) and (ii) are the following:

v = ak, p ≥ k ≥ 1.

Notice that uw = a2p+1−k.

2p+1 > 2p+1 − k ≥ 2p+1 − p > 2p+1 − 2p = 2p

i.e., 2p+1 > 2p+1 − k > 2p.

∣a2p+1∣ > ∣uw∣ > ∣a2p∣.

So, uw /∈ L. It is contrary to (iii), so L isnot a DFA language.

87

6. DFA Finiteness

INSTANCE: A DFA M = (Q,Σ, ±, s, F )QUESTION: Is L(M) finite?It is decidable.

Theorem L(M) is finite iffM accepts no words x that satisfy

∣Q∣ ≤ ∣x∣ < 2∣Q∣Proof: If: Assume M accepts no words x with

∣Q∣ ≤ ∣x∣ < 2∣Q∣ and L(M) is infinite. SinceL(M) is infinite, it must accept some wordx with ∣x∣ ≥ 2∣Q∣. By P.L., x = uvw with∣Q∣ ≥ ∣v∣ ≥ 1, and x′ = uw is in L(M). No-tice that ∣x∣ > ∣x′∣ ≥ ∣x∣ − ∣Q∣. If ∣x′∣ ≥ 2∣Q∣,then repeat this process.Otherwise 2∣Q∣ > ∣x′∣ ≥ ∣Q∣. (i.e., 2∣Q∣ − ∣Q∣).This is a contradiction.

Only if: By P.L., x = uvw, ∣v∣ ≥ 1.Since uviw ∈ L(M) for all i ≥ 0, L(M) is infi-nite.

88

Theorem The family of DFA languages isclosed under morphisms.

Proof:

Let L ⊆ Σ∗ be a DFA language. Then there isa DFA M = (Q,Σ, ±, s, F ) such that L(M) = L.Let f : Σ∗ → Δ∗ be a morphism.We are going to show thatf (L) = {f (x) ∣ x ∈ L} is a DFA language.Construct M ′ = (Q,Δ, ±′, s, F ) by defining(p, f (a), q) is in ±′ iff ±(p, a) = q.

M ′ is a lazy FA. So, L(M’) is a DFA language.It is not difficult to show that L(M ′) = f (L).Therefore, f (L) is a DFA language. qed

89

Finite Transducers and Finite Transductions(Translations)In automata and language theory, a machinewith input and output is called a transducer.

- -I in Σ∗ O in Δ∗

A transducer is single-valued if it producesat most one output word for each input word.It is multi-valued otherwise.

A transducer defines— a function f : Σ∗ → Δ∗, if it is single-valued,—a relation R ⊆ Σ∗ ×Δ∗, otherwise.

A function or a relation that is defined by atransducer is called a transduction (transla-tion).

90

Specifically, a finite automaton with outputis called a finite transducer.

FA(DFA, NFA, or ¸-NFA)+ output (output alphabet, output tape)= FT

Input tape

Finite control

Output tape

?

6

read-only head

write-only head

Formally, a finite transducer (FT) M is a six-tuple (Q,Σ,Δ, ±, s, F ) whereΔ is an output alphabet;± ⊆ Q×(Σ∪{¸})×Q×Δ∗ is a finite transitionrelation;the others are the same as in a FA.

91

A configuration vpu is a word in Δ∗QΣ∗.

v — output till now;p — current state;u — remaining input.

We writevpau ⊢ vzqu

if (p, a, q, z) is in ±.

⊢i, ⊢+, ⊢∗ are defined as usual.

Given x ∈ Σ∗, we say that y is an output forx w.r.t. M if

sx ⊢∗ yf

for some f ∈ F .

Let M(x) be the set of all outputs for x, thenM(x) = {y ∣ sx ⊢∗ yf for some f ∈ F}.The transduction or translation defined by Mis T (M) = {(x, y) ∣ x ∈ Σ∗ and y ∈ M(x)}

92

Example M :

0abbc ⊢ a0bbc ⊢ a1bbc ⊢ ab1bc⊢ abb1c ⊢ abb2c ⊢ abb2

i.e. 0abbc ⊢∗ abb2Since 2 ∈ F , we say abb ∈ M(abbc),and (abbc, abb) ∈ T (M).We can see that

T (M) = {(aibjck, aibj) ∣ i, j, k ≥ 0}Example

Let ℎ(a) = 101, ℎ(b) = 11.Encoding:

Decoding:

93

REGULAR EXPRESSIONS

— The Second Model for Defining Languages

Example:Consider the language of all the words thatconsist of a’s and b’s and have abb as a sub-word.We can formally define this language by thefollowing:(1) L = {x ∣ x ∈ {a, b}∗ and x has abb as a

subword};(2) L = L(M) where M is an NFA given bythe following diagram:

Both of the above definitions are lengthy.It can also be expressed by

L([a ∪ b]∗abb[a ∪ b]∗)

94

Definition Let Σ be an alphabet.A regular expression over Σ is defined recur-sively:Basis: (1) ∅,

(2) ¸,(3) a, where a ∈ Σ

are R.E.’s over ΣInduction Step:

If E1 and E2 are R.E.’s over Σ, then(4) [E1 ∪ E2],(5) [E1 ⋅ E2],(6) E∗

1

are R.E.’s over Σ

We usually omit ⋅.

The set of all regular expressions over Σ isdenoted by

ℛΣ

95

[[a + b]a]∗ is the same as [[a ∪ b]a]∗

Example:

[[E1︷︸︸︷a ∪

E2︷︸︸︷b ]

︸ ︷︷ ︸E3

a︸︷︷︸E4

︸ ︷︷ ︸E5

]∗ is a R.E. over {a, b}

We display the “parsing” by anexpression tree: ∗

⋅∪

a b a

What language does an R.E. define?

(0,75)

{a} {b}∪{a, b}

⋅{a}{aa, ba}

∗ {aa, ba}∗

96

Definition Given a regular expression E, thelanguage L(E) denoted by E is defined as fol-lows:Basis: (1) if E = ∅, then ∅;

(2) if E = ¸, then {¸};(3) if E = a, then {a};

Induction Step:(4) if E = [E1 ∪ E2], then L(E1) ∪ L(E2);(5) if E = [E1E2], then L(E1)L(E2);(6) if E = E∗

1, then (L(E1))∗.

Properties of R.E.’s∪: E1 ∪ E2 ≡ E2 ∪ E1;

[E1 ∪ E2] ∪ E3 ≡ E1 ∪ [E2 ∪ E3];So, [E1 ∪ E2] ∪ E3 ⇒ E1 ∪ E2 ∪ E3.

⋅: [E1E2]E3 ≡ E1[E2E3];So, [E1E2]E3 ⇒ E1E2E3.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

97

Definition A language L is regular iff there isa regular expression E such that L = L(E).

The family of (all) regular languages isdenoted by ℒREG.

Example E = [b∗ab∗a]∗b∗.Leven = {x ∣ x in {a, b}∗ and x contains an even

number of a′s}.Claim: L(E) = Leven

Proof:(i) L(E) ⊆ Leven, since every word in L(E)

contains an even number of a’s.(ii)Let x ∈ Leven. Then x can be written as

bi0abi1a . . . . . . abi2n, i0, i1, . . . , i2n ≥ 0.So, x = (bi0abi1a)(bi2abi3a) . . . . . . (bi2n−2abi2n−1a)bi2n.x ∈ L([b∗ab∗a]∗)L(b∗) = L(E). q.e.d.

98

Examples Σ = {a, b}.L1 = {x ∣ x = au, u ∈ Σ∗}.

L2 = {x ∣ ∣x∣a ≡ 0 mod 3}.[a[aa]]∗

L3 = {x ∣ x has 2 or 3 a′s with the last twoappearances nonconsecutive }

L4 = {x ∣ x = anbn, n ≥ 1}

99

ExamplesWhat are the languages denoted by the fol-lowing R.E.’s ?

E1 = a∗ba∗

E2 = [a ∪ ab]∗

E3 = a[a ∪ b]∗a

E4 = [aa ∪ bb ∪ ba ∪ ab]∗

100

How many languages over Σ do R.E.’s define?Σ = {a, b}

(1) Infinitely many ?

(2) Countable ?

101

Regular Expression into Finite Automata

Let E be a regular expression over Σ. Thenwe can construct a ¸-NFA M such thatL(M) = L(E), using the following rules:(i) E = ∅.Construct M such that L(M) = ∅

����

-

(ii) E = ¸.Construct M such that L(M) = {¸}

��������-

(iii) E = a, a ∈ Σ.Construct M such that L(M) = {a}

��������"!#

-- a

102

(iv) E = [E1 ∪ E2].Construct M such that L(M) = L(M1)∪L(M2).

(v) E = [E1E2].Construct M such that L(M) = L(M1)L(M2).

(vi)E = E∗1.

Construct M such that L(M) = L(M1)∗.

103

ExampleE = [c∗[a ∪ [bc∗]]∗]Construct a FA M such that L(M) = L(E).

△ a, b, c by (iii)

△ c∗ by (vi)

△ [bc∗] by (v)

△ [a ∪ bc∗] by (iv)

△ [a ∪ bc∗]∗ by (vi)

△ [c∗[a ∪ bc∗]∗] by (v)

104

Theorem For E, an arbitrary regular expres-sion over Σ, the ¸-NFA, M , constructed asabove satisfies L(M) = L(E).

Proof: Let Op(E) be the total number of ∪, ⋅,and ∗ operations in E. We prove this theoremby induction on Op(E).

Basis: Op(E) = 0. Then E = ∅, ¸, or a ∈ Σ.Then clearly we have L(M) = L(E).

Induction Hypothesis:Assume the claim holds for all E with Op(E) ≤k, for some k ≥ 0.

Induction Step:Consider an arbitrary regular expression Ewith Op(E) = k+1. Since k+1 ≥ 1, E containsat least one operator ∪, ⋅, or ∗.

Case I: E = E1 ∪ E2. Then Op(E1) ≤ kand Op(E2) ≤ k. So, L(M1) = L(E1) and L(M2) =L(E2) by I.H.. We know the construction ofM = M1 ∪ M2 satisfies L(M) = L(M1) ∪ L(M2),and L(E) = L(E1) ∪ L(E2)

105

Therefore, L(M) = L(E).Case II: E = E1E2

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Case III: E = E∗1

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

In each of the three cases, we have shownthat L(M) = L(E). Therefore this holds forall regular expressions by the principle of in-duction. q.e.d.

106

Finite Automata into Regular Expressions

To prove that every DFA language is regularwe introduce an extension of finite automata.

Definition An extended finite automaton(EFA), M , is a quintuple (Q, Σ, ±, s, f ) whereQ, Σ, s are as in ¸-NFA,f is the only final state, f ∕= s,± : Q×Q → RΣ is a total extended transition

function.

Example of an EFA:

±(p, s) = ∅±(s, f ) = ∅. . . . . .One final state f ∕= s

107

△ A configuration is in QΣ∗

△ Movepx ⊢ qy if

(i) x = wy, w ∈ Σ∗,(ii) ±(p, q) = E, and(iii) w ∈ L(E).

△ ⊢∗, ⊢+ are defined similarly as before.

Lemma If M is a DFA, Then there is an EFAM ′ with L(M ′) = L(M).

Example DFA into EFA.

108

Example:An extended finite Automaton (EFA).M:

Check if the following words are in L(M)

(1) bbabab

(2) aabbc

(3) acccc

(4) aaaaac

109

State Elimination TechniqueGoal of the technique:

i.e.:

(1) EFA has 2 states

Example

110

(2) EFA M has k + 1 states, k ≥ 2.Then eliminate a state from M to form M ′:M :

Note: q ∈ Q− {s, f}Consider all transitions (pi, Eiq, q)

and (q, Eqi, pi)M ′ :

±′(pi, pj) = ±(pi, pj) ∪ ±(pi, q)(±(q, q))∗±(q, pj)

111

Example

b∗a[b ∪ ab∗a]∗

112

Example

113

Summary of the State Elimination Technique(0) Change FA into EFA(1) Add a new start state if the original one

has incoming transitions.(2) Add a new final state if there are more

than one final states originally. Old finalstates become non-final states.

(3) Eliminate the states in Q− {s, f}one by one.

(4) Eliminate the transition ±(f, f ).

114