Operations on languages ·Concatenation: L L2={SiS2I S1E L1 and S2E L2} ·Union -L1UL2={s|s∈L1ors∈L2} ·Exponentiation: -L0={ε}L1=L L2=LL ·Kleene Closure -L-UL i=0 ·Positive Closure -=U2 CS308 Compiler Theory 6
Operations on Languages • Concatenation: – L L = { s s | s ∈ L and s ∈ L } 1L 2 { s1s2 | s1 ∈ L1 and s2 ∈ L2 } • Union – L L { | L L } 1 ∪ L 2 = { s| s ∈ L1 or s ∈ L2 } • Exponentiation: – L 0 = { ε} L1 = L L 2 = LL • Kleene Closure Kleene Closure – L* = U ∞ i = 0 i L • Positive Closure – L + = U ∞ i L 6 i =1 CS308 Compiler Theory
Example ·L1={a,b,c,d} L2={1,2} ·L1L2={al,a2,b1,b2,c1,c2,d1,d2} ·L1UL2={a,b,c,d,1,2} L3 all strings with length three (using a,b,c,d) L*=all strings using letters a,b,c,d and empty string L=doesn't include the empty string CS308 Compiler Theory 7
Example • L1 = {a,b,c,d} L2 = {1,2} • L1L2 = {a1,a2,b1,b2,c1,c2,d1,d2} • L1 ∪ L2 = {a,b,c,d,1,2} • L13 = all strings with length three (using a,b,c,d} • L1* = all strings using letters a,b,c,d and empty string • L1+ = doesn’t include the empty string 7 1 py g CS308 Compiler Theory
Regular Expressions We use regular expressions to describe tokens of a programming language. A regular expression is built up of simpler regular expressions(using defining rules) Each regular expression denotes a language. A language denoted by a regular expression is called as a regular set. CS308 Compiler Theory 8
Regular Expressions • We use regular expressions to describe tokens of a programming language. • A regular expression is built up of simpler regular expressions (using d fi i l ) d efi n ing ru les ) • Each regular expression denotes a language. • A l d t d b l i i ll d A language deno t e d by a regu lar express ion is call e d as a regul t ar se t. CS308 Compiler Theory 8
Regular Expressions (Rules) Regular expressions over alphabet Reg.Expr Language it denotes 8 {ε} a∈∑ {a} ()|(2) L()UL(2) (1)(2) L(r1)L(2) (r) (L(r) (r) Lr) ·(r)=(r)r)* ·(r)?=()|E CS308 Compiler Theory 9
Regular Expressions (Rules) Regular expressions over alphabet Σ Reg. Expr Language it denotes ε { ε } a∈ Σ {a} (r ) | (r ) L(r ) ∪ L(r ) 1) | (r 2 ) L(r1) ∪ L(r 2 ) (r1) (r 2) L(r1) L(r 2 ) (r) * (L(r)) * (r) L(r) • (r) + = (r)(r) * • ( r )? = ( r ) | ε 9 ( ) ()| CS308 Compiler Theory
Regular Expressions (cont.) We may remove parentheses by using precedence rules. 一米 highest -concatenation next -1 lowest ·ab*lc means (a(b))(c) 。EX: -∑={0,1} -01=>{0,1} -(011)011)=>{00,01,10,11} -0*=>{ε,0,00,000,0000…} -(01)*=>all strings with 0 and 1,including the empty string CS308 Compiler Theory 10
Regular Expressions (cont.) • We may remove parentheses by using precedence rules. – * highest – concatenation next – | lowest • ab *|c means (a(b) *)|(c) • Ex: – Σ = {0,1} – 0|1 => {0,1} – (0|1)(0|1) => {00,01,10,11} – 0* – 0 => { ε ,0 00 000 0000 } 0,00,000,0000,.... } – (0|1)* => all strings with 0 and 1, including the empty string CS308 Compiler Theory 10