Language Theory Foundations - Regular Expressions

From Dev Wiki
Revision as of 18:27, 25 September 2019 by Brodriguez (talk | contribs) (Correct identity list format)
Jump to navigation Jump to search

Regular Expressions (known as regex) are a means to specify possible strings within a language.

Regex Operations

Operation Description Symbol Example Regular Expression
Concatenation Combine Strings ab ab
[a-c][de] ad ∪ ae ∪ bd ∪ be ∪ cd ∪ ce
Kleene Star Closure Get Between Zero and Multiple * [ab]* (a ∪ b)*
Disjunction A (a ∪ b)* ∪ A
One or More Get at Least One + [ab]+ (a ∪ b)^+
Zero or One Get One Instance or Empty ? a? (a ∪ λ)
One Character Get One Character from Alphabet . a.a a(a ∪ b)a for Σ={a,b}
n Times Repeat n Times {n} aaaa = a^4
n or More Times Repeat n or More Times {n,} a{4,} aaaa^+
n to m Times Repeat between n and m Times {n, m} a{4, 6} aaaa ∪ aaaaa ∪ aaaaaa


Regex Identities

Recall that ∅ is the empty set. Which is different than an empty string.
Empty string implies that the set has a string, just the string has no symbols in it.
Empty set implies that the set has no elements at all, not even an empty string.

Given arbitrary strings u, v, and w, we can define the following identities:

1) ∅u = u∅ = ∅
2) λu = uλ = u
3) ∅ = λ
4) λ* = λ
5) (u ∪ v) = (v ∪ u)
6) (u ∪ ) = u
7) (u ∪ u) = u
8) u* = (u*)*
9) u(v ∪ w) = (uv ∪ uw)
10) (u ∪ v)w = (uw ∪ vw)
11) (uv)*u = u(vu)*
12.1) (u ∪ v)* = (u* ∪ v)*
12.2) = u*(u ∪ v)*
12.3) = (u ∪ vu*)*
12.4) = (u*v*)*
12.5) = u*(vu*)*
12.6) = (u*v)*u*