Language Theory Foundations - Grammars: Difference between revisions
Brodriguez (talk | contribs) (Create page) |
Brodriguez (talk | contribs) (Add sentential form definition) |
||
(One intermediate revision by the same user not shown) | |||
Line 13: | Line 13: | ||
* '''Production Rules''' ('''P''') - A finite set of rules to manipulate a given string into a final, valid string for the given Language and Grammar. | * '''Production Rules''' ('''P''') - A finite set of rules to manipulate a given string into a final, valid string for the given Language and Grammar. | ||
** In other words, a finite set of rules to substitute one valid symbol for one or more other valid symbols. | ** In other words, a finite set of rules to substitute one valid symbol for one or more other valid symbols. | ||
Note that Terminal Symbols and Non-Terminal Symbols are disjoint/mutually exclusive. That is, <code>V ∪ T = ∅</code> | |||
* '''Derivation''' - The steps taken in a particular instance of forming a string from a Grammar. | * '''Derivation''' - The steps taken in a particular instance of forming a string from a Grammar. | ||
== Formal Grammar Derivation == | == Formal Grammar Derivation == | ||
=== Example 1 === | |||
Let's take a basic Language L(G) = {a^2^n | n ≥ 0 }. | |||
L(G) = {a^2^n | n ≥ 0 } | L(G) = {a^2^n | n ≥ 0 } | ||
Which results in L(G) = {a, aa, aaaa, aaaaaaaa, ...} | Which results in L(G) = {a, aa, aaaa, aaaaaaaa, ...} | ||
Line 53: | Line 56: | ||
⇒ a^4 (Definition of Concatenation) | ⇒ a^4 (Definition of Concatenation) | ||
=== Example 2 === | |||
Let's take | |||
L(G) = { w ∈ {a, b, c}* | n_a(w) = n_b(w) = n_c(w) } | |||
V = {A, B, C, S} | |||
T = {a, b, c} | |||
P = { | |||
S → λ | ABCS (Rule #1) | |||
AB → BA (Rule #2) | |||
AC → CA (Rule #3) | |||
BA → AB (Rule #4) | |||
BC → CB (Rule #5) | |||
CA → AC (Rule #6) | |||
CB → BC (Rule #7) | |||
A → a (Rule #8) | |||
B → b (Rule #9) | |||
C → c (Rule #10) | |||
} | |||
Given this Language and Grammar, we can derive the example string "ccbaba": | |||
S ⇒ ABCS (Using Rule #1) | |||
⇒ ABCABCS (Using Rule #1) | |||
⇒ ABCABCλ (Using Rule #1) | |||
⇒ ABCABC (Definition of λ) | |||
⇒ ACBACB (Using Rule #5) | |||
⇒ CABCAB (Using Rule #3) | |||
⇒ CBACBA (Using Rule #2) | |||
⇒ CBCABA (Using Rule #3) | |||
⇒ CCBABA (Using Rule #5) | |||
⇒ ccBABA (Using Rule #10) | |||
⇒ ccbAbA (Using Rule #9) | |||
⇒ ccbaba (Using Rule #8) | |||
== More on Derivations == | |||
* '''Sentential Form''' - Any string of symbols that can be derived from a finite steps from the start symbol. | |||
* '''Sentence''' - Any string who's symbols are 100% comprised of Terminal Symbols. | |||
* '''Derivation Tree''' - A tree formed by showing all steps (from start to finish) of forming a string from a Grammar. | * '''Derivation Tree''' - A tree formed by showing all steps (from start to finish) of forming a string from a Grammar. | ||
* '''Ambiguous Grammar''' - A Grammar one or more strings, which have two or more associated Derivation Trees. | * '''Ambiguous Grammar''' - A Grammar one or more strings, which have two or more associated Derivation Trees. | ||
* '''Ambiguous Language''' - A language in which all Grammars for it are Ambiguous Grammars. | * '''Ambiguous Language''' - A language in which all Grammars for it are Ambiguous Grammars. | ||
* '''Left-Most Derivation''' - A Derivation in which only the left-most possible symbol changes for each step. | * '''Left-Most Derivation''' - A Derivation in which only the left-most possible symbol changes for each step. Excludes terminal symbols, as those cannot be changed. | ||
* '''Right-Most Derivation''' - A Derivation in which only the right-most possible symbol changes for each step. | * '''Right-Most Derivation''' - A Derivation in which only the right-most possible symbol changes for each step. Excludes terminal symbols, as those cannot be changed. |
Latest revision as of 04:06, 29 September 2019
Formal Grammars
A Formal Grammar is a set of rules for establishing strings in a given Language. Essentially, it's a set of rules that describe how to form valid strings, using the given language's alphabet. A grammar doesn't define the string's meaning, just the form a string can take.
Generally speaking, a Grammar will begin with a Start Symbol, have a separate sets of Terminal Symbols and Non-Terminal Symbols, and finally have a set of Production Rules which indicate what can be produced for any given symbol.
To put it more formally, we can define a grammar G (for given language L) as:
G = (V, T, S, P)
- Non-Terminal Symbols (V) - A non-empty, finite set of symbols for which production rules cannot create more symbols from.
- Terminal Symbols (T) - A non-empty, finite set of symbols for which production rules creates more symbols from.
- When all symbols in a string are "Terminal Symbols", then we have a fully valid string for the given Language and Grammar.
- Start Symbol (S) - A symbol indicating a start point for the "Production Rules" to begin doing work.
- Production Rules (P) - A finite set of rules to manipulate a given string into a final, valid string for the given Language and Grammar.
- In other words, a finite set of rules to substitute one valid symbol for one or more other valid symbols.
Note that Terminal Symbols and Non-Terminal Symbols are disjoint/mutually exclusive. That is, V ∪ T = ∅
- Derivation - The steps taken in a particular instance of forming a string from a Grammar.
Formal Grammar Derivation
Example 1
Let's take a basic Language L(G) = {a^2^n | n ≥ 0 }.
L(G) = {a^2^n | n ≥ 0 } Which results in L(G) = {a, aa, aaaa, aaaaaaaa, ...}
Now let's look at the associated grammar G = (V, T, S, P) such that
V = {S, [, ], A, D} T = {a}
Note that the start symbol, "S", is part of the non-terminal set. So are the characters "[" and "]".
Finally, we can define the rules for this Grammar as:
P = { S → [A] (Rule #1) [ → [D | λ (Rule #2) D] → ] (Rule #3) DA → AAD (Rule #4) ] → λ (Rule #5) A → a (Rule #6) }
Given all that, we can finally derive an example string for our language. In this instance, let's derive a^4.
We always start with the start symbol so:
Start with S S ⇒ [A] (Using Rule #1) ⇒ [DA] (Using Rule #2) ⇒ [AAD] (Using Rule #4) ⇒ [AA] (Using Rule #3) ⇒ [DAA] (Using Rule #2) ⇒ [AADA] (Using Rule #4) ⇒ [AAAAD] (Using Rule #4) ⇒ [AAAA] (Using Rule #3) ⇒ λAAAA] (Using Rule #2) ⇒ λAAAAλ (Using Rule #5) ⇒ AAAA (Definition of λ) ⇒ aaaa (Using Rule #6) ⇒ a^4 (Definition of Concatenation)
Example 2
Let's take
L(G) = { w ∈ {a, b, c}* | n_a(w) = n_b(w) = n_c(w) } V = {A, B, C, S} T = {a, b, c} P = { S → λ | ABCS (Rule #1) AB → BA (Rule #2) AC → CA (Rule #3) BA → AB (Rule #4) BC → CB (Rule #5) CA → AC (Rule #6) CB → BC (Rule #7) A → a (Rule #8) B → b (Rule #9) C → c (Rule #10) }
Given this Language and Grammar, we can derive the example string "ccbaba":
S ⇒ ABCS (Using Rule #1) ⇒ ABCABCS (Using Rule #1) ⇒ ABCABCλ (Using Rule #1) ⇒ ABCABC (Definition of λ) ⇒ ACBACB (Using Rule #5) ⇒ CABCAB (Using Rule #3) ⇒ CBACBA (Using Rule #2) ⇒ CBCABA (Using Rule #3) ⇒ CCBABA (Using Rule #5) ⇒ ccBABA (Using Rule #10) ⇒ ccbAbA (Using Rule #9) ⇒ ccbaba (Using Rule #8)
More on Derivations
- Sentential Form - Any string of symbols that can be derived from a finite steps from the start symbol.
- Sentence - Any string who's symbols are 100% comprised of Terminal Symbols.
- Derivation Tree - A tree formed by showing all steps (from start to finish) of forming a string from a Grammar.
- Ambiguous Grammar - A Grammar one or more strings, which have two or more associated Derivation Trees.
- Ambiguous Language - A language in which all Grammars for it are Ambiguous Grammars.
- Left-Most Derivation - A Derivation in which only the left-most possible symbol changes for each step. Excludes terminal symbols, as those cannot be changed.
- Right-Most Derivation - A Derivation in which only the right-most possible symbol changes for each step. Excludes terminal symbols, as those cannot be changed.