Automata Theory Archives - https://www.theoryofcomputation.co/category/finite-automata/ Science of Computer Sat, 27 Jul 2019 08:51:42 +0000 en-US hourly 1 https://wordpress.org/?v=6.7.1 https://i0.wp.com/www.theoryofcomputation.co/wp-content/uploads/2018/08/cropped-favicon-512x512-2.png?fit=32%2C32&ssl=1 Automata Theory Archives - https://www.theoryofcomputation.co/category/finite-automata/ 32 32 149926143 Every regular expression describes regular language https://www.theoryofcomputation.co/regular-expression-describes-regular-language/ Sat, 27 Jul 2019 08:51:39 +0000 https://www.theoryofcomputation.co/?p=390 Every regular expression describes regular language, let R be an arbitrary regular expression over the alphabet Σ. We will prove that the language described by R is a regular language. The proof is by induction on the structure of R. The first base case of induction: Assume that R = ε.  The  R describes the language of {ε}. In order to prove that this...

The post Every regular expression describes regular language appeared first on .

]]>
Every regular expression describes regular language, let R be an arbitrary regular expression over the alphabet Σ. We will prove that the language described by R is a regular language. The proof is by induction on the structure of R.

The first base case of induction: Assume that R = ε.  The describes the language of {ε}. In order to prove that this language is regular, it suffices, by the theorem which says,

Theorem 1:  Let A be a language. Then A is regular if and only if there exists a nondeterministic finite automaton that accepts A.

thus, let construct the NFA M = (Q, Σ, δ, q, F) that accepts this language. This NFA is obtained by defining Q={q}, q is the start state, F = {q}, and δ(q,a) = ε,  for all a ∈ Σε . The figure below gives the state diagram of M:

Show the start and final state of NFA

The second base case:Assume that R= ε. The describes the language of {ε}. In order to prove that this language is regular, we know , by theorem 1, which state that if language is regular then it should be accepted by NFA.

So, let construct the NFA M = (Q, Σ, δ, q, F) that accepts this language. This NFA is obtained by defining Q={q}, q is the start state, F = θ, means final state not exist, and δ(q,a) = θ,  for all a ∈ Σε . The figure below gives the state diagram of M:

Start state of Non Deterministic Finite Automata

The third base case: Let a ∈ Σ and assume that R = a. The describes the language of {a}. In order to prove that this language is regular, we know , by theorem 1, which state that if language is regular then it should be accepted by NFA.

So, let construct the NFA M = (Q, Σ, δ, q1, F) that accepts this language. This NFA is obtained by defining Q={q1, q2}, q1 is the start state, F = {q2},  and

δ(q1,a) ={q2},

δ(q1,b) = θ for all b ∈ Σε \ {a}

δ(q1,b) = θ for all b ∈ Σε

The figure below gives the state diagram of M:

NFA state diagram with input

The first case of the induction step: Assume that R = R1 ∪ R2, where R1 and R2 are regular expressions. Let L1 and L2 be the languages described by R1 and R2, respectively, and assume that L1 and L2 are regular. Then R describes the language L1 ∪ L2, which, by,

Theorem 2: The set of regular languages is closed under the union operation, i.e., if A1 and A2 are regular languages over the same alphabet Σ, then A1 ∪ A2 is also a regular language.

The second case of the induction step: Assume that R = R1 ∪ R2, where R1 and R2 are regular expressions. Let L1 and L2 be the languages described by R1 and R2, respectively, and assume that L1 and L2 are regular. Then R
describes the language L1 ∪ L2, which, by Theorem 3, is regular.

Theorem 3: The set of regular languages is closed under the concatenation operation, i.e., if A1 and A2  are regular languages over the same alphabet Σ , then A1A2 is also a regular language.

The third case of the induction step: Assume that R = (R1)*, where R1 is a regular expression. Let L1 be the language described by R1 and assume that L1 is regular. Then R describes the language (L1)*, which, by Theorem 4, is regular.

Theorem 4: The set of regular languages is closed under the star (Kleene) operation, i.e., if A is a regular language, then A* is  also a regular language.

This concludes the proof of the claim that every regular expression describes a regular language.

Read: Regular Language in Automata Thoery

The post Every regular expression describes regular language appeared first on .

]]>
390
Turing Machine Definition https://www.theoryofcomputation.co/turing-machine/ Sat, 22 Dec 2018 19:19:28 +0000 https://www.theoryofcomputation.in/?p=314 Definition of a Turing Machine We start with an informal description of a Turing Machine. Such a machine consists of the following: There are k tapes , for some fixed k ≥ 1. Each tape is divided into cells, and is infinite both to the left and to the right. Each cell stores a symbol belonging to a finite set Γ ,...

The post Turing Machine Definition appeared first on .

]]>
Definition of a Turing Machine

We start with an informal description of a Turing Machine. Such a machine consists of the following:

  1. There are k tapes , for some fixed k ≥ 1. Each tape is divided into cells, and is infinite both to the left and to the right. Each cell stores a symbol belonging to a finite set Γ , which is called the tape alphabet. The tape alphabet contains the blank symbol Δ. If a cell contains Δ , then this means that the cell is actually empty.
    Turing machine tapes 2
    A Turing machine with k = 2 tapes
  2. Each tape has  a tape head which can move along the tape, one cell per move. It can also read the cell it currently scans and replace the symbol in this cell by another symbol.
  3. There is the state control, which can be any in any one of a finite number of states. The finite set of states is denoted by Q. The set Q contains three special states: a start state, an accept state, and a reject state.

The Turing machine performs a sequence of computation steps. In one such steps, it does the following:

  1. Immediately before the computation step, the Turing machine is in a state r of Q, and each of the k tape heads is on a certain cell.
  2. Depending on the current state r and the k symbols that are read by the tape heads, 
    1. the Turing machine switches to a state r’ of Q (which may be equal to r)
    2. each tape head writes a symbol of Γ in the cell it is currently scanning (this symbol may be equal to the symbol currently stored in the cell), and
    3. each tape head either moves one cell to the left, moves one cell to the right, or stays at the current cell.

We now give a format definition of a deterministic Turing machine.

Definition: A deterministic Turing machine is a 7-tuple

M = (Σ, Γ, Q, δ, q, qaccept, qreject),

where

  1. Σ is a finite set, called the input alphabet; the blank symbol Δ is not contained in Σ, 
  2. Γ is a finite set, called the tape alphabet; this alphabet contains the blank symbol Δ, and Σ ⊆ Γ,
  3. Q is a finite set, whose elements are called states, 
  4. q is an element of Q; it is called the state state,
  5. qaccept is an element of Q; it is called the accept state,
  6. qreject is an element of Q; it is called the reject state,
  7. δ is called the transition function, which is a function

δ: Q x Γk x {L, R, N}k.

The transition function δ is basically the “program” of the Turing machine. This function tells us what the machine can do in “one computation step”: Let r ∈ Q, and let a1,a2,…..,ak ∈ Γ. Furthermore, let r’ ∈ Q, a’1,a’2,a’3,….,a’k ∈ Γ, and σ1, σ23,….,σk ∈ {L,R,N} be such that

δ(r,a1,a2,…..,a) = (r’, a’1,a’2,a’3,….,a’k ,σ1, σ23,….,σk ).

This transition means that if

  • the Turing machine is in state r, and
  • the head of the i-th tape reads the symbol ai, 1 ≤ i ≤ k,

then

  • the Turing machine switches to state r’,
  • the head of the i-th tape replaces the scanned symbol ai by the symbol a’i, 1 ≤ i ≤ k, and 
  • the head of the i-th tape moves according to σi, 1 ≤ i ≤ k: if σi = L, then the tape head moves one cell to the left; if σi = N, then the tape head does not move.

We will write the computation step in the form of the instruction 

ra1a2…..a→ r’a’1a’2a’3….a’kσ1σ2σ3….σk

We now specify the computation of the Turing Machine

M = (Σ, Γ, Q, δ, q, qaccept, qreject).

Like us: Theory of Computation

The post Turing Machine Definition appeared first on .

]]>
314
Context Sensitive Grammar and Linear Bounded Automata https://www.theoryofcomputation.co/context-sensitive-grammar-and-linear-bounded-automata/ Fri, 21 Sep 2018 18:15:41 +0000 https://www.theoryofcomputation.in/?p=217 A context sensitive grammar (CSG) is a grammar where all productions are of the form αAβ → αγβ where γ ≠ ε. During derivation non-terminal A will be changed to γ only when it is present in the context of α  and β.  *Note the constraint that the replacement string γ ≠ ε ; as a consequence we have α ⇒ β implies |α| ≤ |β| CSG is a Noncontracting grammar. Formal Definition...

The post Context Sensitive Grammar and Linear Bounded Automata appeared first on .

]]>
A context sensitive grammar (CSG) is a grammar where all productions are of the form αAβ → αγβ where γ ≠ ε.

During derivation non-terminal A will be changed to γ only when it is present in the context of α  and β. 

*Note the constraint that the replacement string γ ≠ ε ; as a consequence we have α ⇒ β implies |α| ≤ |β|

CSG is a Noncontracting grammar.

Formal Definition of Context Sensitive Grammar

A context sensitive grammar G = ( N, Σ, P, S), where

  • N is a set of non-terminal symbols
  • Σ is a set of terminal symbols
  • S is the start symbol, and
  • P is a set of production rules, of the form αAβ → αγβ , where A in N, α, β ∈ (N ∪ Σ) and γ ∈ (N ∪ Σ)+

The production S → ε is also allowed if S is the start symbol and it does not appear on the right side of any production.

Linear Bounded Automata

Linear Bounded Automata (LBA) is a single tape Turing Machine with two special tape symbols call them left marker < and the right marker >.

The transitions should satisfy these conditions:

  • It should not replace the marker symbols by any other symbol.
  • It should not write on cells beyond the marker symbols.

Thus the initial configuration will be : < q0a1a2a3a4a5…..an >

Real Also Definition of Pushdown Automata

Formal Definition

Formally Linear Bounded Automata is a non-deterministic Turing Machine , M = ( Q, Σ, Γ, δ, ε, q0, <, >, t, r)

  • Q is set of all states
  • Σ is set of all terminals
  • Γ is set of all tape symbols, Σ ⊂ Γ
  • δ is set of transitions
  • ε is blank symbols or null
  • < is left marker and > is right marker
  • t is accept state
  • r is reject state

The post Context Sensitive Grammar and Linear Bounded Automata appeared first on .

]]>
217
Regular Language in Automata Thoery https://www.theoryofcomputation.co/regular-language-in-automata-thoery/ Thu, 20 Sep 2018 17:22:27 +0000 https://www.theoryofcomputation.in/?p=211 Regular Languages or Formal Language : A language is regular if it can be expressed in terms of regular expression. Closure Properties of Regular Languages Union : If L1 and If L2 are two regular languages, their union L1 ∪ L2 will also be regular. For example, L1 = {an | n ≥ 0} and L2 = {bn |...

The post Regular Language in Automata Thoery appeared first on .

]]>
Regular Languages or Formal Language : A language is regular if it can be expressed in terms of regular expression.

Closure Properties of Regular Languages

Union : If L1 and If L2 are two regular languages, their union L1 ∪ L2 will also be regular. For example, L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
L3 = L1 ∪ L2 = {an ∪ bn | n ≥ 0} is also regular.
Intersection : If L1 and If L2 are two regular languages, their intersection L1 ∩ L2 will also be regular. For example,
L1= {am bn | n ≥ 0 and m ≥ 0} and L2= {am bn ∪ bn am | n ≥ 0 and m ≥ 0}
L3 = L1 ∩ L2 = {am bn | n ≥ 0 and m ≥ 0} is also regular.
Concatenation : If L1 and If L2 are two regular languages, their concatenation L1.L2 will also be regular. For example,
L1 = {an | n ≥ 0} and L2 = {bn | n ≥ 0}
L3 = L1.L2 = {am . bn | m ≥ 0 and n ≥ 0} is also regular.
Kleene Closure : If L1 is a regular language, its Kleene closure L1* will also be regular. For example,
L1 = (a ∪ b)
L1* = (a ∪ b)*
Complement : If L(G) is regular language, its complement L’(G) will also be regular. Complement of a language can be found by subtracting strings which are in L(G) from all possible strings. For example,
L(G) = {an | n > 3}
L’(G) = {an | n <= 3}

Note : Two regular expressions are equivalent if languages generated by them are same. For example, (a+b*)* and (a+b)* generate same language. Every string which is generated by (a+b*)* is also generated by (a+b)* and vice versa.

Read Also: The Pumping Lemma for Context-Free Languages

How to solve problems on regular expression and regular languages?

Question 1 : Which one of the following languages over the alphabet {0,1} is described by the regular expression?
(0+1)*0(0+1)*0(0+1)*
(A) The set of all strings containing the substring 00.
(B) The set of all strings containing at most two 0’s.
(C) The set of all strings containing at least two 0’s.
(D) The set of all strings that begin and end with either 0 or 1.

Solution :
 Option A says that it must have substring 00. But 10101 is also a part of language but it does not contain 00 as substring. So it is not correct option.
Option B says that it can have maximum two 0’s but 00000 is also a part of language. So it is not correct option.
Option C says that it must contain atleast two 0. In regular expression, two 0 are present. So this is correct option.
Option D says that it contains all strings that begin and end with either 0 or 1. But it can generate strings which start with 0 and end with 1 or vice versa as well. So it is not correct.

Question 2 : Which of the following languages is generated by given grammar?
S -> aS | bS | ∊
(A) {an bm | n,m ≥ 0}
(B) {w ∈ {a,b}* | w has equal number of a’s and b’s}
(C) {an | n ≥ 0} ∪ {bn | n ≥ 0} ∪ {an bn | n ≥ 0}
(D) {a,b}*

Solution : Option (A) says that it will have 0 or more a followed by 0 or more b. But S -> bS => baS => ba is also a part of language. So (A) is not correct.
Option (B) says that it will have equal no. of a’s and b’s. But But S -> bS => b is also a part of language. So (B) is not correct.
Option (C) says either it will have 0 or more a’s or 0 or more b’s or a’s followed by b’s. But as shown in option (A), ba is also part of language. So (C) is not correct.
Option (D) says it can have any number of a’s and any numbers of b’s in any order. So (D) is correct.

Question 3 : The regular expression 0*(10*)* denotes the same set as
(A) (1*0)*1*
(B) 0 + (0 + 10)*
(C) (0 + 1)* 10(0 + 1)*
(D) none of these

Solution :
 Two regular expressions are equivalent if languages generated by them are same.
Option (A) can generate 101 but 0*(10*)* cannot. So they are not equivalent.
Option (B) can generate 0100 but 0*(10*)* cannot. So they are not equivalent.
Option (C) will have 10 as substring but 0*(10*)* may or may not. So they are not equivalent.

The post Regular Language in Automata Thoery appeared first on .

]]>
211
What is Chomsky Hierarchy in Theory of Computation https://www.theoryofcomputation.co/what-is-chomsky-hierarchy-in-theory-of-computation/ Wed, 19 Sep 2018 16:54:42 +0000 https://www.theoryofcomputation.in/?p=206 What is Chomsky Hierarchy? Noam Chomsky categorised regular and other languages which called as Chomsky Hierarchy. Language Class Grammar Automaton 3 Regular NFA or DFA 2 Context-Free Push-Down Automaton 1 Context-Sensitive Linear-Bounded Automaton 0 Unrestricted (or Free) Turing Machine This is a hierarchy, so every language of type 3 is also of types 2, 1 and 0;...

The post What is Chomsky Hierarchy in Theory of Computation appeared first on .

]]>
What is Chomsky Hierarchy?

Noam Chomsky categorised regular and other languages which called as Chomsky Hierarchy.

Language Class Grammar Automaton
3 Regular NFA or DFA
2 Context-Free Push-Down Automaton
1 Context-Sensitive Linear-Bounded Automaton
0 Unrestricted (or Free) Turing Machine

What is Chomsky Hierarchy

This is a hierarchy, so every language of type 3 is also of types 2, 1 and 0; every language of type 2 is also of types 1 and 0 etc.

The distinction between languages can be seen by examining the structure of the production rules of their corresponding grammar, or the nature of the automata which can be used to identify them.

Type 3 – Regular Languages

A regular language is one which can be represented by a regular grammar, described using a regular expression, or accepted using an NFA or a DFA.

Type 2 – Context-Free Languages

A Context-Free Grammar (CFG) is one whose production rules are of the form: A -> α , where A is any single non-terminal, and α is any combination of terminals and non-terminals.

A NFA/DFA cannot recognise strings from this type of language since we must be able to “remember” information somehow. Instead we use a Push-Down Automaton which is like a DFA except that we are also allowed to use a stack.

Type 1 – Context-Sensitive Languages

Context-Sensitive grammars may have more than one symbol on the left-hand-side of their production rules (provided that at least one of them is a non-terminal). However, the production rules must now obey the following:

CS1
The number of symbols on the left-hand-side must not exceed the number of symbols on the right-hand-side
CS2
We do not allow rules of the form A → ε unless A is the start symbol and does not occur on the right-hand-side of any rule.

Since we allow more than one symbol on the left-hand-side, we refer to those symbols other than the one we are replacing as the context of the replacement.

The automaton which recognises a context-sensitive language is called a linear-bounded automaton: this is basically a NFA/DFA which can store symbols in a list.

Conditions CS1 and CS2 above mean that the sentential form in any derivation must always increase in length every time a production rule is applied. This basically means that the size of a sentential form is bounded by the length of the sentence (ie. word) we are deriving.

Since the sentinel form cannot thus grow infinitely large before deriving a sentence, a linear-bounded automaton always uses a finitely-long list as its store.

Type 0 – Unrestricted (Free) Languages

Free grammars have absolutely no restrictions on their grammar rules, (except, of course, that there must be at least one non-terminal on the left-hand-side).

The type of automata which can recognise such a language is basically a NFA/DFA with an infinitely-long list at its disposal to use as a store; this is called a Turing machine.

 

The post What is Chomsky Hierarchy in Theory of Computation appeared first on .

]]>
206
The Pumping Lemma for Context-Free Languages https://www.theoryofcomputation.co/the-pumping-lemma-for-context-free-languages/ Mon, 10 Sep 2018 18:57:45 +0000 https://www.theoryofcomputation.in/?p=190 The Pumping Lemma for Context-Free Languages (CFL) Proving that something is not a context-free language requires either finding a context-free grammar to describe the language or using another proof technique (though the pumping lemma is the most commonly used one). A common lemma to use to prove that a language is not context-free is the Pumping...

The post The Pumping Lemma for Context-Free Languages appeared first on .

]]>
The Pumping Lemma for Context-Free Languages (CFL)

Proving that something is not a context-free language requires either finding a context-free grammar to describe the language or using another proof technique (though the pumping lemma is the most commonly used one). A common lemma to use to prove that a language is not context-free is the Pumping Lemma for Context-Free Languages.

Theorem
The pumping lemma for context-free languages states that if a language L is 
context-free, there exists some integer length p ≥ 1 such that every string s ε L 
has a length of a p or more symbols, |s| ≥ p, that can written s = uvwxy where 
u, v, w, x and y are substrings of s such that:
    • |vwx| ≤ p
    • |vx| ≥ 1
    • uvnwxny ∈ ∀  n ≥ 0

All context-free languages are “pumpable” meaning that the pumping lemma constraints hold true for all context-free languages. If a language is not pumpable, then it is not a context-free language. However, if a language is pumpable, it is not necessarily a context-free language. Because the set of regular languages is contained in the set of context-free languages, all regular languages must be pumpable too.

Essentially, the pumping lemma holds that arbitrarily long strings can be pumped without ever producing a new string that is not in the language .

To prove that a language is not context-free, use proof by contradiction and the pumping lemma. Set up a proof that claims that is context-free​, and show that a contradiction of the pumping lemma’s constraints occurs in at least one of the three constraints listed above.

Basically, the idea behind the pumping lemma for context-free languages is that there are certain constraints a language must adhere to in order to be a context-free language. You can use the pumping lemma to test if all of these constraints hold for a particular language, and if they do not, you can prove with contradiction that the language is not context-free.

Example

Use the Pumping Lemma to prove that L = { anbncn|n>0 } is not a context-free language.

Assume, for the sake of contradiction, that L = {anbncn |n > 0  } is a context-free
language. By the pumping lemma, there exists an integer pumping length p for L. 
We need a string s that is longer than or equal to the length of p. Certainly 
s = apbpcp is longer than p, so we choose this for the s string. This s is in L since 
it has p a's , p b's and p c's.

Now by the pumping lemma, |vwx| ≤ p. There are five possible places in the string that 
we can assign to be vwx:
    • vwx = aj for some j ≤ p. This means that vwx is contained purely in the a’s section.
    • vwx = ajbk for some and  where j+k ≤ p. This means that the vwx segment is contained somewhere in the a’s and b’s section.
    • vwx = bj for some j ≤ p. This means that vwx is contained purely in the b’s section.
    • vwx = bjck for some and  where j+k ≤ p. This means that the vwx segment is contained somewhere in the b’s and c’s section.
    • vwx = cj for some j ≤ p. This means that vwx is contained purely in the c’s section.

In any of these five cases, we can easily verify that the third constraint for the pumping lemma, that uvnwxny ∈ L ∀ n ≥ 0, does not hold. In other words, for any of these five choices of vwx, the string cannot be pumped in a way that results in a string that has an equal number of a’s, b’s and c’s (the definition of the language L).

Read Also: Context Free Languages

Let’s take a short example string described by a5b5c5 = aaaaabbbbbccccc and p = 3.

In the first case, there will be more a’s than there are b’s and c’s, making the resulting, pumped string, not a member of L. If we pump this region, we will get the string aaaaaaaabbbbbccccc: a string with 8 a’s, 5 b’s and 5 c’s. Clearly this is not in the language. A similar proof can be checked for the third and fifth case, just pump the b and c region, respectively, and the results will be symmetrical.

For the second and fourth case, we do something similar. If we pump anywhere in the a and b region only, we will have a resulting string with more a’s and b’s than c’s (for the second case) and more b’s and c’s than a’s (in the fifth case). For the second case, if we take a5b5c5 = aaaaabbbbbccccc and p = 3 and pump the last a in the a section and the first two b’s in the b section, we get this string: aaaaaabbbbbbbccccc — a string with six a’s, seven b’s, and five c’s. The fifth case has a symmetrical example.

The post The Pumping Lemma for Context-Free Languages appeared first on .

]]>
190
Context Free Languages https://www.theoryofcomputation.co/context-free-languages/ Thu, 06 Sep 2018 19:18:28 +0000 https://www.theoryofcomputation.in/?p=185 Context-free languages (CFLs) are generated by context-free grammars. The set of all context-free languages is identical to the set of languages accepted by pushdown automata, and the set of regular languages is a subset of context-free languages. An inputed language is accepted by a computational model if it runs through the model and ends in an accepting final state. All regular languages...

The post Context Free Languages appeared first on .

]]>
Context-free languages (CFLs) are generated by context-free grammars. The set of all context-free languages is identical to the set of languages accepted by pushdown automata, and the set of regular languages is a subset of context-free languages.

An inputed language is accepted by a computational model if it runs through the model and ends in an accepting final state. All regular languages are context-free languages, but not all context-free languages are regular. Most arithmetic expressions are generated by context-free grammars, and are therefore, context-free languages.

Context-free languages and context-free grammars have applications in computer science and linguistics such as natural language processing and computer language design.

Context Free Languages

Context Free Languages Definition

In formal language theory, a language is defined as a set of strings of symbols that may be constrained by specific rules. Similarly, the written English language is made up of groups of letters (words) separated by spaces. A valid (accepted) sentence in the language must follow particular rules, the grammar.

A context-free language is a language generated by a context-free grammar. They are more general (and include) regular languages. The same context-free language might be generated by multiple context-free grammars.

The set of all context-free languages is identical to the set of languages that are accepted by pushdown  automata (PDA).

Here is an example of a language that is not regular (proof here) but is context-free:

{ anbn | n ≥ 0}.  This is the language of all strings that have an equal number of a’s and b’s.

In this notation,a4b4 can be expanded out too aaaabbbb, where there are four a’s and then four b’s. (So this isn’t exponentiation, through the notation is similar).

Read Also: Context Free Grammars

Closure Properties

Context-free languages have the following closure properties. A set is closed under an operation if doing the operation on a given set always produces a member of the same set. This means that if one of these closed operations is applied to a context-free language the result will also be a context-free language.

  • Union: Context-free languages are closed under the union operation. This means that if  are both context-free languages, then  is also a context-free language.
Proof:

Here is a proof that context-free grammars are closed under union
  1. Let L and P be generated by the context-free grammars, GL = (VL, ΣL, RL, SL) and GP = (VP, ΣP, RP, SP), respectively.
  2. Without loss of generality, subscript each nonterminal symbol in GL with an L, and each nonterminal of GP with a P such that VL ∩ VP = ∅.
  3. Define the CFG, G, that generates LP as follows: G=(VL ∪ VP ∪ {S}, ΣL ∪ ΣP, RL ∪ RP ∪ {S -> SL|SP}, S).
  • Concatenation: If L and are both context-free languages, then LP is also context free. The concatenation of a string is defined as follows: S1S2 = vw: v ∈ S1 and w ∈ S2.
Proof:

Here is a proof that context-free grammars are closed under union
    1. Let L and P be generated by the context-free grammars, GL = (VL, ΣL, RL, SL) and GP = (VP, ΣP, RP, SP), respectively.
    2. Without loss of generality, subscript each nonterminal symbol in GL with an L, and each nonterminal of GP with a P such that VL ∩ VP = ∅.
    3. Define the CFG, G, that generates LP as follows: G=(VL ∪ VP ∪ {S}, ΣL ∪ ΣP, RL ∪ RP ∪ {S -> SLSP}, S).
Every word that G generates is a word L followed by a word P, which is the definition of concatenation.
  •  Kleene Star: If  is a context-free language, then L ∗  is also context free. The Kleene star can repeat the string or symbol it is attached to any number of times (including zero times). The Kleene star basically performs a recursive concatenation of a string with itself. For example, {a,b}∗ = {ε, a, b, ab, aab, aaab, abb, ….} and so on. We’ve already proved that CFLs are closed under concatenation.

Context-free languages are not closed under complement or intersection.

If CFL’s  were closed under intersection then there would be CFLs that violate the pumping lemma for context-free languages which cannot be.

Please wait for our next post on Pumping Lemma.

Please Like Our Post on Facebook

Also see: Definition of Pushdown Automata

The post Context Free Languages appeared first on .

]]>
185
Regular Expressions – (Regex) – Regular Expression https://www.theoryofcomputation.co/regular-expressions-regex-regular-expression/ Sun, 02 Sep 2018 04:35:46 +0000 https://www.theoryofcomputation.in/?p=170 Regular Expressions was initially a term borrowed from automata theory in theoretical computer science. Broadly, it refers to patterns to which a sub-string needs to be matched. The comic should have already given you an idea of what regular expressions could be useful for. It should not be surprising that many programming languages, text processing tools, data validation tools...

The post Regular Expressions – (Regex) – Regular Expression appeared first on .

]]>
Regular Expressions was initially a term borrowed from automata theory in theoretical computer science. Broadly, it refers to patterns to which a sub-string needs to be matched.

The comic should have already given you an idea of what regular expressions could be useful for. It should not be surprising that many programming languages, text processing tools, data validation tools and search engines make extensive use of them.

The key idea is that a regular expression is a pattern which matches a set of target strings.

\w+@\w+\.(com|org|net|in) is a regex that matches a most email addresses that end with a .com, .net, .org or a .in.

Regular Expressions Concepts

There are many forms of regex syntax that vary with the language. Here, we will be examining Perl regex since most other regexps are usually a variation on this.

Before we dive into the syntax, these are the kinds of things that the patterns consist of:

  • Literals: They are the simplest things to match. When they are there, we just match them. It could be like an a or a 1.
  • Meta characters: They do not mean what they look like. They usually refer to something else. For example, \d could refer to any digit.
  • Vertical Bar: The | is a symbol of boolean OR. It gives an option to match any of the things it delimits.
  • Quantifiers: They specify how many of the concerned pattern needs to be matched.
  • Grouping and Capturing: Parentheses could be used to group parts of the regex or capturing parts for later use.

Regular Expression Syntax

Let’s look at what the meta characters do in a little more detail.

Meta character Description
^ Start of a string
$ End of a string
\t Tab
\n Newline
\r Carriage Return
\s Any whitespace character
\S Any non-whitespace character
\d Any Digit
\D Any non-digit
\w Any word-character
\W Any non-word character
\b Any word boundary
\B Any non-word-boundary
. Any single character, usually barring a newline

By the way, if you want to match a metacharacter literally, you need to use \ to escape it. For example, \. would just match the . character.

Now, let us look into more flexibility stuff.

Expression Meaning
[abc] Matches any of a,b, or c
[^abc] Matches anything other than ab, or c
[a-d] Matches any of the characters in the range a-d
a* Matches a zero or more times
a? Matches a zero or one time
a+ Matches a one or more times
a|b Matches either a or b
a{3} Matches exactly 3 of a
a{3,} Matches 3 or more of a
a{3,5} Matches 3, 4 or 5 of a (inclusive range)
( ) Captures everything inside the bracket
Example:

We are now ready to explain why \w+@\w+\.(com|org|net|in) does what it claims.

Firstly, what should an email look like? That's right, it should have a structure like user@domain.extension.

The user and domain consists of any letter, number or underscore but at least one of them. So, we use \w+.

We restrict the extension to org, com, net or in by using the |.

Read Also: Context Free Grammars

Like Us: https://www.facebook.com/theoryofcomputation2018/

The post Regular Expressions – (Regex) – Regular Expression appeared first on .

]]>
170
Context Free Grammars https://www.theoryofcomputation.co/context-free-grammars/ Thu, 23 Aug 2018 18:21:25 +0000 https://www.theoryofcomputation.in/?p=163 Context free grammars (CFGs) are used to describe context-free languages. A context-free grammar is a set of recursive rules used to generate patterns of strings. A context-free grammar can describe all regular languages and more, but they cannot describe all possible languages. Context-free grammars are studied in fields of theoretical computer science, compiler design, and linguistics. CFG’s are used to describe programming languages and parser...

The post Context Free Grammars appeared first on .

]]>
Context free grammars (CFGs) are used to describe context-free languages. A context-free grammar is a set of recursive rules used to generate patterns of strings. A context-free grammar can describe all regular languages and more, but they cannot describe all possible languages.

Context-free grammars are studied in fields of theoretical computer science, compiler design, and linguistics. CFG’s are used to describe programming languages and parser programs in compilers can be generated automatically from context-free grammars.

Context Free Grammars
Two parse trees that describe CFGs that generate the string “x + y * z”. Source: Context-free grammar wikipedia page.

Context Free Grammars:

Context-free grammars can generate context-free languages. They do this by taking a set of variables which are defined recursively, in terms of one another, by a set of production rules. Context-free grammars are named as such because any of the production rules in the grammar can be applied regardless of context—it does not depend on any other symbols that may or may not be around a given symbol that is having a rule applied to it.

Context-free grammars have the following components:
    • A set of terminal symbols which are the characters that appear in the language/strings generated by the grammar. Terminal symbols never appear on the left-hand side of the production rule and are always on the right-hand side.
    • A set of nonterminal symbols (or variables) which are placeholders for patterns of terminal symbols that can be generated by the nonterminal symbols. These are the symbols that will always appear on the left-hand side of the production rules, though they can be included on the right-hand side. The strings that a CFG produces will contain only symbols from the set of nonterminal symbols.
    • A set of production rules which are the rules for replacing nonterminal symbols. Production rules have the following form: variable  string of variables and terminals.
    • A start symbol which is a special nonterminal symbol that appears in the initial string generated by the grammar.
      
      

For comparison, a context-sensitive grammar can have production rules where both the left-hand and right-hand sides may be surrounded by a context of terminal and nonterminal symbols.

To create a string from a context-free grammar, follow these steps:
    • Begin the string with a start symbol.
    • Apply one of the production rules to the start symbol on the left-hand side by replacing the start symbol with the right-hand side of the production.
    • Repeat the process of selecting nonterminal symbols in the string, and replacing them with the right-hand side of some corresponding production, until all nonterminals have been replaced by terminal symbols. Note, it could be that not all production rules are used.

 

Formal Definition

A context-free grammar can be described by a four-element tuple (V, Σ, R, S) , where

  • V is a finite set of variables (which are non-terminal)
  • Σ is a finite set (disjoint from V) of terminal symbols
  • R is a set of production rules where each production rule maps a variable to a string  s ∈ (V ∪ Σ) *
  • S (which is in V ) which is a start symbol.
Example:
Come up with a grammar that will generate the context-free (and also regular) language that contains all strings with matched parentheses.

There are many grammars that can do this task. This solution is one way to do it, but should give you a good idea of if your (possibly different) solution works too.

Starting symbol -> S
Non-terminal variables = {(,)}
Production rules:
    • S -> ( )
    • S -> SS
    • S -> (S).

 

A way to condense production rules is as follows:

We can take

S->()
S->SS
S->(S)

and translate them into a single line: S ->  ( ) | SS | (S) | ε where ε is an empty string.

Context-free grammars can be modeled as parse trees. The nodes of the tree represent the symbols and the edges represent the use of production rules. The leaves of the tree are the end result (terminal symbols) that make up the string the grammar is generating with that particular sequence of symbols and production rules.

The parse trees below represent two ways to generate the string “a + a – a” with the grammar

 

Context Free Grammars
Example of an ambiguous grammar—one that can have multiple ways of generating the same string

Because this grammar can be implemented with multiple parse trees to get the same resulting string, this is said to be ambiguous.

Relationship with other Computation Models

A context-free grammar can be generated by pushdown automata just as regular languages can be generated by finite state machines. Since all regular languages can be generated by CFGs, all regular languages can too be generated by pushdown automata.

Any language that can be generated using regular expressions can be generated by a context-free grammar.

The way to do this is to take the regular language, determine its finite state machine and write production rules that follow the transition functions.

The post Context Free Grammars appeared first on .

]]>
163
Translating Between Context-Free Grammars and Pushdown Automata https://www.theoryofcomputation.co/context-free-grammars-and-pushdown-automata/ Sun, 19 Aug 2018 10:29:46 +0000 https://www.theoryofcomputation.in/?p=159 Context-free Grammar to Pushdown Automata Each derivation or sequence of production rules that results in a given string is made up of intermediate strings (which are made at each step of the derivation). The pushdown automata’s nondeterminism helps it to guess the sequence of steps in the derivation that will result in the desired string. So at...

The post Translating Between Context-Free Grammars and Pushdown Automata appeared first on .

]]>
Context-free Grammar to Pushdown Automata

Each derivation or sequence of production rules that results in a given string is made up of intermediate strings (which are made at each step of the derivation).

The pushdown automata’s nondeterminism helps it to guess the sequence of steps in the derivation that will result in the desired string. So at each step in the derivation, one of the production rules for a given variable is selected nondeterministically and substituted in for the variable.

The pushdown automata begins by pushing a symbol onto the stack and then goes through the series of intermediate strings until it arrives at a string that contains only the terminal symbols (this will happen if the string is actually in the grammar, otherwise it will reject).

Read More: Definition of Pushdown Automata

Here’s what to do

  • Push the start symbol, $, to the stack.

Then the following steps are repeated until the automaton finishes:

  • If there is a variable X on top of the stack, nondeterministically pick one of the production rules for  X and substitute X  with the string on the right-hand side of the production rule.
  • If there is a terminal variable a  on the input, read the next symbol from the input and compare it to a . If they are the same, repeat and if they are not, reject on this branch of the nondeterminism.
  • If it is the end of the input and the top of the stack has the start symbol, $, then accept

Can you come up with a diagram and formal description of a pushdown automaton that recognizes strings containing only parentheses and accepts on strings that have matched parentheses? 

Σ = {(,)}

Γ = {$,Χ} note:, where the Χ could be any symbol you want

Q = { A, B, C, D }

F = {D}

q0 = A

Z = $

δ = {(A,ε, ε, A, $), (A,(,$,B,X), (B, (, X,B,X), (B,),X,C,ε) , (C,),X,C,ε), (C, ε, $, D, ε) }

Context Free Grammer to PushDown Automata

[the_ad_group id=”24″]

The post Translating Between Context-Free Grammars and Pushdown Automata appeared first on .

]]>
159