Tuesday, December 2, 2008

More on PS5

I ve just seen the sample solution for PS5.

P(n): If language L has |L| = n (that is, it consists of n strings), then L has a regular expression that denotes it.

Base cases: If n = 0, then L is the empty language, and it is denoted by \emptyset.

If n = 1, then L consists of a single string. If the single string is the empty string, then the symbol \epsilon denotes L. Otherwise, we may argue that the sequence of symbols in L's single string constitutes a regular expression for L (perhaps by adding suitable parentheses). A really rigorous proof uses induction on the length of the single string.

Induction step: Assume n is a natural number, and that P(0)/\.../\P(n-1) is true. Suppose L is a language with |L|=n. If n is either 0 or 1, we're done, since the base cases show that L has a regular expression that denotes it. Otherwise, partition L into one set with n-1 strings, and another set with 1 string. By P(1) and P(n-1), these two sets have regular expressions that denote them, R1 and R2. Since L is the union of the languages denoted by these two sets, it is denoted by (R1+R2).

I think a simple induction do the job as well. The base case is the same.
IH: Assume P(n) is true. Suppose L is a language with |L| = n+1. Arbitrary pick an element s from L, then there is a subset of L that does not include s. This set L' is a language that has n string, so by IH there is a regex R' that denotes L'. By one more alternation operation, we have R = R'+s that denotes L. Therefore, any finite language can be denoted by a regex.

No comments: