<< Chapter < Page Chapter >> Page >

If English letters were truly independent, then it should be possible to generate “English-like” text using this tableof probabilities. Here is a sample:

Od m shous t ad schthewe be amalllingod ongoutorend youne he Any bupecape tsooa wbeves p le t ke teml ley une weg rloknd

which does not look anything like English. How can the nonindependence of the text be modeled?One way is to consider the probabilities of successive pairs of letters instead of the probabilitiesof individual letters. For instance, the pair “th” is quite frequent, occurring 11,014 times in the Wizard of Oz , while “sh” occurs 861 times. Unlikely pairs such as “wd”occur in only five places In the words “crowd” and “sawdust.” and “pk” not at all. For example, suppose that “He” was chosen first. The next pairwould be “e” followed by something, with the probability of the something dictated by the entries in the table.Following this procedure results in output like this:

Her gethe womfor if you the to had the sed th and the wention At th youg the yout byand a pow eve cank i as saing paill

Observe that most of the two-letter combinations are actual words, as well as many three-letter words. Longer sets ofsymbols tend to wander improbably. While, in principle, it would be possible to continue gathering probabilitiesof all three-letter combinations, then four, etc., the table begins to get rather large (a matrix with 26 n elements would be needed to store all the n -letter probabilities). Shannon “A Mathematical Theory of Communication,” The Bell System Technical Journal , Vol 27, 1948. suggests another way:

...one opens a book at random and selects a letter on the page. This letter is recorded. The book is then opened toanother, page and one reads until this letter is encountered. The succeeding letter is then recorded. Turning to another page, thissecond letter is searched for, and the succeeding letter recorded, etc.

Of course, Shannon did not have access to M atlab when he was writing in 1948. If hehad, he might have written a program like textsim.m , which allows specification of any text(with default being The Wizard of Oz ) and any number of terms for the probabilities.For instance, with m=1 , the letters are chosen completely independently; with m=2 , the letters are chosen from successive pairs; and with m=3 , they are chosen from successive triplets. Thus, the probabilities of clusters of letters are definedimplicitly by the choice of the source text. b=[0.5 1 -0.6];              % define channelm=1000; s=sign(randn(1,m));  % binary source of length m r=filter(b,1,s);             % output of channeln=4; f=[0 1 0 0]';           % initialize equalizermu=.1; % stepsizefor i=n+1:m % iterate  rr=r(i:-1:i-n+1)';         % vector of received signal   e=sign(f'*rr)-f'*rr;       % calculate error  f=f+mu*e*rr;               % update equalizer coefficients end

DDequalizer.m find a DD equalizer f for the channel b (download file)
textsim.m depends heavily on the number of terms m used for the transition probabilities. With m=1 or m=2 , the results appear much as above. When m=3 ,

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Software receiver design. OpenStax CNX. Aug 13, 2013 Download for free at http://cnx.org/content/col11510/1.3
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Software receiver design' conversation and receive update notifications?

Ask