<< Chapter < Page Chapter >> Page >
This appendix explains how the Speak and Spell speech data was packed in the ROM code. It takes the coded data for a spoken word and shows the process of packing it into a set of data that would be found in the ROM code.

Introduction

A detail worth spending time on is how the speech data was packed into the ROM. An example of how it was done is shown in the following four figures (Figures 1 - 4). Figure 1 is an overview of the packing algorithm that I put together for a presentation on this topic. Figures 2 and 3 show the coded data for the word "cage". The coded data was taken from the information at the top of Figure 4. When the packing process was completed as shown in Figures 2 and 3, the resulting data matches the information in the bottom part of Figure 4. I am relatively certain that just using the four figures won't help much in understanding the process. So, I will pull excerpts from Figures 2 and 3 and use them to explain the process.

An overview of how the encoded speech data was stored in memory.

First page of encoded data for the word "cage" (frame number, energy, repeat, pitch, K1 - k6).

Page 2 of encoded data for the word "cage" (K7 - K10)

Computer printout showing the intitial encoded data and the final packed data for the word "cage"

The top set of data in Figure 4 is the parametric data for the word "cage". The first column is the frame number, the second is the energy level the third column is the pitch period, and the remaining columns are the reflection coefficients going from K1 to K10 starting from the left going to the right. The bottom set of data is the final packed data for the encoded word.

I have taken the first five frames of data from Figures 2 and 3 and put them in Table 1. It will be easier to see the data and explain the process using this table rather than attempting to work through the hand written figures.

First 6 frames from figures 1 and 2
Frame Energy Rpt Pitch K1 K2 K3 K4 K5 K6 K7 K8 K9 K10
8 1001 0 00000 10101 10110 0110 0110 - - - - - -
9 0110 1 00000 - - - - - - - - - -
10 0110 1 00000 - - - - - - - - - -
11 1101 0 01010 10010 10000 0101 0101 0110 1011 1010 101 011 010
12 1101 1 01011 - - - - - - - - - -
13 1101 0 01100 10110 10001 0111 0100 0000 1010 1011 110 100 011

Notice that frames 8 - 10 are unvoiced with frames 9 and 10 being repeated copies of frame 8. The "1" in frames 9 and 10 indicate that they are repeated frames. Frames 11 - 13 are voiced frames. Frame 12 is a repeat frame. Referring back to figure 1 you can see that an unvoiced frame (frame 8) only has the first four reflection coefficients (K1 - K4), where a voiced frame has all ten coefficients (frames 11 and 12). In all cases the repeat frame has no coefficients and the repeat flag is set to a "1".

The process consists of several steps

  1. Encode the parameters into binary
  2. Repack the binary numbers into hexadecimal
  3. Bit reverse each hexadecimal number
  4. Reverse the order for each pair of hexadecimal numbers

If I take the binary sequence for Frames 8 through 13 I get this sequence of bits:

1001 0 00000 10101 10110 0110 0110 . 0110 1 00000 . 0110 1 00000 . 1101 0 01010 10010 10000 0101 0101 0110 1011 1010 101 011 010 . 1101 1 01011 . 1101 0 01100 10110 10001 0111 0100 0000 1010 1011 110 100 011

Notice that I have inserted a "." to separate each of the frame sequences and have used a blank to separate the 13 parameters within each frame. The next task is to reformat the bits into hexadecimal. the bits for each hexadecimal number are shown in parenthesis below:

(1001) (0 000)(00 10)(101 1)(0110) (0110) (0110) . (0110) (1 000)(00 . 01)(10 1 0)(0000) . (1101) (0 010)(10 10)(010 1)(0000) (0101) (0101) (0110) (1011) (1010) (101 0)(11 01)(0 . 110)(1 1 01)(011 . 1)(101 0) (0110)(0 101)(10 10)(001 0)(111 0)(100 0)(000 1)(010 1)(011 1)(10 10)(0 011) [1011]

I have put brackets around the last nibble to indicate that it came from frame 14. It was necessary to create an even number of nibbles so that the process could be completed on this example. Now that the binary sequence has been organized into nibbles, I can use Table 2 to convert the nibbles into hexadecimal.

Hexadecimal table
Decimal Binary Hexadecimal Bit Reversed
0 0000 0 0
1 0001 1 8
2 0010 2 4
3 0011 3 C
4 0100 4 2
5 0101 5 A
6 0110 6 6
7 0111 7 E
8 1000 8 1
9 1001 9 9
10 1010 A 5
11 1011 B D
12 1100 C 3
13 1101 D B
14 1110 E 7
15 1111 F F

In hexadecimal it would look like: 90 2B 66 66 81 A0 D2 A5 05 56 BA AD 6D 7A 65 A2 E8 15 7A 3D

Bit reversed would look like: 90 4D 66 66 18 50 B4 5A 0A A6 D5 5B 6B E5 6A 54 71 8A E5 CB

Finally doing a pair wise nibble switch it would look like: 09 D4 66 66 81 05 4B A5 A0 6A 5D B5 B6 5E A6 45 17 A8 5E BC

If this sequence is compared to the bottom data set of Figure 4 it will be comforting to see them identical. Obviously we could have completed the whole word to verify that all of works. But, then, that is what Figures 2 and 3 attempted to do.

You may notice that I have ignored the creation of and use of the encode and decode tables. These tables were created based on a specific professional speaker. For each of the coefficients a test data set was used to reduce all of the variations to a set of buckets. For example with K1 where there are five bits to define the value of the coefficient, the data set was split into 32 buckets ranging from the largest to the smallest. A median point was selected to be the value used for the decoder. As this was specific to each professional speaker and therefore to each version of the TMS028x it will not be presented. That part of the process is left to the student to figure out. And, yes, you may have noted that I didn't disclose how the spelling of the words was packed into the ROM along with the speech data. Another aspect left to the student to figure out.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, The speak n spell. OpenStax CNX. Jan 31, 2014 Download for free at http://cnx.org/content/col11501/1.5
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'The speak n spell' conversation and receive update notifications?

Ask