<< Chapter < Page Chapter >> Page >

We note in past that there is no need to spend a bit to encode leaves of depth D . To see this, consider a procedure for encoding the structure of a tree:

Consider the tree sourced depicted in [link] . In order to encode the structure of this tree, we will utilize the followingprocedure. (Such a procedure has appeared, for example, in  [link] .)

Tree used in Example 10 to demonstrate how the structure of the source is encoded.
Tree used in [link] to demonstrate how the structure of the source is encoded.

Start from root. [procedure(root)]
1. If node S is of depth D (maximum), then return.
2. If node S is internal node, then {
     encode 0
     procedure(0S)
     procedure(1S)
} else encode 1.
3. return.

Let us now simulate the procedure, the procedure will traverse through the following states of the tree in [link] while outputting the corresponding bits.

Source root 0 1 01 001 101 11
Encoded symbol 0 1 0 0 1

Returning to tree pruning, following [link] we see that we must initialize MDL ( s ) = KT ( n x ( s , 0 ) , n x ( s , 1 ) ) for s of full depth | s | = D without the extra bit.

At the end of the pruning procedure, T { } * the maximizing tree for the root, will be the optimal tree for universal coding.

Burrows wheeler transform

The Burrows Wheeler transform (BWT) was proposed by Burrows and Wheeler in 1994  [link] (see also the analysis by Effros et al.  [link] and references therein). It is an invertible permutation sort that sorts symbols according to their contexts. Thatway, the symbols that were generated by the same state of the context tree are grouped together, which as we will see is advantageous.

To compute the BWT, we first compute all cyclical shifts of the input x . Next, we sort the cyclical shifts.The output of the BWT consists of y , the last column of the matrix of sorted shifts, and i the index of the original version. We illustrate with an example.

Consider the input x = b a n a n a . First, we compute the cyclic shifts and their sorts.

All Shifts Sorted
banana abanan
abanan anaban
nabana ananab
anaban banana
nanaba nabana
ananab nanaba

The output of the BWT consists of y = n n b a a a , the last column of the matrix of sorted shifts (to the right), and the index i = 4 containing the original input.

Interestingly, we can recover x from y and i . Seeing that y is structured and thus quite compressible, the BWT can be used as a compression system; a building block that illustrates such a system appearsin [link] .

Typical compression system using the Burrows Wheeler transform
Typical compression system using the Burrows Wheeler transform  [link] .

To see that the BWT is invertible, let us work out how to do this by continuing our example.

In the matrix of sorted shifts, column 1 is a sorted version of column  n , which we know.

Column 1 Column n
a n
a n
a b
b a
n a
n a

Now take column  n and put it before column 1:

Column n Column 1
n a
n a
b a
a b
a n
a n

We now sort these rows, which each consist of 2 symbols: a b , a n , a n , b a , n a , and n a . Now fill column 2 of the sorted shifts matrix accordingly.

Columns 1–2 Column  n
ab n
an n
an b
ba a
na a
na a

The entire matrix can be unraveled, and the row containing the original x is indexed by i .

What is the BWT good for? The key property of the BWT is that symbols generated by the same state are grouped together in y . To see this, note how the last column  n can be rotated to a position to the left of column 1, and symbols that came before the same prefix appear together.(To bunch together symbols generated by the same suffix, we can reverse the order of symbols in x before running the BWT.) Therefore, y has the form of a piecewise i.i.d. sequence  [link] , where segments generated by the same state of the context tree are bunched together.

Get Jobilize Job Search Mobile App in your pocket Now!

Get it on Google Play Download on the App Store Now




Source:  OpenStax, Universal algorithms in signal processing and communications. OpenStax CNX. May 16, 2013 Download for free at http://cnx.org/content/col11524/1.1
Google Play and the Google Play logo are trademarks of Google Inc.

Notification Switch

Would you like to follow the 'Universal algorithms in signal processing and communications' conversation and receive update notifications?

Ask