Published in:

2008 | OriginalPaper | Chapter

How the Burrows-Wheeler Transform works

Published in: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching

Publisher: Springer US

Activate our intelligent search to find suitable subject content or patents.

search-config

AI-assisted search

Off

This chapter will look in detail at how the Burrows-Wheeler Transform is implemented in practice. The examples given in Chapter 1 overlooked some important practical details — to transform a text of n characters the encoder was sorting an array of n strings, each n characters long, and the decoder performed n sorts to reverse the transform. This complexity is not necessary for the BWT, and in this chapter we will see how to perform the encoding and decoding in O(n) space, and O(n log n) time. In fact, using a few tricks, the time can be reduced to O(n).

We will also look at various auxiliary data structures that are used for decoding the Burrows-Wheeler Transform, as some of them, while not essential for decoding, are useful if the transformed text is to be searched. These extra structures can still be constructed in O(n) time so in principle they add little to the decoding cost.

This chapter considers only the transform; in the next chapter we will look at how a compression system can take advantage of the transformed text to reduce its size; we refer to this second phase as the “Local to Global Transform”, or LGT.

We will present the Burrows-Wheeler Transform for coding a string T of n characters, T[1 … n], over an alphabet Σ of |Σ| characters. Note that there is a summary of all the main notation in Appendix A on page 309.

Springer Professional

How the Burrows-Wheeler Transform works

Dont have a licence yet? Then find out more about our products and how to get one now:

Springer Professional "Wirtschaft+Technik"

Springer Professional "Technik"

Springer Professional "Wirtschaft"

Premium Partner