2008 | OriginalPaper | Buchkapitel
How the Burrows-Wheeler Transform works
Erschienen in: The Burrows-Wheeler Transform: Data Compression, Suffix Arrays, and Pattern Matching
Verlag: Springer US
Aktivieren Sie unsere intelligente Suche, um passende Fachinhalte oder Patente zu finden.
Wählen Sie Textabschnitte aus um mit Künstlicher Intelligenz passenden Patente zu finden. powered by
Markieren Sie Textabschnitte, um KI-gestützt weitere passende Inhalte zu finden. powered by
This chapter will look in detail at how the Burrows-Wheeler Transform is implemented in practice. The examples given in Chapter 1 overlooked some important practical details — to transform a text of n characters the encoder was sorting an array of n strings, each n characters long, and the decoder performed n sorts to reverse the transform. This complexity is not necessary for the BWT, and in this chapter we will see how to perform the encoding and decoding in O(n) space, and O(n log n) time. In fact, using a few tricks, the time can be reduced to O(n).
We will also look at various auxiliary data structures that are used for decoding the Burrows-Wheeler Transform, as some of them, while not essential for decoding, are useful if the transformed text is to be searched. These extra structures can still be constructed in O(n) time so in principle they add little to the decoding cost.
This chapter considers only the transform; in the next chapter we will look at how a compression system can take advantage of the transformed text to reduce its size; we refer to this second phase as the “Local to Global Transform”, or LGT.
We will present the Burrows-Wheeler Transform for coding a string T of n characters, T[1 … n], over an alphabet Σ of |Σ| characters. Note that there is a summary of all the main notation in Appendix A on page 309.