Skip to main content
main-content

Über dieses Buch

Compression and Coding Algorithms describes in detail the coding mechanisms that are available for use in data compression systems. The well known Huffman coding technique is one mechanism, but there have been many others developed over the past few decades, and this book describes, explains and assesses them. People undertaking research of software development in the areas of compression and coding algorithms will find this book an indispensable reference. In particular, the careful and detailed description of algorithms and their implementation, plus accompanying pseudo-code that can be readily implemented on computer, make this book a definitive reference in an area currently without one.

Inhaltsverzeichnis

Frontmatter

Chapter 1. Data Compression Systems

Abstract
One of the paradoxes of modern computer systems is that despite the spiraling decrease in storage cost there is an ever increasing emphasis on date compression. We use compression daily, often without even being aware of it, when we use facsimile machines, communication networks, digital cellular telephones, world-wide web browsers, and DVD players. Indeed, on some computer system, the moment we access a file from disk we make use of compression technology; and not too far in the future are computer architectures that store executable code in compressed form in main memory in lines of few hundred bytes, decompressing it only when brought into cache.
Alistair Moffat, Andrew Turpin

Chapter 2. Fundamental Limits

Abstract
The previous chapter introduced the coding problem: that of assigning some codewords or bit-patterns C to a set of n symbols that have a probability distribution given by P = [p1,…,p n . This chapter explores some lines in the sand which cannot be crossed when designing codes. The first is a lower bound on the expected length of a code: Shannon’s entropy limit. The second restriction applies to the lengths of codewords, and is generally referred to as the Kraft inequality. Both of these limits serve to keep us honest when devising new coding schemes. Both limits also provide clues on how to construct codes that come close to reaching them. We can also obtain experimental bounds on compressibility by using human models and experience, and this area is briefly considered in Section 2.3. The final section of this chapter then shows the application of these limits to some simple compression systems.
Alistair Moffat, Andrew Turpin

Chapter 3. Static Codes

Abstract
The simplest coding methods are those that ignore or make only minimal use of the supplied probabilities. In doing so, their compression effectiveness may be relatively poor, but the simple and regular codewords that they assign can usually be encoded and decoded extremely quickly. Moreover, some compression applications are such that the source probabilities p i have a distribution to which the regular nature of these non-parameterized codes is well suited.
Alistair Moffat, Andrew Turpin

Chapter 4. Minimum-Redundancy Coding

Abstract
We now turn to the more general case illustrated by the “Code 3” column in Table 1.1 on page 7. It is the best of the three listed codes because, somehow, its set of codeword lengths better matches the probability distribution than do the other two sets. Which forces the question: given a sorted list of symbol probabilities, how can a set of prefix-free codewords be assigned that is best for that data? And what is really meant by “best”?
Alistair Moffat, Andrew Turpin

Chapter 5. Arithmetic Coding

Abstract
Given that the bit is the unit of stored data, it appears impossible for codewords to occupy fractional bits. And given that a minimum-redundancy code as described in Chapter 4 is the best that can be done using integral-length codewords, it would thus appear that a minimum-redundancy code obtains compression as close to the entropy as can be achieved.
Alistair Moffat, Andrew Turpin

Chapter 6. Adaptive Coding

Abstract
In the three previous chapters it has been assumed that the probability distribution is fixed, and that both encoder and decoder share knowledge of either the actual symbol frequencies within the message, or of some underlying distribution that may be assumed to be representative of the message. While there was some discussion of alternative ways of representing the prelude in a semistatic system such as those of Algorithm 4.6 on page 83 and Algorithm 5.3 on page 102, we acted as if the only problem worth considering was that of assigning a set of codewords.
Alistair Moffat, Andrew Turpin

Chapter 7. Additional Constraints

Abstract
The coding algorithms presented so far have focussed on minimizing the expected length of a code, and on fast coding speed once a code has been devised. Furthermore, all codes have used a binary channel alphabet. This chapter examines other coding problems.
Alistair Moffat, Andrew Turpin

Chapter 8. Compression Systems

Abstract
This chapter resumes the discussion of compression systems that was started in Chapters 1 and 2, but then deferred while we focussed on coding. Three state-of-the-art compression systems are described in detail, and the modeling and coding mechanisms they incorporate examined. Unfortunately, one chapter is not enough space to do justice to the wide range of compression models and applications that have been developed over the last twenty-five years, and our coverage is, of necessity, rather limited. For example, we have chosen as our main examples three mechanisms that are rather more appropriate for text than for, say, image or sound data. Nevertheless, the three mechanisms chosen — sliding window compression, the PPM method, and the Burrows-Wheeler transform — represent a broad cross section of current methods, and each provides interesting trade-offs between implementation complexity, execution-time resource cost, and compression effectiveness. And because they are general methods, they can still be used for non-text data, even if they do not perform as well as methods that are expressly designed for particular types of other data. Lossy modeling techniques for non-text data, such as gray-scale images, are touched upon briefly in Section 8.4; Pennebaker and Mitchell [1993], Salomon [2000], and Sayood [2000] give further details of such compression methods.
Alistair Moffat, Andrew Turpin

Chapter 9. What Next?

Abstract
It is much harder to write the last chapter of a book than to write the first one. At the beginning, the objective is clear — you must set the scene, be full of effusive optimism, and paint a rosy picture of what the book is intended to achieve. But by the time your reader reaches the final chapter there is no scope for lily-gilding, and for better or for worse, they have formed an opinion as to the validity and usefulness of your work.
Alistair Moffat, Andrew Turpin

Backmatter

Weitere Informationen