--------------------

The Boosting Library

--------------------

Welcome to the home page of the boosting library. Informally, compression boosting is a technique that takes a memoryless compressor (one that uses no context) and turns it into a compression algorithm that uses the best possible contexts. Interested? Then read all the details in the paper Boosting Textual Compression in Optimal Linear Time, by P. Ferragina, R. Giancarlo, G. Manzini, M. Sciortino, J. ACM, Vol. 52 (2005), 688-713 (local copy available here).

Although theoretically entralling, the practical implementation of compression boosting is a non trivial task. Indeed, the boosting technique builds upon three main ingredients: the Burrows-Wheeler Transform, the Suffix Tree data structure, and a greedy algorithm to process them. Managing these three ingredients within reasonable time and space bounds requires the use of carefully engineered algorithms and data structures. An implementation of the compression boosting is described in the paper The Engineering of a Compression Boosting Library: Theory vs Practice in BWT compression by P. Ferragina, R. Giancarlo, G. Manzini (ESA '06) and in this more complete technical report. The source code of this implementation is available under the GNU General Public License Version 2. It has been fully tested only under Linux, but it should work on any platform supporting GCC.

Currently, we have applied the compression booster to the classical order 0 compressors: Huffman coding, arithmetic coding, and range coding. However, our boosting library is highly modular and anyone can use it to create a powerful high order compressor even without any knowledge of the inner working of the boosting techniques. Anyone interested in developing and testing his/her own compression algorithms is encouraged to download also the collection of files used in our experiments and to use them for testing.


The material below this paragraph is made available for research purposes only. Copyright and all rights therein are maintained by the authors or by other copyright holders. It is understood that all persons accessing this information will adhere to the appropriate copyright rules.
The range coder used in the boosting Library is the Carryless Rangecoder by Dmitry Subbotin translated to C by Mikael Lundqvist. It is fast, efficient and very simple to use. You can download the source code from Mikael's home page or from this local copy.

We compared the performance of our compressors with the bzip2 tool by Julian Seward and with the PPMd tool by Dmitry Shkarin (which was the best of the lot). You can download the source code of Version J (the one used in our tests) of PPMd here, or brush up your Russian and search for the latest version on Dmitry's page. Of course, bzip2 source code is available from the bzip.org home page.


Page last modified on 30 August, 2006.

--------------------

Viewable With Any Browser Valid HTML 3.2!

--------------------