Go to the first, previous, next, last section, table of contents.


Programming with libbzip2

This chapter describes the programming interface to libbzip2.

For general background information, particularly about memory use and performance aspects, you'd be well advised to read Chapter 2 as well.

Top-level structure

libbzip2 is a flexible library for compressing and decompressing data in the bzip2 data format. Although packaged as a single entity, it helps to regard the library as three separate parts: the low level interface, and the high level interface, and some utility functions.

The structure of libbzip2's interfaces is similar to that of Jean-loup Gailly's and Mark Adler's excellent zlib library.

Low-level summary

This interface provides services for compressing and decompressing data in memory. There's no provision for dealing with files, streams or any other I/O mechanisms, just straight memory-to-memory work. In fact, this part of the library can be compiled without inclusion of stdio.h, which may be helpful for embedded applications.

The low-level part of the library has no global variables and is therefore thread-safe.

Six routines make up the low level interface: bzCompressInit, bzCompress, and
bzCompressEnd for compression, and a corresponding trio bzDecompressInit,
bzDecompress and bzDecompressEnd for decompression. The *Init functions allocate memory for compression/decompression and do other initialisations, whilst the *End functions close down operations and release memory.

The real work is done by bzCompress and bzDecompress. These compress/decompress data from a user-supplied input buffer to a user-supplied output buffer. These buffers can be any size; arbitrary quantities of data are handled by making repeated calls to these functions. This is a flexible mechanism allowing a consumer-pull style of activity, or producer-push, or a mixture of both.

High-level summary

This interface provides some handy wrappers around the low-level interface to facilitate reading and writing bzip2 format files (.bz2 files). The routines provide hooks to facilitate reading files in which the bzip2 data stream is embedded within some larger-scale file structure, or where there are multiple bzip2 data streams concatenated end-to-end.

For reading files, bzReadOpen, bzRead, bzReadClose and bzReadGetUnused are supplied. For writing files, bzWriteOpen, bzWrite and bzWriteFinish are available.

As with the low-level library, no global variables are used so the library is per se thread-safe. However, if I/O errors occur whilst reading or writing the underlying compressed files, you may have to consult errno to determine the cause of the error. In that case, you'd need a C library which correctly supports errno in a multithreaded environment.

To make the library a little simpler and more portable, bzReadOpen and bzWriteOpen require you to pass them file handles (FILE*s) which have previously been o