|Adaptive Prediction Tree Coding|
APT Version 1.0 (Released March 2004)John Robinson
What is APT?
APT stands for Adaptive Prediction Trees. It is a general-purpose image coding method for lossless and lossy compression. It codes monochrome and colour images with bit depths from 1 to 16 bits per component. Its performance on all image types is often close to that of the best alternative.
APT uses principal components analysis, a multicomponent image pyramid, shape-adaptive non-linear predictions, adaptive runlength and Huffman coding. It also supports coding with colour palettes where appropriate. No part of APT is patented (but see the note below on the palette ordering method in the provided source code). APT is inherently progressive, and a straightforward modification of the decoder to write directly to an on-screen picture buffer allows simple, progressive, image recovery.
APT supersedes Binary Tree Predictive Coding (BTPC). The final version of BTPC was version 5, released March 2003. APT's spatial predictor, inter-component predictor, quantizer and lossless coder are all different to those in BTPC. APT uses hex-trees rather than binary trees.
How does APT compare with the alternatives?
Here is a thumbnail of a photographic image (original size 512 x 512):
Here is a graph comparing the coding of the original photo by APT and standard alternatives.
Note that the x axis of the graph (bits per pixel) is logarithmic. The vertical axis (Peak Signal to Noise Ratio) is in dB. Higher and to the left is better so this graph shows JPEG 2000 marginally beating APT over a wide range of data rates. The lossless data rates for APT, PNG and JPEG-LS are shown as vertical lines (they correspond to infinite PSNR so using vertical lines is just a way to show them on the same graph). The leftmost line is best, so APT marginally beats JPEG-LS. (The actual values for lossless compression are 12.2 bits per pixel for APT, 12.4 bpp for JPEG-LS and 13.6 for PNG.)
This graphical image (original size 40 x 610):
is coded losslessly by APT at 1.40 bpp, by PNG at 1.92 bpp and by JPEG-LS at 5.46 bpp. The picture was originally a palettized GIF at 7.24 bpp. Lossy coding is shown here:
Again the lossless results for APT, PNG and JPEG-LS are included as vertical lines, showing that both JPEG and JPEG 2000 must code below 30 dB PSNR to achieve a similar data rate to APT's lossless performance.
Results comparing APT with competitors on many other images are given at the bottom of this page. In summary:
When applied losslessly, APT has good performance on both natural images and graphics. On average it matches JPEG-LS on photographs and does much better on graphics. It beats a typical one-shot PNG compressor almost always, but when many possible options for PNG are tried during encoding (e.g. by pngcrush) APT loses to PNG about 30% of the time. In tests APT is rarely more than 5% worse than the best of JPEG-LS, PNG and GIF but it is sometimes much better than all three. For example, the graph image distributed with the APT source code compresses to 0.096bpp with APT, 0.14bpp with PNG and 0.47bpp with both GIF and JPEG-LS.
As a lossy compressor for photographic images, APT is always superior to JPEG and sometimes superior to JPEG 2000. More often it is slightly inferior to JPEG 2000 for photos, but the subjective difference is very small at the compression ratios used in practice. At very low data rates two artifacts of APT become visible -- smudging and speckling. These may be less acceptable subjectively than the "blocky" artifact of DCT and wavelet compression. The two expanded versions of a small part of lenna below were both coded at 0.2 bpp (more compression than is usual). The JPEG 2000 case, on the left, used unequal colour weighting to maximize subjective quality. (In objective comparisons I test JPEG 2000 with equal colour weighting to maximize PSNR.) APT does not provide a choice of colour weightings so the image shown on the right is the same as used in objective comparisons.
For lossy compression of graphics, APT is much superior to both JPEG and JPEG 2000. For other imaging modalities (e.g. MRI, SAR) APT performs close to JPEG 2000.
APT's lossless coding of bilevel images is usually inferior to CCITT Group IV coding (i.e. READ coding). The worst measured performance for APT against standards has been for coding high-resolution document images (e.g. faxes) where it uses up to twice as many bits as Group IV. However, for detailed bilevel images (e.g. the contours of regions in two-component video coding, or a transparency bitmap) it outperforms Group IV. It far outperforms PNG, JPEG-LS, etc. for bilevel images.
APT compresses images with component depths greater than 8 bits. This makes it suitable for 12-bit and 16-bit medical and remote sensing images. I have not done a systematic evaluation of APT's relative performance on such images for lack of suitable comparison coders, but performance is better than independent coding of 8-bit planes by any competing method. However, at its core, APT is still an 8-bit method, and the released version does not include quantization tests to see whether a presented 16-bit image is really just a scaled 8-bit original. In future work I hope to improve APT's 16-bit performance by converting to 16-bit resolution throughout. This should also lead to better handling of large palettes and allow lossless coding with invertible colour transforms.
In the freely-available software implementation of APT I have made some effort to make decoding fast, but have left several slow functions in the encoder. As a result encoding can take up to ten times as long as decoding. APT averages about the same decode speed as Kakuda, a speed-optimized software implementation of JPEG 2000, for images of equivalent quality. APT is much faster than any known version of JPEG-LS.
Getting and using the code
C++ source code is available for APT and may be freely used.
Note: The palette ordering method implemented in the file palettize.h of the source code may be covered by a patent. APT itself does not define palette ordering as this is a matter for the encoder only. I anticipate that later versions of the APT source code will implement a patent-free palettizer in the encoder. This will not affect the APT format (i.e. the palette order can be changed and the decoder will work unaltered). Note that palettes are only used for lossless coding of images with a small number of colours.
The tarred and gzipped package apt1.0.tar.gz contains:
The executables are:
The makefile contains instructions to make all of these.
The optional quality parameter is a number between 0 and 100. Default is 80. Quality 100 is lossless.
capt reads PBM, PGM and PPM raw format files (magic number "P4", "P5" or "P6"). dapt writes PGM and PPM files only.
I have included two pictures in the distribution. To test your compiled versions of capt and dapt work correctly, you can do:
If you then view graph.ppm, you will see the performance curves for
of airplane.ppm. airplane.ppm is a photographic image, so the curve for
slightly below that of JPEG 2000. graph.ppm, however, is graphical,
and is an example of an image that compresses significantly better
APT than its competitors.
BTPC was described in J A Robinson, "Efficient General-Purpose Image Compression with Binary Tree Predictive Coding", IEEE Transactions on Image Processing, Vol 6, No 4, April 1997, pp 601-607. The web pages at http://www.elec.york.ac.uk/visual/jar11/btpc/btpc.html explain the principles of the method as well as provide results for BTPC version 4.1. I recently (March 2004) learned that BTPC 4.1 is still being used on mobile phones because of its very fast encode and decode speeds, so here is the Unix/Linux archive and here is a zipfile including Windows binaries. Some of the C++ in version 4.1 is obsolete, so you may have to tweak it to get it to compile these days. The BTPC version 5 webpages are still online in a similar format to these APT pages.
APT comparisons with standard algorithms