Image Compression
In the evenings he would feast on ancient books
to the lazy enchanting lap of wavelets ...
from Vladimir Nabakov's Invitation to a Beheading
Heavily updated in May and June of 1998.
Sections
Lossy
WaveVideo
Finished Sept. 1999, finally released October, 2004 !!
WaveVideo.zip (300k zip). Full source + exe + crblib so it stands alone.
Compile with MSVC6. Uses DirectDraw 4.
The highlight of WaveVideo is a very fast fullframe Wavelet image decoder. See also this article 
Wavelet Video notes
NOTEZ : the compiled version uses the D97/C22 transform ; for speed, make sure you use the C22/Haar
transform set.
Also, I do silly things like ftoi() that are not needed and can be way faster with SSE2. The true
speed of the algorithm is probably well beyond the experimental numbers.
excerpt from the readme :
Basically WaveVideo works inplace on an image. It is a fullimage wavelet transform,
it is not blockbased at all. It is also a true Wavelet transform (CDF22 and Daubechies 9/7
are supported), not just a trivial Haar wavelet.
The decode works with a few circular row buffers. The size of the buffers needed depends on
the span of the transform; eg. Haar needs 2 row, CDF22 needs 3 rows, D97 needs 7 rows.
Very good speed can be acheived with a horizontal CDF22 transform and a vertical Haar transform.
I've got some nice handwritten diagrams of the memory flow, but nothing digitial.
Basically you first decode the wavelet coefficients for the needed rows into the temp LH,HL,HH
bands. You have the previous LL band already. You never actually decode the full image, you
decode rows as you need them. Then you do the horizontal inverse transform on the LL+LH and
the HL+HH to make a L and H temp vertical rows; you build these into temp circular row buffers
until you have enough to do inverse vertical transforms, and then you do that into the output
image.
Each level of wavelet inverse transform works like this :
[coded stream] > Coefficient decoder > [wavelet coefficients for LH,HL,HH bands in temp buffers]
[previously decoded LL band] + [wavelet coefficients for LH,HL,HH bands] >
horizontal inverse transform > [temp L and H vertical rows] >
vertical inverse transform > [decoded image rows]
the decoded rows are thus streamed out in order. So, we are streaming in the coded bits and
the previous LL band, and streaming out a decoded image.
On the final level, we also do a YUV>RGB transform (the coded stream is in YUV).
WaveCode
 Get the source code to WaveCode (Zip,150k) .
This is wavelet, DCT, S+P (tdec), and the kitchen sink.
 For the merelycurious (or users of bad compilers), you can
download an v1.7 executable for Windows NT/95 (Zip,141k)
 Added a VC6 project, April 2003
 Wavecode v1.0 released May 23, 1998
 Wavecode v1.1 released May 26, 1998
 Wavecode v1.2 released June 16, 1998
 Wavecode v1.3 got lost in the mail :^)
 Wavecode v1.4 released June 26, 1998
adds many new coders, a new "stop at bitrate" and a seekbestquantizer for bitrate,
adds zerotrees to some coders, and tunes some coders for speed. My DPCM on the LL
band is quite poor, so performance may even be worse than v1.2. The coder "bp_bin"
uses many ideas stolen from Xaolin's excellent ECECOW. The "bp bin fast" is an
almostasgood coder that is much fast; try also the "bp bin fast zerotree" for
blazing speed and compression.
 Wavecode v1.7 released August 19, 1998
 Adds a slick new DPCM for the LL that makes a fast estimate of the Gaussian variance
then uses a semiGuassian code for that variance.
 Adds many Lifting Transforms : cdf22, cdf24, bcw3, d4, daub97
 Adds Wavelet Packets ; we don't prune the full tree right now, we only go forward
as deep as we can figure is optimal
 adds more coders still, and god knows what else
 does color images in YUV and sorts bands on activity
 More information someday soon; some performance numbers in the table below.
 Take a look at Lena in Wavelet, Fourier, and Lifting (15k,gif) bases.
(no quantization, that is, quantizer=1.0)
 I've often complained that the simple histograms used to estimate the rate for use in
ratedistortion optimal quantization (see, for example a full discussion on the standard
method by Dominik Schmidt which I downloaded locally and can't find on the web anymore...)
Anyhoo, here is Lena at 0.25 bpp (8k,gif) with various quantizer values; the goal of the optimal quantizer
is to move around in this curve and find the quantizer which minimizes the error. As you can
see this curve is not simple, and not monotonic.
 More...
DCT Based
We've got a nice family of DCT coders for you. All use a DCT Transform
shamelessly stolen from the Independent JPEG Group (and simplified, and
consolidated). We beat JPEG in a measure of MSE (meansquarederror) or
PSNR (= 20*log(256/rmse)) but that's no surprise since jpeg is meant for
*visual* quality, not MSE quality; in fact ArithCoded JPEG with flat
quantizers is quite competitive with the state of the art wavelet coders.

Read the Bloom Public License

You need crblib to compile this source.

DCT (Zip,14k) coder. Many advances over JPEG,
incorporating various stateoftheart techniques (ratedistortion optimal
thresholding, order1 context coding, etc.) help very little. Equipped
with vast status reports, this is a nice toolbox for examining DCT.

EZ DCT (Zip,12k) coder. An idea shamelessly
stolen from Xiong et.al. ("A DCTBased Embedded Image Coder") is to apply
the "EZ" scheme to DCT data. We treestructure the coefficients of each
block, rather than zigzag scan. We simply transmit bitplanes. There are
no "zero trees" or "significance maps"  we refuse to introduce jargon which
is irrelevant to the coding; all an EZ coder really does is code the bitplanes
instead of the values. This coder is *incredibly* simple (and fast), and is
quite competitive with the stateoftheart.

Report of Results on Lena (512x512x Grey) 

coder  bit rate  MSE  PSNR 

JPEG base, flatq=75  0.25  59.1  30.45 
JPEG huff, flatq=58  0.25  44.5  31.68 
JPEG arit, flatq=54  0.25  41.3  32.00 
EZW  0.25  31.4  33.2 
SPIHT (improved EZW)  0.25  29.2  33.5 
WaveCode v1.7 (bitplane)  0.25  24.9  34.2 
SFQ  0.25  24.3  34.3 

baseline jpeg  0.50  22.32  34.7 
jpegarith  0.50  20.00  35.15 
flatq jpeg, huffman  0.492  19.45  35.3 
ezdct  0.50  19.00  35.38 
dct  0.50  18.53  35.49 
tuned jpeghuff  0.50  18.5  35.5 
flatq jpeg, arith  0.503  17.92  35.6 
EZW  0.50  15.4  36.3 
SPIHT  0.50  13.5  36.9 
WaveCode v1.5  0.50  12.9  37.0 
SFQ  0.50  11.83  37.4 
Key:
 dct and ezdct are our DCT coders.
 WaveCode is our generic transformcoder package. Results are
for the Daubechies 9/7 wavelet transform (with uniform quantizers)
unless otherwise stated.
 baseline jpeg is the very stupid JPEG which is usually set up and knocked
down like a bowling pin : it uses precomputed entropy tables. The "opitmized"
JPEG uses perfile computed Huffman tables, and is a much reasonable comparitor.
 jpegarith is the ISOSpec implementation of JPEG arithmetic coding.
Note how well the flatq jpeg arith compares with EZW.
Get it
 flatq jpeg is JPEG with flat quantizers (25 for the arithcoding version,
27 for the "optimized" huffman version). Be careful when comparing, because I couldn't
get these to hit 0.50 bpp exactly. Note that the flatq huffman beats the arithcoding
jpeg with visual quantizers and at a lower bitrate! clearly indicating that
quantizers are the single most important thing in JPEG.
 tuned jpeghuff is baseline JPEG with optimized (computed) Huffman tables,
and ratedistortion optimized quantization tables. This process is *very* slow, and
helps a whole lot. It should be noted that this is only mildly better than using flat
quantizers; it is foolish to compare tuned quantizers vs. baseline JPEG using an MSE
or PSNR rating.
 EZW is the famous coder of Shapiro. It beats ezDCT presumably only
because of the superiority of the wavelet transform.
 SPIHT is the improved version of EZW by Said&Pearlman. It is
structurally identical to EZW, but modernized.
 SFQ is the RD optimal EZ coder of Xiong et.al. ("SpaceFrequency Quantization").
They also do optimal scalar quantization and socalled zerotree prediction. It's amusing
that SFQ at a rate of 0.25 has the same quality as JPEG baseline at twice the rate !!!
A sample run of DCT
Q:\dct>dct c:\ccc\images\peppers.512.raw 512 512 1 83
dct v0.5, (c) 1998 by Charles Bloom
peppers.512.raw : 262144 > 29840 = 0.910 bpb = 8.784 to 1 , 169125 byps
DC dpcm : 4096 > 3370 = 6.582 bpb = 1.215 to 1
AC order01 : 258048 > 22103 = 0.685 bpb = 11.674 to 1
AC order(1) : 56 > 55 = 7.857 bpb = 1.018 to 1
signs : 4321 > 4312 = 7.983 bpb = 1.002 to 1
error: av= 2.85,max= 29,mse= 13.712,rmse= 3.70,psnr= 36.79,perc= 3.67
performance, MSE = 0.640665 , RMSE = 2.372391 , percep = 2.394120
n0 = 33.197403 % , n1 = 7.017136 %, zero_len = 136453 , av = 33.313721
0p0 = 83.736185 % , 0p1 = 12.543942 % , 0pbig = 3.719872 %
nsign = 14.6774 % , same = 33.9069 % , diff = 28.8621 % , noc = 37.2310 %
Some figures are nice. The average EOB length of zeros is 33.3 ;
the nonEOB'd values are in turn 33% zeros. We also get the delightful
figure that zero predicts another zero 84% of the time, and a large value only
follows a zero 4% of the time.
15% of pixels needed signs; 34% of these signs were the same as their parent,
29% were different, and 37% had a zero parent (no sign). This slight
skewing saved a whopping 9 bytes of the total file length. :^)
The "percep" error measure gives less weight to high frequency errors;
it is an rmse error, so compare 3.67 percep and 3.70 rmse, you see not much
difference (this is because we use uniform quantizers).
The coders here are reasonably
practical compared to the state of the art (CALIC,
contextbased image coders (UCM)). I conjecture that the best
image coder would be a PPMZ modified to work on 2d arrays
of data (2d contexts), but I don't have time to make that
modification. (some other subtleties may be necessary,
like a preliminary transform of basis to YUV or deltacodes,
and/or using the other colorplanes to predict the current
pixel (R predicts G, G predicts B, etc.).

Read the Bloom Public License

You need crblib to compile this source.

Get my DPCM (Zip,8k) (Digital Pulse Code Modulation) coder.
The name is an anachronism  the DPCM coder is the simplest sensical
image coder; we take advantage of the fact that images tend to be
smooth. I use ideas shamelessly stolen from CALIC and others, so this source
code is for academic experimentation only!! Some parts are patented by other people,
so do some research before using it in commerce.
Added 32898 : "imppm" deterministic prepass with a PPM (with my SEE) helps a bit.
We now slightly beat the stateoftheart, but I think with the techniques in here,
we should be doing even better still. There's some finetuning which is very delicate.
Noted 42898 : there is a minor bug which causes stalls on rare files

A tree decomposition (zip,3k) coder. This is an extremely small
and fast little coder, with performance roughly on par with CALIC's 1off mode (lossy, with a
largest error of +/ 1 , and a squareaverage error of zero). The advantage over calic's
lossy mode is the extreme simplicity and speed of this coder, along with its progressive
transmission. Unfortunately, I see very little room for improvement here.

It's big brother is EZ treedec (zip,4k) . This is
inspired by the "EZ" wavelet and DCT coders. The transform is the same simple one as
in the above tdec, but is now kept lossless by going to 10bit pots. An EZ pyramid
scheme is used to code bitplanes of coefficients. We get about 0.4 bpb worse than
the state of the art lossless coders, at the benefit of extreme simplicity. This is
a nice way to learn the EZ scheme on a simple coder.
Our performance is quite encouraging, considering that our transform is
basically a 2x2 DCT. (apparently I rediscovered the
socalled "Stransform", the trivial Haar "wavelet"). I get 4.640 bpp on
Lena which equals the best losslessJPEG (not saying much) and
makes a mockery of the 4.77 figure that Said & Pearlman quote for the STransform.
As usual, authors do not give proper credit to the paperdoll algorithms they
knock down.
Now try EZ treedec IV (zip,7k) . I got the basic
idea from S & P 's S+P coder, went off to play with it and tuned out the params,
and lo reproduced exactly their coder. Strangely, their compression beats us
by about 0.2 bpp. Also strangely, this coder seems to work best at levels==0 ,
that is, pure DPCM ! This is a disappointment and presented only for
instructional value (or maybe you can tell me what went wrong).
(btw the EZ treedec coders are now available as a subset of WaveCode, and
the ones in WaveCode are betters; the code here is selfcontained and
embedded so may still be valuable for instruction).
 The "Tapestry" Image (512x512 x 8bit grey). I created this as a test, and it's so
interesting that I post it in its own right. This is a natural image,
copied around to be extremely repetitive. This image is a challenge
to image compressors, which are generally very bad at this sort of thing
(or good at it, like PNG and Rkive, while being bad on standard tests
like Lena). For example, PPMZ v9.1 compresses Tapestry to 0.736 bpp
( download it , 24k ) , while EZTdec3 packs it to 7.289 bpp,
almost ten times worse!!! In fact this is one of the most (apparently) incompressible
nonrandom images I have ever found.
Charles Bloom / cb at my domain send me email
Back to the Source Code Index
The free web counter says you are visitor number