Here are two mail letters I wrote in response to responses to a new article I posted with the title "arithmetic encoding revisisted revisisted". I have since lost the original article in a hard drive crash. If anyone has this article, I would REALLY appreciate a copy!!
From bloom Thu Jun 22 09:34:18 1995
Date: Thu, 22 Jun 95 09:34:14 PDT
From: bloom (Charles Bloom)
To: noesis@ucscb.UCSC.EDU
Subject: Re: Huffman beats Arithmetic !!
Cc: bloom
Content-Length: 4291
Status: RO
X-Lines: 152
> liked the article. common for people to neglect the simple
> when given what looks like better / more complex. linked lists
> and bubble sort are good examples.
>
> but...that's not why i wrote :)
>
> i thought i new bunches about compression, but alas, i've never
> heard of either:
>
> >DCC95 code : 10k bps
> >Rissanen-mohiudin : 15k bps
>
> could you (briefly) describe these or send me off to an ftp
> site.
> thanks
> --kyle
>
response posted to comp.compression.research 6-22-95
Hi!
"DCC95 code" refers to the source code for "Arithmetic
Encoding Revisited" by Moffat, Neal & Witten. It is
available at munnari.oz.au : pub/arith_coder/arith_coder.tar.gz
"Rissanen-mohiudin" refers to the technique in
"A Multiplication-Free Multialphabet Arithmetic Code"
by J. Rissanen and K.M. Mohiuddin
I'm afraid there is no publicly available version of this
code... maybe I'll post mine...
the basic idea is:
normal arithcoding:
you have range defined my Low (L) and Range (R)
you want to shrink that range as set by SymbolTotal,
SymbolLow, and SymbolHigh (St,Sl,Sh) ( Sl < Sh < St )
(St = cumprobtotal, Sl and Sh are lower and upper
comprobs of a symbol)
L = L + (Sl/St)*R+
R = ((Sh/St) - (Sl/St))*R
rewrite as:
L = L + Sl*(R/St)
R = (Sh-Sl)*(R/St)
rissanen-mohiuddin coding:
the idea is to eliminate the multiply divide by
setting R/St = 1
thus L = L + Sl
R = Sh - Sl
tada! The key is: since the only thing which matters with Sl,Sh,St is their
ratios (i.e. you can scale them without changing their
meaning) (this is obvious in the first form of the
above formulas, in which only ratios of cumprobs appear)
you can therefore change them so that St = R
If you actually scaled St up to R, this would be
pointless (because it would require mult & div)
but you can approximately scale St to R by
just shifting St up to the neighborhood of R.
Now I will go into a more theoretical discussion
once St is within the region (R>>1) < St < R
you have a problem : there is extra code space
E = R - St
not using this extra code space would cause a
serious penalty in compression (about 0.5 bpb worse)
Thus, you must put this extra code space (E) onto
some symbol.
The fastest way to do this is to put it on the last
symbol of the alphabet:
if ( Sh == St )
R = R - Sl
L = L + Sl
(note the difference between this and the original)
A simple, pretty good way, is to put E on the MPS
thus:
if C > MPS
Sl += E
Sh += E
else if C == MPS
Sh += E
else
no change
Now, the interesting thing : the best way of all
to allocate E is to distribute it in the same
ratio as the counts. Thus, we wish to add
to each count some fraction of itself, until
St == E
To do this, we cannot just double each count,
because St > E , however, we may be able to
add the count/2 to itself:
also note that adding count/2 to each count
is the same as adding cumprob/2 to each cumprob.
thus:
if (St>>1) > E
{
Sl += Sl>>1
Sh += Sh>>1
E -= St>>1
}
if not, try it for adding count/4 to itself,
etc. until E == 0 (or thereabouts, E<16 or so
should be fine)
Note that this is not completely right because
of integer division rounding effectings (i.e.
St>>1 will be larger than the actual sum of all
shifted-one counts, because the odd numbered counts
will drop their least-sig-one )
Notez: this identical to the DCC95 code!
Thus : I show that the DCC95 code is just a slick
way to allocate the extra-code-space from the
Rissanen-Mohiuddin code !!!
Pretty cool, huh?
sidenote: I'm not really positive that this
result is the same as the DCC95 code, but it
looks a whole lot like it... definitely, the
viewpoint/perspective is different, but I
think the resultant algorithm is the same...
notez: here's an area of research for someone to
explore (tell me about it if you do) :
a compromise between the speed of
just adding E to the MPS and properly
distributing E among the symbols could
be found, such as properly distrubiting E
amound the top-four-most-probable symbols
etc. Perhaps, for high-order context
modelling, you could use a very small
alphabet of the most probable
symbols and so could very easily
distribute E...
Charles Bloom
From bloom Thu Jun 22 18:36:27 1995
Date: Thu, 22 Jun 95 18:36:25 PDT
From: bloom (Charles Bloom)
To: alistair@cs.mu.OZ.AU
Subject: Re: "Arithmetic Coding Revisited" Revisited
Cc: bloom
Content-Length: 3232
Status: RO
X-Lines: 94
Hi, Dr. Moffat,
Thanks for the response.
> > > Yes, the RM method is essentially a special case of the DCC'95 code,
> > > when b-f=2. This relationship was noted in the paper.
> >
> > b-f = 2 and they don't bother with spreading the extra code space:
> > they just drop it on the MPS which is the major difference in speed..
>
> Well, the same is true for the DCC code. It does b-f iterations of
> ``spreading'' the error, and then puts all residual on the last
> symbol. The code as distributed does not ensure that the last is the
> MPS, but the paper notes that for best efficiency the extra should be
> given to the MPS.
Exactly, except that the RM code doesn't even do those b-f iterations !
Once you have scaled St such that
(R>>1) < St < R
then there are no more shifts for the RM coder to do, all you do is:
L += Sl
R = Sh - Sl
( in the terminology of my original message; review :
L and R describe the current setting of the arithcoder
Sl,Sh,&St are symbol cumprobs supplied by the model
St is the total of all probs
Sl & Sh are the upper & lower cumprobs
for full precision you would do:
L += Sl*R/St
R = (Sh-Sl)*R/St )
Whereas the DCC95 coder does:
E = R - St
Slup = Sl>>1
Shup = Sh>>1
Stup = St>>1
for(i=0;i>=1;
Shup>>=1;
Stup>>=1;
}
L += Sl
R = Sh - Sl
note that the DCC95 coder looks a little different than this
because of the fact that it does the normalization of
(R>>1) < St < R
and the allocation of E in one single merged loop...
the RM code eliminates all of that b-f loop by just doing the last two
instructions. The RM code sticks a whole lot on the last symbol : E,
whereas the DCC95 code sticks much less than E on the last symbol...
> > BTW is the full paper version of "arithcoding revisited" available yet?
> > (I asked once before and was told that it was still in draft stage)..
>
> No, not yet. Too many other things have intervened.
> Sorry...
I'll just keep waiting ....
BTW the reason I have investigated this topic is because I developed
my own mult&div-free arithcoder, and discovered that it was the same
as the DCC95 & RM coders. I never had a really good understanding of
the RM & DCC95 coders until I rediscovered them :^> +
Essentially the problem comes down to this:
I find that despite many advances, arithcoders are still
too slow for practical use : Static (two-pass) Huffman is about 10 times
faster than the DCC95 coder, and is even 5 times faster than the
very crude RM coder - and static huffman often outperforms the RM coder
for compression ratio (on large, evenly-distributed order-0 alphabets) !
The DCC95 coder is further slowed by the complex model
maintenance required to maintain the partially-normalized counts, and
the RM coder is slowed by the same need to maintain
(R>>1) < St < R
i.e. partial normalization ...
Is there a solution to his debacle? The problem is that I
need to be able to do order-1 with the speed of Static Huffman (order-0)
and I had hoped that an approximate arithcoder would do it, but I
find that it does not...
Charles Bloom
Charles Bloom / cb at my domain