Results 61 to 80 of 80

Thread: SMAC : SMall Arithmetic Coding

- Show Printable Version
- Email this Page…
- Advanced Search
- Linear Mode
- Switch to Hybrid Mode
- Switch to Threaded Mode

23rd May 2013, 09:20 #61
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
What I propose (or better: what I have seen in LPAQ IIRC) is to have 5 mixers. Use first one if in step 1 you collected only 1 prediction. Use second on if in step 1 you collected 2 predictions. And so on, finally use fifth one if in step 1 you collected 5 predictions (ie all models had available predictions).

Now I get it : this would avoid the problem of high-order models that have not collected any statistics yet, and are initialized with a probability of 1/2. Thanks for the tip! Sorry if I seem a bit slow of the mark, I'm a bit confused with the terminology sometimes.

A match model could be a good solution to discard long matches. I have to study this.

Last edited by Jean-Marie Barone; 23rd May 2013 at 11:39. Reason: typo
Reply With Quote Reply With Quote
23rd May 2013, 14:33 #62
Piotr Tarsa

View Profile

View Forum Posts

Private Message

Visit Homepage
Piotr Tarsa is offline
Member

Join Date

Jun 2009

Location

Kraków, Poland

Posts

1,516

Thanks

31

Thanked 143 Times in 109 Posts
Both of those techniques are used in various paq8 (but I think not lpaq, or possible in zpaq).

OK, probably I mixed up something.

One problem with disabling models for long matches is that those models will not see most of the match and then predict the next byte badly because it doesn't see the immediate context.

Really? I got an impression that you don't update the contexts nor rebuild them, so in that case that could be true. But for simple order-n models it should be sufficient to track the last max(all models orders) bytes and then rebuild the contexts for all models on mismatch.

As for using the context match length as mixer context, it does help, but I'm not sure if it works any better than just using an order 0 or 1 context to select the mixer weight vector, like is normally done in zpaq. It might.

You mean the highest order index or long match length? I think highest order index (if the limit is relatively small, like single digit) should be the best single context for selecting mixer.

Now I get it : this would avoid the problem of high-order models that have not collected any statistics yet, and are initialized with a probability of 1/2.

Yep, mainly. Additionally, it could also have other implications - data compression is so unpredictable that you can't reliably predict when something will help or damage compression :]

A match model could be a good solution to discard long matches.

More precisely - it would be good for avoiding many reduntant (well, not 100% reduntant, but practically useless) computations.
In general removing repetitions doesn't change the load on hash table as repetitions do not add any new entries to hash table - the entries are already here. On the other hand, removing repetitions could help avoid problems like overfitting, I think.
Reply With Quote Reply With Quote
23rd May 2013, 19:09 #63
Matt Mahoney

View Profile

View Forum Posts

Private Message

Visit Homepage
Matt Mahoney is offline
Expert
Matt Mahoney's Avatar

Join Date

May 2008

Location

Melbourne, Florida, USA

Posts

3,271

Thanks

315

Thanked 841 Times in 506 Posts
> You mean the highest order index or long match length? I think highest order index (if the limit is relatively small, like single digit) should be the best single context for selecting mixer.

Probably so. I think that's why an ICM-ISSE chain works better than mixing different order ICMs. It effectively does the same thing as using the context order as a mixing context because at each step in the chain, an empty bit history selects a pair of weights to modify the prediction as p := w1*p + w2 in the logistic domain. Initially, w1 = 1, w2 = 0. If the proper action is to emphasize the lower order prediction, then the ICM will learn this by increasing w1.

> But for simple order-n models it should be sufficient to track the last max(all models orders) bytes and then rebuild the contexts for all models on mismatch.

That would probably work, but PAQ doesn't do that. There would be a speed penalty, of course.
Reply With Quote Reply With Quote
31st July 2013, 04:16 #64
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.16: Bitwise, Logistic Mixing, ICM, Orders-6 to 1 & Match Model
● A context seen for the first time points to a Bit history value of 0. With version 1.16, the couples of Frequencies linked to this Bit history entry will not be updated, so its probability will remain p(0)=1/2. Therefore, because Stretch(p(0))=Log₂(0.5/0.5)=0, contexts seen for the first time will have no impact on the global prediction.

● The Match model is triggered on a byte boundary. It searches, on encoding, the Source buffer history (data read so far) to find the same pattern as the current Context (8 bytes). On decoding, it will search the Destination buffer (data written so far).
The bits following that pattern will be considered as next prediction bits until a mismatch occurs, with probabilities of p(0)=65535/65536 if bit 0 is predicted, and p(0)=1/65536 if bit 1 is predicted.
Eventually, that prediction will be included in the Logistic mixing process, along with Orders-6 to 1 predictions.
The Match model uses a hash-table of 512 Kilobytes (65535 entries of 8 bytes) to find in the buffer a pattern matching the current Context.
Each entry of the hash-table is designed like this :

The 64 bits of the Context are hashed to a 16 bits value. Because collisions are very likely, the low-half of the current Context will be compared with the 2nd double-word of the Match hash-table, which contains the low-half of the last Context having produced the same hash.
If both values differ, a collision is detected and the Match model will terminate with a Match Prediction value of 2, meaning that no prediction could be made, and a Match Offset value of 0, meaning the model is turned "off".
If both values are equal, Match Offset value will be filled with the first double-word of the Match hash-table entry, which contains the offset in the buffer of the next byte that follows the same context. Match Prediction will be filled with the most significant bit of that byte.
In all cases (either a prediction could be carried out or not), the entry of the Match hash-table will be updated with respectively the offset of the byte following the current Context, and its low double-word value.
During the next bit encoding/decoding process, the procedure will know that a Match model is "on" by testing the Match Offset variable which will be different from 0, and will fetch the next bit of Match Prediction directly from that address. If that prediction turns out to be wrong, Match Offset will be set to 0, and a new prediction will be calculated on the next byte boundary.

Results on enwik8 & enwik9 are 21831822 & 183551384.

Attached Files Attached Files

File Type: 7z SMAC 1.16.7z (117.1 KB, 476 views)
Last edited by Jean-Marie Barone; 31st July 2013 at 04:20. Reason: typo
Reply With Quote Reply With Quote
31st July 2013, 15:41 #65
Matt Mahoney

View Profile

View Forum Posts

Private Message

Visit Homepage
Matt Mahoney is offline
Expert
Matt Mahoney's Avatar

Join Date

May 2008

Location

Melbourne, Florida, USA

Posts

3,271

Thanks

315

Thanked 841 Times in 506 Posts
Updated LTCB and Silesia.
http://mattmahoney.net/dc/text.html#1835
http://mattmahoney.net/dc/silesia.html
Reply With Quote Reply With Quote
Thanks:

Jean-Marie Barone (1st August 2013)
2nd November 2013, 06:22 #66
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.17 : Optimized for speed
After looking over Intel & Agner Fog’s documentation regarding Core Duo processors & above, I endeavoured to modify the code in order to give a boost to compression & decompression times.

I have discarded many instructions greater than 1 μop or with high latencies, reorganized them to favor a 4-1-1-1 decoding scheme and, more generally, I tried to take profit from latest Intel procs novelties like Store-forwarding, loopback buffer, zero-latency movs, macro-fusions, etc...

The FNV-1 hash for Contexts has also been overhauled, so that the Order-4 hash uses Order-3 hash as a partial result, and likewise for Order-6 hash.

There is no improvement in the compression engine itself, still the results are a tad better due to a change in the rounding method of the Bit History counters. Also the lookup table for 2³²/(n0+n1) now uses floating-point values instead of integers previously.

Results on enwik8 & enwik9 are 21816272 & 183459153.

Attached Files Attached Files

File Type: 7z SMAC.7z (117.0 KB, 1297 views)
Reply With Quote Reply With Quote
4th November 2013, 17:38 #67
Matt Mahoney

View Profile

View Forum Posts

Private Message

Visit Homepage
Matt Mahoney is offline
Expert
Matt Mahoney's Avatar

Join Date

May 2008

Location

Melbourne, Florida, USA

Posts

3,271

Thanks

315

Thanked 841 Times in 506 Posts
Nice improvements. http://mattmahoney.net/dc/text.html#1834

Edit: also, some improvement in the Silesia benchmark: http://mattmahoney.net/dc/silesia.html

Last edited by Matt Mahoney; 4th November 2013 at 18:22.
Reply With Quote Reply With Quote
Thanks:

Jean-Marie Barone (5th November 2013)
18th November 2013, 06:15 #68
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.17A : more speed optimization
I managed to gain 20 seconds more on enwik8 with my Core 2 Duo Merom, essentially by getting rid of the slow FSCALE instruction in the Squash function, while calculating 2^-p.
I also replaced some FPU instructions with faster ones, plus a few more optimization tricks.
Compression results remain identical as previous version.

Attached Files Attached Files

File Type: 7z SMAC.7z (117.0 KB, 1168 views)
Reply With Quote Reply With Quote
19th November 2013, 18:07 #69
Matt Mahoney

View Profile

View Forum Posts

Private Message

Visit Homepage
Matt Mahoney is offline
Expert
Matt Mahoney's Avatar

Join Date

May 2008

Location

Melbourne, Florida, USA

Posts

3,271

Thanks

315

Thanked 841 Times in 506 Posts
Nice speed improvement. http://mattmahoney.net/dc/text.html#1834
Reply With Quote Reply With Quote
Thanks:

Jean-Marie Barone (20th November 2013)
9th December 2013, 06:03 #70
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.18 : optimized for speed (part III, final?)
* The Squash function has been entirely rebuilt using a chain of polynomial approximations (3rd degree, 32 intervals, step=1).
I've traded a bit of compression for the benefit of speed : -40 seconds on enwik8 on my Core 2 Duo.

* Until recently, I was not aware that XMM registers were considered as volatile when calling an API.
On Windows 64 bits, it was possible (but not confirmed) that calls to ReadFile & WriteFile destroy xmm0 & xmm2, which kept the Context and a mask for a hashtable.
Ergo, I have replaced xmm0 & xmm2 by xmm6 & xmm7, which are preserved on 64 bits.

The Squash function now uses a SSE3 instruction (fisttp). Since SSE3 was introduced in 2004(Intel) & 2005(AMD), I've assumed that nowadays it is active on every PCs (In other words, I've been too lazy to check for SSE3 compatibility).

The (minimax) polynomial tables for the Squash function are the work & idea of qWord : thanks to him, and all MASM32 forum members for their support!

Results on enwik8 & enwik9 are 21816285 & 183459860.

Attached Files Attached Files

File Type: 7z SMAC.7z (119.9 KB, 1331 views)
Last edited by Jean-Marie Barone; 9th December 2013 at 06:04. Reason: some bugs with font
Reply With Quote Reply With Quote
10th December 2013, 16:03 #71
Matt Mahoney

View Profile

View Forum Posts

Private Message

Visit Homepage
Matt Mahoney is offline
Expert
Matt Mahoney's Avatar

Join Date

May 2008

Location

Melbourne, Florida, USA

Posts

3,271

Thanks

315

Thanked 841 Times in 506 Posts
Nice speed improvement. http://mattmahoney.net/dc/text.html#1834

Edit: updated http://mattmahoney.net/dc/silesia.html

Last edited by Matt Mahoney; 10th December 2013 at 16:40.
Reply With Quote Reply With Quote
Thanks:

Jean-Marie Barone (10th December 2013)
18th December 2013, 03:40 #72
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.19 : yet another speed improvement
I have figured out that my Stretch function could be radically improved.
Indeed, if we substitute p0=(n0/(n0+n1))*2³² in Stretch(p0)=log₂(p0/(2³²-p0)), we end up with :
Stretch(p0)=log₂(n0/n1)=log₂ n0 – log₂ n1
with n0=bit 0 count, n1=bit 1 count, p0=prediction that next bit will be 0, and n_i ranges between [1;65535]

Thus, there was no need to calculate p0 & p1. Compression of enwik8 is now less than 5 min. under my Core 2 Duo.
The Contexts hashes are now calculated earlier in order to increase the Prefetch Scheduling Distance : prefetches instructions benefit from this additional leeway to complete successfully.

Results on enwik8 & enwik9 are 21816323 & 183459942.

Attached Files Attached Files

File Type: 7z SMAC.7z (119.4 KB, 1342 views)
Reply With Quote Reply With Quote
Thanks:

Nania Francesco (19th December 2013)
19th December 2013, 19:12 #73
Matt Mahoney

View Profile

View Forum Posts

Private Message

Visit Homepage
Matt Mahoney is offline
Expert
Matt Mahoney's Avatar

Join Date

May 2008

Location

Melbourne, Florida, USA

Posts

3,271

Thanks

315

Thanked 841 Times in 506 Posts
Updated LTCB and Silesia.
http://mattmahoney.net/dc/text.html#1834
http://mattmahoney.net/dc/silesia.html
Reply With Quote Reply With Quote
Thanks:

Jean-Marie Barone (20th December 2013)
16th January 2014, 17:28 #74
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.20 : using Laplace estimator
I have switched my prediction formula p0=n0/(n0+n1) with Laplace’s rule of succession, which gives better results in estimating the probability of next symbol.
From now on, p0=(n0+1)/(n0+n1+2),and Stretch(p0)=log2(n0+1)-log2(n1+1).
I made some experiments with Krichevski-Trofimov estimator, where
p0=(n0+0.5)/(n0+n1+1), and Stretch(p0)=log2(2n0+1)-log2(2n1+1), but results were not as good.
One positive side-effect of Laplace estimator is that it solves the Zero-frequency problem : with previous formula, I had to initialize n0 and n1 counts to one to avoid a division by zero. Now, the frequencies can range from [0;65535].

The Match model has been improved. The bit following a matching pattern of 8 bytes is now predicted with a probability of p0=1-(1/x) if bit 0 is predicted, and p0=1/x if bit 1 is predicted, with x=64 and x=x+1 after each correct prediction.

Results on enwik8 & enwik9 are 21781544 & 183190888.

Attached Files Attached Files

File Type: 7z SMAC.7z (119.5 KB, 586 views)
Reply With Quote Reply With Quote

16th January 2014, 18:53 #75

Matt Mahoney

Matt Mahoney is offline

Expert

Matt Mahoney's Avatar

Join Date: May 2008
Location: Melbourne, Florida, USA
Posts: 3,271
Thanks: 315; Thanked 841 Times in 506 Posts

Results for Silesia corpus.

Code:

 Silesia dicke mozil mr nci ooff osdb reym samba sao webst x-ray xml Compressor -options
--------- ----- ----- ---- ----- ---- ----- ---- ----- ---- ----- ----- ---- -------------------
 48093438 2322 16009 2446 1787 2270 2725 1104 3596 4892 6458 4078 400 smac 1.20
 48253662 2331 16034 2453 1819 2271 2733 1111 3608 4902 6489 4093 403 smac 1.17
 48254125 2331 16035 2453 1819 2271 2733 1111 3608 4902 6489 4093 404 smac 1.19
 48254226 2331 16035 2453 1819 2271 2733 1111 3608 4902 6489 4093 404 smac 1.18
 48272013 2333 16035 2454 1823 2271 2733 1112 3609 4903 6495 4093 404 smac 1.16

Edit: LTCB results. http://mattmahoney.net/dc/text.html#1831

Last edited by Matt Mahoney; 17th January 2014 at 16:41.

Reply With Quote Reply With Quote

Thanks:

Jean-Marie Barone (16th January 2014)
3rd June 2018, 02:45 #76
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.22 : Secondary Symbol Estimation
●くろまる SSE improves the prediction by collecting further statistics in a 2D-table, featuring an Order-2 context and a stretched prediction ranging from -16 to +16 (33 boundaries).
My method roughly follows the one described by Matt in the latest PAQ archivers.
Each entry of the table stores a pair of 16 bit counters reflecting the observed bits for a given context & prediction.
From these counters, a new stretched prediction will be calculated.

Like with PAQ, both counters will be halved when one of them is saturated, and initial counters are set so that each prediction maps to itself.
The SSE prediction is calculated as a linear interpolation between above & below boundaries, and the final prediction is equal to 3/4 SSE prediction+1/4 Initial prediction.
The closest entry to the linear interpolation will be updated when the result is known.

●くろまる With hindsight, it appeared to me that my Match model was overly complicated.
Henceforth, each entry of the Match table is a simple doubleword containing the offset of a matching pattern, given current context (8 bytes).
Both contexts (current context & Matching pattern) are still compared to detect hash collisions.
Also, I have increased the memory allocated for the table.

●くろまる FPU instructions have been replaced by SIMD ones (SSE & SSE2), except in the creation of the Logarithms lookup table.

●くろまる Some optimization in the code.

●くろまる Source code has been ported to UASM.

Results on enwik8 & enwik9 are 21047188 & 176607820.

Attached Files Attached Files

File Type: 7z SMAC 1.22.7z (132.6 KB, 315 views)
Last edited by Jean-Marie Barone; 3rd June 2018 at 03:06. Reason: forgot results
Reply With Quote Reply With Quote
Thanks:

Mike (3rd June 2018)
12th July 2018, 17:22 #77
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
SMAC 1.23 : 3D SSE
This version uses a 3D-table SSE which inputs:
- a stretched prediction from -16 to +16
- an Order-2 context
- the current bit rank (8 segments from 0 to 7)

Some good results can also be achieved using only 2 segments (high & low nibble), or 4 segments.
Bit rank is a useful piece of information; including it in the Modeling improves overall prediction.

Results on enwik8 & enwik9 are 20930754 & 175759034.
Results on Silesia Corpus :

Code:

Total dickens mozilla mr nci ooff osdb reym samba sao webst x-ray xml SMAC 1.21 47822648 2319 15829 2445 1787 2261 2725 1102 3543 4892 6436 4078 399 SMAC 1.22 46544819 2271 15421 2401 1593 2233 2608 1060 3479 4849 6242 4000 382 SMAC 1.23 45592332 2269 15002 2285 1589 2147 2467 1057 3485 4789 6234 3883 379

Attached Files Attached Files

File Type: 7z SMAC 1.23.7z (125.4 KB, 298 views)
Reply With Quote Reply With Quote
Thanks (2):

Gonzalo (13th July 2018),kaitz (12th July 2018)
13th July 2018, 06:04 #78
Gonzalo

View Profile

View Forum Posts

Private Message
Gonzalo is offline
Member

Join Date

Aug 2014

Location

Argentina

Posts

758

Thanks

287

Thanked 123 Times in 91 Posts
Thanks for sharing this, Jean-Marie! I see it fits good the use case of text-alike data. It even outperforms ppmd on my test set (although at the cost of speed).

Now, could you elaborate a little on memory requirements? I aim to try it on a few big files but I don't want to risk halting the system. Thanks in advance!
Reply With Quote Reply With Quote
13th July 2018, 13:17 #79
Jean-Marie Barone

View Profile

View Forum Posts

Private Message

Visit Homepage
Jean-Marie Barone is offline
Member Jean-Marie Barone's Avatar

Join Date

Oct 2012

Location

Versailles, France

Posts

60

Thanks

15

Thanked 5 Times in 4 Posts
It uses a lot of memory, but is still below its 2 GB limit of adress space. Still, I don't recommend using it with files greater than 1 GB as it would be to slow & impractical. You have some better alternatives if you wish to perform some serious backups.

Edit : Also, I forgot to mention SMAC was not designed to compress/decompress files longer than 4 GB, as it uses a 32 bit variable to store the length of the file.

Code:

LOCAL FileLength:DWORD call GetFileSize,@@SourceHandle,0 mov FileLength,eax ;store original size
Last edited by Jean-Marie Barone; 13th July 2018 at 22:53.
Reply With Quote Reply With Quote
14th July 2018, 20:56 #80
Gonzalo

View Profile

View Forum Posts

Private Message
Gonzalo is offline
Member

Join Date

Aug 2014

Location

Argentina

Posts

758

Thanks

287

Thanked 123 Times in 91 Posts
Thanks for the info. I was just giving it a try to know its limits, no serious backup involved. It seems the method is symmetric, so it has no practical value to me as a day-to-day program, but it's good to see new ideas put to work!
Reply With Quote Reply With Quote

« Previous Thread | Next Thread »

Similar Threads

Arithmetic Coding : Underflow problem

By Jean-Marie Barone in forum Data Compression

Replies: 11
Last Post: 24th October 2012, 03:53
Arithmetic coding broken in IJG-code

By thorfdbg in forum Data Compression

Replies: 1
Last Post: 10th September 2012, 22:22
Raising performance bar in arithmetic coding

By stbit in forum Data Compression

Replies: 43
Last Post: 28th April 2011, 10:07
Minimal Ashford arithmetic-coder termination

By Ethatron in forum Data Compression

Replies: 18
Last Post: 15th January 2011, 15:38
Experiments with small dictionary coding

By Matt Mahoney in forum Data Compression

Replies: 13
Last Post: 28th June 2010, 17:34

Posting Permissions

You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
[VIDEO] code is On
HTML code is Off

Forum Rules

All times are GMT +3. The time now is 20:05.

Thread: SMAC : SMall Arithmetic Coding

SMAC 1.16: Bitwise, Logistic Mixing, ICM, Orders-6 to 1 & Match Model

Thanks:

SMAC 1.17 : Optimized for speed

Thanks:

SMAC 1.17A : more speed optimization

Thanks:

SMAC 1.18 : optimized for speed (part III, final?)

Thanks:

SMAC 1.19 : yet another speed improvement

Thanks:

Thanks:

SMAC 1.20 : using Laplace estimator

Thanks:

SMAC 1.22 : Secondary Symbol Estimation

Thanks:

SMAC 1.23 : 3D SSE

Thanks (2):

Similar Threads

Arithmetic Coding : Underflow problem

Arithmetic coding broken in IJG-code

Raising performance bar in arithmetic coding

Minimal Ashford arithmetic-coder termination

Experiments with small dictionary coding

Posting Permissions