I have been working on an experimental CM compressor for the past few weeks, the performance has just recently gotten acceptable so I'd though that I'd release it closed source for now. It is not too complicated yet (no resolving hash collisions, SSE, ISSE, BCJ). By default its tuned for text but you can disable the word model. I welcome any feedback!
Stephan Busch (4th June 2013)
Just realized I had left AVX extensions enabled in code gen, probably resulting in most people not being able to run the program. New version has this and a few other bugs fixed and a reduced initialization time.
I quickly tested your MCM, I tried it on a few Tar-ed program directories and it performed better than winrar :)
Although it was much slower.
Good luck for the future of your program :3
MCM 0.0 is on rank #26 of the SqueezeChart, which means it is already in the Top 30
I will publish results later.
Ranked #21 on LTCB. http://mattmahoney.net/dc/text.html#1663
Thank you both for running these benchmarks! I'll try to see if I can improve the speed any more, as well as binary / exe / text detection.
results are now online at http://www.squeezechart.com
Dear Mr. Chartier,
did you produce only 64 bit versions? When I run newer or older program, I obtain message "This is not valid Win32 program.". Tested on Win XP SP3 CZECH version.
Best regards,
FatBit
That's strange, what CPU do you have? The compressor requires SSE2, but nearly every CPU should support this.
It is Intel Centrino Mobile Pentium M 1,5 GHz, ~10 years old + 855PM chipset.
Ah ok, I'll see if I can remove the SSE2 requirements in the next version. It should be ready in around a week. Hopefully that will fix it.
May be different compilations will be good solution. Newer/faster and older/slower versions.
My test was in 32 bit Vista (2 GHz T3200) and it worked.
BTW, ZPAQ requires SSE2 instructions. I thought every processor has them by now. If not, you can compile with -DNOJIT but it will be slow. I know somebody compiled an older version for ARM and it worked.
I sucessfully ran zpaq 6.28 and zpaqd 6.27 on Win XP SP3 CZECH edition 32 bit.
Best Regards,
FatBit
It's very strange that it doesn't work on windows XP. I'm using VS2012 to compile it so that might have something to do with it. On a side note, anybody know a good way to figure out where to add new states to a PAQ like state machine? I currently have 105/255 unused states. The state machine was generated with a simple brute force algorithm on enwik6.
If I remember correctly, user ENCODE had to downgrade from Visual Studio new to Visual Studio old because in new version was removed Win XP support (and partially returned later?). I am not able to find it in forum.
Best regards,
FatBit
You could use the StateTable class in the ZPAQ reference decoder to generate a PAQ state table. http://mattmahoney.net/dc/unzpaq200.cpp
I had intended to have 255 states but due to a design error I discovered much later that only 219 states are reachable. I left it that way so I would not break compatibility with the standard.
SSE2 is supported on Pentium M. It is supported on most Intel processors since 2001 and AMD since 2003. In ZPAQ, SSE2 is only required for the MIX component, so the faster methods that don't use it (1, 2, and 3) should still work. Or you can compile with -DNOJIT for any processor.
ZPAQ will run on Windows XP, but probably not older versions. When I make calls to Windows I make sure the function is supported at least back to XP.
http://stackoverflow.com/questions/1...al-studio-2012 :Quote Originally Posted by Mat Chartier View PostIt's very strange that it doesn't work on windows XP. I'm using VS2012 to compile it so that might have something to do with it.
Visual Studio 2012 Update 1 has now been released, and adds official support for running apps built with VC++ 2012 on Windows XP.
I guess Microsoft forgot that 38% of PCs are still running WinXP, like it or not. :p
They try to push it out of the market with whatever oppoturnity they can. But the customer backlash is still very strong in many places.
Currently MCM, from my statistics and analysis from the results of the WCC that I will publish soon are truly remarkable. Of course I do not know if the program uses a system type PPM (Byte compression) or type CM (single-bit compression) but I think the way to go, if I can give some advice, is to make it simple and fast and not the other way around!
It uses CM.
Hi Nania, Currently I'm using CM with 6 contexts: o1/match, word, o2, o3, o4, o6. Contexts are selected on a byte basis. I'm not too sure how to increase the speed any more, in mcm v0.0 each context rarely hits more than two cache lines in the hash table for encoding/decoding a byte. Using some xor tricks, I recently managed to get a guarantee that each context will hit at most 2 cache lines in the hash table, but this is only a very minor performance improvement. I guess the next lowest hanging fruit is match model, it takes around 20% of compression time.
EDIT:
Also, I was just thinking of floating point CM. With the new dot product (dpps) instruction that comes with SSE4, it may be a feasible option? What do you guys think.
Last edited by Mat Chartier; 7th June 2013 at 21:37.
i think it's a great idea, but don't stop on that. ideally, archives should be decompressible only on i7-4770R in a full moon
Agreed Bulat, we need to use these new instruction sets so that people with old CPUs finally upgrade.
Although, I could check CPUID and have different code paths for older machines to make sure that the code runs. The main thing that I'm worried about is having consistent floating point behaviour on all machines.
I guess you mean for the mixer. In zpaq I use SSE2 for the dot product using 20 bit weights and 12 bit predictions: drop 8 bits of the weight, multiply (PMADDWD), accumulate. But SSE2 turned out to be slower than scalar code to update the weights. It would have been faster to use 16 bit weights but in my experiments I lost too much compression. You could probably do it with probabilistic weight updates.
Thanks for the answer Matt! I'm surprised that SSE2 wasn't faster than scalar code. I'll probably just stick to 32 bit integer weights for now.
SSE2 is faster for dot product of vectors of 16 bit signed elements, like in mixer prediction. It wasn't faster for updating 20 bit weights and bounding the values, even after I figured out how to do it in parallel.
Screw your users, so you have a better justification for playing with new toys, huh?Quote Originally Posted by Mat Chartier View PostAgreed Bulat, we need to use these new instruction sets so that people with old CPUs finally upgrade.
Added mcm to last zpaq benchmark test, very good for single thread.