I'm making a game in which a row of tiles may look something like this, in which each block is represented with a unicode char:
"gggggggggggggggggggggggggggggggggggtttttttttjjjjjjjjjjjjjjj"
So my method of compression shortens it to this (I may have the numbers wrong, it is just for demonstration):
"g23 t9 j17"
Seems fairly simple right?
My friend suggests to use an enum and represent each block with a number and work with bytes, reasoning that numbers are faster to compare
So it may look something like this 0001 0002 0004 0003 0034
I said that using this is pain, and saving a few ms isn't gonna help the pain I have to go through
I'm using C# and Unity, which method should I use for compressing world data?
-
1\$\begingroup\$ It sounds like you've already decided which approach you prefer, and reasoned that the storage or performance difference is not significant for your use case. What do you need from strangers on the Internet like us? \$\endgroup\$DMGregory– DMGregory ♦2022年04月02日 16:51:32 +00:00Commented Apr 2, 2022 at 16:51
-
\$\begingroup\$ Im asking about whether my friend approach is better or mine @DMGregory \$\endgroup\$Coder2195– Coder21952022年04月02日 16:54:13 +00:00Commented Apr 2, 2022 at 16:54
-
1\$\begingroup\$ "Better" under what criteria? You've raised reasonable objections that this approach is less intuitive or convenient for your style of working, and that milliseconds of performance are not an important consideration for your needs, so under those criteria, you've established that your way is "better". Other people may choose different criteria and arrive at a different answer, but that doesn't make one right and the other wrong. It's not clear why one would prefer criteria chosen by a stranger on the Internet over those chosen by someone with far more experience in the game you're making: you. \$\endgroup\$DMGregory– DMGregory ♦2022年04月02日 18:08:19 +00:00Commented Apr 2, 2022 at 18:08
1 Answer 1
My friend suggests to use an enum and represent each block with a number and work with bytes, reasoning that numbers are faster to compare
If you write your files as utf8, then you are effectively doing this in your uncompressed method: each utf8 char is a number in the same range as a byte (8 bits). If you started doing bit packing, then number method would be more efficient: it would allow you to fit multiple blocks into a byte (depending on the number of block types). However, using char
gives you the advantage of creating levels in a text editor and it's easier to understand the level data seen in your watch window.
The compression scheme you're describing is Run-length encoding. It would definitely be more efficient if you wrote your counts as numbers instead of strings.
However, for the compressed version you might be better off using an existing compression library instead of doing something painful. It's pretty simple to use SharpZipLib to decompress files. While you could likely get a higher compression ratio with a custom scheme, it's pain you could avoid and unlikely to be worthwhile unless your world data is measured in GB.
I'm assuming you're only concerned about the disk size of your world data since your runtime data will presumably be constructed world objects (tileset images, collision, etc) and you'll discard the level definition after load.
-
\$\begingroup\$ Minor technicality: a C#
char
type is UTF-16, so it's two bytes. And storing the numbers as characters is also bulkier. In the example "g23 t9 j17", we could store each number in a single byte, so in binary form the whole thing could be just 6 bytes. In text form (including spaces), it's 20 bytes. Not that this will break the bank by any means. \$\endgroup\$2022年04月03日 21:40:29 +00:00Commented Apr 3, 2022 at 21:40 -
\$\begingroup\$ @DMGregory: I didn't know it was utf-16! Clarified that you'd write files at utf8. OP's friend had an odd number of values so it wasn't run length encoding, but mentioned that regardless to be clear of the merit of their method. Thanks! \$\endgroup\$idbrii– idbrii2022年04月04日 16:53:31 +00:00Commented Apr 4, 2022 at 16:53