I have voxel files of 1billion voxels, every voxel is true/false and is kept in a 1D boolean array.
What is a good way to copy it to disk, for example as bytes/ a 0100010101 ASCII file, where i can read the file back into memory fast and efficiently?
At the moment i can write files to disk using:
savePath = System.IO.Directory.GetParent(Application.dataPath).ToString()+ "/Saved_Files" ;
var sw : System.IO.StreamWriter;
I don't know the best way to read and write 1-2gb files.
This is what i wrote for the moment:
function saveBW(){
//var SW2 : System.IO.StreamWriter;
var timeString = DateTime.Now.ToString("HH-mm");
var fileNameFromFolder= Path.GetFileNameWithoutExtension(QPath[QDone]);
fileNameFromFolder = stripTrailingSlash(fileNameFromFolder);
PLYname = "MK5_aliased_" + fileNameFromFolder + "_"+ timeString + ".Bo0L" ;
var str ="";
var SW2 = new System.IO.StreamWriter(savePath + "/" + PLYname);
for( var tr = 0 ; tr < mesher.supernormous.Length ; tr++ )
{
str += mesher.supernormous ? 1 : 0;
if(tr%255==0)SW2.Write(str);
}
SW2.Write(str);
SW2.Flush();
SW2.Close();
}
1 Answer 1
Booleans aren't bit-sized in .NET, so they aren't a good storage for the kind of data you want. Instead, use a BitArray
- it still gives you all the manipulation you need (read a single bit value, write a single bit value), and allows you to load and store the whole array in byte[]
(up to eight bits per byte). This makes persistence quite easy:
var data = new BitArray(File.ReadAllBytes("MyFile.bin"));
Of course, how efficient this really is is up to profiling. And it might be that you don't want to load the data unless it's actually required, so some sort of a paging solution might be better; but that's beyond the scope of your question as it is.
5 Comments
BitArray
also has a constructor that takes bool[]
, and the CopyTo
method works with both bool[]
and byte[]
. It may be worth it to read and write the bytes manually if the overhead is significant in your scenario - it really isn't hard, just simple math and getting the edge cases right. Think of it as a jagged array, where each byte
corresponds to a bool[]
with 8 elements, and accessing an element in the byte is done as byteVal & (2 << index) > 0
(read) and byteVal |= (2 << index) * (boolVal ? 1 : 0)
(write - only works if you don't reuse the byte).bool[]
- assumptions aren't a good way to decide that nowadays :) Manipulating a bool can be faster/cheaper, but at the same time, you now need 8 GiB of memory instead of 1 GiB; things like this are often decided by memory access patterns, which are quite tricky to properly analyze. Is the access random or sequential? Is there a better organization than a 1D array, e.g. spatial subdivision?
byte
. Basically using math, applying a bitmask to thatbyte
for each boolean value. (I haven't done it, so I don't know what that math specifically looks like... but for any given set of 8 bits you could add numeric values corresponding to that bit's position. 2, 4, 8, 16, etc.) Writing that stream of bytes to a file would be 1/8 the size of writing the boolean values to the file.