lua-users home
lua-l archive

Re: Reading large files

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


> "The current method for determining if a file is binary involves reading
> in the entire file into memory in lua, and then calling a C++
> function to determine if that string is binary. Lua is stunningly 
> inefficient at reding in a large file (reading in 100MiB copies 1.4GiB)."
Maybe the last remark hints at the concatenation problem explained in
LTN 9:
 http://www.lua.org/notes/ltn009.html
This would occur if you read a file line by line and naively
concatenate them into a large chunk. Anyway, here's a simplistic
script to test binary-ness. Adjust the pattern in "find" to something
more sensible, if you like. Usage:
 lua isbin.lua <file-name>
---------------
file isbin.lua:
---------------
local now = os.clock()
local input, err = io.open(arg[1], "rb")
assert(input, err)
local isbin = false
local chunk_size = 2^12
local find = string.find
local read = input.read
repeat
 local chunk = read(input, chunk_size)
 if not chunk then break end
 if find(chunk, "[^\f\n\r\t032円-128円]") then
 isbin = true
 break
 end
until false
input:close()
now = os.clock() - now
if isbin then
 print "this file is binary..."
else
 print "this is a text file..."
end
print(string.format("this took %.3f seconds", now))
-----------
end of file
-----------
Wim

AltStyle によって変換されたページ (->オリジナル) /