7

I'm looking for an efficient way to read numbers from a text file without installing additional packages. Data.ByteString.Lazy.Char8.readInt seems to do the trick for integers. I've read that ByteString now has a readDouble method, but when I write import Data.ByteString.Lex.Lazy.Double (readDouble) the compiler complains:

 Main.hs:4:7:
 Could not find module `Data.ByteString.Lex.Lazy.Double':
 locations searched:
 Data/ByteString/Lex/Lazy/Double.hs
 Data/ByteString/Lex/Lazy/Double.lhs

My bytestring package version is 0.9.1.5.

So, am I doing something wrong? Or maybe there is a better solution for the problem? Thanks.

Update: OK, seems that readDouble is in package bytestring-lexer which is not installed by default. Any other idea?

asked Dec 20, 2010 at 12:15
5
  • 1
    just install the bytestring-lexer package then. "cabal install bytestring-lexer" Commented Dec 20, 2010 at 16:41
  • 1
    I want to do without additional packages, because my programs will be run on servers over which I have no control. Commented Dec 20, 2010 at 18:34
  • @adamax: It's worth adding that restriction to your question. Commented Dec 20, 2010 at 20:36
  • 3
    Huh? It's written in the first line. OK, I'll make it bold :) Commented Dec 20, 2010 at 21:43
  • 6
    "cabal unpack bytestring-lexing" -- now you have the source for the bytestring-lexing package. Drop it in your source tree, and now you don't need the package! Magic! Commented Dec 20, 2010 at 22:31

3 Answers 3

4

Another solution: install the bytestring-lexing package, and use readDouble, which I optimized for you.

 cabal install bytestring-lexing

The package provides optimized parsing functions for floating point literals:

 readDouble :: ByteString -> Maybe (Double, ByteString) 
answered May 13, 2011 at 17:59

Comments

3

The only time I encountered parsing doubles on the critical path, I used this:

{-# LANGUAGE ForeignFunctionInterface #-}
import qualified Data.ByteString.Char8 as B
import Foreign.C.Types
import Foreign.C.String
import System.IO.Unsafe
foreign import ccall unsafe "stdlib.h atof" c_atof :: CString -> IO CDouble
unsafeReadDouble = unsafePerformIO . flip B.useAsCString c_atof

There wasn't anything that looked like a readDouble in bytestring at that time, though. That would probably be a better solution if it's now standard.

answered Dec 20, 2010 at 12:52

1 Comment

Thanks! I've made some experiments. To make things easier, I took atoi instead of atof and compared it with the usual show function and my naive implementation (iread). FFI totally beats show, however it loses about 20% to iread. Perhaps, there is an overhead caused by conversions to CString
2

Here's what I came up with.

I used the function offered by JB and added two tricks which I learned from the source code of bytestring-lexing (thanks, sclv!). The first one is this function:

strict = SB.concat . LB.toChunks

It transforms a lazy bytestring into non-lazy one efficiently.

The second trick is function Data.ByteString.Internal.inlinePerformIO which is a more efficient variant of unsafePerformIO.

Here's complete code that allows a pretty fast number reading:


{-# LANGUAGE ForeignFunctionInterface #-}
import qualified Data.ByteString.Lazy.Char8 as LB
import qualified Data.ByteString as SB
import Data.ByteString.Internal (inlinePerformIO)
import Foreign.C.String (CString)
import Foreign.C (CDouble)
import Data.Maybe (fromJust)
foreign import ccall unsafe "stdlib.h atof" c_atof :: CString -> IO Double
unsafeReadDouble = inlinePerformIO . flip SB.useAsCString c_atof
{-# INLINE unsafeReadDouble #-}
readDouble = unsafeReadDouble . SB.concat . LB.toChunks
readInt = fst . fromJust . LB.readInt

And a sample program that calculates the sum of all numbers in the input:

main = LB.getContents>>= (print . sum . map readDouble . LB.lines)
It processes an 11Mb file (1M numbers) in about 0.5 seconds

I also found several links, where a much more efficient version of readInt is discussed. Presumably one can build a readDouble based on similar ideas. But I think I'll stick with my current version for now.

answered Dec 22, 2010 at 20:17

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.