3
\$\begingroup\$

I have a function that returns the contents of a file.

Since reading files from disk is expensive, I'd like to avoid having to read the file again after the first read.

I've come up with the function getFileContents that caches the file content in memory during the first call and returns the cached contents when called again.

Here's a short program including imports that demonstrates its behavior:

import qualified Data.ByteString as BS
import System.Directory ( getCurrentDirectory )
import System.FilePath
import Control.Exception
import Data.Typeable ( typeOf )
import Text.Printf ( printf )
import Data.IORef
main = do
 fileContentsRef <- newIORef Nothing
 -- First time reading the file accesses the disk
 _ <- getFileContents fileContentsRef
 fileContentsFromMemory <- getFileContents fileContentsRef
 print fileContentsFromMemory
getFileContents
 :: IORef (Maybe BS.ByteString) -> IO (Either IOException BS.ByteString)
getFileContents fileContentsRef = do
 refContents <- readIORef fileContentsRef
 case refContents of
 Just fileContents -> do
 putStrLn "Using cached file contents from memory"
 return $ Right fileContents
 Nothing -> readFileAndCacheContents fileContentsRef
readFileAndCacheContents
 :: IORef (Maybe BS.ByteString) -> IO (Either IOException BS.ByteString)
readFileAndCacheContents fileContentsRef = do
 putStrLn "Reading file from disk, then caching it"
 curDir <- getCurrentDirectory
 let filePath = curDir </> "aDir" </> "theFile"
 readResult <-
 (try $ BS.readFile filePath) :: IO (Either IOException BS.ByteString)
 case readResult of
 Left ex -> do
 logEx ex
 return readResult
 Right fileContents -> do
 -- Cache the file contents
 writeIORef fileContentsRef $ Just fileContents
 return readResult
where
 logEx ex = printf "Exception of type %s: %s\n" (show (typeOf ex)) (show ex)

Was IORef the right choice in this case? Is there something to improve in the code?

asked Dec 21, 2019 at 15:15
\$\endgroup\$
1
  • 1
    \$\begingroup\$ If you're going to keep a mutable copy of your string, then IORef is a fine choice. If you just want a readable copy, you can read it once and store it in e.g. a Reader environment for the rest of your program to access. That is, don't make your caching mechanism more generic than the scope in which you intend to use the cached value. Otherwise you'll inevitably experience memory leaks. \$\endgroup\$ Commented Dec 27, 2019 at 9:57

1 Answer 1

4
\$\begingroup\$

This strategy will work for any IO action, and so should be generalized.

once :: IO a -> IO (IO a)
once ioa = do
 cache <- newIORef Nothing
 return $ readIORef cache >>= \case
 Nothing -> do
 a <- ioa
 writeIORef cache $ Just a
 return a
 Just a -> return a
main = do
 fileContentsGetter <- once readFileContents
 -- First time reading the file accesses the disk
 _ <- try fileContentsGetter
 fileContentsFromMemory <- try fileContentsGetter
 print fileContentsFromMemory

Note that if two threads call the getter at the same time, they will both find the cache empty, and both read the file. System.IO.Memoize provides a once that isn't vulnerable to this.

(catch and rethrow in the definition of readFileContents lets you rescue logEx.)

Matthias Braun
1,2093 gold badges12 silver badges24 bronze badges
answered Dec 22, 2019 at 12:20
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.