Problem
I have a function that returns the contents of a file.
Since reading files from disk is expensive, I’d like to avoid having to read the file again after the first read.
I’ve come up with the function getFileContents
that caches the file content in memory during the first call and returns the cached contents when called again.
Here’s a short program including imports that demonstrates its behavior:
import qualified Data.ByteString as BS
import System.Directory ( getCurrentDirectory )
import System.FilePath
import Control.Exception
import Data.Typeable ( typeOf )
import Text.Printf ( printf )
import Data.IORef
main = do
fileContentsRef <- newIORef Nothing
-- First time reading the file accesses the disk
_ <- getFileContents fileContentsRef
fileContentsFromMemory <- getFileContents fileContentsRef
print fileContentsFromMemory
getFileContents
:: IORef (Maybe BS.ByteString) -> IO (Either IOException BS.ByteString)
getFileContents fileContentsRef = do
refContents <- readIORef fileContentsRef
case refContents of
Just fileContents -> do
putStrLn "Using cached file contents from memory"
return $ Right fileContents
Nothing -> readFileAndCacheContents fileContentsRef
readFileAndCacheContents
:: IORef (Maybe BS.ByteString) -> IO (Either IOException BS.ByteString)
readFileAndCacheContents fileContentsRef = do
putStrLn "Reading file from disk, then caching it"
curDir <- getCurrentDirectory
let filePath = curDir </> "aDir" </> "theFile"
readResult <-
(try $ BS.readFile filePath) :: IO (Either IOException BS.ByteString)
case readResult of
Left ex -> do
logEx ex
return readResult
Right fileContents -> do
-- Cache the file contents
writeIORef fileContentsRef $ Just fileContents
return readResult
where
logEx ex = printf "Exception of type %s: %sn" (show (typeOf ex)) (show ex)
Was IORef
the right choice in this case? Is there something to improve in the code?
Solution
This strategy will work for any IO
action, and so should be generalized.
once :: IO a -> IO (IO a)
once ioa = do
cache <- newIORef Nothing
return $ readIORef cache >>= case
Nothing -> do
a <- ioa
writeIORef cache $ Just a
return a
Just a -> return a
main = do
fileContentsGetter <- once readFileContents
-- First time reading the file accesses the disk
_ <- try fileContentsGetter
fileContentsFromMemory <- try fileContentsGetter
print fileContentsFromMemory
Note that if two threads call the getter at the same time, they will both find the cache empty, and both read the file. System.IO.Memoize
provides a once
that isn’t vulnerable to this.
(catch
and rethrow
in the definition of readFileContents
lets you rescue logEx
.)