Accidental optimization via I/O indirection

August 18, 2016

Getting rid of unnecessary IO has some pretty nice safety benefits. It can even have some (unexpected) performance benefits.

I was recently looking for ways to reduce the proliferation of IO in a Web project, and I came across a Mustache template rendering function. Its type was essentially Model -> String -> IO Text, taking a data model and the name of a template file to compile, then reading and compiling the template, and rendering it with the model:

render :: Model -> String -> IO Text
render model name = do
  templateE <- localAutomaticCompile name
  case templateE of
    (Left error)     -> return $ pack $ show error
    (Right template) -> return $ substitute template model

A complete example of this is available here.

This render function was performed for every incoming HTTP request to build each outgoing HTTP response.

Since the template is static, I found that I could do the compilation just once in my main function's do block, and pass the resulting template to a pure rendering function that didn't need IO to do its job:

type TemplateE = Either ParseError Template

compile :: String -> IO TemplateE
compile name = localAutomaticCompile name

render :: Model -> TemplateE -> Text
render _ (Left e) = pack $ show e
render m (Right t) = substitute t m

main :: IO ()
main = do
  let name = "template.mustache"
  let model = Model "Hello" "world"
  view <- render model <$> compile name
  return ()

A complete example of this is available here.

Now my Mustache module only needs to export a pure render function, instead of one wrapped in IO.

This would have been a successful stopping point, until I realized that reading a Mustache template from disk is not a cheap operation. Neither is compiling a Mustache template.

By doing this just once, instead of once per HTTP request, I save an enormous amount of I/O and CPU time, on the order of 100μs.

At first this might not sound like a lot of savings, but consider a server with thousands (or more) of incoming requests per second. In my project, it increased the number of concurrent requests that my server can handle by a couple orders of magnitude, and I got rid of an IO from a module export to boot!