4

(this question is written using javascript for examples)

Say I have a bunch of files that I need to access sequentially:

1.json
2.json
3.json
4.json
...

Say I want to transform the data in all of these files and store the transformed data in new files.

This loop goes through a list of IDs, reads a file, transforms the data, and then writes the data to a new file.

import { readFileSync, writeFileSync } from 'fs'
const ids = [1,2,3,4]
for (const id of ids) {
 const datum = JSON.parse(readFileSync(`${id}.json`))
 const transformedDatum = ....
 writeFileSync(`${id}-transformed.json`, JSON.stringify(transformedDatum)
}

In reality, the logic for all of this is many more lines of code, and I want to use functional programming techniques to make it more readable and make pure functions that don't have side effects.

So I start to go down this road.

import { readFileSync, writeFileSync } from 'fs'
function transformDatum(id) {
 const datum = JSON.parse(readFileSync(`${id}.json`))
 const transformedDatum = ....
 writeFileSync(`${id}-transformed.json`, JSON.stringify(transformedDatum)
}
const ids = [1,2,3,4]
const transformedData = ids.map(transformDatum)

This now encapsulates the list of IDs passed into the business logic, and the business logic and IDs are now no longer existing in the same scope.

But the function is still interacting with the file system. I want to isolate the I/O so that the function receives the file data and returns the transformed data, making it pure.

import { readFileSync, writeFileSync } from 'fs'
function transformDatum(rawDatum) {
 const transformedDatum = ....
 return transformedDatum
}
const ids = [1,2,3,4]
for (const id of ids) {
 const rawDatum = JSON.parse(readFileSync(`${id}.json`))
 const transformedDatum = transformDatum(rawDatum)
 writeFileSync(`${id}-transformed.json`, JSON.stringify(transformedDatum)
}

Now the transformDatum function is pure, but I've sort of walked back to having an outer for loop which goes through the IDs.

Is this more or less a "healthy" tradeoff to make in functional programming? Is there some better way of handling this?

Philip Kendall
26k10 gold badges66 silver badges68 bronze badges
asked Jun 1, 2024 at 12:50
3
  • 7
    You have just about reached the point where you need the IO monad or some equivalent construct. If you want to do full-on functional programming, this is where the training wheels come off. Commented Jun 1, 2024 at 13:22
  • Why not use async/await and reduce it back to imperative style? Or in other words, why would functional solution be more readable? Commented Jun 2, 2024 at 1:30
  • 1
    @Basilevs async/await really wouldn't do much in OP's example, and give more code as they'd have to promisify read and writefile Commented Jun 3, 2024 at 7:55

2 Answers 2

8

You don't need to lose the map for a for each, you can just function chain

const saveResult = ids.map(readFile).map(transformFile).map(saveFile)

Because saveFile needs the id for the filename you will have to pass extra metadata in each call, so each function will have to return a tuple or object with the main data + the id and any other metadata that makes sense.

if its the reading and writing to disk which you find problematic you could push that to the back of your mind a bit more by adding some stream functions

const saveResult = ids.map(getFileStream)
 .map(parseStreamToObject)
 .map(transformObject)
 .map(objectToStream)
 .map(writeFileStream)

I guess this separates out a couple more pure functions. But I think this is just a general problem with functional programming. At some point you hit the edge and want to do something that isn't just maths and have to suck it up.

answered Jun 1, 2024 at 15:08
9
  • 1
    Yup. Functional Programming 101: pass functions around like any other value. At some point you are coupled to I/O operations (read file and save file) but the bits in between don't need to. Commented Jun 1, 2024 at 15:51
  • 1
    I'm not the downvoter, but the functional programmer in me squirms a little bit at calling map with a non-pure function. Commented Jun 1, 2024 at 19:17
  • 1
    @PhilipKendall, at some point your code needs to perform I/O. Just make it clear where the "purity" ends and the side effects begin. A good name like saveFile works just fine. Commented Jun 1, 2024 at 20:33
  • 2
    Or maybe call forEach(writeFile) instead? Commented Jun 1, 2024 at 20:34
  • 2
    BTW, @Ewan, "let me add some streams..." has just become my go-to solution for every functional programming conundrum ever! All the more funny because I thought your original answer was fine, but "more streams" apparently was better. Commented Jun 2, 2024 at 3:17
1

I am not a fan of using map for things with side effects. So alternatively you can pass your read and write functions as parameters for transform like so

import { readFileSync, writeFileSync } from 'fs'
function transformDatum(id, readFunction, writeFunction) {
 const datum = JSON.parse(readFunction(`${id}.json`))
 const transformedDatum = ....
 writeFunction(`${id}-transformed.json`, JSON.stringify(transformedDatum)
}
const ids = [1,2,3,4]
for (const id of ids)
 transformDatum(id, readFileSync, writeFileSync);
answered Jun 3, 2024 at 8:37
2
  • Wouldn't this mean that transformDatum is now impure? I was thinking that transformDatum would only remain pure if the data from the file is passed in as input and the data to be written is passed out as output. If readFunction is making a call inside transformDatum, then doesn't that mean that transformDatum now has a side effect? Maybe there is a functional composition approach that could make this work, though Commented Jun 3, 2024 at 11:34
  • @rpivovar Ideally you'd read |> transform |> write for it to be pure. Or at least as pure as you could get with I/O. Above though the function doesn't know about the file system, or how it gets the data. It just knows that it can call read to get data, and write to output it. Above we do something functional, in that we tread the read and write functions as values Commented Jun 3, 2024 at 11:57

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.