I'm interested in becoming more familiar with functional programming as a paradigm, so I'm trying to introduce a more functional style to some of my projects. I'm struggling to understand how to handle side effects with a database.
I have some functions that kind of look like this:
db query
+
db query |
+ |
| v
| +--------->a()
v |
f(type)+--+
|
+--------->b()
The trouble is that both f
and a
are non-pure functions because they need to do database queries. I've seen some functional projects that work by having all the state in a single place and the rest of the application takes bits and pieces of state as function parameters. I can replicate something like that here by putting all the queries in f
for example, but since b
doesn't need the database queries used by a
, this would be really inefficient.
Is there a pattern for handling database access in functional programs?
2 Answers 2
Extract important business logic out of application dependencies and "wrap" it with database calls or calls to other external resources.
-- Load data
-- Process data
-- Save data based on processed result
Code structure will look like below:
# --> logic with side effects
data = load_data_from_external_resources()
# --> pure logic without side effects
result = business_logic.process(data)
# --> logic with side effects based on result
if result.has_records
save(result.records)
end
if result.should_notify_others
notify_others(result.message)
end
With such approach business logic will be totally independent of application technologies(database, file system or third parties) in design and runtime.
For very complicated cases where loading all required data is not possible because of performance issues we can introduced small bits of business logic and "stack" layers of application and business logic on each other
# --> load minimum required data
data = load_minimum_data()
# --> pure logic without side effects
next_steps = business_logic.determine_next_steps(data)
if next_steps.need_more_data?
# --> load more data
data = data + load_more_data()
end
# --> pure logic without side effects
result = next_steps.process(data)
# --> logic with side effects based on result
if result.has_records
save(result.records)
end
-
3Downvoter, please explain what is wrong with the answer, I will be happy to fix or delete it.Fabio– Fabio2021年02月07日 17:50:45 +00:00Commented Feb 7, 2021 at 17:50
-
2I don't see what's wrong with it. Have an upvote from me.candied_orange– candied_orange2021年02月07日 19:05:02 +00:00Commented Feb 7, 2021 at 19:05
-
6This kind of approach is often called "functional core, imperative shell". Such an architecture is a great fit for many problems, and corresponds nicely to concepts such as hexagonal architecture or onion architecture.amon– amon2021年02月07日 19:49:31 +00:00Commented Feb 7, 2021 at 19:49
-
Have an upvote from me too, and damn the silent downvoters!Steve– Steve2021年02月08日 01:02:29 +00:00Commented Feb 8, 2021 at 1:02
f
only needs to do a database query because it needs a
which needs a database query.
a
only needs to do a database query because no one has passed it what the result of that query would be.
Solution: pass a
what it needs.
Side effects exist in every useful program. Otherwise you'd see nothing on the screen. The issue is where they exist. Programming functionally isn't eliminating all side effects. It's being formal with them. Only permitting them to exist in well controlled places away from all your interesting, needs testing and debugging, business logic. That's where you keep your pure functions. Move side effects to boring structural code that people understand at a glance.
The biggest impact here is that a
no longer controls when the database is queried. But if you want a
to be pure that's the cost it comes at.
f
as a parameter? In most of the cases it is not significant, especially in cases where you can load data asynchronously.SELECT
statement returns data from the file system, which can be updated between select statements.SELECT COUNT(*) FROM MyTable
returns99
. Next week, same query(with same arguments as you said)SELECT COUNT(*) FROM MyTable
will return120
. With pure functionsum(12, 30)
returns42
today and next week it will also return42
. With pure function I can save final result and never call this function anymore for arguments12
and30
, but I can not do it for SQL statements, I need to execute actual sql statement again."SELECT COUNT(*) FROM MyTable"
string to the database and receives different results at different time, which is the case for OP's example.