I’m working with large nested associative arrays in PHP and I need to apply transformations (like map, filter, reshape) immutably, meaning the original array should not be modified.
The problem is that every time I use array_map or array_filter, PHP creates a new array copy, which becomes very slow and memory-heavy for large datasets.
For example:
$data = [
'users' => [
['id' => 1, 'name' => 'Alice', 'active' => true],
['id' => 2, 'name' => 'Bob', 'active' => false],
]];
$mapped = array_map(fn($u) => ['id' => $u['id']], $data['users']);
$filtered = array_filter($mapped, fn($u) => $u['id'] % 2 === 0);
This works but copies arrays multiple times. On large inputs, it’s slow and inefficient.
Main question: What is the most efficient way in PHP to transform nested arrays immutably without repeatedly creating deep copies?
-
3This sounds like a good task to use a database.Markus Zeller– Markus Zeller2025年09月07日 12:18:49 +00:00Commented Sep 7 at 12:18
-
You might want to look at other questions like stackoverflow.com/questions/5792388/…Progman– Progman2025年09月07日 14:43:08 +00:00Commented Sep 7 at 14:43
-
1I would urge you to filter before bothering to restructure the data because otherwise you are going to process some data sets before you throw them away. Are we looking at a realistic process in your question, or does it get crazy-convoluted?mickmackusa– mickmackusa ♦2025年09月08日 04:34:51 +00:00Commented Sep 8 at 4:34
-
Arrrays in php are always copied when passed or returned, while objects are always passed by reference. It's not enough information here but I think that if you don't need all the data you can use transducers, streams, generators and if you need all the data some kind of data structure that works like an array, but is really an object that is layered such that for each iteration you make a new instance with the old as "fallback" and fetching will only update when the original data is changed... More complex than the other alternatives.Sylwester– Sylwester2025年09月08日 13:48:52 +00:00Commented Sep 8 at 13:48
2 Answers 2
As mentioned in the comments section, one option would be the use of generators, that will basically iterate "on-demand". Also, use references (&) instead of values.
Example:
$data = [
'users' => [
['id' => 1, 'name' => 'Alice', 'active' => true],
['id' => 2, 'name' => 'Bob', 'active' => false],
]
];
$filterPairIds = function() use (&$data) {
foreach ($data['users'] as &$user) {
if ($user['id'] % 2 === 0) {
yield $user;
}
}
};
$filtered = $filterPairIds();
foreach ($filtered as $user) {
print_r($user);
}
This should optimize memory usage.
Example link (php sandbox): https://onlinephp.io/c/c79a7
Comments
You might want to consider classic PHP programming:
$data = ['users' => [
['id' => 1, 'name' => 'Alice', 'active' => true],
['id' => 2, 'name' => 'Bob', 'active' => false],
]];
$filtered = [];
foreach ($data['users'] as $key => $user) {
if ($user['id'] % 2 === 0) {
$filtered[$key] = ['id' => $user['id']];
}
}
var_export($filtered);
Output:
array (
1 =>
array (
'id' => 2,
),
)
This shouldn't be hard to read when you're used to PHP. Each user is processed separately, and only added to the final output when all processing is done. That means it doesn't create copies of big arrays, it's efficient. Some people thing that using build-in functions would more efficient, but there's no real reason to think so in this case. Quite the opposite.
Demo: https://3v4l.org/KXYnC
Comments
Explore related questions
See similar questions with these tags.