I have a huge list of 4k+ software products.
My goal is simple. I have a list of products. Each of those products has a software value. I would like to extract each software occurrence and covert it into an object in the format {text, id}
.
The code below works as I need, but I would like to know if you have a better way to do it.
softwareProducts.reduce((acc = [], {software_id, software_name}) => {
if (acc.find((row) => row.id === software_id) === undefined) {
acc.push({
text: software_name,
id: software_id
})
}
return acc;
}, []);
1 Answer 1
Your explanation of what the code does is a little bit unclear. The algorithm you've shown dedupes an array of objects based on ids, choosing the first occurrence of each id. It's not so much that you're choosing "each occurrence" as much as "first occurrence".
You can improve on the linear lookup (.find()
loops over the whole acc
every time), making the quadratic algorithm a linear one using a Set
, which offers constant time lookups. This essentially removes the inner loop in exchange for a bit of extra space and a bit of allocation overhead, likely well worth it on 4k+ items (always benchmark to be sure):
const softwareProducts = [
{software_id: 1, software_name: "foo"},
{software_id: 2, software_name: "bar"},
{software_id: 1, software_name: "ignore this"},
{software_id: 3, software_name: "baz"},
{software_id: 1, software_name: "ignore this"},
{software_id: 3, software_name: "ignore this"},
];
const resultIds = new Set();
const result = [];
for (const {software_id: id, software_name: text} of softwareProducts) {
if (!resultIds.has(id)) {
resultIds.add(id);
result.push({id, text});
}
}
console.log(result);
.reduce
feels a little overworked acting as a .map
and .filter
. It's usually clearer as an explicit loop as shown above.
Normally, JS uses camelCase, but I assume software_name
comes from an API you don't have control over. In that case, it's acceptable to use, but use softwareName
if the code is under your control.
acc = []
is unnecessary; just use acc
. The default argument is a no-op.
The return value of softwareProducts.reduce
needs to be assigned to something. I assume you're doing so in your actual code.
See also Create array of unique objects by property for other approaches.
-
\$\begingroup\$ An array of objects is being created from an array of objects. The obvious performance improvement would be to not create the new array of objects but simply remove the duplicates from the original list. All I'm saying is that without knowing the input data format, nothing definitive can be said about performance tweaking. \$\endgroup\$radarbob– radarbob2024年06月14日 23:46:24 +00:00Commented Jun 14, 2024 at 23:46
-
1\$\begingroup\$ @radarbob Intuitive as that may seem, don't think that'll help. Removing one element from a list is O(n) unless it's the back element, but you can't make that guarantee here, so that would be quadratic. Feel free to post your own answer, but I'm pretty sure this will outperform it. The only microoptimization I see here is avoiding destructuring and new allocations of each object, but I doubt that's worth it so I didn't bother mentioning it. \$\endgroup\$ggorlen– ggorlen2024年06月14日 23:51:38 +00:00Commented Jun 14, 2024 at 23:51
Explore related questions
See similar questions with these tags.
acc.find()
is doing a linear search and going all the way to the end of the array for any item it doesn't find. If the list is long, keeping a separateSet
object of row ids may be a faster way to just determine if you've already added this one to the list or not. \$\endgroup\$row
come from? Is it supposed to come from thesoftwareProducts
array? \$\endgroup\$