Determine whether two arrays have the same elements with the same multiplicities

Question 1

I have written a Python function called comp that checks whether two given arrays have the same elements, with the same multiplicities. The multiplicity of a member is the number of times it appears in the array. I would like to post my code here for review, so that I can optimize it and receive any advice on how to improve it.

Here's the code for the function:

def comp(array1, array2):
 c = 0
 ta = []
 if len(array1) == 0 or len(array2) == 0:
 return None
 for i in range(len(array2)):
 for j in range(len(array2)):
 if array2[i] != array1[j] * array1[j]:
 c += 1
 ta.append(c)
 c = 0
 if len(array1) in ta:
 return False
 else:
 return True

I would appreciate any suggestions or advice on how to improve the code's efficiency, readability, or any other best practices that I might be missing.

Question 2

Could you provide some sample input?

Question 3

This code doesn't appear to work correctly. Based on the problem description, comp([1,2], [1,2]) should return True, but it returns False.

Question 4

What resource do you want to "optimize"? Memory, CPU, something else? There are tags for these, but I don't know which to apply. (BTW, I removed the "functional programming" tag, as this is clearly using an imperative style).

Question 5

Why is this code squaring elements of array1? This code makes no sense to me, and no comments explain why it's doing it that way. It's highly non-obvious how this algorithm works, although I guess the idea of looking for len(array1) in ta[] is that it's counting non-matches, so that would find a case where nothing matched? Seems weird.

Question 6

 for i in range(len(array2)):
 for j in range(len(array2)):
 if array2[i] != array1[j] * array1[j]:
 c += 1

The multiplicity of a member is the number of times it appears in the array.

Either the problem is ill-specified, or the code is wrong.

When the lengths of array{1,2} differ, we either ignore many entries or we raise IndexError.

It is very surprising for a bool predicate to return None. Consider using mypy to help you lint this code.

The identifier c apparently means "count". Take the opportunity to spell that out more fully.

What were you thinking when you chose the identifier ta?? What should I think when I read it? No idea what's behind that decision. Rename it before merging down to main.

You chose a list datastructure for something that we only do an in test on. Ordinarily I would encourage switching it to a set, for O(1) constant lookup speed. But given the crazy quadratic complexity of that loop, it's lost in the noise. Instead you should focus on choosing a sensible algorithm with complexity O(N log N) or better.

Given the lack of unit tests and poor level of specification / documentation, I would not be willing to delegate or accept maintenance tasks on this code base.

Question 7

There's no docstring for the function (and it's so generically named, it could easily be mistaken for something else).

There are no unit-tests; they should be trivial to add.

Variable names such as c and ta might mean something to you now, but it's unlikely they will mean much to the next maintainer (perhaps 6-months-older you?).

It's surprising that a predicate function can return None. If one of the inputs is empty, I'd expect the result to be true if, and only if, the other input is also empty.

It's not clear why we're multiplying array1[j] by itself when comparing. That's going to break the comparison, and should fail our unit tests.

The O(n2) algorithm that searches for matches scales poorly. We can make it much more efficient (at the expense of increased memory usage) by constructing a multiset (Counter) from each input, then comparing the two multisets for equality.

Simpler implementation

#!/usr/bin/python3
import collections
def bag_equal(array1, array2):
 """
 True if the inputs contain the same elements (including duplicates)
 in any order.
 >>> bag_equal([], [])
 True
 >>> bag_equal([], [''])
 False
 >>> bag_equal([0, 0, 2, 3, 3], (0, 2, 3, 0, 3))
 True
 >>> bag_equal([0, 0, 1], [0, 1, 1])
 False
 >>> bag_equal(["me", "you"], ["me", "you", "you"])
 False
 """
 # Test lengths first, to avoid creating counters unnecessarily
 return (len(array1) == len(array2) and
 collections.Counter(array1) == collections.Counter(array2))
if __name__ == '__main__':
 import doctest
 doctest.testmod()

Question 8

Things to check upfront

same length, else multiplicities cannot all be the same

tools to use

collections.Counter
most simple:
return Counter(items1) = Counter(items2)
one additional Counter instead of two:
```
 delta = Counter(items1)
 delta.subtract(items2)
 return not any(delta.values())
```
sorted - useful if the items cannot be hashed (meaning Counter won't work)
return sorted(items1) == sorted(items2)

Question 9

Inviting everybody to expand or otherwise improve.

Question 10

Counter.subtract() returns None, so you can't chain the .values(). delta = Counter(items1)\ndelta.subtract(items2)\nreturn not any(delta.values()) should work.

J_H J_H 41.4k3 gold badges38 silver badges157 bronze badges · Accepted Answer · 2023-05-10 16:54:10Z

 for i in range(len(array2)):
 for j in range(len(array2)):
 if array2[i] != array1[j] * array1[j]:
 c += 1

The multiplicity of a member is the number of times it appears in the array.

Either the problem is ill-specified, or the code is wrong.

When the lengths of array{1,2} differ, we either ignore many entries or we raise IndexError.

It is very surprising for a bool predicate to return None. Consider using mypy to help you lint this code.

The identifier c apparently means "count". Take the opportunity to spell that out more fully.

What were you thinking when you chose the identifier ta?? What should I think when I read it? No idea what's behind that decision. Rename it before merging down to main.

You chose a list datastructure for something that we only do an in test on. Ordinarily I would encourage switching it to a set, for O(1) constant lookup speed. But given the crazy quadratic complexity of that loop, it's lost in the noise. Instead you should focus on choosing a sensible algorithm with complexity O(N log N) or better.

Given the lack of unit tests and poor level of specification / documentation, I would not be willing to delegate or accept maintenance tasks on this code base.

Stack Exchange Network

Determine whether two arrays have the same elements with the same multiplicities

3 Answers 3

Simpler implementation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Determine whether two arrays have the same elements with the same multiplicities

3 Answers 3

Simpler implementation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions