As an academic exercise, I have to write a parallel algorithm that given a sorted array of $n$ integers computes the mode (i.e. the item with the highest frequency) efficiently using $p$ processors, where $p \le n$ is a constant.
The model we use to describe these algorithms allows lock/unlock primitives to synchronize concurrent access to shared variables.
We cannot use hash tables. Could anyone share any hint on the optimal algorithm to solve this problem?
Edit: rephrased question according to comments.
Edit 2: adding my solution for feedback. While trying to solve the exercise I thought of an algorithm in the lines of
- Declare two global variables, mode and modeFrequency and initialize them appropriately;
- For i where 1ドル \le i \le p,ドル invoke a concurrent process on a portion of the array.
- In each concurrent process: 3.1.. Find the local mode of the partition; 3.2. Store the local mode and its frequency in two local variables; 3.3. Compare the local mode with the global one: 3.3.1. if the mode is the same, add the local frequency to the global one; 3.3.2. if the local mode is different than the global one and the local frequency is higher than the global one, set the local mode/frequency to the global ones
- Return the global mode.
but I am not convinced of the correctness of the algorithm. Please note that I omitted locks/unlocks for brevity. Also, can the way I partition the array make any difference on the correctness of the algorithm?
-
$\begingroup$ Is p a constant, or are we in something like the PRAM model? Do you mean to ask how to do this in parallel? (See here.) $\endgroup$Raphael– Raphael2018年02月13日 16:23:59 +00:00Commented Feb 13, 2018 at 16:23
-
$\begingroup$ What did you try? Where did you get stuck? We're happy to help you understand the concepts but just solving exercises for you is unlikely to achieve that. You might find this page helpful in improving your question. $\endgroup$D.W.– D.W. ♦2018年02月13日 17:10:54 +00:00Commented Feb 13, 2018 at 17:10
-
$\begingroup$ @D.W. ♦ I know how to solve the problem in $O(n)$ with a sequential algorithm by taking advantage of the fact that the array is sorted. However I don't know how to write a parallel algorithm to solve the problem in $O(n/p),ドル if that's possible at all. And I was looking for hints on the way to proceed, not for someone to do the assignment for me. :) $\endgroup$user3075898– user30758982018年02月13日 19:59:53 +00:00Commented Feb 13, 2018 at 19:59
-
$\begingroup$ Partition 1 contains A 100 times, and some others. Partition 2 contains B 52 times and C 51 times. Partition 3 contains C 50 times and D 52 times. The processor handling Partition 2 must tell you about both B and C; the processor handling Partition 3 must tell you about both C and D. $\endgroup$gnasher729– gnasher7292018年02月13日 22:44:53 +00:00Commented Feb 13, 2018 at 22:44
1 Answer 1
Since you asked for a hint:
Partition the array.
Think about that for a while. If you need more of a hint:
If you partitioned the array into $p$ partitions and assigned each partition to a single processor, could you compute the mode of the numbers within a particular partition efficiently? Would that help get you closer to a solution to what you want to achieve?
-
$\begingroup$ @d-w thanks for the hint, I thought about that before asking this question and I had an algorithm in mind. Could you comment on its correctness please? $\endgroup$user3075898– user30758982018年02月13日 22:27:25 +00:00Commented Feb 13, 2018 at 22:27