4
$\begingroup$

I have an algorithm which, basically given an array of $n$ numbers, checks if there is any repeated numbers in the array, and returns true if there is and false otherwise.

It uses a direct access table (hashing function $h(x)=x$), which makes the running time linear. So it creates a new array, initializes all values to false, then iterates through the given array, and since each value is an index to the hash table, it accesses that array location and changes it to 1ドル$. If it is already a 1ドル,ドル then you know that it is a repeated number.

But when getting the expected running time, you need to first define a probability space. This is the part I am confused about. How can you define a probability of a number being repeated in A, if you are just given an array with arbitrary numbers?

Paresh
3,3681 gold badge21 silver badges33 bronze badges
asked Mar 6, 2013 at 15:35
$\endgroup$
5
  • $\begingroup$ Maybe a random variable that assigns to every number in the array a number of occurrences. $\endgroup$ Commented Mar 6, 2013 at 15:43
  • $\begingroup$ I don't understand, can you show an example? $\endgroup$ Commented Mar 6, 2013 at 15:51
  • $\begingroup$ Here is an average case analysis that might help: ocw.mit.edu/courses/electrical-engineering-and-computer-science/… $\endgroup$ Commented Mar 6, 2013 at 15:52
  • $\begingroup$ How much is the running time affected by the presence/absence of repeats? If none/not much, it doesn't matter ;-) If it does matter, you have to define a reasonable model of the input. Let's say $n$ numbers selected at random from among $m,ドル with replacement. Note that if $m \gg n$ the average number of repeats will approach 0, for $m < n$ the pigeonhole principle guarantees repeats. Certainly other models could be more realistic, depending on the exact surroundings. $\endgroup$ Commented Mar 6, 2013 at 16:11
  • $\begingroup$ The running time is clearly affected by the the presence of repeats, but you could simplify your calculation, if it doesn't matter which element is repeated(e.g. $\{1,2,3,4\}$ and 96 times 5 versus 20 times $\{1,\dots,5\}$). $\endgroup$ Commented Mar 6, 2013 at 16:54

1 Answer 1

1
$\begingroup$

Usually you assume uniform distribution on inputs which share a certain parameter. You could assume uniform distribution on e.g.

  1. All inputs of length $n,ドル e.g. if your input looks like $w_1\#\dots\#w_k,ドル where $w_i\in\{0,1\}^*,ドル $n=k-1+\sum_{i=1}^k|w_i|$.
  2. All multisets (bags) of $n$ integers up to $k$
  3. All $n$-tuples of integers integers up to $k$ (different,since e.g. $\{1,2\}_b$ corresponds to $(1,2)$ and $(2,1)$ and $\{1,1\}_b$ only to $(1,1)$)

You can combine this with an assumption about the distribution of the parameter, say the size $n\sim \mathrm{Geo}(0.5)$. However, this is unusual, most would consider the limit of probabilities as your parameter $n\rightarrow\infty$ instead.


If you choose 3, the probability of no repeated numbers can be calculated this way: There are different $\binom{k}{n}$ sets of size $n$ and each set corresponds to $n!$ $n$-tuples. There is a total number of $k^n$ $n$-tuples, so $$P(\text{no repeated number})=\frac{\binom{k}{n}\cdot n!}{k^n}$$

answered Mar 6, 2013 at 16:09
$\endgroup$
2
  • $\begingroup$ I don't understand your number 1. The input can be like A=[1, 3, 5, 2, 6, 3]. Elements of A can be any number. Are you trying to say all elements of A must be from some finite set? $\endgroup$ Commented Mar 6, 2013 at 16:21
  • $\begingroup$ No, that's the actually the advantage of number 1, that you don't have to consider a second parameter $k$. You take the sum of the length of the numbers (binarily encoded, i.e. $\sum_{i=1}^n \lceil log_2(a_i)\rceil$) + plus the number of numbers (since you need some separator sign) as a parameter. $\endgroup$ Commented Mar 6, 2013 at 16:41

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.