Find the first duplicated element

Question 1

Given an array a that contains only numbers in the range from 1 to a.length, find the first duplicate number for which the second occurrence has the minimal index. In other words, if there are more than 1 duplicated numbers, return the number for which the second occurrence has a smaller index than the second occurrence of the other number does. If there are no such elements, your program / function may result in undefined behaviour.

Example:

For a = [2, 3, 3, 1, 5, 2], the output should be firstDuplicate(a) = 3.

There are 2 duplicates: numbers 2 and 3. The second occurrence of 3 has a smaller index than the second occurrence of 2 does, so the answer is 3.

For a = [2, 4, 3, 5, 1], the output should be firstDuplicate(a) = -1.

This is code-golf, so shortest answer in bytes wins.

BONUS: Can you solve it in O(n) time complexity and O(1) additional space complexity?

Question 2

Comments are not for extended discussion; this conversation has been moved to chat.

Question 3

Python 2, 34 bytes

O(n²) time, O(n) space

^{Saved 3 bytes thanks to @vaultah, and 3 more from @xnor!}

lambda l:l[map(l.remove,set(l))<0]

Try it online!

Question 4

It looks like lambda l:l[map(l.remove,set(l))<0] works, even though the order of evaluation is weird.

Question 5

This doesn't return -1 when no duplicates are found without the 'footer code', does that code not count towards the bytes? I'm new to code golf, sorry if it's a basic question!

Question 6

@Chris_Rands Beneath the question musicman did ask if exception is okay instead of -1 and OP said its okay and musicman's answer throws exception.

Question 7

That took me a while to figure out. Well played. Getting the 0th element of l using the conditional after modifying it is really clever.

Question 8

Does Python guarantee the time and space complexity of standard-library functions like set.remove?

Question 9

JavaScript (ES6), (削除) 47 (削除ここまで) (削除) 36 (削除ここまで) (削除) 31 (削除ここまで) 25 bytes

Saved 6 bytes thanks to ThePirateBay

Returns undefined if no solution exists.

Time complexity: O(n) :-)
Space complexity: O(n) :-(

a=>a.find(c=>!(a[-c]^=1))

How?

We keep track of already encountered values by saving them as new properties of the original array a by using negative numbers. This way, they can't possibly interfere with the original entries.

Demo

let f =
a=>a.find(c=>!(a[-c]^=1))
console.log(f([2, 3, 3, 1, 5, 2]))
console.log(f([2, 4, 3, 5, 1]))
console.log(f([1, 2, 3, 4, 1]))

Question 10

25 bytes: a=>a.find(c=>!(a[-c]^=1))

Question 11

@ThePirateBay Oh, of course. Thanks!

Question 12

Just notice that Objects in JavaScript may not be implemented as hash table. Time complexity of accessing keys of some object may not be O(1).

Question 13

Mathematica, 24 bytes

#/.{h=___,a_,h,a_,h}:>a&

Mathematica's pattern matching capability is so cool!

Returns the original List for invalid input.

Explanation

#/.

In the input, replace...

{h=___,a_,h,a_,h}

A List with a duplicate element, with 0 or more elements before, between, and after the duplicates...

... :>a

With the duplicate element.

Question 14

Jelly, 5 bytes

Ṛœ-QṪ

Try it online!

How it works

Ṛœ-QṪ Main link. Argument: A (array)
Ṛ Yield A, reversed.
 Q Unique; yield A, deduplicated.
 œ- Perform multiset subtraction.
 This removes the rightmost occurrence of each unique element from reversed
 A, which corresponds to the leftmost occurrence in A.
 Ṫ Take; take the rightmost remaining element, i.e., the first duplicate of A.

Question 15

œ- removes the rightmost occurrences? TIL

Question 16

This doesn't seem to return -1 for no duplicates. Throwing an exception is okay as per OP but I'm not sure if 0 is even though it's not in the range.

Question 17

Pyth, 5 bytes

h.-Q{

Test suite

Remove from Q the first appearance of every element in Q, then return the first element.

Question 18

@LuisMendo Ok thanks. Sorry for creating confusion, I should learn to read...

Question 19

@Mr.Xcoder No, it's the OP's fault. That information should be in the challenge text, but just in a comment

Question 20

Haskell, 35 bytes

f s(h:t)|h`elem`s=h|1<2=f(h:s)t
f[]

Try it online! Crashes if no duplicate is found.

Question 21

Vyxal, 4 bytes

UÞ⊍h

Try it Online!

Takes the multi-set symmetric difference, which outputs the duplicate values in order that they occur. Outputs 0 if nothing is found.

Question 22

Jelly, 6 bytes

xŒQ¬$Ḣ

Try it online!

Returns the first duplicate, or 0 if there is no duplicate.

Explanation

xŒQ¬$Ḣ Input: array M
 $ Operate on M
 ŒQ Distinct sieve - Returns a boolean mask where an index is truthy
 for the first occurrence of an element
 ¬ Logical NOT
x Copy each value in M that many times
 Ḣ Head

Question 23

It's golfier to use indexing like this: ŒQi0ị.

Question 24

@EriktheOutgolfer If there are no duplicates, i0 would return 0, where ị would index and return the last value of the input instead of 0.

Question 25

Japt, 7 bytes

æ@bX ¦Y

Test it online!

Explanation

 æ@ bX ¦ Y
UæXY{UbX !=Y} Ungolfed
 Implicit: U = input array
UæXY{ } Return the first item X (at index Y) in U where
 UbX the first index of X in U
 !=Y is not equal to Y.
 In other words, find the first item which has already occured.
 Implicit: output result of last expression

Alternatively:

æ@ ̄Y øX

Test it online!

Explanation

 æ@ ̄ Y øX
UæXY{Us0Y øX} Ungolfed
 Implicit: U = input array
UæXY{ } Return the first item X (at index Y) in U where
 Us0Y the first Y items of U (literally U.slice(0, Y))
 øX contains X.
 In other words, find the first item which has already occured.
 Implicit: output result of last expression

Question 26

Dyalog APL, (削除) 27 (削除ここまで) (削除) 24 (削除ここまで) (削除) 20 (削除ここまで) (削除) 19 (削除ここまで) (削除) 13 (削除ここまで) (削除) 12 (削除ここまで) 11 bytes

⊢⊃⍨0⍳⍨⊢=⍴↑∪

Now modified to not depend on v16! Try it online!

How? (With input N)

⊢⊃⍨... - N at this index:
- ⍴↑∪ - N with duplicates removed, right-padded with 0 to fit N
- ⊢= - Element-wise equality with N
- 0⍳⍨ - Index of the first 0. `

Question 27

nevermind, I misread the question. not enough test cases though...

Question 28

Sorry for misleading you, I also misread the question.

Question 29

Looks like 36 bytes to me.

Question 30

Oh god, iota underbar isn't in ⎕AV, is it?

Question 31

@Zacharý Right, Classic translates it to ⎕U2378 when loading. Try it online!

Question 32

Brachylog, 5 bytes

a⊇=bh

Try it online!

Explanation

a⊇=bh Input is a list.
a There is an adfix (prefix or suffix) of the input
 ⊇ and a subsequence of that adfix
 = whose elements are all equal.
 b Drop its first element
 h and output the first element of the rest.

The adfix built-in a lists first all prefixes in increasing order of length, then suffixes in decreasing order of length. Thus the output is produced by the shortest prefix that allows it, if any. If a prefix has no duplicates, the rest of the program fails for it, since every subsequence of equal elements has length 1, and the first element of its tail doesn't exist. If a prefix has a repeated element, we can choose the length-2 subsequence containing both, and the program returns the latter.

Question 33

Another 5 bytes solution: a⊇Ċ=h, which only looks at length-2 subsets.

Question 34

Python 3, (削除) 94 (削除ここまで) 92 bytes

O(n) time and O(1) extra memory.

def f(a):
 r=-1
 for i in range(len(a)):t=abs(a[i])-1;r=[r,i+1][a[t]<0>r];a[t]*=-1
 return r

Try it online!

Source of the algorithm.

Explanation

The basic idea of the algorithm is to run through each element from left to right, keep track of the numbers that have appeared, and returning the number upon reaching a number that has already appeared, and return -1 after traversing each element.

However, it uses a clever way to store the numbers that have appeared without using extra memory: to store them as the sign of the element indexed by the number. For example, I can represent the fact that 2 and 3 has already appeared by having a[2] and a[3] negative, if the array is 1-indexed.

Question 35

What would this do for i where a[i] > n?

Question 36

@Downgoat read the question again.

Question 37

The question says 1 to a.length but for a[i]= a.length wouldn't this go out of bounds?

Question 38

@Downgoat t=abs(a[i])-1=a.length-1

Question 39

Note from feersum: "solution is cheating because it uses integers 1 bit larger than the input."

Question 40

Perl 6, 13 bytes

*.repeated[0]

Try it

Explanation

The * is in a Term position so the whole statement is a WhateverCode lambda.
The .repeated is a method that results in every value except for the first time each value was seen.
```
say [2, 3, 3, 3, 1, 5, 2, 3].repeated.perl; # (3, 3, 2, 3).Seq
# ( 3, 3, 2, 3).Seq
```
[0] just returns the first value in the Seq.
If there is no value Nil is returned.
(Nil is the base of the Failure types, and all types are their own undefined value, so Nil different than an undefined value in most other languages)

Note that since the implementation of .repeated generates a Seq that means it doesn't start doing any work until you ask for a value, and it only does enough work to generate what you ask for.
So it would be easy to argue this has at worst O(n) time complexity, and at best O(2) time complexity if the second value is a repeat of the first.
Similar can probably be said of memory complexity.

Question 41

J, 12 bytes

,&_1{~~:i.0:

Try it online!

Explanation

,&_1{~~:i.0: Input: array M
 ~: Nub-sieve
 0: The constant 0
 i. Find the index of the first occurrence of 0 (the first duplicate)
,&_1 Append -1 to M
 {~ Select the value from the previous at the index of the first duplicate

Question 42

APL (Dyalog), 20 bytes

⊃n/⍨(,≢∪) ̈,\n←⎕,2⍴ ̄1

Try it online!

2⍴ ̄1 negative one reshaped into a length-two list

⎕, get input (mnemonic: console box) and prepend to that

n← store that in n

,\ prefixes of n (lit. cumulative concatenation)

(...) ̈ apply the following tacit function to each prefix

, [is] the ravel (just ensures that the prefix is a list)

≢ different from

∪ the unique elements[?] (i.e. is does the prefix have duplicates?)

n/⍨ use that to filter n (removes all elements until the first for which a duplicate was found)

⊃ pick the first element from that

Question 43

Wow, you got beat three times. Still, +1. And can you add an explanation of how this works?

Question 44

@Zacharý Apparently I just needed to get the ball rolling. Here you go.

Question 45

@Zacharý Eventually, I managed to beat them all.

Question 46

APL (Dyalog), 11 bytes

As per the new rules, throws an error if no duplicates exist.

⊢⊃⍨⍬⍴⍳∘≢~⍳⍨

Try it online!

⍳⍨ the indices of the first occurrence of each element

~ removed from

⍳∘≢ of all the indices

⍬⍴ reshape that into a scalar (gives zero if no data is available)

⊃⍨ use that to pick from (gives error on zero)

⊢ the argument

Question 47

Well, yeah, when the rules are changed, of course you can beat them all!

Question 48

Well, I tied you.

Question 49

APL, 15

{⊃⍵[(⍳⍴⍵)~⍵⍳⍵]}

Seems like we can return 0 instead of -1 when there are no duplicates, (thanks Adám for the comment). So 3 bytes less.

A bit of description:

⍵⍳⍵ search the argument in itself: returns for each element the index of it's first occurrence
(⍳⍴⍵)~⍵⍳⍵ create a list of all indexes, remove those found in ⍵⍳⍵; i.e. remove all first elements
⊃⍵[...] of all remaining elements, take the first. If the array is empty, APL returns zero

For reference, old solution added -1 to the list at the end, so if the list ended up empty, it would contain -1 instead and the first element would be -1.

{⊃⍵[(⍳⍴⍵)~⍵⍳⍵], ̄1}

Try it on tryapl.org

Question 50

You may return a zero instead of ¯1, so {⊃⍵[(⍳⍴⍵)~⍵⍳⍵]} should do.

Question 51

Retina, (削除) 26 (削除ここまで) 24 bytes

1!`\b(\d+)\b(?<=\b1円 .*)

Try it online! Explanation: \b(\d+)\b matches each number in turn, and then the lookbehind looks to see whether the number is a duplicate; if it is the 1st match is ! output, rather than the count of matches. Unfortunately putting the lookbehind first doesn't seem to work, otherwise it would save several bytes. Edit: ~~(削除) Added 7 bytes to comply with the -1 return value on no match. (削除ここまで)~~ Saved 2 bytes thanks to @MartinEnder.

Question 52

For the record, the lookaround won't backtrack. This prevents this from working if you try to put it before. I've made this mistake many times, and Martin always corrects me.

Question 53

I got 30 bytes by using a lookahead instead of a lookbehind. Also, the rules now say you don't need to return -1.

Question 54

@ValueInk But the correct answer for that test case is 3...

Question 55

OH. I misread the challenge, whoops

Question 56

PHP, (削除) 56 44 38 (削除ここまで) 32 bytes

for(;!${$argv[++$x]}++;);echo$x;

Run like this:

php -nr 'for(;!${$argv[++$x]}++;);echo$x;' -- 2 3 3 1 5 2;echo
> 3

Explanation

for(
 ;
 !${ // Loop until current value as a variable is truthy
 $argv[++$x] // The item to check for is the next item from input
 }++; // Post increment, the var is now truthy
);
echo $x; // Echo the index of the duplicate.

Tweaks

Saved 12 bytes by using variables instead of an array
Saved 6 bytes by making use of the "undefined behavior" rule for when there is no match.
Saved 6 bytes by using post-increment instead of setting to 1 after each loop

Complexity

As can be seen from the commented version of the code, the time complexity is linear O(n). In terms of memory, a maximum of n+1 variables will be assigned. So that's O(n).

Question 57

Thanks for not using a weird encoding. But you should add the error_reporting option to the byte count (or use -n, which is free).

Question 58

We've been here before. PHP notices and warnings are ignorable. I might as well pipe them to /dev/null, which is the same.

Question 59

I tend to remember the wrong comments. :) Isn´t this O(n)?

Question 60

Yes it's linear

Question 61

How is that O(1) for additional space? You're literally assigning a new variable per n, which is O(n)

Question 62

K (ngn/k), 11 bytes

*(~':?',\)#

Try it online!

*(~':?',\)#
 ( )# keep the elements where the left function returns true
 (when applied to the whole array):
 ,\ prefixes
 ?' uniquify each
 ~': is it the same as previous?
* first element

Question 63

R, 34 bytes

c((x=scan())[duplicated(x)],-1)[1]

Cut a few characters off the answer from @djhurio, don't have enough reputation to comment though.

Question 64

oh...I didn't see this answer; this is good for the prior spec when missing values required -1 but with the new spec, I managed to golf it down even more. This is still solid and it's a different approach from the way he did it, so I'll give you a +1!

musicman523 musicman523 4,8021 gold badge26 silver badges62 bronze badges · Accepted Answer · 2017-07-30 21:26:50Z

17

\$\begingroup\$

Python 2, 34 bytes

O(n²) time, O(n) space

^{Saved 3 bytes thanks to @vaultah, and 3 more from @xnor!}

lambda l:l[map(l.remove,set(l))<0]

Try it online!

Share

Improve this answer

edited Aug 1, 2017 at 3:59

answered Jul 30, 2017 at 21:26

musicman523's user avatar

musicman523 musicman523

4,8021 gold badge26 silver badges62 bronze badges

\$\endgroup\$

7

1

\$\begingroup\$ It looks like lambda l:l[map(l.remove,set(l))<0] works, even though the order of evaluation is weird. \$\endgroup\$

xnor
– xnor

2017年07月31日 00:08:32 +00:00
Commented Jul 31, 2017 at 0:08
\$\begingroup\$ This doesn't return -1 when no duplicates are found without the 'footer code', does that code not count towards the bytes? I'm new to code golf, sorry if it's a basic question! \$\endgroup\$

Chris_Rands
– Chris_Rands

2017年07月31日 12:38:59 +00:00
Commented Jul 31, 2017 at 12:38
\$\begingroup\$ @Chris_Rands Beneath the question musicman did ask if exception is okay instead of -1 and OP said its okay and musicman's answer throws exception. \$\endgroup\$

LiefdeWen
– LiefdeWen

2017年07月31日 13:55:51 +00:00
Commented Jul 31, 2017 at 13:55
\$\begingroup\$ That took me a while to figure out. Well played. Getting the 0th element of l using the conditional after modifying it is really clever. \$\endgroup\$

Thoth19
– Thoth19

2017年07月31日 22:45:40 +00:00
Commented Jul 31, 2017 at 22:45
\$\begingroup\$ Does Python guarantee the time and space complexity of standard-library functions like set.remove? \$\endgroup\$

Draconis
– Draconis

2017年08月01日 02:30:39 +00:00
Commented Aug 1, 2017 at 2:30

| Show 2 more comments

Find the first duplicated element

69 Answers 69

Python 2, 34 bytes

JavaScript (ES6), (削除) 47 (削除ここまで) (削除) 36 (削除ここまで) (削除) 31 (削除ここまで) 25 bytes

How?

Demo

Mathematica, 24 bytes

Explanation

Jelly, 5 bytes

How it works

Pyth, 5 bytes

Haskell, 35 bytes

Vyxal, 4 bytes

Jelly, 6 bytes

Explanation

Japt, 7 bytes

Explanation

Explanation

Dyalog APL, (削除) 27 (削除ここまで) (削除) 24 (削除ここまで) (削除) 20 (削除ここまで) (削除) 19 (削除ここまで) (削除) 13 (削除ここまで) (削除) 12 (削除ここまで) 11 bytes

How? (With input N)

Brachylog, 5 bytes

Explanation

Python 3, (削除) 94 (削除ここまで) 92 bytes

Explanation

Perl 6, 13 bytes

Explanation

J, 12 bytes

Explanation

APL (Dyalog), 20 bytes

APL (Dyalog), 11 bytes

APL, 15

Retina, (削除) 26 (削除ここまで) 24 bytes

PHP, (削除) 56 44 38 (削除ここまで) 32 bytes

Explanation

Tweaks

Complexity

K (ngn/k), 11 bytes

R, 34 bytes

J, (削除) 17 (削除ここまで) 16 bytes

How?

R, 28 bytes

Dyalog APL Classic, 18 chars

Ruby, (削除) 28 (削除ここまで) 36 bytes

Java (OpenJDK 8), (削除) 65 (削除ここまで) (削除) 117 (削除ここまで) 109 bytes

Edit

Proof

Explanation

Java 8, (削除) 82 (削除ここまで) (削除) 78 (削除ここまで) (削除) 76 bytes (削除ここまで) No longer viable, (削除) 75 (削除ここまで) (削除) 67 (削除ここまで) 64 bytes below in edit

Explanation:

*Edit*

Explanation:

Husk, 4 bytes

Explanation

Haskell, 34 bytes

MATL, 8 bytes

Explanation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Edit