11
\$\begingroup\$

Remove all repeating words from an inputted sentence.

Input will be something like cat dog cat dog bird dog Snake snake Snake and the output should be cat dog bird Snake snake. There will always be a single space separating words.

Output order must be the same as input. (Refer to the example)

You don't need to handle punctuation but capital letter handling is required.

Alex A.
24.8k5 gold badges39 silver badges120 bronze badges
asked Oct 29, 2015 at 2:31
\$\endgroup\$
4
  • 13
    \$\begingroup\$ I recommend waiting to accept an answer for at least a few days. A shorter solution may still come. \$\endgroup\$ Commented Oct 29, 2015 at 2:54
  • 1
    \$\begingroup\$ I expect similar solutions to uniqchars, except that this doesn't ban built-ins that remove duplicates. \$\endgroup\$ Commented Oct 29, 2015 at 5:34
  • 3
    \$\begingroup\$ Seeing the example, there is not special capital letter handling: Snake and snake are treated simply as different \$\endgroup\$ Commented Oct 29, 2015 at 13:32
  • \$\begingroup\$ @AlexA.: In fact, there already is one. codegolf.stackexchange.com/questions/62044/… \$\endgroup\$ Commented Oct 29, 2015 at 19:28

32 Answers 32

1
2
9
\$\begingroup\$

CJam, 7 chars

qS/_&S*

Can probably be much shorter... but whatever I've almost never used CJam. ^.^

q reads input, S/ splits on spaces, _& duplicates and applies a setwise AND (therefore getting rid of duplicates), and S* re-joins on space.

Online interpreter link

answered Oct 29, 2015 at 2:34
\$\endgroup\$
2
  • 1
    \$\begingroup\$ How can you even get much shorter than 7? lol \$\endgroup\$ Commented Oct 29, 2015 at 14:50
  • \$\begingroup\$ Some one just did. \$\endgroup\$ Commented Oct 31, 2015 at 2:07
8
\$\begingroup\$

Haskell, 34 bytes

import Data.List
unwords.nub.words

Usage example: (unwords.nub.words) "cat dog cat dog bird dog Snake snake Snake" -> "cat dog bird Snake snake".

answered Oct 29, 2015 at 2:40
\$\endgroup\$
8
\$\begingroup\$

APL, (削除) 22 (削除ここまで) 20 bytes

{1↓∊∪(∊∘' '⊂⊢)' ',⍵}

This creates an unnamed monadic function that accepts a string on the right and returns a string.

Explanation:

 ' ',⍵} ⍝ Prepend a space to the input string
 (∊∘' '⊂⊢) ⍝ Split the string on spaces using a fork
 ∪ ⍝ Select the unique elements
{1↓∊ ⍝ Join into a string and drop the leading space

Try it online

Saved 2 bytes thanks to Dennis!

answered Oct 29, 2015 at 3:36
\$\endgroup\$
1
  • 3
    \$\begingroup\$ I love any answer that uses a non-esoteric, non-golf language. \$\endgroup\$ Commented Oct 29, 2015 at 16:04
7
\$\begingroup\$

Ruby, 21 chars

->s{s.split.uniq*' '}
answered Oct 29, 2015 at 2:46
\$\endgroup\$
7
\$\begingroup\$

TeaScript, 12 bytes

TeaScript is JavaScript for golfing.

xs` `u()j` `

This is pretty short. It splits on each space, filters out duplicates, then rejoins.

Try it online

answered Oct 29, 2015 at 3:11
\$\endgroup\$
4
  • \$\begingroup\$ Is it tee-a script or tee script? \$\endgroup\$ Commented Oct 29, 2015 at 8:14
  • \$\begingroup\$ @MathiasFoster it would be "tee-script" \$\endgroup\$ Commented Oct 29, 2015 at 14:13
  • \$\begingroup\$ Does TeaScript have letters reserved for variable names? Most of them appear to be shorthands for built-in properties. \$\endgroup\$ Commented Oct 30, 2015 at 19:24
  • \$\begingroup\$ @intrepidcoder yes all of these: cdfghijklmnopstuvw are reserved for variables, they are all pre-initialized to 0. b is also reserved for a variable name, it is pre-initialized to an empty string \$\endgroup\$ Commented Oct 30, 2015 at 19:28
7
\$\begingroup\$

JavaScript (ES6) 33

(see this answer)

Test running the snippet below in an EcmaScript 6 compliant browser (implementing Set, spread operator, template strings and arrow functions - I use Firefox).

Note: the conversion to Set drop all the duplicates and Set mantains the original ordering.

f=s=>[...Set(s.split` `)].join` `
function test() { O.innerHTML=f(I.value) }
test()
#I { width: 70% }
<input id=I value="cat dog cat dog bird dog Snake snake Snake"/><button onclick="test()">-></button>
<pre id=O></pre>

answered Oct 29, 2015 at 13:40
\$\endgroup\$
5
  • \$\begingroup\$ Wow wow wow... I am continually amazed by your ability to cut any solution I think up by 25% or more. +1 \$\endgroup\$ Commented Oct 29, 2015 at 15:37
  • 1
    \$\begingroup\$ Looked at the problem and immediately thought of Sets... only to realize that you'd already done it =P very nice! \$\endgroup\$ Commented Oct 29, 2015 at 18:02
  • \$\begingroup\$ how can set maintain the original ordering? \$\endgroup\$ Commented Oct 31, 2015 at 1:33
  • \$\begingroup\$ @njzk2 ask the developers of the language. It could be: a set is internally an Array, and at each insertion there is a check to reject duplicates. It's an implementation detail anyway \$\endgroup\$ Commented Oct 31, 2015 at 1:37
  • \$\begingroup\$ @njzk2 while I don't know how, I know that this fact is specified by the language: Set objects are collections of values, you can iterate its elements in insertion order. A value in the Set may only occur once; it is unique in the Set's collection. (developer.mozilla.org/it/docs/Web/JavaScript/Reference/…) \$\endgroup\$ Commented Oct 31, 2015 at 1:40
5
\$\begingroup\$

R, 22 bytes

cat(unique(scan(,"")))

This reads a string from STDIN and splits it into a vector on spaces using scan(,""), selects only unique elements, then concatenates them into a string and prints it to STDOUT using cat.

answered Oct 29, 2015 at 3:03
\$\endgroup\$
5
\$\begingroup\$

PowerShell, 15 Bytes

$args|select -u

Whoa, an actual entry where PowerShell is somewhat competitive? That's unpossible!

Takes the string as input arguments, pipes to Select-Object with the -Unique flag. Spits out an array of strings, preserving order and capitalization as requested.

Usage:

PS C:\Tools\Scripts\golfing> .\remove-repeated-words-from-string.ps1 cat dog cat dog bird dog Snake snake Snake
cat
dog
bird
Snake
snake

If this is too "cheaty" in assuming the input can be as command-line arguments, then go for the following, at (削除) 24 (削除ここまで) 21 Bytes (saved some bytes thanks to blabb). Interestingly, using the unary operator in this direction happens to also work if the input string is demarcated with quotes or as individual arguments, since the default -split is by spaces. Bonus.

-split$args|select -u
answered Oct 29, 2015 at 13:21
\$\endgroup\$
4
  • \$\begingroup\$ Relying on the environment's behavior of spoon-feeding the code with readily split up input...? \$\endgroup\$ Commented Oct 29, 2015 at 13:41
  • \$\begingroup\$ @manatwork I've added a clarification if the first usage is considered too "cheaty" -- since it's not clear exactly how the input is specified, we'll leave it up to the OP. \$\endgroup\$ Commented Oct 29, 2015 at 14:12
  • \$\begingroup\$ And now is clear how efficients are PowerShell's own features. That 24 really deserves an upvote. \$\endgroup\$ Commented Oct 29, 2015 at 14:30
  • \$\begingroup\$ @timmyD you can chop off 3 bytes to the uncheaty ?? version by using the unary split and no need for "" '' in the commandline args too :\>ls -l split.ps1 & type split.ps1 & echo.&powershell -nologo -f split.ps1 cat dog cat dog bird dog Snake snake Snake -rw-rw-rw- 1 Admin 0 21 2015年11月02日 19:06 split.ps1 -split$args|select -u cat dog bird Snake snake \$\endgroup\$ Commented Nov 2, 2015 at 13:38
4
\$\begingroup\$

Julia, 29 bytes

s->join(unique(split(s))," ")

This creates an unnamed function that splits the string into a vector on spaces, keeps only the unique elements (preserving order), and joins the array back into a string with spaces.

answered Oct 29, 2015 at 2:41
\$\endgroup\$
4
\$\begingroup\$

Retina, 22 bytes

 (\w+)\b(?<=\b1円\b.+)

Save the file with a trailing linefeed and run it with the -s flag.

This is fairly straight forward in that it matches a single word, and the lookbehind checks whether that same word has appeared in the string before. The trailing linefeed causes Retina to work in Replace mode with an empty replacement string, removing all matches.

answered Oct 29, 2015 at 10:00
\$\endgroup\$
4
\$\begingroup\$

Mathematica, (削除) 43 (削除ここまで) 39 bytes

StringRiffle@*Keys@*Counts@*StringSplit
answered Oct 29, 2015 at 10:43
\$\endgroup\$
4
  • \$\begingroup\$ Kudos for using StringRiffle[]. \$\endgroup\$ Commented Oct 29, 2015 at 17:17
  • \$\begingroup\$ could use Keys@Counts instead of DeleteDuplicates \$\endgroup\$ Commented Oct 30, 2015 at 15:29
  • \$\begingroup\$ @branislav Does Keys@Counts preserve order? \$\endgroup\$ Commented Oct 30, 2015 at 21:13
  • \$\begingroup\$ @LegionMammal978 Counts[list] gives an association whose keys are in the same order as they first occur as elements of list. \$\endgroup\$ Commented Nov 1, 2015 at 0:25
3
\$\begingroup\$

Pyth - 9 bytes

Well this is why we're all waiting for Pyth5, could have been 5 bytes.

jdoxzN{cz

Try it online here.

answered Oct 29, 2015 at 2:50
\$\endgroup\$
2
  • \$\begingroup\$ Why isn't Pyth5 valid? It appears to be implemented. \$\endgroup\$ Commented Oct 29, 2015 at 3:34
  • \$\begingroup\$ @ThomasKwa I don't think it's finished. There hasn't been a versioned release yet. \$\endgroup\$ Commented Oct 29, 2015 at 3:40
3
\$\begingroup\$

C++11, 291 bytes

#include<iostream>
#include<string>
#include<list>
#include<sstream>
#include<algorithm>
using namespace std;main(){string s;getline(cin,s);list<string>m;stringstream b(s);while(getline(b,s,' '))if(find(m.begin(),m.end(),s)==m.end())m.push_back(s);for(auto a:m)cout<<a<<' ';cout<<endl;}

I don't see a whole lot of C++ answers compared to golfing languages, so why not. Note that this uses C++11 features, and so if your compiler is (削除) stuck in the dark ages (削除ここまで) sufficiently old enough, you may need to pass a special compilation switch to make it use the C++11 standard. For g++, it's -std=c++11 (only needed for versions < 5.2). Try it online

answered Oct 29, 2015 at 3:15
\$\endgroup\$
5
  • \$\begingroup\$ If you compare the number of bytes with other languages, you will see why no one is using C++. \$\endgroup\$ Commented Oct 29, 2015 at 4:59
  • 3
    \$\begingroup\$ @CroCo If you realize the point of this site is to find the shortest solution in each language, you will see why I posted this answer. \$\endgroup\$ Commented Oct 29, 2015 at 5:01
  • \$\begingroup\$ sorry I'm not aware of it. \$\endgroup\$ Commented Oct 29, 2015 at 5:39
  • 1
    \$\begingroup\$ Why not use a set? It allows no duplicates by design. Just push into it. \$\endgroup\$ Commented Oct 29, 2015 at 12:17
  • 1
    \$\begingroup\$ @black A set is not guaranteed to have the items in the same order they were added. \$\endgroup\$ Commented Oct 29, 2015 at 14:53
3
\$\begingroup\$

K5, 9 bytes

" "/?" "\

FYI, this is a function.

Explanation

 " "\ Split the input on spaces
 ? Find all the unique elements
" "/ Join them back together
answered Oct 29, 2015 at 17:48
\$\endgroup\$
2
\$\begingroup\$

Matlab: 18 Bytes

unique(d,'stable')

where d is d = {'cat','dog','cat','dog','bird','dog','Snake','snake','Snake'}.

The result is 'cat' 'dog' 'bird' 'Snake' 'snake'

answered Oct 29, 2015 at 4:56
\$\endgroup\$
1
  • 4
    \$\begingroup\$ Welcome to Programming Puzzles and Code Golf! Submissions here need to either be full programs that read from STDIN and write to STDOUT, or functions which accept input and return output. As it stands, this is merely a snippet; it assumes the variable d is already assigned. You can rectify this by using a function handle: @(d)unique(d,'stable'), at the cost of 4 bytes. \$\endgroup\$ Commented Oct 29, 2015 at 21:41
2
\$\begingroup\$

Python 3, 55

l=[]
for x in input().split():l+=[x][x in l:]
print(*l)

Yeesh, this is long. Unfortunately, Python's set doesn't keep the order of the elements, so we have to do the work ourselves. We iterate through the input words, keeping a list l of elements that aren't yet in l. Then, we print the contents of l space-separated.

A string version of l would not work if some words are substrings of other words.

answered Oct 29, 2015 at 5:42
\$\endgroup\$
0
2
\$\begingroup\$

C#, 38 bytes

String.Join(" ",s.Split().Distinct());
answered Oct 29, 2015 at 12:49
\$\endgroup\$
2
  • 2
    \$\begingroup\$ I'm not sure you can assume input is already populated in s, I think you should get it as an argument. \$\endgroup\$ Commented Oct 29, 2015 at 13:04
  • 3
    \$\begingroup\$ Welcome to PPCG! Please have a look at our default answer formats. Answers should either be full programs or functions. Unnamed functions (like lambda literals) are fine, but snippets which expect the code to already exist in some variable/on the stack etc. or require a REPL environment are generally disallowed unless the OP explicitly permits them. \$\endgroup\$ Commented Oct 29, 2015 at 14:01
2
\$\begingroup\$

Perl 6, 14 bytes

As a whole program the only way you would write it is 21 bytes long

say $*IN.words.unique # 21 bytes

As a lambda expression the shortest is 14 bytes

*.words.unique # 14 bytes
say ( *.words.unique ).('cat dog cat dog bird dog Snake snake Snake')
my &foo = *.words.unique;
say foo $*IN;

While the output is a List, if you put it in a stringifying context it will put a space between the elements. If it was a requirement to return a string you could just add a ~ to the front ~*.words.unique.


If snippets were allowed, you could shorten it to 13 bytes by removing the *.

$_ = 'cat dog cat dog bird dog Snake snake Snake';
say .words.unique
answered Oct 29, 2015 at 14:37
\$\endgroup\$
1
\$\begingroup\$

gs2, 3 bytes

,É-

Encoded in CP437.

STDIN is pushed at the start of the program. , splits it over spaces. É is uniq, which filters duplicates. - joins by spaces.

answered Oct 30, 2015 at 16:19
\$\endgroup\$
1
\$\begingroup\$

Python 3, (削除) 87 (削除ここまで) 80 bytes

turns out the full program version is shorter

s=input().split(' ')
print(' '.join(e for i,e in enumerate(s)if e not in s[:i]))

Did it without regex, I am happy

Try it online

answered Oct 29, 2015 at 2:42
\$\endgroup\$
1
\$\begingroup\$

Lua, 94 bytes

function c(a)l={}return a:gsub("%S+",function(b)if l[b]then return""else l[b]=true end end)end
answered Oct 29, 2015 at 2:47
\$\endgroup\$
1
  • \$\begingroup\$ An anonymous user suggested to replace ... return""else l[b]=true end end... with ...return""end l[b]=""end.... \$\endgroup\$ Commented Aug 9, 2018 at 14:06
1
\$\begingroup\$

awk, 25

BEGIN{RS=ORS=" "}!c[0ドル]++

Output:

$ printf "cat dog cat dog bird dog Snake snake Snake" | awk 'BEGIN{RS=ORS=" "}!c[0ドル]++'
cat dog bird Snake snake $ 
$ 
answered Oct 29, 2015 at 6:01
\$\endgroup\$
1
\$\begingroup\$

JavaScript, (削除) 106 (削除ここまで) (削除) 102 (削除ここまで) 100 bytes

function(s){o={};s.split(' ').map(function(w){o[w]=1});a=[];for(w in o)a.push(w);return a.join(' ')}

// way too long for JS :(

answered Oct 29, 2015 at 12:35
\$\endgroup\$
2
  • \$\begingroup\$ Try using JS (aka ECMAScript) 6 arrow functions, which should save 6 bytes. Also, I can already see porting this to CoffeeScript will save at least 30 bytes. \$\endgroup\$ Commented Oct 29, 2015 at 17:56
  • \$\begingroup\$ This answer is in native JavaScript (ECMA5), there is edc65's one for es6. \$\endgroup\$ Commented Oct 29, 2015 at 17:57
1
\$\begingroup\$

Hassium, 91 bytes

func main(){d=[]foreach(w in input().split(' '))if(!(d.contains(w))){d.add(w)print(w+" ")}}

Run online and see expanded here

answered Oct 29, 2015 at 14:58
\$\endgroup\$
1
\$\begingroup\$

PHP (削除) 64 (削除ここまで) 59 bytes

function r($i){echo join(" ",array_unique(split(" ",$i)));}
answered Oct 29, 2015 at 13:43
\$\endgroup\$
2
  • \$\begingroup\$ explode()split(), implode()join()? \$\endgroup\$ Commented Oct 29, 2015 at 13:47
  • \$\begingroup\$ Thanks! Good suggestions. Seems split is being depricated though, but guess that does not matter for codegolving. \$\endgroup\$ Commented Oct 29, 2015 at 15:37
1
\$\begingroup\$

AppleScript, 162 bytes

Interestingly, this is almost identical to the non-repeating characters thing.

set x to(display dialog""default answer"")'s text returned's words
set o to""
repeat with i in x
considering case
if not i is in o then set o to o&i&" "
end
end
o

I didn't actually know the considering keyword before this. the more you know...

answered Oct 29, 2015 at 15:43
\$\endgroup\$
1
\$\begingroup\$

Burlesque, 6 bytes

blsq ) "cat dog cat dog bird dog Snake snake Snake"wdNBwD
cat dog bird Snake snake

Rather simple: split words, nub (nub = remove duplicates), convert back to words.

answered Oct 29, 2015 at 15:43
\$\endgroup\$
1
\$\begingroup\$

Gema, 21 characters

*\S=${0ドル;0ドル}@set{0ドル;}

(Very similar to the unique character solution, as there are no arrays in Gema, so allowing built-in unique functions not helps us much.)

Sample run:

bash-4.3$ gema '*\S=${0ドル;0ドル}@set{0ドル;}' <<< 'cat dog cat dog bird dog Snake snake Snake'
cat dog bird Snake snake 
answered Oct 29, 2015 at 15:43
\$\endgroup\$
1
\$\begingroup\$

Scala, (削除) 44 (削除ここまで) 47 bytes

(s:String)=>s.split(" ").distinct.mkString(" ")

EDIT: using toSet might not preserve order, so I'm now using distinct // that just cost me 3 bytes :(

answered Oct 29, 2015 at 12:12
\$\endgroup\$
0
\$\begingroup\$

PHP, 37 Bytes

Assuming $s is the input string.

print_r(array_flip(explode(' ',$s)));
answered Oct 30, 2015 at 12:44
\$\endgroup\$
1
2

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.