Remove all repeating words from an inputted sentence.
Input will be something like cat dog cat dog bird dog Snake snake Snake and the output should be cat dog bird Snake snake. There will always be a single space separating words.
Output order must be the same as input. (Refer to the example)
You don't need to handle punctuation but capital letter handling is required.
32 Answers 32
CJam, 7 chars
qS/_&S*
Can probably be much shorter... but whatever I've almost never used CJam. ^.^
q reads input, S/ splits on spaces, _& duplicates and applies a setwise AND (therefore getting rid of duplicates), and S* re-joins on space.
Haskell, 34 bytes
import Data.List
unwords.nub.words
Usage example: (unwords.nub.words) "cat dog cat dog bird dog Snake snake Snake" -> "cat dog bird Snake snake".
APL, (削除) 22 (削除ここまで) 20 bytes
{1↓∊∪(∊∘' '⊂⊢)' ',⍵}
This creates an unnamed monadic function that accepts a string on the right and returns a string.
Explanation:
' ',⍵} ⍝ Prepend a space to the input string
(∊∘' '⊂⊢) ⍝ Split the string on spaces using a fork
∪ ⍝ Select the unique elements
{1↓∊ ⍝ Join into a string and drop the leading space
Saved 2 bytes thanks to Dennis!
-
3\$\begingroup\$ I love any answer that uses a non-esoteric, non-golf language. \$\endgroup\$Darth Egregious– Darth Egregious2015年10月29日 16:04:03 +00:00Commented Oct 29, 2015 at 16:04
Ruby, 21 chars
->s{s.split.uniq*' '}
TeaScript, 12 bytes
TeaScript is JavaScript for golfing.
xs` `u()j` `
This is pretty short. It splits on each space, filters out duplicates, then rejoins.
-
\$\begingroup\$ Is it
tee-a scriptortee script? \$\endgroup\$user31556– user315562015年10月29日 08:14:58 +00:00Commented Oct 29, 2015 at 8:14 -
\$\begingroup\$ @MathiasFoster it would be "tee-script" \$\endgroup\$Downgoat– Downgoat2015年10月29日 14:13:16 +00:00Commented Oct 29, 2015 at 14:13
-
\$\begingroup\$ Does TeaScript have letters reserved for variable names? Most of them appear to be shorthands for built-in properties. \$\endgroup\$intrepidcoder– intrepidcoder2015年10月30日 19:24:05 +00:00Commented Oct 30, 2015 at 19:24
-
\$\begingroup\$ @intrepidcoder yes all of these:
cdfghijklmnopstuvware reserved for variables, they are all pre-initialized to 0.bis also reserved for a variable name, it is pre-initialized to an empty string \$\endgroup\$Downgoat– Downgoat2015年10月30日 19:28:09 +00:00Commented Oct 30, 2015 at 19:28
JavaScript (ES6) 33
(see this answer)
Test running the snippet below in an EcmaScript 6 compliant browser (implementing Set, spread operator, template strings and arrow functions - I use Firefox).
Note: the conversion to Set drop all the duplicates and Set mantains the original ordering.
f=s=>[...Set(s.split` `)].join` `
function test() { O.innerHTML=f(I.value) }
test()
#I { width: 70% }
<input id=I value="cat dog cat dog bird dog Snake snake Snake"/><button onclick="test()">-></button>
<pre id=O></pre>
-
\$\begingroup\$ Wow wow wow... I am continually amazed by your ability to cut any solution I think up by 25% or more. +1 \$\endgroup\$ETHproductions– ETHproductions2015年10月29日 15:37:24 +00:00Commented Oct 29, 2015 at 15:37
-
1\$\begingroup\$ Looked at the problem and immediately thought of Sets... only to realize that you'd already done it =P very nice! \$\endgroup\$Mwr247– Mwr2472015年10月29日 18:02:55 +00:00Commented Oct 29, 2015 at 18:02
-
\$\begingroup\$ how can set maintain the original ordering? \$\endgroup\$njzk2– njzk22015年10月31日 01:33:13 +00:00Commented Oct 31, 2015 at 1:33
-
\$\begingroup\$ @njzk2 ask the developers of the language. It could be: a set is internally an Array, and at each insertion there is a check to reject duplicates. It's an implementation detail anyway \$\endgroup\$edc65– edc652015年10月31日 01:37:35 +00:00Commented Oct 31, 2015 at 1:37
-
\$\begingroup\$ @njzk2 while I don't know how, I know that this fact is specified by the language: Set objects are collections of values, you can iterate its elements in insertion order. A value in the Set may only occur once; it is unique in the Set's collection. (developer.mozilla.org/it/docs/Web/JavaScript/Reference/…) \$\endgroup\$edc65– edc652015年10月31日 01:40:41 +00:00Commented Oct 31, 2015 at 1:40
R, 22 bytes
cat(unique(scan(,"")))
This reads a string from STDIN and splits it into a vector on spaces using scan(,""), selects only unique elements, then concatenates them into a string and prints it to STDOUT using cat.
PowerShell, 15 Bytes
$args|select -u
Whoa, an actual entry where PowerShell is somewhat competitive? That's unpossible!
Takes the string as input arguments, pipes to Select-Object with the -Unique flag. Spits out an array of strings, preserving order and capitalization as requested.
Usage:
PS C:\Tools\Scripts\golfing> .\remove-repeated-words-from-string.ps1 cat dog cat dog bird dog Snake snake Snake
cat
dog
bird
Snake
snake
If this is too "cheaty" in assuming the input can be as command-line arguments, then go for the following, at (削除) 24 (削除ここまで) 21 Bytes (saved some bytes thanks to blabb). Interestingly, using the unary operator in this direction happens to also work if the input string is demarcated with quotes or as individual arguments, since the default -split is by spaces. Bonus.
-split$args|select -u
-
\$\begingroup\$ Relying on the environment's behavior of spoon-feeding the code with readily split up input...? \$\endgroup\$manatwork– manatwork2015年10月29日 13:41:06 +00:00Commented Oct 29, 2015 at 13:41
-
\$\begingroup\$ @manatwork I've added a clarification if the first usage is considered too "cheaty" -- since it's not clear exactly how the input is specified, we'll leave it up to the OP. \$\endgroup\$AdmBorkBork– AdmBorkBork2015年10月29日 14:12:07 +00:00Commented Oct 29, 2015 at 14:12
-
\$\begingroup\$ And now is clear how efficients are PowerShell's own features. That 24 really deserves an upvote. \$\endgroup\$manatwork– manatwork2015年10月29日 14:30:46 +00:00Commented Oct 29, 2015 at 14:30
-
\$\begingroup\$ @timmyD you can chop off 3 bytes to the uncheaty ?? version by using the unary split and no need for "" '' in the commandline args too :\>ls -l split.ps1 & type split.ps1 & echo.&powershell -nologo -f split.ps1 cat dog cat dog bird dog Snake snake Snake -rw-rw-rw- 1 Admin 0 21 2015年11月02日 19:06 split.ps1 -split$args|select -u cat dog bird Snake snake \$\endgroup\$blabb– blabb2015年11月02日 13:38:29 +00:00Commented Nov 2, 2015 at 13:38
Julia, 29 bytes
s->join(unique(split(s))," ")
This creates an unnamed function that splits the string into a vector on spaces, keeps only the unique elements (preserving order), and joins the array back into a string with spaces.
Retina, 22 bytes
(\w+)\b(?<=\b1円\b.+)
Save the file with a trailing linefeed and run it with the -s flag.
This is fairly straight forward in that it matches a single word, and the lookbehind checks whether that same word has appeared in the string before. The trailing linefeed causes Retina to work in Replace mode with an empty replacement string, removing all matches.
Mathematica, (削除) 43 (削除ここまで) 39 bytes
StringRiffle@*Keys@*Counts@*StringSplit
-
\$\begingroup\$ Kudos for using
StringRiffle[]. \$\endgroup\$Michael Stern– Michael Stern2015年10月29日 17:17:18 +00:00Commented Oct 29, 2015 at 17:17 -
\$\begingroup\$ could use
Keys@Countsinstead ofDeleteDuplicates\$\endgroup\$sanchez– sanchez2015年10月30日 15:29:20 +00:00Commented Oct 30, 2015 at 15:29 -
\$\begingroup\$ @branislav Does
Keys@Countspreserve order? \$\endgroup\$LegionMammal978– LegionMammal9782015年10月30日 21:13:05 +00:00Commented Oct 30, 2015 at 21:13 -
\$\begingroup\$ @LegionMammal978
Counts[list]gives an association whose keys are in the same order as they first occur as elements of list. \$\endgroup\$sanchez– sanchez2015年11月01日 00:25:41 +00:00Commented Nov 1, 2015 at 0:25
Pyth - 9 bytes
Well this is why we're all waiting for Pyth5, could have been 5 bytes.
jdoxzN{cz
-
\$\begingroup\$ Why isn't Pyth5 valid? It appears to be implemented. \$\endgroup\$lirtosiast– lirtosiast2015年10月29日 03:34:05 +00:00Commented Oct 29, 2015 at 3:34
-
\$\begingroup\$ @ThomasKwa I don't think it's finished. There hasn't been a versioned release yet. \$\endgroup\$Alex A.– Alex A.2015年10月29日 03:40:15 +00:00Commented Oct 29, 2015 at 3:40
C++11, 291 bytes
#include<iostream>
#include<string>
#include<list>
#include<sstream>
#include<algorithm>
using namespace std;main(){string s;getline(cin,s);list<string>m;stringstream b(s);while(getline(b,s,' '))if(find(m.begin(),m.end(),s)==m.end())m.push_back(s);for(auto a:m)cout<<a<<' ';cout<<endl;}
I don't see a whole lot of C++ answers compared to golfing languages, so why not. Note that this uses C++11 features, and so if your compiler is (削除) stuck in the dark ages (削除ここまで) sufficiently old enough, you may need to pass a special compilation switch to make it use the C++11 standard. For g++, it's -std=c++11 (only needed for versions < 5.2). Try it online
-
\$\begingroup\$ If you compare the number of bytes with other languages, you will see why no one is using C++. \$\endgroup\$CroCo– CroCo2015年10月29日 04:59:16 +00:00Commented Oct 29, 2015 at 4:59
-
3\$\begingroup\$ @CroCo If you realize the point of this site is to find the shortest solution in each language, you will see why I posted this answer. \$\endgroup\$user45941– user459412015年10月29日 05:01:00 +00:00Commented Oct 29, 2015 at 5:01
-
\$\begingroup\$ sorry I'm not aware of it. \$\endgroup\$CroCo– CroCo2015年10月29日 05:39:41 +00:00Commented Oct 29, 2015 at 5:39
-
1\$\begingroup\$ Why not use a
set? It allows no duplicates by design. Just push into it. \$\endgroup\$edmz– edmz2015年10月29日 12:17:34 +00:00Commented Oct 29, 2015 at 12:17 -
1\$\begingroup\$ @black A
setis not guaranteed to have the items in the same order they were added. \$\endgroup\$user45941– user459412015年10月29日 14:53:23 +00:00Commented Oct 29, 2015 at 14:53
K5, 9 bytes
" "/?" "\
FYI, this is a function.
Explanation
" "\ Split the input on spaces
? Find all the unique elements
" "/ Join them back together
Matlab: 18 Bytes
unique(d,'stable')
where d is d = {'cat','dog','cat','dog','bird','dog','Snake','snake','Snake'}.
The result is 'cat' 'dog' 'bird' 'Snake' 'snake'
-
4\$\begingroup\$ Welcome to Programming Puzzles and Code Golf! Submissions here need to either be full programs that read from STDIN and write to STDOUT, or functions which accept input and return output. As it stands, this is merely a snippet; it assumes the variable
dis already assigned. You can rectify this by using a function handle:@(d)unique(d,'stable'), at the cost of 4 bytes. \$\endgroup\$Alex A.– Alex A.2015年10月29日 21:41:54 +00:00Commented Oct 29, 2015 at 21:41
Python 3, 55
l=[]
for x in input().split():l+=[x][x in l:]
print(*l)
Yeesh, this is long. Unfortunately, Python's set doesn't keep the order of the elements, so we have to do the work ourselves. We iterate through the input words, keeping a list l of elements that aren't yet in l. Then, we print the contents of l space-separated.
A string version of l would not work if some words are substrings of other words.
C#, 38 bytes
String.Join(" ",s.Split().Distinct());
-
2\$\begingroup\$ I'm not sure you can assume input is already populated in
s, I think you should get it as an argument. \$\endgroup\$Jacob– Jacob2015年10月29日 13:04:14 +00:00Commented Oct 29, 2015 at 13:04 -
3\$\begingroup\$ Welcome to PPCG! Please have a look at our default answer formats. Answers should either be full programs or functions. Unnamed functions (like lambda literals) are fine, but snippets which expect the code to already exist in some variable/on the stack etc. or require a REPL environment are generally disallowed unless the OP explicitly permits them. \$\endgroup\$Martin Ender– Martin Ender2015年10月29日 14:01:53 +00:00Commented Oct 29, 2015 at 14:01
Perl 6, 14 bytes
As a whole program the only way you would write it is 21 bytes long
say $*IN.words.unique # 21 bytes
As a lambda expression the shortest is 14 bytes
*.words.unique # 14 bytes
say ( *.words.unique ).('cat dog cat dog bird dog Snake snake Snake')
my &foo = *.words.unique;
say foo $*IN;
While the output is a List, if you put it in a stringifying context it will put a space between the elements. If it was a requirement to return a string you could just add a ~ to the front ~*.words.unique.
If snippets were allowed, you could shorten it to 13 bytes by removing the *.
$_ = 'cat dog cat dog bird dog Snake snake Snake';
say .words.unique
Python 3, (削除) 87 (削除ここまで) 80 bytes
turns out the full program version is shorter
s=input().split(' ')
print(' '.join(e for i,e in enumerate(s)if e not in s[:i]))
Did it without regex, I am happy
Lua, 94 bytes
function c(a)l={}return a:gsub("%S+",function(b)if l[b]then return""else l[b]=true end end)end
-
\$\begingroup\$ An anonymous user suggested to replace
... return""else l[b]=true end end...with...return""end l[b]=""end.... \$\endgroup\$Jonathan Frech– Jonathan Frech2018年08月09日 14:06:47 +00:00Commented Aug 9, 2018 at 14:06
awk, 25
BEGIN{RS=ORS=" "}!c[0ドル]++
Output:
$ printf "cat dog cat dog bird dog Snake snake Snake" | awk 'BEGIN{RS=ORS=" "}!c[0ドル]++'
cat dog bird Snake snake $
$
JavaScript, (削除) 106 (削除ここまで) (削除) 102 (削除ここまで) 100 bytes
function(s){o={};s.split(' ').map(function(w){o[w]=1});a=[];for(w in o)a.push(w);return a.join(' ')}
// way too long for JS :(
-
\$\begingroup\$ Try using JS (aka ECMAScript) 6 arrow functions, which should save 6 bytes. Also, I can already see porting this to CoffeeScript will save at least 30 bytes. \$\endgroup\$kirbyfan64sos– kirbyfan64sos2015年10月29日 17:56:18 +00:00Commented Oct 29, 2015 at 17:56
-
\$\begingroup\$ This answer is in native JavaScript (ECMA5), there is edc65's one for es6. \$\endgroup\$Jacob– Jacob2015年10月29日 17:57:44 +00:00Commented Oct 29, 2015 at 17:57
Hassium, 91 bytes
func main(){d=[]foreach(w in input().split(' '))if(!(d.contains(w))){d.add(w)print(w+" ")}}
Run online and see expanded here
PHP (削除) 64 (削除ここまで) 59 bytes
function r($i){echo join(" ",array_unique(split(" ",$i)));}
-
\$\begingroup\$
explode()→split(),implode()→join()? \$\endgroup\$manatwork– manatwork2015年10月29日 13:47:57 +00:00Commented Oct 29, 2015 at 13:47 -
\$\begingroup\$ Thanks! Good suggestions. Seems
splitis being depricated though, but guess that does not matter for codegolving. \$\endgroup\$Jeroen– Jeroen2015年10月29日 15:37:07 +00:00Commented Oct 29, 2015 at 15:37
AppleScript, 162 bytes
Interestingly, this is almost identical to the non-repeating characters thing.
set x to(display dialog""default answer"")'s text returned's words set o to"" repeat with i in x considering case if not i is in o then set o to o&i&" " end end o
I didn't actually know the considering keyword before this. the more you know...
Burlesque, 6 bytes
blsq ) "cat dog cat dog bird dog Snake snake Snake"wdNBwD
cat dog bird Snake snake
Rather simple: split words, nub (nub = remove duplicates), convert back to words.
Gema, 21 characters
*\S=${0ドル;0ドル}@set{0ドル;}
(Very similar to the unique character solution, as there are no arrays in Gema, so allowing built-in unique functions not helps us much.)
Sample run:
bash-4.3$ gema '*\S=${0ドル;0ドル}@set{0ドル;}' <<< 'cat dog cat dog bird dog Snake snake Snake'
cat dog bird Snake snake
Scala, (削除) 44 (削除ここまで) 47 bytes
(s:String)=>s.split(" ").distinct.mkString(" ")
EDIT: using toSet might not preserve order, so I'm now using distinct // that just cost me 3 bytes :(
PHP, 37 Bytes
Assuming $s is the input string.
print_r(array_flip(explode(' ',$s)));
Snakeandsnakeare treated simply as different \$\endgroup\$