Find characters common among all strings

Question 1

Find characters common among all strings

You are given multiple Strings as an input. Your task is to find the characters that are present in every string, and return them in any order. There may be duplicates.

If the same character is present in all strings more than once, you have to return multiple copies of it.

Example:

Input: "cattle", "throttle", "rotten", "torrent"
Output: 't', 'e', 't'
Input: "cattle", "throttle", "rotten", "tree"
Output: 't', 'e'
Input: "hauberk", "ill"
Output:
Input: "haunt", "haunting-haunt"
Output: 'h', 'u', 'a', 't', 'n'

Constraints

Input is only ASCII
1 <= input length <= 100
1 <= input[i] length <= 100

I/O format

Flexible w/ default I/O policies.

$$Let's\ see\ your\ shortest\ solutions!$$

Question 2

Suggested test case: "cattle", "throttle", "rotten", "tree" (which should give 't', 'e' if I understand correctly).

Question 3

I have a feeling this should be closed as a dupe for "set intersection", but theres multiple challenges and all are subtly different so idk

Question 4

@Seggan Set intersection would not have duplicates in the output.

Question 5

ascii only invalidates plenty of languages with custom codepages...

Question 6

@pacman256 I thought he meant that the input was ascii only?

Question 7

Jelly, 3 bytes

œ&/

Takes a single input argument with a list of strings.

Try it online!

Explanation

Reduce (/) using multiset intersection (œ&).

Question 8

Python, (削除) 65 (削除ここまで) 60 bytes

-5 thanks to Jitse (unnecessary generator-to-list inside join(): -2; replace set(X) with {*X}: -3)

lambda L:"".join(a*min(w.count(a)for w in L)for a in{*L[0]})

Attempt This Online!

lambda L:"".join( # Return a string by concatenating a sequence of
 a* # - letters
 min(w.count(a)for w in L) # - multiplied by the minimum common count,
 for a in # - taken from
 {*L[0]} # unique letters from the first string.
)

A previous, 7 bytes more expensive, solution had ...join(L and[...]) to account for the case of empty input, that was later clarified to be out of scope.

Discarded approaches: `reduce` ((削除) 112 (削除ここまで) 108 bytes) and recursive (100 bytes)

The most popular genre of the other answers, i.e. "reduce/fold multiset intersection", doesn't have a direct port to Python (no multisets). The closest would be using Counters but the imports are very verbose, and so is the .elements() method to make the results a primitive.

Explicit (112 bytes, Attempt This Online!):

lambda L:[*reduce(lambda a,b:(k:=Counter)(a)&k(b),L).elements()]
from functools import*
from collections import*

Further golfed by Stef (108 bytes, Attempt This Online!):

lambda L:[*reduce((k:=Counter).__and__,map(k,L)).elements()]
from functools import*
from collections import*

We save only a few bytes by replacing reduce with recursion (100 bytes, Attempt This Online!)

from collections import*
k=Counter
f=lambda L:[*(k(L[0])&k(f(L[1:]))).elements()]if L[1:]else[*L[0]]

Question 9

-4bytes on the reduce version: lambda L:[*reduce((k:=Counter).__and__,map(k,L)).elements()]

Question 10

R, 49 bytes

\(S)Reduce(\(x,y)y[pmatch(x,y,0)],strsplit(S,""))

Attempt This Online!

Or alternatively, this 66 bytes version:

R, 66 bytes

\(S,`+`=make.unique)Reduce(\(x,y)y[match(+x,+y,0)],strsplit(S,""))

Attempt This Online!

In the both cases the function pmatch/match compares two words (vectors of the characters) and outputs the positions of the matching characters, while subsetting produces the characters themselves. Reduce runs this process quasi-recursively and finally outputs only those characters which matched all words.

Question 11

K(ngn/k), (削除) 8 (削除ここまで) (削除) 20 (削除ここまで) 10 bytes

-10 bytes thanks to ovs.

&&/#''=:'

Try it here!

20 bytes verison

Question 12

Just {&&/#''='x} or &&/#''=:' as a tacit function works

Question 13

J, 21 bytes

a.#~[:<./+/@(=/&a.)&>

Try it online!

Takes input as list of boxed words.

For each word in the list:
- For each letter in the word:
  - Create a length 256 bit mask corresponding to the ascii alphabet, which is 1 if the letter equals that ascii character and 0 otherwise.
- Sum those bit masks. Now we have length 256 array, with each position giving us the letter count within the word.
Take the element-wise min across all those length 256 letter counts.
Use that new count mask to duplicate the letters of the ascii alphabet.

Question 14

NARS2000 (APL), 3 characters; 6 bytes

∩⍦/

/ reduce using

⍦ multi-set

∩ intersection

enter image description here

Question 15

Haskell, 75 bytes

import Data.List
f x=[y|y<-mapM(group.sort)x,[z]<-[nub$concat y]]>>=minimum

Try it online!

Explanation:

group . sort sorts the characters of a given string, and then splits the result into a list of substrings, each composed of multiple occurrences of the same character (e.g. (group . sort) "cattle" is equal to ["a", "c", "e", "l", "tt"]); the sort is necessary to ensure that all occurrences of a given character are contiguous before the group.
mapM g x is the cartesian product of the lists in map g x ^*; for example, let ys = mapM g ["cattle", "throttle", "rotten", "torrent"] where g = (group . sort):
- the first element of ys is ["a", "e", "e", "e"], where "a" is the first element of g "cattle", the first "e" is the first element of g "throttle", and so on;
- the last element of ys is ["tt", "ttt", "tt", "tt"], where the first "tt" is the last element of g "cattle", "ttt" is the last element of g "throttle", and so on.
Then, nub $ concat y concatenates all substrings of y, and nub keeps only the first occurrence of each character; for example:
- nub $ concat ["a", "e", "e", "e"] is "ae",
- nub $ concat ["tt", "ttt", "tt", "tt"] is "t".
With the [z] <- ... syntax we select only the ys such that nub $ concat y comprises a single element, since the pattern match fails if nub $ concat y has zero or more than one element; for example, the first element of ys is filtered out, while the last element of ys is kept; as a consequence, we select only those ys that contain only substrings that all contain the same character.
Finally, (>>= minimum) is equivalent to concatMap minimum; minimum applied to a list like ["tt", "ttt", "tt", "tt"] selects the shortest string, so it select the substring containing n occurrences of the same character so that every string of the initial input contains at least n occurrences of the considered character; the concat is applied so the function returns "ett" and not ["e", "tt"].

^*: the cartesian product is a very inefficient and unnecessary step for solving the problem, but it allows for a short(er) solution

Question 16

Can you add an explanation? There's some cool stuff I don't understand going on in this answer, and I want to be able to fully appreciate it.

Question 17

@DLosc done, hope it's clear enough. Thank you for your interest :)

Question 18

Retina 0.8.2, 40 bytes

%O`.
!`(.)(?<=^.*(1円+))(?=.*(¶.*2円.*)*$)

Try it online! Takes each string on its own line and outputs each common character in ASCII order on its own line. Explanation:

%O`.

Sort each string into ASCII order.

!`(.)

Output all characters...

(?<=^.*(1円+))

... from the first string, counting how many times this character has appeared so far...

(?=.*(¶.*2円.*)*$)

... that appear that many times in all of the other strings as well.

Question 19

Haskell + hgl, 5 bytes

lH nx

Fold by multiset intersection.

Attempt This Online!

Question 20

Vyxal 3 `l`, 3 bytes

/Þ)

Try it Online! (link is to literate version)

uses literate mode, ends up being a port of jelly anyways due to weird behavior in the set intersection

Question 21

1. you have an h flag. 2. This is not right, try adding a t to cattle and it will result in ttet.

Question 22

R, 81 bytes

\(l)rep(y<-unique(el(x<-strsplit(l,""))),Map(\(a)min(sapply(x,\(s)sum(s==a))),y))

Attempt This Online!

Question 23

Charcoal, 18 bytes

WS⊞υιΦθ⬤υ›NoλιNo...θκι

Try it online! Link is to verbose version of code. Takes input as a list of newline-terminated strings. Explanation:

WS⊞υι

Read the input into an array.

 θ First string
Φ Filtered where
 υ List of strings
 ⬤ All satisfy
 No Count of
 ι Current character
 λ In current string
 › Is greater than
 No Count of
 ι Current character
 θ In first string
 ... Truncated to length
 κ Outer index

Taking the input as a JSON array would save 3 bytes, as υ would be replaced with θ which itself would have to be replaced with ⌊θ (which isn't necessarily the first string but it will be consistent which is all that's required here).

Question 24

JavaScript (ES10), 62 bytes

Expects words as arrays of characters and returns in the same format.

a=>a.reduce((s,w)=>s.flatMap(c=>w.splice(w.indexOf(c)>>>0,1)))

Try it online!

Commented

a => // a[] = input array
a.reduce((s, w) => // for each word w[] in a[], with accumulator s[]:
 s.flatMap(c => // for each character c in s[]:
 w.splice( // remove from w[]:
 w.indexOf(c) // the first occurrence of c in w[]
 >>> 0, // if not found, we turn the index -1 into
 1 // the maximum unsigned 32-bit value so that
 // nothing is removed
 ) // end of splice() -> returns [c] or []
 ) // end of flatMap() -> removes all empty entries
) // end of reduce()

Question 25

vemf, 5 bytes

Takes a list of strings.

╧û╓¿\

Though vemf has no built-in multiset intersection, it doesn't do too badly, because û is made to implement multiset operations anyways (stolen from BQN, which doesn't happen to have an entry). The obvious solution would be ╧ Reduce │û\╓¿ Multiset Intersection, but it turns out leaving the Nones there until the end saves 1 byte. online

Question 26

Uiua ^SBCS, (削除) 17 (削除ここまで) 16 bytes

⍜-(⊚/↧⬚0∵◇°⊚)@0円

Try it!

-1 thanks to Tbw

Question 27

You can drop the left paren by bringing the constant null outside of the under.

Question 28

Perl 5 `-MList::Util=min,pairmap -plF`, 77 bytes

map$r[$.-1]{$_}++||$l{$_}++,@F}{pairmap{$\.=$b==$.&&$a x min map$$_{$a},@r}%l

Try it online!

Question 29

Uiua, 19 characters

▽≡/↧:⟜(∵◇/+⊞=)◴/◇⊂.

Try it here!

Question 30

Zsh & coreutils, (削除) 110 (削除ここまで) (削除) 93 (削除ここまで) 83 bytes

d()(<<<1ドル|fold -1|sort)
m=1ドル;while ((#>1))m=`comm -12 <(d $m) <(d 2ドル)`&&shift
<<<$m

Try it online!

d() is a function to turn a string like tree into a sorted column of data; e, e, r, t.
$m is a string of common letters built up by iterating.
while ((#>1)) iterates over the input arguments (enumerated by $#, and shifted each time)
comm takes 2 input files of sorted columnar data. The -12 option returns only data common to both files. The <( ) things redirect the commands d $m and d 2ドル to temp files for comm to read.

Question 31

TODO: try using the ${name:*arrayname} array intersection.

Question 32

Nibbles, 5 nibbles (2.5 bytes)

/$`&$

Attempt This Online!

/$`&$ # full program
/$`&$@ # with implicit arg shown
/ # fold over 
 $ # input:
 `& # list intersection of
 $@ # L & R elements

Question 33

TinyAPL, 4 bytes

∩⍦ᑒ/

∩⍦ multiset intersection / reduction ᑒ unboxing the vector of strings

Question 34

Pip, (削除) 16 (削除ここまで) 12 bytes

@SS_Mg@X*UQa

Takes the strings as command-line arguments. Attempt This Online!

Explanation

Pip doesn't have multiset builtins, so we have to do it by hand:

@SS_Mg@X*UQa
 a Take the first string (by convenience; could be any of them)
 UQ Uniquify
 X* Convert each character to a regex matching that character
 @ For each of those, find all matches in
 g Each command-line argument
 M Map this function to each list of lists of matches:
 SS_ Sort using string comparison (in this case, since the characters
 are all the same, this means shortest first)
@ Get the first (shortest) one
 Concatenate and autoprint (implicit)

For example, with inputs "cattle" "throttle" "rotten" "torrent":

 UQa "catle"
 X* [`c`;`a`;`t`;`l`;`e`]
 g@ [[["c"];[];[];[]];[["a"];[];[];[]];[["t";"t"];["t";"t";"t"];["t";"t"];["t";"t"]];[["l"];["l"];[];[]];[["e"];["e"];["e"];["e"]]]
 SS_M [[[];[];[];["c"]];[[];[];[];["a"]];[["t";"t"];["t";"t"];["t";"t"];["t";"t";"t"]];[[];[];["l"];["l"]];[["e"];["e"];["e"];["e"]]]
@ [[];[];["t";"t"];[];["e"]]

which outputs tte.

Question 35

Google Sheets, 279 bytes

=let(f,lambda(t,split(regexreplace(t,"\B|\b","→"),"→")),g,lambda(t,join(,sort(tocol(f(t))))),v,map(tocol(A:A,1),g),w,lambda(m,rows(v)=counta(ifna(filter(v,regexmatch(v,m))))),f(join(,unique(reduce(,f(A1),lambda(a,c,let(s,regexextract(g(A1),c&"+"),{a;ifs(w(s),s,w(c),c,1,)})))))))

Put the strings in cells A1:A and the formula in cell B1.

screenshot

Ungolfed:

=let( 
 f, lambda(t, split(regexreplace(t, "\B|\b", "→"), "→")), 
 g, lambda(t, join(, sort(tocol(f(t))))), 
 v, map(tocol(A:A, 1), g), 
 w, lambda(m, counta(v) = counta(ifna(filter(v, regexmatch(v, m))))), 
 r, reduce(, f(A1), lambda(a, c, let( 
 s, regexextract(g(A1), c & "+"), 
 vstack( 
 a, 
 ifs(w(s), s, w(c), c, 1, ) 
 ) 
 ))), 
 f(join(, unique(r))) 
)

The formula sorts characters within strings and finds repeated characters in A1 in the other strings, falling back to finding non-repeated characters in A1. When the number of matches is the same as the total number of strings, the repeated characters are included in the result (or ditto for a non-repeated character.) Finally, get uniques, possibly including repeated characters.

(Note: While this will give the correct result for all test cases presented in the question, it won't handle a triple like "aardvark" in A1 correctly when there are doubles like "lama" in other strings. Triples and above are fine in the rest of strings so "lama", "aardvark" does get the correct result. Note also that regex special characters are not escaped.)

Question 36

I read "handle duplicates" here as "handle arbitrary amount of repetition". That's the usual intended meaning.

Question 37

@Jonah yes... I agree that the usual intended meaning in computing obviously includes triples and above. The dictionary definition of "duplicate" mentions "twin, double, clone, match, mate, fellow, counterpart". Spreadsheet formula languages have no variables but constants only, and functions cannot have side-effects. Could not think of a concise way of handling the case where there are different triples or above in all strings. I will look at adding a slight improvement if and when there is a test case where the answer fails.

Question 38

05AB1E, 9 bytes

ε{æ}.»Ãéθ

(Output is sorted.)

Try it online or verify all test cases.

Explanation:

ε # Map over each string of the (implicit) input-list:
 { # Sort the characters in the string
 æ # Pop and push the powerset
 }.» # After the map: reduce the list of lists of strings by:
 Ã # Keep powerset-strings which are present in both lists
 é # Then sort the remaining powerset-strings by length
 θ # And pop and push the last/longest one
 # (which is output implicitly as result)

Question 39

Arturo, 68 bytes

$->L->join map unique L0円'a->repeat a min map L=>[enumerate&=>[&=a]]

Try it!

A port of Nicola Sap's Python answer. Takes input as a list of lists of characters as singleton strings.

Question 40

AWK,`-vRS="," -F ""` 122 bytes

1~NR{for(;i++<NF;)b[$i]++}NR>1{for(i in b)b[i]=(c=gsub(i,X))?(b[i]<c?b[i]:c):0}END{for(i in b)for(j=1;j++<=b[i];)printf i}

Try it online!

It's not working in TIO for whatever reason, but you can try this instead:

BEGIN{RS=",";FS=X}1~NR{for(;i++<NF;)b[$i]++}NR>1{for(i in b)b[i]=(c=gsub(i,X))?(b[i]<c?b[i]:c):0}END{for(i in b)for(j=1;j++<=b[i];)s=s i;print s}
BEGIN{RS=",";FS=X} # split up data (-vRS="," -F "")
1~NR{for(;i++<NF;)b[$i]++} # first record becomes dict
NR>1{for(i in b) # compare new strings to dict
b[i]=(c=gsub(i,X))?(b[i]<c?b[i]:c):0} # return matches
END{for(i in b)for(j=1;j++<=b[i];) # goes through remaining chars
print i}

Question 41

Dyalog APL, (削除) 25 (削除ここまで) 23 bytes

Well, not having a built-in multiset (that I know of (using TryAPL...)) here's a much longer solution. I couldn't quite mold ∩ to give me the multiset functionality I needed, but I'm not convinced it is not possible.

{x/⍨⌊⌿↑+/ ̈⍵∘.∊⍨ ̈⊂x←⎕AV}⁡‎⁪⁪⁠‎⁪⁡⁪⁠⁪⁢⁡⁢⁪‏⁠‎⁪⁡⁪⁠⁪⁢⁡⁣⁪‏⁠‎⁪⁡⁪⁠⁪⁢⁡⁤⁪‏⁠‎⁪⁡⁪⁠⁪⁢⁢⁡⁪‏⁠‎⁪⁡⁪⁠⁪⁢⁢⁢⁪‏‏⁡⁠⁡‌⁢‎‎⁪⁡⁪⁠⁪⁣⁤⁪‏⁠‎⁪⁡⁪⁠⁪⁤⁡⁪‏⁠‎⁪⁡⁪⁠⁪⁤⁢⁪‏⁠‎⁪⁡⁪⁠⁪⁤⁣⁪‏⁠⁪⁪‏⁡⁠⁡‌⁣‎‎⁪⁡⁪⁠⁪⁣⁣⁪‏⁠‎⁪⁡⁪⁠⁪⁤⁤⁪‏‏⁡⁠⁡‌⁤‎‎⁪⁡⁪⁠⁪⁢⁡⁡⁪‏‏⁡⁠⁡‌⁢⁡‎‎⁪⁡⁪⁠⁪⁢⁤⁪‏⁠‎⁪⁡⁪⁠⁪⁣⁡⁪‏⁠‎⁪⁡⁪⁠⁪⁣⁢⁪‏‏⁡⁠⁡‌⁢⁢‎‎⁪⁡⁪⁠⁪⁢⁡⁪‏⁠‎⁪⁡⁪⁠⁪⁢⁢⁪‏⁠‎⁪⁡⁪⁠⁪⁢⁣⁪‏‏⁡⁠⁡‌⁢⁣‎‎⁪⁡⁪⁠⁪⁢⁪‏⁠‎⁪⁡⁪⁠⁪⁣⁪‏⁠‎⁪⁡⁪⁠⁪⁤⁪‏‏⁡⁠⁡‌
 x←⎕AV # ‎⁡Using AV because it conveniently has upper- and lower-case characters, store that in x
 ∘.∊⍨ # ‎⁢Find the membership outer product between
 ⍵ ̈ # ‎⁣each input word
 ⊂ # ‎⁤and the whole (extraneous) alphabet
 +/ ̈ # ‎⁢⁡Sum over to get counts by letter
 ⌊⌿↑ # ‎⁢⁢Mix so I can find the minimum count by column. This is the per-letter multiplicity. 0 would mean its not present in every word
 x/⍨ # ‎⁢⁣Replicate over that alphabet to pull as many letters as the multiset multiplicity
💎

Created with the help of Luminespire.

Here, then, is my multi-purpose multiset intersection:

{x/⍨⌊⌿↑+/ ̈⍵∘.∊⍨ ̈⊂x←∪∊⍵}

Question 42

Ruby, 49 bytes

Input is a list of character arrays. Output is a string.

->a{a[0].uniq.map{|c|c*a.map{_1.count c}.min}*''}

Attempt This Online!

Luis Mendo 107k10 gold badges139 silver badges382 bronze badges · Accepted Answer · 2024-06-08 22:05:01Z

8

\$\begingroup\$

Jelly, 3 bytes

œ&/

Takes a single input argument with a list of strings.

Try it online!

Explanation

Reduce (/) using multiset intersection (œ&).

Share

Improve this answer

edited Jun 8, 2024 at 22:41

answered Jun 8, 2024 at 22:05

Luis Mendo's user avatar

Luis Mendo

107k10 gold badges139 silver badges382 bronze badges

\$\endgroup\$

Add a comment |

Find characters common among all strings

Find characters common among all strings

Example:

Constraints

I/O format

$$Let's\ see\ your\ shortest\ solutions!$$

27 Answers 27

Jelly, 3 bytes

Explanation

Python, (削除) 65 (削除ここまで) 60 bytes

Discarded approaches: reduce ((削除) 112 (削除ここまで) 108 bytes) and recursive (100 bytes)

R, 49 bytes

R, 66 bytes

K(ngn/k), (削除) 8 (削除ここまで) (削除) 20 (削除ここまで) 10 bytes

J, 21 bytes

NARS2000 (APL), 3 characters; 6 bytes

Haskell, 75 bytes

Retina 0.8.2, 40 bytes

Haskell + hgl, 5 bytes

Vyxal 3 l, 3 bytes

R, 81 bytes

Charcoal, 18 bytes

JavaScript (ES10), 62 bytes

Commented

vemf, 5 bytes

Uiua SBCS , (削除) 17 (削除ここまで) 16 bytes

Perl 5 -MList::Util=min,pairmap -plF, 77 bytes

Uiua, 19 characters

Zsh & coreutils, (削除) 110 (削除ここまで) (削除) 93 (削除ここまで) 83 bytes

Nibbles, 5 nibbles (2.5 bytes)

TinyAPL, 4 bytes

Pip, (削除) 16 (削除ここまで) 12 bytes

Explanation

Google Sheets, 279 bytes

05AB1E, 9 bytes

Arturo, 68 bytes

AWK,-vRS="," -F "" 122 bytes

Dyalog APL, (削除) 25 (削除ここまで) 23 bytes

Ruby, 49 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions

Discarded approaches: `reduce` ((削除) 112 (削除ここまで) 108 bytes) and recursive (100 bytes)

Vyxal 3 `l`, 3 bytes

Uiua ^SBCS, (削除) 17 (削除ここまで) 16 bytes

Perl 5 `-MList::Util=min,pairmap -plF`, 77 bytes

AWK,`-vRS="," -F ""` 122 bytes