Write a program that finds the most occurring paired letter in a string

Question 1

The program must output the letter that is paired the most. For example, if your program was given the following string:

"Sally's friend Bobby searched for seashells."

it must output L because "ll" occurs twice, which is more frequent than the other pair "bb".

Rules:

If more than one letter has 1st place for occurrences, then output all of them in alphabetical order (e.g. "Sally's friends Jimmy and Bobby rummaged for seashells." should output both L AND M [or "LM" if you please] because they both occur more frequently than other pairs.)
Letters which are tripled, quadrupled, etc. count as one pair (e.g. "lll" in "willless" is counted as only one pair of L.)
Letter pairs must be in one word (e.g. "Sally's sociable friends Sammy and Bobby searched for fabulous seashells." should output L and not S because despite "ss" having more occurrences than "ll", they are separated by spaces.)
Count only letters from the English alphabet
Case does not matter (e.g. "Ss" is the same as "SS" or "ss", and all are counted as one pair of S.)

You may read your input from wherever you please. Shortest code wins.

Question 2

Can we assume that only letters will occur in pairs or could the input contain double spaces or double ' etc?

Question 3

Can we assume that at least one letter appears twice?

Question 4

@MartinBüttner Yes, you can assume at least one letter pair occurs. However, other characters may appear in pairs as well. Only count letters.

Question 5

Even if there is only one pair, can I still print it in a list like ['l']?

Question 6

@Maltysen Yes, you may do so.

Question 7

Bash + GNU coreutils, 133

grep -Eo '([A-Z])1円+'<<<"${1^^}"|cut -c1|sort|uniq -c|sort -rn|while read n l
do((a-n&&a-0))&&exit||echo $l&&a=$n
done|sort|tr -d \\n

Testcases:

$ for t in "Sally's friend Bobby searched for seashells." \
> "Sally's friends Jimmy and Bobby rummaged for seashells." \
> "willless" \
> "Sally's sociable friends Sammy and Bobby searched for fabulous seashells." \
> "11ss11aa"
> do
> ./mostpaired.sh "$t"
> echo
> done
L
LM
LS
L
AS
$

Question 8

Does it only count letters? (testcase: 11ss11aa -> SA)

Question 9

@edc65 There I fixed it ;-). Actually, 11ss11aa -> AS :)

Question 10

I think that your sort -r needs to be sort -rn if you have 10 or more paired letters.

Question 11

@TobySpeight. Yes. Fixed.

Question 12

can make it shorter with AWK instead of while: awk '!n{n=1ドル};n==1ドル'|grep -o .$

Question 13

Pyth, (削除) 26 (削除ここまで) (削除) 25 (削除ここまで) (削除) 24 (削除ここまで) (削除) 16 (削除ここまで) 15 bytes

.M/sfthTrrz08ZG

Try it online: Demonstration

Explanation:

.M/sfthTrrz08ZG implicit: z = input string
 rz0 convert z to lower-char
 r 8 run-length-encoding (into tuples [count, char])
 f filter for tuples T, which satisfy:
 thT T[0]-1 != 0 (count > 1)
 s join to a string
.M G find the elements Z of "abcd...z", which produce the highest value:
 /...........Z count of Z in ...

Question 14

eC -> s saves one byte.

Question 15

Do you know any good resources which I could use to learn Pyth?

Question 16

@BetaDecay You can find a tutorial about Pyth at pyth.readthedocs.org It doesn't cover all functionalities and tricks, but it is a good start. And if you have any questions, just ask in the chat.

Question 17

CJam, (削除) 29 (削除ここまで) 27 bytes

leue`{2a>},s_el-$e`$z~\)-,>

Thanks to @Optimizer for golfing off 2 bytes!

Try it online in the CJam interpreter.

How it works

leu e# Read a line from STDIN and covert to uppercase.
e` e# Perform run-length encoding.
 e# Example: "AABBBC!!" -> [[2 'A] [3 'B] [1 'C] [2 '!]]
{2a>}, e# Filter out all pairs that are less of equal to [2].
s e# Stringify.
 e# Example: [[2 'A] [3 'B] [2 '!]] -> "2A3B2!"
_el e# Push a copy of the string and convert to lowercase.
 e# Example: "2A3B2!" -> "2a3b2!"
- e# Remove all character from the second string from the first.
 e# Example: "2A3B2!" "2a3b2!" - -> "AB"
$e`$ e# Sort, perform run-length encoding and sort again.
 e# Example: "ABACABDBC" -> "AAABBBCCD"
 e# -> [[3 'A] [3 'B] [2 'C] [1 'D]]
 -> [[1 'D] [2 'C] [3 'A] [3 'B]]
z~ e# Zip and dump.
 e# Example: [[1 'D] [2 'C] [3 'A] [3 'B]] -> [1 2 3 3] ['D 'C 'A 'B]
\) e# Pop out the last element from the first array.
 e# Example: [1 2 3 3] -> [1 2 3] 3
- e# Remove all occurrences of the popped element from the array.
 e# Example: [1 2 3] 3 -> [1 2]
, e# Compute the length of the remainder.
> e# Skip that many elements from the character array.

Question 18

z~\)-,> should work as far as I can see.

Question 19

@Optimizer: Shorter and a lot more intuitive. Thanks!

Question 20

Pyth - (削除) 23 (削除ここまで) (削除) 22 (削除ここまで) (削除) 21 (削除ここまで) 20 bytes

Uses regexp substitution to replace all of two or greater of alphabet to a temp value, and uses .Maximal to get all the have the highest occurrence. Thanks to @Jakube for pointing out the redundancy of sorting and saving a byte.

.M/:rz0+Z"{2,}"KC0KG

Takes input from stdin and outputs like ['l', 'm'] to stdout.

.M G Values which yield maximal amount over lowercase alphabet
 / Count
 : Regexp substitution
 rz0 Lowercased input
 +Z String concatenate current loop var 
 "{2,}" Regexp 2 or more of previous group
 KCZ Inline assign null byte to K and use value
 K Count K which is null byte

Try it online here.

Question 21

C, 155

Something different, no regexps.

m=1,i=1,n[91];
main(c,a)char**a;
{
 char*t=a[1];
 for(;c=*t++;)(c&=95)>64&&c<91&&(c-(*t&95)?i=1:(c=(n[c]+=i),i=0,m=m<c?c:m));
 for(c=0;++c<91;)n[c]-m||putchar(c);
}

Question 22

Python 2, (削除) 132 (削除ここまで) 143 bytes

def f(x):import re;x=re.findall(r'(.)1円+',x.upper());s={l:x.count(l)for l in x};print "".join(sorted([l for l in s if s[l]==max(s.values())]))

Example run:

f("Sally's friends Jimmy and Bobby rummaged for seashells.")
LM

Question 23

Probably it doesn't fulfil "Letters which are tripled, quadrupled, etc. count as one pair"

Question 24

You're right! i've tried to fix it. thanks for pointing that out :)

Question 25

CJam, 37 bytes

leue`{~_el-\(e&},1f=$e`$_W=0=f-{,1=},

Try it online

Without regular expression support, I'm afraid it will be tough to compete with Pyth. This is the best I came up with on a first pass.

Explanation:

l Get input.
eu Convert it to upper case, since case does not matter.
e` Run length encoding, to split into groups of same characters.
{ Start of block for filtering.
 ~ Unpack the length/letter pair.
 _ Copy the letter.
 el Change copy to lower case.
 - Subtract to compare. If result is non-zero, this is a letter.
 \ Swap count to top.
 ( Decrement to get truthy value for count > 1.
 e& Logical and: It's a letter, and count is > 1.
}, End of filter.
1f= Don't need the counts anymore, filter out the letters only from the RLE pairs.
$ Sort them, so that multiples of the same letter are sequential.
e` RLE again, to count how many multiples of each letter we had.
$ And sort again, to get the count/letter pairs in order of incrementing count.
_ Copy list.
W=0= Pick out count of last element, which is the highest count.
f- Remove count from pairs that have the highest count. This leaves them
 as one member lists with letter only, while others still have count/letter.
{ Start block for filter.
 ,1= Check for list length one.
}, End filter.

Question 26

Q (66)

Relatively readable to boot:

{where g=max g:.Q.A#count each group y where not differ y:upper x}

Question 27

R, 105 bytes

cat(substr(names(b<-table(regmatches(s<-toupper(readline()),gregexpr("([A-Z])\1円+",s))))[b==max(b)],1,1))

This reads a line of text from STDIN and prints a space delimited list of the most common paired letters to STDOUT.

Ungolfed + explanation:

# Read a string from STDIN, convert to uppercase
s <- toupper(readline())
# Get each match of the regex /([A-Z])1円+/
matches <- regmatches(s, gregexpr("([A-Z])\1円+", s))
# Compute the frequency of each match
freq <- table(matches)
# Get the matches with the highest frequency
highest <- names(freq)[freq == max(freq)]
# Each element of highest is the literal pair, so take the first character
first <- substr(highest, 1, 1)
# Print to STDOUT
cat(first)

Examples:

> (code)
Sally's friends Jimmy and Bobby rummaged for seashells.
L M
> (code)
Sally's friend Bobby searched for seashells.
L
> (code)
Sally's sociable friends Sammy and Bobby searched for fabulous seashells.
L
> (code)
11ss11nn
N S

You can try it online!

Question 28

You can probably get rid of the toupper if you ignore case and use perl in your gregexpr. eg cat(substr(names(b<-table(regmatches(s<-readline(),gregexpr("(\\w)\1円+",s,T,T))))[b==max(b)],1,1))

Question 29

@MickyT: I had thought about that, but it looks like the OP wants the output to be uppercase, so it'd have to use toupper to ensure that anyway.

Question 30

That's ashame, I missed that when I read the question.

Question 31

Tried the fiddle, but it seems to run forever with no outpuy in firefox. Testcase : 11ss11nn?

Question 32

@edc65 It's a problem with R-Fiddle; nothing at all works. I contacted their admin to report the issue. Fixed my regex and now your test works as expected, but it cost me 2 bytes. Thanks for pointing this stuff out, I appreciate it!

Question 33

Ruby, 60

f=->s{(?a..?z).group_by{|l|s.scan(/#{l*2}+/i).size}.max[1]}
p f["Sally's friends Jimmy and Bobby rummaged for seashells."]

group_by creates a hash (dictionary) structure where the keys are the output of the block and the values are lists of letters that result in each key. In this case, the keys are counts of 2+ runs of a letter, case-insensitive. max compares each [key,value] tuple lexicographically, so it just finds the maximum key. Then [1] returns the value list part of the tuple.

Question 34

Python 2, (削除) 185 159 (削除ここまで) 153

i=input().lower()
d=sorted({l:len(i.split(l+l))for l in map(chr,range(97,123))}.items(),None,lambda x:x[1],1)
print sorted(c for c,k in d if k==d[0][1])

Takes input as a quoted string.

Question 35

C# 160 Bytes

Where s is the input:

char? d(string s){s=s.ToUpper();return s.Select((x,i)=>new{y=x,z=i==0?(char?)null:s[i-1]}).Where(x=>x.y==x.z).GroupBy(x=>x.z).OrderBy(x=>x.Count()).Last().Key;}

Question 36

rs, 146 bytes

[^A-Za-z]/
*(.)1円+/1円1円
*(.)(?!1円)/
$/#
+*(.)#(?!.*?1円)/#1円
+*(.)(.*)#(.*)1円/2円#3円1円1円
#/
*(.)(1円*)/1円(_)^^((^^1円2円))
([^_])(_+)(?!_)(?=.*2円_)/
_/

Try it! Please! It took me forever to make the buttons even with the output box on that page...

Well, this was fairly...crazy. The logic here is kind of weird; I'll only post an explanation if someone asks. (Of course, I also said that for an INTERCAL answer whose explanation was requested...that I never explained... ;)

Question 37

I like the interpreter, but you might want to put the debug checkbox on the same line as the buttons or something. It kinda looks weird all the way up their. Still cool! +1

Question 38

Tried it (errors...) i.sstatic.net/mTioT.png

Question 39

@Maltysen I'll consider that. Thanks!

Question 40

@edc65 Damn...what was the complete error message? Sounds like it could be a PyPy.js bug. Or just the fact I never tested this on Firefox...

Question 41

JavaScript (削除) 156 (削除ここまで) 153

var x=prompt(),f={},a=0,o
x.toUpperCase().replace(/([A-Z])1円+/g,function(m,s){m=f[s]=-~f[s]
if(a==m)o.push(s)
if(a<m)a=m,o=[s]})
alert(o.sort().join(""))

Question 42

f[s]?f[s]+1:1 -> -~f[s]

Question 43

It fails with non-letters: Count only letters from the English alphabet

Question 44

Thanks @edc65. I've added the shortcut and A-Z checking.

Question 45

Your exact code, streamlned and ES6:

f=x=>{x.toUpperCase(f={},a=0,o).replace(/([A-Z])1円+/g,(m,s)=>a<(m=f[s]=-~f[s])?(a=m,o=[s]):a>m?0:o.push(s));alert(o.sort().join'')}

(the last 2 '' are really backticks, &#96

Question 46

Bash + textutils (grep, sed), 111 chars

fold -1<<<$s|uniq -iD|sort -f|uniq -ic|sort -rn|grep -i [A-Z]|sed -n '1h;G;s/\(\s*\S\+\s\)\(.\)\n1円./2円/p'|sort

Bash + awk (instead of sed), 97 chars

fold -1<<<$s|uniq -iD|sort -f|uniq -ic|sort -rn|grep -i [A-Z]|awk '!n{n=1ドル};n==1ドル{print 2ドル}'|sort

to test it, first assign s

s="Sally's friends Jimmy ää and Bobby rummaged ää for seashells."

Question 47

R, 98 bytes

Very similar to Alex's solution, but uses a substitution rather than a match to determine consecutive letters. Scan is used to get the input and also to split the substitution result on spaces.

cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\1円?)[^A-z]*','\\U\2円 ',scan(,''),T,T))))[a==max(a)])

A couple of tests

> cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\1円?)[^A-z]*','\\U\2円 ',scan(,''),T,T))))[a==max(a)])
1: 11 was a race horse, 22 was one too. 11 won one race and 22 one won too.
19: 
Read 18 items
Read 2 items
O
> cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\1円?)[^A-z]*','\\U\2円 ',scan(,''),T,T))))[a==max(a)])
1: Sally's friends Jimmy and Bobby rummaged for seashells.
9: 
Read 8 items
Read 5 items
L M
> cat(names(a<-table(scan(,'',t=gsub('([A-z]?)(\1円?)[^A-z]*','\\U\2円 ',scan(,''),T,T))))[a==max(a)])
1: 11ss11nn
2: 
Read 1 item
Read 2 items
N S
>

Digital Trauma 73.7k10 gold badges116 silver badges268 bronze badges · Accepted Answer · 2015-07-01 23:28:45Z

7

\$\begingroup\$

Bash + GNU coreutils, 133

grep -Eo '([A-Z])1円+'<<<"${1^^}"|cut -c1|sort|uniq -c|sort -rn|while read n l
do((a-n&&a-0))&&exit||echo $l&&a=$n
done|sort|tr -d \\n

Testcases:

$ for t in "Sally's friend Bobby searched for seashells." \
> "Sally's friends Jimmy and Bobby rummaged for seashells." \
> "willless" \
> "Sally's sociable friends Sammy and Bobby searched for fabulous seashells." \
> "11ss11aa"
> do
> ./mostpaired.sh "$t"
> echo
> done
L
LM
LS
L
AS
$

Share

Improve this answer

edited Jul 2, 2015 at 15:16

answered Jul 1, 2015 at 23:28

Digital Trauma's user avatar

Digital Trauma

73.7k10 gold badges116 silver badges268 bronze badges

\$\endgroup\$

5

\$\begingroup\$ Does it only count letters? (testcase: 11ss11aa -> SA) \$\endgroup\$

edc65
– edc65

2015年07月02日 04:03:26 +00:00
Commented Jul 2, 2015 at 4:03
\$\begingroup\$ @edc65 There I fixed it ;-). Actually, 11ss11aa -> AS :) \$\endgroup\$

Digital Trauma
– Digital Trauma

2015年07月02日 04:30:23 +00:00
Commented Jul 2, 2015 at 4:30
\$\begingroup\$ I think that your sort -r needs to be sort -rn if you have 10 or more paired letters. \$\endgroup\$

Toby Speight
– Toby Speight

2015年07月02日 13:06:19 +00:00
Commented Jul 2, 2015 at 13:06
\$\begingroup\$ @TobySpeight. Yes. Fixed. \$\endgroup\$

Digital Trauma
– Digital Trauma

2015年07月02日 15:16:12 +00:00
Commented Jul 2, 2015 at 15:16
\$\begingroup\$ can make it shorter with AWK instead of while: awk '!n{n=1ドル};n==1ドル'|grep -o .$ \$\endgroup\$

Nik O'Lai
– Nik O'Lai

2015年07月03日 12:49:59 +00:00
Commented Jul 3, 2015 at 12:49

Add a comment |

Stack Exchange Network

Write a program that finds the most occurring paired letter in a string

16 Answers 16

Bash + GNU coreutils, 133

Testcases:

Pyth, (削除) 26 (削除ここまで) (削除) 25 (削除ここまで) (削除) 24 (削除ここまで) (削除) 16 (削除ここまで) 15 bytes

Explanation:

CJam, (削除) 29 (削除ここまで) 27 bytes

How it works

Pyth - (削除) 23 (削除ここまで) (削除) 22 (削除ここまで) (削除) 21 (削除ここまで) 20 bytes

C, 155

Python 2, (削除) 132 (削除ここまで) 143 bytes

CJam, 37 bytes

Q (66)

R, 105 bytes

Ruby, 60

Python 2, (削除) 185 159 (削除ここまで) 153

C# 160 Bytes

rs, 146 bytes

JavaScript (削除) 156 (削除ここまで) 153

R, 98 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Write a program that finds the most occurring paired letter in a string

16 Answers 16

Bash + GNU coreutils, 133

Testcases:

Pyth, (削除) 26 (削除ここまで) (削除) 25 (削除ここまで) (削除) 24 (削除ここまで) (削除) 16 (削除ここまで) 15 bytes

Explanation:

CJam, (削除) 29 (削除ここまで) 27 bytes

How it works

Pyth - (削除) 23 (削除ここまで) (削除) 22 (削除ここまで) (削除) 21 (削除ここまで) 20 bytes

C, 155

Python 2, (削除) 132 (削除ここまで) 143 bytes

CJam, 37 bytes

Q (66)

R, 105 bytes

Ruby, 60

Python 2, (削除) 185 159 (削除ここまで) 153

C# 160 Bytes

rs, 146 bytes

JavaScript (削除) 156 (削除ここまで) 153

R, 98 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions