NESCA: New English Stroke Count Alphabet

Question 1

TLDR: Sort your input according to a new English alphabet somewhat based on Chinese stroke count methods.

Background: In a Chinese glossary/index, finding terms that are contained within the book is different from English because Chinese doesn't have an alphabet like English, instead they are sorted by stroke count. (一畫 = 1 stroke,二畫 = 2 strokes,三畫 = 3 strokes,四畫 = 4 strokes,and so on)

An English glossary, having an alphabet, is naturally sorted alphabetically. For this challenge, we flip that idea somewhat to follow the Chinese manner. And we'll follow some Chinese writing rules to help determine stroke order for the alphabet below.

Counting Strokes: Take 口 (kou) for example, a simple square. You'd think it is 4 strokes, but it is actually 3. The 1st being the left vertical line. The 2nd being the top horizontal and right vertical in one fluid stroke, forming the corner. And the 3rd being the lower horizontal line, completing the square. This pattern, among others, holds relatively true across Chinese characters. For sake of simplicity though, and for some added diversity in the English Stroke Count Alphabet, there is a somewhat subjective choice for stroke counts.

Defining the NESCA First, I need to define stroke count for each letter. For sake of simplicity, and somewhat subjectively, I'll use the characters as they appear below. If there are any arguments why a letter should have a different stroke count, please make your case, but again in order to promote diversity in stroke counts, I made some personal judgment calls. For example, W could arguably be done in 2 strokes, where each stroke makes a v shape, but if that was the case for every letter, this new alphabet would essentially resemble the original. Hence my subjective choice of stroke counts. (For those that also read/speak Mandarin, 不好意思!)

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

3 3 1 2 4 3 2 3 3 1 3 2 4 3 1 2 2 3 1 2 1 2 4 2 3 3

a b c d e f g h i j k l m n o p q r s t u v w x y z

2 2 1 2 2 2 2 2 3 2 3 2 3 2 1 2 2 2 1 2 2 2 4 2 2 3

Letters with equal stroke counts should retain the original alphabetic order as before. The only tie-breaker should be upper and lower case letters with the same stroke counts. C and c, O and o, S and s, D and d, etc. So the English Stroke Order Alphabet is as follows. (If I made an error, please say as much, there are a lot of examples that I might have to adjust)

The NESCA

C J O S U D G L P Q T V X A B F H I K N R Y Z E M W

c o s a b d e f g h j l n p q r t u v x y i k m z w

and more specifically...

CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw

1111111122222222222222222222222222333333333333344444

Note 1: Tiebreakers - If upper and lowercase for the same letter have the same stroke count, uppercase letters take precedence.

"Cousin" precedes "cousin"
"father" precedes "Father" (because lowercase f is 2 strokes, while the uppercase is 3)
"Stop" precedes "soap" (while the o would precede t in stroke count, uppercase S precedes lowercase s)
KO precedes kO (K precedes k)
kO precedes ko (O precedes o)
make precedes When (both have 4 strokes, but m precedes W in the original alphabet)

Note 2: Input will never include any numbers, punctuation, or special characters, nor will it be empty.

Note 3: I left this challenge in the Sandbox for 2 weeks as a precaution. I'm worried a lot of people will argue against my subjective decisions in defining this alphabet (especially the letter g). I merely tried to allow for a very new and very different alphabet, and to add more diversity to Challenges.

The Challenge Given a string input containing a sentence, series of words, or a list of words, organize those words according to the NESCA. Output can be either a string, or a list of properly words is a single string of properly organized words, including duplicates should they exist.

EDIT At the behest of users, I have changed my examples to be one consistent input/output format. My example formats can be found here, and exact examples can be obviously found in edit history.

Example Format 1 "INPUT HERE" / "OUTPUT HERE"
Example Format 2 [INPUT HERE] / [OUTPUT HERE]
Example Format 3 ["INPUT", "HERE"] / ["OUTPUT", "HERE"]
Any suitable format for your language, as per community standards.

Input / Output

"It was the best of times it was the worst of tImes" / "of of best the the tImes times It it worst was was"

"When life gives you lemons make Lemonade" / "gives Lemonade lemons life you make When"

"The journey of a thousand miles begins with one step" / "of one step a begins journey The thousand miles with"

"English Stroke Count Alphabet" / "Count Stroke Alphabet English"

"A man a plan a canal panama" / "canal a a panama plan A man"

"Carry on my wayward son" / "Carry on son my wayward"

"Close our store and begin destroying every flower green house just lose no people quietly rather than using vexing xrays yesterday it killed Zachs mini wombat" / Same as input (If you can write a better sentence than above, I'd be much appreciated. I'd gift reputation, but I don't know how)

"May the Force be with you" / "be the you Force May with"

"Im going to make him an offer hE cant refuse" / "cant offer an going him hE refuse to Im make"

"jello Jello JellO JEllo jellO JELlo JELlO jEllo JELLO JelLo JeLlo" / "JeLlo JelLo JellO Jello JELLO JELlO JELlo JEllo jellO jello jEllo" (Annoyed? Me too!)

"We suffer more often In imagination than IN reality" / "often suffer reality than In IN imagination more We"

"Code Golf and Coding Challenges" / "Code Coding Challenges and Golf"

"Do or DO not there is no try" / "or DO Do no not there try" is"

"Failure the best teacher is" / "best teacher the Failure is"

"Can you tell that I am a Star Wars fan" / "Can Star a am fan tell that you I Wars"

"enough examples no more words" / "enough examples no more words"

Question 2

You have Code Coding Challenges Golf and but a precedes G in CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw?

Question 3

Could you add the inputs for all the test cases in a consistent format, please? It's making it very difficult to test my solution.

Question 4

You're currently missing a winning-condition tag? Based on the answers and challenge I assume this is a code-golf challenge?

Question 5

Awww... I read the title and expected something about Tiger Woods' 10-stroke hole 3 in the Masters.... he went from 3 under to 4 over and didn't even yell...

Question 6

You know, c cost 一畫, but C cost 壹畫.

Question 7

Japt, 53 bytes

I/O as an array of words. I wasn't able to run all the test cases 'cause trying to format the input for them all on my phone got to be infuriating.

n`CcJOoSsUabD ̧fGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw

Try it (header splits input strings on spaces)

Question 8

Coding on your phone?!?! That's worth an upvote in and of itself!

Question 9

What is the purpose of the comma character between D and f?

Question 10

@Sumner18, it's the de compressed; the backtick encloses a compressed string.

Question 11

@Sumner18 Shaggy golfs on his phone after a couple of pints. Unclear if the latter improves his golfing or not.

Question 12

Jelly, 38 bytes

"EḂ 2JḶ]{5+cUBẋ÷ỌṫƇÆ7ɗ"CỵƊ¢Ṁċ’œ?ØẠ¤w)Þ

Try it online!

Verify all test cases

A monadic link that accepts and returns a list of words.

Explanation

"...’œ?ØẠ¤w)Þ Main monadic link
 Þ Sort by
 ) Map [over each letter in a given word]
 w Find index of subsequence in
 ¤ (
"...’ 3928442642485912187600397757783525135099072511850472479412437675483
 œ? rd permutation of
 ØẠ the string "ABC...XYZabc...xyz"
 ¤ )

Question 13

How did you find that permutation number? That's insane! Have an upvote!

Question 14

@Sumner18 Jelly has a built-in that does exactly that: Œ¿ is essentially the inverse of Œ?.

Question 15

@Arnauld Could that be translated to other languages? I see most other examples, like your JavaScript solution, are just hard coding the NESCA into the solution, which is fine, but I'm just imagining the possibilities.

Question 16

@Sumner18 It wouldn't make sense in practical languages. In order for this approach to save bytes, it needs the language to have compressed number literals, a built-in for the alphabet and a built-in for indexing permutations.

Question 17

@Sumner18 To give you an idea of how bad it would be, here is some JS code that only computes the correct permutation.

Question 18

Perl 5 `-a`, (削除) 104 (削除ここまで) 100 bytes

@NahuelFouilleul saved 4 bytes

sub t{pop=~y/CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw/A-z/r}say for sort{t($a)cmp t$b}@F

Try it online!

Question 19

some ideas popinstead of "@_" and A-z instead of A-Za-z

Question 20

@NahuelFouilleul Good call on using pop. I didn't even know that A-z was possible.

Question 21

for A-z may be more tricky than what you think, it could be also be A-t, also as cmp is dependent to locale (LC_COLLATE) and on tio, it works [tio.run/##K0gtyjH9/79S3znZyz8/… Try it online!]

Question 22

Python 3, ^{(削除) 107 (削除ここまで) (削除) 101 (削除ここまで)} 99 bytes

Saved ~~(削除) 6 (削除ここまで)~~ 8 bytes (and got below 100!) thanks to ovs!!!

lambda s:s.sort(key=lambda k:[*map("CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw".find,k)])

Try it online!

Inputs a list of strings and sorts them accordingly.

Question 23

Your key function can be shortened to lambda k:[*map("CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw".find,k)].

Question 24

@ovs Very nice - thanks! :D

Question 25

You can save another 2 bytes by modifying the input list instead of returning a new one: def f(s):s.sort(key=...).

Question 26

@ovs Nice one - thanks! :-)

Question 27

R, (削除) 159 (削除ここまで) (削除) 134 (削除ここまで) 125 bytes

icuSetCollate(locale="ASCII");s=scan(,"");s[order(chartr("CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw","A-Za-z",s))]

Try it online!

Thanks to Dominic van Essen for -7 bytes.

Similar to others, translate the characters in the string using chartr into an appropriate order, then sort the strings using that order.

The default collation order in the R install on TIO, en_US.UTF8, is very odd: while, for instance, e comes before E, ekF comes after EgHTk (those being the translations of "and" and "begin" in the unchanged test case). So I switch to an ASCII locale, which compares by byte value instead.

Question 28

I follow you, but I really don't. I have no idea what you're saying in that second paragraph.

Question 29

@Sumner18 Try it online! -- the docs say something about string comparison here (just ctrl + F for "lexicographic"), so I just dug around until I found the commands to give me the right sort order.

Question 30

I wish I could say that I understand that too. I'm just a statistician with less than 3 years in the workforce. I'll figure it out someday!

Question 31

Seeing as we define languages by their interpreter here, why not just assume an interpreter running in an ASCII locale?

Question 32

152 bytes using intToUtf8...

Question 33

05AB1E, 38 bytes

ΣžnS•f[?θ$Ÿ)*:TMûò0Æì+Ω£μ\.—g"Ý»θä•.Isk

I/O as a list of list of characters.

Port of @xigoi's Jelly answer, so make sure to upvote him/her as well!

Try it online or verify all test cases.

Explanation:

Σ # Sort the (implicit) input-list by:
 žn # Push the constant string "ABC...XYZabc...xyz"
 S # Convert it to a list of characters
 •f[?θ$Ÿ)*:TMûò0Æì+Ω£μ\.—g"Ý»θä•
 "# Push compressed integer 3928442642485912187600397757783525135099072511850472479412437675482
 .I # Get the 392...482nd permutation of the character-list
 s # Swap to get the current list of characters
 k # And get the index of each character in the permutation
 # (we sort on those lists of indices)
 # (after which the sorted list is output implicitly as result)

See this 05AB1E tip of mine (section How to compress large integers?) to understand why •f[?θ$Ÿ)*:TMûò0Æì+Ω£μ\.—g"Ý»θä• is 3928442642485912187600397757783525135099072511850472479412437675482. (Note that it's 1 lower than the number used in the Jelly answer, because 05AB1E uses 0-based indexing and Jelly uses 1-based indexing instead. This number is generated with the œ¿ Jelly builtin.)

Question 34

Slight correction: The Jelly program you linked completely ignores the second argument. Œ¿ is similar to œ¿, but it takes only one argument and uses its sorted version as the second argument. Here it just happens to work because the alphabet is sorted.

Question 35

@xigoi Ah, thanks a lot for mentioning that. That explains why I sometimes had trouble finding the permutation index in other challenges, when I was using that Jelly builtin. I always assumed I just had to sort the characters in the string prior to using the \$n^{th}\$ permutation when it happened, but apparently I was sometimes just using the wrong builtin in Jelly to calculate \$n\$.. Thanks for letting me know (and I'll edit my answer to reduce confusion if someone else reads it).

Question 36

Jelly actually has a naming convention for this: atoms starting with an uppercase letter are monadic and atoms starting with a lowercase letter are dyadic.

Question 37

JavaScript (ES6), 119 bytes

a=>a.sort((a,b)=>(g=s=>[...s].map(c=>"CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw".search(c)+10))(a)>g(b)||-1)

Try it online!

Question 38

Retina 0.8.2, 141 bytes

T`CcJ\O\oSsUabD\defGg\hj\L\lnP\pQqrTtuVvXxyABF\HIiKkNRYZz\EMmW\w`Ll
O`\w+
T`Ll`CcJ\O\oSsUabD\defGg\hj\L\lnP\pQqrTtuVvXxyABF\HIiKkNRYZz\EMmW\w

Try it online! Link includes test cases. Explanation: Simply replaces all letters with other letters that are in the desired sort order, then replaces then back after sorting the words into order. Note that Transliterate has several shorthand letters (such as L and l of course) which need to be quoted in the master list.

Question 39

K (ngn/k), 61 bytes

{x@<"CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw"?x}

Try it online!

Takes the input as a list of words; returns a list of words.

{ } a function with parameter x
 "CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw"?x find the indeces of each character of the input words in the list of stroke counts 
 < grade down
 x@ take the words at the graded down indeces

Question 40

Very confusing, needs an explanation

Question 41

Red, 162 bytes

func[b][forall b[b/1: collect[foreach c b/1[keep index?
find/case"CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw"c]keep
b/1]]sort b forall b[b/1: last b/1]]

Try it online!

Question 42

I will be joining with answers in Rebol very soon.

Question 43

@Razetime That's great, I'm looking forward to it! Is there any online "try it" suite for Rebol?

Question 44

Yeah, REBOL 2 and 3 are there.. I found it through Hostilefork.

Question 45

@Razetime Thanks! BTW Arturo language is inspired (among others) by Rebol and has some functional tools that would be useful for golfing.

Question 46

Jelly, 28 bytes

"ẹ1ʋỴḂỤ$*Ɗż©zk’b4żØẠŒuÞFiⱮμÞ

A monadic Link accepting and yielding a list of words (each being a list of characters).

Try it online! Or see the test-suite.

How?

"...’b4żØẠŒuÞFiⱮμÞ - Link: words
 μÞ - sort (words) by this monadic chain, f(word):
"...’ - base 250 literal = 12827082404216683880457031718358
 b4 - in base 4 -> 2201321220213201120101312211011111212121011101113112
 ØẠ - alphabet = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
 ż - zip -> [[2,'A'],[2,'B'],[0,'C'],...,[2,'z']]
 Þ - sort by:
 Œu - to upper-case -> [[0,'C'],[0,'c'],[0,'J'],...,[3,'w']]
 F - flatten -> [0,'C',0,'c',0,'J',...,3,'w']
 Ɱ - for each character, c, in word:
 i - index (of c) in (that)

Question 47

Julia 1.0, (削除) 93 (削除ここまで) 92 bytes

l->sort(l,by=x->findlast.(i for i=x,"CcJOoSsUabDdefGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw"))

Try it online!

works with Julia > 1.3

uses as input and output a list of the words

based on this answer by @Noodle9

edit: replace collect(x) with i for i=x (-1 byte)

Question 48

Charcoal, (削除) 72 (削除ここまで) 63 bytes

≔⭆4⭆⌕A")"∧·1]↗¿¤≕τB}VC↘"Iι§⭆α+ν↧νληUMθEι⌕ηλW−θυFNoθ⌊ι⊞υ⌊ιEυ⭆ι§ηλ

Try it online! Link is to verbose version of code. Partly inspired by @JonathanAllan's answer. Takes input as a list and outputs the sorted words on separate lines. Explanation:

≔⭆4⭆⌕A")"∧·1]↗¿¤≕τB}VC↘"Iι§⭆α+ν↧νλη

The compressed string ")"∧·1]↗¿¤≕τB}VC↘" expands to 2121001131211121220122113321001111210011011133112122 which represents the decremented stroke counts of each letter in the order AaBbC...Zz. The NESCA is then calculated by extracting the relevant letters in order of stroke count.

UMθEι⌕ηλ

Replace each string with an array of integer offsets into the NESCA.

W−θυFNoθ⌊ι⊞υ⌊ι

For each unique word in the list in ascending order, push each occurrence to the sorted list. (Minus filters out all matches, so we have to explicitly push the duplicates.)

Eυ⭆ι§ηλ

Restore each integer array to its original string and output each string on its own line.

Question 49

05AB1E, 26 bytes

Σε•a ̄æ·$ÎÐ+MAî+X•4Bžnø{Ssk

Try it online! Beats all other answers.

Σε•...•4Bžnø{Ssk # trimmed program
 # implicit input...
Σ # sorted by...
 sk # index of...
 # (implicit) current element in...
 ε # map over letters of...
 # (implicit) current element in sort...
 sk # in...
 # (implicit) flat...
 S # list of characters in...
 # (implicit) each element of...
 { # sorted list of...
 # (implicit) all elements of...
 •...• # 12827082404216683880457031718358...
 B # in base...
 4 # literal...
 ø # with each element paired with corresponding element from...
 žn # "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz"
 # implicit output

Shaggy 45k4 gold badges39 silver badges95 bronze badges · Accepted Answer · 2020-11-16 21:30:54Z

12

\$\begingroup\$

Japt, 53 bytes

I/O as an array of words. I wasn't able to run all the test cases 'cause trying to format the input for them all on my phone got to be infuriating.

n`CcJOoSsUabD ̧fGghjLlnPpQqrTtuVvXxyABFHIiKkNRYZzEMmWw

Try it (header splits input strings on spaces)

Share

Improve this answer

answered Nov 16, 2020 at 21:30

Shaggy's user avatar

Shaggy

45k4 gold badges39 silver badges95 bronze badges

\$\endgroup\$

4

7

\$\begingroup\$ Coding on your phone?!?! That's worth an upvote in and of itself! \$\endgroup\$

Sumner18
– Sumner18

2020年11月16日 21:37:08 +00:00
Commented Nov 16, 2020 at 21:37
\$\begingroup\$ What is the purpose of the comma character between D and f? \$\endgroup\$

Sumner18
– Sumner18

2020年11月16日 21:49:11 +00:00
Commented Nov 16, 2020 at 21:49
\$\begingroup\$ @Sumner18, it's the de compressed; the backtick encloses a compressed string. \$\endgroup\$

Shaggy
– Shaggy

2020年11月16日 21:54:36 +00:00
Commented Nov 16, 2020 at 21:54
13

\$\begingroup\$ @Sumner18 Shaggy golfs on his phone after a couple of pints. Unclear if the latter improves his golfing or not. \$\endgroup\$

Giuseppe
– Giuseppe

2020年11月16日 22:31:39 +00:00
Commented Nov 16, 2020 at 22:31

Add a comment |

Stack Exchange Network

NESCA: New English Stroke Count Alphabet

14 Answers 14

Japt, 53 bytes

Jelly, 38 bytes

Explanation

Perl 5 `-a`, (削除) 104 (削除ここまで) 100 bytes

Python 3, ^{(削除) 107 (削除ここまで) (削除) 101 (削除ここまで)} 99 bytes

R, (削除) 159 (削除ここまで) (削除) 134 (削除ここまで) 125 bytes

05AB1E, 38 bytes

JavaScript (ES6), 119 bytes

Retina 0.8.2, 141 bytes

K (ngn/k), 61 bytes

Red, 162 bytes

Jelly, 28 bytes

How?

Julia 1.0, (削除) 93 (削除ここまで) 92 bytes

Charcoal, (削除) 72 (削除ここまで) 63 bytes

05AB1E, 26 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Hot Network Questions

NESCA: New English Stroke Count Alphabet

14 Answers 14

Japt, 53 bytes

Jelly, 38 bytes

Explanation

Perl 5 -a, (削除) 104 (削除ここまで) 100 bytes

Python 3, (削除) 107 (削除ここまで) (削除) 101 (削除ここまで) 99 bytes

R, (削除) 159 (削除ここまで) (削除) 134 (削除ここまで) 125 bytes

05AB1E, 38 bytes

JavaScript (ES6), 119 bytes

Retina 0.8.2, 141 bytes

K (ngn/k), 61 bytes

Red, 162 bytes

Jelly, 28 bytes

How?

Julia 1.0, (削除) 93 (削除ここまで) 92 bytes

Charcoal, (削除) 72 (削除ここまで) 63 bytes

05AB1E, 26 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Linked

Related

Hot Network Questions

Perl 5 `-a`, (削除) 104 (削除ここまで) 100 bytes

Python 3, ^{(削除) 107 (削除ここまで) (削除) 101 (削除ここまで)} 99 bytes