It's Nowruz and you want to set up a Haft-Sin table by words. This means finding seven words that start with letter s.
The Challenge
Write a program which its input is a list of words separated by space, and output at most first 7 words which starts by letter s. If the s is before letter h it doesn't count because it would not pronounce /s/.
Input
An arbitrary length string containing words separated by space.
Words must not contain anything other than letters (uppercase or lowercase) and numbers and _.
These inputs are valid:
hello Puzzle code_golf 12
Start say_hello separating_by_space_is_right
I am a valid word list
And these inputs are invalid:
code-golf, #invalid_word, separating_by_comma_is_wrong
I'm an invalid word list
Output
The first 7 words which starts by letter S-s and not followed by letter H-h, in every acceptable way (comma separated, space separated, new-line etc) and in any order.
- If two words are duplicate don't count them twice. Every single word is count once.
- If the input contains less that 7 word starting with s output nothing. Don't output the words.
- The output must contain the exact word which is in the input. So if the input contains
SuPER, output should beSuPERand notSUPERorsuperor any other form of lower and upper case. - Words pronunciation matter. The word
SpeedandSPEEDboth count the same. You may want to lowercase all the input and unique the words and then check for words.
test-cases
input:
speed speed new car book seven sad sum power fun super sister silver silly start
output:
speed seven sad sum super sister silver
input:
speed SpEEd new book seven sad sum power fun super sister silver silly start
output:
speed seven sad sum super sister silver
input:
sheep speed new car book seven sad sum power fun super sister silver silly start
output:
speed seven sad sum super sister silver
input:
first second third
output:
Edited
This was my first question and I missed many special cases. I try to clarify them.
14 Answers 14
Vyxal, 28 bytes
⌈'⇩ḣh\h≠$h\s=∧;:ɽ:vḟUİ56>7*Ẏ
Try it Online! or Try some testcases! Mis-deduplication should be fixed now.
⌈'⇩ḣh\h≠$h\s=∧;:ɽ:vḟUİ56>7*Ẏ
⌈' ; Filter input, split on spaces, by:
⇩ḣ Push (input lowercased)([0], [1:])
h\h≠ Second character (input[1:][0]) isn't `h`
$h\s=∧ and first character isn `s`
:ɽ Duplicate, lowercase each
:vḟ Find first occurrences of each of ^ in ^
U Uniquify ^
İ Index ^ into filtered list, resulting
in the properly deduplicated list
56> 1 if len(^) is 7 or more, otherwise 0
7*Ẏ Multiply by 7 and slice [0:that]
-
\$\begingroup\$
SPEED sPeEd shopper SPEED new car book seven sad sum power fun super sister silver silly startshould result inSPEED seven sad sum super sister silver, so a simply uniquify won't suffice in this case. \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2022年03月23日 08:09:05 +00:00Commented Mar 23, 2022 at 8:09 -
1\$\begingroup\$ @KevinCruijssen that should be fixed now \$\endgroup\$a stone arachnid– a stone arachnid2022年03月23日 14:57:06 +00:00Commented Mar 23, 2022 at 14:57
-
1\$\begingroup\$ Great, +1 from me. :) \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2022年03月23日 15:24:59 +00:00Commented Mar 23, 2022 at 15:24
R, (削除) 57 (削除ここまで) (削除) 61 (削除ここまで) (削除) 58 (削除ここまで) 79 bytes
Edit: -2 bytes thanks to "regex stealing" by pajonk
Edit2: +21 bytes to remove case-sensitive duplicates, while returning the originally cased input
q=(o=grep("^s(?!h)",scan(,""),T,T,T))[!duplicated(tolower(o))];if(q[7]>F)q[1:7]
Outputs the first 7 unique words starting with 's' or 'S' but not 'sh' or 'Sh', if there are at least 7.
Otherwise errors without outputting anything.
R, (削除) 59 (削除ここまで) (削除) 63 (削除ここまで) (削除) 61 (削除ここまで) 82 bytes
(q=(o=grep("^s(?!h)",scan(,""),T,T,T))[!duplicated(tolower(o))])[1:7][length(q)>6]
As above, but exits quietly without erroring if there are less than 7 valid words.
-
-
\$\begingroup\$ @pajonk - Thanks! I even think I saw it, but didn't understand it, and didn't try it out... Now I need to figure-out how it works... \$\endgroup\$Dominic van Essen– Dominic van Essen2022年03月23日 07:36:56 +00:00Commented Mar 23, 2022 at 7:36
-
\$\begingroup\$ "Negative lookahead" is the search term to look for. Example link \$\endgroup\$pajonk– pajonk2022年03月23日 07:40:56 +00:00Commented Mar 23, 2022 at 7:40
-
\$\begingroup\$
SPEED sPeEd shopper SPEED new car book seven sad sum power fun super sister silver silly startshould result inSPEED seven sad sum super sister silver, so a simply uniquify won't suffice in this case. \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2022年03月23日 08:10:00 +00:00Commented Mar 23, 2022 at 8:10 -
1\$\begingroup\$ @KevinCruijssen - Fixed now, but it cost a lot... \$\endgroup\$Dominic van Essen– Dominic van Essen2022年03月23日 08:45:06 +00:00Commented Mar 23, 2022 at 8:45
JavaScript (ES6), (削除) 62 (削除ここまで) 61 bytes
Saved 1 byte thanks to @l4m2
s=>(a=[...new Set(s.match(/\bs(?!h)\w*/g))]).slice(a[6]||7,7)
-
\$\begingroup\$
sdoes not appear to register as a valid word. \$\endgroup\$Jonathan Allan– Jonathan Allan2022年03月22日 22:58:28 +00:00Commented Mar 22, 2022 at 22:58 -
\$\begingroup\$ @JonathanAllan Well, I guess
sshould be matched indeed. Now fixed. \$\endgroup\$Arnauld– Arnauld2022年03月22日 23:05:48 +00:00Commented Mar 22, 2022 at 23:05 -
-
\$\begingroup\$ @l4m2 I think
a[6]may be a number. \$\endgroup\$Arnauld– Arnauld2022年03月22日 23:56:47 +00:00Commented Mar 22, 2022 at 23:56 -
\$\begingroup\$ @Arnauld How can a number start with
s? \$\endgroup\$l4m2– l4m22022年03月23日 00:01:52 +00:00Commented Mar 23, 2022 at 0:01
Retina 0.8.2, 47 bytes
+msi`(^(.+)$.+)^2円$
1ドル
Gi`^s(?!h)
1!`.+(¶.+){6}
Try it online! Explanation:
+msi`(^(.+)$.+)^2円$
1ドル
Delete case-insensitive duplicates.
Gi`^s(?!h)
Keep only words beginning with s but not sh.
1!`.+(¶.+){6}
Select the first seven words.
-
1\$\begingroup\$ What about the lone word,
s? \$\endgroup\$Jonathan Allan– Jonathan Allan2022年03月22日 22:56:13 +00:00Commented Mar 22, 2022 at 22:56 -
1\$\begingroup\$ @JonathanAllan I should have guessed... \$\endgroup\$Neil– Neil2022年03月23日 00:38:09 +00:00Commented Mar 23, 2022 at 0:38
-
\$\begingroup\$
SPEED sPeEd shopper SPEED new car book seven sad sum power fun super sister silver silly startshould result inSPEED seven sad sum super sister silver, so a simply uniquify won't suffice in this case. \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2022年03月23日 08:10:13 +00:00Commented Mar 23, 2022 at 8:10 -
1\$\begingroup\$ @KevinCruijssen Better now? \$\endgroup\$Neil– Neil2022年03月23日 08:42:53 +00:00Commented Mar 23, 2022 at 8:42
-
1\$\begingroup\$ 34 bytes in Retina 1: Try it online! \$\endgroup\$Neil– Neil2022年03月23日 08:48:49 +00:00Commented Mar 23, 2022 at 8:48
Jelly, 24 bytes
Assuming that we must handle uppercase S and H too and that we must/may return the leftmost distinct "s-words"
ḣ2ŒliⱮ=Ø.
ḲQçƇ)hsḣJf7ḢƊ$
A monadic Link that accepts a list of characters and yields a list of the words.
How?
ḣ2ŒliⱮ=Ø. - Helper Link, valid word?: list of characters, Word; identifiers ("hs")
ḣ2 - head Word to index two - e.g. "Child" -> "Ch"
Œl - lower-case -> X -> X = "ch"
Ɱ - map across C in identifiers with:
i - first (1-indexed) index of C in X -> [2,0]
('h' at index 2, no 's' exists)
Ø. - [0,1]
= - equal?
ḲQçƇ)hsḣJf7ḢƊ$ - Link get s-words: list of characters, T
Ḳ - split T at space characters -> Words
Q - deduplicate
)hs - set the right argument to "hs"
Ƈ - filter keep those Words for which:
ç - call the helper Link as a dyad - f(Word, "hs")
$ - last two links as a monad - f(ValidWords):
Ɗ - last three links as a monad - g(ValidWords):
J - range of length -> [1,2,...,number of valid words]
7 - seven
f - filter-keep -> [7] or [] if less than seven valid words
Ḣ - head -> 7 or 0
ḣ - head of ValidWords to that index
-
\$\begingroup\$
SPEED sPeEd shopper SPEED new car book seven sad sum power fun super sister silver silly startshould result inSPEED seven sad sum super sister silver, so a simply deduplicate won't suffice in this case. \$\endgroup\$Kevin Cruijssen– Kevin Cruijssen2022年03月23日 08:09:16 +00:00Commented Mar 23, 2022 at 8:09
05AB1E, 27 (or 22?) bytes
#ʒlć'sQsн'hÊ*}DlDÙkèDg7@7*£
Assumes differently cased words (e.g. speed/SPEED/sPeEd) are all the same for the uniquify. Otherwise this could have been 22 bytes by replacing the DlDÙkè with Ù.
Try it online or verify some more test cases.
Explanation:
# # Split the (implicit) input-string by spaces
ʒ # Filter this list of words by:
l # Convert it to lowercase
ć # Extract head; pop remainder-string and first char separated
'sQ '# Check if this head is an "s"
s # Swap so the remainder-string is at the top
н # Pop and push its first character
'hÊ '# Check that it's NOT equal to a "h"
* # Check that both were truthy
}D # After the filter: duplicate the resulting list of words
l # Convert each to lowercase
D # Duplicate it again
Ù # Uniquify the top copy
k # Get all its indices in the lowercase list
è # Use it to index in the regular case-insensitive list
D # Duplicate the list
g # Pop and push its length
7@ # Check if it's >=7
7* # Multiply that 0/1 by 7 (either 0 or 7)
£ # Leave that many leading words from the list
# (after which the resulting list is output implicitly)
Python 3, (削除) 111 (削除ここまで) (削除) 102 (削除ここまで) 103 bytes
I am pretty sure this isn't a perfect solution, as my regex skills are far from perfect and this seems an unnecessarily long way to check for "any word character that is not h or H", but it works. Takes a list of the words, and returns a set of the seven words, or nothing if it cannot find seven words.
import re
def f(x):
r=set()
for i in x:
if re.match("[Ss](?![Hh])",i):r|={i}
if len(r)>6:return r
Edit -9 bytes: realized that it didn't have to check if the words only contained alphanumerics
Edit +1 bytes: @a stone arachnid pointed out my regex failed for input of s
-
\$\begingroup\$ unfortunately @astonearachnid, that does not fully (to my understanding, at least) match the challenge spec, as that will match non-word characters as well, and the challenge was not specific (to my reading) on whether the input will contain invalid words or if it will only be valid words \$\endgroup\$des54321– des543212022年03月22日 21:41:48 +00:00Commented Mar 22, 2022 at 21:41
-
\$\begingroup\$ actually @astonearachnid youre right, I skipped over a bit of the input specification when I was reading it \$\endgroup\$des54321– des543212022年03月22日 21:47:01 +00:00Commented Mar 22, 2022 at 21:47
-
2\$\begingroup\$ @des54321 The thing we actually want to use here is a negative lookahead, so
s(?!h). \$\endgroup\$Lazy– Lazy2022年03月22日 21:53:28 +00:00Commented Mar 22, 2022 at 21:53 -
\$\begingroup\$ @Lazy that only works if we dont need to handle uppercase, which I think we do, and thus we'd need
[sS](?![hH]), 1 byte longer than what I have \$\endgroup\$des54321– des543212022年03月22日 22:00:48 +00:00Commented Mar 22, 2022 at 22:00 -
\$\begingroup\$ @des54321 my mistake, using
[^hH]fails for input ofswhile[sS](?![hH])does not. \$\endgroup\$a stone arachnid– a stone arachnid2022年03月23日 00:22:29 +00:00Commented Mar 23, 2022 at 0:22
PowerShell Core for Windows, 68 bytes
Thanks @Julian for the inspiration
($r=$args|sls '\bs(?!h)\S*'-a|% m*|% v*e|sort -u -t 7)*!($r.count-7)
The alias sort is not defined for Linux PowerShell and TIO. Linux requires sort-object.
Less golfed:
$result = $args|select-string '\bs(?!h)\S*' -allMatches|% matches|% value|sort -unique -top 7
$result*($result.count-eq7)
PowerShell Core, (削除) 78 (削除ここまで) 75 bytes
($r=-split$args-match'^s[^h]*$'|?{!($_-in$u);$u+=,$_})[0..6]*($r.count-ge7)
-3 bytes thanks to mazzy!
-
1\$\begingroup\$ nice. Try it online! \$\endgroup\$mazzy– mazzy2022年03月28日 21:17:27 +00:00Commented Mar 28, 2022 at 21:17
Charcoal, 38 bytes
WS⊞υι≔Φυ∧›=↧§ι0s=↧§ι1h=κ⌕↧υ↧ιυ...υ∧‹6Lυ7
Try it online! Link is to verbose version of code. Takes input as a list of newline-terminated strings. Explanation:
WS⊞υι
Input the list of words.
≔Φυ∧›=↧§ι0s=↧§ι1h=κ⌕↧υ↧ιυ
Case-insensitively filter out all words that don't start with an s or start with sh or are duplicate.
...υ∧‹6Lυ7
Output the first seven remaining words if there are more than six of them.
JavaScript (Node.js), 64 bytes
s=>(a=s.match(/(\bs(?!h)\w*\b)(?<!\b1円 .*)/gi)).slice(a[6]||7,7)
Fixed Arnauld's uppercase problem with 3 bytes
PHP, (削除) 134 (削除ここまで) 131 bytes
$a=explode(" ",$argn);foreach($a as$v){$l=strtolower($v);if($l[0]=="s"&&$l[1]<>"h"){$b[$l]="$v ";if(count($b)==7){echo join($b);}}}
Explanation: Used associative array key to prevent dups instead of in_array function.
----- Previous answer -----
$a=explode(" ",$argn);foreach($a as$v){if($v[0]=="s"&&$v[1]<>"h"&&!in_array($v,$b)&&$i<7){$b[]=$v;$i++;}}if($i==7){echo join($b," ");}
Explanation: Straightforward conversion of input to array, step through array testing each word against the rules, if a word passes the test add it to second array, and only print second array if 7 words pass the test.
sor must we handle uppercaseStoo? What about theh/H? \$\endgroup\$Superif its at the start? and should we excludesHould? \$\endgroup\$