In Irish, most consonants are divided into broad (velarized) and slender (palatalized) variants, and the orthography marks them with neighboring vowels, which are similarly divided. This gives rise to the caol le caol agus leathan le leathan (slender with slender and broad with broad) rule – a medial sequence of consonants must have the same class of vowel on either side: in leabhar, bh is surrounded by two broad vowels, so it is broad as well, and in cailín, l is surrounded by two slender vowels, so it is slender. a, o and u are broad and e and i are slender (similar with the vowels with the fada: á ó ú é í); ae (but not áe, aé, or áé) is also considered broad.
Given a word, output whether it follows this rule.
Input
You may assume that the input has only the following characters with their uppercase variants:
aábcdeéfghiílmnoóprstuú
AÁBCDEÉFGHIÍLMNOÓPRSTUÚ
Input will be given in the NFC normalization form.
Tests
Valid:
deartháireacha
madra
nuachtán
gaolta
ceannasaithe
snámhann
fómhair
laethanta
béar
Bealtaine
hAoine
ball
tree
ggg
laEthanta
agus
úsáideoir
Invalid:
codegolf
delta
alishanoi
ABI
anseo
breithlá
aéco
áeco
áéco
(Note that anseo and breithlá are Irish words, but they happen not to follow this rule. You should still output a falsy answer for them for the sake of simplicity.)
7 Answers 7
Retina 0.8.2, 39 bytes
ae|[aáoóuú]
#@
[eéií]
@#
i)1`(#|@)\w+1円
Try it online! Link includes test cases. Outputs an inverted result, i.e. 0 for valid, 1 for invalid. Explanation:
i)`
Run the whole script case-insensitively.
ae|[aáoóuú]
#@
Mark broad vowels with #@.
[eéií]
@#
Mark slender vowels with @#.
1`(#|@)\w+1円
Check for consonants surrounded by vowels of different types (which equates to having surrounding symbols of the same type).
Note that Retina defaults to the ISO-8859-1 code page so all of the Irish vowels only cost one byte.
Haskell + hgl, (削除) 78 (削除ここまで) (削除) 60 (削除ここまで) 51 bytes
(ma.*pPX"[ei][^aeiou]+[aou]"~<rv)<rmD<<skX"ae_"<mtL
Explanation
This works by making some reductions on the input and then running a simple regex on both the string and its reverse.
mtLconvert to lower case.skX"ae_"replaceaewitha.rmDremove the síntí fada.~<rvon both the string and reverse ...pPX"[ei][^aeiou]+[aou]"check if the pattern slender-cluster-broad occurs ...maget the logical or.
No regex, 69 bytes
(ma.*pP(ah'$xys"ei"<>so(nxy W5)<>xys"aou")~<rv)<rmD<<sk(ʃa<*ʃe)<mtL
Reflection
- There should be a version of
gkYbut for regexes. - There should be
pPXbut not for regexes. That is combinepPwithah'. - I could probably add a builtin for
"aeiouáéíóú". - This regex reminds me of the regex used here. There might be a way to systematize this a bit to save bytes.
- Just has
xayhas a string version there should be a string version ofnxy. - Just as there are
(<>?)and(?<>), there should be(<>*),(*<>),(<>+)and(+<>). - We could use some more versions of
sY, or "split along". There'ssYeto split along characters from a list of options, but it would be nice to have the opposite, to split along characters absent from a list. It would also be good to have variants for all of these that remove empty strings from the output, i.e. to split along clusters of matching characters.
Jelly, 30 bytes
Œlœṣ)aeKe€ØCœpƊO%65%9Ḃj€-FSƝ1e
A monadic Link that accepts a list of the specified characters and yields 0 for valid or 1 for invalid.
How?
Œlœṣ)aeKe€ØCœpƊO%65%9Ḃj€-FSƝ1e - Link: list of characters (limited to those specified)
Œl - convert to lowercase (works for fada e.g.: 'Ó' -> 'ó')
)ae - "ae"
œṣ - split {Lowercased} at occurrences of {"ae"}
K - join with spaces (we'll treat this as a wide vowel later)
Ɗ - last three links as a monad:
ØC - consonants -> "BCD...Zbcd...z"
€ - for each {C in our string}:
e - {C} exists in {"BCD...Zbcd...z"}?
œp - partition {our string} at truthy indices of {that}
discarding the borders
O - cast to ordinals
%65 - mod 65
%9 - mod 9
Ḃ - mod 2
j€- - join each with -1
F - flatten
Ɲ - for neighbouring pairs:
S - sum
1e - contains 1?
05AB1E, 31 bytes
l„ae¬:žPS¡õKü2εεN<è"eiéí"så}Ë}P
Try it online or verify all test cases.
Explanation:
l # Convert the (implicit) input-string to lowercase
„ae # Push string "ae"
¬ # Push its first character "a" (without popping)
: # Replace all "ae" with "a" in the lowercase input
žP # Push the consonants constant "bcdfghjklmnpqrstvwxz"
S¡ # Split the string on each consonant
õK # Remove all empty strings (where multiple adjacent consonants were)
ü2 # Pop and push all overlapping pairs
ε # Map over each pair:
ε # Map over both vowel-strings in each pair:
N # Push the 0-based index
< # Decrease it by 1
è # Use it to index into the vowel-string;
# N=0 → -1 → last character; N=1 → 0 → first character
"eiéí"så # Check if this character is one of "eiéí"
}Ë # After the inner map: Check if both vowel-checks are the same;
# [1,1] for both slender; [0,0] for both broad; or
# [0,1] or [1,0] for invalid
}P # After the outer map: Check if all are truthy
# (after which the result is output implicitly)
Perl -Mutf8 -MUnicode::Normalize, 72 bytes
Thanks @bb94!
sub{s/ae/a/ig;$_=NFD$_;s/\pM//ug;($_.z.reverse)=~/[ei][^aeiouz]+[aou]/i}
Perl -Mutf8 -MUnicode::Normalize, 75 bytes
Inverted truthy and falsey.
sub{s/ae/a/ig;$_=NFD$_;s/\pM//ug;s/[^aeiou]+/_/gi;/[aou]_[ei]|[ei]_[aou]/i}
-
\$\begingroup\$ Can I remove NFD part so I could let the user apply such function as prerequisite? \$\endgroup\$IY5dVSjABEeV– IY5dVSjABEeV2024年07月31日 14:03:11 +00:00Commented Jul 31, 2024 at 14:03
-
1\$\begingroup\$ Sorry if it wasn’t clear; you should assume that the input is in NFC. \$\endgroup\$bb94– bb942024年07月31日 17:42:25 +00:00Commented Jul 31, 2024 at 17:42
-
1\$\begingroup\$ 72 bytes:
sub{s/ae/a/ig;$_=NFD$_;s/\pM//ug;($_.z.reverse)=~/[ei][^aeiouz]+[aou]/i}\$\endgroup\$bb94– bb942024年08月02日 01:37:31 +00:00Commented Aug 2, 2024 at 1:37
Charcoal, 54 bytes
1FEE⪫⪪Sae¦ac/o%c/o↧ι132−No]aouvιNoeiι«F∧ι∧Noυ0Noυ±ι⎚F↔ι≔⟦⟧υ⊞υι
Try it online! Link is to verbose version of code. Outputs a Charcoal boolean, i.e. - for valid, nothing if not. Explanation:
1
Assume the word is valid.
FEE⪫⪪Sae¦ac/o%c/o↧ι132−No]aouvιNoeiι«
Replace ae in the input with a, then for each letter, lowercase it, then reduce its ordinal modulo 132, which converts á to ], ú to v (which fortunately isn't a valid input) and the other accented letters to their base letter, then finally determine whether it's a broad vowel (mapped to 1), a slender vowel (mapped to -1) or a consonant (mapped to 0). Loop over the resulting list of letter types.
F∧ι∧Noυ0Noυ±ι⎚
If this is a vowel, the previous vowel was the other type of vowel, and there was an intervening consonant, then clear the canvas, marking the word as invalid.
F↔ι≔⟦⟧υ
If this is a vowel then clear any saved letter types.
⊞υι
Save the type of this letter.
Raku (Perl 6) (rakudo), (削除) 56 (削除ここまで) 55 bytes
{s:g:i/ae/a/;$_|.flip!~~m:i:m/[e|i]<-[aeiou]>+<[aou]>/}
Explanation
The string-and-reverse trick was inspired by Wheat Wizard’s Haskell + hgl answer.
{s:g:i/ae/a/;$_|.flip!~~m:i:m/[e|i]<-[aeiou]>+<[aou]>/}
s:g:i/ae/a/; # Replace all instances of `ae` with `a` in the input (case-insensitively)
$_|.flip!~~ # Does neither that string nor its reverse match...
m:i:m/[e|i]<-[aeiou]>+<[aou]>/ # the regex /[ei][^aeiou]+[aou]/ (case- and diacritic-insensitively)?
💎
Created with the help of Luminespire.
Explore related questions
See similar questions with these tags.
aé,áe, andáéare used in Irish at all. I’ll say that onlyaeshould be. \$\endgroup\$gggisn't an Irish word and that's listed as a "valid" Irish word in the examples. While the title refers to being a "valid Irish word", the purpose of the exercise solely focuses on the slender/broad division for its vowels. It's not asking to vet any other spelling, nor to check if the word itself exists in an Irish dictionary. We're only asked to apply the broad/slender rule. Non-existing words can still be judged for correctness on broad/slender. If your interpretation were correct then the valid examples would only contain actual Irish words, which is not the case. \$\endgroup\$