Check if a given string is valid romaji

Question 1

Your program is given a string consisting entirely of lowercase letters at STDIN (or closest alternative). The program must then output a truthy or falsey value, depending on whether the input is valid romaji.

Rules:

It must be possible to divide the entire string into a sequence of kana without any leftover characters.
Each kana can be a single vowel (aeiou)
Each kana can also be a consonant p, g, z, b, d, k, s, t, n, h, m, or r followed by a vowel. For example, ka and te are valid kana, but qa is not.
The exceptions to the above rule are that zi, di, du, si, ti, and tu are not valid kana.
The following are also valid kana: n, wa, wo, ya, yu, yo, ji, vu, fu, chi, shi, tsu.
If a particular consonant is valid before an i (i.e ki, pi), the i can be replaced by a ya, yu, or yo and still be valid (i.e kya, kyu, kyo)
Exceptions to the above rule are chi and shi, for which the y has to be dropped too (i.e cha, chu, cho, sha, shu, sho)
It is also valid to double consonants if they are the first character of a kana (kka is valid but chhi is not)
Shortest answer wins. All regular loopholes are disallowed.

List of all valid kana:

Can have double consonant:
ba, bu, be, bo, bi
ga, gu, ge, go, gi
ha, hu, he, ho, hi
ka, ku, ke, ko, ki
ma, mu, me, mo, mi
na, nu, ne, no, ni
pa, pu, pe, po, pi
ra, ru, re, ro, ri
sa, su, se, so,
za, zu, ze, zo,
da, de, do,
ta, te, to,
wa, wo,
ya, yu, yo,
 fu,
 vu
 ji
Can not have double consonant:
a, i, u, e, o, 
 tsu,
chi, cha, cho, chu,
shi, sha, sho, shu,
n

Test cases

Pass:

kyoto
watashi
tsunami
bunpu
yappari

Fail:

yi
chhi
zhi
kyi

Question 2

How do we win? Is this a code golf?

Question 3

Need test cases. Also could do with a list of all valid kana instead of the rules

Question 4

@RobertFraser both is preferred - test cases are not rules

Question 5

n cannot be doubled. I know enough about the Japanese alphabets to say that. If n was doubled, it would need to have a vowel after, but then it wouldn't be n. So if kanna was a word (just making it up), it'd actually be ka n na.

Question 6

You know, I wanted to make a solution using unicodedata, but it'll definitely be longer than a regex solution. Partial program

Question 7

Ruby, (削除) 96 (削除ここまで) 149 bytes

Regex solution to match all the valid kana. Interestingly, "ecchi" is not valid according to the current rules, but perhaps it's for the best.

->s{s.gsub(/(?![dt]u)(sh|ch|([gbknhmrp])2円?y?|([zdst])3円?)?[auo]|(\g<2>)?4円?[ie]|(\g<3>)5円?e|ww?[ao]|n|tsu|([fv])6円?u|jj?i|j?y?[aou]|yy[aou]/){}==""}

Try it online! feat. Cruel Angel's Thesis

Question 8

It failes on simple tests zi and zye

Question 9

@DeadPossum fixed.

Question 10

Python 2, 166 bytes

Long regex solution
Try it online

I think that f-strings from 3.[something] python can help to shorten it by replacing repeated [auo and {1,2}.
Unfortunatetly I can't check it by myself now :c

import re
lambda x:re.sub('[bghkmnpr]~([auoei]|y[auo])|[sz]~[auoe]|[dt]~[aeo]|w~[ao]|([fv]~|ts)u|(j~|[cs]h)(i|y[auo])|y~[auo]|[auoien]'.replace('~','{1,2}'),'',x)==''

Question 11

re.sub('~','{1,2}',(your regex) is shorter than (your regex).replace('~','{1,2}') by 1 byte.

Question 12

Your regex is also failing on a simple test case: bku. Doubled consonants have to be the same consonant.

Value Ink Value Ink 13.4k1 gold badge18 silver badges45 bronze badges · Accepted Answer · 2017-05-24 00:42:13Z

Ruby, (削除) 96 (削除ここまで) 149 bytes

Regex solution to match all the valid kana. Interestingly, "ecchi" is not valid according to the current rules, but perhaps it's for the best.

->s{s.gsub(/(?![dt]u)(sh|ch|([gbknhmrp])2円?y?|([zdst])3円?)?[auo]|(\g<2>)?4円?[ie]|(\g<3>)5円?e|ww?[ao]|n|tsu|([fv])6円?u|jj?i|j?y?[aou]|yy[aou]/){}==""}

Try it online! feat. Cruel Angel's Thesis

\$\begingroup\$ It failes on simple tests zi and zye \$\endgroup\$

Dead Possum
– Dead Possum

2017年05月24日 13:32:21 +00:00
Commented May 24, 2017 at 13:32
\$\begingroup\$ @DeadPossum fixed. \$\endgroup\$

Value Ink
– Value Ink

2017年05月24日 23:01:13 +00:00
Commented May 24, 2017 at 23:01

Stack Exchange Network

Check if a given string is valid romaji

Rules:

List of all valid kana:

Test cases

2 Answers 2

Ruby, (削除) 96 (削除ここまで) 149 bytes

Python 2, 166 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Check if a given string is valid romaji

Rules:

List of all valid kana:

Test cases

2 Answers 2

Ruby, (削除) 96 (削除ここまで) 149 bytes

Python 2, 166 bytes

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions