8
\$\begingroup\$

(This post is partly self-plagiarized.)

Objective

Given a Hangul syllable, toggle its vowel harmony.

Introduction to Hangul syllables

Hangul(한글) is the Korean writing system invented by Sejong the Great. Hangul syllables are allocated in Unicode point U+AC00 – U+D7A3. A Hangul syllable consists of an initial consonant, a vowel, and an optional final consonant.

The initial consonants are:

ᄀ ᄁ ᄂ ᄃ ᄄ ᄅ ᄆ ᄇ ᄈ ᄉ ᄊ ᄋ ᄌ ᄍ ᄎ ᄏ ᄐ ᄑ ᄒ

The vowels are:

ᅡ ᅢ ᅣ ᅤ ᅥ ᅦ ᅧ ᅨ ᅩ ᅪ ᅫ ᅬ ᅭ ᅮ ᅯ ᅰ ᅱ ᅲ ᅳ ᅴ ᅵ

The final consonants are:

(none) ᄀ ᄁ ᆪ ᄂ ᆬ ᆭ ᄃ ᄅ ᆰ ᆱ ᆲ ᆳ ᆴ ᆵ ᄚ ᄆ ᄇ ᄡ ᄉ ᄊ ᄋ ᄌ ᄎ ᄏ ᄐ ᄑ ᄒ

For example, has initial consonant , vowel , and final consonant .

South Korean dictionary order

The consonants and vowels above are sorted in South Korean dictionary order. The syllables are firstly sorted by initial consonants, secondly by vowels, and finally by (optional) final consonants.

The Unicode block for Hangul syllables contains every consonant/vowel combinations, and is entirely sorted in South Korean dictionary order.

The Unicode block can be seen here, and the first 256 characters are shown for illustrative purpose:

가각갂갃간갅갆갇갈갉갊갋갌갍갎갏감갑값갓갔강갖갗갘같갚갛개객갞갟갠갡갢갣갤갥갦갧갨갩갪갫갬갭갮갯갰갱갲갳갴갵갶갷갸갹갺갻갼갽갾갿걀걁걂걃걄걅걆걇걈걉걊걋걌걍걎걏걐걑걒걓걔걕걖걗걘걙걚걛걜걝걞걟걠걡걢걣걤걥걦걧걨걩걪걫걬걭걮걯거걱걲걳건걵걶걷걸걹걺걻걼걽걾걿검겁겂것겄겅겆겇겈겉겊겋게겍겎겏겐겑겒겓겔겕겖겗겘겙겚겛겜겝겞겟겠겡겢겣겤겥겦겧겨격겪겫견겭겮겯결겱겲겳겴겵겶겷겸겹겺겻겼경겾겿곀곁곂곃계곅곆곇곈곉곊곋곌곍곎곏곐곑곒곓곔곕곖곗곘곙곚곛곜곝곞곟고곡곢곣곤곥곦곧골곩곪곫곬곭곮곯곰곱곲곳곴공곶곷곸곹곺곻과곽곾곿

Vowel Harmony

Korean vowels express vowel harmony as positive-negative pairs. They're paired like the followings:

(Positive) - (Negative)
ᅡ - ᅥ
ᅢ - ᅦ
ᅣ - ᅧ
ᅤ - ᅨ
ᅩ - ᅮ
ᅪ - ᅯ
ᅫ - ᅰ
ᅬ - ᅱ
ᅭ - ᅲ

Note that , , and lack counterparts. More accurately, is neither positive nor negative. and are negative, but their positive counterparts have vanished historically. As such, Hangul syllables whose vowel is , , or are considered to be an invalid input.

I/O format

Flexible. In particular, I/O in Unicode codepoints are okay.

Examples

뷁 → 봵
냥 → 녕
멍 → 망
망 → 멍
asked Feb 27 at 2:30
\$\endgroup\$
1
  • \$\begingroup\$ I tried answering this in Retina but it came out at 2399 bytes... \$\endgroup\$ Commented Feb 28 at 0:40

4 Answers 4

6
\$\begingroup\$

JavaScript (Node.js), 47 bytes

x=>x+28*((x=(x+68)/28%21)<4?4:x<8?-4:x<13?5:-5)

Try it online!

JavaScript (Node.js), 49 bytes by Arnauld

x=>x+((x%=588)<44|x>519?4:x<156?-4:x<296?5:-5)*28

Try it online!

JavaScript (Node.js), 50 bytes

x=>x+((x+68)%588<224?-4:5)*((x-44)%588<252||-1)*28

Try it online!

answered Feb 27 at 2:53
\$\endgroup\$
1
  • \$\begingroup\$ 49 bytes, I think. \$\endgroup\$ Commented Feb 27 at 15:34
5
\$\begingroup\$

x86-64 machine code, 26 bytes

8D 97 54 54 FF FF 6A AC 58 01 C2 78 0A 04 E4 7B F5 01 C2 79 F4 F7 D8 01 F8 C3

Try it online!

Following the standard calling convention for Unix-like systems (from the System V AMD64 ABI), this takes a 32-bit integer in EDI and returns a 32-bit integer in EAX.

The offsets needed to the vowel indices are [4, 4, 4, 4, -4, -4, -4, -4, 5, 5, 5, 5, 5, -5, -5, -5, -5, -5, ?, ?, ?]. The ?s are don't-care values, which will be set to -3 to fit the pattern of n repeats of n and -n. When working on the combined character code, the pattern is scaled up by a factor of 28.

In assembly:

f: lea edx, [rdi - 43948] # Set EDX to the character code minus 43948.
 # With this offset, the Hangul characters start at 28*3.
r3: push -84; pop rax # Set EAX to -84 = -28 * 3.
r: add edx, eax # Add EAX to EDX.
 js e # Jump if the result is negative (offset in EAX).
 add al, -28 # Decrease EAX by 28, using its low byte for shortness.
 jpo r3 # Jump back if the sum of the low 8 bits is odd
 # (which occurs at -28 * 6) to reset EAX.
 add edx, eax # Add EAX to EDX.
 jns r # Jump if the result is not negative.
 neg eax # Negate EAX (to reverse the offset).
e: add eax, edi # Add EDI (the original character code) to EAX.
 ret # Return.
answered Feb 27 at 18:51
\$\endgroup\$
2
\$\begingroup\$

Uiua 0.15.0-dev.2, (削除) 32 (削除ここまで) 31 bytes SBCS

⍜(◿21÷28+68)(⨬(+8◿10+1)◿8⊸<4-4)

Try on Uiua Pad!

Takes a Unicode code point as input and output (the link uses under F @0円 to convert to and from code points).

-1 byte inspired by l4m2's comment

Explanation

⍜(◿21÷28+68)(⨬(+8◿10+1)◿8⊸<4-4)
⍜( ) # do this first, then undo at the end:
 ◿21÷28+68 # add 68, divide by 28, mod 21
 -4 # minus 4
 ⨬ ⊸<4 # test if less than 4
 ◿8 # if so, mod 8
 (+8◿10+1) # if not, add 1, mod 10, add 8
answered Feb 27 at 3:29
\$\endgroup\$
1
  • \$\begingroup\$ Do dividing first result longer? \$\endgroup\$ Commented Feb 27 at 3:33
1
\$\begingroup\$

Charcoal, 38 bytes

c/o+c×ばつ28I§⪪"{⊞¶∨A⧴⧴C〜pNo↙✳⊖@";"2÷+68c/oθ28

Try it online! Link is to verbose version of code. I/O is in characters. Explanation:

 θ Input character
 c/o Take the ordinal
 + Plus
 "..." Compressed look-up table of offsets
 ⪪ Split into substrings of length
 2 Literal integer `2`
 § Indexed by
 θ Input character
 c/o Take the ordinal
 + Plus
 68 Literal integer `68`
 ÷ Integer divided by
 28 Literal integer `28`
 I Cast to integer
 ×ばつ Multiplied by
 28 Literal integer `28`
c/o Convert to character
 Implicitly print

(You can change the I/O format to ordinals by replacing all of the c/os with Is.)

answered Feb 28 at 1:00
\$\endgroup\$

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.