Characters
Let’s call these Unicode characters English IPA consonants:
bdfhjklmnprstvwzðŋɡʃʒθ
And let’s call these Unicode characters English IPA vowels:
aeiouæɑɔəɛɜɪʊʌː
(Yes, ː is just the long vowel mark, but treat it as a vowel for the purpose of this challenge.)
Finally, these are primary and secondary stress marks:
ˈˌ
Note that
ɡ(U+0261) is not a lowercase g, and the primary stress markerˈ(U+02C8) is not an apostrophe, andː(U+02D0) is not a colon.
Your task
Given a word, stack the vowels on top of the consonants they follow, and place the stress markers beneath the consonants they precede. (As the question title hints, such a writing system, where consonant-vowel sequences are packed together as a unit, is called an abugida.) Given the input ˈbætəlʃɪp, produce the output:
æə ɪ
btlʃp
ˈ
A word is guaranteed to be a string of consonants, vowels, and stress marks, as defined above. There will never be consecutive stress marks, and they will always be placed at the start of the word and/or before a consonant.
Test cases
There may be consecutive vowels. For example, kənˌɡrætjʊˈleɪʃən becomes
ɪ
ə æ ʊeə
knɡrtjlʃn
ˌ ˈ
If a word starts with a vowel, print it on the "baseline" with the consonants: əˈpiːl becomes
ː
i
əpl
ˈ
A test case with an initial, stressed vowel: ˈælbəˌtrɔs becomes
ə ɔ
ælbtrs
ˈ ˌ
A long word: ˌsuːpərˌkaləˌfrædʒəˌlɪstɪˌkɛkspiːæləˈdoʊʃəs becomes
æ
ː ː ʊ
uə aə æ əɪ ɪɛ iəoə
sprklfrdʒlstkkspldʃs
ˌ ˌ ˌ ˌ ˌ ˈ
A nonsense example with an initial diphthong, lots of vowel stacking, and no stress markers: eɪbaeioubaabaaa becomes
u
o
i a
eaa
ɪaaa
ebbb
Reference implementation
Your program should produce the same output as this Python script:
consonants = 'bdfhjklmnprstvwzðŋɡʃʒθ'
vowels = 'aeiouæɑɔəɛɜɪʊʌː'
stress_marks = 'ˈˌ'
def abugidafy(word):
tiles = dict()
x = y = 0
is_first = True
for c in word:
if c in stress_marks:
tiles[x + 1, 1] = c
elif c in consonants or is_first:
y = 0
x += 1
tiles[x, y] = c
is_first = False
elif c in vowels:
y -= 1
tiles[x, y] = c
is_first = False
else:
raise ValueError('Not an IPA character: ' + c)
xs = [x for (x, y) in tiles.keys()]
ys = [y for (x, y) in tiles.keys()]
xmin, xmax = min(xs), max(xs)
ymin, ymax = min(ys), max(ys)
lines = []
for y in range(ymin, ymax + 1):
line = [tiles.get((x, y), ' ') for x in range(xmin, xmax + 1)]
lines.append(''.join(line))
return '\n'.join(lines)
print(abugidafy(input()))
Rules
You may write a function or a full program.
If your program has a Unicode character/string type, you can assume inputs and outputs use those. If not, or you read/write from STDIN, use the UTF-8 encoding.
You may produce a string containing newlines, or a list of strings representing rows, or an array of Unicode characters.
Each row of output may contain any amount of trailing spaces. If you produce a string, it may have a single trailing newline.
Your program should produce the correct output for arbitrarily long words with arbitrarily long vowel chains, but may assume that the input word is always valid.
If there are no stress markers, your output may optionally include a final empty row (containing nothing, or spaces).
The shortest answer (in bytes) wins.
4 Answers 4
NARS2000 APL, 138 bytes
⍉⌽⊃E,⍨ ̈↓∘' ' ̈∨/ ̈∊∘M ̈E←(1+(W∊M←'ˌˈ')++\W∊'bdfhjklmnprstvwzðŋɡʃʒθ')⊂W←⍞
Python, 222 bytes
(202 characters)
import re
def f(s):y=[w[0]in'ˈˌ'and w or' '+w for w in re.split('([ˈˌ]?[bdfhjklmnprstvwzðŋɡʃʒθ]?[aeiouæɑɔəɛɜɪʊʌː]*)',s)[1::2]];return[[x[i-1:i]or' 'for x in y]for i in range(max(len(w)for w in y),0,-1)]
Returns an array of unicode characters with an array for each row (containing single spaces for each space required)
Not sure where one can get decent output online yet (and I haven't even got the tools to test it properly here either).
I have loaded a version to ideone that just uses English consonants and vowels with , and . as stress marks, where I have fudged the test cases to conform.
JavaScript (ES6), 181 bytes
f=
s=>(a=s.match(/[ˈˌ]?.[aeiouæɑɔəɛɜɪʊʌː]*/g).map(s=>/[ˈˌ]/.test(s)?s:` `+s)).map(s=>(l=s.length)>m&&(t=s,m=l),m=0)&&[...t].map(_=>a.map(s=>s[m]||` `,--m).join``).join`
`
;
<input oninput=o.textContent=f(this.value)><pre id=o>
Go, 609 bytes
import."strings"
type P struct{x,y int}
func M(s[]int)(m,n int){for _,e:=range s{if e<=m{m=e};if e>=n{n=e}};return}
func f(s string)(L string){T,x,y,F,R:=make(map[P]rune),0,0,1>0,ContainsRune
for _,r:=range s{if R("ˈˌ",r){T[P{x+1,1}]=r}else if R("bdfhjklmnprstvwzðŋɡʃʒθ",r)||F{y=0;x++;T[P{x,y}],F=r,1<0}else if R("aeiouæɑɔəɛɜɪʊʌː",r){y--;T[P{x,y}],F=r,1<0}}
var u,v[]int
for k:=range T{u,v=append(u,k.x),append(v,k.y)}
for y,Y:=M(v);y<Y+1;y++{o:=[]string{}
for x,X:=M(u);x<X+1;x++{if r,ok:=T[P{x,y}];ok{o=append(o,string(r))}else{o=append(o," ")}}
L+=TrimPrefix(Join(o,"")," ")+"\n"}
return}
A direct port of the reference implementation.
Ungolfed Explanation
// map of x-y coords to a character
type tile map[P]rune
func (t tile) Xs() (o []int) {
for k := range t {
o = append(o, k.x)
}
return
}
func (t tile) Ys() (o []int) {
for k := range t {
o = append(o, k.y)
}
return
}
// coordinate pair
type P struct{ x, y int }
func min[T int](s []T) T {
var m T
for _, e := range s {
if e <= m {
m = e
}
}
return m
}
func max[T int](s []T) T {
var m T
for _, e := range s {
if e >= m {
m = e
}
}
return m
}
func f(s string) string {
C, V, S := "bdfhjklmnprstvwzðŋɡʃʒθ", "aeiouæɑɔəɛɜɪʊʌː", "ˈˌ" // constant strs
tiles := make(tile) // the actual map
x, y := 0, 0 // (x0,y0) is the leftmost center
isFirst := true // is this character the first of a line?
for _, r := range s { // for each char...
if strings.ContainsRune(S, r) { // if it's stress...
tiles[P{x + 1, 1}] = r // put it underneath the next syllable
} else if strings.ContainsRune(C, r) || isFirst { // if it's a consonant or the first letter...
y = 0
x++
tiles[P{x, y}] = r // place on the baseline
isFirst = false
} else if strings.ContainsRune(V, r) { // if's a vowel...
y--
tiles[P{x, y}] = r // place 1 above the baseline at the current x
isFirst = false
}
}
// get the ranges for outputting into a string
xs, ys := tiles.Xs(), tiles.Ys()
xmin, xmax := min(xs), max(xs)
ymin, ymax := min(ys), max(ys)
lines := []string{}
for y := ymin; y < ymax+1; y++ { // for each vowel at height y...
line := func() (o []string) {
for x := xmin; x < xmax+1; x++ { // for each syllable on that vowel height...
if r, ok := tiles[P{x, y}]; ok { // get the char
o = append(o, string(r))
} else {
o = append(o, " ") // use a space if there is no vowel there
}
}
return
}()
lines = append(lines, strings.TrimPrefix(strings.Join(line, ""), " ")) // add to the output
}
return strings.Join(lines, "\n") // return the string
}
ɜ, you left it out :-) And British will complain about theirɒ\$\endgroup\$ɜ, so this should be a full General American vowel set now. \$\endgroup\$biiiiiiiiiiiʒ(As in "not the bees") \$\endgroup\$