Phonemic Abugida

Question 1

Characters

Let’s call these Unicode characters English IPA consonants:

bdfhjklmnprstvwzðŋɡʃʒθ

And let’s call these Unicode characters English IPA vowels:

aeiouæɑɔəɛɜɪʊʌː

(Yes, ː is just the long vowel mark, but treat it as a vowel for the purpose of this challenge.)

Finally, these are primary and secondary stress marks:

ˈˌ

Note that ɡ (U+0261) is not a lowercase g, and the primary stress marker ˈ (U+02C8) is not an apostrophe, and ː (U+02D0) is not a colon.

Your task

Given a word, stack the vowels on top of the consonants they follow, and place the stress markers beneath the consonants they precede. (As the question title hints, such a writing system, where consonant-vowel sequences are packed together as a unit, is called an abugida.) Given the input ˈbætəlʃɪp, produce the output:

æə ɪ
btlʃp
ˈ

A word is guaranteed to be a string of consonants, vowels, and stress marks, as defined above. There will never be consecutive stress marks, and they will always be placed at the start of the word and/or before a consonant.

Test cases

There may be consecutive vowels. For example, kənˌɡrætjʊˈleɪʃən becomes

 ɪ
ə æ ʊeə
knɡrtjlʃn
 ˌ ˈ

If a word starts with a vowel, print it on the "baseline" with the consonants: əˈpiːl becomes

 ː
 i
əpl
 ˈ

A test case with an initial, stressed vowel: ˈælbəˌtrɔs becomes

 ə ɔ 
ælbtrs
ˈ ˌ

A long word: ˌsuːpərˌkaləˌfrædʒəˌlɪstɪˌkɛkspiːæləˈdoʊʃəs becomes

 æ 
ː ː ʊ 
uə aə æ əɪ ɪɛ iəoə 
sprklfrdʒlstkkspldʃs
ˌ ˌ ˌ ˌ ˌ ˈ

A nonsense example with an initial diphthong, lots of vowel stacking, and no stress markers: eɪbaeioubaabaaa becomes

 u
 o
 i a
 eaa
ɪaaa
ebbb

Reference implementation

Your program should produce the same output as this Python script:

consonants = 'bdfhjklmnprstvwzðŋɡʃʒθ'
vowels = 'aeiouæɑɔəɛɜɪʊʌː'
stress_marks = 'ˈˌ'
def abugidafy(word):
 tiles = dict()
 x = y = 0
 is_first = True
 for c in word:
 if c in stress_marks:
 tiles[x + 1, 1] = c
 elif c in consonants or is_first:
 y = 0
 x += 1
 tiles[x, y] = c
 is_first = False
 elif c in vowels:
 y -= 1
 tiles[x, y] = c
 is_first = False
 else:
 raise ValueError('Not an IPA character: ' + c)
 xs = [x for (x, y) in tiles.keys()]
 ys = [y for (x, y) in tiles.keys()]
 xmin, xmax = min(xs), max(xs)
 ymin, ymax = min(ys), max(ys)
 lines = []
 for y in range(ymin, ymax + 1):
 line = [tiles.get((x, y), ' ') for x in range(xmin, xmax + 1)]
 lines.append(''.join(line))
 return '\n'.join(lines)
print(abugidafy(input()))

Try it on Ideone.

Rules

You may write a function or a full program.
If your program has a Unicode character/string type, you can assume inputs and outputs use those. If not, or you read/write from STDIN, use the UTF-8 encoding.
You may produce a string containing newlines, or a list of strings representing rows, or an array of Unicode characters.
Each row of output may contain any amount of trailing spaces. If you produce a string, it may have a single trailing newline.
Your program should produce the correct output for arbitrarily long words with arbitrarily long vowel chains, but may assume that the input word is always valid.
If there are no stress markers, your output may optionally include a final empty row (containing nothing, or spaces).
The shortest answer (in bytes) wins.

Question 2

Poor ɜ, you left it out :-) And British will complain about their ɒ

Question 3

Oops, I did! I added ɜ, so this should be a full General American vowel set now.

Question 4

Are occurrences of any of these characters to only count as one byte in whichever language is used regardless of their code base in order to strike balance between competing golfing languages or is part of the challenge, in your opinion, to find which language may actually perform it in least bytes, period?

Question 5

Is there a maximum number of vowels after a consonant that our program should recognize? If not add a test case like biiiiiiiiiiiʒ (As in "not the bees")

Question 6

@JonathanAllan The latter; Unicode I/O is part of the challenge. I'll add a note about that.

Question 7

NARS2000 APL, 138 bytes

⍉⌽⊃E,⍨ ̈↓∘' ' ̈∨/ ̈∊∘M ̈E←(1+(W∊M←'ˌˈ')++\W∊'bdfhjklmnprstvwzðŋɡʃʒθ')⊂W←⍞

Question 8

You can remove the initial ⍞← as output is implied. Also, byte count should be exactly twice the character count, as per this. So this should be 138 bytes.

Question 9

Python, 222 bytes

(202 characters)

import re
def f(s):y=[w[0]in'ˈˌ'and w or' '+w for w in re.split('([ˈˌ]?[bdfhjklmnprstvwzðŋɡʃʒθ]?[aeiouæɑɔəɛɜɪʊʌː]*)',s)[1::2]];return[[x[i-1:i]or' 'for x in y]for i in range(max(len(w)for w in y),0,-1)]

Returns an array of unicode characters with an array for each row (containing single spaces for each space required)

Not sure where one can get decent output online yet (and I haven't even got the tools to test it properly here either).
I have loaded a version to ideone that just uses English consonants and vowels with , and . as stress marks, where I have fudged the test cases to conform.

Question 10

JavaScript (ES6), 181 bytes

f=
s=>(a=s.match(/[ˈˌ]?.[aeiouæɑɔəɛɜɪʊʌː]*/g).map(s=>/[ˈˌ]/.test(s)?s:` `+s)).map(s=>(l=s.length)>m&&(t=s,m=l),m=0)&&[...t].map(_=>a.map(s=>s[m]||` `,--m).join``).join`
`
;

<input oninput=o.textContent=f(this.value)><pre id=o>

Question 11

Go, 609 bytes

import."strings"
type P struct{x,y int}
func M(s[]int)(m,n int){for _,e:=range s{if e<=m{m=e};if e>=n{n=e}};return}
func f(s string)(L string){T,x,y,F,R:=make(map[P]rune),0,0,1>0,ContainsRune
for _,r:=range s{if R("ˈˌ",r){T[P{x+1,1}]=r}else if R("bdfhjklmnprstvwzðŋɡʃʒθ",r)||F{y=0;x++;T[P{x,y}],F=r,1<0}else if R("aeiouæɑɔəɛɜɪʊʌː",r){y--;T[P{x,y}],F=r,1<0}}
var u,v[]int
for k:=range T{u,v=append(u,k.x),append(v,k.y)}
for y,Y:=M(v);y<Y+1;y++{o:=[]string{}
for x,X:=M(u);x<X+1;x++{if r,ok:=T[P{x,y}];ok{o=append(o,string(r))}else{o=append(o," ")}}
L+=TrimPrefix(Join(o,"")," ")+"\n"}
return}

Attempt This Online!

A direct port of the reference implementation.

Ungolfed Explanation

// map of x-y coords to a character
type tile map[P]rune
func (t tile) Xs() (o []int) {
	for k := range t {
		o = append(o, k.x)
	}
	return
}
func (t tile) Ys() (o []int) {
	for k := range t {
		o = append(o, k.y)
	}
	return
}
// coordinate pair
type P struct{ x, y int }
func min[T int](s []T) T {
	var m T
	for _, e := range s {
		if e <= m {
			m = e
		}
	}
	return m
}
func max[T int](s []T) T {
	var m T
	for _, e := range s {
		if e >= m {
			m = e
		}
	}
	return m
}
func f(s string) string {
	C, V, S := "bdfhjklmnprstvwzðŋɡʃʒθ", "aeiouæɑɔəɛɜɪʊʌː", "ˈˌ" // constant strs
	tiles := make(tile) // the actual map
	x, y := 0, 0 // (x0,y0) is the leftmost center
	isFirst := true // is this character the first of a line?
	for _, r := range s { // for each char...
		if strings.ContainsRune(S, r) { // if it's stress...
			tiles[P{x + 1, 1}] = r // put it underneath the next syllable
		} else if strings.ContainsRune(C, r) || isFirst { // if it's a consonant or the first letter...
			y = 0
			x++
			tiles[P{x, y}] = r // place on the baseline
			isFirst = false
		} else if strings.ContainsRune(V, r) { // if's a vowel...
			y--
			tiles[P{x, y}] = r // place 1 above the baseline at the current x
			isFirst = false
		}
	}
	// get the ranges for outputting into a string
	xs, ys := tiles.Xs(), tiles.Ys()
	xmin, xmax := min(xs), max(xs)
	ymin, ymax := min(ys), max(ys)
	lines := []string{}
	for y := ymin; y < ymax+1; y++ { // for each vowel at height y...
		line := func() (o []string) {
			for x := xmin; x < xmax+1; x++ { // for each syllable on that vowel height...
				if r, ok := tiles[P{x, y}]; ok { // get the char
					o = append(o, string(r))
				} else {
					o = append(o, " ") // use a space if there is no vowel there
				}
			}
			return
		}()
		lines = append(lines, strings.TrimPrefix(strings.Join(line, ""), " ")) // add to the output
	}
	return strings.Join(lines, "\n") // return the string
}

Attempt This Online!

Oberon 2,9411 gold badge16 silver badges16 bronze badges · Accepted Answer · 2016-09-10 19:51:48Z

2

\$\begingroup\$

NARS2000 APL, 138 bytes

⍉⌽⊃E,⍨ ̈↓∘' ' ̈∨/ ̈∊∘M ̈E←(1+(W∊M←'ˌˈ')++\W∊'bdfhjklmnprstvwzðŋɡʃʒθ')⊂W←⍞

Share

Improve this answer

edited Sep 12, 2016 at 10:03

answered Sep 10, 2016 at 19:51

Oberon's user avatar

Oberon

2,9411 gold badge16 silver badges16 bronze badges

\$\endgroup\$

1

\$\begingroup\$ You can remove the initial ⍞← as output is implied. Also, byte count should be exactly twice the character count, as per this. So this should be 138 bytes. \$\endgroup\$

Adám
– Adám

2016年09月12日 07:34:35 +00:00
Commented Sep 12, 2016 at 7:34

Add a comment |

Stack Exchange Network

Phonemic Abugida

Characters

Your task

Test cases

Reference implementation

Rules

4 Answers 4

NARS2000 APL, 138 bytes

Python, 222 bytes

JavaScript (ES6), 181 bytes

Go, 609 bytes

Ungolfed Explanation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Hot Network Questions

Phonemic Abugida

Characters

Your task

Test cases

Reference implementation

Rules

4 Answers 4

NARS2000 APL, 138 bytes

Python, 222 bytes

JavaScript (ES6), 181 bytes

Go, 609 bytes

Ungolfed Explanation

Your Answer

Sign up or log in

Post as a guest

Post as a guest

Related

Hot Network Questions