skip to main | skip to sidebar

Friday, 30 July 2010

gut, foot, hoot


Warren Maguire’s website has a nice map of the British Isles showing preliminary results of his survey of answers to the question
Which of the words gut, foot and hoot rhyme for you?

The coloured dots on his map nicely display the distribution in the country of the three typical setups. In Scotland and Northern Ireland foot and hoot rhyme (the ‘Scottish’ system, blue dots). Everywhere else they don’t. In the north of England foot and gut rhyme (the ‘Northern’ system, yellow dots). In the south of England, and in non-Scottish, non-Northern English in general, none of the three rhyme (the ‘Southern’ system, red dots).
gut foot hoot
’Scottish’ ʌ u u blue
’Northern’ ʊ ʊ yellow
’Southern’ ʌ ʊ red

Because people were allowed to give more than one answer, there are also mixed possibilities, as in Warren’s own Northern Irish speech, in which (green dots) foot can rhyme either with gut (fʌt-ɡʌt) or with hoot (fʉt-hʉt), but presumably not both at once.

These keywords represent my own lexical sets STRUT, FOOT, and GOOSE. I didn’t choose gut as a keyword, even though it’s a commoner word than strut, because I judged that one speaker’s gut could well be confused with another speaker’s got.

Thursday, 29 July 2010

two placenames

Two placenames today.

One is Duisburg in Germany, recently in the news because of the tragedy at the Love Parade. In the British media, beside the ‘established anglicization’ (OBGP) ˈdjuːzbɜːɡ, I also heard newsreaders say ˈdjuːɪzbɜːɡ, an obvious spelling pronunciation. In German this place is ˈdyːsbʊʁk, which does not exactly follow the spelling. Personally, given that I learnt German in Kiel in the far north of the country, I tend to pronounce it ˈdyːsbʊɐç (like ˈhambʊɐç Hamburg) unless I remind myself not to.

The other placename is Slaugham, a village just off the main A23 road from London to Brighton, near the intriguingly named Pease Pottage. Driving past, I’ve sometimes idly wondered how this written form is to be interpreted: how do the locals pronounce this name? Does it rhyme with Maugham mɔːm? The answer is no.

Yesterday I was watching a traffic police video programme on television, when the action moved to this area. As the officers in the pursuit car reported their position over the radio I noted with interest that they called it ˈslɑːfəm. So it’s like laughter, not like slaughter. The old BBC Pronouncing Dictionary of British Names says it can be either ˈslɑːfəm or ˈslæfəm, prioritizing the latter.

Wednesday, 28 July 2010

sound comparisons

Warren Maguire’s comments on yesterday’s blog reminds me that I have not previously written about the interesting Sound Comparisons website.

This is the showcase for a research project conducted at the University of Edinburgh in 2005–2007. The website offers you ‘sound comparisons’ for about a hundred English words pronounced in fifty or so different native-speaker accents, mainly but not exclusively British. They are presented in narrowish IPA transcription, and for many (but not all) there are sound clips. There’s no connected speech.

Unusually, there are also a dozen or so ‘historical’ accents/varieties covered, ranging from Proto-Germanic to Shakespearean. Strangely, no native speakers of these varieties seem to have been available to offer recordings.

There are also a dozen or so ‘other Germanic’ varieties, giving the cognates in the relevant languages of the items in the English word list. If you’ve always wanted to listen to the Frisian word for ear, this is where to find it. (It’s iˑər, which you could easily take as some kind of postGaelic Scottish.)

With my browser at least the sound clips are rather flaky: you tend to get two plays (perhaps overlapping) of the word you ask for, followed shortly or indeed after quite a time by other words seemingly chosen at random.

Presumably because the research money was not renewed, the website now gives the impression of having been abandoned. I hope this is not the case: it would be nice to have the missing sound files, e.g. for London or Norwich. It would be nice to have some connected speech. It would be nice to have more accents represented.

Perhaps Warren Maguire’s ongoing research will fill some of these gaps.

Tuesday, 27 July 2010

k-backing

More than two years ago (blog, 25 Mar 2008) I reported on the work being done by Jenny Cheshire, Sue Fox, Paul Kerswill and Eivind Torgersen on the speech of young Londoners living in the inner city. Traditional Cockney has given way to what they call “Multicultural London English”.

Eivind and Paul (pictured) have now kindly made available to me some sound clips of this new variety. I am not at liberty to let you hear any extended samples, but what I can do is let you listen to one or two words or phrases.

One of the innovations they identify is ‘k-backing’.

We’re used to the idea that velars tend to accommodate to the place of the following vowel, being somewhat fronter before front vowels and backer before back vowels. We routinely compare the initial plosive of keep with that of cool and perhaps use this to illustrate the notion of allophones of a phoneme.

The k-backing innovation is a kind of exaggeration of the backing of velars before back vowels. Rather than a mildly retracted k in words such as car, come, caught, many younger inner-London speakers have a very retracted plosive, perhaps even a uvular q.

Listen to clips of a young Anglo (= ethnically white) speaker pronouncing the phrases he's comi..., coming into it, yeah? and (a) parked car. This speaker has a multicultural friendship network. Note the backed ks, which are typical of such multiculturally-oriented anglos and of non-anglos. Anglos whose friendship network is Anglo-only do it just slightly less. Older people don’t do it at all.

No one knows where this innovation comes from.

= = =

In other news, the next International Congress of Phonetic Sciences will be held in Hong Kong from the 17th to the 21st of August 2011. The website has recently gone live here.

Monday, 26 July 2010

disunification (2)

Michael Everson correctly identifies a number of reasons to advocate the disunification of the Latin letters beta, theta, and chi from their Greek versions. If this happened, as IPA symbols we would use the Latin versions rather than the Greek ones.
He quotes briefly, without identifying the source, from the IPA 1949 Principles booklet. Here, more fully, is what is says there (The Principles of the International Phonetic Association, pages 1-2). Although unattributed, these are clearly Daniel Jones’s words.
Note the very clear intention to treat IPA θ (vertical) as distinct from Greek theta (typically oblique). Greek letters are to be incorporated into the IPA only as roman [sic] adaptations.
As Jones says, Greek theta has an alternative form, θ. This is encoded at U+03D1, whereas ordinary θ is at U+03B8.

In English printed texts that mix the Latin and Greek scripts, the Greek letters are typically oblique, the Latin ones upright. The purpose is to distinguish clearly between the two scripts (whereas the IPA wants everything in the same script). Here is an example, from Abbott and Mansfield’s Primer of Greek Grammar (my copy printed in 1949).
I think disunification of Latin and Greek beta, theta, chi would be a good thing.

An existing disunification that might be thought surprising is that of the IPA symbol for a voiced velar plosive, ɡ, U+0261, from ordinary lower-case g, U+0067. In many fonts there is no difference in the appearance of these two; in other fonts there is, e.g. in Times New Roman ɡ g (which I hope shows up properly in your browser). The IPA is on record as declaring that the two symbol shapes are equivalent and interchangeable. Nevertheless many phoneticians persist in treating them as distinct, which justifies Unicode’s disunification.

It is worth noting that a number of obsolete, derecognized former IPA symbols are located in the Unicode block Latin Extended-B. They include ƍ ƞ ƪ ƫ ƺ ƾ ƻ. This is also where we find upper-case versions of certain IPA symbols. These might be used in orthographies, though not in phonetic texts as such: Ɔ Ə Ɛ Ɣ Ɯ Ɵ Ʊ Ʌ. I have sometimes had to correct careless authors who used them in place of the lower-case phonetic symbols.

Another difficult area is that of letters with diacritics. It is possible to encode any such letter by using the base form plus one (or more) of the Combining Diacritical Marks provided in Unicode 0300–036F. However, doing so puts you at the mercy of the designers of fonts, browsers and word processing software, who may or may not have done the necessary work to make diacritics line up correctly above, below, or through the base letter. For “accented” letters used in orthographies Unicode provides separate encoding, as for example in the case of precomposed á ê ï õ ù ă ē į ő ů ç đ ġ ķ ň. However no precomposed combinations are provided for explicitly phonetic use. Obviously, since the range of possible combinations is potentially enormous we cannot expect to have many of these; but it would certainly be convenient to have precomposed versions of the symbols for the French nasalized vowels (blog, 15 July), ɑ̃ ɛ̃ ɔ̃ œ̃, which are abundantly attested in printed texts.

Friday, 23 July 2010

disunification (1)

Consider the following pairs of symbols: a а ä ӓ æ ӕ c с e е è ѐ ë ё i і j ј o о p р s ѕ x х y у. Can you see any difference between the members of each pair? Nor can I. Nor can anyone.

However in each pair the first symbol is a letter of the Latin alphabet, while the second is Cyrillic.

Correspondingly, the two members of each pair have different Unicode encodings. While Latin a is U+0061, Cyrillic а is U+0430. While Latin j is U+006A, Cyrillic ј (used in writing Serbian) is U+0458. And so on.

This situation is convenient in that it keeps all the basic Latin letters together in the block 0021–00FF (I give Unicode numbers in the usual hexadecimal form) and all the Cyrillic letters together in the block 0400–04FF. But it is also highly inconvenient, because it opens up potential breaches in security. Now that non-ASCII letters are allowed in URLs, the fact that two differently coded letters look identical could be exploited for malicious purposes, for phishing or scamming. While www.facebook.com is a website you know and love (or not), “www.fасеbооk.com” would be somewhere quite different. (In the latter case, the Latin a,c,e,o have been replaced by the identical-looking Cyrillic equivalents.)

That is why they tell you not to click on links in emails, but to type them into the browser yourself.

It’s not quite as bad as that, because the domain name authorities will (we hope) refuse to register such deceptive domain names. On the other hand there is nothing to stop someone using this sort of thing as their Facebook name.

By the time it came to encoding IPA symbols, the Unicode consortium had become aware of this danger and resolved to take a much more conservative line. The new policy was that if two characters (“glyphs”) look the same, then normally they should have the same encoding. That’s why although most phonetic symbols are located in the IPA Extensions block (0250–02AF) some aren’t. We use the basic Latin a b c… rather than having special IPA ones. We also use the “Latin-1 Supplement” coding for the characters æ ç ð ø (U+00E6, U+00E7, U+00F0, U+00F8) since they occur in the ordinary spelling of Danish, French, Icelandic, and Norwegian. We also use the “Latin Extended-A” coding for the ħ (U+0127) used in Maltese orthography, for the œ (U+0153) used in French, and even for the ŋ (U+014B) used in spelling Sami and Mende. None of these is repeated in the IPA Extensions block, though ћ is separately coded for Cyrillic (Serbian, U+045B).

Worse, the phonetic symbols β θ χ (U+03B2, U+03B8, U+03C7) are to be found only in the “Greek and Coptic” block, since they are treated as identical with the Greek letters beta, theta and chi.

Fortunately, our IPA ɫ is not lumped in with Polish ł, nor ɪ (lax front unrounded vowel, small cap i) with Turkish dotless ı or Greek iota ι.

Meanwhile — rather incredibly, and going to the other extreme — our phonetic schwa ə is among the IPA symbols at U+0259, while the identical-appearing ǝ and ә are respectively LATIN SMALL LETTER TURNED E (U+01DD) of the Pan-Nigerian alphabet and CYRILLIC SMALL LETTER SCHWA (U+04D9) as used in Azerbaijani orthography.

The problem we face in all such cases is that of the “unification” versus “disunification” of identical-looking symbols.

More on this next week. Meanwhile, you might like to read Michael Everson’s discussion here.

Thursday, 22 July 2010

crux

A few days ago the Guardian crossword included a word clued in such a way as to require crux to be a homophone of crooks. I remember noticing it at the time I solved the crossword and thinking that I have a distinction between krʌks and krʊks, whereas the compiler, Rufus, presumably did not. But it did not occur to me to write a letter to the editor about it.

Others did not hesitate. Two days ago there was a letter saying that Rufus must be a northerner (and by implication not properly educated) because he pronounces the two words the same. Today someone writes from an address in Greater London as follows.
…whereas “crux” as pronounced oop north rhymes with crooks as pronounced in t’ south … it does not rhyme with crooks in t’ north, where it approximates to “crewks”.

Clear? Let me explain. Popular northern speech merges the STRUT and FOOT sets, making dull rhyme with full and cut with put. (Hence the eye-dialect joke spelling 'oop' for up in the comment.) However among the FOOT words (i.e. words that have ʊ in RP and in ‘General American’) there is a variable subset in which some northerners (and also some Irish people) use a long vowel . This subset includes several words spelt -ook, such as book, cook, look… and crook.

This is why you have to be careful when selecting minimal pairs to test for the STRUT-FOOT merger. Cut vs. put is fine; but luck vs. look is not. Nor is crux vs. crooks, the pair at issue here.
crux crooks

most speakers of English krʌks krʊks
some northerners krʊks krʊks
other northerners krʊks kruːks
(As usual the notation ʊ can cover a multitude of qualities for northern speech, ranging from close to mid and from back to central. Some people use a kind of ə for both STRUT and FOOT. The point is not the exact phonetic quality involved but the sameness or differentness of the vowel qualities in particular lexical sets or subsets.)

It’s difficult even for those who understand this situation to explain it in simple terms in a line or two of a letter to the editor.
Subscribe to: Comments (Atom)
 

AltStyle によって変換されたページ (->オリジナル) /