skip to main | skip to sidebar

30 April 2010

Ignite Ithaca talk

this is storming the internets thanks to all my friends promoting it. they will probably drive more traffic than me posting this here, but nevertheless, here it is: my Ignite Ithaca talk entitled "Why nobody ever taught you how to write good (and what you can do about it)".

[埋込みオブジェクト:http://www.youtube.com/v/DkkZ77lnuas&hl=en_US&fs=1&]

Posted by Ed Cormany at 10:28 PM 5 comments

tags: , ,

01 February 2010

newsflash: language peeves potentially irritable

coming across the twitter tubes this morning (via @jillianp), this story out of New Zealand: "Research could dismay English language purists". in other news, "Water is wet", "Vegetarians not so keen on meat", etc. etc.


i shouldn't mock the bit of news that prompted this piece: a USD 400ドルK+ grant to do a massive morphological survey of English. this could be incredibly useful. it's just the "context" that the writer put it in. gems like:
It is the first time the morphology of the English language has been looked at in this depth since rules were first laid out in the 19th century.
because before the 19th century, there were no rules! sheer and utter chaos! it's a miracle people could even form words.

but this brings up an interesting question: why did the (mostly inaccurate) grammar texts of the 19th century become sacrosanct to the so-called "purists"? and there's no doubt that they've taken on a mystical value, because they have the ability to trump basic logical reasoning. present a "purist" with two options—150 years of hearsay based on something initially wrong, or the collaborative research of renowned language experts—and they'll chose the former every time. and it can't just be anti-institutional, "down with the man!" sentiment; if that were the case, they should have rejected the prescriptivist poppycock (to borrow Geoff Pullum's term) in the first place.

oh well, we all know that the surest way to go insane is to argue with irrational people. so instead, i'll just pretend that the headline on the story was "Research could be pivotal for English linguists" and go about my day.

Posted by Ed Cormany at 12:18 PM 0 comments

tags: ,

19 January 2010

give and take: math and linguistics

this is a response to the excellent post "Why Linguists Should Study Math" over at The Lousy Linguist, which i found via fellow Cornellian @nmashton on twitter. i was going to just write a comment there, but i realized that it would probably become rather long.


first of all, let me say that i am in absolute agreement with the sentiments put forth by Chris in his post. in fact, i'm going to be auditing the brand new, never-before-offered Statistics for Linguists course this semester. but i think that one major point needs to be added.

simply: there is a grave asymmetry in linguists learning math versus mathematicians (or statisticians, or computer scientists, etc.) learning linguistics.

here's the scenario. you're a grad student in linguistics. this means that you went to high school once, and probably were rather good at most of your subjects, or you wouldn't be a student anymore. in high school, they made you learn math. if you were really good at it, you made it through single-variable calculus; if not, probably trig. even if you didn't like it and haven't touched math since, you should have a decent sense of How Math Works, in case you need to pick it up again.

but the converse just isn't true. i've audited the NLP course at Cornell, which is taught by an excellent professor in the CS department who has a very solid grounding in theoretical linguistics. but that almost doesn't matter given the fact that there are zero prerequisites for the course. that's right, no LING101, no nothing. the demographics of the ~80-person lecture break down roughly as 70 CS undergrads, 9 linguistics undergrads, and 1 lonely linguistics grad student.

so what's the big problem? they'll learn as they go, right? learning by doing is the best way, no? wrong. as has been shown time after time on Language Log and elsewhere for this and other fields (law, education, etc.), these would-be NLPers have a complex against linguistics. i think they recognize that they're uninformed on the finer points of linguistic theory, but because "hell, i speak a language!" they don't think they need any more expertise to solve complex linguistic problems. throw more code at it, throw more servers at it, we can brute force our way through. i've watched them re-invent the wheel, and it's a square wheel with an off-center axis. and they're not looking to refine its design, or ask those crazy round-wheeler linguists what they've got cooking in their lab. instead they're trying to make titanium and carbon-fiber square wheels, thinking that will improve things. the mantra is to strive for good enough rather than (i concede, unattainable) perfection.

i think that linguists are more and more cognizant of the need for mathematical training. and for those who just aren't math types, they're willing to go find fellow linguists who are, or even statisticians and computer scientists outside their departments to collaborate with. but nobody comes knocking on the linguistics department door. it's open, guys, and seriously, you could stand to visit. we won't bite.

Posted by Ed Cormany at 12:15 PM 6 comments

tags: , , ,

07 December 2009

seek and ye shall not find

morphological revelations on my morning comb through twitter and facebook statuses:

whoa. on the other hand, this isn't entirely unexpected. morphology tends to be entropic, that is, it favors simplicity and regularity and minimal expression, and moves in that direction over time. this doesn't mean the language apocalypse is upon us any more than the heat death of the universe, as predicted by physical entropy, is. just like physical entropy, language entropy can be locally reduced by other factors, particularly token frequency. that is to say—in the broadest terms—speakers are likelier to hang on to irregular forms of words that are used all the time, and tend to regularize words that aren't as common.

that brings me to my "whoa" moment. i just hadn't realized that 'seek' was possibly on the cusp of regularization. so the question is, how does 'seek'/'sought' stack up to other verbs with past tense forms in -ought? to get a comprehensive list, i turned to a reverse dictionary, which yielded just five non-compound -ought pasts: bought, fought, thought, brought, and our test case, sought. next to test their frequencies i headed to wordcount.org, a nifty visualization of frequency in the British National Corpus. admittedly the BNC might not give the most precise results for predicting the tendencies of young speakers in Michigan, but should be accurate enough. here are their ranks (not token counts; smaller numbers indicate higher frequency):

buy/bought: 785/1129
fight/fought: 1484/3204
think/thought: 102/152*
bring/brought: 631/461
seek/sought: 1875/1895

the data reveals that i perhaps shouldn't be as surprised as i was. 'seek' is the least frequent of the five verbs, although strangely 'fought' is the least frequent past tense form. i starred 'thought' since its frequency is probably affected considerably by use of the noun 'thought'. also of note is the fact that 'bring' is the only item whose past tense is more frequent than the base form; this is due to the fact that 'bring' requires a progressive present tense ("I bring the wine" ≠ "I am bringing the wine" but rather "I (habitually) bring the wine"). despite—or perhaps owing in part to—its frequency, 'bring' is subject to taking on a different irregular pattern, 'bring'/'brang'/'brung' in many children's speech and some adult dialects.

anyhow, to wrap this up, it looks like 'sought' might well be the best candidate of these forms to undergo regularization, even if i hadn't expected it before. the only other form that might do the same is 'fought'-->'fighted', but i think that would be even more surprising...i'm actually wondering why its frequency turned out to be so low in the BNC.

a postscript: although i certainly have 'sought' as the past tense of 'seek' in its basic sense "to look for", 'seeked' is also in my lexicon. it's the past tense of the relatively new lexical item 'seek' "to move rapidly through a video or audio clip". 'sought' is terrible as its past tense:

i seeked ahead 2 minutes to skip the commercials.
*i sought ahead 2 minutes to skip the commercials.

this kind of regularization is a common symptom of generating a new, distinct lexical entry from an existing form, cf. the classic case bad/worse/worst vs. bad/badder/baddest.

[UPDATE] regarding 'wrought', which is very low frequency, and i (rightly) eliminated from consideration as not being a productive past form. i commented the following on the ongoing facebook thread that prompted this all:
'wrought' is a strange case...it's actually the old past participle of 'work' (e.g. "wrought iron" = "worked iron" ≠ "wreaked iron"), and the historical past tense of 'wreak' is regular 'wreaked'. they got conflated because both 'work' and 'wreak' were used in the "____ havoc" idiom. since 'wrought' is almost never used outside the idiom any more, it probably doesn't fit into the regularization question here.

Posted by Ed Cormany at 9:16 AM 5 comments

tags:

27 August 2009

surprisal for dogs

today's Frazz comic:

i don't know if anybody has actually done research on dogs' abilities to learn frequency-based patterns (although we had cottontop tamarins not so long ago). and unlike the (削除) grammar (削除ここまで)pattern-sensitive monkeys, Mario didn't even wait to confirm the probability-based prediction, he just went for it.

Posted by Ed Cormany at 10:35 AM 1 comments

tags:

Subscribe to: Comments (Atom)
 

AltStyle によって変換されたページ (->オリジナル) /