Wednesday, December 29, 2004

Tsunami help

tsunamihelp.blogspot.com has rapidly become the central clearinghouse for information about aid efforts after the disaster. Also, closer to home, D. Sivakumar of IBM Almaden is involved with redroots.org:

Red Roots is a network of compassionate individuals who care about the less fortunate. Our immediate goal is to get a million donations to the American Red Cross disaster relief fund for the Asian Tsunami crisis (through Amazon): Go to Amazon and donate 5ドル to the American Red Cross. Call, email, or visit 2-4 individuals you know and get them to do likewise (donate 5,ドル sign up more people). The emphasis here is on the million, less on the 5ドル (or Amazon or Red Cross). If each of us donates and finds two new participants, we can very quickly reach the million mark. If this experiment succeeds, we hope to build a much larger network of compassion whose potential we can only imagine.
Spread the word...

Copyright

Maybe this is what we need to solve Jeff and Alan Selman's problems with copyright:

Science Commons is a new project of Creative Commons and will launch on January 1, 2005.

The mission of Science Commons is to encourage scientific innovation by making it easier for scientists, universities, and industries to use literature, data, and other scientific intellectual property and to share their knowledge with others. Science Commons works within current copyright and patent law to promote legal and technical mechanisms that remove barriers to sharing.


Tuesday, December 28, 2004

Numb3rs

Just saw a preview for the new CBS series Numb3rs. As Lance had previously mentioned, the premise of this series is that an FBI agent uses his math whiz brother to help him solve crimes, CSI-style.

I was worried that all this would do is reinforce traditional stereotypes of mathematicians. At least the actor playing the mathematician doesn't have thick glasses. However the preview is not encouraging. There are exchanges like 'Life is more than just numbers ! Life is all about numbers !', and inane sequences where the math whiz says 'There is no statistical evidence for X', his brother says 'X will happen', and lo and behold, X happens.

(subtext: mathematicians are hopeless geeks out of touch with the real world, but are useful as part of a dog-and-pony show).

sigh...

[Update: actually it wasn't so bad in the actual show. See the review ]

Rudbeckia Hirta has a scenario that the series could use. As an aside, the following story is an amusing counterweight to the "mathematician = number cruncher" stereotype:
One striking characteristic of Grothendieck’s mode of thinking is that it seemed to rely so little on examples. This can be seen in the legend of the so-called “Grothendieck prime”. In a mathematical conversation, someone suggested to Grothendieck that they should consider a particular prime number. “You mean an actual number?” Grothendieck asked. The other person replied, yes, an actual prime number. Grothendieck suggested, “All right,take 57.”

Monday, December 27, 2004

And then, the perfect room decoration.

From Sciencenews, via BB:

[Dick] Termes takes a unique perspective in his art. In effect, he imagines crawling inside a transparent ball, then looking out with one eye in all directions from a center point. He then transfers what he sees onto the outside of a sphere.


Suppose, for example, that the ball he's inside is in a starkly geometric, cubical room. The room is defined by three sets of parallel lines. From his perspective inside the ball, these lines lead to six vanishing points. And the lines are curved. This six-point perspective, with the six vertices of an octahedron serving as the vanishing points, becomes the basis of his spherical paintings.


Sunday, December 26, 2004

The Latest Scian Melt..

There are now blog carnivals for many topics: the Scian Melt collects articles about science and India, and was started by Scientific Indian. The latest melt is at Patrix's place, and I will be hosting the next one.

From the Melt:
The content of The Scian Melt comes from you, our loyal readers. If you have written a post on your weblog that gives an insight into science and relates to India, don't hesitate to send it in.
Please send submissions for the next Scian Melt to me or to melt [at] thescian [dot] com.

Horrible...

I just got up today morning to read this: death toll is approaching (削除) 10,000 (削除ここまで) 50,000 over South-east Asia, Sri Lanka, India, and the Maldives. One heartening aspect of this terrible tragedy: a blogger's network through Malaysia has been posting on the tsunamis, relaying information "from the ground" all over Malaysia.

For more news, visit the Reuters Alertnet. Bloggers covering the story include Nitin, Rajan and Amit.

Update: What I never understood was why there was no warning of the tsunamis, given that the earthquake was over 1000 miles away from the Indian coast. Apparently,
None of the countries most severely affected - including India, Thailand, Indonesia and Sri Lanka - had a tsunami warning mechanism or tidal gauges to alert people to the wall of water that followed a massive earthquake, said Waverly Person of the USGS National Earthquake Information Centre.
(via InstaPundit)

Further,
As the tsunami crosses the deep ocean, its length from crest to crest may be a hundred miles or more, and its height from crest to trough will only be a few feet or less. They can not be felt aboard ships nor can they be seen from the air in the open ocean. In the deepest oceans, the waves will reach speeds exceeding 600 miles per hour (970 km/hr). When the tsunami enters the shoaling water of coastlines in its path, the velocity of its waves diminishes and the wave height increases. It is in these shallow waters that a large tsunami can crest to heights exceeding 100 feet (30 m) and strike with devastating force.
You don't get much warning with those kinds of speeds.

Update II: There is now a clearing house of information related to the tsunamis @ tsunamihelp.blogspot.com

Thursday, December 23, 2004

An interesting 2D conjecture...

Prahladh Harsha and Jaikumar Radhakrishnan post a summary (on the Complexity blog) of this year's FSTTCS at Chennai, India. One of the papers they mention is Testing geometric convexity by Rademacher and Vempala.

The paper discusses (in the property-testing framework) the problem of checking if a given shape S is convex. The natural approximate formulation of this problem is: is there a convex shape C such that the symmetric difference of S and C has volume at most an ε-fraction of vol(S) ? Of course, one needs a way of asking queries like "is x in S ? ". They use this membership oracle and a random oracle that returns a (uniform) random point of S to show that
  • In 1D, the problem can be solved efficiently in time 1/ε
  • In Rn, the problem can be solved in time exponential in n and polynomial in 1/ε
An interesting conjecture emerges from all of this. The fact that the problem is solvable in 1D suggests the use of line queries: pick two random points x, y and test whether the intersection of the line joining them and S is convex. They show that in general one might need an exponential number of such line queries (which in itself is interesting) and conjecture that using two-dimensional sectional queries might work instead.

Wednesday, December 22, 2004

Oribots...

An artist in Australia has designed origami folding robots. The idea is that in the future, nano paper will electronically fold itself into various shapes. In other words, robotics, self assembly and folding, all in one. Fascinating...




via Engadget..

Tuesday, December 21, 2004

The beginning of the end ?

Jeff points out more doom-and-gloom news on the post-graduate education front, this time from the NYT. Some of the points that caught my attention:
China, which has declared that transforming 100 universities into world-class research institutions is a national priority, is persuading top Chinese scholars to return home from American universities.
This hits close to home: Andrew Yao, formerly of Princeton, is now at Tsinghua University, one of the best universities in China.
Steven B. Sample, president of the University of Southern California - which last year had 6,647 foreign students, the most of any American university - said colleagues who lead other universities had expressed anxiety at professional meetings."But we compete no holds barred among ourselves for the best faculty, for students, for gifts and for grants, and that's one of the reasons for our strength," Dr. Sample said. "Now we'll compete with some overseas universities. Fine with me, bring 'em on."
Apart from using probably the least appropriate phrase possible to describe this situation, maybe Prof. Sample should also recognize that this is not just a matter of other countries catching up; it is a problem of institutional shifts in US goverment policy, such as loss of funding for basic science, and barriers to entry for students and researchers alike.

Some 28 percent fewer Indian students applied to attend American graduate schools this fall than last year, according to a survey by the Council of Graduate Schools. This matched the overall decline for all foreign students.

Rabindranath Panda, the education consul at India's consulate in New York, said that huge private investments in Indian higher education in recent years had greatly increased options at home for Indian students, and that those who wished to study abroad were increasingly looking at universities not only in the United States and Britain but also in France, Germany, Singapore and elsewhere.

This is true: it is getting easier and easier to get student loans in India (something that was not possible even 10 years ago), and this allows students to be creative about where they choose to go.

This is a point worth noting. A vast majority of students from India (and likely from other countries) come to the US for Masters degrees after which they move into the corporate world. US academic superiority may last long enough to continue drawing Ph.D candidates, albeit at a declining rate, but huge chunks of revenue are generated from Masters programs in universities, and if students start to find that MS degrees from US universities do not confer a significant enough advantage over a degree from (say) the UK or Australia, the extra pain and suffering associated with merely entering the US will make it worth their while to take their money and go elsewhere.




Monday, December 20, 2004

adobe provides pdf to text ?

Xeni@BoingBoing points us to an Adobe site that promises to convert PDF to text or HTML. Apart from the fact that Adobe "may occasionally access the content you submit", it seems like a useful tool. But for anyone with Linux, pdftotext seems equally adept (or not) at converting documents. I tried converting some of my papers with the Adobe tool. First of all, if you choose the HTML 3.2 option, the website chugs away for many minutes, and then reports the ever-so-informative "error occurred" error message.

So I tried being even kinder: I directed the script to generate text, and for windows. The source file is here, and the results are here. This is the conversion that pdftotext did. I can't say that one is significantly better than the other, but you'd think Adobe could do a better job.

p.s I use Type I fonts: no bitmap nonsense. So there is no excuse really....

Memory, Entropy and Time

Sean Caroll over at Preposterous Universe discusses an intriguing (but not new) idea, that
the thermodynamic arrow of time -- the fact entropy is very small in the past, and tends to grow on purely statistical grounds -- is responsible for the fact that we can remember the past but not the future.

Friday, December 17, 2004

Very nice math posters.

I am a sucker for things that explain math to a general audience. Graham Leuschke points to this collection of posters (and detailed web pages explaining them) placed by the Newton Institute for Mathematical Sciences in the London Underground as part of the celebrations commemorating World Mathematical Year 2000.

In honor of our current fascination with Lorenz attractors, here is the corresponding poster (click it for more information):




Boing Boing on a roll

Not that Boing Boing needs pointers from me, but they appear to be on a roll lately. Apart from the post on chaotic crochet, they have a number of posts on mathematical/geometric subjects. How can you resist posts with lines like "Geometry rules" and "When random numbers were cool" ?

Here, for your reading pleasure..
  • How to make a desk and chair from one sheet of plywood (the site has been Boinged, but should be back up shortly)
  • A description (from 1959) of how to generate physical randomness
  • A collection of random numbers from RAND.

Computational crochet, anyone ?

Yes, we can fold stuff, but can we knit it ?

Dr Hinke Osinga and Professor Bernd Krauskopf, of Bristol University's engineering mathematics department, used 25,511 crochet stitches to represent the Lorenz equations. The equations describe the nature of chaotic systems - such as the weather or a turbulent river.
The paper describing this, with such section titles as "Comparing with crocheting the hyperbolic plane" and "A shapeless crocheted topological disk" appears in the Mathematical Intelligencer.
The academics are offering a bottle of champagne to anyone who cares to follow the pattern published in the journal Mathematics Intelligencer.
Hmm. My mother is a math teacher and does crochet. HMMM.....

(via Boing Boing)

Wednesday, December 15, 2004

Gödel and Einstein

Palle Yourgrau has a new book coming out titled 'A World Without Time: The Forgotten Legacy of Gödel and Einstein'. In an excerpt at the Chronicle of Higher Education, he discusses the notion of "Gödel universes", solutions proposed by Gödel to the field equations of general relativity,
solutions in which time would undergo a shocking transformation. The mathematics, the physics, and the philosophy of Gödel's results were all new. In the possible worlds governed by these new cosmological solutions, the so-called "rotating" or "Gödel universes," it turned out that the space-time structure is so greatly warped or curved by the distribution of matter that there exist timelike, future-directed paths by which a spaceship, if it travels fast enough -- and Gödel worked out the precise speed and fuel requirements, omitting only the lunch menu -- can penetrate into any region of the past, present, or future.

Gödel, the union of Einstein and Kafka, had for the first time in human history proved, from the equations of relativity, that time travel was not a philosopher's fantasy but a scientific possibility. Yet again he had somehow contrived, from within the very heart of mathematics, to drop a bomb into the laps of the philosophers. The fallout, however, from this mathematical bomb was even more perilous than that from the incompleteness theorem. Gödel was quick to point out that if we can revisit the past, then it never really "passed." But a time that fails to "pass" is no time at all.
The only story I know involving Gödel and Einstein is the apocryphal one about Gödel finding loopholes in the U.S Constitution that could allow a dictator to take over, and then being convinced by Einstein not to share this with immigration authorities. I had never heard about Gödel dabbling in general relativity: this sounds rather interesting.



Via ALDaily...

Weather report...

It is now -40 degrees in the Mordor quadrant of Hell.

Two of my friends got their Green Cards approved, well before the expected date...

Sariel, help is on the way !

[META] A request for commenters

This is a request for commenters. I really like hearing from people and I would reply to more comments if I could. Unfortunately many commenters post anonymously, and so my only recourse is to post another comment and hope they visit the page again (Blogger doesn't yet have RSS feeds for comments).

Now of course you might post anonymously by design, and that's perfectly fine. However I know that the Blogger interface for comment posting is not the greatest: you have to register a profile to post a comment with your name attached to it, and many people are understandably loathe to do so. So if you are a commenter who posts anonymously because you don't want to go thru the hassle of setting up a Blogger profile:
  • You can post anonymously with an appropriately spam-proof version of your name/email address (or even just a name)
  • If you do decide to set up a profile, spambots will not be able to harvest your ID because of the way Blogger does linking.
Keep those comments coming !

Tuesday, December 14, 2004

Patents and Research.

Erich Kunhardt has written an op-ed in the NYT today suggesting that "inventing", or the filing of patents, become a part of academic life, along with research and teaching. Prof. Kunhardt unfortunately never makes clear how he distinguishes inventing from research itself.

The primary purpose of a patent (at least originally) was to enable innovation by encouraging the dissemination of ideas while protecting the inventor. The bargain went somewhat like: "you agree to disclose the nature of your invention, and in return no one else gets to make money off of it". Not anything can be patented: the usual test of what is patentable is that it should be novel, have utility, and be easily describable and reproducible. As was drilled into me a couple of times by patent experts, a patent does not give you ownership of an idea: it merely prevents anyone else from making money from it, for a period of time (17 years for "standard" patents).

The world of patents today bears only a small similarity to this original view. Many companies generate patent portfolios for use in cross licensing agreements, in which case the number of patents filed is often a useful statistic (quality be darned). The US PTO has laughably bad standards for accepting patents, leading to absurd situations where a 6 year old patents swinging sideways.

So at least by current mechanisms, asking academics to file patents appears to be a bad idea. So maybe the academic community should come up with its own standards for the value of a patent:
However, unlike research, there is no established peer-review process for evaluating inventions, no way to evaluate the academic significance of a new idea beyond its potential economic value. That must change, and the academic community, perhaps with a push from the professional societies and the financial support from the government, should take the lead in clarifying the principles for doing so.
But Prof. Kunhardt never quite says how a patent should be evaluated "beyond its economic value". In fact, earlier in the article he says:
Patents, along with available investment capital, are an excellent measure of the potential for job creation. America's competitive advantage in the global economy has long rested on our ability to generate intellectual property - patents and other expressions of creativity - and to leverage it by creating companies or increasing market share.
I would argue that the primary purpose of a patent is economic: this is why the utility clause is important when establishing the value of one. The novelty and reproducibility elements are already familiar parts of any research endeavour: to publish we need to be novel, and reproducibility needs no defense.

The line separating the university from the technical school is already blurring. Students already view a university as a place that provides a service, rather than as a place where one gets an education. Start making researchers file patents as a condition for tenure, and the quality of "blue-sky" research will go down tremendously as one focuses on the economic factors influencing one's research rather than the scientific ones.



Update I: emmous points me to this wonderful article by Jeff Ullman (shame on me for not knowing this) on the patent process. It has some interesting anecdotes from his time as an expert witness in patent lawsuits.

Update II: Jason Schultz, a lawyer at EFF, writes this commentary on the current state of affairs in the patent world. (salon.com - paid registration or free one-day pass)

Flawed reasoning

A new article from CNET is titled "Science, engineering Ph.D. numbers buck downturn". The article bases this claim on the latest numbers on granting doctorates in science and engineering in the period from 2002-2003, up 2.8% from the previous period. The "bucking" here refers to the new worries about drops in enrollment in graduate programs.

What the article does not point out is the blindingly obvious problem with the claim implicit in the title. Ph.D graduation trends reflect enrollment trends from at least five and probably six years earlier, i.e from the period from 1996-1998. It will come as no surprise to anyone that this period had high enrollment, given as it came at the peak of the dot-com boom (although personally I am a tad surprised, since many people chose to drop out and join startups).

Whether or not there is a true slump in enrollment right now is an interesting debate to have. But graduation figures reflecting enrollment from many years ago (and before the dot-com bust and 9/11, and thus arguably eons ago) should not be part of any reasonable argument either for or against such a claim.

Sunday, December 12, 2004

LinkRank

There is a new site called LinkRanks, from pubsub.com, that purports to do page-rank-like ranking of domains, based not on regular linking, but on weblog linking:

LinkRanks are our way of measuring the strength, persistence, and vitality of links appearing in weblogs. When PubSub reads a new weblog entry, we pull out any URIs we find and attach them to the entry in a separate field. This allows our users to include domain names or linked file types when creating subscriptions....

Unlike Google's PageRank system, LinkRanks are not iterative. Rather, we base LinkRanks on a simple formula that only looks at local links - links which are within one or two steps of any target site. Also, it's important to note that we only look at links which are in weblog entries - we don't read any of the other links on the page, like the side bars or blogrolls.

Ok. Fair enough. They even go on to describe the formula they use to compute these rankings. Basically, given a source site S and a target site T that it links to, they want to estimate the "relative likelihood" of clicking on the link to T from S. They do this by weighting the importance of the link to T relative to the total number of outbound links from S.

So far so good. But the formula they use is somewhat bizarre. If S0 is the total number of outgoing links from S and ST is the number of links from S to T (think of T as a domain), then their formula calculates the relative weight of this link as



They then go on to do some standard summing over links etc. What I don't see is: they claim this measures a kind of likelihood that you would click on a link to T from S. But the probability of doing that under a uniform assumption is merely ST/So. So what model would generate this form ? This is clearly trying to downweight the influence of ST for small values of So, but how ?

In any case:
If you comment on LinkRanks in your blog, we'd like you to try an experiment with us. In weblog entries that talk about LinkRanks, include this URN somewhere:

http://psi.pubsub.com/20040413:linkranks:1
Consider it done.

Pseudorandomness

Erica Klarreich does a nice job of introducing the notion of pseudorandomness in a general setting. She starts by describes the basic problem of generating truly random bits. She then moves on to practical approaches that people have used to generate randomness, and also talks of what it means for a sequence to be random (loosely of course). She also describes the more interesting notion of cryptographic randomness, with a brief quote from David Wagner.

The article also discusses simple algorithms for generating randomness, the linear congruential generator being one that she alludes to. She also covers the von Neumann trick for converting a biased coin into a fair coin.

All in all, a decent article. I complain constantly about science writers, so it seems only fair to mention when one does a reasonable job. She could have gone a little deeper into why we care about randomness at all, and I would have liked to see some comments from other experts on pseudorandomness in the academic community, but overall, the article was nice.

A little surfing pointed me to her home page, which has a collection of other articles she has written (mostly on mathematical/computer science-related topics). She also has a Ph.D in mathematics ! She's one of us !!


Via Marginal Revolution, which is also an entertaining note on the game theory of Rock, Paper, Scissors.


ipods+windows+firewire HOWTO

I just acquired an iPod. It's a great gadget, and I've been using it as TiVo for radio, via Ben Hammersley's fantastic RadioPod application. It took me a bit of effort to get it to work - judging by the volume of comments on the various iPod forums I was surfing, I wasn't the only one with a problem. What follows is a description of what you need to do to get the iPod to work on Windows with firewire; I thought I'd put it up for Google to archive and serve out to others in the future.

Using an iPod on a Mac is trivial (unsurprisingly). Sariel has already described how to get one to talk to Linux. It is surprisingly tricky to get one to work on windows though. The problem is not the operating system; it's that Firewire is much more common on Macs than on PCs. PCs of course have USB 2.0, which by spec is faster than firewire, but in practice is a lot slower. There are also some claims that USB 2.0 hangs on long transfers to an iPod (30 minutes or longer).
Why should I care ? The catch with iPods is that they drain a lot of power when syncing with a computer. You have to make sure the iPod is fully charged before syncing (if you transfer a lot of songs: this is not a problem that I have had yet).

Let's assume for the remainder of this discussion that you have decided (for speed/other reasons) that you want to use firewire to sync with an iPod. Now this is where things get interesting. There are two kinds of firewire cables; the 6 pin kind and the 4-pin kind. The extra 2 pins are power suppliers for the firewire device.



Many PCs come with a 4 pin socket, and so you can't charge the iPod while syncing it (which you can do with the 6 pin socket). The iPod itself (the 4G version at any rate) comes only with a 6 pin cable. At this point, here are the options:
  • You can get a 6-pin to 4-pin converter, and assume that you will not need to recharge.
  • You can get a 6-pin firewire PCMCIA card (for a laptop) or a PCI card (for a desktop)
  • You can give up and go with USB 2.0
The problem with the first option is this: it seems impossible to find a plug that has a female 6 pin end and a 4 pin male end (to fit into the 4 pin socket on the computer). If you're near Fry's in the Bay Area, this would not be a problem, but for me this wasn't an option, and I drew a blank at the places I tried (Radio Shack very huffily informed me: "We don't do firewire").

If you have a desktop, you can manage with option 2. However with a laptop, the PCMCIA card does not provide the power lines you wanted the 6 pin firewire slot for in the first place ! In order to get a powered slot, you need to buy an extra AC adapter, and these are not sold with the cards ! In fact I have yet to figure out how to get one - the forums suggest that you have to buy a generic adapter that can supply power in different modes, and match it to your card. Total cost: 40ドル-50,ドル depending on the adapter and card. Yechhhh...

As it turns out, there is a solution, and a surprisingly elegant one at that. Newnex sells a little device that has a 4 pin male end (to plug into the computer) and two 6 pin female ends.



All you need now is a regular 6 to 6 firewire cable, which any computer store carries. Connect the iPod cable to one socket of the device, and connect the 6-6 cable to the other socket and to the iPod AC adapter. You're set, at total cost 25ドル or so.

It took me a long time to figure this out, and this was after I bought my iPod, and was getting very frustrated with the lack of information out there. I hope this will be useful to someone.


Saturday, December 11, 2004

Incompetent Indian consulates...

This will come as no surprise to Indian readers, but for non-Indians, planning to make a trip to India, this is a cautionary tale. Unfortunately, one geom blogger was harmed in the making of this story.
I guess that implicitly I assumed a certain level of competence from the consulate people, which was not there. I thought that their work is to give me (and other people) visas to India. Judging by their performance, I have no clue what they think their work is. Probably they think of themselves as some kind of form verifiers, that should verify that the applicant filled correctly his name/date of birth/etc on the form.
I've lost track of the number of times I've raised this plaintive cry to the heavens (replace "consulate" by "government office", and "visas to India" by "service"). Typical low-level government functionaries view themselves as protecting their precious job against interlopers like us, who actually might want some help. Anything that requires them to do real work either does not happen, or needs to be appropriately "greased".

Thursday, December 09, 2004

Differing standards for sharing experimental support structures

I've been writing some simple code to help my wife with some data analysis (she's a 'wetwork' biologist). According to her, if she writes a paper and uses analysis based on the code I write, I am required to supply the code to anyone who might request it. Of course, I could charge money for it: the commercial aspect is not really the point. The crucial point is that there is a common understanding in her field that all data used in the course of a paper must be made available on demand, no matter how long it took to make it (like reagents) or how precious the resource is.

Of course, in practice people have ways of getting around this. But the significant aspect of this understanding is that it is normative; people are expected to follow this baseline behaviour model, and deviation is viewed as inappropriate.

Now in experimental areas in computer science I find that we are far from such a baseline expectation. It seems to me that someone is doing me a favor if they provide me with the code they used in their paper, and even conferences that explicitly focus on experimental work (as opposed to conferences that deal with applied topics) don't expect authors to provide code.

There are logistical difficulties with releasing code if you work at a research lab like I do, but for researchers working in academia, this is clearly not a problem. So is this lower level of expectation appropriate for our community: is the analogy with biology (and probably other experimental sciences) inapt ? Or are we really slacking off on our scientific responsibilities ?




I should add that I often hide behind my corporate wall: I don't have publicly available code either, and haven't really tried to push the myriad pieces of paper that go into getting approval for a release. The point really is the different baseline that we have; people adapt their behaviour appropriately.

Algorithmic Self-Assembly

Now this is neat stuff:
Reporting in the December issue of the journal Public Library of Science (PLoS) Biology, Caltech assistant professor Erik Winfree and his colleagues show that DNA ''tiles'' can be programmed to assemble themselves into a crystal bearing a pattern of progressively smaller ''triangles within triangles,'' known as a Sierpinski triangle. This fractal pattern is more complex than patterns found in natural crystals because it never repeats. Natural crystals, by contrast, all bear repeating patterns like those commonly found in the tiling of a bathroom floor. And, because each DNA tile is a tiny knot of DNA with just 150 base pairs (an entire human genome has some 3 billion), the resulting Sierpinski triangles are microscopic. The Winfree team reports growing micron-size DNA crystals (about a hundredth the width of a human hair) that contain numerous Sierpinski triangles
So where's the computation ?
The novel aspect of the research is the translation of an algorithm--the basic method underlying a computer program--into the process of crystal growth. A well-known algorithm for drawing a Sierpinski triangle starts with a sequence of 0s and 1s. It redraws the sequence over and over again, filling up successive rows on a piece of paper, each time performing binary addition on adjacent digits.

The result is a Sierpinski triangle built out of 0s and 1s. To embed this algorithm in crystal growth, the scientists represented written rows of binary ''0s'' and ''1s'' as rows of DNA tiles in the crystal--some tiles stood for 0, and others for 1. To emulate addition, the sticky ends were designed to ensure that whenever a free tile stuck to tiles already in the crystal, it represented the sum of the tiles it was sticking to.

....This concept, according to Winfree's coauthor and Caltech research fellow Paul W. K. Rothemund, has inspired an entirely new research field, ''algorithmic self-assembly,'' in which scientists study the implications of embedding computation into crystal growth.
Algorithmic self-assembly will be familiar to STOC/FOCS goers. My friend Ashish Goel has often tried to convince me that this is the Next Big Thing, and has a couple of papers (with Len Adleman and others) on this topic...

Wednesday, December 08, 2004

Someone please Fisk this...

Jeff intones a resounding NO to the question posed by this gem:
We are currently paying a large amount of money to attend this University and receive an education. If I have paid to be taught something, shouldn't there be a repercussion for the teacher rather than, or at least as well as, the student when knowledge has not been taught?
This article is crying out for a comprehensive Fisking: I am too busy trying to keep my head from exploding when I read it (but Tall, Dark and Mysterious is more than equal to the task !) . However, I will excerpt shamelessly and out of context to make unsubtle and unwarranted points by gratuitous highlighting:
...Students pay teachers to educate us, yet they are then allowed to tell us how much we're learning. The whole situation seems akin to a boss paying her employee to clean toilets and the employee turning around and telling the employer how much she is or isn't happy with the cleaning job. IfI'm paying someone to do my housekeeping, I'll be the one to tell the receiver of my hard-earned money exactly how well they did....

We are currently paying a large amount of money to attend this University and receive an education. If I have paid to be taught something, ....

...I pay the teacher to teach me, and then I get slapped with the label of failure if the teacher deems that I haven't learned the correct information?

...Students have paid someone to teach them, they have been taught, but an arbitrary grade makes it seem as though this learning never occurred. Their newfound education is not recognized, and they have, in essence, paid money to be told that they are idiots. If I want to be told that I'm an idiot, I could just get drunk and leave embarrassing messages on the phone machines of attractive men -- for free.

An old Sanskrit saying used to subjugate poor oppressed students back in India roughly said that while one was studying, the most important person in one's life was the teacher (and the teacher's wife: students lived at the "ashram" or school during their entire period of learning). One's parents came next, and finally, after all of them, came God.

I guess they forgot to insert Mammon before all of these....
One of my first posts on the Geomblog was about coin-tossing. A paper by Diaconis, Holmes and Montgomery had just shown that physical coin-tossing is inherently biased (the result of the toss is the same as the starting position roughly 51%) of the time.

The full paper is now available. It takes the precession of a tossed coin into account when doing its analysis, and among other things has some nice pictures of coin-tossing machines.

The following section made me wonder (emphasis added):
Poincare's arguments suggest that as a roulette ball is spun more and more vigorously the numbers become closer and closer to uniformly distributed. There are numerous studies ( [Barnhart, 1992], [Bass, 1985]) suggesting that real roulette may not be vigorous enough to wash out the initial conditions.
Hmm. So I'm writing my grant proposal for the NSF, and in the travel section I put down 2 trips to Las Vegas ? 1 to Reno ? Maybe a couple to Monte Carlo for completeness ? And what about some legal fees when I get arrested for throwing roulette balls "vigorously"?

These are important questions ....

Tuesday, December 07, 2004

String Theory

I have always felt a sneaking envy of physicists (especially theoretical ones). I read A Brief History of Time in high school and for a brief time thereafter was convinced that I wanted to be a physicist.

Physics has always had the great metaphysical flourishes, the profound raison d'être, and the intuitive qualities that (at the time) I felt a drier discipline like theory CS didn't quite have; my original interest in computer science came from AI, Turing Tests, and Turing machines, and it was fun to immerse myself in discussions of mind-body separations and Chinese Room arguments.

My views evolved over the years. As I studied more, and learnt more (especially about complexity theory), I began to realize the deeper meaning of "computation as a phenomenon" and how the study of computational complexity is really about understanding different metaphors of computation and how these metaphors address the fundamental question of "what can you do, and how well can you do it ?"

Now it almost seems like the wheel has turned full circle. A recent issue of the Scientific American talks of black holes as computing devices, and current theories of quantum gravity all appear to reduce to wrestling with discrete space-time, which together with quantum computing, suggests a more fundamental role for computation in nature than one might have envisaged thirty years ago.

In that context, it is interesting to read quotes like this (via Not Even Wrong):
String theorists in general seem to have trouble getting their minds around the idea that it is even possible the theory is wrong. Jeff Harvey does admit that sometimes he wakes up thinking "What am I doing spending my whole career on something that cannot be proved experimentally"
It's nice to feel that we don't have to worry about something like that :)

Sunday, December 05, 2004

Dept. of "Speak now, or forever hold your peace"

From the NYT (emphasis mine):
Congress has cut the budget for the National Science Foundation, an engine for research in science and technology, just two years after endorsing a plan to double the amount given to the agency.

Supporters of scientific research, in government and at universities, noted that the cut came as lawmakers earmarked more money for local projects like the Rock and Roll Hall of Fame in Cleveland and the Punxsutawney Weather Museum in Pennsylvania.

David M. Stonner, director of Congressional affairs at the science foundation, said on Monday that the reduction might be just the beginning of a period of austerity. Congress, Mr. Stonner said, told the agency to expect "a series of flat or slightly declining budgets for the next several years."

In renewing the legal authority for science programs in late 2002, Congress voted to double the budget of the science foundation by 2007. The agency finances the work and training of many mathematicians, physicists, chemists, engineers, computer scientists, biologists and environmental experts.

The 388ドル billion spending bill for the current fiscal year, approved by both houses of Congress on Nov. 20, provides 5ドル.473 billion for the National Science Foundation, which is 105ドル million less than it got last year and 272ドル million less than President Bush requested.

In the Sunday Op-Ed by Tom Friedman:

Think about this. We are facing a mounting crisis in science and engineering education. The generation of scientists, engineers and mathematicians who were spurred to get advanced degrees by the 1957 Soviet launch of Sputnik and the challenge by President John Kennedy to put a man on the moon is slowly retiring.

But because of the steady erosion of science, math and engineering education in U.S. high schools, our cold war generation of American scientists is not being fully replenished. We traditionally filled the gap with Indian, Chinese and other immigrant brainpower. But post-9/11, many of these foreign engineers are not coming here anymore, and, because the world is now flat and wired, many others can stay home and innovate without having to emigrate.

If we don't do something soon and dramatic to reverse this "erosion," Shirley Ann Jackson, the president of Rensselaer Polytechnic and president of the American Association for the Advancement of Science, told me, we are not going to have the scientific foundation to sustain our high standard of living in 15 or 20 years.

Instead of doubling the N.S.F. budget - to support more science education and research at every level - this Congress decided to cut it! Could anything be more idiotic?


More on PCs ..

Jeff has more thoughts on program-committeeing: his main point being that with the number of back-channels that already exist to favor the well-connected in the review process, adding one more (PC members contacting authors for clarifications) isn't that bad.

In a sense, he is right that backchannels always exist and always will. After all, any conference review system is an inherently error-prone process of determining quality/correctness in a short period of time. Precisely because of this lack of time, it is impossible to evaluate a paper in vacuo, solely on its stated merits, and thus extraneous factors like authorship will play a role.

What interests me though is a related point. He mentions the back channel of going around giving talks about one's work. The reason that this is an effective back channel is that it helps a potential reviewer: after all, if they hear my one-hour talk, they will get a much better (and possibly much more positive) view of my paper than they would have acquired merely by reading it.

In that sense, good marketing appears to be an important part of the process. In fact, in the past people have recommended that I go and give talks on my (unpublished) work precisely for this reason. However this goes back to a point I was discussing earlier: in a double blind review process, it is generally felt that one is ethically bound not to do anything that might break anonymity (give talks, put papers on web sites etc). But is giving a talk on a paper under review (in a non-double-blind scenario) not similar ? After all, doing this gives me a competitive edge on the poor soul from Outer Timbuktu who can't travel to important places giving talks, and sending out drafts is fraught with its own dangers.


Friday, December 03, 2004

Do computational geometry, get a Peace Prize:
Having failed to quell months of escalating unrest in three southern provinces by force, Thailand's unorthodox prime minister is hoping plane loads of origami peace bombs will defuse the tension.

Thaksin Shinawatra has urged all 63 million Thais to make at least one paper bird in the next fortnight so they can be dropped on the three restive provinces on December 5 as a sign of goodwill to mark King Bhumibol Adulyadej's birthday...

Via Boing Boing..

Update: Well..... maybe not .....

Thursday, December 02, 2004

Isn't this sad (emphasis mine):
The council consists of, and takes testimony from, the sort of people who have spent enough time probing deep questions to cultivate expertise and esteem. These people used to be called monks. Today we call them nerds.

He continues on:

I like nerds. I often play chess with a member of the bioethics council, and if playing chess with a nerd doesn't make you a nerd, I don't know what does. Nerds care about facts and doing the right thing. They ask probing questions and challenge assumptions. The nerds in this room can translate between the red-state language of religion and the blue-state language of policy. Washington needs more of them. If it weren't for nerds, nobody here—as opposed to at the Vatican—would be seriously debating bioethics.

But nerds have trouble communicating the essence of the council's mandate: humanity
I like journalists. I even read some of their writing. They care about a story, and getting the right message. If it weren't for journalists, no one would know anything about what was going on in the world.

But journalists have trouble lifting above cliched storylines and the easy jab, and conveying the truth of complex matters.

Wednesday, December 01, 2004

PC discussions

Sariel riffs on Lance's discussion of program committees, and Oded Goldreich's somewhat controversial idea of having PC members contact authors for more information.

I have to pick a bone with a few things he said though (emphasis mine):
IMHO, the PC member should be show caution in contacting the authors, and use the input from the authors with care, but rejecting a paper just because it contains some minor technical error which prevents the PC from fairly judging its quality, seems to me to be an excuse to reject papers without reviewing them carefully. As OG mentioned, this additional input (a correction, etc) probably would have a negative impact on the evaluation of the paper anyway. In any case, I think the PC member should be trusted in using his/her/it common sense and clearcut rules should be avoided.
I am not sure I agree with this. People even in the relatively homogenous theory community have widely divergent views on matters (for example, OG's comments on PCs), and to rely on as ephemeral a notion as "common sense" is asking for confusion and ambiguity when some clearcut rules could make things so much simpler.

The tacit rule we have now: "Do not contact authors" is quite clearcut. It might prevent some paper from getting a fair shake, but it is fair (or at least equally unfair). Changing this to allow for PC contact with authors only introduces even more variability into the review process; maybe I feel conscientious and contact authors, whereas others do not. Maybe I believe honestly that a paper written by a friend is really good and needs to be interpreted "correctly", and contact the friend for a clarification I can use. There are myriad ways in which, applying what I believe to be good sense and fairness, I can reduce the overall fairness of the review process as seen by authors in general.

Ken Clarkson suggested over at Sariel's place that any contact with the authors should be made via the committee as a whole (or via the chair). This is more reasonable, and in fact is how most contact happens right now (when authors withdraw papers, or make bug reports), but to weaken the current proscription on author contact would open up a can of worms that no amount of trusting in good sense can prevent.

The other comment I take exception to is:
In particular, most papers (even if incorrect and junk) are harmless even if accepted to a very good conference.
We are trained in theory to think about resources (time/space/randomness/what have you). I claim that a slot in a good conference is also a resource, and with limits on this resource come hard decisions as to how to use it effectively. This is a zero sum game: an incorrect/junk paper that gets into a conference harms the correct/non-junk papers that didn't. Sariel goes on to to describe a more egregious kind of accepted paper, which I agree causes more grief in the long term, but in the long term we are all dead anyway ! Not making proper local choices (i.e what gets accepted to this conference) is harmful, and the harmful effects of this add up over time.

Monday, November 29, 2004

Streaming(?) Cell Processors and Graphics

This is a topic of direct research interest to me: Via the EE Times:
The eagerly anticipated Cell processor from IBM, Toshiba and Sony leverages a multicore 64-bit Power architecture with an embedded streaming processor, high-speed I/O, SRAM and dynamic multiplier in an effort, the partners hope, to revolutionize distributed computing architectures.
...

Each processing element comprises a Power-architecture 64-bit RISC CPU, a highly sophisticated direct-memory access controller and up to eight identical streaming processors. The Power CPU, DMA engine and streaming processors all reside on a very fast local bus. And each processing element is connected to its neighbors in the cell by high-speed "highways." Designed by Rambus Inc. with a team from Stanford University, these highways — or parallel bundles of serial I/O links — operate at 6.4 GHz per link. One of the ISSCC papers describes the link characteristics, as well as the difficulties of developing high-speed analog transceiver circuits in SOI technology.

The streaming processors, described in another paper, are self-contained SIMD units that operate autonomously once they are launched.

They include a 128-kbyte local pipe-lined SRAM that goes between the stream processor and the local bus, a bank of one hundred twenty-eight 128-bit registers and a bank of four floating-point and four integer execution units, which appear to operate in single-instruction, multiple-data mode from one instruction stream. Software controls data and instruction flow through the processor.
One theme of my research is demonstrating the power of modern GPUs for a wide variety of applications, and how their streaming nature represents a new model of computing akin to, but not exactly like the model used for traditional streaming algorithms. I wonder to what extent the new Cell processor fits in with this notion.

Clarity in writing

Via Critical Mass comes this screed criticizing obscure writing in literary theory circles. Much of the article is "inside baseball" for literary theorists, but there is at least one point that relates to what scientists do. It concerns one of the arguments literary theorists appear to make in defence of their alleged obscurity:
Scientists have their jargon—why can't theorists have theirs? Again, this is a valid question with a simple pragmatic answer. The public tolerates scientific jargon and not theory jargon because it believes that scientists need jargon to extend their researches and produce practical knowledge that benefits all. Only when scientists appear to abandon the common good does their language come under attack (for example, Swift's portrait of mathematicians in Book III of Gulliver's Travels, or contemporary ridicule of sociologese and psychobabble). Come the day when the theorists are able to demonstrate that their jargon enhances human life, and isn't just pretension and science-envy, public mistrust of them will end. Constantly claiming to foment social justice isn't sufficient.
I don't remember the Gulliver's Travels allusion, but the idea that only when scientists appear to abandon the common good does their language come under attack appears interesting, and in the light of the all the current evolution imbroglios, not quite right.

Alan Sokal's hilarious spoof was one of the first salvos fired at the obscurity of literary theorists (although his intent was quite different); it is interesting to see followups, even if my standpoint is of the mildly amused observer indulging in schadenfreude.

Friday, November 26, 2004

सस्ता और टिकाऊ

Via The Opti Mystic, this gem from the latest edition of the Indian blog carnival (or the Bharateeya Blog Mela):
Indian farmers have come up with what they think is the real thing to keep crops free of bugs. Instead of paying hefty fees to international chemical companies for patented pesticides, they are reportedly spraying their cotton and chilli fields with Coca-Cola.

(from the Guardian, via Jagadish)

In Hindi, there is a phrase that captures this: सस्ता और टिकाऊ
It literally translates as "Cheap, and solid", and is often used in advertisement tag lines.

An interesting problem:

emmous asks an interesting question:
Let N be a set of nonnegative integers (amen!). We say that N has the decomposition property if there are nonempty subsets N1 and N2, maybe overlapping or identical but both containing at least two elements, such that

N = { n1 + n2 | n1N1 and n2N2}.

Can we find such two subsets for a given N or it's NP-complete?
The source of this question is a paper by Mateesca, Salomaa and Yu.

I found this question interesting because we know that if N indeed had such a structure, and we were presented N1 and N2 instead, we could rank and select elements from N far more efficiently than if we had N itself.

Wednesday, November 24, 2004

Now this really makes me angry..

I had the (mis)fortune to be busy with papers around the time of the great November cataclysm, so I had no time to obsess over election results. In retrospect, I find though that there are some issues that rile me up a lot more than politics, and one of them is evolution.

It was disturbing enough to start seeing, a few years ago, that school districts were trying to teach (un)intelligent design along with evolution theory in classrooms. It has become even worse though, with districts in Georgia, Wisconsin, Maryland, Ohio and now Pennsylvania all choosing to teach such nonsense in classrooms (see the National Center for Science Education for more details)

The usual argument that has come up in discussion when I start spouting off is: "Well, the students who are really interested in science will learn the truth about evolution, so this doesn't really matter". This is of course correct: I can't imagine someone getting into a graduate program in biology without having been able to verify for themselves the basic facts of (micro)evolution under the microscope or understanding the fossil record. But scientists don't make their own policy or control their own funding; politicians do, and politicians are by and large people who got their science education in school and maybe a little in college, their background tending to be more in law/economics/business. It is crucial that we get this right, and get it right early on. If students are taught all kinds of gobbledygook in school, it messes up their fundamentals later on: we need only look at math education to see how that works.

I actually don't think everyone should be indoctrinated into the cult of evolution. Personally, I have been very confused myself about what statements regarding evolution are facts, and what is theory, and what is falsifiable: after all, a statement that we are descended from the apes is not inherently falsifiable. For more discussion, and a point-by-point elucidation of falsifiable tests of evolution, check out the fascinating talk.origins FAQ.

What upsets me more is the undermining of the basic scientific foundation of theory based on falsifiable premises[1]. (un)ID is a a big fat negation: humans did NOT develop via evolution, because evolution can't explain everything yet, and there are discrepancies. It is not a working theory of the world, and it can't even be falsified ! I mean, how do I prove that some intelligent being did NOT intervene with my development from Nim Chimpsky.

To place it on a par with evolution, as a 'competing theory of biological development', is to indulge in the kind of fake 'fairness' that science journalists have picked up as a bad habit from the political reporters, and wilfully ignores the volumes and volumes of evidence that support both the facts of evolution and many of the theoretical mechanisms.

There is a lot that is not understood about how evolution works, specifically exactly how (削除) genes (削除ここまで) traits get passed on, and how species evolve. And this is a genuine discussion worth having. But we don't teach children Newton's Laws with a big warning sticker, and we don't because we know that this provides a good first approximation to the laws of mechanics, and to understand the discrepancies predicted by relativity requires a lot more sophistication.

Should we point out that a lot of the "theory of evolution" is just a theory ? Absolutely. But we should separate out what is theory from what is not. Painting with such a broad brush is ludicrous, and cannot in good faith be interpreted as anything but a political power play to subvert the very basis of scientific enlightenment where it matters the most.

Science is not a political football. It is not a game. It is the foundation of the modern world. Deal with it.



[1] Before you start beating me over the head with Lakatos, Kuhn and Feyerabend, let me at least argue that falsificationism is a more accurate description of the day-to-day process of doing science.

Tuesday, November 23, 2004

The beleagured computational genomicists...

Much is being made of this article by John Derbyshire of NRO on the threat posed by computational genomicists to the "liberal elite". One point that has not been made though: much of computational genomics is based on reasoning about phylogenetic structure, inferring gene propogation cross species. This of course is based on the only-believed-by- 35%-of-the-US-population theory of evolution.

It is rather amusing that conservative commentators who spring to the defense of research that undermines the "liberal establishment" are not particularly concerned that this very research is based on the theory that people in their corner are trying their best to discredit.

wireless fun...

I forgot to mention that if you are in Phladelphia, a place that has both the FECSWATM and the TRCPWATM is Intermezzo, on 32nd and Walnut.

How to increase the number of hits on your blog...

This is a cottage industry in the blogosphere, with various authors willing to give you advice (free and paid) on how to increase traffic to your blog. In fact, some time back the Geomblog was cited as an example of how NOT to increase traffic to your blog, by exemplifying the kind of blog that "is for people who only talk to their own kind".

Well I have my own personal lesson on how to increase traffic to your blog, and I'll even give it out for free. The answer is: POLITICS. My evidence ?



This picture is a snapshot of my traffic over the past one month: visits in green, and page hits in purple. The prefix is my average traffic per day: not a lot, but then how many geometers are there to begin with ? The climb starts the day I posted my article about election cartograms. It lasted for a week, and I am still seeing the effects of it.

Anonymous submissions and ethical practice

Over in Lance-land, a post about chilling out after a deadline has mutated (in the comments section) into a discussion of anonymous submission to conferences. My question, which I posted there, is an ethical one.

Often I will go around giving talks about papers that are not yet published. My reason for doing so (and I have come recently to this way of thinking) is that as long as I am not fearful of being scooped, or preempted, giving a talk is how one disseminates work, and asssuming that the paper will get published eventually, this is OK, especially if the work is something I am excited about and want to talk about. Related activities might include submitting an e-print, or putting up a web page etc.

The ethical dilemma is then this: if I am submitting to a double blind review process (that many conferences adopt), is dissemination of the above kind an underhanded way of subverting the anonymity of the review process and should it be avoided at all costs ?

Obviously a way out is to just not talk about work under review. However, this doesn't tend to be common practice: many important results have spread long before they were submitted or accepted to a conference. I imagine that some nuance based on the "importance" of the paper might come into play. If I believe that the knowledge of this result will be of great interest to the world at large, then it is ok to talk about it. However, this gets us into a slippery area of reasoning: after all, one doesn't write a paper and submit it to a conference if one believes that the results are of no interest to anyone.

Comments ?
Posting has been slow, as I've been travelling, and doing some paper writing (you know, scholarly work, the kind that bloggers do in their pajamas). My internet connection has been down, and in the process of running around trying to make my deadline from various WiFi-enabled cafes, I have discovered the Four Essential Criteria for a Successful Wireless AdventureTM.
  1. A clear internet signal (preferably free)
  2. Good cell phone reception
  3. Accesssible power sockets
  4. Some reasonable espresso
You'd be surprised how difficult it is to satisfy all four of these constraints, and how essential they really are.

Along with these are the Two Recommended Criteria for A Pleasant Wireless AdventureTM:
  1. Comfy armchairs: a table and chair is NOT designed for typing on a laptop
  2. Pleasant music: it's not even Thanksgiving and I am already trying to free my mind of inane Christmas jingles !
anyway, I shall now kick back and take Lance's advice, at least until my next deadline...

Friday, November 19, 2004

Open problems.

This year's CG workshop promised to have a focus on open problems, and we had more than our share ! Check out the CG Blog for a list.

Thursday, November 18, 2004

I am blogging from inside the deranged cubist fantasy that is Stata Center, MIT's computer science building. The CG workshop is being held here in a lush auditorium, and the internets are flowing like wine :)



I hope to have updates as the day goes on: am looking forward to an intriguing talk by Godfreid Toussaint on Computational Geometric Aspects of Musical Rhythm.

Monday, November 15, 2004

So much for SIGGRAPH :)

I am happy to announce that this year's Fall CG workshop will have a fully-compliant 21st century web surfing experience: a website, a blog, and even a wiki !!

Hervé Brönnimann from PolyTech U has set up a comp geom space that (with the aid of volunteers like your humble blogger, Ken Clarkson, Jeff Erickson and Sariel Har-Peled) now has all kinds of goodies for the computational geometry community.

Firstly, we now have a blog for the Fall CG workshop. If you plan to attend the workshop, and would like to post entries, send mail to sureshv@gmail.com with the subject line "FWCG" to get an account, or for instructions on how to post anonymously. Please supply enough information to identify yourself :)

Secondly, we have a computational geometry wiki that runs off the excellent Mediawiki platform (the same one that runs wikipedia). The beauty of a wiki is that anyone can edit it (once you are registered), and there's lots of information on the site to help you out. We hope that through volunteer efforts, we can build this up into a valuable resource for the CG community.

Some cool features of the wiki:
  • You can write math in latex, and it will convert the text appropriately
  • You can use geombib reference for citations (thanks to Sariel)
  • Anyone can edit any page; although this seems like recipe for disaster, it actually works out reasonably well. So if you have some CG-related content you'd like to add in, or can help contribute to existing pages, dive right in.
Sariel has converted his core-sets survey into wiki format: check it out...


Thursday, November 11, 2004

A victory for robust statistics ?

In all the brouhaha over election predictions, it is worth observing that robust statistics scored a clear victory.

According to Wes Colley and J. Richard Gott, merely taking the median of all polls in the last month was good enough to predict the election outcome with almost 100% accuracy (they got Hawaii, with only 2 polls, wrong).

And what is a robust statistic ? One way of defining it is that it is a statistic on a set of numbers that requires at least a constant fraction of the input to go to infinity before the statistic itself goes to infinity (for the median, this number is 50%; however, only one number needs to go to infinity to pull the mean along with it). A robust statistic is often useful as a non-parametric measure of a data set that is sensitive to outliers; one of the best examples in two dimensions is Tukey's halfspace depth.

Aren't elections educational ?

Wednesday, November 10, 2004

USTCON is in L

Read all about it !! Lance has the annoncement, and David Molnar has additional reading.

Note that this paper could also have been titled in 4 letters:

SL = L

Tuesday, November 09, 2004

The fever catches on...

I promise (or at least try to promise) that this will be my last cartogram post. Looks like map fever has caught on a big way since my first cartogram. If you are not already sated by maps, maps and more maps, check out the extensive resources on election cartograms at UCSB provided by Sara Fabrikant.

Her maps have Alaska and Hawaii (which I and others shamelessly ignored). I must say that Alaska looks like an albatross in this map.

Monday, November 08, 2004

Making your own maps...

Since I first posted the cartograms, many people have emailed me asking if I could modify the cartograms to do various kinds of displays. Although changing color schemes is easy, changing the map itself is not (since I didn't generate the polygons for the map myself).

Anthony Robinson, a master's student at Penn State, works on various aspects of geographic visualization within the geoVISTA center, and has provided a toolkit developed by folks at Penn State that allows you to download Election 2004 results and play with them inside an exploratory analysis tool. He also provides the election data to work with !

Definitely check it out if you want to create your own maps; the folks at Penn State have been doing great work in scientific visualization for a while.

Sunday, November 07, 2004

Color schemes

One of the side discussions around the purple map was: what is a proper color scheme to color such a map, beyond just a red-blue-purple mode. Many ideas were suggested, and some of them are reflected on the page, but till a commenter mentioned it, I didn't realize that cartographers have studied this problem in great depth.

Cindy Brewer, a cartographer at Penn State, has designed a beautiful web site that helps with coloring choropleths (the technical term for maps where each region is colored with a single color). You decide what kind of data you have and how many categories of colors you needs, and the flash applet generates a color scheme, as well as color values in a variety of color maps (RGB/CMYK/etc).

Using a bivariate 10-value divergent color scheme (color spreads out from the middle of the data range), here is the Election 2004 cartogram:



For more maps, visit my new cartogram page.

More cartograms and some histograms

Gartner, Shalizi and Newman have another cartogram, made using a different ("diffusion") approach.



They also had some histograms of vote totals on this page, which they took down because of some anomalous values. I got the idea to do something similar, and here is the result (click for the big pic).



Friday, November 05, 2004

The 'Purple Haze', revisited.

[Update: Nov 8]: More color schemes (one based on colorbrewer.org ) and a toolkit to make your own maps !

By now, Robert Vanderbei's purple map of the voting counts in Election 2004 has crisscrossed the internet several times. On his web page, he answers some questions about these maps:
Can you warp the counties so that each county's area is proportional to its total vote count? Such warped maps are called cartograms. There are already several of these at the state-by-state level on the web. I haven't seen any at the county-by-county level. A few years ago I collaborated briefly with David Dobkin and Stephen North on algorithms for producing cartograms. I can say that making a cartogram with so many individual elements (counties) would be very difficult.
Cartograms are indeed hard to compute: this is an interesting geometric problem, as Jeff Erickson also points out.

As it turns out, the place I work at does cartograms, and quite well at that ! Stephen North (the one mentioned above) and colleagues have developed methods for computing cartograms and I used their approach to create a cartogram of the election results. I used the data from a paper by Daniel Keim, Stephen North, and Christian Panse that uses the medial axis to construct a cartogram.

[UPDATE: the original maps had a problem in labelling Nevada because of a mismatch between county names in the two data sets: the new maps shown here reflect the correction. Thanks to the commenter who pointed this out.]

I used a variety of color schemes: for more information, and larger pics, see my new cartograms page. (note: for formatting purposes I used HTML to scale the images; if you download the image, it will be bigger)

  1. A Vanderbei-like color scheme, where the relative proportion of votes for Bush or Kerry turns the county red or blue respectively: purple regions are roughly even.



  2. The winner-take-all color scheme: a county that Bush won is marked in red, and one that Kerry one is marked in blue.


  3. Red-blue (suggested by a poster on the blog). Start with a baseline of white for equal vote-share. as the vote percentage for the winner increases, increase the strength of red or blue appropriately.



  4. Grayscale (suggested by Kathryn Myronuk). Start with a baseline of white for equal vote-share. As the vote percentage for the winner increases, make the color grayer.


  5. ROYGBIV (suggested by Kathryn Myronuk). Again, white is neutral, but go towards the R side or the V side of the color spectrum depending on who wins and how much (thresholds at .7, 0.6, .53)




It takes a while getting used to a cartogram, and having to ensure that counties still touch after being inflated can make the problem quite difficult to solve while still retaining an overall representative shape. However, it is interesting to see how large California and the North-east really are in context, and how the entire middle-west of the country shrinks.

As far as I know, the complexity of computing a cartogram (where the typical formulation would be to minimize some error metric on the discrepancy between the area of a region and weight associated with it) is unknown, and is probably NP-hard (I once thought I had a proof for NP-hardness for rectangular cartograms, but it foundered). However, much of the challenge in cartogram design comes from trying to balance accuracy and aesthetics; the map when distorted should still look like the original map !

For more on this, check out the AT&T Info Viz page on spatial data transformation.

Thursday, November 04, 2004

Unicode is cool

This is my name, typed in Hindi using the INSCRIPT keyboard.

सुरेश वेंकटसुब्रमणियन
and in Tamil:

சுரேஷ் வெங்கடசுப்ரமணியன்


In order to do this earlier, I needed all kinds of LaTeX goo...

Incidentally, this site has all kinds of useful information on entering and reading characters from different languages.

Aside: I just realized that Safari mangles the fonts: Hindi slightly, but the Tamil is completely unrecognizable. I can understand that the mapping table for Tamil is not loaded, but the Hindi rendering has the right map, but is wrongly rendered. Interesting...

Monday, November 01, 2004

Another e-publishing model

Another example of how ease of publishing on the internet is making new models of academic discourse possible: Brad DeLong points to the Encyclopedia of Economic and Business History, a new article collection that
is designed to provide students and laymen with high quality reference articles in the field. Articles for the Online Encyclopedia are written by experts, screened by a group of authorities, and carefully edited. A distinguished Advisory Board recommends entry topics, assists in the selection of authors, and defines the project's scope.
This places it somewhat below a survey journal in terms of scholarly status, but above other web resources for laymen like Wikipedia or Mathworld or even ScienceWorld in terms of reliability and academic rigor.

It's a laudible endeavour. I suspect it works well in this area because economics have a tradition of public writings in addition to their academic work (as evidenced by the number of economics blogs). Given the levels of economic illiteracy among the public, and the importance of economic issues in making policy, such a resource can be of immense benefit for those of us trying to follow the often arcane and impenetrable discussions that swirl around issues like outsourcing, the economy, social security and tax cuts.
Subscribe to: Comments (Atom)

Disqus for The Geomblog

AltStyle によって変換されたページ (->オリジナル) /