Essentially as an exercise, I tried to write the smaller possible UTF-8 encoder in Lua [1]. Compared to a naive implementation like in [2], it is around 2.6 times shorter. Still, I am wondering if the code could be further shorted (not counting space removal). [1] https://gist.github.com/b0ae016da7b8f0b221ff [2] http://lwn.net/Articles/493167/ (and that implementation doesn't handle 4 bytes codes)