Try writing "The first conquistador in what is now the US was Juan Ponce de Leon and the last was Don Juan de Onate Salazar." There's an o-with-acute-accent in Leon, and there's an n-with-tilde in Onate).
Try writing "Noroveirusyking a HliX", which is a headline from today's MorgunblaXiX (a newspaper in Iceland).
Try writing "Don't use the <blink> element!"
Or try writing some of the other problems I pointed out earlier. (A parent to this comment.)
* They do not work. *
If the challenge was "... as long as the input is in ASCII and doesn't include the '<' and '>' and '&' characters" then that's different. But that's not the challenge.At the very least, raise an exception for out-of-range characters. The current code hacks some Latin-1 characters to ASCII, others to "X", and encodes characters >= 256 to &# escape codes. This is wrong.
To which kens added that because the server doesn't set the content-type encoding, if the browser autodetects the ASCII as being utf-7 then there's another possible attack.
XSi! Antligen! Tschuss!
-----