lua-users home
lua-l archive

Re: Convert utf8 string to iso8859-1 (Latin1)

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


On 2015年02月11日 20:29 Sean Conner wrote:
It was thus said that the Great Igor Medeiros once stated:
Dear contributors,
Is there a way to covert a string whose characters are encoded in utf8, to
a string with characters encoded in iso-8859-1, just using lua standard
libs? I cannot use libs with C codes.
If there is, could you tell me how to do that or even point some site with
this information?
 I don't know of any existing Lua code to do this, but the concept is
straightforward:
	1. convert UTF-8 sequence to a Unicode codepoint
	 (http://en.wikipedia.org/wiki/UTF-8)
	2. Convert the Unicode codepoint
	(http://www.unicode.org/Public/UCD/latest/charts/CodeCharts.pdf WARNING:
	LARGE PDF) to ISO-8859-1 codepoint
	(http://en.wikipedia.org/wiki/ISO/IEC_8859-1)
	3. Go back to step 1 if more data.
 -spc (That should be enough to get you going ... )
Actually, the Latin 1 subset of unicode has the same codepoints as Latin1. It's just that UTF-8 is a different encoding. The following suffices to do the conversion with Lua 5.3:
 function utf8_to_latin1(s)
 local r = ''
 for _, c in utf8.codes(s) do
 r = r .. string.char(c)
 end
 return r
 end
HTH, Christian

AltStyle によって変換されたページ (->オリジナル) /