lua-users home
lua-l archive

Re: Replace specific comma's in a string.

[Date Prev][Date Next][Thread Prev][Thread Next] [Date Index] [Thread Index]


>>>>> "Coda" == Coda Highland <chighland@gmail.com> writes:
 Coda> Your discovery that it can't be done without loops is also fairly
 Coda> accurate. CSV parsing is one of the classic examples of "you
 Coda> really shouldn't try to do that with a regexp". If it's possible
 Coda> for values to CONTAIN quotes (i.e. by escaping) instead of just
 Coda> being DELIMITED by them, it's actually impossible (unless you use
 Coda> some Perlisms that go beyond the technical formalism of regular
 Coda> expressions).
Nonsense; CSV is clearly a regular language even when allowing quotes
inside the values.
Here is the definition from RFC4180 (excluding the obvious terminals):
 file = [header CRLF] record *(CRLF record) [CRLF]
 header = name *(COMMA name)
 record = field *(COMMA field)
 name = field
 field = (escaped / non-escaped)
 escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTE
 non-escaped = *TEXTDATA
which corresponds to this regexp (assuming newlines match [^] except
where explicitly excluded):
^(("([^"]|"")*"|[^",\r\n]*)(,"([^"]|"")*"|,[^",\r\n]*)*(\r\n|$))*$
 Code> Meanwhile, gsub is LESS expressive than regexps.
Indeed.
-- 
Andrew.

AltStyle によって変換されたページ (->オリジナル) /