I found the answer for this here but it's in php.
I would like to match an array like [123, "hehe", "lala"] but only if the array syntax is correct.
I made this regex /\["?.+"?(?:,"?.+"?)*\]/.
The problem is that if the input is [123, "hehe, "lala"], the regex match, but the syntax is incorrect.
How can I make it only match if the array syntax is correct?
My problem is making the second " required when the first "is matched.
Edit: I'm only trying to do it only with strings and numbers inside the array.
3 Answers 3
You can try this regex: /\[((\d+|"([^"]|\\")*?")\s*,?\s*)*(?<!,)\]/
Each item should either
"([^"]|\\")*?": start and end with ", containing anything but ". If " is contained it should be escaped (\").
\d+: a number
After each item should be
\s*,?\s*: a comma with any number of spaces before or after.
And before the closing bracket should not be a comma: (?<!,)
Comments
You must have two (or more) separate expressions (using the | operator) in order to do that.
So it would be something like this:
/\[\s*("[^"]*"|[0-9]+)(\s*,\s*("[^"]*"|[0-9]+))*\s*\]/
(You may also want to use ^ at the start and $ at the end to make sure nothing else appears before/after the array: /^...snip...$/ to match the string from start to finish.)
If you need floating point numbers with exponents, add a period and the 'e' character: [0-9.eE]+ (which is why I did not use \d+ because only digits are allowed in that case.) To make sure a number is valid, it's much more complicated, obviously (sign, exponent with/without sign, digits only before or after the decimal point...)
You could also support single quoted strings. That too is a separate expression: '[^']*'.
You may want to allow spaces before and after the brackets too (start: /^\s*\[... and end: ...\]\s*$/).
Finally, if you want to really support JavaScript strings you would need to add support for the backslash. Something like this: ("([^"]|\\.)*").
Note
Your .+ expression would match " and , too and without the ^ and $ an array as follow matches your expression just fine:
This Array ["test", 123, true, "this"] Here
Comments
When the input string should only consist of an array literal, then JSON.parse is the way to go. If the input parses without error, you can still continue to check whether the parsed array consists of only numbers and strings (through normal iteration).
If however you need to find substring matches inside a longer input string, then realise that the escaping backslash can escape other characters too, like not in the least a backslash. That's why some of the answers here fail on some inputs.
Also:
- numbers can be more than just digits (decimal point, sign, hex, octal, binary, scientific, underscore separators, ...)
- string literals can be encoded with single quotes too
Here are some test cases to verify whether a solution matches the array literals in them (not consisting of other things than number or string literals):
Positive cases:
array [] in textarray [ ] in textarray [ -1 ] in textarray [0b101] in textarray [1E+10] in textarray [1.1e-1_0] in textarray [0xFFaF] in textarray [-9.01] in textarray [.01] in textarray [+12.] in textarray [1_2.3_4e5_6] in textarray [""] in textarray ["\\"] in textarray ["\\\""] in textarray ["]]]]]]]]]"] in textarray ["\"]...."] in textarray [ '', ''] in textarray ['you\'re',"in" ] in text
Negative cases:
no array [ in textno array ][ in text]no array [0p10] in textno array [1e] in textno array [e-10] in textno array [FFaF] in textno array [-9,] in textno array [+] in textno array [.] in textno array [,01] in textno array [1_2_.3] in textno array [_2.3] in textno array [2._3] in textno array ["] in textno array ["\"] in text"no array ['"] in textno array ["" ""] in textno array [1 2 3] in text
And here is a regex for it:
\[(?:(?<!\])\s*(?:(?:[+-]?(?:(?:\d_?)+(?<!_)(?:\.(?:\d_?)*)?|\.(?:\d_?)+)(?<!_)(?:e[+-]?(?:\d_?)+)?|0b(?:[01]_?)+|0x(?:[\da-f]_?)+)|(['"])(?:\\.|(?!1円)[^\\])*?1円)(?<!_)\s*(?:,|(?=\])))*(?<!,)\s*\]
It needs the i flag.
See it on regex101
NB: The ECMAScript syntax rules for number and string literals go beyond this. For instance, string literals can include character escape codes (like 251円, \u2024, \x10, \u{2F804}, ...) and while the above regex will not reject those, it will not check whether a valid character code is specified.
.+do not restrict much, only line breaks. Also, you are missing anchors,^and$, on both sides. Note that you need to make sure you support escape sequences, too. It is hardly a job for a regex in the end, though possible.JSON.parsedoes that.try { JSON.parse('["one", 1,2]'); console.log("valid"); } catch(e) { console.log("invalid"); }