4

I'm pulling a JSON feed that is invalid JSON. It's missing quotes entirely. I've tried a few things, like explode() and str_replace(), to get the string looking a little bit more like valid JSON, but with an associate JSON string inside, it generally gets screwed up.

Here's an example:

id:43015,name:'John Doe',level:15,systems:[{t:6,glr:1242,n:'server',s:185,c:9}],classs:0,subclass:5

Are there any JSON parsers for php out there that can handle invalid JSON like this?

Edit: I'm trying to use json_decode() on this string. It returns nothing.

asked Oct 15, 2009 at 21:29
10
  • 1
    i dont believe numbers need quotes in JSON Commented Oct 15, 2009 at 21:31
  • But the "keys" do, don't they? Like id:43015 should be "id":43015, right? Commented Oct 15, 2009 at 21:37
  • Yes, the problem is that the key names like "id" are not quoted Commented Oct 15, 2009 at 21:39
  • Additionally single quotes around strings are not allowed in JSON Commented Oct 15, 2009 at 21:39
  • 1
    You are right. Only solution I see is the patch of one of the available parsers. Commented Oct 15, 2009 at 21:39

7 Answers 7

12
  1. All the quotes should be double quotes " and not single quotes '.
  2. All the keys should be quoted.
  3. The whole element should be an object.
 function my_json_decode($s) {
 $s = str_replace(
 array('"', "'"),
 array('\"', '"'),
 $s
 );
 $s = preg_replace('/(\w+):/i', '"1円":', $s);
 return json_decode(sprintf('{%s}', $s));
 }
answered Oct 15, 2009 at 21:52

1 Comment

Try setting a value to a url or something with a colon in it. This will not work. (ie id:43015,name:'http:John Doe',lev ...)
5

This regex will do the trick

$json = preg_replace('/([{,])(\s*)([A-Za-z0-9_\-]+?)\s*:/','1ドル"3ドル":',$json);
answered Mar 26, 2015 at 2:38

Comments

4

From my experience Marko's answer doesnt work anymore. For newer php versions use this istead:

$a = "{id:43015,name:'John Doe',level:15,systems:[{t:6,glr:1242,n:'server',s:185,c:988}],classs:0,subclass:5}";
$a = preg_replace('/(,|\{)[ \t\n]*(\w+)[ ]*:[ ]*/','1ドル"2ドル":',$a);
$a = preg_replace('/":\'?([^\[\]\{\}]*?)\'?[ \n\t]*(,"|\}$|\]$|\}\]|\]\}|\}|\])/','":"1ドル"2ドル',$a);
print_r($a);
answered Dec 28, 2012 at 21:20

1 Comment

Support for Arrays: $a = preg_replace('/(,|\{)[ \t\n]*(\w+)[ ]*:[ ]*/','1ドル"2ドル":',$a); $a = preg_replace('/(,|[)[ \t\n]*\'?\"?(\w+)\'?\"?/','1ドル"2ドル"',$a); $a = preg_replace('/":\'?\"?([^[]\{\}]*?)\'?\"?[ \n\t]*(,"|\}$|]$|\}]|]\}|\}|])/','":"1ドル"2ドル',$a);
2

I know this question is old, but I hope this helps someone.

I had a similar problem, in that I wanted to accept JSON as a user input, but didn't want to require tedious "quotes" around every key. Furthermore, I didn't want to require quotes around the values either, but still parse valid numbers.

The simplest way seemed to be writing a custom parser.

I came up with this, which parses to nested associative / indexed arrays:

function loose_json_decode($json) {
 $rgxjson = '%((?:\{[^\{\}\[\]]*\})|(?:\[[^\{\}\[\]]*\]))%';
 $rgxstr = '%("(?:[^"\\\\]*|\\\\\\\\|\\\\"|\\\\)*"|\'(?:[^\'\\\\]*|\\\\\\\\|\\\\\'|\\\\)*\')%';
 $rgxnum = '%^\s*([+-]?(\d+(\.\d*)?|\d*\.\d+)(e[+-]?\d+)?|0x[0-9a-f]+)\s*$%i';
 $rgxchr1 = '%^'.chr(1).'\\d+'.chr(1).'$%';
 $rgxchr2 = '%^'.chr(2).'\\d+'.chr(2).'$%';
 $chrs = array(chr(2),chr(1));
 $escs = array(chr(2).chr(2),chr(2).chr(1));
 $nodes = array();
 $strings = array();
 # escape use of chr(1)
 $json = str_replace($chrs,$escs,$json);
 # parse out existing strings
 $pieces = preg_split($rgxstr,$json,-1,PREG_SPLIT_DELIM_CAPTURE);
 for($i=1;$i<count($pieces);$i+=2) {
 $strings []= str_replace($escs,$chrs,str_replace(array('\\\\','\\\'','\\"'),array('\\','\'','"'),substr($pieces[$i],1,-1)));
 $pieces[$i] = chr(2) . (count($strings)-1) . chr(2);
 }
 $json = implode($pieces);
 # parse json
 while(1) {
 $pieces = preg_split($rgxjson,$json,-1,PREG_SPLIT_DELIM_CAPTURE);
 for($i=1;$i<count($pieces);$i+=2) {
 $nodes []= $pieces[$i];
 $pieces[$i] = chr(1) . (count($nodes)-1) . chr(1);
 }
 $json = implode($pieces);
 if(!preg_match($rgxjson,$json)) break;
 }
 # build associative array
 for($i=0,$l=count($nodes);$i<$l;$i++) {
 $obj = explode(',',substr($nodes[$i],1,-1));
 $arr = $nodes[$i][0] == '[';
 if($arr) {
 for($j=0;$j<count($obj);$j++) {
 if(preg_match($rgxchr1,$obj[$j])) $obj[$j] = $nodes[+substr($obj[$j],1,-1)];
 else if(preg_match($rgxchr2,$obj[$j])) $obj[$j] = $strings[+substr($obj[$j],1,-1)];
 else if(preg_match($rgxnum,$obj[$j])) $obj[$j] = +trim($obj[$j]);
 else $obj[$j] = trim(str_replace($escs,$chrs,$obj[$j]));
 }
 $nodes[$i] = $obj;
 } else {
 $data = array();
 for($j=0;$j<count($obj);$j++) {
 $kv = explode(':',$obj[$j],2);
 if(preg_match($rgxchr1,$kv[0])) $kv[0] = $nodes[+substr($kv[0],1,-1)];
 else if(preg_match($rgxchr2,$kv[0])) $kv[0] = $strings[+substr($kv[0],1,-1)];
 else if(preg_match($rgxnum,$kv[0])) $kv[0] = +trim($kv[0]);
 else $kv[0] = trim(str_replace($escs,$chrs,$kv[0]));
 if(preg_match($rgxchr1,$kv[1])) $kv[1] = $nodes[+substr($kv[1],1,-1)];
 else if(preg_match($rgxchr2,$kv[1])) $kv[1] = $strings[+substr($kv[1],1,-1)];
 else if(preg_match($rgxnum,$kv[1])) $kv[1] = +trim($kv[1]);
 else $kv[1] = trim(str_replace($escs,$chrs,$kv[1]));
 $data[$kv[0]] = $kv[1];
 }
 $nodes[$i] = $data;
 }
 }
 return $nodes[count($nodes)-1];
}

Note that it does not catch errors or bad formatting...

For your situation, it looks like you'd want to add {}'s around it (as json_decode also requires):

$data = loose_json_decode('{' . $json . '}');

which for me yields:

array(6) {
 ["id"]=>
 int(43015)
 ["name"]=>
 string(8) "John Doe"
 ["level"]=>
 int(15)
 ["systems"]=>
 array(1) {
 [0]=>
 array(5) {
 ["t"]=>
 int(6)
 ["glr"]=>
 int(1242)
 ["n"]=>
 string(6) "server"
 ["s"]=>
 int(185)
 ["c"]=>
 int(9)
 }
 }
 ["classs"]=>
 int(0)
 ["subclass"]=>
 int(5)
}
answered Jan 28, 2017 at 19:33

Comments

1
$json = preg_replace('/([{,])(\s*)([A-Za-z0-9_\-]+?)\s*:/','1ドル"3ドル":',$json);// adding->(")
$json = str_replace("'",'"', $json);// replacing->(')

This solution seems to be enough for most common purposes.

answered Aug 17, 2015 at 11:02

Comments

0

I'd say your best bet is to download the source of a JSON decoder (they're not huge) and fiddle with it, especially if you know what's wrong with the JSON you're trying to decode.

The example you provided needs { } around it, too, which may help.

answered Oct 15, 2009 at 21:51

Comments

0

This is my solution to remove trailing/leading/multi commas. It can be combined with other answers that remove single quotes and add quotes around json keys. I realize this would not be relevant to the OP as it deals with other types of invalid json however I just hope to help someone who finds this question on a google search.

function replace_unquoted_text ($json, $f)
{
 $matches = array();
 preg_match_all('/(")(?:(?=(\\\\?))2円.)*?1円/', $json, $matches, PREG_OFFSET_CAPTURE);
 //echo '<pre>' . json_encode($matches[0]) . '</pre>';
 $matchIndexes = [0];
 foreach ($matches[0] as $match)
 {
 array_push($matchIndexes, $match[1]);
 array_push($matchIndexes, strlen($match[0]) + $match[1]);
 }
 array_push($matchIndexes, strlen($json));
 $components = [];
 for ($n = 0; $n < count($matchIndexes); $n += 2)
 {
 $startIDX = $matchIndexes[$n];
 $finalExclIDX = $matchIndexes[$n + 1];
 //echo $startIDX . ' -> ' . $finalExclIDX . '<br>';
 $len = $finalExclIDX - $startIDX;
 if ($len === 0) continue;
 $prevIDX = ($n === 0) ? 0 : $matchIndexes[$n - 1];
 array_push($components, substr($json, $prevIDX, $startIDX - $prevIDX));
 array_push($components, $f(substr($json, $startIDX, $len)));
 array_push($components, substr($json, $finalExclIDX, ((($n + 1) === count($matchIndexes)) ? count($json) : $matchIndexes[$n + 1]) - $finalExclIDX));
 }
 //echo '<pre>' . json_encode($components) . '</pre>';
 return implode("", $components);
}
function json_decode_lazy ($jsonSnip) {
 return json_decode(fix_lazy_json($jsonSnip));
}
function fix_lazy_json ($json) {
 return replace_unquoted_text($json, 'fix_lazy_snip');
}
function fix_lazy_snip ($jsonSnip) {
 return remove_multi_commas_snip(remove_leading_commas_snip(remove_trailing_commas_snip($jsonSnip)));
}
function remove_leading_commas ($json) {
 return replace_unquoted_text($json, 'remove_leading_commas_snip');
}
function remove_leading_commas_snip ($jsonSnip) {
 return preg_replace('/([{[]\s*)(,\s*)*/', '1ドル', $jsonSnip);
}
function remove_trailing_commas ($json) {
 return replace_unquoted_text($json, 'remove_trailing_commas_snip');
}
function remove_trailing_commas_snip ($jsonSnip) {
 return preg_replace('/(,\s*)*,(\s*[}\]])/', '2ドル', $jsonSnip);
}
function remove_multi_commas ($json) {
 return replace_unquoted_text($json, 'remove_multi_commas_snip');
}
function remove_multi_commas_snip ($jsonSnip) {
 return preg_replace('/(,\s*)+,/', ',', $jsonSnip);
}
json_decode_lazy('[,,{,,,"a":17,,, "b":13,,,,},,,]') // {"a":17, "b":13}

See on repl.it.

answered Jul 26, 2021 at 17:47

Comments

Your Answer

Draft saved
Draft discarded

Sign up or log in

Sign up using Google
Sign up using Email and Password

Post as a guest

Required, but never shown

Post as a guest

Required, but never shown

By clicking "Post Your Answer", you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.