Questions
- Are there edge-cases that I've missed?
- Do HTTP response header values ever contain JSON-like data?
- Any style pointers related to code readability?
- Are there other/better Vim (versions 8 or higher) builtins for what I'm attempting to do?
Story time
Currently I'm working on integrating an API with Vim (classy flavored Vim, not NeoVim), but the API, or proxy I wrote, seems to intermittently return malformed HTTP responses. In an ideal world I'd split on \r\n\r\n
to separate the headers from body data of HTTP responses, but this timeline isn't ideal π€·
Here are a few examples of the most offensively malformed HTTP response;
Status line joined to body, and no headers
HTTP/1.1 200 OK{ "key": "value" }
No status line, and body joined to headers
Content-Type: application/json{ "key": "value" }
No status line or headers and dictionaries are touching
{ "key": "value" }{ "key": "value" }
dictionaries are touching with nested data or other fun stuff
{ "key": "\"val}{}ue\"" }{ "nested": { "dict": [419, 68] } }
... and for clarity, here's what an ideal HTTP response would look like;
HTTP/1.1 200 OK
Server: SimpleHTTP/0.6 Python/3.12.6
Date: 2024εΉ΄9ζ28ζ₯ 23:29:00 GMT
Content-Type: application/json
{ "key": "value" }
{ "key": "\"val}{}ue\"" }
{ "nested": { "dict": [419, 68] } }
Notes about input
So far I've yet to find a simple RegExp that also handles nested dictionary data-structures, &&
/||
escaped double-quotes, &&
/||
curly-braces within double-quotes.
And the more complex RegExp attempts that satisfy some of the parsing requirements both fail with certain inputs, and more importantly do not spark joy.
Currently I don't need to worry about the API sending a list as a response, eg. nothing like;
[{ "key": "value" }]
... But lists within dictionary values are expected.
Current solution
So far I've encoded a character-by-character scanning loop that detects when within a string or curly-brace state, plus I believe handles escaped quotes correctly, and when it detects a whole dictionary makes a call out to json_decode
function! ExtractJSONDicts(data) abort
let l:dictionary_list = []
let l:index = 0
let l:slice_start = 0
let l:inside_string = v:false
let l:escape_count = 0
let l:curly_depth = 0
while l:index < len(a:data)
let l:character = a:data[l:index]
if l:inside_string
if l:character == '\'
let l:escape_count += 1
else
if l:character == '"'
if l:escape_count % 2 == 0
let l:inside_string = v:false
else
let l:inside_string = v:true
endif
endif
let l:escape_count = 0
endif
elseif l:character == '"'
let l:inside_string = v:true
elseif l:character == '{'
let l:curly_depth += 1
elseif l:character == '}'
let l:curly_depth -= 1
if l:curly_depth == 0
call add(l:dictionary_list, json_decode(a:data[l:slice_start:l:index]))
let l:slice_start = l:index + 1
let l:inside_string = v:false
let l:escape_count = 0
endif
elseif l:curly_depth == 0
let l:slice_start = l:index + 1
endif
let l:index += 1
endwhile
return l:dictionary_list
endfunction
For brevity of this OP; here are where unit tests can be double-checked.
Related documentation
:help channel-raw
:help channel-open-options
For the curious readers; here be a perma-link to source code of callback parser function in the context of the plugin that I'm working on.