We have encountered some memory issues when validating JSON.
We are using:
- Newtensoft.Json v 13.0.3
- Newtonsoft.Json.Schema 4.0.1
- .NET 9.0.304
For validation we have this code:
var validationResult = new Lazy<ReplyErrorReqV1Payload>();
var streamReader = new StreamReader(stream);
var jsonReader = new JsonTextReader(streamReader);
var validatingReader = new JSchemaValidatingReader(jsonReader);
validatingReader.Schema = schema;
validatingReader.ValidationEventHandler += (o, a) =>
{
// simplified error handling...
Console.WriteLine(a.Message)
};
while (await validatingReader.ReadAsync(cancellationToken)){
// force reader through stream...
}
Let's say we have this schema:
{
"type": "object",
"additionalProperties": false,
"required": [
"demoId",
],
"properties": {
"demoId": {
"type": "string",
"maxLength": 50
}
}
}
and we send some disproportional JSON into validation:
{
"demoId": "... Some very large string token - lets say 10M chars ..."
}
When I was debugging this situation, when the event ValidationEventHandler is triggered, the error message contains the whole 10M chars.
Also according to dotMemory the value seems to be in memory like 4 times.
Which is something like 10M chars * 2B * 4 instances ~= 80MB of memory.
Is there some way to handle this effectively without writing custom JsonTextReader or some custom validation for extreme lengths of tokens in validated JSON?
I have been debugging decompiled code and I cannot see anything where I can add some handle or override method in order to lower memory requirements.
JsonTextReaderalways completely materializes every string it encounters, even when you call.Skip()-- and there are no extension points to disable this. See e.g. JsonConvert Deserialize Object out of memory exception. (I'm a bit surprised about the string being in memory 4 times though!)System.Text.Json.Nodes.JsonNodehierarchy and just check the value's length, in MaxLengthKeyword. That will result in the entire 10M long string getting materialized as well.