JSON is slow, deserialization, DeserializationObject

The expected latency when downloading anything from a server to a client should increase as the size of the file increases.  Simply, the bigger the file the longer it takes to download.  There is a common step when writing an API that returns a JSON formatted file, which is the deserialization of the data content which can cause slowness.  Let’s first start with these two concepts.

  • Serialization
  • Deserialization

The act of serialization is to take some form of data, perhaps the result set of a database query, and place that data into a file for storage onto a file server or hard drive.  The content of the file is structured in perhaps JSON format, but it can also be XML or binary.  The important point is that the content of the file is in a format that can be understood by a client.

The act of deserialization is the opposite of what I wrote above.  You can write some code to access the data within a file and then cast it into a certain object structure or class.

If you look at this from a strictly programming perspective, consider you wanted to send a Guitar class to a method within the class.  You do that by creating an instance of the class, populating the class with values (via the constructor perhaps) and then you pass the instance of that class as a parameter of the method.  What happens if you want to send a Guitar class to an API on another server.  You can do that if you have all the attributes of the Guitar you want to send, the data would need to be serialized as a file or can be retrieved from a data source.  Then you could deserialize it and then pass it over HTTP.  Here is a common example.

var assembly = Assembly.GetExecutingAssembly();
var resourceName = "Guitars.json";
Guitars guitars;
using (Stream stream = assembly.GetManifestResourceStream(resourceName))
{
   using (StreamReader reader = new StreamReader(stream))
   {
     string result = await reader.ReadToEndAsync();
     guitars = JsonConvert.DeserializeObject<Guitars>(result);
    }
}
if (guitars?.Guitar.Any() == true)
{
   return Ok(guitars);
}

By doing this you are sure that the contents of the Guitars.json file is in a format which maps to the Guitars class.  Whatever receives this result can be confident of the JSON structure being received and use the data after populating it into a Guitars class.  Consider it as kind of a contract.

If the JSON file is large, where I consider 600KB large, you might get some high latency when there is high volume.  I consider high volume around 10 concurrent requests per second.

Something you can try if you are experiencing this is to not deserialize the object on the server side, where the server side is the one which is receiving the request and sending back the data in response.  Consider the following.

var assembly = Assembly.GetExecutingAssembly();
var resourceName = "Guitars.json";
string result = String.Empty;
using (Stream stream = assembly.GetManifestResourceStream(resourceName))
{
    using (StreamReader reader = new StreamReader(stream))
    {
      result = await reader.ReadToEndAsync();
    }
}
if (!String.IsNullOrEmpty(result))
{
    return Ok(result);
}

In this case, you are simply sending back the JSON file without casting it to a Guitar class.  Consider this method as one without a contract, but since you own the JSON file, you should have high confidence that it is in the correct format.  So you need to make a trade off if your solution has this symptom.  Do you want it to be performant or full proof, because at the moment I am not aware of another easy-ish solution. NOTE: that you will likely require more memory in this scenario.

What you need to consider as well is that this code here is real-time.  Ask yourself if this needs to be a real-time transaction, can it instead be near real-time.  I wrote this article here about decoupling solutions using Azure messaging products.  If you have a program that serializes files, you can run that offline, send it to a blob storage which triggers an Azure Function.  That Azure Function can then deserialize it, place it where it needs to be and then you can stream it instead of doing it all in real-time, which can be slow.

If I had this problem, this is how I would solve it.