Improve autocomplete performance with custom binary serialization

Using BinaryReader and BinaryWriter for fast custom de/serialization of C# objects

Improve autocomplete performance with custom binary serialization

The goal of 0.1 seconds

During the development of the back-end services for our upcoming SQL Studio web app, we wanted to implement fast autocomplete help to the user as he or she is typing the code. We set the goal for "fast" to be around 100 ms.

Autocomplete requires knowing the meta-data (data about the data) of any database.

Stateless containers

The back-end uses .Net Core code written in C# and hosted in Docker containers currently running on Amazon Web Services. The container instances can come and go and each container can serve many users. Keeping things in memory of a container is not very scalable. Each user request can be routed by the load-balancer to hit the different container.

Fetching the meta-data is a set of costly operations that hit the database, so
we needed to address the problem of caching the meta-data for any kind of database. The decision was made to use Redis cache server in which we'll put binary representation of the database.

Least effort approach

The problem of efficiently saving and reloading of the cached meta-data boils down to the serialization and deserialization of lists of objects containing the said meta-data.

In the first implementation we just needed the things to work, so we used the basic approach - save the objects to byte array using BinaryFormatter. It has nice and simple to use methods Serialize and Deserialize. Give it an object and it will use reflection and magic to spit out the byte array describing that object. Great - runs perfectly for sample databases with 10 tables, views and other objects.

But real databases are nowhere near 10 tables or procedures. Think more like thousands of objects. The most important was the deserialization part of the process, which for one such big database took around 400 ms. Combined with other required steps, like parsing and static analysis of the script document that user is typing, this simply was not acceptable.

Going "closer to the metal"

There are many libraries for C# that deal with serialization, offering various types of added value and functionality. But we decided to use our own using BinaryReader and BinaryWriter classes. These classes offer serialization of basic .Net types and that's all we needed. Lower that that would be harder, since we'd have to move within the (memory) streams by steps precisely the size of the data to be saved.

This way we have no versioning of serialized messages or format descriptions of any kind - but that was unnecessary anyways. Every class that needs to be serialized must have it's own implementation of how its created when read from byte array or how it converts itself to such array of bytes.

Total effect of using this technique resulted in deserialization that takes around 20 ms for the aforementioned real-world database - only 5% of the original basic approach. Now we have more room to perform other more interesting stuff for the autocomplete within the set performance goal.