Description
Hello! this is my first issue ever, so go easy on me.
I have published a JSON library that I was thinking about during my high school breaks. and I hope to share my knowledge I have gained from it, and if possible, I would like to ask you to add it to your benchmarks.
The following articles describes the general idea
The following is my repository
There are three major optimization I used, excluding unsafe
things.
-
(Staged) sync.Pool
By setting up staged in the pool, slices etc can be reused more reliably.
The idea is to create an array of pools and specify the size of slices that each pool accept.
My implementation is here.
This eliminates waste and simultaneously omits objects that are too large or too small. -
Bit-wise operations (SWAR)
This can be used for multiple things, but I use it for byte searching. You can search multiple characters at once, so this can be efficient if assembly is not available.
My implementation is here.
This uses unsafe, however you can still replace it withencoding/binary
.
algorithms are described here. -
Custom data structures
One thing I noticed in profiling the decoder is that lookups tend to be the bottleneck. (in this case, checking if a field exists with that name.) so first I created a hash map with robinhood hashing algorithm, that was faster than the standardmap
, because it's quite strong at handling keys that don't exist. I also tried cuckoo hash, which has interesting insertion and O(1) lookup. but I ended up using perfect hashing functions, as key(field name)-value(field information) pairs are known from the start, and don't change after. I also used OAAT fnv hash function to lowercase the field name and hash it at the same time.
My implementation is (although it's probably too simple) here
These are optimizations I used.
Next, about the addition of the benchmark, I created a repository for reference. the result of benchmarks taken on darwin- amd64- Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz is here. the code for the benchmark is in the same repository. Check that out if you're interested.
As you may already see, the library is quite strong at stream-decoding JSON thanks to the technique I provided above.
By the way, I made the library so that it's compatible with the standard encoding/json
, so the usage is the same as that.
Thank you for reading and have a good day.