suggestions about its performance / additon to the benchmarks

Hello! this is my first issue ever, so go easy on me.
I have published a JSON library that I was thinking about during my high school breaks. and I hope to share my knowledge I have gained from it, and if possible, I would like to ask you to add it to your benchmarks.

The following articles describes the general idea
- [The Japanese article (Zenn)](https://zenn.dev/sugawarayuuta/articles/2a5a3cb3f9a504)
- [The translated English article (Medium)](https://medium.com/@sugawarayuuta/still-i-wanted-to-build-the-fastest-json-decoder-in-go-eac1cda5b893)

The following is my repository
- [sugawarayuuta/sonnet](https://github.com/sugawarayuuta/sonnet)

There are three major optimization I used, excluding `unsafe` things.

1. (Staged) sync.Pool
By setting up staged in the pool, slices etc can be reused more reliably.
The idea is to create an array of pools and specify the size of slices that each pool accept.
My implementation is [here](https://github.com/sugawarayuuta/sonnet/blob/main/internal/pool/buffer.go).
This eliminates waste and simultaneously omits objects that are too large or too small.

2. Bit-wise operations (SWAR)
This can be used for multiple things, but I use it for byte searching. You can search multiple characters at once, so this can be efficient if assembly is not available.
My implementation is [here](https://github.com/sugawarayuuta/sonnet/blob/main/internal/decoder/escape.go#L74).
This uses unsafe, however you can still replace it with `encoding/binary`. 
algorithms are described [here](https://graphics.stanford.edu/~seander/bithacks.html).

3. Custom data structures
One thing I noticed in profiling the decoder is that lookups tend to be the bottleneck. (in this case, checking if a field exists with that name.) so first I created a hash map with [robinhood hashing algorithm](https://cs.uwaterloo.ca/research/tr/1986/CS-86-14.pdf), that was faster than the standard `map`, because it's quite strong at handling keys that don't exist. I also tried [cuckoo hash](https://en.wikipedia.org/wiki/Cuckoo_hashing), which has interesting insertion and O(1) lookup. but I ended up using [perfect hashing functions](https://en.wikipedia.org/wiki/Perfect_hash_function), as key(field name)-value(field information) pairs are known from the start, and don't change after. I also used OAAT fnv hash function to lowercase the field name and hash it at the same time.
My implementation is (although it's probably too simple) [here](https://github.com/sugawarayuuta/sonnet/blob/main/internal/decoder/table.go)

These are optimizations I used. 

Next, about the addition of the benchmark, I created a repository for reference. the result of benchmarks taken on darwin- amd64- Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz is [here](https://github.com/sugawarayuuta/benchmark/blob/main/bench.txt). the code for the benchmark is in the same repository. Check that out if you're interested.
As you may already see, the library is quite strong at stream-decoding JSON thanks to the technique I provided above.
By the way, I made the library so that it's compatible with the standard `encoding/json`, so the usage is the same as that.

Thank you for reading and have a good day.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

suggestions about its performance / additon to the benchmarks #8

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

suggestions about its performance / additon to the benchmarks #8

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions