Skip to content

suggestions about its performance / additon to the benchmarks #8

Open
@sugawarayuuta

Description

@sugawarayuuta

Hello! this is my first issue ever, so go easy on me.
I have published a JSON library that I was thinking about during my high school breaks. and I hope to share my knowledge I have gained from it, and if possible, I would like to ask you to add it to your benchmarks.

The following articles describes the general idea

The following is my repository

There are three major optimization I used, excluding unsafe things.

  1. (Staged) sync.Pool
    By setting up staged in the pool, slices etc can be reused more reliably.
    The idea is to create an array of pools and specify the size of slices that each pool accept.
    My implementation is here.
    This eliminates waste and simultaneously omits objects that are too large or too small.

  2. Bit-wise operations (SWAR)
    This can be used for multiple things, but I use it for byte searching. You can search multiple characters at once, so this can be efficient if assembly is not available.
    My implementation is here.
    This uses unsafe, however you can still replace it with encoding/binary.
    algorithms are described here.

  3. Custom data structures
    One thing I noticed in profiling the decoder is that lookups tend to be the bottleneck. (in this case, checking if a field exists with that name.) so first I created a hash map with robinhood hashing algorithm, that was faster than the standard map, because it's quite strong at handling keys that don't exist. I also tried cuckoo hash, which has interesting insertion and O(1) lookup. but I ended up using perfect hashing functions, as key(field name)-value(field information) pairs are known from the start, and don't change after. I also used OAAT fnv hash function to lowercase the field name and hash it at the same time.
    My implementation is (although it's probably too simple) here

These are optimizations I used.

Next, about the addition of the benchmark, I created a repository for reference. the result of benchmarks taken on darwin- amd64- Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz is here. the code for the benchmark is in the same repository. Check that out if you're interested.
As you may already see, the library is quite strong at stream-decoding JSON thanks to the technique I provided above.
By the way, I made the library so that it's compatible with the standard encoding/json, so the usage is the same as that.

Thank you for reading and have a good day.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions