Evaluate Varints and / or Simple Packing / Compression For Leaf Components #10

zicklag · 2024-08-21T00:19:42Z

The only reason so far I'm not sure that Borsh is perfect for Leaf is that it doesn't have any sort of compression for small integers.

If we want to keep the canonical property, then adding any sort of varint encoding or packing would require making it mandatory, or else a new data type. I lean towards making it mandatory if we're going to use it, as an extra data type, just for the sake of encoding/decoding/storage efficiency doesn't match the semantic nature of Leaf schemas very well.

Looking into varint encodings, while they do take less space, they can have a negative impact on performance:

Note that varint encoding saves space on the wire but is actually relatively slow, because it requires a lot of branches to encode/decode. It's not as slow as text encoding, of course, but as binary formats go it's not so great. This is one reason why Cap'n Proto decided to reverse this decision and just put fixed-width ints on the wire. Cap'n Proto also includes an optional "packing" algorithm which compresses away zero-valued bytes, which produces similar message sizes to Protobuf but is generally faster because the algorithm uses less branching.

https://stackoverflow.com/questions/24614553/why-is-varint-an-efficient-data-representation

I also found the Cap'nProto packing algorithm: https://capnproto.org/encoding.html#packing

That could be useful. Looks like it's designed to be extremely efficient to implement in a native language, but not sure if it'd slow things down in JS or something like that.

For now we are more concerned with a working prototype, but this is something we should think about and carefully weigh the pros and cons of before a Leaf 1.0.

One advantage of not packing is that Borsh is very simple to implement right now. Packing adds an extra complication, but it's still quite simple, so I'm not too worried about it.

It seems like the storage savings we could get from packing might be worth it, when we want to work well on resource constrained and low bandwidth devices. It might save a lot of storage/network costs over the long term.

zicklag · 2024-10-16T14:11:17Z

Polkadot's SCALE encoding is really similar to Borsh, but it also has compact integers. I think we need to do some serious thinking about whether Borsh or SCALE makes more sense for leaf.

Edit: Credit to @AireshBhat for pointing out SCALE to me.

anderspitman · 2024-10-16T17:37:21Z

Another option to check out would be QUIC's int packing. It's very simple.

zicklag · 2024-10-16T20:41:28Z

Another option to check out would be QUIC's int packing. It's very simple.

Interesting!

Looking at the QUIC int packing algorithm seems great for speed and storage efficiency, but only allows encoding up to 62 bits, which is a problem if you need a full 64 bits unfortunately.

A little research later...

Ah! The SCALE encoding for compact integers is almost identical to the encoding for packed QUIC integers, except that it allows you to expand beyond 62 bits for very large integers.
I'm pretty happy with that.

The only concerns with compact integers were performance and otherwise confusion with the schema system, but I think we'll be fine for both of those, especially if QUIC is doing something very similar for integer packing, and parity did design SCALE to be efficient for low-end devices, so we're probably fine.

It looks like there might be nicer TypeScript libraries with type inference for SCALE, too, which could be nice if we don't have to make our own.

zicklag added the leaf Related to the Leaf Protocol specification label Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluate Varints and / or Simple Packing / Compression For Leaf Components #10

Evaluate Varints and / or Simple Packing / Compression For Leaf Components #10

zicklag commented Aug 21, 2024

zicklag commented Oct 16, 2024 •

edited

Loading

anderspitman commented Oct 16, 2024

zicklag commented Oct 16, 2024 •

edited

Loading

Evaluate Varints and / or Simple Packing / Compression For Leaf Components #10

Evaluate Varints and / or Simple Packing / Compression For Leaf Components #10

Comments

zicklag commented Aug 21, 2024

zicklag commented Oct 16, 2024 • edited Loading

anderspitman commented Oct 16, 2024

zicklag commented Oct 16, 2024 • edited Loading

zicklag commented Oct 16, 2024 •

edited

Loading

zicklag commented Oct 16, 2024 •

edited

Loading