3
3
[ ![ Build Status] ( https://github.com/aklomp/base64/actions/workflows/test.yml/badge.svg )] ( https://github.com/aklomp/base64/actions/workflows/test.yml )
4
4
5
5
This is an implementation of a base64 stream encoding/decoding library in C99
6
- with SIMD (AVX2, NEON, AArch64/NEON, SSSE3, SSE4.1, SSE4.2, AVX) and
6
+ with SIMD (AVX2, AVX512, NEON, AArch64/NEON, SSSE3, SSE4.1, SSE4.2, AVX) and
7
7
[ OpenMP] ( http://www.openmp.org ) acceleration. It also contains wrapper functions
8
8
to encode/decode simple length-delimited strings. This library aims to be:
9
9
@@ -19,6 +19,10 @@ will pick an optimized codec that lets it encode/decode 12 or 24 bytes at a
19
19
time, which gives a speedup of four or more times compared to the "plain"
20
20
bytewise codec.
21
21
22
+ AVX512 support is only for encoding at present, utilizing the AVX512 VL and VBMI
23
+ instructions. Decoding part reused AVX2 implementations. For CPUs later than
24
+ Cannonlake (manufactured in 2018) supports these instructions.
25
+
22
26
NEON support is hardcoded to on or off at compile time, because portable
23
27
runtime feature detection is unavailable on ARM.
24
28
@@ -59,6 +63,9 @@ optimizations described by Wojciech Muła in a
59
63
[ articles] ( http://0x80.pl/notesen/2016-01-17-sse-base64-decoding.html ) .
60
64
His own code is [ here] ( https://github.com/WojciechMula/toys/tree/master/base64 ) .
61
65
66
+ The AVX512 encoder is based on code from Wojciech Muła's
67
+ [ base64simd] ( https://github.com/WojciechMula/base64simd ) library.
68
+
62
69
The OpenMP implementation was added by Ferry Toth (@htot ) from [ Exalon Delft] ( http://www.exalondelft.nl ) .
63
70
64
71
## Building
@@ -76,8 +83,8 @@ To compile just the "plain" library without SIMD codecs, type:
76
83
make lib/libbase64.o
77
84
```
78
85
79
- Optional SIMD codecs can be included by specifying the ` AVX2_CFLAGS ` , ` NEON32_CFLAGS ` , ` NEON64_CFLAGS ` ,
80
- ` SSSE3_CFLAGS ` , ` SSE41_CFLAGS ` , ` SSE42_CFLAGS ` and/or ` AVX_CFLAGS ` environment variables.
86
+ Optional SIMD codecs can be included by specifying the ` AVX2_CFLAGS ` , ` AVX512_CFLAGS ` ,
87
+ ` NEON32_CFLAGS ` , ` NEON64_CFLAGS ` , ` SSSE3_CFLAGS ` , ` SSE41_CFLAGS ` , ` SSE42_CFLAGS ` and/or ` AVX_CFLAGS ` environment variables.
81
88
A typical build invocation on x86 looks like this:
82
89
83
90
``` sh
@@ -93,6 +100,15 @@ Example:
93
100
AVX2_CFLAGS=-mavx2 make
94
101
```
95
102
103
+ ### AVX512
104
+
105
+ To build and include the AVX512 codec, set the ` AVX512_CFLAGS ` environment variable to a value that will turn on AVX512 support in your compiler, typically ` -mavx512vl -mavx512vbmi ` .
106
+ Example:
107
+
108
+ ``` sh
109
+ AVX512_CFLAGS=" -mavx512vl -mavx512vbmi" make
110
+ ```
111
+
96
112
The codec will only be used if runtime feature detection shows that the target machine supports AVX2.
97
113
98
114
### SSSE3
@@ -208,6 +224,7 @@ Mainly there for testing purposes, this is also useful on ARM where the only way
208
224
The following constants can be used:
209
225
210
226
- ` BASE64_FORCE_AVX2 `
227
+ - ` BASE64_FORCE_AVX512 `
211
228
- ` BASE64_FORCE_NEON32 `
212
229
- ` BASE64_FORCE_NEON64 `
213
230
- ` BASE64_FORCE_PLAIN `
@@ -434,7 +451,7 @@ x86 processors
434
451
| i7-4770 @ 3.4 GHz DDR1600 OPENMP 4 thread | 4884\* | 7099\* | 4917\* | 7057\* | 4799\* | 7143\* | 4902\* | 7219\* |
435
452
| i7-4770 @ 3.4 GHz DDR1600 OPENMP 8 thread | 5212\* | 8849\* | 5284\* | 9099\* | 5289\* | 9220\* | 4849\* | 9200\* |
436
453
| i7-4870HQ @ 2.5 GHz | 1471\* | 3066\* | 6721\* | 6962\* | 7015\* | 8267\* | 8328\* | 11576\* |
437
- | i5-4590S @ 3.0 GHz | 3356 | 3197 | 4363 | 6104 | 4243 | 6233 | 4160 | 6344 |
454
+ | i5-4590S @ 3.0 GHz | 3356 | 3197 | 4363 | 6104 | 4243\* | 6233 | 4160\* | 6344 |
438
455
| Xeon X5570 @ 2.93 GHz | 2161 | 1508 | 3160 | 3915 | - | - | - | - |
439
456
| Pentium4 @ 3.4 GHz | 896 | 740 | - | - | - | - | - | - |
440
457
| Atom N270 | 243 | 266 | 508 | 387 | - | - | - | - |
0 commit comments