Closed
Description
Bug report
When creating a zip file with a zip64 entry in a streaming way (unseekable file), my understanding from the spec is that the headers should be set up so that:
- The local file header general flags field has bit 3 set (§4.4.4)
- The local file header crc32, compressed size, and uncompressed size fields are set to 0 (§4.4.4)
- The local file header has a zip64 (
0x0001
) extra record so that data descriptor sizes are interpreted as 64-bit integers (§4.3.9.2) - A data descriptor header exists (§4.3.9.1)
zipfile
normally does all this, but (2) seems to have broken after #103861/#103863. With that change, the local file header's compressed and uncompressed sizes are set to 0xffffffff
instead of 0x00000000
. (0xffffffff
is correct for the usual case where files are seekable and data descriptors are not used.)
From initial testing, it seems like this is all that's needed to fix the issue:
--- zipfile.py 2023-06-08 22:29:05.000000000 -0400
+++ zipfile2.py 2023-06-28 17:50:45.841709476 -0400
@@ -463,8 +463,9 @@
fmt = '<HHQQ'
extra = extra + struct.pack(fmt,
1, struct.calcsize(fmt)-4, file_size, compress_size)
- file_size = 0xffffffff
- compress_size = 0xffffffff
+ if not (self.flag_bits & _MASK_USE_DATA_DESCRIPTOR):
+ file_size = 0xffffffff
+ compress_size = 0xffffffff
min_version = ZIP64_VERSION
if self.compress_type == ZIP_BZIP2:
To reproduce
import zipfile
class UnseekableFile:
def __init__(self, fp):
self.fp = fp
def write(self, data):
return self.fp.write(data)
def flush(self):
self.fp.flush()
with open('test.zip', 'wb') as f_raw:
with zipfile.ZipFile(UnseekableFile(f_raw), 'w') as z:
with z.open('foobar', 'w', force_zip64=True) as f:
f.write(b'Hello, world!')
The resulting file looks like this, as reported by zipdetails (green=good, red=problematic).
0000 LOCAL HEADER #1 04034B50
0004 Extract Zip Spec 2D '4.5'
0005 Extract OS 00 'MS-DOS'
+0006 General Purpose Flag 0008
+ [Bit 3] 1 'Streamed'
0008 Compression Method 0000 'Stored'
000A Last Mod Time 00210000 'Mon Dec 31 19:00:00 1979'
+000E CRC 00000000
-0012 Compressed Length FFFFFFFF
-0016 Uncompressed Length FFFFFFFF
001A Filename Length 0006
001C Extra Length 0014
001E Filename 'foobar'
+0024 Extra ID #0001 0001 'ZIP64'
+0026 Length 0010
+0028 Uncompressed Size 0000000000000000
+0030 Compressed Size 0000000000000000
0038 PAYLOAD Hello, world!
+0045 STREAMING DATA HEADER 08074B50
+0049 CRC EBE6C6E6
+004D Compressed Length 000000000000000D
+0055 Uncompressed Length 000000000000000D
005D CENTRAL HEADER #1 02014B50
0061 Created Zip Spec 2D '4.5'
0062 Created OS 03 'Unix'
0063 Extract Zip Spec 2D '4.5'
0064 Extract OS 00 'MS-DOS'
0065 General Purpose Flag 0008
[Bit 3] 1 'Streamed'
0067 Compression Method 0000 'Stored'
0069 Last Mod Time 00210000 'Mon Dec 31 19:00:00 1979'
006D CRC EBE6C6E6
0071 Compressed Length 0000000D
0075 Uncompressed Length 0000000D
0079 Filename Length 0006
007B Extra Length 0000
007D Comment Length 0000
007F Disk Start 0000
0081 Int File Attributes 0000
[Bit 0] 0 'Binary Data'
0083 Ext File Attributes 01800000
0087 Local Header Offset 00000000
008B Filename 'foobar'
0091 END CENTRAL HEADER 06054B50
0095 Number of this disk 0000
0097 Central Dir Disk no 0000
0099 Entries in this disk 0001
009B Total Entries 0001
009D Size of Central Dir 00000034
00A1 Offset to Central Dir 0000005D
00A5 Comment Length 0000
Done
Your environment
- CPython versions tested on: 3.11.4
- Operating system and architecture: Alpine Linux x86_64