Permanent errors in metadata following 2.3.0 upgrade #17090

putnam · 2025-02-24T21:48:17Z

System information

Type	Version/Name
Distribution Name	Debian
Distribution Version	trixie
Kernel Version	6.12.12
Architecture	amd64
OpenZFS Version	zfs-2.3.0-1 zfs-kmod-2.3.0-1

Describe the problem you're observing

Now that 2.3.0 was merged into trixie, I just upgraded my system to 2.3.0 (from the 2.2.x series). Before rebooting, the pool was clean, having finished a scrub a couple of weeks ago. After reboot, I got an email from zed letting me know a resilver had occurred at boot and multiple permanent errors were present.

The pool is comprised of 6 vdevs, each 11 disks in raidz2. Compression and encryption are enabled. I am using a dedicated L2ARC device, a single SSD. secondarycache is set to metadata.

# zpool status tank -v
  pool: tank
 state: ONLINE
status: One or more devices has experienced an error resulting in data
        corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
        entire pool from backup.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-8A
  scan: resilvered 0B in 00:00:02 with 0 errors on Mon Feb 24 15:30:31 2025

< snip >

errors: Permanent errors have been detected in the following files:

        <metadata>:<0x0>
        <metadata>:<0x41827>
        <metadata>:<0x34>
        <metadata>:<0x3c>
        <metadata>:<0xbe>
        <metadata>:<0xc9>

Inspecting the "corrupt" metadata nodes:

# zdb -dd tank 0x0
Dataset mos [META], ID 0, cr_txg 4, 4.69G, 64926 objects

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
         0    3   128K    16K   255M     512   192M   16.51  DMU dnode

# zdb -dd tank 0x41827
Dataset mos [META], ID 0, cr_txg 4, 4.68G, 64840 objects

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
    268327    1    16K   128K   110K     512   128K  100.00  SPA space map

# zdb -dd tank 0x34
Dataset mos [META], ID 0, cr_txg 4, 4.68G, 64846 objects

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
        52    1   128K  1.50K  27.5K     512  1.50K  100.00  zap

# zdb -dd tank 0x3c
Dataset mos [META], ID 0, cr_txg 4, 4.68G, 64876 objects

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
        60    3   128K   128K   305M     512   710M  100.00  SPA history

# zdb -dd tank 0xbe
Dataset mos [META], ID 0, cr_txg 4, 4.68G, 64884 objects

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
       190    2   128K    16K  5.68M     512  4.88M   67.31  zap

# zdb -dd tank 0xc9
Dataset mos [META], ID 0, cr_txg 4, 4.68G, 64908 objects

    Object  lvl   iblk   dblk  dsize  dnsize  lsize   %full  type
       201    1   128K   128K  27.5K     512   128K  100.00  uint64

Exactly how concerned should I be here? I have been waiting for a long time for the 2.3.0 release and kind of worried I may have rushed into it today.

The text was updated successfully, but these errors were encountered:

putnam · 2025-02-25T20:40:08Z

I tried running a scrub and returned to the pool being degraded. I've never seen this particular situation before. Not only is the pool degraded, dmesg is filled indefinitely with CPU hangs (they're still going right now):

Feb 25 12:32:10 server kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 183s! [txg_sync:6724]
Feb 25 12:32:10 server kernel: Modules linked in: wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 curve25519_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel xt_nat xt_tcpudp veth xt_MASQUERADE bridge stp llc xt_conntrack x>
Feb 25 12:32:10 server kernel:  acpi_cpufreq pcspkr watchdog k10temp ipmi_devintf ipmi_msghandler evdev joydev sg nvme_fabrics nvme_keyring drm efi_pstore configfs nfnetlink ip_tables x_tables autofs4 zfs(PO) spl(O) efivarfs raid10 raid0 raid1 dm_raid>
Feb 25 12:32:10 server kernel: CPU: 9 UID: 0 PID: 6724 Comm: txg_sync Tainted: P           O L     6.12.12-amd64 #1  Debian 6.12.12-1
Feb 25 12:32:10 server kernel: Tainted: [P]=PROPRIETARY_MODULE, [O]=OOT_MODULE, [L]=SOFTLOCKUP
Feb 25 12:32:10 server kernel: Hardware name: Supermicro Super Server/H12SSL-CT, BIOS 2.9 05/28/2024
Feb 25 12:32:10 server kernel: RIP: 0010:txg_sync_thread+0x117/0x3b0 [zfs]
Feb 25 12:32:10 server kernel: Code: c0 4c 29 e3 4c 01 e5 49 39 df 48 0f 4c e8 49 8b be a8 01 00 00 e8 b9 62 fb ff 85 c0 74 92 48 8b 7c 24 08 e8 8b 54 ff ff 85 c0 <0f> 85 64 ff ff ff 41 80 be 71 07 00 00 00 0f 85 68 02 00 00 49 8d
Feb 25 12:32:10 server kernel: RSP: 0018:ffffabef803cbe68 EFLAGS: 00000202
Feb 25 12:32:10 server kernel: RAX: 0000000000000001 RBX: 000000000000217b RCX: 00000000ffffffd0
Feb 25 12:32:10 server kernel: RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9864da6cc000
Feb 25 12:32:10 server kernel: RBP: 0000000000000000 R08: 0000000000000030 R09: 0000000000000000
Feb 25 12:32:10 server kernel: R10: ffffffffa0dfe560 R11: 0000000000000000 R12: 00000001011f094d
Feb 25 12:32:10 server kernel: R13: ffff9864ad6ab550 R14: ffff9864ad6ab000 R15: 00000000000004e2
Feb 25 12:32:10 server kernel: FS:  0000000000000000(0000) GS:ffff98830dc80000(0000) knlGS:0000000000000000
Feb 25 12:32:10 server kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 25 12:32:10 server kernel: CR2: 000055c303efcd60 CR3: 0000001928422000 CR4: 0000000000350ef0
Feb 25 12:32:10 server kernel: Call Trace:
Feb 25 12:32:10 server kernel:  <IRQ>
Feb 25 12:32:10 server kernel:  ? watchdog_timer_fn.cold+0x3d/0xa1
Feb 25 12:32:10 server kernel:  ? __pfx_watchdog_timer_fn+0x10/0x10
Feb 25 12:32:10 server kernel:  ? __hrtimer_run_queues+0x132/0x2a0
Feb 25 12:32:10 server kernel:  ? hrtimer_interrupt+0xfa/0x210
Feb 25 12:32:10 server kernel:  ? __sysvec_apic_timer_interrupt+0x55/0x100
Feb 25 12:32:10 server kernel:  ? sysvec_apic_timer_interrupt+0x6c/0x90
Feb 25 12:32:10 server kernel:  </IRQ>
Feb 25 12:32:10 server kernel:  <TASK>
Feb 25 12:32:10 server kernel:  ? asm_sysvec_apic_timer_interrupt+0x1a/0x20
Feb 25 12:32:10 server kernel:  ? txg_sync_thread+0x117/0x3b0 [zfs]
Feb 25 12:32:10 server kernel:  ? txg_sync_thread+0x115/0x3b0 [zfs]
Feb 25 12:32:10 server kernel:  ? __pfx_txg_sync_thread+0x10/0x10 [zfs]
Feb 25 12:32:10 server kernel:  ? __pfx_thread_generic_wrapper+0x10/0x10 [spl]
Feb 25 12:32:10 server kernel:  thread_generic_wrapper+0x5d/0x70 [spl]
Feb 25 12:32:10 server kernel:  kthread+0xd2/0x100
Feb 25 12:32:10 server kernel:  ? __pfx_kthread+0x10/0x10
Feb 25 12:32:10 server kernel:  ret_from_fork+0x34/0x50
Feb 25 12:32:10 server kernel:  ? __pfx_kthread+0x10/0x10
Feb 25 12:32:10 server kernel:  ret_from_fork_asm+0x1a/0x30
Feb 25 12:32:10 server kernel:  </TASK>

Looking through the kernel log I can see that the event that precipitated these indefinite kernel hangs was a set of disk resets -- but usually it's recoverable. Doesn't look like that's the case with 2.3.0 and 6.12.

putnam · 2025-02-25T20:44:50Z

zpool status:

  pool: tank
 state: SUSPENDED
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: https://openzfs.github.io/openzfs-docs/msg/ZFS-8000-HC
  scan: scrub in progress since Mon Feb 24 23:12:07 2025
	315T / 623T scanned at 5.79G/s, 274T / 623T issued at 5.03G/s
	0B repaired, 43.94% done, 19:46:23 to go
config:

	NAME                                   STATE     READ WRITE CKSUM
	tank                                   UNAVAIL      0     0     0  insufficient replicas
	  raidz2-0                             ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	  raidz2-1                             UNAVAIL      9     2     0  insufficient replicas
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST24000NM000C-3WD103_XXXXXXXX  DEGRADED     0     0     0  too many slow I/Os
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST24000NM000C-3WD103_XXXXXXXX  DEGRADED    13    20     0  too many errors
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST24000NM000C-3WD103_XXXXXXXX  DEGRADED    12    20     0  too many errors
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST24000NM000C-3WD103_XXXXXXXX  ONLINE       0     0     0
	  raidz2-2                             ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	    ata-HGST_HUH721212ALN600_XXXXXXXX  ONLINE       0     0     0
	  raidz2-3                             ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM000J-2TW103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	    ata-ST16000NM001G-2KK103_XXXXXXXX  ONLINE       0     0     0
	  raidz2-4                             ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH721818ALE6L4_XXXXXXXX   ONLINE       0     0     0
	  raidz2-5                             ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	    ata-WDC_WUH722020BLE6L4_XXXXXXXX   ONLINE       0     0     0
	cache
	  nvme-eui.00253xxxxxxxxxxx            ONLINE       0     0     0

errors: 8 data errors, use '-v' for a list

thewacokid · 2025-03-04T15:37:02Z

I wouldn't attribute this to ZFS changes right away - that "too many slow I/Os" tag on one of your disks only triggers after multiple I/Os were delayed for at least 30 seconds. So, with that, you've got 3 damaged/dying/something disks in a RAIDZ2.

Have you tried reseating the disks (or moving them to different slots)? With that many I assume it's in a backplane, but it might be worth moving things around a bit to ensure it's not data or power cable/backplane/etc related in any way.

madmax01 · 2025-03-04T18:11:54Z

i have the same situation

ubuntu 22.04.5 LTS
upgraded to 2.3.0 and instant 2 disks where degraded after few hours

now i replaced those 2 - and resilvered. then again fall down with errors. thats not normal.. something is not right with 2.3.0

i tried now with zpool clear pool1 and checksums instant climbing.. that was not with zfs 2.2.x

thewacokid · 2025-03-04T18:32:38Z

If you haven't upgraded your pool features - you can move back to 2.2.X. This does sound like a hardware issue, though.

madmax01 · 2025-03-04T19:14:57Z

ah got it working

zfs set direct=disabled pool1

zpool clean pool1
then waited checksum error happens
then i done scrub
errors: did show up
then i done again scrub - value went down
scrub again and done -s
zpool clean pool1

now no checksum raise and errors are clean

when i use standard > i get errors. with disabled it works exactly like before with 2.2.x

how the disks configured

supermicro with SSDS.. Each Disk in Raid0 and Cachevault WB
on OS side i use this then as Fast Tier.... and assuming this make issues with "standard" or can make issue as on other node with single vm i don't have it yet.. will observe

thewacokid · 2025-03-06T13:56:29Z

What applications are you using that utilize O_DIRECT?

madmax01 · 2025-03-06T15:00:29Z

only kvm (opennebula) running on top hosting vms based on virtio as such

driver name='qemu' type='qcow2' cache='none' discard='unmap'

we used ppa downstreams while ago able to jump onto the 2.2.2 because of silent corruption then we left it. but as now 2.3.0 caused some miracles...we switch off downstream again as officially repos seems at 2.2.2

so its all fine for now ;).. just wanted to share that there are Circumstances when switching straight to 2.3.0 then with other 2 Options are straight issues... this is may what the Initiator face also....as direct Speed without cache is often very slow.. also with ssds... when dealing with HDD's can mean absolutly nightmare ;)

just thinking "standard" option as default is may not the best approach for every setup. as lots will suffer. especially for setups where cache is important

for me all good ;)

IvanVolosyuk · 2025-03-12T10:55:50Z

Did you update your kernel as well when upgrading the ZFS version? I've got some errors and data corruption because of #16873 . It was caused by power management defaults change in kernel.

putnam added the Type: Defect Incorrect behavior (e.g. crash, hang) label Feb 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Permanent errors in metadata following 2.3.0 upgrade #17090

Permanent errors in metadata following 2.3.0 upgrade #17090

putnam commented Feb 24, 2025 •

edited

Loading

putnam commented Feb 25, 2025 •

edited

Loading

putnam commented Feb 25, 2025

thewacokid commented Mar 4, 2025

madmax01 commented Mar 4, 2025

thewacokid commented Mar 4, 2025

madmax01 commented Mar 4, 2025 •

edited

Loading

thewacokid commented Mar 6, 2025

madmax01 commented Mar 6, 2025 •

edited

Loading

IvanVolosyuk commented Mar 12, 2025

Permanent errors in metadata following 2.3.0 upgrade #17090

Permanent errors in metadata following 2.3.0 upgrade #17090

Comments

putnam commented Feb 24, 2025 • edited Loading

System information

Describe the problem you're observing

putnam commented Feb 25, 2025 • edited Loading

putnam commented Feb 25, 2025

thewacokid commented Mar 4, 2025

madmax01 commented Mar 4, 2025

thewacokid commented Mar 4, 2025

madmax01 commented Mar 4, 2025 • edited Loading

thewacokid commented Mar 6, 2025

madmax01 commented Mar 6, 2025 • edited Loading

IvanVolosyuk commented Mar 12, 2025

putnam commented Feb 24, 2025 •

edited

Loading

putnam commented Feb 25, 2025 •

edited

Loading

madmax01 commented Mar 4, 2025 •

edited

Loading

madmax01 commented Mar 6, 2025 •

edited

Loading