Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZTS: Corrupt all block copies with corrupt_file_at_level, test L1 corruption #11141

Closed
wants to merge 3 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Oct 30, 2020

Motivation and Context

In ZTS, list_file_blocks is supposed to output the level, vdev path, offset, and length of blocks in a given file.
It fails in the following ways:

  • Only the path of the first child of a mirrored vdev is output
    This prevents corrupt_blocks_at_level from corrupting all copies of the blocks, which is desired for reproducing a particular issue we encountered in the async DMU project.
  • Only L0 blocks and only the first DVA per block are output
    This prevents corrupt_blocks_at_level from working at any level other than 0.

Description

Only the path of the first child of a mirrored vdev is output

The first part of list_file_blocks transforms the pool configuration output by zdb -C $pool into shell code to set up a shell variable, VDEV_MAP, that maps from vdev id to the underlying vdev path. This variable is a simple indexed array. However, the vdev id in a DVA is only the id of the top level vdev.

When the pool is mirrored, the top level vdev is a mirror and its children are the mirrored devices. So, what we need is to map from the top level vdev id to a list of the underlying vdev paths. list_file_blocks does not need to work for raidz vdevs, so we can disregard that case.

Only L0 blocks and only the first DVA per block are output

The second part of list_file_blocks transforms the object description output by zdb -ddddd $ds $objnum into a stream of lines of the form "level path offset length" for the indirect blocks in the given file. The current code only works for the first copy of L0 blocks.

Add one more -d to the zdb command so we get all block copies and rewrite the transformation to match more than L0 and output all DVAs.

How Has This Been Tested?

Added a test case.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Performance enhancement (non-breaking change which improves efficiency)
  • Code cleanup (non-breaking change which makes code smaller or more readable)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (a change to man pages or other documentation)

Checklist:

@ghost ghost added Component: Test Suite Indicates an issue with the test framework or a test case Status: Code Review Needed Ready for review and testing labels Oct 30, 2020
@ghost ghost requested a review from jwk404 October 30, 2020 23:08
@ghost ghost force-pushed the list-more-blocks branch 2 times, most recently from 59cf6b1 to c484790 Compare October 30, 2020 23:15
@ghost
Copy link
Author

ghost commented Oct 30, 2020

  • Sign-off on all the commits
  • ALL the commits

@ghost ghost force-pushed the list-more-blocks branch from c484790 to 7fa9a0b Compare October 30, 2020 23:24
@ghost
Copy link
Author

ghost commented Oct 30, 2020

  • Simplify the added test to avoid duplicating non-corrupted checksum tests from filetest_001_pos

@ghost ghost force-pushed the list-more-blocks branch from 7fa9a0b to 9f55214 Compare November 1, 2020 14:52
@ghost
Copy link
Author

ghost commented Nov 1, 2020

  • Rebased
  • Added -n flag to make gawk decode hex, squelched the resulting unknown argument warning on FreeBSD
  • Added typeset for local vdev

@ghost ghost force-pushed the list-more-blocks branch from 9f55214 to 7f0ad28 Compare November 1, 2020 15:00
@ghost
Copy link
Author

ghost commented Nov 1, 2020

  • Amend the correct commit

@behlendorf
Copy link
Contributor

These tests appears to have failed during the Debian 10 test run.

@ghost
Copy link
Author

ghost commented Nov 2, 2020

Ok back to plan B

Ryan Moeller added 3 commits November 2, 2020 15:40
L1 and L2 indirect blocks have more than one copy on disk.

Show all the indirect block copies.

Signed-off-by: Ryan Moeller <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
@ghost ghost force-pushed the list-more-blocks branch from 7f0ad28 to 5766308 Compare November 2, 2020 20:41
@ghost
Copy link
Author

ghost commented Nov 2, 2020

  • Rebased
  • Check for gawk by testing if the -n flag is supported

@ghost
Copy link
Author

ghost commented Nov 2, 2020

Ah, Debian uses mawk, I'd not heard of that one before.

@ghost ghost requested a review from behlendorf November 5, 2020 19:57
@behlendorf behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Nov 5, 2020
@behlendorf behlendorf closed this in 94deb47 Nov 6, 2020
behlendorf pushed a commit that referenced this pull request Nov 6, 2020
The second part of list_file_blocks transforms the object description
output by zdb -ddddd $ds $objnum into a stream of lines of the form
"level path offset length" for the indirect blocks in the given file.
The current code only works for the first copy of L0 blocks.  L1 and
L2 indirect blocks have more than one copy on disk.

Add one more -d to the zdb command so we get all block copies and
rewrite the transformation to match more than L0 and output all DVAs.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11141
behlendorf pushed a commit that referenced this pull request Nov 6, 2020
Add a new test case which corrupts all level 1 block in a file.
Then verifies that corruption is detected and repaired.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11141
@ghost ghost deleted the list-more-blocks branch November 6, 2020 16:05
behlendorf pushed a commit that referenced this pull request Nov 11, 2020
The first part of list_file_blocks transforms the pool configuration
output by zdb -C $pool into shell code to set up a shell variable,
VDEV_MAP, that maps from vdev id to the underlying vdev path. This
variable is a simple indexed array. However, the vdev id in a DVA is
only the id of the top level vdev.

When the pool is mirrored, the top level vdev is a mirror and its
children are the mirrored devices. So, what we need is to map from
the top level vdev id to a list of the underlying vdev paths.
ist_file_blocks does not need to work for raidz vdevs, so we can
disregard that case.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11141
behlendorf pushed a commit that referenced this pull request Nov 11, 2020
The second part of list_file_blocks transforms the object description
output by zdb -ddddd $ds $objnum into a stream of lines of the form
"level path offset length" for the indirect blocks in the given file.
The current code only works for the first copy of L0 blocks.  L1 and
L2 indirect blocks have more than one copy on disk.

Add one more -d to the zdb command so we get all block copies and
rewrite the transformation to match more than L0 and output all DVAs.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11141
behlendorf pushed a commit that referenced this pull request Nov 11, 2020
Add a new test case which corrupts all level 1 block in a file.
Then verifies that corruption is detected and repaired.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes #11141
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
The first part of list_file_blocks transforms the pool configuration
output by zdb -C $pool into shell code to set up a shell variable,
VDEV_MAP, that maps from vdev id to the underlying vdev path. This
variable is a simple indexed array. However, the vdev id in a DVA is
only the id of the top level vdev.

When the pool is mirrored, the top level vdev is a mirror and its
children are the mirrored devices. So, what we need is to map from
the top level vdev id to a list of the underlying vdev paths.
ist_file_blocks does not need to work for raidz vdevs, so we can
disregard that case.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes openzfs#11141
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
The second part of list_file_blocks transforms the object description
output by zdb -ddddd $ds $objnum into a stream of lines of the form
"level path offset length" for the indirect blocks in the given file.
The current code only works for the first copy of L0 blocks.  L1 and
L2 indirect blocks have more than one copy on disk.

Add one more -d to the zdb command so we get all block copies and
rewrite the transformation to match more than L0 and output all DVAs.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes openzfs#11141
jsai20 pushed a commit to jsai20/zfs that referenced this pull request Mar 30, 2021
Add a new test case which corrupts all level 1 block in a file.
Then verifies that corruption is detected and repaired.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes openzfs#11141
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
The first part of list_file_blocks transforms the pool configuration
output by zdb -C $pool into shell code to set up a shell variable,
VDEV_MAP, that maps from vdev id to the underlying vdev path. This
variable is a simple indexed array. However, the vdev id in a DVA is
only the id of the top level vdev.

When the pool is mirrored, the top level vdev is a mirror and its
children are the mirrored devices. So, what we need is to map from
the top level vdev id to a list of the underlying vdev paths.
ist_file_blocks does not need to work for raidz vdevs, so we can
disregard that case.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes openzfs#11141
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
The second part of list_file_blocks transforms the object description
output by zdb -ddddd $ds $objnum into a stream of lines of the form
"level path offset length" for the indirect blocks in the given file.
The current code only works for the first copy of L0 blocks.  L1 and
L2 indirect blocks have more than one copy on disk.

Add one more -d to the zdb command so we get all block copies and
rewrite the transformation to match more than L0 and output all DVAs.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes openzfs#11141
sempervictus pushed a commit to sempervictus/zfs that referenced this pull request May 31, 2021
Add a new test case which corrupts all level 1 block in a file.
Then verifies that corruption is detected and repaired.

Reviewed-by: Brian Behlendorf <[email protected]>
Signed-off-by: Ryan Moeller <[email protected]>
Closes openzfs#11141
behlendorf pushed a commit that referenced this pull request Dec 29, 2024
FreeBSD recently removed non-standard hex numbers support from awk.
Neither it supports -n argument, enabling it in gawk.  Instead of
depending on those rewrite list_file_blocks() fuction to handle the
hex math in shell instead of awk.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #11141
behlendorf pushed a commit that referenced this pull request Dec 29, 2024
procfs might be not mounted on FreeBSD.  Plus checking for specific
PID might be not exactly reliable.  Check for empty list of jobs
instead.

Premature loop exit can result in failed test and failed cleanup,
failing also some following tests.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #11141
behlendorf pushed a commit that referenced this pull request Dec 29, 2024
This test takes 3 minutes on RELEASE FreeBSD bots, but on CURRENT,
probably due to debugging it has in kernel, it does not complete
within 10 minutes, ending up killed.  As I see all the redacting
here happens within the first ~128MB of the file, so I hope it
won't matter if there is 1GB of data instead of 2GB.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes #11141
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Dec 29, 2024
FreeBSD recently removed non-standard hex numbers support from awk.
Neither it supports -n argument, enabling it in gawk.  Instead of
depending on those rewrite list_file_blocks() fuction to handle the
hex math in shell instead of awk.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes openzfs#11141
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Dec 29, 2024
procfs might be not mounted on FreeBSD.  Plus checking for specific
PID might be not exactly reliable.  Check for empty list of jobs
instead.

Premature loop exit can result in failed test and failed cleanup,
failing also some following tests.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes openzfs#11141
behlendorf pushed a commit to behlendorf/zfs that referenced this pull request Dec 29, 2024
This test takes 3 minutes on RELEASE FreeBSD bots, but on CURRENT,
probably due to debugging it has in kernel, it does not complete
within 10 minutes, ending up killed.  As I see all the redacting
here happens within the first ~128MB of the file, so I hope it
won't matter if there is 1GB of data instead of 2GB.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes openzfs#11141
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Jan 26, 2025
FreeBSD recently removed non-standard hex numbers support from awk.
Neither it supports -n argument, enabling it in gawk.  Instead of
depending on those rewrite list_file_blocks() fuction to handle the
hex math in shell instead of awk.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes openzfs#11141
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Jan 26, 2025
procfs might be not mounted on FreeBSD.  Plus checking for specific
PID might be not exactly reliable.  Check for empty list of jobs
instead.

Premature loop exit can result in failed test and failed cleanup,
failing also some following tests.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes openzfs#11141
lundman pushed a commit to openzfsonwindows/openzfs that referenced this pull request Jan 26, 2025
This test takes 3 minutes on RELEASE FreeBSD bots, but on CURRENT,
probably due to debugging it has in kernel, it does not complete
within 10 minutes, ending up killed.  As I see all the redacting
here happens within the first ~128MB of the file, so I hope it
won't matter if there is 1GB of data instead of 2GB.

Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Tony Hutter <[email protected]>
Reviewed-by: Tino Reichardt <[email protected]>
Signed-off-by:Alexander Motin <[email protected]>
Sponsored by: iXsystems, Inc.
Closes openzfs#11141
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Test Suite Indicates an issue with the test framework or a test case Status: Accepted Ready to integrate (reviewed, tested)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant