-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add more control/visibility and speedup spa_load_verify(). #13022
Conversation
Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reched it. Signed-off-by: Alexander Motin <[email protected]>
Does anybody know good motivation to scrub last few TXGs during a normal import? Is it more than a time waste? |
My colleague measured pretty large pool import time after crash during active write with dedup enabled. And he found that disabling metadata verify in that case reduces import time from ~75 seconds to ~20. It is too tasty to ignore. Does anybody know why we do all this scrub and why we do it up to TXG_DEFER_SIZE TXGs back? I don't see any relation to that number at least. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the approach taken here, I think this makes good sense.
Given the size of many modern pools, I think it probably makes sense to go even farther. For many pools it's completely impractical to scrub the data even during a rewind. That's an operation which could take weeks, or longer. I wouldn't be against changing the default spa_load_verify_data
value to B_FALSE
. Or if we wanted to be clever, maybe only perform the data scrub on rewind when the pool is all SSD? Though I'm not sure it's worth the complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks reasonable. I'm curious, are you changing the policy by doing modifications in libzfs directly, as I do not recall if we can set those via cli?
I'm not sure I'd go as far as turning off spa_load_verify by default, as I do recall it catching errors in some, albeit rare circumstances and then importing the previous txg successfully.
@pzakha At this point I just disabled data scan when the result is not used and completed the policy API. There indeed no user-space part for it now. If somebody have preferences how it could be set on the command line -- please speak up or feel free to take over. I'd prefer it to be done on the command line via using the policy API rather than via global tunables as it is now. |
Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubbed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reached it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#13022 (cherry picked from commit f2c5bc1)
Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubbed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reached it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#13022
Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubbed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reached it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes #13022
Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubbed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reached it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#13022
Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubbed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reached it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#13022
Use error thresholds from policy to control whether to scrub data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about result and we may skip that part. By default import neither set the data error threshold nor read the error counter, so skip the data scrub for faster import. Metadata are still scrubbed and fail if even single error found. While there just for symmetry return number of metadata errors in case threshold is not set to zero and we haven't reached it. Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: Pavel Zakharov <[email protected]> Signed-off-by: Alexander Motin <[email protected]> Closes openzfs#13022 (cherry picked from commit f2c5bc1)
Use error thresholds from the policy to control whether to verify data and/or metadata. If threshold is set to UINT64_MAX, then caller probably does not care about the number of errors and we may skip that part to import pool faster. By default import neither set the data error threshold nor read the error counter. It was only reported to dbgmsg, that is not very useful in everyday life. Metadata are still verified and fail if even single error found.
While there, just for symmetry, return number of metadata errors in case threshold is not set to zero and we haven't reched it.
How Has This Been Tested?
Importing pool on FreeBSD after system crash during active write I see reduction of time spent inside spa_load_verify() from 6.5s to 1.5s due to skipped data verify.
Types of changes
Checklist:
Signed-off-by
.