-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait in libzfs_init() for the /dev/zfs device #3431
Conversation
return (0); | ||
} else if (errno == ENOENT) { | ||
if (NSEC2MSEC(gethrtime() - start) > busy_timeout) | ||
usleep(10 * MILLISEC); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should be an else sched_yield();
here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually intentionally dropped it because we're only going to be spinning briefly. But I can add it back, it does appear to be POSIX so there aren't compatibility concerns.
There is a pre-existing issue where |
A fix for this issue is something that we would want to backport to 0.6.4, but the library changes and behavior changes are not something that are good to backport. Lets split it into two patches. One implements things without restructuring layering/functionality and another that does. Then we can backport the first patch. |
That's amusing the test failures are due to parts of the test suite assuming the auto-loading behavior. When I refresh this I'll update the offending scripts accordingly.
Do you have the issue number? I'd like to see the full bug. We should not need to use @ryao this doesn't strike me as a critical issue worth applying to a point release. |
Refreshed:
|
ASSERT(g_zfs != NULL); | ||
if ((g_zfs = libzfs_init()) == NULL) { | ||
(void) fprintf(stderr, "%s", libzfs_error_init(errno)); | ||
return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't a libzfs_init()
failure here be fatal, especially since you've removed the ASSERT
?
Now if libzfs_init()
fails, you return early from import_pool()
. That function returns a void, so you can't return the error to the caller (zhack_spa_open()
) which obliviously rumbles on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, yes it should be. I'll fix it by reverting this hunk so it matches upstream. I clearly just replaced all this calls without fully inspecting them.
Refreshed with the |
timeout_str = getenv("ZFS_MODULE_TIMEOUT"); | ||
if (timeout_str) { | ||
timeout = strtol(timeout_str, NULL, 0); | ||
timeout = MIN(MAX(timeout, (10 * 60)), 0); /* 0 <= N <= 600 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout
here is always 0. I think you mean to swap MIN
and MAX
:
timeout = MAX(MIN(timeout, (10 * 60)), 0); /* 0 <= N <= 600 */
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's embarrassing, and exactly what I deserve for rushing this small tweak to the original patch. Originally it didn't have a minimum bound.
@behlendorf There is no issue number for the misuse of zpool import in the start scripts (yet). I noticed it when debugging this problem on Friday. When the pool is imported by the module reading zpool.cache, a verbatim import is done, but when the script does it, a non-verbatim import is done. This poses problems for txg rollback if something goes wrong and is only necessary when a change is done to the pool vdev configuration. My employer considers backporting a fix for this to be important because this issue affects the flocker tutorial in anywhere from 1/3 to 2/3 of all runs. I spent yesterday backporting my original patch to 0.6.4.1 (as 0.6.4.1.flocker.0). I did it in a manner roughly equivalent to what you do when you tag releases, such that there are source tarballs that can be used to build packages using the generic rpm/deb instructions. They have autotools products included, so no ./autogen.sh is necessary. I pushed the branches to a public git repository earlier today: https://github.com/ClusterHQ/spl/tree/flocker-backports The tarballs are available at a temporary location. If anyone wants them, I can provide the link, but I am going to avoid posting the temporary link to the issue tracker so that people in the future do not encounter a broken link. If anyone asks for the tarballs, I'll post the official link after it is up. |
Refreshed to include the restructured I've also added an else clause to the |
Looks good to me. |
@@ -9,4 +9,4 @@ ConditionPathExists=!@sysconfdir@/zfs/zpool.cache | |||
[Service] | |||
Type=oneshot | |||
RemainAfterExit=yes | |||
ExecStart=@sbindir@/zpool import -d /dev/disk/by-id -aN | |||
ExecStart=/sbin/modprobe zfs && @sbindir@/zpool import -d /dev/disk/by-id -aN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest adding a -F
here to match the behavior when the module auto-imports the pool itself to fix #3434.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than sneak that in here. Let's tackle that in a separate patch because why that's desirable is going to need a fair bit of explanation and it would be a change in existing behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is fine.
@behlendorf I am not a fan of backporting the changes to the libzfs API because they ideally shouldn't change across point releases, but it is your call. You will have no further complaints from me as long as you are aware of that concern. As a side note, issues with library API changes are not an issue for Gentoo because Gentoo's packaging keeps the userland binaries in a single package. It is more likely to be an issue for other distributions because their packaging breaks the tools and libraries into discrete packages. |
@behlendorf Those comments aside, this looks good to me. |
That's a very good point. This patch does remove one public function and add another which could potentially cause breakage. I suspect the odds of this causing breakage are quite low but it's definitely possible. I've split this patch as you originally suggested in to two parts, we'll only cherry pick the first part to the release branch to avoid any API changes. Thanks for keeping me honest about this. |
While module loading itself is synchronous the creation of the /dev/zfs device is not. This is because /dev/zfs is typically created by a udev rule after the module is registered and presented to user space through sysfs. This small window between module loading and device creation can result in spurious failures of libzfs_init(). This patch closes that race by extending libzfs_init() so it can detect that the modules are loaded and only if required wait for the /dev/zfs device to be created. This allows scripts to reliably use the following shell construct without the need for additional error handling. $ /sbin/modprobe zfs && /sbin/zpool import -a To minimize the potential time waiting in libzfs_init() a strategy similar to adaptive mutexes is employed. The function will busy-wait for up to 10ms based on the expectation that the modules were just loaded and therefore the /dev/zfs will be created imminently. If it takes longer than this it will fall back to polling for up to 10 seconds. This behavior can be customized to some degree by setting the following new environment variables. This functionality is provided for backwards compatibility with existing scripts which depend on the module auto-load behavior. By default module auto-loading is now disabled. * ZFS_MODULE_LOADING="YES|yes|ON|on" - Attempt to load modules. * ZFS_MODULE_TIMEOUT="<seconds>" - Seconds to wait for /dev/zfs The zfs-import-* systemd service files have been updated to call '/sbin/modprobe zfs' so they no longer rely on the legacy auto-loading behavior. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2556
All fprintf() error messages are moved out of the libzfs_init() library function where they never belonged in the first place. A libzfs_error_init() function is added to provide useful error messages for the most common causes of failure. Additionally, in libzfs_run_process() the 'rc' variable was renamed to 'error' for consistency with the rest of the code base. Signed-off-by: Brian Behlendorf <[email protected]>
The failures here were due to the test suite needing an update to explicitly load the zfs modules for ztest. I'll update the tests and get this merged today. |
While module loading itself is synchronous the creation of the /dev/zfs
device is not. This is because /dev/zfs is typically created by a udev
rule after the module is registered and presented to user space through
sysfs. This small window between module loading and device creation
can result in spurious failures of libzfs_init().
This patch closes that race by extending libzfs_init() so it can detect
that the modules are loaded and only if required wait for the /dev/zfs
device to be created. This allows scripts to reliably use the following
shell construct without the need for additional error handling.
$ /sbin/modprobe zfs && /sbin/zpool import -a
To minimize the potential time waiting in libzfs_init() a strategy
similar to adaptive mutexes is employed. The function will busy-wait
for up to 10ms based on the expectation that the modules were just
loaded and therefore the /dev/zfs will be created imminently. If it
takes longer than this it will fall back to polling for up to 10 seconds.
This behavior can be customized to some degree by setting the following
new environment variables. This functionality is provided for backwards
compatibility with existing scripts which depend on the module auto-load
behavior. By default module auto-loading is now disabled.
The additional small changes were also made.
consistency with the rest of the code base.
library function where they never belonged in the first place. A
libzfs_error_init() function was added to provide useful error
messages for the most common causes of failure.
'/sbin/modprobe zfs' so they no longer rely on the legacy auto-loading
behavior.
Signed-off-by: Brian Behlendorf [email protected]
Issue #2556