Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

memory leak in trim branch #6185

Closed
ghost opened this issue May 31, 2017 · 6 comments
Closed

memory leak in trim branch #6185

ghost opened this issue May 31, 2017 · 6 comments

Comments

@ghost
Copy link

ghost commented May 31, 2017

System information

Type Version/Name
Distribution Name gentoo
Distribution Version
Linux Kernel 4.7.10
Architecture x86
ZFS Version (see below)
SPL Version

Describe the problem you're observing

This code (branch 'ntrim2', commit f727c0e) contains a memory leak that is causing the OOM killer to kill anything sizeable in userland on the following tests:

  1. create a raidz3 pool consisting of 12 SSDs of about 250GB each. trim defaults aren't changed (ie, autotrim=off, forcetrim=off).
  2. start filling up the pool with files of 1 GB each, using a tool we wrote in-house (in Java).

The tool prints out a line of progress every 100 files. Before this pool fills up completely, oom killer starts killing user-land processes left, right and center (around 30 of them within a few minutes!).
The same test shows no such behaviour with up-to-date zfs-on-linux and otherwise comparable settings.

The machine I'm testing this on has 16GB of RAM, this is the relevant excerpt of /proc/slabinfo, I believe, after the oom killer has effectively terminated my test:

name <active_objs> <num_objs> : tunables : slabdata <active_slabs> <num_slabs>
kmalloc-1024 14019478 14019524 1024 32 8 : tunables 0 0 0 : slabdata 476884 476884 0

during the test, I can see <active_objs> (col 2) grow by ~200k - 300k every minute.

Is there any information I can supply to help debug this?

Addendum: exporting and re-importing the pool doesn't free up the memory (according to slabinfo).

This may be another instance of #5626

@behlendorf
Copy link
Contributor

Referencing #5925.

@ghost
Copy link
Author

ghost commented Jun 1, 2017

I did some 'divide and conquer' testing and I believe this change:

commit 19a677f
Author: Isaac Huang [email protected]
Date: Wed May 24 19:02:02 2017 -0400

Remove vdev_raidz_map_alloc()

Rather than hacking `vdev_raidz_map_alloc()` to get the child
offsets calculate the values directly.

Signed-off-by: Isaac Huang <[email protected]>

introduced the memory leak, since with the previous commit e83b488 the behaviour is not seen.

@ghost
Copy link
Author

ghost commented Jun 1, 2017

This diff takes care of the kmem leak:

$ git diff
diff --git a/module/zfs/vdev_raidz.c b/module/zfs/vdev_raidz.c
index ccde111..5ca0820 100644
--- a/module/zfs/vdev_raidz.c
+++ b/module/zfs/vdev_raidz.c
@@ -137,6 +137,10 @@ vdev_raidz_map_free(raidz_map_t *rm)
 {
        int c;
 
+       /* raidz_map_t without abd allocation from vdev_raidz_trim() */
+       if (rm->rm_col[0].rc_abd == NULL)
+               goto out;
+
        for (c = 0; c < rm->rm_firstdatacol; c++) {
                abd_free(rm->rm_col[c].rc_abd);
 
@@ -149,6 +153,9 @@ vdev_raidz_map_free(raidz_map_t *rm)
 
        if (rm->rm_abd_copy != NULL)
                abd_free(rm->rm_abd_copy);
+
+out:
+       kmem_free(rm, offsetof(raidz_map_t, rm_col[rm->rm_scols]));
 }
 
 static void

@behlendorf
Copy link
Contributor

behlendorf commented Jun 1, 2017

@ms-fastlta thanks for running this down, I accidentally dropped that kmem_free() when removing the conditional (which was intentional). I'll fold this fix in to the TRIM PR.

@ghost
Copy link
Author

ghost commented Jun 2, 2017

@behlendorf
YW :-)
I later compared this to the illumos code and saw the same; not being sufficiently familiar with the code, I opted for the "safe" route.
If there's anything I can do, feel free to let me know; I will be away the coming two weeks though.

@behlendorf
Copy link
Contributor

Closing. This fix was merged in to open TRIM PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant