-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add /sys/kernel/wait_for_device_probe attribute. #10
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Being written anything waits for all device probe to complete before returning. After that `udevadm settle` used by ZFS scripts really can provide system with all disks detected for boot pool import. Ticket: NAS-108200
ghost
approved these changes
Aug 17, 2021
ghost
pushed a commit
that referenced
this pull request
Oct 6, 2021
commit 67069a1 upstream. ASan reported a memory leak caused by info_linear not being deallocated. The info_linear was allocated during in perf_event__synthesize_one_bpf_prog(). This patch adds the corresponding free() when bpf_prog_info_node is freed in perf_env__purge_bpf(). $ sudo ./perf record -- sleep 5 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.025 MB perf.data (8 samples) ] ================================================================= ==297735==ERROR: LeakSanitizer: detected memory leaks Direct leak of 7688 byte(s) in 19 object(s) allocated from: #0 0x4f420f in malloc (/home/user/linux/tools/perf/perf+0x4f420f) #1 0xc06a74 in bpf_program__get_prog_info_linear /home/user/linux/tools/lib/bpf/libbpf.c:11113:16 #2 0xb426fe in perf_event__synthesize_one_bpf_prog /home/user/linux/tools/perf/util/bpf-event.c:191:16 #3 0xb42008 in perf_event__synthesize_bpf_events /home/user/linux/tools/perf/util/bpf-event.c:410:9 #4 0x594596 in record__synthesize /home/user/linux/tools/perf/builtin-record.c:1490:8 #5 0x58c9ac in __cmd_record /home/user/linux/tools/perf/builtin-record.c:1798:8 #6 0x58990b in cmd_record /home/user/linux/tools/perf/builtin-record.c:2901:8 #7 0x7b2a20 in run_builtin /home/user/linux/tools/perf/perf.c:313:11 #8 0x7b12ff in handle_internal_command /home/user/linux/tools/perf/perf.c:365:8 #9 0x7b2583 in run_argv /home/user/linux/tools/perf/perf.c:409:2 #10 0x7b0d79 in main /home/user/linux/tools/perf/perf.c:539:3 #11 0x7fa357ef6b74 in __libc_start_main /usr/src/debug/glibc-2.33-8.fc34.x86_64/csu/../csu/libc-start.c:332:16 Signed-off-by: Riccardo Mancini <[email protected]> Acked-by: Ian Rogers <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Alexei Starovoitov <[email protected]> Cc: Andrii Nakryiko <[email protected]> Cc: Daniel Borkmann <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: John Fastabend <[email protected]> Cc: KP Singh <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Martin KaFai Lau <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Song Liu <[email protected]> Cc: Yonghong Song <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Hanjun Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ghost
pushed a commit
that referenced
this pull request
Oct 6, 2021
commit 41d5854 upstream. I got several memory leak reports from Asan with a simple command. It was because VDSO is not released due to the refcount. Like in __dsos_addnew_id(), it should put the refcount after adding to the list. $ perf record true [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.030 MB perf.data (10 samples) ] ================================================================= ==692599==ERROR: LeakSanitizer: detected memory leaks Direct leak of 439 byte(s) in 1 object(s) allocated from: #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559bce4aa8ee in dso__new_id util/dso.c:1256 #2 0x559bce59245a in __machine__addnew_vdso util/vdso.c:132 #3 0x559bce59245a in machine__findnew_vdso util/vdso.c:347 #4 0x559bce50826c in map__new util/map.c:175 #5 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787 #6 0x559bce512f6b in machines__deliver_event util/session.c:1481 #7 0x559bce515107 in perf_session__deliver_event util/session.c:1551 #8 0x559bce51d4d2 in do_flush util/ordered-events.c:244 #9 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323 #10 0x559bce519bea in __perf_session__process_events util/session.c:2268 #11 0x559bce519bea in perf_session__process_events util/session.c:2297 #12 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017 #13 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234 #14 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026 #15 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858 #16 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 #17 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 #18 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 #19 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539 #20 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308 Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7fea52341037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559bce520907 in nsinfo__copy util/namespaces.c:169 #2 0x559bce50821b in map__new util/map.c:168 #3 0x559bce503c92 in machine__process_mmap2_event util/machine.c:1787 #4 0x559bce512f6b in machines__deliver_event util/session.c:1481 #5 0x559bce515107 in perf_session__deliver_event util/session.c:1551 #6 0x559bce51d4d2 in do_flush util/ordered-events.c:244 #7 0x559bce51d4d2 in __ordered_events__flush util/ordered-events.c:323 #8 0x559bce519bea in __perf_session__process_events util/session.c:2268 #9 0x559bce519bea in perf_session__process_events util/session.c:2297 #10 0x559bce2e7a52 in process_buildids /home/namhyung/project/linux/tools/perf/builtin-record.c:1017 #11 0x559bce2e7a52 in record__finish_output /home/namhyung/project/linux/tools/perf/builtin-record.c:1234 #12 0x559bce2ed4f6 in __cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2026 #13 0x559bce2ed4f6 in cmd_record /home/namhyung/project/linux/tools/perf/builtin-record.c:2858 #14 0x559bce422db4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313 #15 0x559bce2acac8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365 #16 0x559bce2acac8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409 #17 0x559bce2acac8 in main /home/namhyung/project/linux/tools/perf/perf.c:539 #18 0x7fea51e76d09 in __libc_start_main ../csu/libc-start.c:308 SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s). Signed-off-by: Namhyung Kim <[email protected]> Acked-by: Jiri Olsa <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: http://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Hanjun Guo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
ghost
pushed a commit
that referenced
this pull request
Oct 6, 2021
commit 57f0ff0 upstream. It's later supposed to be either a correct address or NULL. Without the initialization, it may contain an undefined value which results in the following segmentation fault: # perf top --sort comm -g --ignore-callees=do_idle terminates with: #0 0x00007ffff56b7685 in __strlen_avx2 () from /lib64/libc.so.6 #1 0x00007ffff55e3802 in strdup () from /lib64/libc.so.6 #2 0x00005555558cb139 in hist_entry__init (callchain_size=<optimized out>, sample_self=true, template=0x7fffde7fb110, he=0x7fffd801c250) at util/hist.c:489 #3 hist_entry__new (template=template@entry=0x7fffde7fb110, sample_self=sample_self@entry=true) at util/hist.c:564 #4 0x00005555558cb4ba in hists__findnew_entry (hists=hists@entry=0x5555561d9e38, entry=entry@entry=0x7fffde7fb110, al=al@entry=0x7fffde7fb420, sample_self=sample_self@entry=true) at util/hist.c:657 #5 0x00005555558cba1b in __hists__add_entry (hists=hists@entry=0x5555561d9e38, al=0x7fffde7fb420, sym_parent=<optimized out>, bi=bi@entry=0x0, mi=mi@entry=0x0, sample=sample@entry=0x7fffde7fb4b0, sample_self=true, ops=0x0, block_info=0x0) at util/hist.c:288 #6 0x00005555558cbb70 in hists__add_entry (sample_self=true, sample=0x7fffde7fb4b0, mi=0x0, bi=0x0, sym_parent=<optimized out>, al=<optimized out>, hists=0x5555561d9e38) at util/hist.c:1056 #7 iter_add_single_cumulative_entry (iter=0x7fffde7fb460, al=<optimized out>) at util/hist.c:1056 #8 0x00005555558cc8a4 in hist_entry_iter__add (iter=iter@entry=0x7fffde7fb460, al=al@entry=0x7fffde7fb420, max_stack_depth=<optimized out>, arg=arg@entry=0x7fffffff7db0) at util/hist.c:1231 #9 0x00005555557cdc9a in perf_event__process_sample (machine=<optimized out>, sample=0x7fffde7fb4b0, evsel=<optimized out>, event=<optimized out>, tool=0x7fffffff7db0) at builtin-top.c:842 #10 deliver_event (qe=<optimized out>, qevent=<optimized out>) at builtin-top.c:1202 #11 0x00005555558a9318 in do_flush (show_progress=false, oe=0x7fffffff80e0) at util/ordered-events.c:244 #12 __ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP, timestamp=timestamp@entry=0) at util/ordered-events.c:323 #13 0x00005555558a9789 in __ordered_events__flush (timestamp=<optimized out>, how=<optimized out>, oe=<optimized out>) at util/ordered-events.c:339 #14 ordered_events__flush (how=OE_FLUSH__TOP, oe=0x7fffffff80e0) at util/ordered-events.c:341 #15 ordered_events__flush (oe=oe@entry=0x7fffffff80e0, how=how@entry=OE_FLUSH__TOP) at util/ordered-events.c:339 #16 0x00005555557cd631 in process_thread (arg=0x7fffffff7db0) at builtin-top.c:1114 #17 0x00007ffff7bb817a in start_thread () from /lib64/libpthread.so.0 #18 0x00007ffff5656dc3 in clone () from /lib64/libc.so.6 If you look at the frame #2, the code is: 488 if (he->srcline) { 489 he->srcline = strdup(he->srcline); 490 if (he->srcline == NULL) 491 goto err_rawdata; 492 } If he->srcline is not NULL (it is not NULL if it is uninitialized rubbish), it gets strdupped and strdupping a rubbish random string causes the problem. Also, if you look at the commit 1fb7d06, it adds the srcline property into the struct, but not initializing it everywhere needed. Committer notes: Now I see, when using --ignore-callees=do_idle we end up here at line 2189 in add_callchain_ip(): 2181 if (al.sym != NULL) { 2182 if (perf_hpp_list.parent && !*parent && 2183 symbol__match_regex(al.sym, &parent_regex)) 2184 *parent = al.sym; 2185 else if (have_ignore_callees && root_al && 2186 symbol__match_regex(al.sym, &ignore_callees_regex)) { 2187 /* Treat this symbol as the root, 2188 forgetting its callees. */ 2189 *root_al = al; 2190 callchain_cursor_reset(cursor); 2191 } 2192 } And the al that doesn't have the ->srcline field initialized will be copied to the root_al, so then, back to: 1211 int hist_entry_iter__add(struct hist_entry_iter *iter, struct addr_location *al, 1212 int max_stack_depth, void *arg) 1213 { 1214 int err, err2; 1215 struct map *alm = NULL; 1216 1217 if (al) 1218 alm = map__get(al->map); 1219 1220 err = sample__resolve_callchain(iter->sample, &callchain_cursor, &iter->parent, 1221 iter->evsel, al, max_stack_depth); 1222 if (err) { 1223 map__put(alm); 1224 return err; 1225 } 1226 1227 err = iter->ops->prepare_entry(iter, al); 1228 if (err) 1229 goto out; 1230 1231 err = iter->ops->add_single_entry(iter, al); 1232 if (err) 1233 goto out; 1234 That al at line 1221 is what hist_entry_iter__add() (called from sample__resolve_callchain()) saw as 'root_al', and then: iter->ops->add_single_entry(iter, al); will go on with al->srcline with a bogus value, I'll add the above sequence to the cset and apply, thanks! Signed-off-by: Michael Petlan <[email protected]> CC: Milian Wolff <[email protected]> Cc: Jiri Olsa <[email protected]> Fixes: 1fb7d06 ("perf report Use srcline from callchain for hist entries") Link: https //lore.kernel.org/r/[email protected] Reported-by: Juri Lelli <[email protected]> Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jan 27, 2022
commit fdc1223 upstream. If the string passed into qcom_pil_info_store() isn't as long as PIL_RELOC_NAME_LEN we'll try to copy the string assuming the length is PIL_RELOC_NAME_LEN to the io space and go beyond the bounds of the string. Let's only copy as many byes as the string is long, ignoring the NUL terminator. This fixes the following KASAN error: BUG: KASAN: global-out-of-bounds in __memcpy_toio+0x124/0x140 Read of size 1 at addr ffffffd35086e386 by task rmtfs/2392 CPU: 2 PID: 2392 Comm: rmtfs Tainted: G W 5.16.0-rc1-lockdep+ #10 Hardware name: Google Lazor (rev3+) with KB Backlight (DT) Call trace: dump_backtrace+0x0/0x410 show_stack+0x24/0x30 dump_stack_lvl+0x7c/0xa0 print_address_description+0x78/0x2bc kasan_report+0x160/0x1a0 __asan_report_load1_noabort+0x44/0x50 __memcpy_toio+0x124/0x140 qcom_pil_info_store+0x298/0x358 [qcom_pil_info] q6v5_start+0xdf0/0x12e0 [qcom_q6v5_mss] rproc_start+0x178/0x3a0 rproc_boot+0x5f0/0xb90 state_store+0x78/0x1bc dev_attr_store+0x70/0x90 sysfs_kf_write+0xf4/0x118 kernfs_fop_write_iter+0x208/0x300 vfs_write+0x55c/0x804 ksys_pwrite64+0xc8/0x134 __arm64_compat_sys_aarch32_pwrite64+0xc4/0xdc invoke_syscall+0x78/0x20c el0_svc_common+0x11c/0x1f0 do_el0_svc_compat+0x50/0x60 el0_svc_compat+0x5c/0xec el0t_32_sync_handler+0xc0/0xf0 el0t_32_sync+0x1a4/0x1a8 The buggy address belongs to the variable: .str.59+0x6/0xffffffffffffec80 [qcom_q6v5_mss] Memory state around the buggy address: ffffffd35086e280: 00 00 00 00 02 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 ffffffd35086e300: 00 02 f9 f9 f9 f9 f9 f9 00 00 00 06 f9 f9 f9 f9 >ffffffd35086e380: 06 f9 f9 f9 05 f9 f9 f9 00 00 00 00 00 06 f9 f9 ^ ffffffd35086e400: f9 f9 f9 f9 01 f9 f9 f9 04 f9 f9 f9 00 00 01 f9 ffffffd35086e480: f9 f9 f9 f9 00 00 00 00 00 00 00 01 f9 f9 f9 f9 Fixes: 549b67d ("remoteproc: qcom: Introduce helper to store pil info in IMEM") Signed-off-by: Stephen Boyd <[email protected]> Reviewed-by: Bjorn Andersson <[email protected]> Signed-off-by: Bjorn Andersson <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Mar 28, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Apr 7, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Apr 7, 2022
[ Upstream commit 4224cfd ] When bringing down the netdevice or system shutdown, a panic can be triggered while accessing the sysfs path because the device is already removed. [ 755.549084] mlx5_core 0000:12:00.1: Shutdown was called [ 756.404455] mlx5_core 0000:12:00.0: Shutdown was called ... [ 757.937260] BUG: unable to handle kernel NULL pointer dereference at (null) [ 758.031397] IP: [<ffffffff8ee11acb>] dma_pool_alloc+0x1ab/0x280 crash> bt ... PID: 12649 TASK: ffff8924108f2100 CPU: 1 COMMAND: "amsd" ... #9 [ffff89240e1a38b0] page_fault at ffffffff8f38c778 [exception RIP: dma_pool_alloc+0x1ab] RIP: ffffffff8ee11acb RSP: ffff89240e1a3968 RFLAGS: 00010046 RAX: 0000000000000246 RBX: ffff89243d874100 RCX: 0000000000001000 RDX: 0000000000000000 RSI: 0000000000000246 RDI: ffff89243d874090 RBP: ffff89240e1a39c0 R8: 000000000001f080 R9: ffff8905ffc03c00 R10: ffffffffc04680d4 R11: ffffffff8edde9fd R12: 00000000000080d0 R13: ffff89243d874090 R14: ffff89243d874080 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #10 [ffff89240e1a39c8] mlx5_alloc_cmd_msg at ffffffffc04680f3 [mlx5_core] #11 [ffff89240e1a3a18] cmd_exec at ffffffffc046ad62 [mlx5_core] #12 [ffff89240e1a3ab8] mlx5_cmd_exec at ffffffffc046b4fb [mlx5_core] #13 [ffff89240e1a3ae8] mlx5_core_access_reg at ffffffffc0475434 [mlx5_core] #14 [ffff89240e1a3b40] mlx5e_get_fec_caps at ffffffffc04a7348 [mlx5_core] #15 [ffff89240e1a3bb0] get_fec_supported_advertised at ffffffffc04992bf [mlx5_core] #16 [ffff89240e1a3c08] mlx5e_get_link_ksettings at ffffffffc049ab36 [mlx5_core] #17 [ffff89240e1a3ce8] __ethtool_get_link_ksettings at ffffffff8f25db46 #18 [ffff89240e1a3d48] speed_show at ffffffff8f277208 #19 [ffff89240e1a3dd8] dev_attr_show at ffffffff8f0b70e3 #20 [ffff89240e1a3df8] sysfs_kf_seq_show at ffffffff8eedbedf #21 [ffff89240e1a3e18] kernfs_seq_show at ffffffff8eeda596 #22 [ffff89240e1a3e28] seq_read at ffffffff8ee76d10 #23 [ffff89240e1a3e98] kernfs_fop_read at ffffffff8eedaef5 #24 [ffff89240e1a3ed8] vfs_read at ffffffff8ee4e3ff #25 [ffff89240e1a3f08] sys_read at ffffffff8ee4f27f #26 [ffff89240e1a3f50] system_call_fastpath at ffffffff8f395f92 crash> net_device.state ffff89443b0c0000 state = 0x5 (__LINK_STATE_START| __LINK_STATE_NOCARRIER) To prevent this scenario, we also make sure that the netdevice is present. Signed-off-by: suresh kumar <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Apr 19, 2022
[ Upstream commit fe2640b ] In remove_phb_dynamic() we use &phb->io_resource, after we've called device_unregister(&host_bridge->dev). But the unregister may have freed phb, because pcibios_free_controller_deferred() is the release function for the host_bridge. If there are no outstanding references when we call device_unregister() then phb will be freed out from under us. This has gone mainly unnoticed, but with slub_debug and page_poison enabled it can lead to a crash: PID: 7574 TASK: c0000000d492cb80 CPU: 13 COMMAND: "drmgr" #0 [c0000000e4f075a0] crash_kexec at c00000000027d7dc #1 [c0000000e4f075d0] oops_end at c000000000029608 #2 [c0000000e4f07650] __bad_page_fault at c0000000000904b4 #3 [c0000000e4f076c0] do_bad_slb_fault at c00000000009a5a8 #4 [c0000000e4f076f0] data_access_slb_common_virt at c000000000008b30 Data SLB Access [380] exception frame: R0: c000000000167250 R1: c0000000e4f07a00 R2: c000000002a46100 R3: c000000002b39ce8 R4: 00000000000000c0 R5: 00000000000000a9 R6: 3894674d000000c0 R7: 0000000000000000 R8: 00000000000000ff R9: 0000000000000100 R10: 6b6b6b6b6b6b6b6b R11: 0000000000008000 R12: c00000000023da80 R13: c0000009ffd38b00 R14: 0000000000000000 R15: 000000011c87f0f0 R16: 0000000000000006 R17: 0000000000000003 R18: 0000000000000002 R19: 0000000000000004 R20: 0000000000000005 R21: 000000011c87ede8 R22: 000000011c87c5a8 R23: 000000011c87d3a0 R24: 0000000000000000 R25: 0000000000000001 R26: c0000000e4f07cc8 R27: c00000004d1cc400 R28: c0080000031d00e8 R29: c00000004d23d800 R30: c00000004d1d2400 R31: c00000004d1d2540 NIP: c000000000167258 MSR: 8000000000009033 OR3: c000000000e9f474 CTR: 0000000000000000 LR: c000000000167250 XER: 0000000020040003 CCR: 0000000024088420 MQ: 0000000000000000 DAR: 6b6b6b6b6b6b6ba3 DSISR: c0000000e4f07920 Syscall Result: fffffffffffffff2 [NIP : release_resource+56] [LR : release_resource+48] #5 [c0000000e4f07a00] release_resource at c000000000167258 (unreliable) #6 [c0000000e4f07a30] remove_phb_dynamic at c000000000105648 #7 [c0000000e4f07ab0] dlpar_remove_slot at c0080000031a09e8 [rpadlpar_io] #8 [c0000000e4f07b50] remove_slot_store at c0080000031a0b9c [rpadlpar_io] #9 [c0000000e4f07be0] kobj_attr_store at c000000000817d8c #10 [c0000000e4f07c00] sysfs_kf_write at c00000000063e504 #11 [c0000000e4f07c20] kernfs_fop_write_iter at c00000000063d868 #12 [c0000000e4f07c70] new_sync_write at c00000000054339c #13 [c0000000e4f07d10] vfs_write at c000000000546624 #14 [c0000000e4f07d60] ksys_write at c0000000005469f4 #15 [c0000000e4f07db0] system_call_exception at c000000000030840 #16 [c0000000e4f07e10] system_call_vectored_common at c00000000000c168 To avoid it, we can take a reference to the host_bridge->dev until we're done using phb. Then when we drop the reference the phb will be freed. Fixes: 2dd9c11 ("powerpc/pseries: use pci_host_bridge.release_fn() to kfree(phb)") Reported-by: David Dai <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Tested-by: Sachin Sant <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
May 27, 2022
[ Upstream commit 4503cc7 ] Do not allow to write timestamps on RX rings if PF is being configured. When PF is being configured RX rings can be freed or rebuilt. If at the same time timestamps are updated, the kernel will crash by dereferencing null RX ring pointer. PID: 1449 TASK: ff187d28ed658040 CPU: 34 COMMAND: "ice-ptp-0000:51" #0 [ff1966a94a713bb0] machine_kexec at ffffffff9d05a0be truenas#1 [ff1966a94a713c08] __crash_kexec at ffffffff9d192e9d truenas#2 [ff1966a94a713cd0] crash_kexec at ffffffff9d1941bd truenas#3 [ff1966a94a713ce8] oops_end at ffffffff9d01bd54 truenas#4 [ff1966a94a713d08] no_context at ffffffff9d06bda4 truenas#5 [ff1966a94a713d60] __bad_area_nosemaphore at ffffffff9d06c10c truenas#6 [ff1966a94a713da8] do_page_fault at ffffffff9d06cae4 truenas#7 [ff1966a94a713de0] page_fault at ffffffff9da0107e [exception RIP: ice_ptp_update_cached_phctime+91] RIP: ffffffffc076db8b RSP: ff1966a94a713e98 RFLAGS: 00010246 RAX: 16e3db9c6b7ccae4 RBX: ff187d269dd3c180 RCX: ff187d269cd4d018 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 RBP: ff187d269cfcc644 R8: ff187d339b9641b0 R9: 0000000000000000 R10: 0000000000000002 R11: 0000000000000000 R12: ff187d269cfcc648 R13: ffffffff9f128784 R14: ffffffff9d101b70 R15: ff187d269cfcc640 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 truenas#8 [ff1966a94a713ea0] ice_ptp_periodic_work at ffffffffc076dbef [ice] truenas#9 [ff1966a94a713ee0] kthread_worker_fn at ffffffff9d101c1b truenas#10 [ff1966a94a713f10] kthread at ffffffff9d101b4d truenas#11 [ff1966a94a713f50] ret_from_fork at ffffffff9da0023f Fixes: 77a7811 ("ice: enable receive hardware timestamping") Signed-off-by: Arkadiusz Kubalewski <[email protected]> Reviewed-by: Michal Schmidt <[email protected]> Tested-by: Dave Cain <[email protected]> Tested-by: Gurucharan <[email protected]> (A Contingent worker at Intel) Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jun 10, 2022
[ Upstream commit 6b9dbed ] pty_write() invokes kmalloc() which may invoke a normal printk() to print failure message. This can cause a deadlock in the scenario reported by syz-bot below: CPU0 CPU1 CPU2 ---- ---- ---- lock(console_owner); lock(&port_lock_key); lock(&port->lock); lock(&port_lock_key); lock(&port->lock); lock(console_owner); As commit dbdda84 ("printk: Add console owner and waiter logic to load balance console writes") said, such deadlock can be prevented by using printk_deferred() in kmalloc() (which is invoked in the section guarded by the port->lock). But there are too many printk() on the kmalloc() path, and kmalloc() can be called from anywhere, so changing printk() to printk_deferred() is too complicated and inelegant. Therefore, this patch chooses to specify __GFP_NOWARN to kmalloc(), so that printk() will not be called, and this deadlock problem can be avoided. Syzbot reported the following lockdep error: ====================================================== WARNING: possible circular locking dependency detected 5.4.143-00237-g08ccc19a-dirty truenas#10 Not tainted ------------------------------------------------------ syz-executor.4/29420 is trying to acquire lock: ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1752 [inline] ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: vprintk_emit+0x2ca/0x470 kernel/printk/printk.c:2023 but task is already holding lock: ffff8880119c9158 (&port->lock){-.-.}-{2:2}, at: pty_write+0xf4/0x1f0 drivers/tty/pty.c:120 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> truenas#2 (&port->lock){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 tty_port_tty_get drivers/tty/tty_port.c:288 [inline] <-- lock(&port->lock); tty_port_default_wakeup+0x1d/0xb0 drivers/tty/tty_port.c:47 serial8250_tx_chars+0x530/0xa80 drivers/tty/serial/8250/8250_port.c:1767 serial8250_handle_irq.part.0+0x31f/0x3d0 drivers/tty/serial/8250/8250_port.c:1854 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1827 [inline] <-- lock(&port_lock_key); serial8250_default_handle_irq+0xb2/0x220 drivers/tty/serial/8250/8250_port.c:1870 serial8250_interrupt+0xfd/0x200 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x109/0xa50 kernel/irq/handle.c:156 [...] -> truenas#1 (&port_lock_key){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 serial8250_console_write+0x184/0xa40 drivers/tty/serial/8250/8250_port.c:3198 <-- lock(&port_lock_key); call_console_drivers kernel/printk/printk.c:1819 [inline] console_unlock+0x8cb/0xd00 kernel/printk/printk.c:2504 vprintk_emit+0x1b5/0x470 kernel/printk/printk.c:2024 <-- lock(console_owner); vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 register_console+0x8b3/0xc10 kernel/printk/printk.c:2829 univ8250_console_init+0x3a/0x46 drivers/tty/serial/8250/8250_core.c:681 console_init+0x49d/0x6d3 kernel/printk/printk.c:2915 start_kernel+0x5e9/0x879 init/main.c:713 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 -> #0 (console_owner){....}-{0:0}: [...] lock_acquire+0x127/0x340 kernel/locking/lockdep.c:4734 console_trylock_spinning kernel/printk/printk.c:1773 [inline] <-- lock(console_owner); vprintk_emit+0x307/0x470 kernel/printk/printk.c:2023 vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 fail_dump lib/fault-inject.c:45 [inline] should_fail+0x67b/0x7c0 lib/fault-inject.c:144 __should_failslab+0x152/0x1c0 mm/failslab.c:33 should_failslab+0x5/0x10 mm/slab_common.c:1224 slab_pre_alloc_hook mm/slab.h:468 [inline] slab_alloc_node mm/slub.c:2723 [inline] slab_alloc mm/slub.c:2807 [inline] __kmalloc+0x72/0x300 mm/slub.c:3871 kmalloc include/linux/slab.h:582 [inline] tty_buffer_alloc+0x23f/0x2a0 drivers/tty/tty_buffer.c:175 __tty_buffer_request_room+0x156/0x2a0 drivers/tty/tty_buffer.c:273 tty_insert_flip_string_fixed_flag+0x93/0x250 drivers/tty/tty_buffer.c:318 tty_insert_flip_string include/linux/tty_flip.h:37 [inline] pty_write+0x126/0x1f0 drivers/tty/pty.c:122 <-- lock(&port->lock); n_tty_write+0xa7a/0xfc0 drivers/tty/n_tty.c:2356 do_tty_write drivers/tty/tty_io.c:961 [inline] tty_write+0x512/0x930 drivers/tty/tty_io.c:1045 __vfs_write+0x76/0x100 fs/read_write.c:494 [...] other info that might help us debug this: Chain exists of: console_owner --> &port_lock_key --> &port->lock Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: b6da31b ("tty: Fix data race in tty_insert_flip_string_fixed_flag") Signed-off-by: Qi Zheng <[email protected]> Acked-by: Jiri Slaby <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Cc: Akinobu Mita <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jul 21, 2022
[ Upstream commit 6b9dbed ] pty_write() invokes kmalloc() which may invoke a normal printk() to print failure message. This can cause a deadlock in the scenario reported by syz-bot below: CPU0 CPU1 CPU2 ---- ---- ---- lock(console_owner); lock(&port_lock_key); lock(&port->lock); lock(&port_lock_key); lock(&port->lock); lock(console_owner); As commit dbdda84 ("printk: Add console owner and waiter logic to load balance console writes") said, such deadlock can be prevented by using printk_deferred() in kmalloc() (which is invoked in the section guarded by the port->lock). But there are too many printk() on the kmalloc() path, and kmalloc() can be called from anywhere, so changing printk() to printk_deferred() is too complicated and inelegant. Therefore, this patch chooses to specify __GFP_NOWARN to kmalloc(), so that printk() will not be called, and this deadlock problem can be avoided. Syzbot reported the following lockdep error: ====================================================== WARNING: possible circular locking dependency detected 5.4.143-00237-g08ccc19a-dirty #10 Not tainted ------------------------------------------------------ syz-executor.4/29420 is trying to acquire lock: ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: console_trylock_spinning kernel/printk/printk.c:1752 [inline] ffffffff8aedb2a0 (console_owner){....}-{0:0}, at: vprintk_emit+0x2ca/0x470 kernel/printk/printk.c:2023 but task is already holding lock: ffff8880119c9158 (&port->lock){-.-.}-{2:2}, at: pty_write+0xf4/0x1f0 drivers/tty/pty.c:120 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #2 (&port->lock){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 tty_port_tty_get drivers/tty/tty_port.c:288 [inline] <-- lock(&port->lock); tty_port_default_wakeup+0x1d/0xb0 drivers/tty/tty_port.c:47 serial8250_tx_chars+0x530/0xa80 drivers/tty/serial/8250/8250_port.c:1767 serial8250_handle_irq.part.0+0x31f/0x3d0 drivers/tty/serial/8250/8250_port.c:1854 serial8250_handle_irq drivers/tty/serial/8250/8250_port.c:1827 [inline] <-- lock(&port_lock_key); serial8250_default_handle_irq+0xb2/0x220 drivers/tty/serial/8250/8250_port.c:1870 serial8250_interrupt+0xfd/0x200 drivers/tty/serial/8250/8250_core.c:126 __handle_irq_event_percpu+0x109/0xa50 kernel/irq/handle.c:156 [...] -> #1 (&port_lock_key){-.-.}-{2:2}: __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:110 [inline] _raw_spin_lock_irqsave+0x35/0x50 kernel/locking/spinlock.c:159 serial8250_console_write+0x184/0xa40 drivers/tty/serial/8250/8250_port.c:3198 <-- lock(&port_lock_key); call_console_drivers kernel/printk/printk.c:1819 [inline] console_unlock+0x8cb/0xd00 kernel/printk/printk.c:2504 vprintk_emit+0x1b5/0x470 kernel/printk/printk.c:2024 <-- lock(console_owner); vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 register_console+0x8b3/0xc10 kernel/printk/printk.c:2829 univ8250_console_init+0x3a/0x46 drivers/tty/serial/8250/8250_core.c:681 console_init+0x49d/0x6d3 kernel/printk/printk.c:2915 start_kernel+0x5e9/0x879 init/main.c:713 secondary_startup_64+0xa4/0xb0 arch/x86/kernel/head_64.S:241 -> #0 (console_owner){....}-{0:0}: [...] lock_acquire+0x127/0x340 kernel/locking/lockdep.c:4734 console_trylock_spinning kernel/printk/printk.c:1773 [inline] <-- lock(console_owner); vprintk_emit+0x307/0x470 kernel/printk/printk.c:2023 vprintk_func+0x8d/0x250 kernel/printk/printk_safe.c:394 printk+0xba/0xed kernel/printk/printk.c:2084 fail_dump lib/fault-inject.c:45 [inline] should_fail+0x67b/0x7c0 lib/fault-inject.c:144 __should_failslab+0x152/0x1c0 mm/failslab.c:33 should_failslab+0x5/0x10 mm/slab_common.c:1224 slab_pre_alloc_hook mm/slab.h:468 [inline] slab_alloc_node mm/slub.c:2723 [inline] slab_alloc mm/slub.c:2807 [inline] __kmalloc+0x72/0x300 mm/slub.c:3871 kmalloc include/linux/slab.h:582 [inline] tty_buffer_alloc+0x23f/0x2a0 drivers/tty/tty_buffer.c:175 __tty_buffer_request_room+0x156/0x2a0 drivers/tty/tty_buffer.c:273 tty_insert_flip_string_fixed_flag+0x93/0x250 drivers/tty/tty_buffer.c:318 tty_insert_flip_string include/linux/tty_flip.h:37 [inline] pty_write+0x126/0x1f0 drivers/tty/pty.c:122 <-- lock(&port->lock); n_tty_write+0xa7a/0xfc0 drivers/tty/n_tty.c:2356 do_tty_write drivers/tty/tty_io.c:961 [inline] tty_write+0x512/0x930 drivers/tty/tty_io.c:1045 __vfs_write+0x76/0x100 fs/read_write.c:494 [...] other info that might help us debug this: Chain exists of: console_owner --> &port_lock_key --> &port->lock Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: b6da31b ("tty: Fix data race in tty_insert_flip_string_fixed_flag") Signed-off-by: Qi Zheng <[email protected]> Acked-by: Jiri Slaby <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> Cc: Akinobu Mita <[email protected]> Cc: Vlastimil Babka <[email protected]> Cc: Steven Rostedt (Google) <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Aug 14, 2022
[ Upstream commit ad25f5c ] There's a locking issue with the per-netns list of calls in rxrpc. The pieces of code that add and remove a call from the list use write_lock() and the calls procfile uses read_lock() to access it. However, the timer callback function may trigger a removal by trying to queue a call for processing and finding that it's already queued - at which point it has a spare refcount that it has to do something with. Unfortunately, if it puts the call and this reduces the refcount to 0, the call will be removed from the list. Unfortunately, since the _bh variants of the locking functions aren't used, this can deadlock. ================================ WARNING: inconsistent lock state 5.18.0-rc3-build4+ truenas#10 Not tainted -------------------------------- inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage. ksoftirqd/2/25 [HC0[0]:SC1[1]:HE1:SE0] takes: ffff888107ac4038 (&rxnet->call_lock){+.?.}-{2:2}, at: rxrpc_put_call+0x103/0x14b {SOFTIRQ-ON-W} state was registered at: ... Possible unsafe locking scenario: CPU0 ---- lock(&rxnet->call_lock); <Interrupt> lock(&rxnet->call_lock); *** DEADLOCK *** 1 lock held by ksoftirqd/2/25: #0: ffff8881008ffdb0 ((&call->timer)){+.-.}-{0:0}, at: call_timer_fn+0x5/0x23d Changes ======= ver truenas#2) - Changed to using list_next_rcu() rather than rcu_dereference() directly. Fixes: 17926a7 ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both") Signed-off-by: David Howells <[email protected]> cc: Marc Dionne <[email protected]> cc: [email protected] Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Aug 14, 2022
[ Upstream commit 391e982 ] It is trivial to craft a module to trigger OOB access in this line: if (info->secstrings[strhdr->sh_size - 1] != '\0') { BUG: unable to handle page fault for address: ffffc90000aa0fff PGD 100000067 P4D 100000067 PUD 100066067 PMD 10436f067 PTE 0 Oops: 0000 [truenas#1] PREEMPT SMP PTI CPU: 7 PID: 1215 Comm: insmod Not tainted 5.18.0-rc5-00007-g9bf578647087-dirty truenas#10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-4.fc34 04/01/2014 RIP: 0010:load_module+0x19b/0x2391 Fixes: ec2a295 ("module: harden ELF info handling") Signed-off-by: Alexey Dobriyan <[email protected]> [rebased patch onto modules-next] Signed-off-by: Luis Chamberlain <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Sep 3, 2022
[ Upstream commit 5a42f11 ] Fix the following scenario: 1. ethtool -L $IFACE rx 8 tx 96 2. xdpsock -q 10 -t -z Above refers to a case where user would like to attach XSK socket in txonly mode at a queue id that does not have a corresponding Rx queue. At this moment ice's XSK logic is tightly bound to act on a "queue pair", e.g. both Tx and Rx queues at a given queue id are disabled/enabled and both of them will get XSK pool assigned, which is broken for the presented queue configuration. This results in the splat included at the bottom, which is basically an OOB access to Rx ring array. To fix this, allow using the ids only in scope of "combined" queues reported by ethtool. However, logic should be rewritten to allow such configurations later on, which would end up as a complete rewrite of the control path, so let us go with this temporary fix. [420160.558008] BUG: kernel NULL pointer dereference, address: 0000000000000082 [420160.566359] #PF: supervisor read access in kernel mode [420160.572657] #PF: error_code(0x0000) - not-present page [420160.579002] PGD 0 P4D 0 [420160.582756] Oops: 0000 [truenas#1] PREEMPT SMP NOPTI [420160.588396] CPU: 10 PID: 21232 Comm: xdpsock Tainted: G OE 5.19.0-rc7+ truenas#10 [420160.597893] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [420160.609894] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420160.616968] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420160.639421] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420160.646650] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420160.655893] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420160.665166] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420160.674493] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420160.683833] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420160.693211] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420160.703645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420160.711783] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420160.721399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420160.731045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420160.740707] PKRU: 55555554 [420160.745960] Call Trace: [420160.750962] <TASK> [420160.755597] ? kmalloc_large_node+0x79/0x90 [420160.762703] ? __kmalloc_node+0x3f5/0x4b0 [420160.769341] xp_assign_dev+0xfd/0x210 [420160.775661] ? shmem_file_read_iter+0x29a/0x420 [420160.782896] xsk_bind+0x152/0x490 [420160.788943] __sys_bind+0xd0/0x100 [420160.795097] ? exit_to_user_mode_prepare+0x20/0x120 [420160.802801] __x64_sys_bind+0x16/0x20 [420160.809298] do_syscall_64+0x38/0x90 [420160.815741] entry_SYSCALL_64_after_hwframe+0x63/0xcd [420160.823731] RIP: 0033:0x7fa86a0dd2fb [420160.830264] Code: c3 66 0f 1f 44 00 00 48 8b 15 69 8b 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 0f 1f 44 00 00 f3 0f 1e fa b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 8b 0c 00 f7 d8 64 89 01 48 [420160.855410] RSP: 002b:00007ffc1146f618 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 [420160.866366] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa86a0dd2fb [420160.876957] RDX: 0000000000000010 RSI: 00007ffc1146f680 RDI: 0000000000000003 [420160.887604] RBP: 000055d7113a0520 R08: 00007fa868fb8000 R09: 0000000080000000 [420160.898293] R10: 0000000000008001 R11: 0000000000000246 R12: 000055d7113a04e0 [420160.909038] R13: 000055d7113a0320 R14: 000000000000000a R15: 0000000000000000 [420160.919817] </TASK> [420160.925659] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_me coretemp ioatdma mei ipmi_si wmi ipmi_msghandler acpi_pad acpi_power_meter ip_tables x_tables autofs4 ixgbe i40e crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci mdio dca libahci lpc_ich [last unloaded: ice] [420160.977576] CR2: 0000000000000082 [420160.985037] ---[ end trace 0000000000000000 ]--- [420161.097724] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420161.107341] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420161.134741] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420161.144274] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420161.155690] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420161.168088] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420161.179295] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420161.190420] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420161.201505] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420161.213628] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420161.223413] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420161.234653] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420161.245893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420161.257052] PKRU: 55555554 Fixes: 2d4238f ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: George Kuruvinakunnel <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Sep 10, 2022
commit a0e44c6 upstream. A transaction of type BINDER_TYPE_WEAK_HANDLE can fail to increment the reference for a node. In this case, the target proc normally releases the failed reference upon close as expected. However, if the target is dying in parallel the call will race with binder_deferred_release(), so the target could have released all of its references by now leaving the cleanup of the new failed reference unhandled. The transaction then ends and the target proc gets released making the ref->proc now a dangling pointer. Later on, ref->node is closed and we attempt to take spin_lock(&ref->proc->inner_lock), which leads to the use-after-free bug reported below. Let's fix this by cleaning up the failed reference on the spot instead of relying on the target to do so. ================================================================== BUG: KASAN: use-after-free in _raw_spin_lock+0xa8/0x150 Write of size 4 at addr ffff5ca207094238 by task kworker/1:0/590 CPU: 1 PID: 590 Comm: kworker/1:0 Not tainted 5.19.0-rc8 truenas#10 Hardware name: linux,dummy-virt (DT) Workqueue: events binder_deferred_func Call trace: dump_backtrace.part.0+0x1d0/0x1e0 show_stack+0x18/0x70 dump_stack_lvl+0x68/0x84 print_report+0x2e4/0x61c kasan_report+0xa4/0x110 kasan_check_range+0xfc/0x1a4 __kasan_check_write+0x3c/0x50 _raw_spin_lock+0xa8/0x150 binder_deferred_func+0x5e0/0x9b0 process_one_work+0x38c/0x5f0 worker_thread+0x9c/0x694 kthread+0x188/0x190 ret_from_fork+0x10/0x20 Acked-by: Christian Brauner (Microsoft) <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Cc: stable <[email protected]> # 4.14+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Sep 14, 2022
[ Upstream commit 5a42f11 ] Fix the following scenario: 1. ethtool -L $IFACE rx 8 tx 96 2. xdpsock -q 10 -t -z Above refers to a case where user would like to attach XSK socket in txonly mode at a queue id that does not have a corresponding Rx queue. At this moment ice's XSK logic is tightly bound to act on a "queue pair", e.g. both Tx and Rx queues at a given queue id are disabled/enabled and both of them will get XSK pool assigned, which is broken for the presented queue configuration. This results in the splat included at the bottom, which is basically an OOB access to Rx ring array. To fix this, allow using the ids only in scope of "combined" queues reported by ethtool. However, logic should be rewritten to allow such configurations later on, which would end up as a complete rewrite of the control path, so let us go with this temporary fix. [420160.558008] BUG: kernel NULL pointer dereference, address: 0000000000000082 [420160.566359] #PF: supervisor read access in kernel mode [420160.572657] #PF: error_code(0x0000) - not-present page [420160.579002] PGD 0 P4D 0 [420160.582756] Oops: 0000 [#1] PREEMPT SMP NOPTI [420160.588396] CPU: 10 PID: 21232 Comm: xdpsock Tainted: G OE 5.19.0-rc7+ #10 [420160.597893] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019 [420160.609894] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420160.616968] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420160.639421] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420160.646650] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420160.655893] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420160.665166] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420160.674493] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420160.683833] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420160.693211] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420160.703645] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420160.711783] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420160.721399] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420160.731045] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420160.740707] PKRU: 55555554 [420160.745960] Call Trace: [420160.750962] <TASK> [420160.755597] ? kmalloc_large_node+0x79/0x90 [420160.762703] ? __kmalloc_node+0x3f5/0x4b0 [420160.769341] xp_assign_dev+0xfd/0x210 [420160.775661] ? shmem_file_read_iter+0x29a/0x420 [420160.782896] xsk_bind+0x152/0x490 [420160.788943] __sys_bind+0xd0/0x100 [420160.795097] ? exit_to_user_mode_prepare+0x20/0x120 [420160.802801] __x64_sys_bind+0x16/0x20 [420160.809298] do_syscall_64+0x38/0x90 [420160.815741] entry_SYSCALL_64_after_hwframe+0x63/0xcd [420160.823731] RIP: 0033:0x7fa86a0dd2fb [420160.830264] Code: c3 66 0f 1f 44 00 00 48 8b 15 69 8b 0c 00 f7 d8 64 89 02 b8 ff ff ff ff eb bc 0f 1f 44 00 00 f3 0f 1e fa b8 31 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 8b 0c 00 f7 d8 64 89 01 48 [420160.855410] RSP: 002b:00007ffc1146f618 EFLAGS: 00000246 ORIG_RAX: 0000000000000031 [420160.866366] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fa86a0dd2fb [420160.876957] RDX: 0000000000000010 RSI: 00007ffc1146f680 RDI: 0000000000000003 [420160.887604] RBP: 000055d7113a0520 R08: 00007fa868fb8000 R09: 0000000080000000 [420160.898293] R10: 0000000000008001 R11: 0000000000000246 R12: 000055d7113a04e0 [420160.909038] R13: 000055d7113a0320 R14: 000000000000000a R15: 0000000000000000 [420160.919817] </TASK> [420160.925659] Modules linked in: ice(OE) af_packet binfmt_misc nls_iso8859_1 ipmi_ssif intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp mei_me coretemp ioatdma mei ipmi_si wmi ipmi_msghandler acpi_pad acpi_power_meter ip_tables x_tables autofs4 ixgbe i40e crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd ahci mdio dca libahci lpc_ich [last unloaded: ice] [420160.977576] CR2: 0000000000000082 [420160.985037] ---[ end trace 0000000000000000 ]--- [420161.097724] RIP: 0010:ice_xsk_pool_setup+0x44/0x7d0 [ice] [420161.107341] Code: f3 48 83 ec 40 48 8b 4f 20 48 8b 3f 65 48 8b 04 25 28 00 00 00 48 89 44 24 38 31 c0 48 8d 04 ed 00 00 00 00 48 01 c1 48 8b 11 <0f> b7 92 82 00 00 00 48 85 d2 0f 84 2d 75 00 00 48 8d 72 ff 48 85 [420161.134741] RSP: 0018:ffffc9002d2afd48 EFLAGS: 00010282 [420161.144274] RAX: 0000000000000050 RBX: ffff88811d8bdd00 RCX: ffff888112c14ff8 [420161.155690] RDX: 0000000000000000 RSI: ffff88811d8bdd00 RDI: ffff888109861000 [420161.168088] RBP: 000000000000000a R08: 000000000000000a R09: 0000000000000000 [420161.179295] R10: 000000000000889f R11: 0000000000000000 R12: 000000000000000a [420161.190420] R13: 000000000000000a R14: 0000000000000000 R15: ffff888117611828 [420161.201505] FS: 00007fa869fc1f80(0000) GS:ffff8897e0880000(0000) knlGS:0000000000000000 [420161.213628] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [420161.223413] CR2: 0000000000000082 CR3: 00000001d076c001 CR4: 00000000007706e0 [420161.234653] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [420161.245893] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [420161.257052] PKRU: 55555554 Fixes: 2d4238f ("ice: Add support for AF_XDP") Signed-off-by: Maciej Fijalkowski <[email protected]> Tested-by: George Kuruvinakunnel <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Sep 14, 2022
commit a0e44c6 upstream. A transaction of type BINDER_TYPE_WEAK_HANDLE can fail to increment the reference for a node. In this case, the target proc normally releases the failed reference upon close as expected. However, if the target is dying in parallel the call will race with binder_deferred_release(), so the target could have released all of its references by now leaving the cleanup of the new failed reference unhandled. The transaction then ends and the target proc gets released making the ref->proc now a dangling pointer. Later on, ref->node is closed and we attempt to take spin_lock(&ref->proc->inner_lock), which leads to the use-after-free bug reported below. Let's fix this by cleaning up the failed reference on the spot instead of relying on the target to do so. ================================================================== BUG: KASAN: use-after-free in _raw_spin_lock+0xa8/0x150 Write of size 4 at addr ffff5ca207094238 by task kworker/1:0/590 CPU: 1 PID: 590 Comm: kworker/1:0 Not tainted 5.19.0-rc8 #10 Hardware name: linux,dummy-virt (DT) Workqueue: events binder_deferred_func Call trace: dump_backtrace.part.0+0x1d0/0x1e0 show_stack+0x18/0x70 dump_stack_lvl+0x68/0x84 print_report+0x2e4/0x61c kasan_report+0xa4/0x110 kasan_check_range+0xfc/0x1a4 __kasan_check_write+0x3c/0x50 _raw_spin_lock+0xa8/0x150 binder_deferred_func+0x5e0/0x9b0 process_one_work+0x38c/0x5f0 worker_thread+0x9c/0x694 kthread+0x188/0x190 ret_from_fork+0x10/0x20 Acked-by: Christian Brauner (Microsoft) <[email protected]> Signed-off-by: Carlos Llamas <[email protected]> Cc: stable <[email protected]> # 4.14+ Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Nov 18, 2022
[ Upstream commit 97f88a3 ] I found a null pointer reference in arch_prepare_kprobe(): # echo 'p cmdline_proc_show' > kprobe_events # echo 'p cmdline_proc_show+16' >> kprobe_events Kernel attempted to read user page (0) - exploit attempt? (uid: 0) BUG: Kernel NULL pointer dereference on read at 0x00000000 Faulting instruction address: 0xc000000000050bfc Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA PowerNV Modules linked in: CPU: 0 PID: 122 Comm: sh Not tainted 6.0.0-rc3-00007-gdcf8e5633e2e #10 NIP: c000000000050bfc LR: c000000000050bec CTR: 0000000000005bdc REGS: c0000000348475b0 TRAP: 0300 Not tainted (6.0.0-rc3-00007-gdcf8e5633e2e) MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 88002444 XER: 20040006 CFAR: c00000000022d100 DAR: 0000000000000000 DSISR: 40000000 IRQMASK: 0 ... NIP arch_prepare_kprobe+0x10c/0x2d0 LR arch_prepare_kprobe+0xfc/0x2d0 Call Trace: 0xc0000000012f77a0 (unreliable) register_kprobe+0x3c0/0x7a0 __register_trace_kprobe+0x140/0x1a0 __trace_kprobe_create+0x794/0x1040 trace_probe_create+0xc4/0xe0 create_or_delete_trace_kprobe+0x2c/0x80 trace_parse_run_command+0xf0/0x210 probes_write+0x20/0x40 vfs_write+0xfc/0x450 ksys_write+0x84/0x140 system_call_exception+0x17c/0x3a0 system_call_vectored_common+0xe8/0x278 --- interrupt: 3000 at 0x7fffa5682de0 NIP: 00007fffa5682de0 LR: 0000000000000000 CTR: 0000000000000000 REGS: c000000034847e80 TRAP: 3000 Not tainted (6.0.0-rc3-00007-gdcf8e5633e2e) MSR: 900000000280f033 <SF,HV,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 44002408 XER: 00000000 The address being probed has some special: cmdline_proc_show: Probe based on ftrace cmdline_proc_show+16: Probe for the next instruction at the ftrace location The ftrace-based kprobe does not generate kprobe::ainsn::insn, it gets set to NULL. In arch_prepare_kprobe() it will check for: ... prev = get_kprobe(p->addr - 1); preempt_enable_no_resched(); if (prev && ppc_inst_prefixed(ppc_inst_read(prev->ainsn.insn))) { ... If prev is based on ftrace, 'ppc_inst_read(prev->ainsn.insn)' will occur with a null pointer reference. At this point prev->addr will not be a prefixed instruction, so the check can be skipped. Check if prev is ftrace-based kprobe before reading 'prev->ainsn.insn' to fix this problem. Fixes: b4657f7 ("powerpc/kprobes: Don't allow breakpoints on suffixes") Signed-off-by: Li Huafei <[email protected]> [mpe: Trim oops] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jan 9, 2023
[ Upstream commit 93c660c ] ASAN reports an use-after-free in btf_dump_name_dups: ERROR: AddressSanitizer: heap-use-after-free on address 0xffff927006db at pc 0xaaaab5dfb618 bp 0xffffdd89b890 sp 0xffffdd89b928 READ of size 2 at 0xffff927006db thread T0 #0 0xaaaab5dfb614 in __interceptor_strcmp.part.0 (test_progs+0x21b614) truenas#1 0xaaaab635f144 in str_equal_fn tools/lib/bpf/btf_dump.c:127 truenas#2 0xaaaab635e3e0 in hashmap_find_entry tools/lib/bpf/hashmap.c:143 truenas#3 0xaaaab635e72c in hashmap__find tools/lib/bpf/hashmap.c:212 truenas#4 0xaaaab6362258 in btf_dump_name_dups tools/lib/bpf/btf_dump.c:1525 truenas#5 0xaaaab636240c in btf_dump_resolve_name tools/lib/bpf/btf_dump.c:1552 truenas#6 0xaaaab6362598 in btf_dump_type_name tools/lib/bpf/btf_dump.c:1567 truenas#7 0xaaaab6360b48 in btf_dump_emit_struct_def tools/lib/bpf/btf_dump.c:912 truenas#8 0xaaaab6360630 in btf_dump_emit_type tools/lib/bpf/btf_dump.c:798 truenas#9 0xaaaab635f720 in btf_dump__dump_type tools/lib/bpf/btf_dump.c:282 truenas#10 0xaaaab608523c in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:236 truenas#11 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 truenas#12 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 truenas#13 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 truenas#14 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 truenas#15 0xaaaab5d65990 (test_progs+0x185990) 0xffff927006db is located 11 bytes inside of 16-byte region [0xffff927006d0,0xffff927006e0) freed by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) truenas#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 truenas#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 truenas#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 truenas#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 truenas#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 truenas#6 0xaaaab6353e10 in btf__add_field tools/lib/bpf/btf.c:2032 truenas#7 0xaaaab6084fcc in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:232 truenas#8 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 truenas#9 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 truenas#10 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 truenas#11 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 truenas#12 0xaaaab5d65990 (test_progs+0x185990) previously allocated by thread T0 here: #0 0xaaaab5e2c7c4 in realloc (test_progs+0x24c7c4) truenas#1 0xaaaab634f4a0 in libbpf_reallocarray tools/lib/bpf/libbpf_internal.h:191 truenas#2 0xaaaab634f840 in libbpf_add_mem tools/lib/bpf/btf.c:163 truenas#3 0xaaaab636643c in strset_add_str_mem tools/lib/bpf/strset.c:106 truenas#4 0xaaaab6366560 in strset__add_str tools/lib/bpf/strset.c:157 truenas#5 0xaaaab6352d70 in btf__add_str tools/lib/bpf/btf.c:1519 truenas#6 0xaaaab6353ff0 in btf_add_enum_common tools/lib/bpf/btf.c:2070 truenas#7 0xaaaab6354080 in btf__add_enum tools/lib/bpf/btf.c:2102 truenas#8 0xaaaab6082f50 in test_btf_dump_incremental tools/testing/selftests/bpf/prog_tests/btf_dump.c:162 truenas#9 0xaaaab6097530 in test_btf_dump tools/testing/selftests/bpf/prog_tests/btf_dump.c:875 truenas#10 0xaaaab6314ed0 in run_one_test tools/testing/selftests/bpf/test_progs.c:1062 truenas#11 0xaaaab631a0a8 in main tools/testing/selftests/bpf/test_progs.c:1697 truenas#12 0xffff9676d214 in __libc_start_main ../csu/libc-start.c:308 truenas#13 0xaaaab5d65990 (test_progs+0x185990) The reason is that the key stored in hash table name_map is a string address, and the string memory is allocated by realloc() function, when the memory is resized by realloc() later, the old memory may be freed, so the address stored in name_map references to a freed memory, causing use-after-free. Fix it by storing duplicated string address in name_map. Fixes: 919d2b1 ("libbpf: Allow modification of BTF and add btf__add_str API") Signed-off-by: Xu Kuohai <[email protected]> Signed-off-by: Andrii Nakryiko <[email protected]> Acked-by: Martin KaFai Lau <[email protected]> Link: https://lore.kernel.org/bpf/[email protected] Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jan 9, 2023
…g the sock [ Upstream commit 3cf7203 ] There is a race condition in vxlan that when deleting a vxlan device during receiving packets, there is a possibility that the sock is released after getting vxlan_sock vs from sk_user_data. Then in later vxlan_ecn_decapsulate(), vxlan_get_sk_family() we will got NULL pointer dereference. e.g. #0 [ffffa25ec6978a38] machine_kexec at ffffffff8c669757 truenas#1 [ffffa25ec6978a90] __crash_kexec at ffffffff8c7c0a4d truenas#2 [ffffa25ec6978b58] crash_kexec at ffffffff8c7c1c48 truenas#3 [ffffa25ec6978b60] oops_end at ffffffff8c627f2b truenas#4 [ffffa25ec6978b80] page_fault_oops at ffffffff8c678fcb truenas#5 [ffffa25ec6978bd8] exc_page_fault at ffffffff8d109542 truenas#6 [ffffa25ec6978c00] asm_exc_page_fault at ffffffff8d200b62 [exception RIP: vxlan_ecn_decapsulate+0x3b] RIP: ffffffffc1014e7b RSP: ffffa25ec6978cb0 RFLAGS: 00010246 RAX: 0000000000000008 RBX: ffff8aa000888000 RCX: 0000000000000000 RDX: 000000000000000e RSI: ffff8a9fc7ab803e RDI: ffff8a9fd1168700 RBP: ffff8a9fc7ab803e R8: 0000000000700000 R9: 00000000000010ae R10: ffff8a9fcb748980 R11: 0000000000000000 R12: ffff8a9fd1168700 R13: ffff8aa000888000 R14: 00000000002a0000 R15: 00000000000010ae ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 truenas#7 [ffffa25ec6978ce8] vxlan_rcv at ffffffffc10189cd [vxlan] truenas#8 [ffffa25ec6978d90] udp_queue_rcv_one_skb at ffffffff8cfb6507 truenas#9 [ffffa25ec6978dc0] udp_unicast_rcv_skb at ffffffff8cfb6e45 truenas#10 [ffffa25ec6978dc8] __udp4_lib_rcv at ffffffff8cfb8807 truenas#11 [ffffa25ec6978e20] ip_protocol_deliver_rcu at ffffffff8cf76951 truenas#12 [ffffa25ec6978e48] ip_local_deliver at ffffffff8cf76bde truenas#13 [ffffa25ec6978ea0] __netif_receive_skb_one_core at ffffffff8cecde9b truenas#14 [ffffa25ec6978ec8] process_backlog at ffffffff8cece139 truenas#15 [ffffa25ec6978f00] __napi_poll at ffffffff8ceced1a truenas#16 [ffffa25ec6978f28] net_rx_action at ffffffff8cecf1f3 truenas#17 [ffffa25ec6978fa0] __softirqentry_text_start at ffffffff8d4000ca truenas#18 [ffffa25ec6978ff0] do_softirq at ffffffff8c6fbdc3 Reproducer: https://github.com/Mellanox/ovs-tests/blob/master/test-ovs-vxlan-remove-tunnel-during-traffic.sh Fix this by waiting for all sk_user_data reader to finish before releasing the sock. Reported-by: Jianlin Shi <[email protected]> Suggested-by: Jakub Sitnicki <[email protected]> Fixes: 6a93cc9 ("udp-tunnel: Add a few more UDP tunnel APIs") Signed-off-by: Hangbin Liu <[email protected]> Reviewed-by: Jiri Pirko <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Mar 24, 2023
[ Upstream commit cb80242 ] When running kfence_test, I found some testcases failed like this: # test_out_of_bounds_read: EXPECTATION FAILED at mm/kfence/kfence_test.c:346 Expected report_matches(&expect) to be true, but is false not ok 1 - test_out_of_bounds_read The corresponding call-trace is: BUG: KFENCE: out-of-bounds read in kunit_try_run_case+0x38/0x84 Out-of-bounds read at 0x(____ptrval____) (32B right of kfence-truenas#10): kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc The kfence_test using the first frame of call trace to check whether the testcase is succeed or not. Commit 6a00ef4 ("riscv: eliminate unreliable __builtin_frame_address(1)") skip first frame for all case, which results the kfence_test failed. Indeed, we only need to skip the first frame for case (task==NULL || task==current). With this patch, the call-trace will be: BUG: KFENCE: out-of-bounds read in test_out_of_bounds_read+0x88/0x19e Out-of-bounds read at 0x(____ptrval____) (1B left of kfence-truenas#7): test_out_of_bounds_read+0x88/0x19e kunit_try_run_case+0x38/0x84 kunit_generic_run_threadfn_adapter+0x12/0x1e kthread+0xc8/0xde ret_from_exception+0x0/0xc Fixes: 6a00ef4 ("riscv: eliminate unreliable __builtin_frame_address(1)") Signed-off-by: Liu Shixin <[email protected]> Tested-by: Samuel Holland <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Apr 20, 2023
[ Upstream commit 4e264be ] When a system with E810 with existing VFs gets rebooted the following hang may be observed. Pid 1 is hung in iavf_remove(), part of a network driver: PID: 1 TASK: ffff965400e5a340 CPU: 24 COMMAND: "systemd-shutdow" #0 [ffffaad04005fa50] __schedule at ffffffff8b3239cb #1 [ffffaad04005fae8] schedule at ffffffff8b323e2d #2 [ffffaad04005fb00] schedule_hrtimeout_range_clock at ffffffff8b32cebc #3 [ffffaad04005fb80] usleep_range_state at ffffffff8b32c930 #4 [ffffaad04005fbb0] iavf_remove at ffffffffc12b9b4c [iavf] #5 [ffffaad04005fbf0] pci_device_remove at ffffffff8add7513 #6 [ffffaad04005fc10] device_release_driver_internal at ffffffff8af08baa #7 [ffffaad04005fc40] pci_stop_bus_device at ffffffff8adcc5fc #8 [ffffaad04005fc60] pci_stop_and_remove_bus_device at ffffffff8adcc81e #9 [ffffaad04005fc70] pci_iov_remove_virtfn at ffffffff8adf9429 #10 [ffffaad04005fca8] sriov_disable at ffffffff8adf98e4 #11 [ffffaad04005fcc8] ice_free_vfs at ffffffffc04bb2c8 [ice] #12 [ffffaad04005fd10] ice_remove at ffffffffc04778fe [ice] #13 [ffffaad04005fd38] ice_shutdown at ffffffffc0477946 [ice] #14 [ffffaad04005fd50] pci_device_shutdown at ffffffff8add58f1 #15 [ffffaad04005fd70] device_shutdown at ffffffff8af05386 #16 [ffffaad04005fd98] kernel_restart at ffffffff8a92a870 #17 [ffffaad04005fda8] __do_sys_reboot at ffffffff8a92abd6 #18 [ffffaad04005fee0] do_syscall_64 at ffffffff8b317159 #19 [ffffaad04005ff08] __context_tracking_enter at ffffffff8b31b6fc #20 [ffffaad04005ff18] syscall_exit_to_user_mode at ffffffff8b31b50d #21 [ffffaad04005ff28] do_syscall_64 at ffffffff8b317169 #22 [ffffaad04005ff50] entry_SYSCALL_64_after_hwframe at ffffffff8b40009b RIP: 00007f1baa5c13d7 RSP: 00007fffbcc55a98 RFLAGS: 00000202 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f1baa5c13d7 RDX: 0000000001234567 RSI: 0000000028121969 RDI: 00000000fee1dead RBP: 00007fffbcc55ca0 R8: 0000000000000000 R9: 00007fffbcc54e90 R10: 00007fffbcc55050 R11: 0000000000000202 R12: 0000000000000005 R13: 0000000000000000 R14: 00007fffbcc55af0 R15: 0000000000000000 ORIG_RAX: 00000000000000a9 CS: 0033 SS: 002b During reboot all drivers PM shutdown callbacks are invoked. In iavf_shutdown() the adapter state is changed to __IAVF_REMOVE. In ice_shutdown() the call chain above is executed, which at some point calls iavf_remove(). However iavf_remove() expects the VF to be in one of the states __IAVF_RUNNING, __IAVF_DOWN or __IAVF_INIT_FAILED. If that's not the case it sleeps forever. So if iavf_shutdown() gets invoked before iavf_remove() the system will hang indefinitely because the adapter is already in state __IAVF_REMOVE. Fix this by returning from iavf_remove() if the state is __IAVF_REMOVE, as we already went through iavf_shutdown(). Fixes: 9745780 ("iavf: Add waiting so the port is initialized in remove") Fixes: a841733 ("iavf: Fix race condition between iavf_shutdown and iavf_remove") Reported-by: Marius Cornea <[email protected]> Signed-off-by: Stefan Assmann <[email protected]> Reviewed-by: Michal Kubiak <[email protected]> Tested-by: Rafal Romanowski <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jun 2, 2023
[ Upstream commit 05bb016 ] ACPICA commit 770653e3ba67c30a629ca7d12e352d83c2541b1e Before this change we see the following UBSAN stack trace in Fuchsia: #0 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 truenas#1.2 0x000020d0f660777f in ubsan_get_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:41 <libclang_rt.asan.so>+0x3d77f truenas#1.1 0x000020d0f660777f in maybe_print_stack_trace() compiler-rt/lib/ubsan/ubsan_diag.cpp:51 <libclang_rt.asan.so>+0x3d77f truenas#1 0x000020d0f660777f in ~scoped_report() compiler-rt/lib/ubsan/ubsan_diag.cpp:387 <libclang_rt.asan.so>+0x3d77f truenas#2 0x000020d0f660b96d in handlepointer_overflow_impl() compiler-rt/lib/ubsan/ubsan_handlers.cpp:809 <libclang_rt.asan.so>+0x4196d truenas#3 0x000020d0f660b50d in compiler-rt/lib/ubsan/ubsan_handlers.cpp:815 <libclang_rt.asan.so>+0x4150d truenas#4 0x000021e4213b3302 in acpi_ds_init_aml_walk(struct acpi_walk_state*, union acpi_parse_object*, struct acpi_namespace_node*, u8*, u32, struct acpi_evaluate_info*, u8) ../../third_party/acpica/source/components/dispatcher/dswstate.c:682 <platform-bus-x86.so>+0x233302 truenas#5 0x000021e4213e2369 in acpi_ds_call_control_method(struct acpi_thread_state*, struct acpi_walk_state*, union acpi_parse_object*) ../../third_party/acpica/source/components/dispatcher/dsmethod.c:605 <platform-bus-x86.so>+0x262369 truenas#6 0x000021e421437fac in acpi_ps_parse_aml(struct acpi_walk_state*) ../../third_party/acpica/source/components/parser/psparse.c:550 <platform-bus-x86.so>+0x2b7fac truenas#7 0x000021e4214464d2 in acpi_ps_execute_method(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/parser/psxface.c:244 <platform-bus-x86.so>+0x2c64d2 truenas#8 0x000021e4213aa052 in acpi_ns_evaluate(struct acpi_evaluate_info*) ../../third_party/acpica/source/components/namespace/nseval.c:250 <platform-bus-x86.so>+0x22a052 truenas#9 0x000021e421413dd8 in acpi_ns_init_one_device(acpi_handle, u32, void*, void**) ../../third_party/acpica/source/components/namespace/nsinit.c:735 <platform-bus-x86.so>+0x293dd8 truenas#10 0x000021e421429e98 in acpi_ns_walk_namespace(acpi_object_type, acpi_handle, u32, u32, acpi_walk_callback, acpi_walk_callback, void*, void**) ../../third_party/acpica/source/components/namespace/nswalk.c:298 <platform-bus-x86.so>+0x2a9e98 truenas#11 0x000021e4214131ac in acpi_ns_initialize_devices(u32) ../../third_party/acpica/source/components/namespace/nsinit.c:268 <platform-bus-x86.so>+0x2931ac truenas#12 0x000021e42147c40d in acpi_initialize_objects(u32) ../../third_party/acpica/source/components/utilities/utxfinit.c:304 <platform-bus-x86.so>+0x2fc40d truenas#13 0x000021e42126d603 in acpi::acpi_impl::initialize_acpi(acpi::acpi_impl*) ../../src/devices/board/lib/acpi/acpi-impl.cc:224 <platform-bus-x86.so>+0xed603 Add a simple check that avoids incrementing a pointer by zero, but otherwise behaves as before. Note that our findings are against ACPICA 20221020, but the same code exists on master. Link: acpica/acpica@770653e3 Signed-off-by: Bob Moore <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jul 25, 2023
[ Upstream commit 99d4850 ] Found by leak sanitizer: ``` ==1632594==ERROR: LeakSanitizer: detected memory leaks Direct leak of 21 byte(s) in 1 object(s) allocated from: #0 0x7f2953a7077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 #1 0x556701d6fbbf in perf_env__read_cpuid util/env.c:369 #2 0x556701d70589 in perf_env__cpuid util/env.c:465 #3 0x55670204bba2 in x86__is_amd_cpu arch/x86/util/env.c:14 #4 0x5567020487a2 in arch__post_evsel_config arch/x86/util/evsel.c:83 #5 0x556701d8f78b in evsel__config util/evsel.c:1366 #6 0x556701ef5872 in evlist__config util/record.c:108 #7 0x556701cd6bcd in test__PERF_RECORD tests/perf-record.c:112 #8 0x556701cacd07 in run_test tests/builtin-test.c:236 #9 0x556701cacfac in test_and_print tests/builtin-test.c:265 #10 0x556701cadddb in __cmd_test tests/builtin-test.c:402 #11 0x556701caf2aa in cmd_test tests/builtin-test.c:559 #12 0x556701d3b557 in run_builtin tools/perf/perf.c:323 #13 0x556701d3bac8 in handle_internal_command tools/perf/perf.c:377 #14 0x556701d3be90 in run_argv tools/perf/perf.c:421 #15 0x556701d3c3f8 in main tools/perf/perf.c:537 #16 0x7f2952a46189 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 SUMMARY: AddressSanitizer: 21 byte(s) leaked in 1 allocation(s). ``` Fixes: f7b58cb ("perf mem/c2c: Add load store event mappings for AMD") Signed-off-by: Ian Rogers <[email protected]> Acked-by: Ravi Bangoria <[email protected]> Tested-by: Arnaldo Carvalho de Melo <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jul 25, 2023
[ Upstream commit b684c09 ] ppc_save_regs() skips one stack frame while saving the CPU register states. Instead of saving current R1, it pulls the previous stack frame pointer. When vmcores caused by direct panic call (such as `echo c > /proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the backtrace correctly. On further analysis, it was found that it was because of mismatch between r1 and NIP. GDB uses NIP to get current function symbol and uses corresponding debug info of that function to unwind previous frames, but due to the mismatching r1 and NIP, the unwinding does not work, and it fails to unwind to the 2nd frame and hence does not show the backtrace. GDB backtrace with vmcore of kernel without this patch: --------- (gdb) bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69 #1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 #2 0x0000000000000063 in ?? () #3 0xc000000003579320 in ?? () --------- Further analysis revealed that the mismatch occurred because "ppc_save_regs" was saving the previous stack's SP instead of the current r1. This patch fixes this by storing current r1 in the saved pt_regs. GDB backtrace with vmcore of patched kernel: -------- (gdb) bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8) at ./arch/powerpc/include/asm/kexec.h:69 #1 __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974 #2 0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n") at kernel/panic.c:358 #3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 #4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 #5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 #6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340 #7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 #8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00, buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582 #9 0xc0000000005b4264 in ksys_write (fd=<optimized out>, buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2) at fs/read_write.c:637 #10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>) at arch/powerpc/kernel/syscall.c:171 #11 0xc00000000000c270 in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:192 -------- Nick adds: So this now saves regs as though it was an interrupt taken in the caller, at the instruction after the call to ppc_save_regs, whereas previously the NIP was there, but R1 came from the caller's caller and that mismatch is what causes gdb's dwarf unwinder to go haywire. Signed-off-by: Aditya Gupta <[email protected]> Fixes: d16a58f ("powerpc: Improve ppc_save_regs()") Reivewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jul 25, 2023
commit 8785436 upstream. Shift operation of 'exp' and 'shift' variables exceeds the maximum number of shift values in the u32 range leading to UBSAN shift-out-of-bounds. ... [ 6.120512] UBSAN: shift-out-of-bounds in drivers/hid/amd-sfh-hid/sfh1_1/amd_sfh_desc.c:149:50 [ 6.120598] shift exponent 104 is too large for 64-bit type 'long unsigned int' [ 6.120659] CPU: 4 PID: 96 Comm: kworker/4:1 Not tainted 6.4.0amd_1-next-20230519-dirty #10 [ 6.120665] Hardware name: AMD Birman-PHX/Birman-PHX, BIOS SFH_with_HPD_SEN.FD 04/05/2023 [ 6.120667] Workqueue: events amd_sfh_work_buffer [amd_sfh] [ 6.120687] Call Trace: [ 6.120690] <TASK> [ 6.120694] dump_stack_lvl+0x48/0x70 [ 6.120704] dump_stack+0x10/0x20 [ 6.120707] ubsan_epilogue+0x9/0x40 [ 6.120716] __ubsan_handle_shift_out_of_bounds+0x10f/0x170 [ 6.120720] ? psi_group_change+0x25f/0x4b0 [ 6.120729] float_to_int.cold+0x18/0xba [amd_sfh] [ 6.120739] get_input_rep+0x57/0x340 [amd_sfh] [ 6.120748] ? __schedule+0xba7/0x1b60 [ 6.120756] ? __pfx_get_input_rep+0x10/0x10 [amd_sfh] [ 6.120764] amd_sfh_work_buffer+0x91/0x180 [amd_sfh] [ 6.120772] process_one_work+0x229/0x430 [ 6.120780] worker_thread+0x4a/0x3c0 [ 6.120784] ? __pfx_worker_thread+0x10/0x10 [ 6.120788] kthread+0xf7/0x130 [ 6.120792] ? __pfx_kthread+0x10/0x10 [ 6.120795] ret_from_fork+0x29/0x50 [ 6.120804] </TASK> ... Fix this by adding the condition to validate shift ranges. Fixes: 93ce5e0 ("HID: amd_sfh: Implement SFH1.1 functionality") Cc: [email protected] Tested-by: Kai-Heng Feng <[email protected]> Signed-off-by: Basavaraj Natikar <[email protected]> Signed-off-by: Akshata MukundShetty <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Benjamin Tissoires <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jul 31, 2023
[ Upstream commit b684c09 ] ppc_save_regs() skips one stack frame while saving the CPU register states. Instead of saving current R1, it pulls the previous stack frame pointer. When vmcores caused by direct panic call (such as `echo c > /proc/sysrq-trigger`), are debugged with gdb, gdb fails to show the backtrace correctly. On further analysis, it was found that it was because of mismatch between r1 and NIP. GDB uses NIP to get current function symbol and uses corresponding debug info of that function to unwind previous frames, but due to the mismatching r1 and NIP, the unwinding does not work, and it fails to unwind to the 2nd frame and hence does not show the backtrace. GDB backtrace with vmcore of kernel without this patch: --------- (gdb) bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=<optimized out>, newregs=0xc000000004f8f8d8) at ./arch/powerpc/include/asm/kexec.h:69 #1 __crash_kexec (regs=<optimized out>) at kernel/kexec_core.c:974 #2 0x0000000000000063 in ?? () #3 0xc000000003579320 in ?? () --------- Further analysis revealed that the mismatch occurred because "ppc_save_regs" was saving the previous stack's SP instead of the current r1. This patch fixes this by storing current r1 in the saved pt_regs. GDB backtrace with vmcore of patched kernel: -------- (gdb) bt #0 0xc0000000002a53e8 in crash_setup_regs (oldregs=0x0, newregs=0xc00000000670b8d8) at ./arch/powerpc/include/asm/kexec.h:69 #1 __crash_kexec (regs=regs@entry=0x0) at kernel/kexec_core.c:974 #2 0xc000000000168918 in panic (fmt=fmt@entry=0xc000000001654a60 "sysrq triggered crash\n") at kernel/panic.c:358 #3 0xc000000000b735f8 in sysrq_handle_crash (key=<optimized out>) at drivers/tty/sysrq.c:155 #4 0xc000000000b742cc in __handle_sysrq (key=key@entry=99, check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:602 #5 0xc000000000b7506c in write_sysrq_trigger (file=<optimized out>, buf=<optimized out>, count=2, ppos=<optimized out>) at drivers/tty/sysrq.c:1163 #6 0xc00000000069a7bc in pde_write (ppos=<optimized out>, count=<optimized out>, buf=<optimized out>, file=<optimized out>, pde=0xc00000000362cb40) at fs/proc/inode.c:340 #7 proc_reg_write (file=<optimized out>, buf=<optimized out>, count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:352 #8 0xc0000000005b3bbc in vfs_write (file=file@entry=0xc000000006aa6b00, buf=buf@entry=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=count@entry=2, pos=pos@entry=0xc00000000670bda0) at fs/read_write.c:582 #9 0xc0000000005b4264 in ksys_write (fd=<optimized out>, buf=0x61f498b4f60 <error: Cannot access memory at address 0x61f498b4f60>, count=2) at fs/read_write.c:637 #10 0xc00000000002ea2c in system_call_exception (regs=0xc00000000670be80, r0=<optimized out>) at arch/powerpc/kernel/syscall.c:171 #11 0xc00000000000c270 in system_call_vectored_common () at arch/powerpc/kernel/interrupt_64.S:192 -------- Nick adds: So this now saves regs as though it was an interrupt taken in the caller, at the instruction after the call to ppc_save_regs, whereas previously the NIP was there, but R1 came from the caller's caller and that mismatch is what causes gdb's dwarf unwinder to go haywire. Signed-off-by: Aditya Gupta <[email protected]> Fixes: d16a58f ("powerpc: Improve ppc_save_regs()") Reivewed-by: Nicholas Piggin <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> Link: https://msgid.link/[email protected] Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Sep 20, 2023
[ Upstream commit 7962ef1 ] In 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") it only was freeing if strcmp(evsel->tp_format->system, "syscalls") returned zero, while the corresponding initialization of evsel->priv was being performed if it was _not_ zero, i.e. if the tp system wasn't 'syscalls'. Just stop looking for that and free it if evsel->priv was set, which should be equivalent. Also use the pre-existing evsel_trace__delete() function. This resolves these leaks, detected with: $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin ================================================================= ==481565==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) truenas#1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) truenas#2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 truenas#3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 truenas#4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 truenas#5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 truenas#6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212 truenas#7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 truenas#8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 truenas#9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 truenas#10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 truenas#11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 truenas#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 truenas#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) truenas#1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) truenas#2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 truenas#3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 truenas#4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 truenas#5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 truenas#6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205 truenas#7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 truenas#8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 truenas#9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 truenas#10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 truenas#11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 truenas#12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 truenas#13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s). [root@quaco ~]# With this we plug all leaks with "perf trace sleep 1". Fixes: 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Sep 20, 2023
commit 314ded5 upstream. The kernel IRQ system needs the irq affinity notifier to be clear before attempting to free the irq, see WARN_ON log below. On a normal driver unload we don't have this issue since we do the complete cleanup of the irq resources. To fix this, put the important resources cleanup in a helper function and use it in both normal driver unload and shutdown flows. [ 4497.498434] ------------[ cut here ]------------ [ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340 [ 4497.499193] Modules linked in: [ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.4.0-rc4+ truenas#10 [ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 4497.500518] Workqueue: events do_poweroff [ 4497.500849] RIP: 0010:free_irq+0x295/0x340 [ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008 [ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282 [ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000 [ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023 [ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260 [ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000 [ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600 [ 4497.504804] FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [ 4497.505302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0 [ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4497.507002] Call Trace: [ 4497.507158] <TASK> [ 4497.507299] ? free_irq+0x295/0x340 [ 4497.507522] ? __warn+0x7c/0x130 [ 4497.507740] ? free_irq+0x295/0x340 [ 4497.507963] ? report_bug+0x171/0x1a0 [ 4497.508197] ? handle_bug+0x3c/0x70 [ 4497.508417] ? exc_invalid_op+0x17/0x70 [ 4497.508662] ? asm_exc_invalid_op+0x1a/0x20 [ 4497.508926] ? free_irq+0x295/0x340 [ 4497.509146] mlx5_irq_pool_free_irqs+0x48/0x90 [ 4497.509421] mlx5_irq_table_free_irqs+0x38/0x50 [ 4497.509714] mlx5_core_eq_free_irqs+0x27/0x40 [ 4497.509984] shutdown+0x7b/0x100 [ 4497.510184] pci_device_shutdown+0x30/0x60 [ 4497.510440] device_shutdown+0x14d/0x240 [ 4497.510698] kernel_power_off+0x30/0x70 [ 4497.510938] process_one_work+0x1e6/0x3e0 [ 4497.511183] worker_thread+0x49/0x3b0 [ 4497.511407] ? __pfx_worker_thread+0x10/0x10 [ 4497.511679] kthread+0xe0/0x110 [ 4497.511879] ? __pfx_kthread+0x10/0x10 [ 4497.512114] ret_from_fork+0x29/0x50 [ 4497.512342] </TASK> Fixes: 9c2d080 ("net/mlx5: Free irqs only on shutdown callback") Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Shay Drory <[email protected]> Signed-off-by: Mathieu Tortuyaux <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Sep 27, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 truenas#10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 truenas#11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 truenas#12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 truenas#13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 truenas#14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 truenas#15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 truenas#16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 truenas#17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 truenas#18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 truenas#19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 truenas#20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 truenas#21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 truenas#22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 truenas#23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 truenas#24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 truenas#25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 truenas#26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 truenas#27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 truenas#10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 truenas#11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 truenas#12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 truenas#13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 truenas#14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 truenas#15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 truenas#16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 truenas#17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Oct 2, 2023
[ Upstream commit 7962ef1 ] In 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") it only was freeing if strcmp(evsel->tp_format->system, "syscalls") returned zero, while the corresponding initialization of evsel->priv was being performed if it was _not_ zero, i.e. if the tp system wasn't 'syscalls'. Just stop looking for that and free it if evsel->priv was set, which should be equivalent. Also use the pre-existing evsel_trace__delete() function. This resolves these leaks, detected with: $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin ================================================================= ==481565==ERROR: LeakSanitizer: detected memory leaks Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212 #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) Direct leak of 40 byte(s) in 1 object(s) allocated from: #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097) #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966) #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307 #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333 #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458 #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480 #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205 #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891 #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156 #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323 #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377 #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421 #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537 #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f) SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s). [root@quaco ~]# With this we plug all leaks with "perf trace sleep 1". Fixes: 3cb4d5e ("perf trace: Free syscall tp fields in evsel->priv") Acked-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Riccardo Mancini <[email protected]> Link: https://lore.kernel.org/lkml/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Oct 2, 2023
commit 314ded5 upstream. The kernel IRQ system needs the irq affinity notifier to be clear before attempting to free the irq, see WARN_ON log below. On a normal driver unload we don't have this issue since we do the complete cleanup of the irq resources. To fix this, put the important resources cleanup in a helper function and use it in both normal driver unload and shutdown flows. [ 4497.498434] ------------[ cut here ]------------ [ 4497.498726] WARNING: CPU: 0 PID: 9 at kernel/irq/manage.c:2034 free_irq+0x295/0x340 [ 4497.499193] Modules linked in: [ 4497.499386] CPU: 0 PID: 9 Comm: kworker/0:1 Tainted: G W 6.4.0-rc4+ #10 [ 4497.499876] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 [ 4497.500518] Workqueue: events do_poweroff [ 4497.500849] RIP: 0010:free_irq+0x295/0x340 [ 4497.501132] Code: 85 c0 0f 84 1d ff ff ff 48 89 ef ff d0 0f 1f 00 e9 10 ff ff ff 0f 0b e9 72 ff ff ff 49 8d 7f 28 ff d0 0f 1f 00 e9 df fd ff ff <0f> 0b 48 c7 80 c0 008 [ 4497.502269] RSP: 0018:ffffc90000053da0 EFLAGS: 00010282 [ 4497.502589] RAX: ffff888100949600 RBX: ffff88810330b948 RCX: 0000000000000000 [ 4497.503035] RDX: ffff888100949600 RSI: ffff888100400490 RDI: 0000000000000023 [ 4497.503472] RBP: ffff88810330c7e0 R08: ffff8881004005d0 R09: ffffffff8273a260 [ 4497.503923] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8881009ae000 [ 4497.504359] R13: ffff8881009ae148 R14: 0000000000000000 R15: ffff888100949600 [ 4497.504804] FS: 0000000000000000(0000) GS:ffff88813bc00000(0000) knlGS:0000000000000000 [ 4497.505302] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 4497.505671] CR2: 00007fce98806298 CR3: 000000000262e005 CR4: 0000000000370ef0 [ 4497.506104] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 4497.506540] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 4497.507002] Call Trace: [ 4497.507158] <TASK> [ 4497.507299] ? free_irq+0x295/0x340 [ 4497.507522] ? __warn+0x7c/0x130 [ 4497.507740] ? free_irq+0x295/0x340 [ 4497.507963] ? report_bug+0x171/0x1a0 [ 4497.508197] ? handle_bug+0x3c/0x70 [ 4497.508417] ? exc_invalid_op+0x17/0x70 [ 4497.508662] ? asm_exc_invalid_op+0x1a/0x20 [ 4497.508926] ? free_irq+0x295/0x340 [ 4497.509146] mlx5_irq_pool_free_irqs+0x48/0x90 [ 4497.509421] mlx5_irq_table_free_irqs+0x38/0x50 [ 4497.509714] mlx5_core_eq_free_irqs+0x27/0x40 [ 4497.509984] shutdown+0x7b/0x100 [ 4497.510184] pci_device_shutdown+0x30/0x60 [ 4497.510440] device_shutdown+0x14d/0x240 [ 4497.510698] kernel_power_off+0x30/0x70 [ 4497.510938] process_one_work+0x1e6/0x3e0 [ 4497.511183] worker_thread+0x49/0x3b0 [ 4497.511407] ? __pfx_worker_thread+0x10/0x10 [ 4497.511679] kthread+0xe0/0x110 [ 4497.511879] ? __pfx_kthread+0x10/0x10 [ 4497.512114] ret_from_fork+0x29/0x50 [ 4497.512342] </TASK> Fixes: 9c2d080 ("net/mlx5: Free irqs only on shutdown callback") Signed-off-by: Saeed Mahameed <[email protected]> Reviewed-by: Shay Drory <[email protected]> Signed-off-by: Mathieu Tortuyaux <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Oct 2, 2023
commit 0b0747d upstream. The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled. PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0 The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock. Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Cc: [email protected] Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Oct 17, 2023
[ Upstream commit a154f5f ] The following call trace shows a deadlock issue due to recursive locking of mutex "device_mutex". First lock acquire is in target_for_each_device() and second in target_free_device(). PID: 148266 TASK: ffff8be21ffb5d00 CPU: 10 COMMAND: "iscsi_ttx" #0 [ffffa2bfc9ec3b18] __schedule at ffffffffa8060e7f truenas#1 [ffffa2bfc9ec3ba0] schedule at ffffffffa8061224 truenas#2 [ffffa2bfc9ec3bb8] schedule_preempt_disabled at ffffffffa80615ee truenas#3 [ffffa2bfc9ec3bc8] __mutex_lock at ffffffffa8062fd7 truenas#4 [ffffa2bfc9ec3c40] __mutex_lock_slowpath at ffffffffa80631d3 truenas#5 [ffffa2bfc9ec3c50] mutex_lock at ffffffffa806320c truenas#6 [ffffa2bfc9ec3c68] target_free_device at ffffffffc0935998 [target_core_mod] truenas#7 [ffffa2bfc9ec3c90] target_core_dev_release at ffffffffc092f975 [target_core_mod] truenas#8 [ffffa2bfc9ec3ca0] config_item_put at ffffffffa79d250f truenas#9 [ffffa2bfc9ec3cd0] config_item_put at ffffffffa79d2583 truenas#10 [ffffa2bfc9ec3ce0] target_devices_idr_iter at ffffffffc0933f3a [target_core_mod] truenas#11 [ffffa2bfc9ec3d00] idr_for_each at ffffffffa803f6fc truenas#12 [ffffa2bfc9ec3d60] target_for_each_device at ffffffffc0935670 [target_core_mod] truenas#13 [ffffa2bfc9ec3d98] transport_deregister_session at ffffffffc0946408 [target_core_mod] truenas#14 [ffffa2bfc9ec3dc8] iscsit_close_session at ffffffffc09a44a6 [iscsi_target_mod] truenas#15 [ffffa2bfc9ec3df0] iscsit_close_connection at ffffffffc09a4a88 [iscsi_target_mod] truenas#16 [ffffa2bfc9ec3df8] finish_task_switch at ffffffffa76e5d07 truenas#17 [ffffa2bfc9ec3e78] iscsit_take_action_for_connection_exit at ffffffffc0991c23 [iscsi_target_mod] truenas#18 [ffffa2bfc9ec3ea0] iscsi_target_tx_thread at ffffffffc09a403b [iscsi_target_mod] truenas#19 [ffffa2bfc9ec3f08] kthread at ffffffffa76d8080 truenas#20 [ffffa2bfc9ec3f50] ret_from_fork at ffffffffa8200364 Fixes: 36d4cb4 ("scsi: target: Avoid that EXTENDED COPY commands trigger lock inversion") Signed-off-by: Junxiao Bi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Mike Christie <[email protected]> Signed-off-by: Martin K. Petersen <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Nov 20, 2023
[ Upstream commit a84fbf2 ] Generating metrics llc_code_read_mpi_demand_plus_prefetch, llc_data_read_mpi_demand_plus_prefetch, llc_miss_local_memory_bandwidth_read, llc_miss_local_memory_bandwidth_write, nllc_miss_remote_memory_bandwidth_read, memory_bandwidth_read, memory_bandwidth_write, uncore_frequency, upi_data_transmit_bw, C2_Pkg_Residency, C3_Core_Residency, C3_Pkg_Residency, C6_Core_Residency, C6_Pkg_Residency, C7_Core_Residency, C7_Pkg_Residency, UNCORE_FREQ and tma_info_system_socket_clks would trigger an address sanitizer heap-buffer-overflows on a SkylakeX. ``` ==2567752==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5020003ed098 at pc 0x5621a816654e bp 0x7fffb55d4da0 sp 0x7fffb55d4d98 READ of size 4 at 0x5020003eee78 thread T0 #0 0x558265d6654d in aggr_cpu_id__is_empty tools/perf/util/cpumap.c:694:12 #1 0x558265c914da in perf_stat__get_aggr tools/perf/builtin-stat.c:1490:6 #2 0x558265c914da in perf_stat__get_global_cached tools/perf/builtin-stat.c:1530:9 #3 0x558265e53290 in should_skip_zero_counter tools/perf/util/stat-display.c:947:31 #4 0x558265e53290 in print_counter_aggrdata tools/perf/util/stat-display.c:985:18 #5 0x558265e51931 in print_counter tools/perf/util/stat-display.c:1110:3 #6 0x558265e51931 in evlist__print_counters tools/perf/util/stat-display.c:1571:5 #7 0x558265c8ec87 in print_counters tools/perf/builtin-stat.c:981:2 #8 0x558265c8cc71 in cmd_stat tools/perf/builtin-stat.c:2837:3 #9 0x558265bb9bd4 in run_builtin tools/perf/perf.c:323:11 #10 0x558265bb98eb in handle_internal_command tools/perf/perf.c:377:8 #11 0x558265bb9389 in run_argv tools/perf/perf.c:421:2 #12 0x558265bb9389 in main tools/perf/perf.c:537:3 ``` The issue was the use of testing a cpumap with NULL rather than using empty, as a map containing the dummy value isn't NULL and the -1 results in an empty aggr map being allocated which legitimately overflows when any member is accessed. Fixes: 8a96f45 ("perf stat: Avoid SEGV if core.cpus isn't set") Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Nov 20, 2023
[ Upstream commit ede72dc ] Fuzzing found that an invalid tracepoint name would create a memory leak with an address sanitizer build: ``` $ perf stat -e '*:o/' true event syntax error: '*:o/' \___ parser error Run 'perf list' for a list of valid events Usage: perf stat [<options>] [<command>] -e, --event <event> event selector. use 'perf list' to list available events ================================================================= ==59380==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4 byte(s) in 2 object(s) allocated from: #0 0x7f38ac07077b in __interceptor_strdup ../../../../src/libsanitizer/asan/asan_interceptors.cpp:439 #1 0x55f2f41be73b in str util/parse-events.l:49 #2 0x55f2f41d08e8 in parse_events_lex util/parse-events.l:338 #3 0x55f2f41dc3b1 in parse_events_parse util/parse-events-bison.c:1464 #4 0x55f2f410b8b3 in parse_events__scanner util/parse-events.c:1822 #5 0x55f2f410d1b9 in __parse_events util/parse-events.c:2094 #6 0x55f2f410e57f in parse_events_option util/parse-events.c:2279 #7 0x55f2f4427b56 in get_value tools/lib/subcmd/parse-options.c:251 #8 0x55f2f4428d98 in parse_short_opt tools/lib/subcmd/parse-options.c:351 #9 0x55f2f4429d80 in parse_options_step tools/lib/subcmd/parse-options.c:539 #10 0x55f2f442acb9 in parse_options_subcommand tools/lib/subcmd/parse-options.c:654 #11 0x55f2f3ec99fc in cmd_stat tools/perf/builtin-stat.c:2501 #12 0x55f2f4093289 in run_builtin tools/perf/perf.c:322 #13 0x55f2f40937f5 in handle_internal_command tools/perf/perf.c:375 #14 0x55f2f4093bbd in run_argv tools/perf/perf.c:419 #15 0x55f2f409412b in main tools/perf/perf.c:535 SUMMARY: AddressSanitizer: 4 byte(s) leaked in 2 allocation(s). ``` Fix by adding the missing destructor. Fixes: 865582c ("perf tools: Adds the tracepoint name parsing support") Signed-off-by: Ian Rogers <[email protected]> Cc: He Kuang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Nov 21, 2023
[ Upstream commit a84fbf2 ] Generating metrics llc_code_read_mpi_demand_plus_prefetch, llc_data_read_mpi_demand_plus_prefetch, llc_miss_local_memory_bandwidth_read, llc_miss_local_memory_bandwidth_write, nllc_miss_remote_memory_bandwidth_read, memory_bandwidth_read, memory_bandwidth_write, uncore_frequency, upi_data_transmit_bw, C2_Pkg_Residency, C3_Core_Residency, C3_Pkg_Residency, C6_Core_Residency, C6_Pkg_Residency, C7_Core_Residency, C7_Pkg_Residency, UNCORE_FREQ and tma_info_system_socket_clks would trigger an address sanitizer heap-buffer-overflows on a SkylakeX. ``` ==2567752==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x5020003ed098 at pc 0x5621a816654e bp 0x7fffb55d4da0 sp 0x7fffb55d4d98 READ of size 4 at 0x5020003eee78 thread T0 #0 0x558265d6654d in aggr_cpu_id__is_empty tools/perf/util/cpumap.c:694:12 #1 0x558265c914da in perf_stat__get_aggr tools/perf/builtin-stat.c:1490:6 #2 0x558265c914da in perf_stat__get_global_cached tools/perf/builtin-stat.c:1530:9 #3 0x558265e53290 in should_skip_zero_counter tools/perf/util/stat-display.c:947:31 #4 0x558265e53290 in print_counter_aggrdata tools/perf/util/stat-display.c:985:18 #5 0x558265e51931 in print_counter tools/perf/util/stat-display.c:1110:3 #6 0x558265e51931 in evlist__print_counters tools/perf/util/stat-display.c:1571:5 #7 0x558265c8ec87 in print_counters tools/perf/builtin-stat.c:981:2 #8 0x558265c8cc71 in cmd_stat tools/perf/builtin-stat.c:2837:3 #9 0x558265bb9bd4 in run_builtin tools/perf/perf.c:323:11 #10 0x558265bb98eb in handle_internal_command tools/perf/perf.c:377:8 #11 0x558265bb9389 in run_argv tools/perf/perf.c:421:2 #12 0x558265bb9389 in main tools/perf/perf.c:537:3 ``` The issue was the use of testing a cpumap with NULL rather than using empty, as a map containing the dummy value isn't NULL and the -1 results in an empty aggr map being allocated which legitimately overflows when any member is accessed. Fixes: 8a96f45 ("perf stat: Avoid SEGV if core.cpus isn't set") Signed-off-by: Ian Rogers <[email protected]> Acked-by: Namhyung Kim <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Ian Rogers <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Peter Zijlstra <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Dec 4, 2023
[ Upstream commit 7196398 ] KMSAN reported the following uninit-value access issue: ===================================================== BUG: KMSAN: uninit-value in ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] BUG: KMSAN: uninit-value in ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: __alloc_pages+0x75d/0xe80 mm/page_alloc.c:4591 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] __page_frag_cache_refill+0x9a/0x2c0 mm/page_alloc.c:4691 page_frag_alloc_align+0x91/0x5d0 mm/page_alloc.c:4722 page_frag_alloc include/linux/gfp.h:322 [inline] __netdev_alloc_skb+0x215/0x6d0 net/core/skbuff.c:728 netdev_alloc_skb include/linux/skbuff.h:3225 [inline] dev_alloc_skb include/linux/skbuff.h:3238 [inline] ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline] ppp_sync_receive+0x237/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b CPU: 0 PID: 12950 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd truenas#10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== ppp_sync_input() checks the first 2 bytes of the data are PPP_ALLSTATIONS and PPP_UI. However, if the data length is 1 and the first byte is PPP_ALLSTATIONS, an access to an uninitialized value occurs when checking PPP_UI. This patch resolves this issue by checking the data length. Fixes: 1da177e ("Linux-2.6.12-rc2") Signed-off-by: Shigeru Yoshida <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Dec 4, 2023
[ Upstream commit fb317eb ] KMSAN reported the following kernel-infoleak issue: ===================================================== BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline] BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copy_to_user_iter lib/iov_iter.c:24 [inline] iterate_ubuf include/linux/iov_iter.h:29 [inline] iterate_and_advance2 include/linux/iov_iter.h:245 [inline] iterate_and_advance include/linux/iov_iter.h:271 [inline] _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 copy_to_iter include/linux/uio.h:197 [inline] simple_copy_to_iter net/core/datagram.c:532 [inline] __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420 skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546 skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline] netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg net/socket.c:1066 [inline] __sys_recvfrom+0x476/0x860 net/socket.c:2246 __do_sys_recvfrom net/socket.c:2264 [inline] __se_sys_recvfrom net/socket.c:2260 [inline] __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768 slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523 kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1286 [inline] tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline] tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170 tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324 genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline] genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545 genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x997/0xd60 net/socket.c:2588 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642 __sys_sendmsg net/socket.c:2671 [inline] __do_sys_sendmsg net/socket.c:2680 [inline] __se_sys_sendmsg net/socket.c:2678 [inline] __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Bytes 34-35 of 36 are uninitialized Memory access of size 36 starts at ffff88802d464a00 Data copied to user address 00007ff55033c0a0 CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd truenas#10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and the length of TLV value passed as an argument, and aligns the result to a multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes. If the size of struct tlv_desc plus the length of TLV value is not aligned, the current implementation leaves the remaining bytes uninitialized. This is the cause of the above kernel-infoleak issue. This patch resolves this issue by clearing data up to an aligned size. Fixes: d0796d1 ("tipc: convert legacy nl bearer dump to nl compat") Signed-off-by: Shigeru Yoshida <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jan 8, 2024
[ Upstream commit a2e36cd ] This allows it to break the following circular locking dependency. Aug 10 07:01:29 dg1test kernel: ====================================================== Aug 10 07:01:29 dg1test kernel: WARNING: possible circular locking dependency detected Aug 10 07:01:29 dg1test kernel: 6.4.0-rc7+ truenas#10 Not tainted Aug 10 07:01:29 dg1test kernel: ------------------------------------------------------ Aug 10 07:01:29 dg1test kernel: wireplumber/2236 is trying to acquire lock: Aug 10 07:01:29 dg1test kernel: ffff8fca5320da18 (&fctx->lock){-...}-{2:2}, at: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: but task is already holding lock: Aug 10 07:01:29 dg1test kernel: ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: which lock already depends on the new lock. Aug 10 07:01:29 dg1test kernel: the existing dependency chain (in reverse order) is: Aug 10 07:01:29 dg1test kernel: -> truenas#3 (&event->list_lock#2){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: -> truenas#2 (&device->intr.lock){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_inth_allow+0x2c/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x181/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100 Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0 Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau] Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm] Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau] Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0 Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90 Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc Aug 10 07:01:29 dg1test kernel: -> truenas#1 (&event->refs_lock#4){....}-{2:2}: Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_state+0x37/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy_allow+0x63/0xd0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_uevent_mthd+0x4d/0x70 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_ioctl+0x10b/0x250 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_object_mthd+0xa8/0x1f0 [nouveau] Aug 10 07:01:29 dg1test kernel: nvif_event_allow+0x2a/0xa0 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_enable_signaling+0x78/0x80 [nouveau] Aug 10 07:01:29 dg1test kernel: __dma_fence_enable_signaling+0x5e/0x100 Aug 10 07:01:29 dg1test kernel: dma_fence_add_callback+0x4b/0xd0 Aug 10 07:01:29 dg1test kernel: nouveau_cli_work_queue+0xae/0x110 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_gem_object_close+0x1d1/0x2a0 [nouveau] Aug 10 07:01:29 dg1test kernel: drm_gem_handle_delete+0x70/0xe0 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl_kernel+0xa5/0x150 [drm] Aug 10 07:01:29 dg1test kernel: drm_ioctl+0x256/0x490 [drm] Aug 10 07:01:29 dg1test kernel: nouveau_drm_ioctl+0x5a/0xb0 [nouveau] Aug 10 07:01:29 dg1test kernel: __x64_sys_ioctl+0x91/0xd0 Aug 10 07:01:29 dg1test kernel: do_syscall_64+0x3c/0x90 Aug 10 07:01:29 dg1test kernel: entry_SYSCALL_64_after_hwframe+0x72/0xdc Aug 10 07:01:29 dg1test kernel: -> #0 (&fctx->lock){-...}-{2:2}: Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240 Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: other info that might help us debug this: Aug 10 07:01:29 dg1test kernel: Chain exists of: &fctx->lock --> &device->intr.lock --> &event->list_lock#2 Aug 10 07:01:29 dg1test kernel: Possible unsafe locking scenario: Aug 10 07:01:29 dg1test kernel: CPU0 CPU1 Aug 10 07:01:29 dg1test kernel: ---- ---- Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2); Aug 10 07:01:29 dg1test kernel: lock(&device->intr.lock); Aug 10 07:01:29 dg1test kernel: lock(&event->list_lock#2); Aug 10 07:01:29 dg1test kernel: lock(&fctx->lock); Aug 10 07:01:29 dg1test kernel: *** DEADLOCK *** Aug 10 07:01:29 dg1test kernel: 2 locks held by wireplumber/2236: Aug 10 07:01:29 dg1test kernel: #0: ffff8fca53177bf8 (&device->intr.lock){-...}-{2:2}, at: nvkm_intr+0x29/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: truenas#1: ffff8fca41208610 (&event->list_lock#2){-...}-{2:2}, at: nvkm_event_ntfy+0x50/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: stack backtrace: Aug 10 07:01:29 dg1test kernel: CPU: 6 PID: 2236 Comm: wireplumber Not tainted 6.4.0-rc7+ truenas#10 Aug 10 07:01:29 dg1test kernel: Hardware name: Gigabyte Technology Co., Ltd. Z390 I AORUS PRO WIFI/Z390 I AORUS PRO WIFI-CF, BIOS F8 11/05/2021 Aug 10 07:01:29 dg1test kernel: Call Trace: Aug 10 07:01:29 dg1test kernel: <TASK> Aug 10 07:01:29 dg1test kernel: dump_stack_lvl+0x5b/0x90 Aug 10 07:01:29 dg1test kernel: check_noncircular+0xe2/0x110 Aug 10 07:01:29 dg1test kernel: __lock_acquire+0x14e3/0x2240 Aug 10 07:01:29 dg1test kernel: lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: ? lock_acquire+0xc8/0x2a0 Aug 10 07:01:29 dg1test kernel: _raw_spin_lock_irqsave+0x4b/0x70 Aug 10 07:01:29 dg1test kernel: ? nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nouveau_fence_wait_uevent_handler+0x2b/0x100 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_client_event+0xf/0x20 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_event_ntfy+0x9b/0xf0 [nouveau] Aug 10 07:01:29 dg1test kernel: ga100_fifo_nonstall_intr+0x24/0x30 [nouveau] Aug 10 07:01:29 dg1test kernel: nvkm_intr+0x12c/0x240 [nouveau] Aug 10 07:01:29 dg1test kernel: __handle_irq_event_percpu+0x88/0x240 Aug 10 07:01:29 dg1test kernel: handle_irq_event+0x38/0x80 Aug 10 07:01:29 dg1test kernel: handle_edge_irq+0xa3/0x240 Aug 10 07:01:29 dg1test kernel: __common_interrupt+0x72/0x160 Aug 10 07:01:29 dg1test kernel: common_interrupt+0x60/0xe0 Aug 10 07:01:29 dg1test kernel: asm_common_interrupt+0x26/0x40 Aug 10 07:01:29 dg1test kernel: RIP: 0033:0x7fb66174d700 Aug 10 07:01:29 dg1test kernel: Code: c1 e2 05 29 ca 8d 0c 10 0f be 07 84 c0 75 eb 89 c8 c3 0f 1f 84 00 00 00 00 00 f3 0f 1e fa e9 d7 0f fc ff 0f 1f 80 00 00 00 00 <f3> 0f 1e fa e9 c7 0f fc> Aug 10 07:01:29 dg1test kernel: RSP: 002b:00007ffdd3c48438 EFLAGS: 00000206 Aug 10 07:01:29 dg1test kernel: RAX: 000055bb758763c0 RBX: 000055bb758752c0 RCX: 00000000000028b0 Aug 10 07:01:29 dg1test kernel: RDX: 000055bb758752c0 RSI: 000055bb75887490 RDI: 000055bb75862950 Aug 10 07:01:29 dg1test kernel: RBP: 00007ffdd3c48490 R08: 000055bb75873b10 R09: 0000000000000001 Aug 10 07:01:29 dg1test kernel: R10: 0000000000000004 R11: 000055bb7587f000 R12: 000055bb75887490 Aug 10 07:01:29 dg1test kernel: R13: 000055bb757f6280 R14: 000055bb758875c0 R15: 000055bb757f6280 Aug 10 07:01:29 dg1test kernel: </TASK> Signed-off-by: Dave Airlie <[email protected]> Tested-by: Danilo Krummrich <[email protected]> Reviewed-by: Danilo Krummrich <[email protected]> Signed-off-by: Danilo Krummrich <[email protected]> Link: https://patchwork.freedesktop.org/patch/msgid/[email protected] Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jan 8, 2024
[ Upstream commit 0550d46 ] KMSAN reported the following uninit-value access issue: lo speed is unknown, defaulting to 1000 ===================================================== BUG: KMSAN: uninit-value in ib_get_width_and_speed drivers/infiniband/core/verbs.c:1889 [inline] BUG: KMSAN: uninit-value in ib_get_eth_speed+0x546/0xaf0 drivers/infiniband/core/verbs.c:1998 ib_get_width_and_speed drivers/infiniband/core/verbs.c:1889 [inline] ib_get_eth_speed+0x546/0xaf0 drivers/infiniband/core/verbs.c:1998 siw_query_port drivers/infiniband/sw/siw/siw_verbs.c:173 [inline] siw_get_port_immutable+0x6f/0x120 drivers/infiniband/sw/siw/siw_verbs.c:203 setup_port_data drivers/infiniband/core/device.c:848 [inline] setup_device drivers/infiniband/core/device.c:1244 [inline] ib_register_device+0x1589/0x1df0 drivers/infiniband/core/device.c:1383 siw_device_register drivers/infiniband/sw/siw/siw_main.c:72 [inline] siw_newlink+0x129e/0x13d0 drivers/infiniband/sw/siw/siw_main.c:490 nldev_newlink+0x8fd/0xa60 drivers/infiniband/core/nldev.c:1763 rdma_nl_rcv_skb drivers/infiniband/core/netlink.c:239 [inline] rdma_nl_rcv+0xe8a/0x1120 drivers/infiniband/core/netlink.c:259 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x997/0xd60 net/socket.c:2588 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642 __sys_sendmsg net/socket.c:2671 [inline] __do_sys_sendmsg net/socket.c:2680 [inline] __se_sys_sendmsg net/socket.c:2678 [inline] __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Local variable lksettings created at: ib_get_eth_speed+0x4b/0xaf0 drivers/infiniband/core/verbs.c:1974 siw_query_port drivers/infiniband/sw/siw/siw_verbs.c:173 [inline] siw_get_port_immutable+0x6f/0x120 drivers/infiniband/sw/siw/siw_verbs.c:203 CPU: 0 PID: 11257 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd truenas#10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== If __ethtool_get_link_ksettings() fails, `netdev_speed` is set to the default value, SPEED_1000. In this case, if `lanes` field of struct ethtool_link_ksettings is not initialized, an uninitialized value is passed to ib_get_width_and_speed(). This causes the above issue. This patch resolves the issue by initializing `lanes` to 0. Fixes: cb06b6b ("RDMA/core: Get IB width and speed from netdev") Signed-off-by: Shigeru Yoshida <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Leon Romanovsky <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jan 8, 2024
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 truenas#1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 truenas#2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f truenas#3 [fffffe0000549eb8] do_nmi at ffffffff81079582 truenas#4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- truenas#5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b truenas#6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 truenas#7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 truenas#8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] truenas#9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] truenas#10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] truenas#11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] truenas#12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb truenas#13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 truenas#14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 truenas#15 [ffff88aa841efb88] device_del at ffffffff82179d23 truenas#16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] truenas#17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] truenas#18 [ffff88aa841efde8] process_one_work at ffffffff811c589a truenas#19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff truenas#20 [ffff88aa841eff10] kthread at ffffffff811d87a0 truenas#21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/[email protected] Suggested-by: "Ismail, Mustafa" <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Reviewed-by: Shiraz Saleem <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jan 23, 2024
[ Upstream commit 7196398 ] KMSAN reported the following uninit-value access issue: ===================================================== BUG: KMSAN: uninit-value in ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] BUG: KMSAN: uninit-value in ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: __alloc_pages+0x75d/0xe80 mm/page_alloc.c:4591 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] __page_frag_cache_refill+0x9a/0x2c0 mm/page_alloc.c:4691 page_frag_alloc_align+0x91/0x5d0 mm/page_alloc.c:4722 page_frag_alloc include/linux/gfp.h:322 [inline] __netdev_alloc_skb+0x215/0x6d0 net/core/skbuff.c:728 netdev_alloc_skb include/linux/skbuff.h:3225 [inline] dev_alloc_skb include/linux/skbuff.h:3238 [inline] ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline] ppp_sync_receive+0x237/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b CPU: 0 PID: 12950 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== ppp_sync_input() checks the first 2 bytes of the data are PPP_ALLSTATIONS and PPP_UI. However, if the data length is 1 and the first byte is PPP_ALLSTATIONS, an access to an uninitialized value occurs when checking PPP_UI. This patch resolves this issue by checking the data length. Fixes: 1da177e ("Linux-2.6.12-rc2") Signed-off-by: Shigeru Yoshida <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jan 23, 2024
[ Upstream commit fb317eb ] KMSAN reported the following kernel-infoleak issue: ===================================================== BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline] BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copy_to_user_iter lib/iov_iter.c:24 [inline] iterate_ubuf include/linux/iov_iter.h:29 [inline] iterate_and_advance2 include/linux/iov_iter.h:245 [inline] iterate_and_advance include/linux/iov_iter.h:271 [inline] _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 copy_to_iter include/linux/uio.h:197 [inline] simple_copy_to_iter net/core/datagram.c:532 [inline] __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420 skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546 skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline] netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg net/socket.c:1066 [inline] __sys_recvfrom+0x476/0x860 net/socket.c:2246 __do_sys_recvfrom net/socket.c:2264 [inline] __se_sys_recvfrom net/socket.c:2260 [inline] __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Uninit was created at: slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768 slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523 kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1286 [inline] tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline] tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170 tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324 genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline] genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545 genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x997/0xd60 net/socket.c:2588 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642 __sys_sendmsg net/socket.c:2671 [inline] __do_sys_sendmsg net/socket.c:2680 [inline] __se_sys_sendmsg net/socket.c:2678 [inline] __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b Bytes 34-35 of 36 are uninitialized Memory access of size 36 starts at ffff88802d464a00 Data copied to user address 00007ff55033c0a0 CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 ===================================================== tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and the length of TLV value passed as an argument, and aligns the result to a multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes. If the size of struct tlv_desc plus the length of TLV value is not aligned, the current implementation leaves the remaining bytes uninitialized. This is the cause of the above kernel-infoleak issue. This patch resolves this issue by clearing data up to an aligned size. Fixes: d0796d1 ("tipc: convert legacy nl bearer dump to nl compat") Signed-off-by: Shigeru Yoshida <[email protected]> Reviewed-by: Simon Horman <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jan 23, 2024
[ Upstream commit e3e82fc ] When creating ceq_0 during probing irdma, cqp.sc_cqp will be sent as a cqp_request to cqp->sc_cqp.sq_ring. If the request is pending when removing the irdma driver or unplugging its aux device, cqp.sc_cqp will be dereferenced as wrong struct in irdma_free_pending_cqp_request(). PID: 3669 TASK: ffff88aef892c000 CPU: 28 COMMAND: "kworker/28:0" #0 [fffffe0000549e38] crash_nmi_callback at ffffffff810e3a34 #1 [fffffe0000549e40] nmi_handle at ffffffff810788b2 #2 [fffffe0000549ea0] default_do_nmi at ffffffff8107938f #3 [fffffe0000549eb8] do_nmi at ffffffff81079582 #4 [fffffe0000549ef0] end_repeat_nmi at ffffffff82e016b4 [exception RIP: native_queued_spin_lock_slowpath+1291] RIP: ffffffff8127e72b RSP: ffff88aa841ef778 RFLAGS: 00000046 RAX: 0000000000000000 RBX: ffff88b01f849700 RCX: ffffffff8127e47e RDX: 0000000000000000 RSI: 0000000000000004 RDI: ffffffff83857ec0 RBP: ffff88afe3e4efc8 R8: ffffed15fc7c9dfa R9: ffffed15fc7c9dfa R10: 0000000000000001 R11: ffffed15fc7c9df9 R12: 0000000000740000 R13: ffff88b01f849708 R14: 0000000000000003 R15: ffffed1603f092e1 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0000 -- <NMI exception stack> -- #5 [ffff88aa841ef778] native_queued_spin_lock_slowpath at ffffffff8127e72b #6 [ffff88aa841ef7b0] _raw_spin_lock_irqsave at ffffffff82c22aa4 #7 [ffff88aa841ef7c8] __wake_up_common_lock at ffffffff81257363 #8 [ffff88aa841ef888] irdma_free_pending_cqp_request at ffffffffa0ba12cc [irdma] #9 [ffff88aa841ef958] irdma_cleanup_pending_cqp_op at ffffffffa0ba1469 [irdma] #10 [ffff88aa841ef9c0] irdma_ctrl_deinit_hw at ffffffffa0b2989f [irdma] #11 [ffff88aa841efa28] irdma_remove at ffffffffa0b252df [irdma] #12 [ffff88aa841efae8] auxiliary_bus_remove at ffffffff8219afdb #13 [ffff88aa841efb00] device_release_driver_internal at ffffffff821882e6 #14 [ffff88aa841efb38] bus_remove_device at ffffffff82184278 #15 [ffff88aa841efb88] device_del at ffffffff82179d23 #16 [ffff88aa841efc48] ice_unplug_aux_dev at ffffffffa0eb1c14 [ice] #17 [ffff88aa841efc68] ice_service_task at ffffffffa0d88201 [ice] #18 [ffff88aa841efde8] process_one_work at ffffffff811c589a #19 [ffff88aa841efe60] worker_thread at ffffffff811c71ff #20 [ffff88aa841eff10] kthread at ffffffff811d87a0 #21 [ffff88aa841eff50] ret_from_fork at ffffffff82e0022f Fixes: 44d9e52 ("RDMA/irdma: Implement device initialization definitions") Link: https://lore.kernel.org/r/[email protected] Suggested-by: "Ismail, Mustafa" <[email protected]> Signed-off-by: Shifeng Li <[email protected]> Reviewed-by: Shiraz Saleem <[email protected]> Signed-off-by: Jason Gunthorpe <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Mar 29, 2024
[ Upstream commit c957280 ] From commit a304e1b ("[PATCH] Debug shared irqs"), there is a test to make sure the shared irq handler should be able to handle the unexpected event after deregistration. For this case, let's apply MT76_REMOVED flag to indicate the device was removed and do not run into the resource access anymore. BUG: KASAN: use-after-free in mt7921_irq_handler+0xd8/0x100 [mt7921e] Read of size 8 at addr ffff88824a7d3b78 by task rmmod/11115 CPU: 28 PID: 11115 Comm: rmmod Tainted: G W L 5.17.0 truenas#10 Hardware name: Micro-Star International Co., Ltd. MS-7D73/MPG B650I EDGE WIFI (MS-7D73), BIOS 1.81 01/05/2024 Call Trace: <TASK> dump_stack_lvl+0x6f/0xa0 print_address_description.constprop.0+0x1f/0x190 ? mt7921_irq_handler+0xd8/0x100 [mt7921e] ? mt7921_irq_handler+0xd8/0x100 [mt7921e] kasan_report.cold+0x7f/0x11b ? mt7921_irq_handler+0xd8/0x100 [mt7921e] mt7921_irq_handler+0xd8/0x100 [mt7921e] free_irq+0x627/0xaa0 devm_free_irq+0x94/0xd0 ? devm_request_any_context_irq+0x160/0x160 ? kobject_put+0x18d/0x4a0 mt7921_pci_remove+0x153/0x190 [mt7921e] pci_device_remove+0xa2/0x1d0 __device_release_driver+0x346/0x6e0 driver_detach+0x1ef/0x2c0 bus_remove_driver+0xe7/0x2d0 ? __check_object_size+0x57/0x310 pci_unregister_driver+0x26/0x250 __do_sys_delete_module+0x307/0x510 ? free_module+0x6a0/0x6a0 ? fpregs_assert_state_consistent+0x4b/0xb0 ? rcu_read_lock_sched_held+0x10/0x70 ? syscall_enter_from_user_mode+0x20/0x70 ? trace_hardirqs_on+0x1c/0x130 do_syscall_64+0x5c/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 ? do_syscall_64+0x68/0x80 ? trace_hardirqs_on_prepare+0x72/0x160 entry_SYSCALL_64_after_hwframe+0x44/0xae Reported-by: Mikhail Gavrilov <[email protected]> Closes: https://lore.kernel.org/linux-wireless/CABXGCsOdvVwdLmSsC8TZ1jF0UOg_F_W3wqLECWX620PUkvNk=A@mail.gmail.com/ Fixes: 9270270 ("wifi: mt76: mt7921: fix PCI DMA hang after reboot") Tested-by: Mikhail Gavrilov <[email protected]> Signed-off-by: Deren Wu <[email protected]> Signed-off-by: Felix Fietkau <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Mar 29, 2024
[ Upstream commit fd5860a ] The loop inside nfs_netfs_issue_read() currently does not disable interrupts while iterating through pages in the xarray to submit for NFS read. This is not safe though since after taking xa_lock, another page in the mapping could be processed for writeback inside an interrupt, and deadlock can occur. The fix is simple and clean if we use xa_for_each_range(), which handles the iteration with RCU while reducing code complexity. The problem is easily reproduced with the following test: mount -o vers=3,fsc 127.0.0.1:/export /mnt/nfs dd if=/dev/zero of=/mnt/nfs/file1.bin bs=4096 count=1 echo 3 > /proc/sys/vm/drop_caches dd if=/mnt/nfs/file1.bin of=/dev/null umount /mnt/nfs On the console with a lockdep-enabled kernel a message similar to the following will be seen: ================================ WARNING: inconsistent lock state 6.7.0-lockdbg+ truenas#10 Not tainted -------------------------------- inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage. test5/1708 [HC0[0]:SC0[0]:HE1:SE1] takes: ffff888127baa598 (&xa->xa_lock#4){+.?.}-{3:3}, at: nfs_netfs_issue_read+0x1b2/0x4b0 [nfs] {IN-SOFTIRQ-W} state was registered at: lock_acquire+0x144/0x380 _raw_spin_lock_irqsave+0x4e/0xa0 __folio_end_writeback+0x17e/0x5c0 folio_end_writeback+0x93/0x1b0 iomap_finish_ioend+0xeb/0x6a0 blk_update_request+0x204/0x7f0 blk_mq_end_request+0x30/0x1c0 blk_complete_reqs+0x7e/0xa0 __do_softirq+0x113/0x544 __irq_exit_rcu+0xfe/0x120 irq_exit_rcu+0xe/0x20 sysvec_call_function_single+0x6f/0x90 asm_sysvec_call_function_single+0x1a/0x20 pv_native_safe_halt+0xf/0x20 default_idle+0x9/0x20 default_idle_call+0x67/0xa0 do_idle+0x2b5/0x300 cpu_startup_entry+0x34/0x40 start_secondary+0x19d/0x1c0 secondary_startup_64_no_verify+0x18f/0x19b irq event stamp: 176891 hardirqs last enabled at (176891): [<ffffffffa67a0be4>] _raw_spin_unlock_irqrestore+0x44/0x60 hardirqs last disabled at (176890): [<ffffffffa67a0899>] _raw_spin_lock_irqsave+0x79/0xa0 softirqs last enabled at (176646): [<ffffffffa515d91e>] __irq_exit_rcu+0xfe/0x120 softirqs last disabled at (176633): [<ffffffffa515d91e>] __irq_exit_rcu+0xfe/0x120 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(&xa->xa_lock#4); <Interrupt> lock(&xa->xa_lock#4); *** DEADLOCK *** 2 locks held by test5/1708: #0: ffff888127baa498 (&sb->s_type->i_mutex_key#22){++++}-{4:4}, at: nfs_start_io_read+0x28/0x90 [nfs] truenas#1: ffff888127baa650 (mapping.invalidate_lock#3){.+.+}-{4:4}, at: page_cache_ra_unbounded+0xa4/0x280 stack backtrace: CPU: 6 PID: 1708 Comm: test5 Kdump: loaded Not tainted 6.7.0-lockdbg+ Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014 Call Trace: dump_stack_lvl+0x5b/0x90 mark_lock+0xb3f/0xd20 __lock_acquire+0x77b/0x3360 _raw_spin_lock+0x34/0x80 nfs_netfs_issue_read+0x1b2/0x4b0 [nfs] netfs_begin_read+0x77f/0x980 [netfs] nfs_netfs_readahead+0x45/0x60 [nfs] nfs_readahead+0x323/0x5a0 [nfs] read_pages+0xf3/0x5c0 page_cache_ra_unbounded+0x1c8/0x280 filemap_get_pages+0x38c/0xae0 filemap_read+0x206/0x5e0 nfs_file_read+0xb7/0x140 [nfs] vfs_read+0x2a9/0x460 ksys_read+0xb7/0x140 Fixes: 000dbe0 ("NFS: Convert buffered read paths to use netfs when fscache is enabled") Suggested-by: Jeff Layton <[email protected]> Signed-off-by: Dave Wysochanski <[email protected]> Reviewed-by: Jeff Layton <[email protected]> Reviewed-by: David Howells <[email protected]> Signed-off-by: Trond Myklebust <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Apr 29, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 truenas#1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 truenas#2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e truenas#3 [fffffe00003fced0] do_nmi at ffffffff8922660d truenas#4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 truenas#5 [ffffa655314979e8] io_serial_in at ffffffff89792594 truenas#6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 truenas#7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 truenas#8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 truenas#9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 truenas#10 [ffffa65531497ac8] console_unlock at ffffffff89316124 truenas#11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 truenas#12 [ffffa65531497b68] printk at ffffffff89318306 truenas#13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 truenas#14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] truenas#15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] truenas#16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] truenas#17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] truenas#18 [ffffa65531497f10] kthread at ffffffff892d2e72 truenas#19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
May 1, 2024
[ Upstream commit f8bbc07 ] vhost_worker will call tun call backs to receive packets. If too many illegal packets arrives, tun_do_read will keep dumping packet contents. When console is enabled, it will costs much more cpu time to dump packet and soft lockup will be detected. net_ratelimit mechanism can be used to limit the dumping rate. PID: 33036 TASK: ffff949da6f20000 CPU: 23 COMMAND: "vhost-32980" #0 [fffffe00003fce50] crash_nmi_callback at ffffffff89249253 #1 [fffffe00003fce58] nmi_handle at ffffffff89225fa3 #2 [fffffe00003fceb0] default_do_nmi at ffffffff8922642e #3 [fffffe00003fced0] do_nmi at ffffffff8922660d #4 [fffffe00003fcef0] end_repeat_nmi at ffffffff89c01663 [exception RIP: io_serial_in+20] RIP: ffffffff89792594 RSP: ffffa655314979e8 RFLAGS: 00000002 RAX: ffffffff89792500 RBX: ffffffff8af428a0 RCX: 0000000000000000 RDX: 00000000000003fd RSI: 0000000000000005 RDI: ffffffff8af428a0 RBP: 0000000000002710 R8: 0000000000000004 R9: 000000000000000f R10: 0000000000000000 R11: ffffffff8acbf64f R12: 0000000000000020 R13: ffffffff8acbf698 R14: 0000000000000058 R15: 0000000000000000 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 #5 [ffffa655314979e8] io_serial_in at ffffffff89792594 #6 [ffffa655314979e8] wait_for_xmitr at ffffffff89793470 #7 [ffffa65531497a08] serial8250_console_putchar at ffffffff897934f6 #8 [ffffa65531497a20] uart_console_write at ffffffff8978b605 #9 [ffffa65531497a48] serial8250_console_write at ffffffff89796558 #10 [ffffa65531497ac8] console_unlock at ffffffff89316124 #11 [ffffa65531497b10] vprintk_emit at ffffffff89317c07 #12 [ffffa65531497b68] printk at ffffffff89318306 #13 [ffffa65531497bc8] print_hex_dump at ffffffff89650765 #14 [ffffa65531497ca8] tun_do_read at ffffffffc0b06c27 [tun] #15 [ffffa65531497d38] tun_recvmsg at ffffffffc0b06e34 [tun] #16 [ffffa65531497d68] handle_rx at ffffffffc0c5d682 [vhost_net] #17 [ffffa65531497ed0] vhost_worker at ffffffffc0c644dc [vhost] #18 [ffffa65531497f10] kthread at ffffffff892d2e72 #19 [ffffa65531497f50] ret_from_fork at ffffffff89c0022f Fixes: ef3db4a ("tun: avoid BUG, dump packet on GSO errors") Signed-off-by: Lei Chen <[email protected]> Reviewed-by: Willem de Bruijn <[email protected]> Acked-by: Jason Wang <[email protected]> Reviewed-by: Eric Dumazet <[email protected]> Acked-by: Michael S. Tsirkin <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Jun 13, 2024
[ Upstream commit 769e6a1 ] ui_browser__show() is capturing the input title that is stack allocated memory in hist_browser__run(). Avoid a use after return by strdup-ing the string. Committer notes: Further explanation from Ian Rogers: My command line using tui is: $ sudo bash -c 'rm /tmp/asan.log*; export ASAN_OPTIONS="log_path=/tmp/asan.log"; /tmp/perf/perf mem record -a sleep 1; /tmp/perf/perf mem report' I then go to the perf annotate view and quit. This triggers the asan error (from the log file): ``` ==1254591==ERROR: AddressSanitizer: stack-use-after-return on address 0x7f2813331920 at pc 0x7f28180 65991 bp 0x7fff0a21c750 sp 0x7fff0a21bf10 READ of size 80 at 0x7f2813331920 thread T0 #0 0x7f2818065990 in __interceptor_strlen ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:461 truenas#1 0x7f2817698251 in SLsmg_write_wrapped_string (/lib/x86_64-linux-gnu/libslang.so.2+0x98251) truenas#2 0x7f28176984b9 in SLsmg_write_nstring (/lib/x86_64-linux-gnu/libslang.so.2+0x984b9) truenas#3 0x55c94045b365 in ui_browser__write_nstring ui/browser.c:60 truenas#4 0x55c94045c558 in __ui_browser__show_title ui/browser.c:266 truenas#5 0x55c94045c776 in ui_browser__show ui/browser.c:288 truenas#6 0x55c94045c06d in ui_browser__handle_resize ui/browser.c:206 truenas#7 0x55c94047979b in do_annotate ui/browsers/hists.c:2458 truenas#8 0x55c94047fb17 in evsel__hists_browse ui/browsers/hists.c:3412 truenas#9 0x55c940480a0c in perf_evsel_menu__run ui/browsers/hists.c:3527 truenas#10 0x55c940481108 in __evlist__tui_browse_hists ui/browsers/hists.c:3613 truenas#11 0x55c9404813f7 in evlist__tui_browse_hists ui/browsers/hists.c:3661 truenas#12 0x55c93ffa253f in report__browse_hists tools/perf/builtin-report.c:671 truenas#13 0x55c93ffa58ca in __cmd_report tools/perf/builtin-report.c:1141 truenas#14 0x55c93ffaf159 in cmd_report tools/perf/builtin-report.c:1805 truenas#15 0x55c94000c05c in report_events tools/perf/builtin-mem.c:374 truenas#16 0x55c94000d96d in cmd_mem tools/perf/builtin-mem.c:516 truenas#17 0x55c9400e44ee in run_builtin tools/perf/perf.c:350 truenas#18 0x55c9400e4a5a in handle_internal_command tools/perf/perf.c:403 truenas#19 0x55c9400e4e22 in run_argv tools/perf/perf.c:447 truenas#20 0x55c9400e53ad in main tools/perf/perf.c:561 truenas#21 0x7f28170456c9 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 truenas#22 0x7f2817045784 in __libc_start_main_impl ../csu/libc-start.c:360 truenas#23 0x55c93ff544c0 in _start (/tmp/perf/perf+0x19a4c0) (BuildId: 84899b0e8c7d3a3eaa67b2eb35e3d8b2f8cd4c93) Address 0x7f2813331920 is located in stack of thread T0 at offset 32 in frame #0 0x55c94046e85e in hist_browser__run ui/browsers/hists.c:746 This frame has 1 object(s): [32, 192) 'title' (line 747) <== Memory access at offset 32 is inside this variable HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork ``` hist_browser__run isn't on the stack so the asan error looks legit. There's no clean init/exit on struct ui_browser so I may be trading a use-after-return for a memory leak, but that seems look a good trade anyway. Fixes: 05e8b08 ("perf ui browser: Stop using 'self'") Signed-off-by: Ian Rogers <[email protected]> Cc: Adrian Hunter <[email protected]> Cc: Alexander Shishkin <[email protected]> Cc: Andi Kleen <[email protected]> Cc: Athira Rajeev <[email protected]> Cc: Ben Gainey <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: James Clark <[email protected]> Cc: Jiri Olsa <[email protected]> Cc: Kajol Jain <[email protected]> Cc: Kan Liang <[email protected]> Cc: K Prateek Nayak <[email protected]> Cc: Li Dong <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Namhyung Kim <[email protected]> Cc: Oliver Upton <[email protected]> Cc: Paran Lee <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Ravi Bangoria <[email protected]> Cc: Sun Haiyong <[email protected]> Cc: Tim Chen <[email protected]> Cc: Yanteng Si <[email protected]> Cc: Yicong Yang <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Arnaldo Carvalho de Melo <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jul 16, 2024
commit 9d274c1 upstream. We have been seeing crashes on duplicate keys in btrfs_set_item_key_safe(): BTRFS critical (device vdb): slot 4 key (450 108 8192) new key (450 108 8192) ------------[ cut here ]------------ kernel BUG at fs/btrfs/ctree.c:2620! invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 0 PID: 3139 Comm: xfs_io Kdump: loaded Not tainted 6.9.0 #6 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.3-2.fc40 04/01/2014 RIP: 0010:btrfs_set_item_key_safe+0x11f/0x290 [btrfs] With the following stack trace: #0 btrfs_set_item_key_safe (fs/btrfs/ctree.c:2620:4) #1 btrfs_drop_extents (fs/btrfs/file.c:411:4) #2 log_one_extent (fs/btrfs/tree-log.c:4732:9) #3 btrfs_log_changed_extents (fs/btrfs/tree-log.c:4955:9) #4 btrfs_log_inode (fs/btrfs/tree-log.c:6626:9) #5 btrfs_log_inode_parent (fs/btrfs/tree-log.c:7070:8) #6 btrfs_log_dentry_safe (fs/btrfs/tree-log.c:7171:8) #7 btrfs_sync_file (fs/btrfs/file.c:1933:8) #8 vfs_fsync_range (fs/sync.c:188:9) #9 vfs_fsync (fs/sync.c:202:9) #10 do_fsync (fs/sync.c:212:9) #11 __do_sys_fdatasync (fs/sync.c:225:9) #12 __se_sys_fdatasync (fs/sync.c:223:1) #13 __x64_sys_fdatasync (fs/sync.c:223:1) #14 do_syscall_x64 (arch/x86/entry/common.c:52:14) #15 do_syscall_64 (arch/x86/entry/common.c:83:7) #16 entry_SYSCALL_64+0xaf/0x14c (arch/x86/entry/entry_64.S:121) So we're logging a changed extent from fsync, which is splitting an extent in the log tree. But this split part already exists in the tree, triggering the BUG(). This is the state of the log tree at the time of the crash, dumped with drgn (https://github.com/osandov/drgn/blob/main/contrib/btrfs_tree.py) to get more details than btrfs_print_leaf() gives us: >>> print_extent_buffer(prog.crashed_thread().stack_trace()[0]["eb"]) leaf 33439744 level 0 items 72 generation 9 owner 18446744073709551610 leaf 33439744 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da item 0 key (450 INODE_ITEM 0) itemoff 16123 itemsize 160 generation 7 transid 9 size 8192 nbytes 8473563889606862198 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 204 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417704.983333333 (2024-05-22 15:41:44) mtime 1716417704.983333333 (2024-05-22 15:41:44) otime 17592186044416.000000000 (559444-03-08 01:40:16) item 1 key (450 INODE_REF 256) itemoff 16110 itemsize 13 index 195 namelen 3 name: 193 item 2 key (450 XATTR_ITEM 1640047104) itemoff 16073 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 3 key (450 EXTENT_DATA 0) itemoff 16020 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 4096 ram 12288 extent compression 0 (none) item 4 key (450 EXTENT_DATA 4096) itemoff 15967 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 4096 nr 8192 item 5 key (450 EXTENT_DATA 8192) itemoff 15914 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 ... So the real problem happened earlier: notice that items 4 (4k-12k) and 5 (8k-12k) overlap. Both are prealloc extents. Item 4 straddles i_size and item 5 starts at i_size. Here is the state of the filesystem tree at the time of the crash: >>> root = prog.crashed_thread().stack_trace()[2]["inode"].root >>> ret, nodes, slots = btrfs_search_slot(root, BtrfsKey(450, 0, 0)) >>> print_extent_buffer(nodes[0]) leaf 30425088 level 0 items 184 generation 9 owner 5 leaf 30425088 flags 0x100000000000000 fs uuid e5bd3946-400c-4223-8923-190ef1f18677 chunk uuid d58cb17e-6d02-494a-829a-18b7d8a399da ... item 179 key (450 INODE_ITEM 0) itemoff 4907 itemsize 160 generation 7 transid 7 size 4096 nbytes 12288 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0 sequence 6 flags 0x10(PREALLOC) atime 1716417703.220000000 (2024-05-22 15:41:43) ctime 1716417703.220000000 (2024-05-22 15:41:43) mtime 1716417703.220000000 (2024-05-22 15:41:43) otime 1716417703.220000000 (2024-05-22 15:41:43) item 180 key (450 INODE_REF 256) itemoff 4894 itemsize 13 index 195 namelen 3 name: 193 item 181 key (450 XATTR_ITEM 1640047104) itemoff 4857 itemsize 37 location key (0 UNKNOWN.0 0) type XATTR transid 7 data_len 1 name_len 6 name: user.a data a item 182 key (450 EXTENT_DATA 0) itemoff 4804 itemsize 53 generation 9 type 1 (regular) extent data disk byte 303144960 nr 12288 extent data offset 0 nr 8192 ram 12288 extent compression 0 (none) item 183 key (450 EXTENT_DATA 8192) itemoff 4751 itemsize 53 generation 9 type 2 (prealloc) prealloc data disk byte 303144960 nr 12288 prealloc data offset 8192 nr 4096 Item 5 in the log tree corresponds to item 183 in the filesystem tree, but nothing matches item 4. Furthermore, item 183 is the last item in the leaf. btrfs_log_prealloc_extents() is responsible for logging prealloc extents beyond i_size. It first truncates any previously logged prealloc extents that start beyond i_size. Then, it walks the filesystem tree and copies the prealloc extent items to the log tree. If it hits the end of a leaf, then it calls btrfs_next_leaf(), which unlocks the tree and does another search. However, while the filesystem tree is unlocked, an ordered extent completion may modify the tree. In particular, it may insert an extent item that overlaps with an extent item that was already copied to the log tree. This may manifest in several ways depending on the exact scenario, including an EEXIST error that is silently translated to a full sync, overlapping items in the log tree, or this crash. This particular crash is triggered by the following sequence of events: - Initially, the file has i_size=4k, a regular extent from 0-4k, and a prealloc extent beyond i_size from 4k-12k. The prealloc extent item is the last item in its B-tree leaf. - The file is fsync'd, which copies its inode item and both extent items to the log tree. - An xattr is set on the file, which sets the BTRFS_INODE_COPY_EVERYTHING flag. - The range 4k-8k in the file is written using direct I/O. i_size is extended to 8k, but the ordered extent is still in flight. - The file is fsync'd. Since BTRFS_INODE_COPY_EVERYTHING is set, this calls copy_inode_items_to_log(), which calls btrfs_log_prealloc_extents(). - btrfs_log_prealloc_extents() finds the 4k-12k prealloc extent in the filesystem tree. Since it starts before i_size, it skips it. Since it is the last item in its B-tree leaf, it calls btrfs_next_leaf(). - btrfs_next_leaf() unlocks the path. - The ordered extent completion runs, which converts the 4k-8k part of the prealloc extent to written and inserts the remaining prealloc part from 8k-12k. - btrfs_next_leaf() does a search and finds the new prealloc extent 8k-12k. - btrfs_log_prealloc_extents() copies the 8k-12k prealloc extent into the log tree. Note that it overlaps with the 4k-12k prealloc extent that was copied to the log tree by the first fsync. - fsync calls btrfs_log_changed_extents(), which tries to log the 4k-8k extent that was written. - This tries to drop the range 4k-8k in the log tree, which requires adjusting the start of the 4k-12k prealloc extent in the log tree to 8k. - btrfs_set_item_key_safe() sees that there is already an extent starting at 8k in the log tree and calls BUG(). Fix this by detecting when we're about to insert an overlapping file extent item in the log tree and truncating the part that would overlap. CC: [email protected] # 6.1+ Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Omar Sandoval <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Jul 16, 2024
commit be346c1 upstream. The code in ocfs2_dio_end_io_write() estimates number of necessary transaction credits using ocfs2_calc_extend_credits(). This however does not take into account that the IO could be arbitrarily large and can contain arbitrary number of extents. Extent tree manipulations do often extend the current transaction but not in all of the cases. For example if we have only single block extents in the tree, ocfs2_mark_extent_written() will end up calling ocfs2_replace_extent_rec() all the time and we will never extend the current transaction and eventually exhaust all the transaction credits if the IO contains many single block extents. Once that happens a WARN_ON(jbd2_handle_buffer_credits(handle) <= 0) is triggered in jbd2_journal_dirty_metadata() and subsequently OCFS2 aborts in response to this error. This was actually triggered by one of our customers on a heavily fragmented OCFS2 filesystem. To fix the issue make sure the transaction always has enough credits for one extent insert before each call of ocfs2_mark_extent_written(). Heming Zhao said: ------ PANIC: "Kernel panic - not syncing: OCFS2: (device dm-1): panic forced after error" PID: xxx TASK: xxxx CPU: 5 COMMAND: "SubmitThread-CA" #0 machine_kexec at ffffffff8c069932 #1 __crash_kexec at ffffffff8c1338fa #2 panic at ffffffff8c1d69b9 #3 ocfs2_handle_error at ffffffffc0c86c0c [ocfs2] #4 __ocfs2_abort at ffffffffc0c88387 [ocfs2] #5 ocfs2_journal_dirty at ffffffffc0c51e98 [ocfs2] #6 ocfs2_split_extent at ffffffffc0c27ea3 [ocfs2] #7 ocfs2_change_extent_flag at ffffffffc0c28053 [ocfs2] #8 ocfs2_mark_extent_written at ffffffffc0c28347 [ocfs2] #9 ocfs2_dio_end_io_write at ffffffffc0c2bef9 [ocfs2] #10 ocfs2_dio_end_io at ffffffffc0c2c0f5 [ocfs2] #11 dio_complete at ffffffff8c2b9fa7 #12 do_blockdev_direct_IO at ffffffff8c2bc09f #13 ocfs2_direct_IO at ffffffffc0c2b653 [ocfs2] #14 generic_file_direct_write at ffffffff8c1dcf14 #15 __generic_file_write_iter at ffffffff8c1dd07b #16 ocfs2_file_write_iter at ffffffffc0c49f1f [ocfs2] #17 aio_write at ffffffff8c2cc72e #18 kmem_cache_alloc at ffffffff8c248dde #19 do_io_submit at ffffffff8c2ccada #20 do_syscall_64 at ffffffff8c004984 #21 entry_SYSCALL_64_after_hwframe at ffffffff8c8000ba Link: https://lkml.kernel.org/r/[email protected] Link: https://lkml.kernel.org/r/[email protected] Fixes: c15471f ("ocfs2: fix sparse file & data ordering issue in direct io") Signed-off-by: Jan Kara <[email protected]> Reviewed-by: Joseph Qi <[email protected]> Reviewed-by: Heming Zhao <[email protected]> Cc: Mark Fasheh <[email protected]> Cc: Joel Becker <[email protected]> Cc: Junxiao Bi <[email protected]> Cc: Changwei Ge <[email protected]> Cc: Gang He <[email protected]> Cc: Jun Piao <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Sep 9, 2024
[ Upstream commit a699781 ] A sysfs reader can race with a device reset or removal, attempting to read device state when the device is not actually present. eg: [exception RIP: qed_get_current_link+17] #8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede] #9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3 #10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4 #11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300 #12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c #13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b #14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3 #15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1 #16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f #17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb crash> struct net_device.state ffff9a9d21336000 state = 5, state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100). The device is not present, note lack of __LINK_STATE_PRESENT (0b10). This is the same sort of panic as observed in commit 4224cfd ("net-sysfs: add check for netdevice being present to speed_show"). There are many other callers of __ethtool_get_link_ksettings() which don't have a device presence check. Move this check into ethtool to protect all callers. Fixes: d519e17 ("net: export device speed and duplex via sysfs") Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show") Signed-off-by: Jamie Bainbridge <[email protected]> Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
usaleem-ix
pushed a commit
that referenced
this pull request
Nov 4, 2024
Fix __hci_cmd_sync_sk() to return not NULL for unknown opcodes. __hci_cmd_sync_sk() returns NULL if a command returns a status event. However, it also returns NULL where an opcode doesn't exist in the hci_cc table because hci_cmd_complete_evt() assumes status = skb->data[0] for unknown opcodes. This leads to null-ptr-deref in cmd_sync for HCI_OP_READ_LOCAL_CODECS as there is no hci_cc for HCI_OP_READ_LOCAL_CODECS, which always assumes status = skb->data[0]. KASAN: null-ptr-deref in range [0x0000000000000070-0x0000000000000077] CPU: 1 PID: 2000 Comm: kworker/u9:5 Not tainted 6.9.0-ga6bcb805883c-dirty #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 Workqueue: hci7 hci_power_on RIP: 0010:hci_read_supported_codecs+0xb9/0x870 net/bluetooth/hci_codec.c:138 Code: 08 48 89 ef e8 b8 c1 8f fd 48 8b 75 00 e9 96 00 00 00 49 89 c6 48 ba 00 00 00 00 00 fc ff df 4c 8d 60 70 4c 89 e3 48 c1 eb 03 <0f> b6 04 13 84 c0 0f 85 82 06 00 00 41 83 3c 24 02 77 0a e8 bf 78 RSP: 0018:ffff888120bafac8 EFLAGS: 00010212 RAX: 0000000000000000 RBX: 000000000000000e RCX: ffff8881173f0040 RDX: dffffc0000000000 RSI: ffffffffa58496c0 RDI: ffff88810b9ad1e4 RBP: ffff88810b9ac000 R08: ffffffffa77882a7 R09: 1ffffffff4ef1054 R10: dffffc0000000000 R11: fffffbfff4ef1055 R12: 0000000000000070 R13: 0000000000000000 R14: 0000000000000000 R15: ffff88810b9ac000 FS: 0000000000000000(0000) GS:ffff8881f6c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f6ddaa3439e CR3: 0000000139764003 CR4: 0000000000770ef0 PKRU: 55555554 Call Trace: <TASK> hci_read_local_codecs_sync net/bluetooth/hci_sync.c:4546 [inline] hci_init_stage_sync net/bluetooth/hci_sync.c:3441 [inline] hci_init4_sync net/bluetooth/hci_sync.c:4706 [inline] hci_init_sync net/bluetooth/hci_sync.c:4742 [inline] hci_dev_init_sync net/bluetooth/hci_sync.c:4912 [inline] hci_dev_open_sync+0x19a9/0x2d30 net/bluetooth/hci_sync.c:4994 hci_dev_do_open net/bluetooth/hci_core.c:483 [inline] hci_power_on+0x11e/0x560 net/bluetooth/hci_core.c:1015 process_one_work kernel/workqueue.c:3267 [inline] process_scheduled_works+0x8ef/0x14f0 kernel/workqueue.c:3348 worker_thread+0x91f/0xe50 kernel/workqueue.c:3429 kthread+0x2cb/0x360 kernel/kthread.c:388 ret_from_fork+0x4d/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Fixes: abfeea4 ("Bluetooth: hci_sync: Convert MGMT_OP_START_DISCOVERY") Signed-off-by: Sungwoo Kim <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]>
wongsyrone
pushed a commit
to wongsyrone/truenas-scale-linux
that referenced
this pull request
Feb 17, 2025
[ Upstream commit c7b87ce0dd10b64b68a0b22cb83bbd556e28fe81 ] libtraceevent parses and returns an array of argument fields, sometimes larger than RAW_SYSCALL_ARGS_NUM (6) because it includes "__syscall_nr", idx will traverse to index 6 (7th element) whereas sc->fmt->arg holds 6 elements max, creating an out-of-bounds access. This runtime error is found by UBsan. The error message: $ sudo UBSAN_OPTIONS=print_stacktrace=1 ./perf trace -a --max-events=1 builtin-trace.c:1966:35: runtime error: index 6 out of bounds for type 'syscall_arg_fmt [6]' #0 0x5c04956be5fe in syscall__alloc_arg_fmts /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:1966 truenas#1 0x5c04956c0510 in trace__read_syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2110 truenas#2 0x5c04956c372b in trace__syscall_info /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:2436 truenas#3 0x5c04956d2f39 in trace__init_syscalls_bpf_prog_array_maps /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:3897 truenas#4 0x5c04956d6d25 in trace__run /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:4335 truenas#5 0x5c04956e112e in cmd_trace /home/howard/hw/linux-perf/tools/perf/builtin-trace.c:5502 truenas#6 0x5c04956eda7d in run_builtin /home/howard/hw/linux-perf/tools/perf/perf.c:351 truenas#7 0x5c04956ee0a8 in handle_internal_command /home/howard/hw/linux-perf/tools/perf/perf.c:404 truenas#8 0x5c04956ee37f in run_argv /home/howard/hw/linux-perf/tools/perf/perf.c:448 truenas#9 0x5c04956ee8e9 in main /home/howard/hw/linux-perf/tools/perf/perf.c:556 truenas#10 0x79eb3622a3b7 in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58 truenas#11 0x79eb3622a47a in __libc_start_main_impl ../csu/libc-start.c:360 truenas#12 0x5c04955422d4 in _start (/home/howard/hw/linux-perf/tools/perf/perf+0x4e02d4) (BuildId: 5b6cab2d59e96a4341741765ad6914a4d784dbc6) 0.000 ( 0.014 ms): Chrome_ChildIO/117244 write(fd: 238, buf: !, count: 1) = 1 Fixes: 5e58fcf ("perf trace: Allow allocating sc->arg_fmt even without the syscall tracepoint") Signed-off-by: Howard Chu <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Namhyung Kim <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Being written anything waits for all device probe to complete before
returning. After that
udevadm settle
used by ZFS scripts reallycan provide system with all disks detected for boot pool import.
Ticket: NAS-108200