Description
A Firefox developer filed a bug report about Rust symbols not being correctly represented in crash reports:
0 XUL GeckoCrash toolkit/xre/nsAppRunner.cpp:5093 context
1 XUL gkrust_shared::panic_hook toolkit/library/rust/shared/lib.rs:240 frame_pointer
2 XUL core::ops::function::Fn::call src/libcore/ops/function.rs:69 cfi
3 XUL rust_panic_with_hook src/libstd/panicking.rs:482 cfi
4 XUL continue_panic_fmt src/libstd/panicking.rs:385 cfi
5 XUL rust_begin_panic src/libstd/panicking.rs:312 cfi
6 XUL panic_fmt src/libcore/panicking.rs:85 cfi
7 XUL panic src/libcore/panicking.rs:49 cfi
The expectation was that, e.g. rust_panic_with_hook
would have been std::panicking::rust_panic_with_hook
.
We had recently upgraded to Rust 1.34, which led to comparing object files before and after the upgrade. I compared Linux x86-64 binaries; I don't see why the analysis doesn't apply to OS X's Mach-O files, but it's possible the results are different there. (The above crash is from OS X, and we have crashes using Rust 1.33 that do display std::panicking::rust_panic_with_hook
and similar.) The ELF symbol table in both cases lists rust_panic_with_hook
as _ZN3std9panicking20rust_panic_with_hook$UNIQUE_ID
, so that wasn't the problem.
We then looked at the debug information. Rust 1.33 generated, according to readelf --debug-dump=info
:
<3><2155a724>: Abbrev Number: 301 (DW_TAG_subprogram)
<2155a726> DW_AT_low_pc : 0x5738510
<2155a72a> DW_AT_high_pc : 0x6b2
<2155a72e> DW_AT_frame_base : 1 byte block: 54 (DW_OP_reg4 (esp))
<2155a730> DW_AT_linkage_name: (indirect string, offset: 0x9d7fc09): _ZN3std9panicking20rust_panic_with_hook17he447c38467745511E
<2155a734> DW_AT_name : (indirect string, offset: 0x9d7fc45): rust_panic_with_hook
<2155a738> DW_AT_decl_file : 11
<2155a739> DW_AT_decl_line : 447
<2155a73b> DW_AT_external : 1
<2155a73b> DW_AT_noreturn : 1
Notice the existence of both DW_AT_name
and DW_AT_linkage_name
. Rust 1.34, in contrast, generated:
<1><216d7afa>: Abbrev Number: 293 (DW_TAG_subprogram)
<216d7afc> DW_AT_low_pc : 0x5734990
<216d7b00> DW_AT_high_pc : 0x6ae
<216d7b04> DW_AT_name : (indirect string, offset: 0x9db1282): rust_panic_with_hook
which drops the DW_AT_linkage_name
and is also significantly less informative than its predecessor.
I'm not familiar enough with rustc
to know what might have caused this regression. One of my colleagues pointed out #58208, which changed how various bits of panic infrastructure are imported into libstd
. It's not clear to me whether it's that specific change, or how the compiler internally describes crate::
symbols to LLVM, or something else entirely.
Activity
pnkfelix commentedon Apr 18, 2019
Some first steps that would be good here:
rustc
itself. (Or a whole crate to feed tocargo
, if necessary...)in any case I'm not 100% sure what priority to assign to this. Obviously we want our debug info to be high quality. But the first step is to find out whether this change was deliberate or not, and if it was deliberate, what was the issue motivating the change.
froydnj commentedon Apr 18, 2019
A reasonably small testcase is:
Using
rustc
1.34, you get a very different result:nikomatsakis commentedon Apr 23, 2019
This is a fairly serious bug for the FF folk as it is breaking their crash report infrastructure. It would be good to do a bisection to verify if, indeed, #58208 is at fault (I don't know why that should be the case). Is there any connection between the use of Rust 2018 and debuginfo?
Gankra commentedon Apr 23, 2019
I believe we don't currently use Rust 2018, as upgrading is low priority and requires fixes to our tooling
froydnj commentedon Apr 23, 2019
Some investigation showed that the dropping of module names is not limited to
std
symbols. Which means that this is not (specifically) a Rust 2018 issue.froydnj commentedon Apr 26, 2019
I am in the process of bisecting this.
froydnj commentedon Apr 27, 2019
OK, I have run into an issue that must involve some subtlety with
rustc
that I don't understand. I can use the above instructions with a releasedrustc
to get an artifact I can analyze withreadelf
. But when I try to do the same with a freshly-builtrustc
from stage2, the resulting objects in the archive suddenly contain no debug information (even with-C debuginfo=2
)...and significantly more.o
files from core rust (e.g.compiler_builtins
) than the releasedrustc
.My command for the stage2
rustc
is./build/x86_64-unknown-linux-gnu/stage2/bin/rustc --crate-type staticlib -o libdwarf-test.a ~/dwarf-test.rs -C debuginfo=2
.What is going on here? Why isn't the stage2
rustc
acting like therustc
from releases? I don't think the number of object files is really a big issue, but the debug information bit makes it basically impossible to write a bisection script that is reasonably efficient.glandium commentedon Apr 27, 2019
Search for debuginfo in the rust section of config.toml.
michaelwoerister commentedon Apr 29, 2019
Try with the following
config.toml
settings:That should make sure that each crate of the standard library results in one object file, and that debuginfo is generated for the standard library (but not the compiler).
froydnj commentedon Apr 29, 2019
debuginfo
anddebuginfo-only-std
do seem to make the testcase work. They do not explain why-C debuginfo=2
doesn't produce any debuginfo, not even basic information about the function being compiled.Anyway, several attempts at bisection later, and I haven't been able to reproduce the issue when compiling 1.34.0 on my machine and testing against that. I must be using the wrong flags somewhere along the way.
pnkfelix commentedon Apr 29, 2019
@froydnj have you tried using a docker image to replicate the exact form used by our builders when they make the distribution artifacts?
29 remaining items