Closed
Description
I found these pathlogical examples pass the lexer and the parser:
fn main() {
println!("{}", "Foo"_);
println!("{}", 10._);
}
The reason is that
scan_optional_raw_name
eats_
, but reports no suffix.scan_number
distinguishes between decimal points and field/method syntax byis_xid_start
, which in fact excludes_
.
According to the reference, it seems these examples shouldn't be allowed.
Tested on stable (rustc 1.17.0 (56124baa9 2017-04-24)
) and nightly (rustc 1.19.0-nightly (6a5fc9eec 2017-05-02)
) on the playground.
Metadata
Metadata
Assignees
Labels
Type
Projects
Milestone
Relationships
Development
No branches or pull requests
Activity
qnighy commentedon May 3, 2017
According to the reference, decimal points are optionally followed by another "decimal literal". Since
_
isn't a decimal literal, it shouldn't be allowed.Here the parser is recognizing
_
in10._
as a suffix rather than a decimal literal followed by the point. If not,10._f64
would be lexed as10._
+f64
, but actually the parser complains_f64
is not a valid suffix.In addition, the reference says
f32
andf64
are the only valid suffixes. So_
shouldn't be allowed as a suffix either.nikomatsakis commentedon May 11, 2017
It seems to me that
_
should be considered part of the number. So I would expect10._f64
to work, indeed, just as10.2_f64
or2_f64
works.@arielb1 points out that
"a"_
is also accepted, which seems like a bug.Basically, I do not expect a suffix to begin with
_
, I think, but I do expect_
to be allowed in numeric literals anywhere (including the end).nikomatsakis commentedon May 11, 2017
Shifting nomination to T-lang for now.
joshtriplett commentedon May 11, 2017
The language team discussed this in today's meeting. We also considered cases like
42.method_on_numbers()
, or42._method_on_numbers()
, both of which ought to be parsed as methods. The conclusion we came to was that it's ambiguous (for our mental lexers if nothing else) to allow a trailing underscore on a floating-point literal immediately following a.
with no subsequent digits. Parsing_[a-z]
in a way that puts the_
and the[a-z]
in separate tokens feels incredibly confusing. It's not unreasonable to have to write42.0f64
or42.0_f64
instead if you want a suffix.(By contrast,
42_f64
or42_u64
seems fine, and unambiguous, though whether you should write that might be a good question for the style team.)So, the recommendation from the language team would be to always parse
42._ident
as a field or method, and never include the_
as part of the floating-point literal.One question that didn't come up in the language team meeting: does anything rely on the current (apparently inconsistent) lexing behavior here? Would fixing this ambiguity break anything?
qnighy commentedon May 12, 2017
I'll try to write a patch for this issue.
nikomatsakis commentedon May 12, 2017
@joshtriplett
I would think not, at least for numbers, given that it results in an error. Not sure about
""_
, which... apparently parses but does not error? (When do we ever support suffixes on strings, besides#
?)Rollup merge of rust-lang#41946 - qnighy:disallow-dot-underscore-in-f…
Rollup merge of rust-lang#41946 - qnighy:disallow-dot-underscore-in-f…
Rollup merge of rust-lang#41946 - qnighy:disallow-dot-underscore-in-f…
qnighy commentedon May 15, 2017
@joshtriplett
Do you mean by "never include" also forbidding literals like
4_000.000_000
?joshtriplett commentedon May 16, 2017
@qnighy No, that's fine. I meant specifically that
42._ident
or similar should never lex as a floating-point number42._
with suffixident
; an underscore immediately after the dot should not lex as part of the literal.qnighy commentedon May 16, 2017
@joshtriplett I see. Then #41946 and #41990 will close this issue.
Auto merge of #41990 - qnighy:disallow-underscore-suffix-for-string-l…
qnighy commentedon Jun 6, 2017
Closing since #41946 and #41990 are merged.