Skip to content

Commit

Permalink
WIP - Allow most characters in node key names and any primitive as di…
Browse files Browse the repository at this point in the history
…ctionary key
  • Loading branch information
odelalleau committed Dec 8, 2020
1 parent 3af2f85 commit 200fef2
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 10 deletions.
7 changes: 2 additions & 5 deletions omegaconf/grammar/OmegaConfGrammarLexer.g4
Original file line number Diff line number Diff line change
Expand Up @@ -71,13 +71,10 @@ QUOTED_VALUE:
mode INTERPOLATION_MODE;

NESTED_INTER_OPEN: INTER_OPEN -> type(INTER_OPEN), pushMode(INTERPOLATION_MODE);
INTER_COLON: ':' WS? -> type(COLON), mode(VALUE_MODE);
INTER_COLON: ':' WS? -> mode(VALUE_MODE);
INTER_CLOSE: '}' -> popMode;

DOT: '.';
INTER_ID: ID -> type(ID);
// A special type of `ID` that contains the "@" character. It has to be a different
// lexer token because resolver names must not contain this character.
ID_WITH_AT: (ID|'@')+;
LIST_INDEX: INT_UNSIGNED;
KEY: ~[${}[\]:. \t]+; // interpolation key, may contain any non special character

This comment has been minimized.

Copy link
@omry

omry Dec 8, 2020

Owner

If this is an interpolation key, let's call it INTER_KEY.
KEY is way too generic.

This comment has been minimized.

Copy link
@omry

omry Dec 8, 2020

Owner

I am a bit concerned about the approach here. are you sure this is the full list of special characters?

This comment has been minimized.

Copy link
@odelalleau

odelalleau Dec 8, 2020

Author Collaborator

If this is an interpolation key, let's call it INTER_KEY.

Ok.

I am a bit concerned about the approach here. are you sure this is the full list of special characters?

The list of special characters can easily be identified by looking at other rules in INTERPOLATION_MODE. Going through it in order, it is: ${:}. \t. I added brackets in case we may want to add support for ${list[index]} in the future. I think I could add () in case we want to add support for ${function(arg)}. Can you think of any other need for more special characters?

This comment has been minimized.

Copy link
@odelalleau

odelalleau Dec 8, 2020

Author Collaborator

Can you think of any other need for more special characters?

I'm going to add quotes (single+double) as well. This may allow later something like ${"key with spaces".x} (which wouldn't work right now since whitespaces are skipped).

Edit: also at some point maybe we'll want to allow ${"[email protected]".name}, another use case for quotes

This comment has been minimized.

Copy link
@omry

omry Dec 8, 2020

Owner

alright, sounds good.

INTER_WS: WS -> skip;
4 changes: 2 additions & 2 deletions omegaconf/grammar/OmegaConfGrammarParser.g4
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ sequence: element (COMMA element)*;

interpolation: interpolationNode | interpolationResolver;
interpolationNode: INTER_OPEN DOT* configKey (DOT configKey)* INTER_CLOSE;
interpolationResolver: INTER_OPEN (interpolation | ID) COLON sequence? BRACE_CLOSE;
configKey: interpolation | ID | ID_WITH_AT | LIST_INDEX;
interpolationResolver: INTER_OPEN (interpolation | ID) INTER_COLON sequence? BRACE_CLOSE;
configKey: interpolation | ID | KEY;

This comment has been minimized.

Copy link
@omry

omry Dec 8, 2020

Owner

Can you give examples for interpolation, ID and KEY?

This comment has been minimized.

Copy link
@odelalleau

odelalleau Dec 8, 2020

Author Collaborator
  • interpolation: ${x.y}, ${foo:bar}
  • ID: foo_10, _abc
  • KEY: user@domain, ^&*weird_key-+=

This comment has been minimized.

Copy link
@omry

omry Dec 8, 2020

Owner

Aren't all IDs also KEYs?

This comment has been minimized.

Copy link
@odelalleau

odelalleau Dec 9, 2020

Author Collaborator

Yes they are, but I need to make the difference so that interpolationResolver (the line just above) can use only IDs. If I use a single lexer token I can't differentiate between valid resolver names and valid key names.


// Primitive types.

Expand Down
6 changes: 3 additions & 3 deletions omegaconf/grammar_visitor.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,7 +78,7 @@ def defaultResult(self) -> List[Any]:
def visitConfigKey(self, ctx: OmegaConfGrammarParser.ConfigKeyContext) -> str:
from ._utils import _get_value

# interpolation | ID | ID_WITH_AT | LIST_INDEX
# interpolation | ID | KEY
assert ctx.getChildCount() == 1
child = ctx.getChild(0)
if isinstance(child, OmegaConfGrammarParser.InterpolationContext):
Expand Down Expand Up @@ -168,7 +168,7 @@ def visitInterpolationResolver(
) -> Optional["Node"]:
from ._utils import _get_value

# INTER_OPEN (interpolation | ID) COLON sequence? BRACE_CLOSE;
# INTER_OPEN (interpolation | ID) INTER_COLON sequence? BRACE_CLOSE;
resolver_name = None
inputs = []
inputs_str = []
Expand Down Expand Up @@ -208,7 +208,7 @@ def visitDictKeyValuePair(
) -> Tuple[Any, Any]:
from ._utils import _get_value

assert ctx.getChildCount() == 3 # ID COLON element
assert ctx.getChildCount() == 3 # primitive COLON element

This comment has been minimized.

Copy link
@omry

omry Dec 8, 2020

Owner

Did you forget to include the actual change that enables primitive?

Can you test this in Hydra as well?
just to be sure we are not stepping into a landmine?

This comment has been minimized.

Copy link
@odelalleau

odelalleau Dec 8, 2020

Author Collaborator

Did you forget to include the actual change that enables primitive?

Didn't forget, but yes it needs to be added. I only changed the grammar in this commit so that we can agree about it, before I add the supporting code.

Can you test this in Hydra as well?

Will do once I implement it entirely.

This comment has been minimized.

Copy link
@odelalleau

odelalleau Dec 9, 2020

Author Collaborator

Did you forget to include the actual change that enables primitive?

Didn't forget

Oops, actually I did forget. Looking into it now. First attempt's triggering the "Attempting Full Context" warning, will see if I can get around it.

This comment has been minimized.

Copy link
@odelalleau

odelalleau Dec 9, 2020

Author Collaborator

First attempt's triggering the "Attempting Full Context" warning, will see if I can get around it.

Ok, so this is a bit annoying. This is because if we declare:

dictKeyValuePair: primitive COLON element;

then there is an ambiguity with : because this character is also allowed in primitive. So for instance {foo::bar} is ambiguous: is the key foo or foo:?

The fix I would suggest would be to create a slightly different version of primitive that wouldn't include : and would only be used for dictionary keys. I could do this (or something else, if you have a better idea), however I'm not even sure why we wanted to do this in the first place (except because you asked whether it would work). What's the motivation exactly?

key_node = ctx.getChild(0)
assert (
isinstance(key_node, TerminalNode)
Expand Down

2 comments on commit 200fef2

@omry
Copy link
Owner

@omry omry commented on 200fef2 Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are done with this commit, right? (it should be a part of the bigger interpolation diff?)

@odelalleau
Copy link
Collaborator Author

@odelalleau odelalleau commented on 200fef2 Dec 18, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are done with this commit, right? (it should be a part of the bigger interpolation diff?)

Yes, it was just a draft to kickstart the discussion before the "real" implementation. The (almost) final version ended up in #445 as 4bbc308 (you've reviewed it already)

Please sign in to comment.