Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Workaround for Dataframe object in the config: check whether value is string before mandatory missing check #553

Merged
merged 7 commits into from
Feb 25, 2021

Conversation

jeffknaide
Copy link
Contributor

@jeffknaide jeffknaide commented Feb 21, 2021

Closes #566

sidestep source of facebookresearch/hydra#1285 at the omegaconf level

@omry
Copy link
Owner

omry commented Feb 22, 2021

Thanks @jeffknaide.
We have some outstanding discussions around the internal behavior of MISSING, we will address this once we comes to a conclusion there.

@omry omry added this to the OmegaConf 2.1 milestone Feb 22, 2021
@omry omry added the blocked label Feb 22, 2021
@omry
Copy link
Owner

omry commented Feb 22, 2021

Blocked by #543.

Copy link
Owner

@omry omry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jeffknaide.
See inline comments.

Comment on lines 41 to 42
def __eq__(self, other: Any) -> Any:
return self
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find __eq__ returning self very odd. what's going on here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when a pandas DataFrame runs df.__eq__(val), it returns a DataFrame of booleans with the same shape as the original DataFrame. Here I was trying to mimic the behavior of a df comparison resulting in the same class being returned, as this is partially what caused issues originally. perhaps this should be made clearer with a comment?

Copy link
Owner

@omry omry Feb 24, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha.
Yes, add a comment what the behavior we are mimicking is.
I guess we can't mimic it without overriding both eq and bool?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i don't think so -- if we don't override eq, the comparison to string would return a bool and then not raise the expected error on the conditional. eg if Dataframe() == "???" would equate to if False (which wouldn't raise the expected error) without the eq override.

@@ -34,6 +34,17 @@ def __deepcopy__(self, memo: Any) -> Any:
raise NotImplementedError()


class Dataframe:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment here.
something like:

Emulates Pandas Dataframe equality behavior.

@omry omry changed the title check whether value is string before mandatory missing check Workaround for Dataframe object in the config: check whether value is string before mandatory missing check Feb 24, 2021
@omry omry requested a review from odelalleau February 24, 2021 23:37
@omry
Copy link
Owner

omry commented Feb 24, 2021

lgtm.
@odelalleau, this could interact with your outstanding PR to change the semantics of missing.

@odelalleau
Copy link
Collaborator

lgtm.
@odelalleau, this could interact with your outstanding PR to change the semantics of missing.

At first glance I don't foresee any issue (crossing fingers). Just mentioning that maybe there could be a new utility function to avoid repeating if isinstance(x, str) and x == "???" everywhere?

@omry
Copy link
Owner

omry commented Feb 24, 2021

Yeah, was thinking that too.

@jeffknaide
Copy link
Contributor Author

@odelalleau @omry oh, yes, that makes sense. I just added is_mandatory_missing to _utils and referenced it where i had been using if isinstance(x, str) and x == "???".

There was also an inner function defined in BaseContainer._resolve_with_default called is_mandatory_missing which was returned of get_value_kind. I just removed that inner function in favor of is_mandatory_missing defined in _utils. This seems to break the tests, though, so I must be missing something that's happening within get_value_kind

@odelalleau
Copy link
Collaborator

odelalleau commented Feb 25, 2021

I must be missing something that's happening within get_value_kind

It’s almost certainly because that function can be called on a node: it starts with _get_value(..) to obtain the node’s value if that’s the case. So in BaseContainer the problem is likely related to if is_mandatory_missing(resolved):, and replacing ˋresolved` with its value here should fix it.

@omry
Copy link
Owner

omry commented Feb 25, 2021

I think we will want to take a closer look at this in a followup.
can you undo the refactoring changes?

@odelalleau
Copy link
Collaborator

can you undo the refactoring changes?

Most could be kept, only the change in _resolve_with_default() needs reverting

@omry
Copy link
Owner

omry commented Feb 25, 2021

Maybe. it just feels like there are a bit too many layers here. The logic can be slightly reorganized to be simpler.

@jeffknaide
Copy link
Contributor Author

@omry - i just reverted previous commit. let me know if you want me to put the utils function back while leaving _resolve_with_default alone.

Copy link
Owner

@omry omry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jeffknaide.

@omry omry merged commit babb872 into omry:master Feb 25, 2021
@jeffknaide jeffknaide deleted the allow-dataframes branch February 25, 2021 20:13
@Jasha10 Jasha10 removed the blocked label Feb 26, 2021
odelalleau pushed a commit to odelalleau/omegaconf that referenced this pull request Mar 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Illegal Comparisons when handling pandas DataFrame objects
4 participants