Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ARROW-16073: [R] clean-up date time unit testing once tzdb is available on Windows #12883

Closed
wants to merge 24 commits into from

Conversation

dragosmg
Copy link
Contributor

No description provided.

@github-actions
Copy link

@dragosmg dragosmg force-pushed the datetime_unit_testing_cleanup branch from ddc0a7e to 08497fd Compare April 21, 2022 11:22
@dragosmg dragosmg marked this pull request as ready for review April 21, 2022 15:00
@dragosmg dragosmg marked this pull request as draft April 21, 2022 15:19
@jonkeane
Copy link
Member

Would you mind rebasing this and looking into the windows CI failures? We might be too late already, but we really should make sure this is done + more importantly there are no issues before we release

@dragosmg dragosmg force-pushed the datetime_unit_testing_cleanup branch from 35e8b66 to 06ba19e Compare April 26, 2022 08:20
@dragosmg
Copy link
Contributor Author

All window jobs seem to be tripping up in the same spot (involving formats), with the following error message:

Invalid: Cannot find locale 'English_United States.1252': locale::facet::_S_create_c_locale name not valid

As far as I can tell it has to do with the format, in general, and "%b" + "%B", in particular, as they rely on the locale in order to figure out the abbreviated / full month names.

@dragosmg dragosmg changed the title ARROW-16073 [R] clean-up date time unit testing once tzdb is available on Windows ARROW-16073: [R] clean-up date time unit testing once tzdb is available on Windows Apr 27, 2022
@dragosmg dragosmg marked this pull request as ready for review April 28, 2022 09:48
@dragosmg
Copy link
Contributor Author

We could link to ARROW-13133 to solve the remaining locale-related issues. Not sure a new Jira is needed.

@wjones127
Copy link
Member

wjones127 commented Apr 28, 2022

All window jobs seem to be tripping up in the same spot (involving formats)

@dragosmg This is expected. We don't support locales other than the "C" locale, so you can't format timestamps in different locales. From my earlier research, this seems to be an issue with MINGW itself. R gets around this by vendoring the strftime implementation:

However, not all OSes (notably Windows) provided strptime and many issues were found for those which did, so since 2000 R has used a fork of code from glibc. The forked code uses the system's strftime to find the locale-specific day and month names and any AM/PM indicator.

On some platforms (including Windows and by default on macOS) the system's strftime is replaced (along with most of the rest of the C-level datetime code) by code modified from IANA's tzcode distribution (https://www.iana.org/time-zones).
(from https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/strptime)

@jonkeane
Copy link
Member

And presumably, we could set the locale for windows only on the tests that still have the skip here, like we do in test-dplyr-funcs-datetime.R ?

@dragosmg
Copy link
Contributor Author

Thanks @wjones127 The absence of locale support is showing up in a number of different places (for example, some of the datetime parsing). Do you think it's worth having a separate Jira to discuss locale support? ARROW-13133 touches on locale, but doesn't go far enough.

@wjones127
Copy link
Member

I made an issue for looking into locales on Windows MINGW systems / RTools: https://issues.apache.org/jira/browse/ARROW-16399

@dragosmg
Copy link
Contributor Author

dragosmg commented Apr 29, 2022

@jonkeane I believe there aren't any outstanding suggestions. Would you have time for another look?

@jonkeane jonkeane closed this in 3c03d49 Apr 29, 2022
@ursabot
Copy link

ursabot commented May 5, 2022

Benchmark runs are scheduled for baseline = 48a6e25 and contender = 3c03d49. 3c03d49 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Finished ⬇️0.97% ⬆️0.04%] test-mac-arm
[Finished ⬇️0.0% ⬆️0.0%] ursa-i9-9960x
[Finished ⬇️0.08% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 3c03d493 ec2-t3-xlarge-us-east-2
[Finished] 3c03d493 test-mac-arm
[Finished] 3c03d493 ursa-i9-9960x
[Finished] 3c03d493 ursa-thinkcentre-m75q
[Finished] 48a6e258 ec2-t3-xlarge-us-east-2
[Finished] 48a6e258 test-mac-arm
[Finished] 48a6e258 ursa-i9-9960x
[Finished] 48a6e258 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@dragosmg dragosmg deleted the datetime_unit_testing_cleanup branch May 13, 2022 08:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants