-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Randomness conciseness #10
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I commented on things that should change. In brief: I don't love the environment variable system we've got now, but unless we get rid of it completely we should not silently override it. Most changes requested are small
@@ -206,6 +206,7 @@ def calc_score_dict_uncertainty(model, x_df, y_df, split_name, | |||
'INTPTLAT', 'INTPTLON'] | |||
|
|||
tr, va, te = make_xy_data_splits.load_xy_splits( | |||
data_dir = '../cook-county/cleaning-cook-county/data_dir', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be necessary. load_xy_splits
uses the data directory set by the users environment variable. I don't love the environment variable either, but by hardcoding this path here it silently breaks how the DATA_DIR environment variable works
@@ -7,7 +7,7 @@ | |||
parser = argparse.ArgumentParser() | |||
parser.add_argument('--steps_to_run', default='1,2,3,4,5', type=str) | |||
args = parser.parse_args() | |||
|
|||
os.environ['DATA_DIR'] = 'data_dir' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add this as the default on the next line, but we shouldn't overwrite the user's environment here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-Dont override environment variable, but can proivde ./data_dir
as a default if needed
- Why was the rounding change necessary?
…ch will generate yearly semi and quarterly dfs in one command. This script also has option to add code to easily generate monthly, biweekly, and weekly dfs. The data has been rearranged to do this correct with new "data_dir" in the right spot. Also, this commit added compare_timesteps.py and compare_timesteps_fit_and_predict.py, scripts which can compare the predictive power of 2 timesteps using one as a baseline.
Pull request for updated code. Changes in annual ipynb, make_xy_data_splits, and intro ipynb. data dir is now included with a couple large files removed.