-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RGDR class implementation #68
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work! I played with it and checked the code. All look pretty good to me.
But still I have some concerns about the structure of the code. To me it seems too crowded to put RGDR class, all functions doing actual calculations and visualization functions all in one place.
I understand that this fits well with the design if we have a dimensionality
folder and put all methods (e.g. rgdr, pca, mca, t-sne...) there, each in a separate file. But if we only want to have RGDR as an independent service and only wrappers for other methods. Then it is better to restructure RGDR module into a few separate files, for readability and maintenance.
We can think about it later when this module grows bigger. For now, just go ahead merge it!
Again, thanks a lot for the endeavor 😄!
def plot_correlation( | ||
self, | ||
precursor: xr.DataArray, | ||
ax1: Optional[plt.Axes] = None, | ||
ax2: Optional[plt.Axes] = None, | ||
) -> List[Type[mpl.collections.QuadMesh]]: | ||
"""Generates a figure showing the correlation and p-value results with the | ||
initiated RGDR class and input precursor field. | ||
|
||
Args: | ||
precursor (xr.DataArray): Precursor field data with the dimensions | ||
'latitude', 'longitude', and 'anchor_year' | ||
ax1 (plt.Axes, optional): a matplotlib axis handle to plot | ||
the correlation values into. If None, an axis handle will be created | ||
instead. | ||
ax2 (plt.Axes, optional): a matplotlib axis handle to plot | ||
the p-values into. If None, an axis handle will be created instead. | ||
|
||
Returns: | ||
List[mpl.collections.QuadMesh]: List of matplotlib artists. | ||
""" | ||
|
||
if not isinstance(precursor, xr.DataArray): | ||
raise ValueError("Please provide an xr.DataArray, not a dataset") | ||
|
||
corr, p_val = correlation(precursor, self.timeseries, corr_dim="anchor_year") | ||
|
||
if (ax1 is None) and (ax2 is None): | ||
_, (ax1, ax2) = plt.subplots(ncols=2) | ||
elif (ax1 is None) or (ax2 is None): | ||
raise ValueError( | ||
"Either pass axis handles for both ax1 and ax2, or pass neither." | ||
) | ||
|
||
plot1 = corr.plot.pcolormesh(ax=ax1, cmap="viridis") # type: ignore | ||
plot2 = p_val.plot.pcolormesh(ax=ax2, cmap="viridis") # type: ignore | ||
|
||
ax1.set_title("correlation") | ||
ax2.set_title("p-value") | ||
|
||
return [plot1, plot2] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, it makes more sense to place these "plot" codes in a visual.py
module and create a method in the RGDR class to call these "plot" codes. This will reduces the size of rgdr.py
file and makes it more readable. It is fine for now. But when the code grows again due to the implementation of new methods (e.g. regression map?). We need to restructure the code a bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK! Yeah we can do this when it's required. Although moving the plotting methods would only reduce the number of lines by ~25. Most of the lines are the docstrings.
combining `weighted` and `groupby`. An open PR adds supports for this functionality | ||
(https://github.com/pydata/xarray/pull/5480), but this branch was never merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice finding, haha. Maybe we should leave a comment there to push them a bit.
return reduced_data | ||
|
||
|
||
def geographical_cluster_center( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice implementation. We can add the center in the visualization in next iteration, together the restructuring of all plot codes if necessary.
Hi Yang, thanks for the review. RGDR does feel cluttered, however it currently only contains the following;
I think the size and structure of the file is fine at the moment. We could move out the correlation method along with the pearsonr_nan function, but that would indeed be appropriate to do when implementing different correlation/regression methods. We could refactor it to something like;
|
Two docstring corrections Co-authored-by: Yang <[email protected]>
Kudos, SonarCloud Quality Gate passed! |
This PR implements RGDR as a class, with the previously implemented DBSCAN functionality integrated into the class, and with plotting methods available for users to play with the RGDR configuration.
Note that this is a standalone implementation of RGDR, without any knowledge about splits/folds implemented explicitly.
Example:
Closes #38