Skip to content

Commit e31d454

Browse files
authored
Implement NuFFT capability
The ability to interpolate to "loose" visibility points is implemented through the non-uniform FFT or NuFFT. Closes #17
2 parents 9a2c52c + 62d1a7e commit e31d454

32 files changed

+1721
-369
lines changed

.github/workflows/package.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ on:
77

88
jobs:
99
deploy:
10-
runs-on: ubuntu-latest
10+
runs-on: ubuntu-20.04
1111

1212
steps:
1313
- uses: actions/checkout@v3

.github/workflows/tests.yml

+2-7
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ jobs:
7474
# (but don't deploy to gh-pages)
7575
docs:
7676
needs: tests # don't bother running if a test failed
77-
runs-on: ubuntu-latest
77+
runs-on: ubuntu-20.04
7878
steps:
7979
- uses: actions/checkout@v3
8080
- name: Set up Python
@@ -87,11 +87,6 @@ jobs:
8787
- name: Install Pandoc dependency
8888
run: |
8989
sudo apt-get install pandoc
90-
- name: Set up node
91-
uses: actions/setup-node@v2
92-
- name: Install mermaid.js dependency
93-
run: |
94-
npm install @mermaid-js/mermaid-cli
9590
- name: Cache/Restore the .mpol folder cache
9691
uses: actions/cache@v2
9792
env:
@@ -104,4 +99,4 @@ jobs:
10499
- name: Build the docs
105100
run: |
106101
make -C docs clean
107-
make -C docs html MERMAID_PATH="../node_modules/.bin/"
102+
make -C docs html

docs/Makefile

+1-7
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,6 @@ help:
1414
.PHONY: help Makefile html clean
1515

1616
CI-NOTEBOOKS := ci-tutorials/PyTorch.ipynb ci-tutorials/gridder.ipynb ci-tutorials/optimization.ipynb ci-tutorials/crossvalidation.ipynb ci-tutorials/initializedirtyimage.ipynb
17-
CHARTS := _static/mmd/build/SimpleNet.svg _static/mmd/build/ImageCube.svg _static/mmd/build/BaseCube.svg _static/mmd/build/SkyModel.svg
1817
clean:
1918
rm -rf _build
2019
# rm -rf ${CI-NOTEBOOKS}
@@ -36,10 +35,5 @@ _static/fftshift/build/plot.png: _static/fftshift/src/plot.py
3635
mkdir -p _static/fftshift/build
3736
python _static/fftshift/src/plot.py $@
3837

39-
# mermaid.js files
40-
_static/mmd/build/%.svg: _static/mmd/src/%.mmd
41-
mkdir -p _static/mmd/build
42-
${MERMAID_PATH}mmdc -i $^ -o $@
43-
44-
html: ${CHARTS} _static/baselines/build/baselines.csv _static/fftshift/build/plot.png
38+
html: _static/baselines/build/baselines.csv _static/fftshift/build/plot.png
4539
python -m sphinx -T -E -b html -d _build/doctrees -D language=en . _build/html

docs/api.rst

+5-7
Original file line numberDiff line numberDiff line change
@@ -22,19 +22,21 @@ Gridding
2222
--------
2323

2424
.. automodule:: mpol.gridding
25-
:members:
2625

2726
Datasets and Cross-Validation
2827
-----------------------------
2928

3029
.. automodule:: mpol.datasets
31-
:members:
3230

3331
Images
3432
------
3533

3634
.. automodule:: mpol.images
37-
:members:
35+
36+
Fourier
37+
-------
38+
39+
.. automodule:: mpol.fourier
3840

3941

4042
Precomposed Modules
@@ -43,14 +45,11 @@ Precomposed Modules
4345
For convenience, we provide some "precomposed" `modules <https://pytorch.org/docs/stable/notes/modules.html>`_ which may be useful for simple imaging or modeling applications. In general, though, we encourage you to compose your own set of layers if your application requires it. The source code for a precomposed network can provide useful a starting point. We also recommend checking out the PyTorch documentation on `modules <https://pytorch.org/docs/stable/notes/modules.html>`_.
4446

4547
.. automodule:: mpol.precomposed
46-
:members:
47-
4848

4949
Losses
5050
------
5151

5252
.. automodule:: mpol.losses
53-
:members:
5453

5554

5655
Connectors
@@ -61,4 +60,3 @@ The objects in the Images and Precomposed modules are focused on bringing some i
6160
Connectors are a PyTorch layer to help compute those residual visibilities (on a gridded form).
6261

6362
.. automodule:: mpol.connectors
64-
:members:

docs/changelog.md

+2
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,8 @@
55
## v0.1.2
66

77
- Switched documentation backend to [MyST-NB](https://myst-nb.readthedocs.io/en/latest/index.html).
8+
- Switched documentation theme to [Sphinx Book Theme](https://sphinx-book-theme.readthedocs.io/en/latest/index.html).
9+
- Added {class}`~mpol.fourier.NuFFT` layer, allowing the direct forward modeling of un-gridded :math:`u,v` data. Closes GitHub issue [#17](https://github.com/MPoL-dev/MPoL/issues/17).
810

911
## v0.1.1
1012

docs/ci-tutorials/PyTorch.md

+14-19
Original file line numberDiff line numberDiff line change
@@ -14,17 +14,16 @@ kernelspec:
1414

1515
```{code-cell}
1616
:tags: [hide-cell]
17-
%matplotlib inline
1817
%run notebook_setup
1918
```
2019

2120
# Introduction to PyTorch: Tensors and Gradient Descent
2221

23-
This tutorial provides an introduction to PyTorch tensors, automatic differentiation, and optimization with gradient descent.
22+
This tutorial provides a gentle introduction to PyTorch tensors, automatic differentiation, and optimization with gradient descent outside of any specifics about radio interferometry or the MPoL package itself.
2423

2524
## Introduction to Tensors
2625

27-
Tensors are matrices, similar to numpy arrays, with the added benefit that they can be used to calculate gradients (more on that later). MPoL is built on PyTorch, and uses a form of gradient descent optimization to find the "best" image given a dataset and choice of regularizers.
26+
Tensors are multi-dimensional arrays, similar to numpy arrays, with the added benefit that they can be used to calculate gradients (more on that later). MPoL is built on the [PyTorch](https://pytorch.org/) machine learning library, and uses a form of gradient descent optimization to find the "best" image given some dataset and loss function, which may include regularizers.
2827

2928
We'll start this tutorial by importing the torch and numpy packages. Make sure you have [PyTorch installed](https://pytorch.org/get-started/locally/) before proceeding.
3029

@@ -65,7 +64,7 @@ print(f"Torch tensor multiplication result: {prod_tensor}")
6564

6665
+++
6766

68-
PyTorch provides a key functionality---the ability to calculate the gradients on tensors. Let's start by creating a tensor with a single value. Here we are setting ``requires_grad = True``, we'll see why this is important in a moment.
67+
PyTorch allows us to calculate the gradients on tensors, which is a key functionality underlying MPoL. Let's start by creating a tensor with a single value. Here we are setting ``requires_grad = True``; we'll see why this is important in a moment.
6968

7069
```{code-cell}
7170
x = torch.tensor(3.0, requires_grad=True)
@@ -78,7 +77,7 @@ Let's define some variable $y$ in terms of $x$:
7877
y = x ** 2
7978
```
8079

81-
We see that the value of $y$ is as we expect---nothing too strange here.
80+
We see that the value of $y$ is as we expect---nothing too strange here.
8281

8382
```{code-cell}
8483
print(f"x: {x}")
@@ -87,26 +86,24 @@ print(f"y: {y}")
8786

8887
But what if we wanted to calculate the gradient of $y$ with respect to $x$? Using calculus, we find that the answer is $\frac{dy}{dx} = 2x$. The derivative evaluated at $x = 3$ is $6$.
8988

90-
The magic is that can use PyTorch to get the same answer---no analytic derivative needed!
89+
We can use PyTorch to get the same answer---no analytic derivative needed!
9190

9291
```{code-cell}
9392
y.backward() # populates gradient (.grad) attributes of y with respect to all of its independent variables
9493
x.grad # returns the grad attribute (the gradient) of y with respect to x
9594
```
9695

97-
PyTorch uses the concept of automatic differentiation to calculate the derivative. Instead of computing the derivative as we would by hand, the program is using a computational graph and mechanistic application of the chain rule. For example, a computational graph with several operations on $x$ resulting in a final output $y$ will use the chain rule to compute the differential associated with each operation and multiply these differentials together to get the derivative of $y$ with respect to $x$.
96+
PyTorch uses the concept of [automatic differentiation](https://arxiv.org/abs/1502.05767) to calculate the derivative. Instead of computing the derivative as we would by hand, the program uses a computational graph and the mechanistic application of the chain rule. For example, a computational graph with several operations on $x$ resulting in a final output $y$ will use the chain rule to compute the differential associated with each operation and multiply these differentials together to get the derivative of $y$ with respect to $x$.
9897

9998
+++
10099

101100
## Optimizing a Function with Gradient Descent
102101

103-
If we were on the side of a hill in the dark and we wanted to get down to the bottom of a valley, how would we do it?
102+
If we were on the side of a hill in the dark and we wanted to get down to the bottom of a valley, how might we do it?
104103

105-
We wouldn't be able to see all the way to the bottom of the valley, but we could feel which way is down based on the incline of where we are standing. We would take steps in the downward direction and we'd know when to stop when the ground felt flat.
104+
We can't see all the way to the bottom of the valley, but we can feel which way is down based on the incline of where we are standing. We might take steps in the downward direction and we'd know when to stop when the ground finally felt flat. We would also need to consider how large our steps should be. If we take very small steps, it will take us a longer time than if we take larger steps. However, if we take large leaps, we might completely miss the flat part of the valley, and jump straight across to the other side of the valley.
106105

107-
Before we leap, though, we need to consider how large our steps should be. If we take very small steps, it will take us a longer time than if we take larger steps. However, if we take large leaps, we might completely miss the flat part of the valley, and jump straight across to the other side of the valley.
108-
109-
We can look at the gradient descent from a more mathematical lense by looking at the graph $y = x^2$:
106+
Now let's take a more quantitative look at the gradient descent using the function $y = x^2$:
110107

111108
```{code-cell}
112109
def y(x):
@@ -115,7 +112,7 @@ def y(x):
115112

116113
We will choose some arbitrary place to start on the left side of the hill and use PyTorch to calculate the tangent.
117114

118-
Note that Matplotlib requires numpy arrays instead of PyTorch tensors, so in the following code you might see the occasional ``detach().numpy()`` or ``.item()`` calls, which are used to convert PyTorch tensors to numpy arrays and scalar values, respectively. When it comes time to use MPoL for RML imaging, or any large production run, we'll try to keep the calculations native to PyTorch tensors as long as possible, to avoid the overhead of converting types.
115+
Note that the plotting library Matplotlib requires numpy arrays instead of PyTorch tensors, so in the following code you might see the occasional ``detach().numpy()`` or ``.item()`` calls, which are used to convert PyTorch tensors to numpy arrays and scalar values, respectively, for plotting. When it comes time to use MPoL for RML imaging, or any large production run, we'll try to keep the calculations native to PyTorch tensors as long as possible, to avoid the overhead of converting types.
119116

120117
```{code-cell}
121118
x = torch.linspace(-5, 5, 100)
@@ -143,16 +140,14 @@ plt.ylim(ymin=0, ymax=25)
143140
plt.show()
144141
```
145142

146-
We see we need to go to the right to go down toward the minimum. For a multivariate function, the gradient will point in the direction of the steepest downward slope. When we take steps, we find the x coordinate of our new location by this equation:
143+
We see we need to go to the right to go down toward the minimum. For a multivariate function, the gradient will be a vector pointing in the direction of the steepest downward slope. When we take steps, we find the x coordinate of our new location by:
147144

148145
$x_\mathrm{new} = x_\mathrm{current} - \nabla y(x_\mathrm{current}) * (\mathrm{step\,size})$
149146

150147
where:
151148

152149
- $x_\mathrm{current}$ is our current x value
153-
154150
- $\nabla y(x_\mathrm{current})$ is the gradient at our current point
155-
156151
- $(\mathrm{step\,size})$ is a value we choose that scales our steps
157152

158153
We will choose ``step_size = 0.1``:
@@ -206,7 +201,7 @@ plt.ylabel(r"$y$")
206201
plt.show()
207202
```
208203

209-
The gradient at our new point (shown in orange) is still not close to zero, meaning we haven't reached the minimum. We continue this process of checking if the gradient is nearly zero, and taking a step in the direction of steepest descent until we reach the bottom of the valley. We'll say we've reached the bottom of the valley when the absolute value of the gradient is $<0.1$:
204+
The gradient at our new point (shown in orange) is still not close to zero, meaning we haven't reached the minimum. We'll continue this process of checking if the gradient is nearly zero, and take a step in the direction of steepest descent until we reach the bottom of the valley. We'll say we've reached the bottom of the valley when the absolute value of the gradient is $<0.1$:
210205

211206
```{code-cell}
212207
x = torch.linspace(-5, 5, 100)
@@ -288,7 +283,7 @@ y_large_coords.append(y_large_step_new.item())
288283
289284
plt.scatter(x_large_coords, y_large_coords) # plot points showing steps
290285
plt.scatter(x_large_coords[-1], y_large_coords[-1], c="C1")
291-
286+
plt.text(9, 70, "step 1", va="center")
292287
293288
plt.xlim(xmin=-20, xmax=20)
294289
plt.ylim(ymin=-1, ymax=260)
@@ -297,7 +292,7 @@ plt.ylabel(r"$y$")
297292
plt.show()
298293
```
299294

300-
*Note the change in scale.* With only one step, we already see that we stepped *right over* the minimum to somewhere far up the other side of the valley (orange point)! This is not good. If we kept iterating with the same learning rate, we'd find that the optimization process diverges and the step sizes start blowing up. This is why it is important to pick the proper step size by setting the learning rate appropriately. Steps that are too small take a long time while steps that are too large render the optimization process invalid. In this case, a reasonable choice appears to be ``step size = 0.6``, which would have reached pretty close to the minimum after only 3 steps.
295+
*Note the change in scale!* With only one step, we already see that we stepped *right over* the minimum to somewhere far up the other side of the valley (orange point)! This is not good. If we kept iterating with the same learning rate, we'd find that the optimization process diverges and the step sizes start blowing up. This is why it is important to pick the proper step size by setting the learning rate appropriately. Steps that are too small take a long time while steps that are too large render the optimization process invalid. In this case, a reasonable choice appears to be ``step size = 0.6``, which would have reached pretty close to the minimum after only 3 steps.
301296

302297
To sum up, optimizing a function with gradient descent consists of
303298

docs/ci-tutorials/crossvalidation.md

+2-1
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ from mpol import (
4444
datasets,
4545
gridding,
4646
images,
47+
fourier,
4748
losses,
4849
precomposed,
4950
)
@@ -175,7 +176,7 @@ k_fold_datasets = [(train, test) for (train, test) in cv]
175176
```
176177

177178
```{code-cell}
178-
flayer = images.FourierCube(coords=coords)
179+
flayer = fourier.FourierCube(coords=coords)
179180
flayer.forward(torch.zeros(dset.nchan, coords.npix, coords.npix))
180181
```
181182

0 commit comments

Comments
 (0)