Skip to content

Commit 5a8fc89

Browse files
authoredJan 18, 2022
Refactor regression code and docs (for #109) (#170)
* refactored regression code and docs * added int to float data conversion, and methods for vector-type regressors * added isotonic regression * added docs for regression
1 parent 1a871a3 commit 5a8fc89

File tree

6 files changed

+292
-10
lines changed

6 files changed

+292
-10
lines changed
 

‎docs/make.jl

+5-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,11 @@ makedocs(
88
sitename = "MultivariateStats.jl",
99
modules = [MultivariateStats],
1010
pages = ["Home"=>"index.md",
11-
"whiten.md", "lda.md", "pca.md", "mds.md",
11+
"whiten.md",
12+
"lreg.md",
13+
"lda.md",
14+
"pca.md",
15+
"mds.md",
1216
"Development"=>"api.md"]
1317
)
1418

‎docs/src/index.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ end
1111
[MultivariateStats.jl](https://github.com/JuliaStats/MultivariateStats.jl) is a Julia package for multivariate statistical analysis. It provides a rich set of useful analysis techniques, such as PCA, CCA, LDA, ICA, etc.
1212

1313
```@contents
14-
Pages = ["whiten.md", "lda.md", "pca.md", "mds.md", "api.md"]
14+
Pages = ["whiten.md", "lreg.md", "lda.md", "pca.md", "mda.md", "api.md"]
1515
Depth = 2
1616
```
1717

‎docs/src/lreg.md

+167
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,167 @@
1+
# Regression
2+
3+
The package provides functions to perform *Linear Least Square*, *Ridge*, and *Isotonic Regression*.
4+
5+
## Examples
6+
7+
```@setup REGex
8+
using Plots
9+
gr(fmt=:svg)
10+
```
11+
12+
Performing [`llsq`](@ref) regression on *cars* data set:
13+
14+
```@example REGex
15+
using MultivariateStats, RDatasets, Plots
16+
17+
# load cars dataset
18+
cars = dataset("datasets", "cars")
19+
20+
# calculate regression models
21+
a = llsq(cars[!,:Speed], cars[!, :Dist])
22+
b = isotonic(cars[!,:Speed], cars[!, :Dist])
23+
24+
# plot results
25+
p = scatter(cars[!,:Speed], cars[!,:Dist], xlab="Speed", ylab="Distance",
26+
leg=:topleft, lab="data")
27+
let xs = cars[!,:Speed]
28+
plot!(p, xs, map(x->a[1]*x+a[2], xs), lab="llsq")
29+
plot!(p, xs, b, lab="isotonic")
30+
end
31+
```
32+
33+
For a single response vector `y` (without using bias):
34+
35+
```julia
36+
# prepare data
37+
X = rand(1000, 3) # feature matrix
38+
a0 = rand(3) # ground truths
39+
y = X * a0 + 0.1 * randn(1000) # generate response
40+
41+
# solve using llsq
42+
a = llsq(X, y; bias=false)
43+
44+
# do prediction
45+
yp = X * a
46+
47+
# measure the error
48+
rmse = sqrt(mean(abs2.(y - yp)))
49+
print("rmse = $rmse")
50+
```
51+
52+
For a single response vector `y` (using bias):
53+
54+
```julia
55+
# prepare data
56+
X = rand(1000, 3)
57+
a0, b0 = rand(3), rand()
58+
y = X * a0 .+ b0 .+ 0.1 * randn(1000)
59+
60+
# solve using llsq
61+
sol = llsq(X, y)
62+
63+
# extract results
64+
a, b = sol[1:end-1], sol[end]
65+
66+
# do prediction
67+
yp = X * a .+ b'
68+
```
69+
70+
For a matrix of column-stored regressors `X` and a matrix comprised of multiple columns of dependent variables `Y`:
71+
72+
```julia
73+
# prepare data
74+
X = rand(3, 1000)
75+
A0, b0 = rand(3, 5), rand(1, 5)
76+
Y = (X' * A0 .+ b0) + 0.1 * randn(1000, 5)
77+
78+
# solve using llsq
79+
sol = llsq(X, Y, dims=2)
80+
81+
# extract results
82+
A, b = sol[1:end-1,:], sol[end,:]
83+
84+
# do prediction
85+
Yp = X'*A .+ b'
86+
```
87+
88+
## Linear Least Square
89+
90+
[Linear Least Square](http://en.wikipedia.org/wiki/Linear_least_squares_(mathematics))
91+
is to find linear combination(s) of given variables to fit the responses by
92+
minimizing the squared error between them.
93+
This can be formulated as an optimization as follows:
94+
95+
```math
96+
\mathop{\mathrm{minimize}}_{(\mathbf{a}, b)} \
97+
\frac{1}{2} \|\mathbf{y} - (\mathbf{X} \mathbf{a} + b)\|^2
98+
```
99+
100+
Sometimes, the coefficient matrix is given in a transposed form, in which case,
101+
the optimization is modified as:
102+
103+
```math
104+
\mathop{\mathrm{minimize}}_{(\mathbf{a}, b)} \
105+
\frac{1}{2} \|\mathbf{y} - (\mathbf{X}^T \mathbf{a} + b)\|^2
106+
```
107+
108+
The package provides following functions to solve the above problems:
109+
110+
```@docs
111+
llsq
112+
```
113+
114+
## Ridge Regression
115+
116+
Compared to linear least square, [Ridge Regression](http://en.wikipedia.org/wiki/Tikhonov_regularization>)
117+
uses an additional quadratic term to regularize the problem:
118+
119+
```math
120+
\mathop{\mathrm{minimize}}_{(\mathbf{a}, b)} \
121+
\frac{1}{2} \|\mathbf{y} - (\mathbf{X} \mathbf{a} + b)\|^2 +
122+
\frac{1}{2} \mathbf{a}^T \mathbf{Q} \mathbf{a}
123+
```
124+
125+
The transposed form:
126+
127+
```math
128+
\mathop{\mathrm{minimize}}_{(\mathbf{a}, b)} \
129+
\frac{1}{2} \|\mathbf{y} - (\mathbf{X}^T \mathbf{a} + b)\|^2 +
130+
\frac{1}{2} \mathbf{a}^T \mathbf{Q} \mathbf{a}
131+
```
132+
133+
The package provides following functions to solve the above problems:
134+
135+
```@docs
136+
ridge
137+
```
138+
139+
## Isotonic Regression
140+
141+
[Isotonic regression](https://en.wikipedia.org/wiki/Isotonic_regression) or
142+
monotonic regression fits a sequence of observations into a fitted line that is
143+
non-decreasing (or non-increasing) everywhere. The problem defined as a weighted
144+
least-squares fit ``{\hat {y}}_{i} \approx y_{i}`` for all ``i``, subject to
145+
the constraint that ``{\hat {y}}_{i} \leq {\hat {y}}_{j}`` whenever
146+
``x_{i} \leq x_{j}``.
147+
This gives the following quadratic program:
148+
149+
```math
150+
\min \sum_{i=1}^{n} w_{i}({\hat {y}}_{i}-y_{i})^{2}
151+
\text{ subject to } {\hat {y}}_{i} \leq {\hat {y}}_{j}
152+
\text{ for all } (i,j) \in E
153+
```
154+
where ``E=\{(i,j):x_{i}\leq x_{j}\}`` specifies the partial ordering of
155+
the observed inputs ``x_{i}``.
156+
157+
The package provides following functions to solve the above problems:
158+
```@docs
159+
isotonic
160+
```
161+
162+
---
163+
164+
### References
165+
166+
[^1] Best, M.J., Chakravarti, N. Active set algorithms for isotonic regression; A unifying framework. Mathematical Programming 47, 425–439 (1990).
167+

‎src/MultivariateStats.jl

+1
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ module MultivariateStats
2626
# lreg
2727
llsq, # Linear Least Square regression
2828
ridge, # Ridge regression
29+
isotonic, # Isotonic regression
2930

3031
# whiten
3132
Whitening, # Type: Whitening transformation

‎src/lreg.jl

+100-7
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Ridge Regression (Tikhonov regularization)
1+
# Regression
22

3-
#### auxiliary
3+
## Auxiliary
44

55
function lreg_chkdims(X::AbstractMatrix, Y::AbstractVecOrMat, trans::Bool)
66
mX, nX = size(X)
@@ -17,8 +17,23 @@ _vaug(X::AbstractMatrix{T}) where T = vcat(X, ones(T, 1, size(X,2)))::Matrix{T}
1717
_haug(X::AbstractMatrix{T}) where T = hcat(X, ones(T, size(X,1), 1))::Matrix{T}
1818

1919

20-
## linear least square
20+
## Linear Least Square Regression
2121

22+
23+
"""
24+
llsq(X, y; ...)
25+
26+
Solve the linear least square problem.
27+
28+
Here, `y` can be either a vector, or a matrix where each column is a response vector.
29+
30+
This function accepts two keyword arguments:
31+
32+
- `dims`: whether input observations are stored as rows (`1`) or columns (`2`). (default is `1`)
33+
- `bias`: whether to include the bias term `b`. (default is `true`)
34+
35+
The function results the solution `a`. In particular, when `y` is a vector (matrix), `a` is also a vector (matrix). If `bias` is true, then the returned array is augmented as `[a; b]`.
36+
"""
2237
function llsq(X::AbstractMatrix{T}, Y::AbstractVecOrMat{T};
2338
trans::Bool=false, bias::Bool=true,
2439
dims::Union{Integer,Nothing}=nothing) where {T<:Real}
@@ -37,9 +52,32 @@ function llsq(X::AbstractMatrix{T}, Y::AbstractVecOrMat{T};
3752
end
3853
_ridge(X, Y, zero(T), dims == 2, bias)
3954
end
55+
llsq(x::AbstractVector{T}, y::AbstractVector{T}; kwargs...) where {T<:Real} =
56+
llsq(x[:,:], y; dims=1, kwargs...)
57+
58+
## Ridge Regression (Tikhonov regularization)
59+
60+
"""
61+
62+
ridge(X, y, r; ...)
63+
64+
Solve the ridge regression problem.
65+
66+
Here, ``y`` can be either a vector, or a matrix where each column is a response vector.
67+
68+
The argument `r` gives the quadratic regularization matrix ``Q``, which can be in either of the following forms:
69+
70+
- `r` is a real scalar, then ``Q`` is considered to be `r * eye(n)`, where `n` is the dimension of `a`.
71+
- `r` is a real vector, then ``Q`` is considered to be `diagm(r)`.
72+
- `r` is a real symmetric matrix, then ``Q`` is simply considered to be `r`.
4073
41-
## ridge regression
74+
This function accepts two keyword arguments:
4275
76+
- `dims`: whether input observations are stored as rows (`1`) or columns (`2`). (default is `1`)
77+
- `bias`: whether to include the bias term `b`. (default is `true`)
78+
79+
The function results the solution `a`. In particular, when `y` is a vector (matrix), `a` is also a vector (matrix). If `bias` is true, then the returned array is augmented as `[a; b]`.
80+
"""
4381
function ridge(X::AbstractMatrix{T}, Y::AbstractVecOrMat{T}, r::Union{Real, AbstractVecOrMat};
4482
trans::Bool=false, bias::Bool=true,
4583
dims::Union{Integer,Nothing}=nothing) where {T<:Real}
@@ -50,19 +88,24 @@ function ridge(X::AbstractMatrix{T}, Y::AbstractVecOrMat{T}, r::Union{Real, Abst
5088
d = lreg_chkdims(X, Y, dims == 2)
5189
if isa(r, Real)
5290
r >= zero(r) || error("r must be non-negative.")
53-
r = convert(T, r)
91+
r = convert(T <: AbstractFloat ? T : Float64, r)
5492
elseif isa(r, AbstractVector)
5593
length(r) == d || throw(DimensionMismatch("Incorrect length of r."))
5694
elseif isa(r, AbstractMatrix)
5795
size(r) == (d, d) || throw(DimensionMismatch("Incorrect size of r."))
5896
end
5997
_ridge(X, Y, r, dims == 2, bias)
6098
end
99+
ridge(x::AbstractVector{T}, y::AbstractVector{T}, r::Union{Real, AbstractVecOrMat};
100+
kwargs...) where {T<:Real} = ridge(x[:,:], y, r; dims=1, kwargs...)
61101

62-
## implementation
102+
### implementation
63103

64-
function _ridge(X::AbstractMatrix{T}, Y::AbstractVecOrMat{T},
104+
function _ridge(_X::AbstractMatrix{T}, _Y::AbstractVecOrMat{T},
65105
r::Union{Real, AbstractVecOrMat}, trans::Bool, bias::Bool) where {T<:Real}
106+
# convert integer data to Float64
107+
X = T <: AbstractFloat ? _X : convert(Array{Float64}, _X)
108+
Y = T <: AbstractFloat ? _Y : convert(Array{Float64}, _Y)
66109
if bias
67110
if trans
68111
X_ = _vaug(X)
@@ -108,3 +151,53 @@ function _ridge_reg!(Q::AbstractMatrix, r::AbstractMatrix, bias::Bool)
108151
end
109152
return Q
110153
end
154+
155+
## Isotonic Regression
156+
157+
"""
158+
isotonic(x, y[, w])
159+
160+
Solve the isotonic regression problem using the pool adjacent violators algorithm[^1].
161+
162+
Here `x` is the regressor vector, `y` is response vector, and `w` is an optional
163+
weights vector.
164+
165+
The function returns a prediction vector of the same size as the regressor vector `x`.
166+
"""
167+
function isotonic(x::AbstractVector{T}, y::AbstractVector{T},
168+
w::AbstractVector{T} = ones(T, length(y))) where {T<:Real}
169+
n = length(x)
170+
n == length(y) || throw(DimensionMismatch("Dimensions of x and y mismatch."))
171+
172+
idx = sortperm(x)
173+
174+
# PVA algorithm
175+
J = map(i->(Float64(y[i]*w[i]), w[i], [i]), idx)
176+
i = 1
177+
B₀ = J[i]
178+
while i < length(J)
179+
B₊ = J[i+1]
180+
if B₀[1] <= B₊[1] # step 1
181+
B₀ = B₊
182+
i += 1
183+
else # step 2
184+
ww = B₀[2] + B₊[2]
185+
J[i] = ((B₀[1]*B₀[2]+B₊[1]*B₊[2])/ww, ww, append!(B₀[3], B₊[3]))
186+
deleteat!(J, i+1)
187+
B₀ = J[i]
188+
while i > 1 # step 2.1
189+
B₋ = J[i-1]
190+
if B₀[1] <= B₋[1]
191+
ww = B₀[2] + B₋[2]
192+
J[i] = ((B₀[1]*B₀[2]+B₋[1]*B₋[2])/ww, ww, append!(B₀[3], B₋[3]))
193+
deleteat!(J, i-1)
194+
i-=1
195+
else
196+
break
197+
end
198+
end
199+
end
200+
end
201+
[y for (y,w,ii) in J for i in sort(ii)]
202+
end
203+

‎test/lreg.jl

+18-1
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ using Test
33
using LinearAlgebra
44
using StableRNGs
55

6-
@testset "Ridge Regression" begin
6+
@testset "Regression" begin
77

88
rng = StableRNG(34568)
99

@@ -184,4 +184,21 @@ using StableRNGs
184184
aa = ridge(Xt, y1, r; dims=2, bias=true)
185185
@test aa Aa[:,1]
186186

187+
188+
## isotonic
189+
xx = [3.3, 3.3, 3.3, 6, 7.5, 7.5]
190+
yy = [4, 5, 1, 6, 8, 7.0]
191+
a = isotonic(xx, yy)
192+
b = [[1,2,3],[4],[5,6]]
193+
@test a xx atol=0.1
194+
195+
## using various types
196+
@testset for (T, TR) in [(Int,Float64), (Float32, Float32)]
197+
X, y = rand(T, 10, 3), rand(T, 10)
198+
a = llsq(X, y)
199+
@test eltype(a) == TR
200+
a = ridge(X, y, one(T))
201+
@test eltype(a) == TR
202+
end
203+
187204
end

0 commit comments

Comments
 (0)