Skip to content

Commit a961286

Browse files
committed
pval tests and multi-handling, multi-handling w. errorbar output
Some edge-case fixes: - pval did not handle multi-dimensional returns from compfunctions well. This is now fixed: the mean is taken along axis 0 only. - errorbar output for multi-dimensionsal statfunction output had too many length-1 axes. This should now be fixed correctly. - Documentation for pval had some errors.
1 parent 242d2ca commit a961286

File tree

7 files changed

+130
-72
lines changed

7 files changed

+130
-72
lines changed

.github/workflows/python-package.yml

+1-2
Original file line numberDiff line numberDiff line change
@@ -7,8 +7,7 @@ on:
77
[push, pull_request]
88

99
jobs:
10-
build:
11-
10+
tests:
1211
runs-on: ${{ matrix.os }}
1312
strategy:
1413
fail-fast: false

LICENSE

+2-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
BSD 3-Clause License
22

3-
Copyright (c) 2021, Constantine Evans / Evans Foundation for Molecular Medicine,
4-
and other contributors.
3+
Copyright (c) 2021, Constantine Evans, the Evans Foundation for Molecular
4+
Medicine, and other contributors.
55

66
All rights reserved.
77

setup.cfg

+8-7
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ name = scikits.bootstrap
33
description = Bootstrap confidence interval estimation routines for Numpy/Scipy/Pandas
44
license = BSD 3-Clause
55
author = Constantine Evans
6-
author_email = cevans@evanslabs.org
6+
author_email = cevans@costinet.org
77

88
long_description = file: README.md
99
long_description_content_type = text/markdown; charset=UTF-8
@@ -12,14 +12,18 @@ platforms = any
1212

1313
version = 1.1.0.dev4
1414

15-
classifiers =
15+
classifiers =
1616
Development Status :: 5 - Production/Stable
1717
Environment :: Console
1818
Intended Audience :: Developers
1919
Intended Audience :: Science/Research
2020
License :: OSI Approved :: BSD License
2121
Programming Language :: Python
2222
Programming Language :: Python :: 3
23+
Programming Language :: Python :: 3.6
24+
Programming Language :: Python :: 3.7
25+
Programming Language :: Python :: 3.8
26+
Programming Language :: Python :: 3.9
2327
Programming Language :: Python :: Implementation :: PyPy
2428
Programming Language :: Python :: Implementation :: CPython
2529
Topic :: Scientific/Engineering
@@ -31,7 +35,7 @@ include_package_data = True
3135
package_dir =
3236
=src
3337
packages = find_namespace:
34-
install_requires =
38+
install_requires =
3539
numpy
3640
pyerf
3741
typing_extensions; python_version<"3.8"
@@ -45,11 +49,8 @@ exclude =
4549
tests
4650

4751
[flake8]
48-
# Some sane defaults for the code style checker flake8
4952
max_line_length = 127
50-
extend_ignore = E203, W503
51-
# ^ Black-compatible
52-
# E203 and W503 have edge cases handled by black
53+
extend_ignore = E203, W503, F403, F405
5354
exclude =
5455
.tox
5556
build

src/scikits/bootstrap/bootstrap.py

+22-34
Original file line numberDiff line numberDiff line change
@@ -18,13 +18,11 @@
1818
Tuple,
1919
Iterable,
2020
Iterator,
21-
Protocol,
2221
)
2322
else:
2423
from typing_extensions import Literal
2524
from typing import Union, Iterable, Any, Optional, Iterator, Callable, Tuple
2625
import warnings
27-
from numpy.random import randint
2826
import numpy as np
2927
import pyerf
3028

@@ -81,7 +79,7 @@ class InstabilityWarning(UserWarning):
8179
# def __call__(self, *args: Any, weights: np.ndarray = None) -> Any:
8280
# ...
8381

84-
DataType = Union[Tuple[np.ndarray, ...], np.ndarray]
82+
DataType = Union[Tuple[Union[np.ndarray, Sequence[Any]], ...], np.ndarray, "pd.Series"]
8583
SeedType = Union[
8684
None,
8785
int,
@@ -94,7 +92,7 @@ class InstabilityWarning(UserWarning):
9492

9593
@overload
9694
def ci(
97-
data: Union[Tuple[Union[np.ndarray, Sequence[Any]], ...], np.ndarray, "pd.Series"],
95+
data: DataType,
9896
statfunction: Optional[StatFunctionWithWeights] = None,
9997
alpha: Union[float, Iterable[float]] = 0.05,
10098
n_samples: int = 10000,
@@ -112,7 +110,7 @@ def ci(
112110

113111
@overload
114112
def ci(
115-
data: Union[Tuple[Union[np.ndarray, Sequence[Any]], ...], np.ndarray, "pd.Series"],
113+
data: DataType,
116114
statfunction: Optional[StatFunctionWithWeights] = None,
117115
alpha: Union[float, Iterable[float]] = 0.05,
118116
n_samples: int = 10000,
@@ -130,7 +128,7 @@ def ci(
130128

131129
@overload
132130
def ci(
133-
data: Union[Tuple[Union[np.ndarray, Sequence[Any]], ...], np.ndarray, "pd.Series"],
131+
data: DataType,
134132
statfunction: Optional[StatFunction] = None,
135133
alpha: Union[float, Iterable[float]] = 0.05,
136134
n_samples: int = 10000,
@@ -148,7 +146,7 @@ def ci(
148146

149147
@overload
150148
def ci(
151-
data: Union[Tuple[Union[np.ndarray, Sequence[Any]], ...], np.ndarray, "pd.Series"],
149+
data: DataType,
152150
statfunction: Optional[StatFunction] = None,
153151
alpha: Union[float, Iterable[float]] = 0.05,
154152
n_samples: int = 10000,
@@ -164,7 +162,7 @@ def ci(
164162

165163

166164
def ci(
167-
data: Union[Tuple[Union[np.ndarray, Sequence[Any]], ...], np.ndarray, "pd.Series"],
165+
data: DataType,
168166
statfunction: Optional[Union[StatFunctionWithWeights, StatFunction]] = None,
169167
alpha: Union[float, Iterable[float]] = 0.05,
170168
n_samples: int = 10000,
@@ -205,7 +203,7 @@ def ci(
205203
intervals. If it is an iterable, alpha is assumed to be an iterable of
206204
each desired percentile.
207205
n_samples: float, optional
208-
The number of bootstrap samples to use (default=10000)
206+
The number of bootstrap samples to use (default=10_000)
209207
method: string, optional
210208
The method to use: one of 'pi', 'bca', or 'abc' (default='bca')
211209
output: string, optional
@@ -219,8 +217,8 @@ def ci(
219217
If False, assume data is a single array. If True or "paired",
220218
assume data is a tuple/other iterable of arrays of the same length that
221219
should be sampled together (eg, values in each array at a particular index are
222-
linked in some way). If None, decide based on whether the data is an
223-
actual tuple. If "independent", sample the tuple of arrays separately.
220+
linked in some way). If None, "paired" is used if data is an actual
221+
tuple, and False otherwise. If "independent", sample the tuple of arrays separately.
224222
For True/"paired", each array must be the same length. (default=None)
225223
226224
An example of a situation where True/"paired" might be useful is if you have
@@ -431,11 +429,11 @@ def ci(
431429
out = stat[(nvals, np.indices(nvals.shape)[1:].squeeze())]
432430
elif output == "errorbar":
433431
if nvals.ndim == 1:
434-
out = abs(statfunction(*tdata) - stat[nvals])[np.newaxis].T
432+
out = np.abs(statfunction(*tdata) - stat[nvals])[np.newaxis].T
435433
else:
436-
out = abs(
434+
out = np.abs(
437435
statfunction(*tdata) - stat[(nvals, np.indices(nvals.shape)[1:])]
438-
)[np.newaxis].T
436+
).T.squeeze()
439437
else:
440438
raise ValueError("Output option {0} is not supported.".format(output))
441439

@@ -460,7 +458,7 @@ def _ci_abc(
460458
n = tdata[0].shape[0] * 1.0
461459
nn = tdata[0].shape[0]
462460

463-
I = np.identity(nn)
461+
Imatrix = np.identity(nn)
464462
ep = epsilon / n * 1.0
465463
p0 = np.repeat(1.0 / n, nn)
466464

@@ -469,7 +467,7 @@ def _ci_abc(
469467
except TypeError as e:
470468
raise TypeError("statfunction does not accept correct arguments for ABC") from e
471469

472-
di_full = I - p0
470+
di_full = Imatrix - p0
473471
tp = np.fromiter(
474472
(statfunction(*tdata, weights=p0 + ep * di) for di in di_full), dtype=float
475473
)
@@ -716,15 +714,15 @@ def bootstrap_indices_moving_block(
716714
def pval(
717715
data: DataType,
718716
statfunction: StatFunction = np.average,
719-
compfunction: Callable[[Any], bool] = lambda s: cast(bool, s > 0),
717+
compfunction: Callable[[Any], Any] = lambda s: cast(bool, s > 0),
720718
n_samples: int = 10000,
721719
multi: Optional[bool] = None,
722720
seed: SeedType = None,
723-
) -> "np.number[Any]":
721+
) -> "Union[np.number[Any], np.ndarray]":
724722
"""
725723
Given a set of data ``data``, a statistics function ``statfunction`` that
726-
applies to that data, and the criteriafunction ``compfunction``, computes the
727-
bootstrap probability thatthe statistics function ``statfunction`` on that data
724+
applies to that data, and the criteria function ``compfunction``, computes the
725+
bootstrap probability that the statistics function ``statfunction`` on that data
728726
satisfies the the criteria function ``compfunction``. Data points are assumed to
729727
be delineated by axis 0.
730728
@@ -742,9 +740,10 @@ def pval(
742740
to these samples individually.
743741
compfunction: function (stat) -> True or False
744742
This function should accept result of the statfunction computed on the samples of
745-
data from ``data``. It is applied to these results individually.
743+
data from ``data``. It is applied to these results individually. The default
744+
tests for each element of statfunction output being > 0.
746745
n_samples: float, optional
747-
The number of bootstrap samples to use (default=10000)
746+
The number of bootstrap samples to use (default=10_000).
748747
multi: boolean, optional
749748
If False, assume data is a single array. If True, assume data is a tuple/other
750749
iterable of arrays of the same length that should be sampled together. If None,
@@ -756,17 +755,6 @@ def pval(
756755
The probability that the statistics defined by the statfunction satisfies the
757756
criteria defined by the compfunction.
758757
759-
Examples
760-
--------
761-
To calculate the confidence intervals for the mean of some numbers:
762-
763-
>> boot.ci( np.randn(100), np.average )
764-
765-
Given some data points in arrays x and y calculate the confidence intervals
766-
for all linear regression coefficients simultaneously:
767-
768-
>> boot.ci( (x,y), scipy.stats.linregress )
769-
770758
References
771759
----------
772760
Efron, An Introduction to the Bootstrap. Chapman & Hall 1993
@@ -796,4 +784,4 @@ def pval(
796784

797785
pval_stat = [compfunction(s) for s in stat]
798786
# print pval_stat
799-
return np.mean(pval_stat)
787+
return cast("Union[np.number[Any], np.ndarray]", np.mean(pval_stat, axis=0))

src/scikits/bootstrap/py.typed

Whitespace-only changes.

0 commit comments

Comments
 (0)