Skip to content

Commit 14a6549

Browse files
authored
Python SDK version 1.1 (#263)
* style: fix linter errors in tests * docs: correct inline documentation for MetricsApiConfig class * refactor: move METRICS_API to MetricsApiConfig so that it can be overridden during SDK development * feat: add a logger for debugging * refactor: move submission code to its own file * feat: add timeout parameter * fix: remove unnecessary debugging statements * feat: when the process is exiting, send any queued requests and wait for all outstanding network calls to complete * refactor: move ALLOWED_HTTP_HOSTS enforcement into the core Metrics processing object, so it can be shared across multiple implementations (coming soon) * docs: clean up README.md and make sure all parameters are documented * feat: make it easier for users to override the new `timeout` parameter * release: bump version to 1.1.0 * Ignore .envrc files (sometimes used in Python development) * style: proper formatting with black
1 parent 097fa6e commit 14a6549

11 files changed

+160
-74
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,4 @@
1+
.envrc
12
.vscode/
23
node_modules/
34
packages/*/node_modules/

packages/python/README.md

+15-15
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ pip install readme-metrics
1515

1616
## Usage
1717

18-
Just include the `MetricsMiddleware` into your API!
18+
Just include the `MetricsMiddleware` in any WSGI app!
1919

2020
```python
2121
from readme_metrics import MetricsApiConfig, MetricsMiddleware
@@ -25,8 +25,8 @@ app = Flask(__name__)
2525
app.wsgi_app = MetricsMiddleware(
2626
app.wsgi_app,
2727
MetricsApiConfig(
28-
'<<apiKey>>',
29-
lambda req: {
28+
api_key='<<your-readme-api-key>>',
29+
grouping_function=lambda req: {
3030
'api_key': 'unique api_key of the user',
3131
'label': 'label for us to show for this user (ie email, project name, user name, etc)',
3232
'email': 'email address for user'
@@ -39,25 +39,25 @@ app.wsgi_app = MetricsMiddleware(
3939

4040
There are a few options you can pass in to change how the logs are sent to ReadMe. These can be passed in `MetricsApiConfig`.
4141

42-
Ex)
43-
4442
```python
4543
MetricsApiConfig(
46-
'<<apiKey>>',
47-
lambda req: {
44+
api_key='<<your-readme-api-key>>',
45+
grouping_function=lambda req: {
4846
'api_key': 'unique api_key of the user',
4947
'label': 'label for us to show for this user (ie email, project name, user name, etc)',
5048
'email': 'email address for user'
5149
},
5250
buffer_length=1,
53-
denylist=['credit_card'] # Prevents credit_card in the request from being sent to readme
51+
denylist=['password'] # Prevents a request or response's "password" field from being sent to ReadMe
5452
)
5553
```
5654

57-
| Option | Use |
58-
| :----------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
59-
| development_mode | **default: false** If true, the log will be separate from normal production logs. This is great for separating staging or test data from data coming from customers |
60-
| denylist | **optional** An array of keys from your API requests and responses headers and bodies that you wish to denylist from sending to ReadMe.<br /><br />If you configure a denylist, it will override any allowlist configuration. |
61-
| allowlist | **optional** An array of keys from your API requests and responses headers and bodies that you only wish to send to ReadMe. |
62-
| buffer_length | **default: 10** Sets the number of API calls that should be recieved before the requests are sent to ReadMe |
63-
| allowed_http_hosts | A list of allowed http hosts for sending data to the ReadMe API. |
55+
| Option | Use |
56+
| :--------------------- | :------------------------------------------------- |
57+
| development_mode | **default: false** If true, the log will be separate from normal production logs. This is great for separating staging or test data from data coming from customers. |
58+
| background_worker_mode | **default: true** If true, requests to the ReadMe API will be made in a background thread. If false, the ReadMe API request will be made synchronously in the main thread, potentially slowing down your HTTP service. |
59+
| denylist | **optional** An array of keys from your API requests and responses headers and bodies that are blocked from being sent to ReadMe. Both the request and response will be checked for these keys, in their HTTP headers, form fields, URL parameters, and JSON request/response bodies. JSON is only checked at the top level, so a nested field will still be sent even if its key matches one of the keys in `denylist`.<br /><br />If you configure a denylist, it will override any allowlist configuration. |
60+
| allowlist | **optional** An array of keys from your API requests and responses headers and bodies that you only wish to send to ReadMe. All other semantics match `denylist`. Like `denylist`, only the top level of JSON request/response bodies are filtered. If this option is configured, **only** the whitelisted properties will be sent. |
61+
| buffer_length | **default: 10** Sets the number of API calls that should be recieved before the requests are sent to ReadMe. |
62+
| allowed_http_hosts | A list of HTTP hosts which should be logged to ReadMe. If this is present, a request will only be sent to ReadMe if its Host header matches one of the allowed hosts. |
63+
| timeout | Timeout (in seconds) for calls back to the ReadMe Metrics API. Default 3 seconds. |
+37-26
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import atexit
2+
import math
13
import queue
24
import threading
35
import requests
@@ -6,18 +8,19 @@
68

79
from werkzeug import Request
810

9-
from readme_metrics import MetricsApiConfig, ResponseInfoWrapper
11+
from readme_metrics import MetricsApiConfig
12+
from readme_metrics.publisher import publish_batch
1013
from readme_metrics.PayloadBuilder import PayloadBuilder
14+
from readme_metrics.ResponseInfoWrapper import ResponseInfoWrapper
1115

1216

1317
class Metrics:
1418
"""
15-
This is the internal central controller classinvoked by the WSGI middleware. It
16-
handles the creation, queueing, and submission of the requests.
19+
This is the internal central controller class invoked by the ReadMe middleware. It
20+
queues requests for submission. The submission is processed by readme_metrics.publisher.publish_batch().
1721
"""
1822

1923
PACKAGE_NAME: str = "readme/metrics"
20-
METRICS_API: str = "https://metrics.readme.io"
2124

2225
def __init__(self, config: MetricsApiConfig):
2326
"""
@@ -37,38 +40,46 @@ def __init__(self, config: MetricsApiConfig):
3740
)
3841
self.queue = queue.Queue()
3942

43+
atexit.register(self.exit_handler)
44+
4045
def process(self, request: Request, response: ResponseInfoWrapper) -> None:
4146
"""Enqueues a request/response combination to be submitted the API.
4247
4348
Args:
4449
request (Request): Request object
4550
response (ResponseInfoWrapper): Response object
4651
"""
47-
self.queue.put(self.payload_builder(request, response))
52+
if not self.host_allowed(request.environ["HTTP_HOST"]):
53+
self.config.LOGGER.debug(
54+
f"Not enqueueing request, host {request.environ['HTTP_HOST']} not in ALLOWED_HTTP_HOSTS"
55+
)
56+
return
4857

58+
self.queue.put(self.payload_builder(request, response))
4959
if self.queue.qsize() >= self.config.BUFFER_LENGTH:
60+
args = (self.config, self.queue)
5061
if self.config.IS_BACKGROUND_MODE:
51-
threading.Thread(target=self._processAll, daemon=True).start()
62+
thread = threading.Thread(target=publish_batch, daemon=True, args=args)
63+
thread.start()
5264
else:
53-
self._processAll()
65+
publish_batch(*args)
5466

55-
def _processAll(self) -> None:
56-
result_list = []
57-
while not self.queue.empty():
58-
obj = self.queue.get_nowait()
59-
if obj:
60-
result_list.append(obj)
67+
def exit_handler(self) -> None:
68+
if not self.queue.empty():
69+
args = (self.config, self.queue)
70+
for _ in range(math.ceil(self.queue.qsize() / self.config.BUFFER_LENGTH)):
71+
if self.config.IS_BACKGROUND_MODE:
72+
thread = threading.Thread(
73+
target=publish_batch, daemon=True, args=args
74+
)
75+
thread.start()
76+
else:
77+
publish_batch(*args)
78+
self.queue.join()
6179

62-
payload = json.dumps(result_list)
63-
64-
version = importlib.import_module(__package__).__version__
65-
66-
readme_result = requests.post(
67-
self.METRICS_API + "/request",
68-
auth=(self.config.README_API_KEY, ""),
69-
data=payload,
70-
headers={
71-
"Content-Type": "application/json",
72-
"User-Agent": "readme-metrics-python@" + version,
73-
},
74-
)
80+
def host_allowed(self, host):
81+
if self.config.ALLOWED_HTTP_HOSTS:
82+
return host in self.config.ALLOWED_HTTP_HOSTS
83+
else:
84+
# If the allowed_http_hosts has not been set (None by default), send off the data to be queued
85+
return True

packages/python/readme_metrics/MetricsApiConfig.py

+36-15
Original file line numberDiff line numberDiff line change
@@ -1,36 +1,43 @@
11
from typing import List, Any, Callable
22

3+
from readme_metrics.util import util_build_logger
4+
35

46
class MetricsApiConfig:
57
"""ReadMe Metrics API configuration object
68
79
Attributes:
8-
README_API_KEY (str): (required) Your ReadMe API key
9-
GROUPING_FUNCTION (lambda): (required) Grouping function to construct an
10+
README_API_KEY (str) Your ReadMe API key
11+
GROUPING_FUNCTION (lambda): Grouping function to construct an
1012
identity object. It receives the current request as a parameter, and must
1113
return a dictionary containing at least an "id" field, and optionally
1214
"label" and "email" fields.
1315
1416
The main purpose of the identity object is to identify the API's caller.
15-
BUFFER_LENGTH (int, optional): Number of requests to buffer before sending data
17+
BUFFER_LENGTH (int): Number of requests to buffer before sending data
1618
to ReadMe. Defaults to 10.
17-
IS_DEVELOPMENT_MODE (bool, optional): Determines whether you are running in
19+
IS_DEVELOPMENT_MODE (bool): Determines whether you are running in
1820
development mode. Defaults to False.
19-
IS_BACKGROUND_MODE (bool, optional): Determines whether to issue the call to
21+
IS_BACKGROUND_MODE (bool): Determines whether to issue the call to
2022
the ReadMe API in a background thread. Defaults to True.
21-
DENYLIST (List[str], optional): An array of headers and JSON body properties to
23+
DENYLIST (List[str]): An array of headers and JSON body properties to
2224
skip sending to ReadMe.
2325
2426
If you configure a denylist, it will override any allowlist configuration.
25-
ALLOWLIST (List[str], optional): An array of headers and JSON body properties to
27+
ALLOWLIST (List[str]): An array of headers and JSON body properties to
2628
send to ReadMe.
2729
2830
If this option is configured, ONLY the allowlisted properties will be sent.
29-
BLACKLIST (List[str], optional): Deprecated, prefer using an denylist.
30-
WHITELIST (List[str], optional): Deprecated, prefer using an allowlist.
31-
ALLOWED_HTTP_HOSTS (List[str] (optional)): A list of allowed http hosts for sending
31+
ALLOWED_HTTP_HOSTS (List[str]): A list of allowed http hosts for sending
3232
data to the ReadMe API.
33+
METRICS_API (str): Base URL of the ReadMe metrics API.
34+
METRICS_API_TIMEOUT (int): Timeout (in seconds) for metrics API calls.
35+
LOGGER (logging.Logger): Logger used by all classes and methods in the
36+
readme_metrics packge. Defaults to a basic console logger with log level
37+
CRITICAL.
3338
39+
You can adjust logging settings by manipulating LOGGER, or you can replace
40+
LOGGER entirely with your application's Logger.
3441
"""
3542

3643
README_API_KEY: str = None
@@ -41,6 +48,8 @@ class MetricsApiConfig:
4148
DENYLIST: List[str] = []
4249
ALLOWLIST: List[str] = []
4350
ALLOWED_HTTP_HOSTS: List[str] = []
51+
METRICS_API: str = "https://metrics.readme.io"
52+
METRICS_API_TIMEOUT: int = 3
4453

4554
def __init__(
4655
self,
@@ -54,6 +63,7 @@ def __init__(
5463
blacklist: List[str] = None,
5564
whitelist: List[str] = None,
5665
allowed_http_hosts: List[str] = None,
66+
timeout: int = 3,
5767
):
5868
"""Initializes an instance of the MetricsApiConfig object
5969
@@ -71,20 +81,29 @@ def __init__(
7181
development mode. Defaults to False.
7282
background_worker_mode (bool, optional): Determines whether to issue the
7383
call to the ReadMe API in a background thread. Defaults to True.
74-
denylist (List[str], optional): An array of headers and JSON body
75-
properties to skip sending to ReadMe. Defaults to None.
84+
denylist (List[str], optional): An array of keys from your API requests and
85+
responses headers and bodies that are blocked from being sent to ReadMe.
86+
Both the request and response will be checked for these keys, in their
87+
HTTP headers, form fields, URL parameters, and JSON request/response
88+
bodies. JSON is only checked at the top level, so a nested field will
89+
still be sent even if its key matches one of the keys in `denylist`.
90+
Defaults to None.
7691
7792
If you configure a denylist, it will override any allowlist
7893
configuration.
7994
allowlist (List[str], optional): An array of headers and JSON body
80-
properties to send to ReadMe. Defaults to None.
95+
properties to send to ReadMe. Similar semantics to `denylist`; defaults
96+
to None.
8197
8298
If this option is configured, ONLY the whitelisted properties will be
8399
sent.
84100
blacklist (List[str], optional): Deprecated, prefer denylist.
85101
whitelist (List[str], optional): Deprecated, prefer allowlist.
86-
allowed_http_hosts (List[str], optional): A list of allowed http hosts for sending data
87-
to the ReadMe API.
102+
allowed_http_hosts (List[str], optional): A list of HTTP hosts which should be
103+
logged to ReadMe. If this is present, requests will only be sent to ReadMe
104+
whose Host header matches one of the allowed hosts.
105+
timeout (int): Timeout (in seconds) for calls back to the ReadMe Metrics API.
106+
Default 3 seconds.
88107
"""
89108
self.README_API_KEY = api_key
90109
self.GROUPING_FUNCTION = grouping_function
@@ -94,3 +113,5 @@ def __init__(
94113
self.DENYLIST = denylist or blacklist or []
95114
self.ALLOWLIST = allowlist or whitelist or []
96115
self.ALLOWED_HTTP_HOSTS = allowed_http_hosts
116+
self.METRICS_API_TIMEOUT = timeout
117+
self.LOGGER = util_build_logger()

packages/python/readme_metrics/MetricsMiddleware.py

+1-6
Original file line numberDiff line numberDiff line change
@@ -108,12 +108,7 @@ def _start_response(_status, _response_headers, *args):
108108
)
109109

110110
# Send off data to be queued (and processed) by ReadMe if allowed
111-
if self.config.ALLOWED_HTTP_HOSTS:
112-
if environ["HTTP_HOST"] in self.config.ALLOWED_HTTP_HOSTS:
113-
self.metrics_core.process(req, res)
114-
else:
115-
# If the allowed_http_hosts has not been set (None by default), send off the data to be queued
116-
self.metrics_core.process(req, res)
111+
self.metrics_core.process(req, res)
117112

118113
yield data
119114

+1-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
from readme_metrics.MetricsApiConfig import MetricsApiConfig
22
from readme_metrics.MetricsMiddleware import MetricsMiddleware
33

4-
__version__ = "1.0.6"
4+
__version__ = "1.1.0"
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,48 @@
1+
import importlib
2+
import json
3+
import math
4+
from queue import Empty, Queue
5+
import time
6+
7+
import requests
8+
9+
10+
def publish_batch(config, queue):
11+
result_list = []
12+
try:
13+
try:
14+
while not queue.empty() and len(result_list) < config.BUFFER_LENGTH:
15+
payload = queue.get_nowait()
16+
result_list.append(payload)
17+
except Empty:
18+
pass
19+
20+
if len(result_list) == 0:
21+
return
22+
23+
version = importlib.import_module(__package__).__version__
24+
url = config.METRICS_API + "/request"
25+
26+
readme_result = requests.post(
27+
url,
28+
auth=(config.README_API_KEY, ""),
29+
data=json.dumps(result_list),
30+
headers={
31+
"Content-Type": "application/json",
32+
"User-Agent": f"readme-metrics-python@{version}",
33+
},
34+
timeout=config.METRICS_API_TIMEOUT,
35+
)
36+
config.LOGGER.info(
37+
f"POST to {url} with {len(result_list)} items returned {readme_result.status_code}"
38+
)
39+
if not readme_result.ok:
40+
config.LOGGER.exception(readme_result.text)
41+
raise Exception(f"POST to {url} returned {readme_result.status_code}")
42+
except Exception as e:
43+
# Errors in the Metrics SDK should never cause the application to
44+
# throw an error. Log it but don't re-raise.
45+
config.LOGGER.exception(e)
46+
finally:
47+
for _ in result_list:
48+
queue.task_done()

packages/python/readme_metrics/tests/MetricsMiddleware_test.py

+1-5
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
import pytest
1+
import pytest # pylint: disable=import-error
22
import requests
33
import json
44

@@ -49,10 +49,6 @@ def mockMiddlewareConfig():
4949
)
5050

5151

52-
# Verify that metrics has fields for metrics API and package name
53-
assert Metrics.METRICS_API != None
54-
assert Metrics.PACKAGE_NAME != None
55-
5652
# Mock callback for handling middleware response
5753
class MetricsCoreMock:
5854
def process(self, req, res):

0 commit comments

Comments
 (0)