@@ -14,3 +14,74 @@ create a virtual environment and install packages in `requirements.txt`
14
14
python3 -m venv venv && source venv/bin/activate
15
15
```
16
16
17
+ and then,
18
+
19
+ ``` bash
20
+ pip install -r requirements.txt
21
+ ```
22
+
23
+ The dataset being used is: https://archive.ics.uci.edu/ml/datasets/Heart+failure+clinical+records
24
+
25
+ and can be downloaded to the data directory using the following command executed in the terminal:
26
+
27
+ ``` bash
28
+ wget https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv -P data/
29
+ ```
30
+
31
+ To train ML models we next need to create a training dataset and a holdout test dataset. To achieve this we can use the
32
+ ` create_test_dataset.py ` script to split the original dataset into a training dataset ("data/train.json") with 80% of
33
+ the data and a holdout test set with 20% of the data (To train ML models we next need to create a training dataset and a
34
+ holdout test dataset. To achieve this we can use the ` create_test_dataset.py ` script to split the original dataset into
35
+ a training dataset (data/train.json) with 80% of the data and a holdout test set with 20% of the data (data/test.json).
36
+ An example json file to be used as part of the curl request to the model server api is also generated - this contains
37
+ the first example in the test dataset (data/test_post_request.json).
38
+
39
+ ``` bash
40
+ python -m create_test_dataset
41
+ ```
42
+
43
+ ## Model Serving
44
+
45
+ As this dataset is small (299 examples in total) only simple ML models have been chosen (Random Forests, SVM, MLPs)
46
+ to avoid overfitting. Due to the lightweight resource requirements of the models, training of the model occurs on
47
+ server start up.
48
+
49
+ To start up the server and train the model we can run the following command from the terminal:
50
+
51
+ ``` bash
52
+ uvicorn api:app
53
+ ```
54
+
55
+ We can then test out a post request on the "/predict" endpoint using the browser at http://localhost:8000/docs and use a request body such as
56
+
57
+ ``` json
58
+ {
59
+ "features" :{
60
+ "age" :94.0 ,
61
+ "anaemia" :0.0 ,
62
+ "creatinine_phosphokinase" :582.0 ,
63
+ "diabetes" :1.0 ,
64
+ "ejection_fraction" :38.0 ,
65
+ "high_blood_pressure" :1.0 ,
66
+ "platelets" :263358.03 ,
67
+ "serum_creatinine" :1.83 ,
68
+ "serum_sodium" :134.0 ,
69
+ "sex" :1.0 ,
70
+ "smoking" :0.0 ,
71
+ "time" :27.0
72
+ }
73
+ }
74
+ ```
75
+
76
+ or send a curl request e.g.:
77
+ ``` bash
78
+ curl -X POST --header " Content-Type: application/json" -d @data/test_post_request.json http://localhost:8000/predict
79
+ ```
80
+
81
+ ## Tests
82
+
83
+ To run the unit tests associated with the api run the following command in the terminal
84
+
85
+ ``` bash
86
+ python -m pytest test_api.py --cov=api --cov-report=term
87
+ ```
0 commit comments