update requirements and complete README.md

Allaway11 · Allaway11 · commit 34f68cce014e · 2021-06-22T19:38:17.000+01:00
diff --git a/README.md b/README.md
@@ -14,3 +14,74 @@ create a virtual environment and install packages in `requirements.txt`
 python3 -m venv venv && source venv/bin/activate
 ```
 
+and then,
+
+```bash
+pip install -r requirements.txt
+```
+
+The dataset being used is: https://archive.ics.uci.edu/ml/datasets/Heart+failure+clinical+records
+
+and can be downloaded to the data directory using the following command executed in the terminal:
+
+```bash
+wget https://archive.ics.uci.edu/ml/machine-learning-databases/00519/heart_failure_clinical_records_dataset.csv -P data/
+```
+
+To train ML models we next need to create a training dataset and a holdout test dataset. To achieve this we can use the 
+`create_test_dataset.py` script to split the original dataset into a training dataset ("data/train.json") with 80% of 
+the data and a holdout test set with 20% of the data (To train ML models we next need to create a training dataset and a
+holdout test dataset. To achieve this we can use the `create_test_dataset.py` script to split the original dataset into 
+a training dataset (data/train.json) with 80% of the data and a holdout test set with 20% of the data (data/test.json). 
+An example json file to be used as part of the curl request to the model server api is also generated - this contains
+the first example in the test dataset (data/test_post_request.json).
+
+```bash
+python -m create_test_dataset
+```
+
+## Model Serving
+
+As this dataset is small (299 examples in total) only simple ML models have been chosen (Random Forests, SVM, MLPs)
+to avoid overfitting. Due to the lightweight resource requirements of the models, training of the model occurs on 
+server start up. 
+
+To start up the server and train the model we can run the following command from the terminal:
+
+```bash
+uvicorn api:app
+```
+
+We can then test out a post request on the "/predict" endpoint using the browser at http://localhost:8000/docs and use a request body such as 
+
+```json
+{
+  "features":{
+    "age":94.0,
+    "anaemia":0.0,
+    "creatinine_phosphokinase":582.0,
+    "diabetes":1.0,
+    "ejection_fraction":38.0,
+    "high_blood_pressure":1.0,
+    "platelets":263358.03,
+    "serum_creatinine":1.83,
+    "serum_sodium":134.0,
+    "sex":1.0,
+    "smoking":0.0,
+    "time":27.0
+  }
+}
+```
+
+or send a curl request e.g.:
+```bash
+curl -X POST --header "Content-Type: application/json" -d @data/test_post_request.json http://localhost:8000/predict   
+```
+
+## Tests
+
+To run the unit tests associated with the api run the following command in the terminal
+
+```bash
+python -m pytest test_api.py --cov=api --cov-report=term
+```
diff --git a/api.py b/api.py
@@ -47,4 +47,5 @@ async def get_model_predictions(request: PredictRequest) -> ModelResponse:
 
 
 if __name__ == "__main__":
+    # To enable debugging this entrypoint to the uvicorn server has been created
     uvicorn.run(app, host="0.0.0.0", port=8000)
diff --git a/requirements.txt b/requirements.txt
@@ -1,6 +1,6 @@
 fastapi==0.65.2
-numpy==1.20.3
-pandas==1.2.4
+numpy==1.19.5
+pandas==1.1.5
 pydantic==1.8.2
 pytest==6.2.4
 pytest-cov==2.12.1

Original file line number	Diff line number	Diff line change
`@@ -47,4 +47,5 @@ async def get_model_predictions(request: PredictRequest) -> ModelResponse:`
`47`	`47`
`48`	`48`
`49`	`49`	`if __name__ == "__main__":`
	`50`	`+ # To enable debugging this entrypoint to the uvicorn server has been created`
`50`	`51`	`uvicorn.run(app, host="0.0.0.0", port=8000)`