In this example we demonstrate an end-to-end example for
BLS in Python backend. The
model repository
should contain PyTorch, AddSub, and BLS models.
The PyTorch and AddSub models
calculate the sum and difference of the INPUT0
and INPUT1
and put the
results in OUTPUT0
and OUTPUT1
respectively. The goal of the BLS model is
the same as PyTorch and AddSub models but the
difference is that the BLS model will not calculate the sum and difference by
itself. The BLS model will pass the input tensors to the PyTorch
or AddSub models and return the responses of that model as the
final response. The additional parameter MODEL_NAME
determines which model
will be used for calculating the final outputs.
- Create the model repository:
$ mkdir -p models/add_sub/1
$ mkdir -p models/bls/1
$ mkdir -p models/pytorch/1
# Copy the Python models
$ cp examples/add_sub/model.py models/add_sub/1/
$ cp examples/add_sub/config.pbtxt models/add_sub/
$ cp examples/bls/model.py models/bls/1/
$ cp examples/bls/config.pbtxt models/bls/
$ cp examples/pytorch/model.py models/pytorch/1/
$ cp examples/pytorch/config.pbtxt models/pytorch/
- Start the tritonserver:
tritonserver --model-repository `pwd`/models
- Send inference requests to server:
python3 examples/bls/client.py
You should see an output similar to the output below:
=========='add_sub' model result==========
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) + INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([0.7290179 1.5889243 1.2588708 0.9553937])
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) - INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([-0.02932483 -0.22716594 0.04308355 0.28689077])
=========='pytorch' model result==========
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) + INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([0.7290179 1.5889243 1.2588708 0.9553937])
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) - INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([-0.02932483 -0.22716594 0.04308355 0.28689077])
=========='undefined' model result==========
Failed to process the request(s) for model instance 'bls_0', message: TritonModelException: Failed for execute the inference request. Model 'undefined_model' is not ready.
At:
/tmp/python_backend/models/bls/1/model.py(110): execute
The bls model file is heavily commented with explanations about each of the function calls.
The client.py sends three inference requests to the 'bls' model with different values for the "MODEL_NAME" input. As explained earlier, "MODEL_NAME" determines the model name that the "bls" model will use for calculating the final outputs. In the first request, it will use the "add_sub" model and in the seceond request it will use the "pytorch" model. The third request uses an incorrect model name to demonstrate error handling during the inference request execution.