Skip to content

Commit 8f845c0

Browse files
authoredAug 9, 2021
Add documentation for BLS (triton-inference-server#70)
* Add documentation for BLS * Review edits
1 parent 4c01991 commit 8f845c0

File tree

10 files changed

+484
-20
lines changed

10 files changed

+484
-20
lines changed
 

‎README.md

+84-7
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
<!--
2-
# Copyright (c) 2021, NVIDIA CORPORATION. All rights reserved.
2+
# Copyright 2020-2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
33
#
44
# Redistribution and use in source and binary forms, with or without
55
# modification, are permitted provided that the following conditions
@@ -44,6 +44,7 @@ any C++ code.
4444
* [Error Handling](#error-handling)
4545
* [Managing Shared Memory](#managing-shared-memory)
4646
* [Building From Source](#building-from-source)
47+
* [Business Logic Scripting (beta)](#business-logic-scripting-beta)
4748

4849
## Quick Start
4950

@@ -471,6 +472,79 @@ properly set the `--shm-size` flag depending on the size of your inputs and
471472
outputs. The default value for docker run command is `64MB` which is very
472473
small.
473474

475+
# Business Logic Scripting (beta)
476+
477+
Triton's
478+
[ensemble](https://github.com/triton-inference-server/server/blob/main/docs/architecture.md#ensemble-models)
479+
feature supports many use cases where multiple models are composed into a
480+
pipeline (or more generally a DAG, directed acyclic graph). However, there are
481+
many other use cases that are not supported because as part of the model
482+
pipeline they require loops, conditionals (if-then-else), data-dependent
483+
control-flow and other custom logic to be intermixed with model execution. We
484+
call this combination of custom logic and model executions *Business Logic
485+
Scripting (BLS)*.
486+
487+
Starting from 21.08, you can implement BLS in your Python model. A new set of
488+
utility functions allows you to execute inference requests on other models being
489+
served by Triton as a part of executing your Python model. Example below shows
490+
how to use this feature:
491+
492+
```python
493+
import triton_python_backend_utils as pb_utils
494+
495+
496+
class TritonPythonModel:
497+
...
498+
def execute(self, requests):
499+
...
500+
# Create an InferenceRequest object. `model_name`,
501+
# `requested_output_names`, and `inputs` are the required arguments and
502+
# must be provided when constructing an InferenceRequest object. Make sure
503+
# to replace `inputs` argument with a list of `pb_utils.Tensor` objects.
504+
inference_request = pb_utils.InferenceRequest(
505+
model_name='model_name',
506+
requested_output_names=['REQUESTED_OUTPUT_1', 'REQUESTED_OUTPUT_2'],
507+
inputs=[<pb_utils.Tensor object>])
508+
509+
# `pb_utils.InferenceRequest` supports request_id, correlation_id, and model
510+
# version in addition to the arguments described above. These arguments
511+
# are optional. An example containing all the arguments:
512+
# inference_request = pb_utils.InferenceRequest(model_name='model_name',
513+
# requested_output_names=['REQUESTED_OUTPUT_1', 'REQUESTED_OUTPUT_2'],
514+
# inputs=[<list of pb_utils.Tensor objects>],
515+
# request_id="1", correlation_id=4, model_version=1)
516+
517+
# Execute the inference_request and wait for the response
518+
inference_response = inference_request.exec()
519+
520+
# Check if the inference response has an error
521+
if inference_response.has_error():
522+
raise pb_utils.TritonModelException(inference_response.error().message())
523+
else:
524+
# Extract the output tensors from the inference response.
525+
output1 = pb_utils.get_output_tensor_by_name(inference_response, 'REQUESTED_OUTPUT_1')
526+
output2 = pb_utils.get_output_tensor_by_name(inference_response, 'REQUESTED_OUTPUT_2')
527+
528+
# Decide the next steps for model execution based on the received output
529+
# tensors. It is possible to use the same output tensors to for the final
530+
# inference resposne too.
531+
```
532+
533+
A complete example for BLS in Python backend is included in the
534+
[Examples](#examples) section.
535+
536+
## Limitations
537+
538+
- The number of inference requests that can be executed as a part of your model
539+
execution is limited to the amount of shared memory available to the Triton
540+
server. If you are using Docker to start the TritonServer, you can control the
541+
shared memory usage using the
542+
[`--shm-size`](https://docs.docker.com/engine/reference/run/) flag.
543+
- You need to make sure that the inference requests performed as a part of your model
544+
do not create a circular dependency. For example, if model A performs an inference request
545+
on itself and there are no more model instances ready to execute the inference request, the
546+
model will block on the inference execution forever.
547+
474548
# Examples
475549

476550
For using the Triton Python client in these examples you need to install
@@ -486,12 +560,15 @@ find the files in [examples/add_sub](examples/add_sub).
486560
## AddSubNet in PyTorch
487561

488562
In order to use this model, you need to install PyTorch. We recommend using
489-
`pip` method mentioned in the [PyTorch
490-
website](https://pytorch.org/get-started/locally/). Make sure that PyTorch is
491-
available in the same Python environment as other dependencies. If you need
492-
to create another Python environment, please refer to the "Changing Python
493-
Runtime Path" section of this readme. You can find the files for this example
494-
in [examples/pytorch](examples/pytorch).
563+
`pip` method mentioned in the [PyTorch website](https://pytorch.org/get-started/locally/).
564+
Make sure that PyTorch is available in the same Python environment as other
565+
dependencies. Alternatively, you can create a [Python Execution Environment](#using-custom-python-execution-environments).
566+
You can find the files for this example in [examples/pytorch](examples/pytorch).
567+
568+
## Business Logic Scripting
569+
570+
The BLS example needs the dependencies required for both of the above examples.
571+
You can find the complete example instructions in [examples/bls](examples/bls/README.md).
495572

496573
# Reporting problems, asking questions
497574

‎examples/add_sub/client.py

-1
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,6 @@
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

2727
from tritonclient.utils import *
28-
import tritonclient.grpc as grpcclient
2928
import tritonclient.http as httpclient
3029

3130
import numpy as np

‎examples/add_sub/config.pbtxt

-2
Original file line numberDiff line numberDiff line change
@@ -32,15 +32,13 @@ input [
3232
name: "INPUT0"
3333
data_type: TYPE_FP32
3434
dims: [ 4 ]
35-
3635
}
3736
]
3837
input [
3938
{
4039
name: "INPUT1"
4140
data_type: TYPE_FP32
4241
dims: [ 4 ]
43-
4442
}
4543
]
4644
output [

‎examples/add_sub/model.py

-2
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,6 @@
2424
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

27-
import numpy as np
28-
import sys
2927
import json
3028

3129
# triton_python_backend_utils is available in every Triton Python model. You

‎examples/bls/README.md

+104
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,104 @@
1+
<!--
2+
# Copyright 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
3+
#
4+
# Redistribution and use in source and binary forms, with or without
5+
# modification, are permitted provided that the following conditions
6+
# are met:
7+
# * Redistributions of source code must retain the above copyright
8+
# notice, this list of conditions and the following disclaimer.
9+
# * Redistributions in binary form must reproduce the above copyright
10+
# notice, this list of conditions and the following disclaimer in the
11+
# documentation and/or other materials provided with the distribution.
12+
# * Neither the name of NVIDIA CORPORATION nor the names of its
13+
# contributors may be used to endorse or promote products derived
14+
# from this software without specific prior written permission.
15+
#
16+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
17+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
19+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
20+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
21+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
22+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
23+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
24+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
25+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
26+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
27+
-->
28+
29+
# BLS Example
30+
31+
In this example we demonstrate an end-to-end example for
32+
[BLS](../../README.md#business-logic-scripting-beta) in Python backend. The
33+
[model repository](https://github.com/triton-inference-server/server/blob/main/docs/model_repository.md)
34+
should contain [PyTorch](../pytorch), [AddSub](../add_sub), and [BLS](../bls) models.
35+
The [PyTorch](../pytorch) and [AddSub](../add_sub) models
36+
calculate the sum and difference of the `INPUT0` and `INPUT1` and put the
37+
results in `OUTPUT0` and `OUTPUT1` respectively. The goal of the BLS model is
38+
the same as [PyTorch](../pytorch) and [AddSub](../add_sub) models but the
39+
difference is that the BLS model will not calculate the sum and difference by
40+
itself. The BLS model will pass the input tensors to the [PyTorch](../pytorch)
41+
or [AddSub](../add_sub) models and return the responses of that model as the
42+
final response. The additional parameter `MODEL_NAME` determines which model
43+
will be used for calculating the final outputs.
44+
45+
1. Create the model repository:
46+
47+
```console
48+
$ mkdir -p models/add_sub/1
49+
$ mkdir -p models/bls/1
50+
$ mkdir -p models/pytorch/1
51+
52+
# Copy the Python models
53+
$ cp examples/add_sub/model.py models/add_sub/1/
54+
$ cp examples/add_sub/config.pbtxt models/add_sub/
55+
$ cp examples/bls/model.py models/bls/1/
56+
$ cp examples/bls/config.pbtxt models/bls/
57+
$ cp examples/pytorch/model.py models/pytorch/1/
58+
$ cp examples/pytorch/config.pbtxt models/pytorch/
59+
```
60+
61+
2. Start the tritonserver:
62+
63+
```
64+
tritonserver --model-repository `pwd`/models
65+
```
66+
67+
3. Send inference requests to server:
68+
69+
```
70+
python3 examples/bls/client.py
71+
```
72+
73+
You should see an output similar to the output below:
74+
75+
```
76+
=========='add_sub' model result==========
77+
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) + INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([0.7290179 1.5889243 1.2588708 0.9553937])
78+
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) - INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([-0.02932483 -0.22716594 0.04308355 0.28689077])
79+
80+
81+
=========='pytorch' model result==========
82+
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) + INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([0.7290179 1.5889243 1.2588708 0.9553937])
83+
INPUT0 ([0.34984654 0.6808792 0.6509772 0.6211422 ]) - INPUT1 ([0.37917137 0.9080451 0.60789365 0.33425143]) = OUTPUT0 ([-0.02932483 -0.22716594 0.04308355 0.28689077])
84+
85+
86+
=========='undefined' model result==========
87+
Failed to process the request(s) for model instance 'bls_0', message: TritonModelException: Failed for execute the inference request. Model 'undefined_model' is not ready.
88+
89+
At:
90+
/tmp/python_backend/models/bls/1/model.py(110): execute
91+
```
92+
93+
The [bls](./model.py) model file is heavily commented with explanations about
94+
each of the function calls.
95+
96+
## Explanation of the Client Output
97+
98+
The [client.py](./client.py) sends three inference requests to the 'bls'
99+
model with different values for the "MODEL_NAME" input. As explained earlier,
100+
"MODEL_NAME" determines the model name that the "bls" model will use for
101+
calculating the final outputs. In the first request, it will use the "add_sub"
102+
model and in the seceond request it will use the "pytorch" model. The third
103+
request uses an incorrect model name to demonstrate error handling during
104+
the inference request execution.

‎examples/bls/client.py

+94
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,94 @@
1+
# Copyright 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without
4+
# modification, are permitted provided that the following conditions
5+
# are met:
6+
# * Redistributions of source code must retain the above copyright
7+
# notice, this list of conditions and the following disclaimer.
8+
# * Redistributions in binary form must reproduce the above copyright
9+
# notice, this list of conditions and the following disclaimer in the
10+
# documentation and/or other materials provided with the distribution.
11+
# * Neither the name of NVIDIA CORPORATION nor the names of its
12+
# contributors may be used to endorse or promote products derived
13+
# from this software without specific prior written permission.
14+
#
15+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
16+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
19+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
20+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
21+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
22+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
23+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26+
27+
from tritonclient.utils import *
28+
import tritonclient.http as httpclient
29+
import numpy as np
30+
31+
model_name = "bls"
32+
shape = [4]
33+
34+
with httpclient.InferenceServerClient("localhost:8000") as client:
35+
input0_data = np.random.rand(*shape).astype(np.float32)
36+
input1_data = np.random.rand(*shape).astype(np.float32)
37+
inputs = [
38+
httpclient.InferInput("INPUT0", input0_data.shape,
39+
np_to_triton_dtype(input0_data.dtype)),
40+
httpclient.InferInput("INPUT1", input1_data.shape,
41+
np_to_triton_dtype(input1_data.dtype)),
42+
httpclient.InferInput("MODEL_NAME", [1],
43+
np_to_triton_dtype(np.object_)),
44+
]
45+
inputs[0].set_data_from_numpy(input0_data)
46+
inputs[1].set_data_from_numpy(input1_data)
47+
48+
# Will perform the inference request on the 'add_sub' model.
49+
inputs[2].set_data_from_numpy(np.array(['add_sub'], dtype=np.object_))
50+
51+
outputs = [
52+
httpclient.InferRequestedOutput("OUTPUT0"),
53+
httpclient.InferRequestedOutput("OUTPUT1"),
54+
]
55+
56+
response = client.infer(model_name,
57+
inputs,
58+
request_id=str(1),
59+
outputs=outputs)
60+
61+
result = response.get_response()
62+
print("=========='add_sub' model result==========")
63+
print("INPUT0 ({}) + INPUT1 ({}) = OUTPUT0 ({})".format(
64+
input0_data, input1_data, response.as_numpy("OUTPUT0")))
65+
print("INPUT0 ({}) - INPUT1 ({}) = OUTPUT1 ({})".format(
66+
input0_data, input1_data, response.as_numpy("OUTPUT1")))
67+
68+
# Will perform the inference request on the pytorch model:
69+
inputs[2].set_data_from_numpy(np.array(['pytorch'], dtype=np.object_))
70+
response = client.infer(model_name,
71+
inputs,
72+
request_id=str(1),
73+
outputs=outputs)
74+
75+
result = response.get_response()
76+
print("\n")
77+
print("=========='pytorch' model result==========")
78+
print("INPUT0 ({}) + INPUT1 ({}) = OUTPUT0 ({})".format(
79+
input0_data, input1_data, response.as_numpy("OUTPUT0")))
80+
print("INPUT0 ({}) - INPUT1 ({}) = OUTPUT1 ({})".format(
81+
input0_data, input1_data, response.as_numpy("OUTPUT1")))
82+
83+
# Will perform the same inference request on an undefined model. This leads
84+
# to an exception:
85+
print("\n")
86+
print("=========='undefined' model result==========")
87+
try:
88+
inputs[2].set_data_from_numpy(np.array(['undefined_model'], dtype=np.object_))
89+
response = client.infer(model_name,
90+
inputs,
91+
request_id=str(1),
92+
outputs=outputs)
93+
except InferenceServerException as e:
94+
print(e.message())

‎examples/bls/config.pbtxt

+66
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
# Copyright 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without
4+
# modification, are permitted provided that the following conditions
5+
# are met:
6+
# * Redistributions of source code must retain the above copyright
7+
# notice, this list of conditions and the following disclaimer.
8+
# * Redistributions in binary form must reproduce the above copyright
9+
# notice, this list of conditions and the following disclaimer in the
10+
# documentation and/or other materials provided with the distribution.
11+
# * Neither the name of NVIDIA CORPORATION nor the names of its
12+
# contributors may be used to endorse or promote products derived
13+
# from this software without specific prior written permission.
14+
#
15+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
16+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
19+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
20+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
21+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
22+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
23+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26+
27+
name: "bls"
28+
backend: "python"
29+
30+
input [
31+
{
32+
name: "MODEL_NAME"
33+
data_type: TYPE_BYTES
34+
dims: [ 1 ]
35+
}
36+
]
37+
input [
38+
{
39+
name: "INPUT0"
40+
data_type: TYPE_FP32
41+
dims: [ 4 ]
42+
}
43+
]
44+
input [
45+
{
46+
name: "INPUT1"
47+
data_type: TYPE_FP32
48+
dims: [ 4 ]
49+
}
50+
]
51+
output [
52+
{
53+
name: "OUTPUT0"
54+
data_type: TYPE_FP32
55+
dims: [ 4 ]
56+
}
57+
]
58+
output [
59+
{
60+
name: "OUTPUT1"
61+
data_type: TYPE_FP32
62+
dims: [ 4 ]
63+
}
64+
]
65+
66+
instance_group [{ kind: KIND_CPU }]

‎examples/bls/model.py

+136
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Copyright 2021, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
2+
#
3+
# Redistribution and use in source and binary forms, with or without
4+
# modification, are permitted provided that the following conditions
5+
# are met:
6+
# * Redistributions of source code must retain the above copyright
7+
# notice, this list of conditions and the following disclaimer.
8+
# * Redistributions in binary form must reproduce the above copyright
9+
# notice, this list of conditions and the following disclaimer in the
10+
# documentation and/or other materials provided with the distribution.
11+
# * Neither the name of NVIDIA CORPORATION nor the names of its
12+
# contributors may be used to endorse or promote products derived
13+
# from this software without specific prior written permission.
14+
#
15+
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
16+
# EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
17+
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
18+
# PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR
19+
# CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,
20+
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,
21+
# PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR
22+
# PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY
23+
# OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
24+
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
26+
27+
# triton_python_backend_utils is available in every Triton Python model. You
28+
# need to use this module to create inference requests and responses. It also
29+
# contains some utility functions for extracting information from model_config
30+
# and converting Triton input/output types to numpy types.
31+
import triton_python_backend_utils as pb_utils
32+
33+
34+
class TritonPythonModel:
35+
"""Your Python model must use the same class name. Every Python model
36+
that is created must have "TritonPythonModel" as the class name.
37+
"""
38+
def initialize(self, args):
39+
"""`initialize` is called only once when the model is being loaded.
40+
Implementing `initialize` function is optional. This function allows
41+
the model to intialize any state associated with this model.
42+
43+
Parameters
44+
----------
45+
args : dict
46+
Both keys and values are strings. The dictionary keys and values are:
47+
* model_config: A JSON string containing the model configuration
48+
* model_instance_kind: A string containing model instance kind
49+
* model_instance_device_id: A string containing model instance device ID
50+
* model_repository: Model repository path
51+
* model_version: Model version
52+
* model_name: Model name
53+
"""
54+
55+
# You must parse model_config. JSON string is not parsed here
56+
self.model_config = json.loads(args['model_config'])
57+
58+
def execute(self, requests):
59+
"""`execute` must be implemented in every Python model. `execute`
60+
function receives a list of pb_utils.InferenceRequest as the only
61+
argument. This function is called when an inference request is made
62+
for this model. Depending on the batching configuration (e.g. Dynamic
63+
Batching) used, `requests` may contain multiple requests. Every
64+
Python model, must create one pb_utils.InferenceResponse for every
65+
pb_utils.InferenceRequest in `requests`. If there is an error, you can
66+
set the error argument when creating a pb_utils.InferenceResponse
67+
68+
Parameters
69+
----------
70+
requests : list
71+
A list of pb_utils.InferenceRequest
72+
73+
Returns
74+
-------
75+
list
76+
A list of pb_utils.InferenceResponse. The length of this list must
77+
be the same as `requests`
78+
"""
79+
80+
responses = []
81+
# Every Python backend must iterate over everyone of the requests
82+
# and create a pb_utils.InferenceResponse for each of them.
83+
for request in requests:
84+
# Get INPUT0
85+
in_0 = pb_utils.get_input_tensor_by_name(request, "INPUT0")
86+
87+
# Get INPUT1
88+
in_1 = pb_utils.get_input_tensor_by_name(request, "INPUT1")
89+
90+
# Get Model Name
91+
model_name = pb_utils.get_input_tensor_by_name(
92+
request, "MODEL_NAME")
93+
94+
# Model Name string
95+
model_name_string = model_name.as_numpy()[0]
96+
97+
# Create inference request object
98+
infer_request = pb_utils.InferenceRequest(
99+
model_name=model_name_string,
100+
requested_output_names=["OUTPUT0", "OUTPUT1"],
101+
inputs=[in_0, in_1])
102+
103+
# Perform synchronous blocking inference request
104+
infer_response = infer_request.exec()
105+
106+
# Make sure that the inference response doesn't have an error. If
107+
# it has an error, raise an exception.
108+
if infer_response.has_error():
109+
raise pb_utils.TritonModelException(
110+
infer_response.error().message())
111+
112+
# Create InferenceResponse. You can set an error here in case
113+
# there was a problem with handling this inference request.
114+
# Below is an example of how you can set errors in inference
115+
# response:
116+
#
117+
# pb_utils.InferenceResponse(
118+
# output_tensors=..., TritonError("An error occured"))
119+
#
120+
# Because the infer_response of the models contains the final
121+
# outputs with correct output names, we can just pass the list
122+
# of outputs to the InferenceResponse object.
123+
inference_response = pb_utils.InferenceResponse(
124+
output_tensors=infer_response.output_tensors())
125+
responses.append(inference_response)
126+
127+
# You should return a list of pb_utils.InferenceResponse. Length
128+
# of this list must match the length of `requests` list.
129+
return responses
130+
131+
def finalize(self):
132+
"""`finalize` is called only once when the model is being unloaded.
133+
Implementing `finalize` function is OPTIONAL. This function allows
134+
the model to perform any necessary clean ups before exit.
135+
"""
136+
print('Cleaning up...')

‎examples/pytorch/config.pbtxt

-2
Original file line numberDiff line numberDiff line change
@@ -32,15 +32,13 @@ input [
3232
name: "INPUT0"
3333
data_type: TYPE_FP32
3434
dims: [ 4 ]
35-
3635
}
3736
]
3837
input [
3938
{
4039
name: "INPUT1"
4140
data_type: TYPE_FP32
4241
dims: [ 4 ]
43-
4442
}
4543
]
4644
output [

‎examples/pytorch/model.py

-6
Original file line numberDiff line numberDiff line change
@@ -24,8 +24,6 @@
2424
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
2525
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
2626

27-
import numpy as np
28-
import sys
2927
import json
3028
from torch import nn
3129

@@ -41,21 +39,17 @@ class AddSubNet(nn.Module):
4139
Simple AddSub network in PyTorch. This network outputs the sum and
4240
subtraction of the inputs.
4341
"""
44-
4542
def __init__(self):
4643
super(AddSubNet, self).__init__()
4744

4845
def forward(self, input0, input1):
49-
"""
50-
"""
5146
return (input0 + input1), (input0 - input1)
5247

5348

5449
class TritonPythonModel:
5550
"""Your Python model must use the same class name. Every Python model
5651
that is created must have "TritonPythonModel" as the class name.
5752
"""
58-
5953
def initialize(self, args):
6054
"""`initialize` is called only once when the model is being loaded.
6155
Implementing `initialize` function is optional. This function allows

0 commit comments

Comments
 (0)
Please sign in to comment.