-
Notifications
You must be signed in to change notification settings - Fork 3
ResNet18 deployment on AWS Lambda
Below you can find an introductory tutorial describing deployment of ResNet18 image classifier using torchlambda
.
This is only an example, for more sophisticated use cases (e.g. base64
encoding of image or testing deployment locally)
see other tutorials section.
Below is a code (model.py
) to load ResNet
from torchvision
and compile it as torchscript
:
import torch
import torchvision
model = torchvision.models.resnet18()
torch.jit.script(model).save("model.ptc")
Invoke it from CLI:
$ python model.py
You should get model.ptc
in your current working directory.
torchlambda
uses C++ to deploy models hence it might be harder for end users
to provide necessary source code.
To alleviate some of those issues, easy to understand YAML
settings can be used
to define outputs
and various elements of neural network's deployment.
Please run the following:
torchlambda settings
This command will generate torchlambda.yaml
file with all available commands
for you to modify according to your needs. You can see all of them
with short description below.
Click here to check generated YAML settings
---
grad: False # Turn gradient on/off
validate_json: true # Validate correctnes of JSON parsing
model: /opt/model.ptc # Path to model to load
input: # Define properties of input
name: data # Name of field containing data
validate_field: true # Whether above field will be checked for correctness
type: float # Type of data in this field (array assumed or base64)
shape: [1, 3, width, height] # Input shapes (int or name of field as str)
validate_shape: true # Whether to validate fields containing shape info
cast: float # Type to which tensor will be casted before inference (if any)
divide: 255 # Value by which it will be divided (if any)
normalize: # Whether to normalize the tensor
means: [0.485, 0.456, 0.406] # Using those means
stddevs: [0.229, 0.224, 0.225] # And those standard deviations
return: # Finally return something in JSON
output: # Unmodified output from neural network
type: double # Casted to double type (AWS SDK compatible)
name: output # Name of the field where value(s) will be returned
item: false # If we return single value use True, neural network usually returns more (an array)
result: # Return another field result by modifying output
operations: argmax # Apply argmax (more operations can be specified as list)
arguments: 1 # Over first dimension (more or no arguments can be specified)
type: int # Type returned will be integer
name: result # Named result
item: true # It will be a single item
Many fields already have sensible defaults (see YAML settings file reference) hence they will be left for now. In our case we will only define bare minimum:
---
input:
shape: [1, 3, width, height]
type: byte
cast: float
divide: 255
normalize:
means: [0.485, 0.456, 0.406]
stddevs: [0.229, 0.224, 0.225]
return:
result:
operations: argmax
type: int
name: label
item: true
-
input
- tensor of shape[1, 3, width, height]
, where first two dimensions arebatch
andchannel
(always static and equal to1
and3
respectively), and variable width and height. Exactwidth
andheight
will be passed asint
fields in JSON request.type
of tensor is specified asbyte
(image with values in range[0, 255]
, will be casted on AWS Lambda's side to C++'suint8_t
). Created tensor will becast
ed tofloat
anddivide
d by255
to be in[0,1]
range asResNet18
expects. - Data will be
normalize
d per channels with ImageNet pre-calculatedmeans
and standard deviations. -
return
- return output of the network modified byargmax
operation which createsresult
. Our returnedtype
will beint
, and JSON fieldname
(torchlambda
always returns JSONs) will belabel
.argmax
overtensor
will create single (by default the operation is applied over all dimension), henceitem
is specified.
Save the above content in torchlambda.yaml
.
Now that we have our settings we can generate C++ code based on it. Run the following:
$ torchlambda template --yaml torchlambda.yaml
You should see a new folder called torchlambda
in your current directory
with main.cpp
file inside.
If you don't care about C++ you can move on to the next section. If you want to know a little more (or your deployment needs more customization), carry on reading.
If YAML
settings cannot fulfil your needs torchlambda
offers you a basic C++ template
you can start your deployment code from.
Run this simple command (no settings needed in this case):
$ torchlambda template --destination custom_deployment
This time you can find new folder custom_deployment
with main.cpp
inside.
This file is a minimal reasonable and working C++ code one should be able to follow
easily. It does exactly the same thing (except dynamic shapes) as we did above
via settings but this time the file is readable (previous main.cpp
might be quite hard to grasp
as it's "autogenerated").
Click here to check generated code
#include <aws/core/Aws.h>
#include <aws/core/utils/base64/Base64.h>
#include <aws/core/utils/json/JsonSerializer.h>
#include <aws/core/utils/memory/stl/AWSString.h>
#include <aws/lambda-runtime/runtime.h>
#include <torch/script.h>
#include <torch/torch.h>
/*!
*
* HANDLE REQUEST
*
*/
static aws::lambda_runtime::invocation_response
handler(torch::jit::script::Module &module,
const Aws::Utils::Base64::Base64 &transformer,
const aws::lambda_runtime::invocation_request &request) {
const Aws::String data_field{"data"};
/*!
*
* PARSE AND VALIDATE REQUEST
*
*/
const auto json = Aws::Utils::Json::JsonValue{request.payload};
if (!json.WasParseSuccessful())
return aws::lambda_runtime::invocation_response::failure(
"Failed to parse input JSON file.", "InvalidJSON");
const auto json_view = json.View();
if (!json_view.KeyExists(data_field))
return aws::lambda_runtime::invocation_response::failure(
"Required data was not provided.", "InvalidJSON");
/*!
*
* LOAD DATA, TRANSFORM TO TENSOR, NORMALIZE
*
*/
const auto base64_data = json_view.GetString(data_field);
Aws::Utils::ByteBuffer decoded = transformer.Decode(base64_data);
torch::Tensor tensor =
torch::from_blob(decoded.GetUnderlyingData(),
{
static_cast<long>(decoded.GetLength()),
},
torch::kUInt8)
.reshape({1, 3, 64, 64})
.toType(torch::kFloat32) /
255.0;
torch::Tensor normalized_tensor = torch::data::transforms::Normalize<>{
{0.485, 0.456, 0.406}, {0.229, 0.224, 0.225}}(tensor);
/*!
*
* MAKE INFERENCE
*
*/
auto output = module.forward({normalized_tensor}).toTensor();
const int label = torch::argmax(output).item<int>();
/*!
*
* RETURN JSON
*
*/
return aws::lambda_runtime::invocation_response::success(
Aws::Utils::Json::JsonValue{}
.WithInteger("label", label)
.View()
.WriteCompact(),
"application/json");
}
int main() {
/*!
*
* LOAD MODEL ON CPU
* & SET IT TO EVALUATION MODE
*
*/
/* Turn off gradient */
torch::NoGradGuard no_grad_guard{};
/* No optimization during first pass as it might slow down inference by 30s */
torch::jit::setGraphExecutorOptimize(false);
constexpr auto model_path = "/opt/model.ptc";
torch::jit::script::Module module = torch::jit::load(model_path, torch::kCPU);
module.eval();
/*!
*
* INITIALIZE AWS SDK
* & REGISTER REQUEST HANDLER
*
*/
Aws::SDKOptions options;
Aws::InitAPI(options);
{
const Aws::Utils::Base64::Base64 transformer{};
const auto handler_fn =
[&module,
&transformer](const aws::lambda_runtime::invocation_request &request) {
return handler(module, transformer, request);
};
aws::lambda_runtime::run_handler(handler_fn);
}
Aws::ShutdownAPI(options);
return 0;
}
For more info run torchlambda template --help
or
check out documentation.
Requests for AWS Lambda functions created by torchlambda
should be in JSON format. Copy and run the code below
to create an example small payload
in desired format:
import json
import numpy as np
def create_payload():
width = 64
height = 64
data = (
np.random.randint(low=0, high=255, size=(1, 3, width, height))
.flatten()
.tolist()
)
payload = {"width": width, "height": height, "data": data}
with open("payload.json", "w") as file:
json.dump(payload, file)
if __name__ == "__main__":
create_payload()
It should create payload.json
file with randomly generated image in data
field of width
and height
equal to 64
.
Keep this file around as it will be needed at the end.
Notice small image size. If you wish to send larger images to AWS Lambda
you should use base64
encoding described in base64
image encoding tutorial.
Now we have our model and source code. It's time to deploy it as AWS Lambda
ready .zip
package.
Run from command line:
$ torchlambda build ./torchlambda --compilation "-Wall -O2"
Above will create torchlambda.zip
file ready for deploy.
Notice --compilation
where you can pass any C++ compilation flags (here -O2
for increased performance).
There are many more things one could set during this step, check torchlambda build --help
or documentation
for full list of available options and description.
Our source code is roughly 30Mb
in size (AWS Lambda has 250Mb
limit),
hence we can put our model as additional layer (so AWS S3 won't be involved).
To create it run:
$ torchlambda layer ./model.ptc --destination "model.zip"
You will receive model.zip
layer in your current working directory (--destination
is optional).
See torchlambda layer --help
or documentation for more info.
From now on you could mostly follow tutorial from AWS Lambda's C++ Runtime. It is assumed you have AWS CLI configured, check Configuring the AWS CLI otherwise (or see Test Lambda deployment locally tutorial)
First create the following trust policy JSON file:
$ cat trust-policy.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Service": ["lambda.amazonaws.com"]
},
"Action": "sts:AssumeRole"
}
]
}
Run from your shell:
$ aws iam create-role --role-name demo --assume-role-policy-document file://trust-policy.json
Note down the role Arn
returned to you after running that command, it will be needed during next step.
Create deployment function with the script below:
$ aws lambda create-function --function-name demo \
--role <specify role arn from step 5.2 here> \
--runtime provided --timeout 30 --memory-size 1024 \
--handler torchlambda --zip-file fileb://torchlambda.zip
We already have our ResNet18
packed appropriately so run the following to make a layer
from it:
$ aws lambda publish-layer-version --layer-name model \
--description "Resnet18 neural network model" \
--license-info "MIT" \
--zip-file fileb://model.zip
Please save the LayerVersionArn
just like in 6.2
and insert below to add this layer
to function from previous step:
$ aws lambda update-function-configuration \
--function-name demo \
--layers <specify layer arn from above here>
This configures whole deployment, now we our model is ready to get incoming requests.
In this final step you will send payload.json
(created in step 6) to our AWS Lambda function and check whether we get a
correct response.
Simply run from CLI:
aws lambda invoke --function-name demo --payload file://payload.json output.json
You should get the following response in output.json
(your label may vary as image and neural network weights are random):
cat output.txt
{"label": 40}
Congratulations, you have deployed ResNet18 classifier using only AWS Lambda in a few simple steps!