5.17.2. onnxruntime¶
- GitHub(MIT License): https://github.com/microsoft/onnxruntime 
- Youtube: https://www.youtube.com/@ONNXRuntime 
- [example]inference: https://github.com/microsoft/onnxruntime-inference-examples 
- [example]training: https://github.com/microsoft/onnxruntime-training-examples 
简介¶
- ONNX Runtimeis a cross-platform inference and training machine-learning accelerator.
- ONNX Runtime inferencecan enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms
- ONNX Runtime trainingcan accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts.
安装¶
pip install onnxruntime-gpu
or
pip install onnxruntime
使用¶
PyTorch CV¶
Export the model using torch.onnx.export
torch.onnx.export(model,                                # model being run
                  torch.randn(1, 28, 28).to(device),    # model input (or a tuple for multiple inputs)
                  "fashion_mnist_model.onnx",           # where to save the model (can be a file or file-like object)
                  input_names = ['input'],              # the model's input names
                  output_names = ['output'])            # the model's output names
Load the onnx model with onnx.load:
import onnx
onnx_model = onnx.load("fashion_mnist_model.onnx")
onnx.checker.check_model(onnx_model)
Create inference session using ort.InferenceSession:
import onnxruntime as ort
import numpy as np
x, y = test_data[0][0], test_data[0][1]
ort_sess = ort.InferenceSession('fashion_mnist_model.onnx')
outputs = ort_sess.run(None, {'input': x.numpy()})
# Print Result
predicted, actual = classes[outputs[0][0].argmax(0)], classes[y]
print(f'Predicted: "{predicted}", Actual: "{actual}"')
PyTorch NLP¶
Export Model:
# Export the model
torch.onnx.export(model,                     # model being run
                (text, offsets),           # model input (or a tuple for multiple inputs)
                "ag_news_model.onnx",      # where to save the model (can be a file or file-like object)
                export_params=True,        # store the trained parameter weights inside the model file
                opset_version=10,          # the ONNX version to export the model to
                do_constant_folding=True,  # whether to execute constant folding for optimization
                input_names = ['input', 'offsets'],   # the model's input names
                output_names = ['output'], # the model's output names
                dynamic_axes={'input' : {0 : 'batch_size'},    # variable length axes
                              'output' : {0 : 'batch_size'}})
Load the model using onnx.load:
import onnx
onnx_model = onnx.load("ag_news_model.onnx")
onnx.checker.check_model(onnx_model)
Create inference session with ort.infernnce:
import onnxruntime as ort
import numpy as np
ort_sess = ort.InferenceSession('ag_news_model.onnx')
outputs = ort_sess.run(None, {'input': text.numpy(),
                            'offsets':  torch.tensor([0]).numpy()})
# Print Result
result = outputs[0].argmax(axis=1)+1
print("This is a %s news" %ag_news_label[result[0]])





