TrOCR: Optimized for Qualcomm Devices

End-to-end text recognition approach with pre-trained image transformer and text transformer models for both image understanding and wordpiece-level text generation.

This is based on the implementation of TrOCR found here. This repository contains pre-exported model files optimized for Qualcomm® devices. You can use the Qualcomm® AI Hub Models library to export with custom configurations. More details on model performance across various devices, can be found here.

Qualcomm AI Hub Models uses Qualcomm AI Hub Workbench to compile, profile, and evaluate this model. Sign up to run these models on a hosted Qualcomm® device.

Getting Started

There are two ways to deploy this model on your device:

Option 1: Download Pre-Exported Models

Below are pre-exported model assets ready for deployment.

Runtime Precision Chipset SDK Versions Download
ONNX float Universal QAIRT 2.42, ONNX Runtime 1.24.3 Download
QNN_DLC float Universal QAIRT 2.45 Download
TFLITE float Universal QAIRT 2.45 Download

For more device-specific assets and performance metrics, visit TrOCR on Qualcomm® AI Hub.

Option 2: Export with Custom Configurations

Use the Qualcomm® AI Hub Models Python library to compile and export the model with your own:

  • Custom weights (e.g., fine-tuned checkpoints)
  • Custom input shapes
  • Target device and runtime configurations

This option is ideal if you need to customize the model beyond the default configuration provided here.

See our repository for TrOCR on GitHub for usage instructions.

Model Details

Model Type: Model_use_case.image_to_text

Model Stats:

  • Model checkpoint: trocr-small-stage1
  • Input resolution: 320x320
  • Number of parameters (decoder): 38.3M
  • Model size (decoder) (float): 146 MB
  • Number of parameters (encoder): 23.0M
  • Model size (encoder) (float): 87.8 MB

Performance Summary

Model Runtime Precision Chipset Inference Time (ms) Peak Memory Range (MB) Primary Compute Unit
decoder ONNX float Snapdragon® 8 Elite Gen 5 Mobile 1.142 ms 1 - 220 MB NPU
decoder ONNX float Snapdragon® 8 Elite Mobile 1.25 ms 0 - 218 MB NPU
decoder ONNX float Snapdragon® X2 Elite 1.157 ms 68 - 68 MB NPU
decoder ONNX float Snapdragon® X Elite 2.232 ms 67 - 67 MB NPU
decoder ONNX float Snapdragon® X Elite 2.232 ms 67 - 67 MB NPU
decoder ONNX float Snapdragon® 8 Gen 3 Mobile 1.435 ms 0 - 247 MB NPU
decoder ONNX float Qualcomm® QCS8550 (Proxy) 2.101 ms 0 - 74 MB NPU
decoder ONNX float Qualcomm® QCS9075 2.739 ms 7 - 16 MB NPU
decoder ONNX float Snapdragon® 8 Elite For Galaxy Mobile 1.25 ms 0 - 218 MB NPU
decoder QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 1.137 ms 1 - 172 MB NPU
decoder QNN_DLC float Snapdragon® 8 Elite Mobile 1.249 ms 0 - 187 MB NPU
decoder QNN_DLC float Snapdragon® X2 Elite 1.63 ms 7 - 7 MB NPU
decoder QNN_DLC float Snapdragon® X Elite 2.172 ms 7 - 7 MB NPU
decoder QNN_DLC float Snapdragon® X Elite 2.172 ms 7 - 7 MB NPU
decoder QNN_DLC float Snapdragon® 8 Gen 3 Mobile 1.358 ms 0 - 275 MB NPU
decoder QNN_DLC float Qualcomm® QCS8275 (Proxy) 4.119 ms 7 - 105 MB NPU
decoder QNN_DLC float Qualcomm® QCS8550 (Proxy) 1.989 ms 1 - 3 MB NPU
decoder QNN_DLC float Qualcomm® SA8775P 2.854 ms 7 - 106 MB NPU
decoder QNN_DLC float Qualcomm® SA8775P 2.854 ms 7 - 106 MB NPU
decoder QNN_DLC float Qualcomm® SA8775P 2.854 ms 7 - 106 MB NPU
decoder QNN_DLC float Qualcomm® QCS9075 2.557 ms 7 - 15 MB NPU
decoder QNN_DLC float Qualcomm® QCS8450 (Proxy) 2.703 ms 3 - 219 MB NPU
decoder QNN_DLC float Qualcomm® SA7255P 4.119 ms 7 - 105 MB NPU
decoder QNN_DLC float Qualcomm® SA8295P 2.663 ms 0 - 49 MB NPU
decoder QNN_DLC float Snapdragon® 8 Elite For Galaxy Mobile 1.249 ms 0 - 187 MB NPU
decoder TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 1.116 ms 0 - 175 MB NPU
decoder TFLITE float Snapdragon® 8 Elite Mobile 1.214 ms 0 - 190 MB NPU
decoder TFLITE float Snapdragon® 8 Gen 3 Mobile 1.404 ms 0 - 281 MB NPU
decoder TFLITE float Qualcomm® QCS8275 (Proxy) 4.202 ms 0 - 103 MB NPU
decoder TFLITE float Qualcomm® QCS8550 (Proxy) 2.042 ms 0 - 2 MB NPU
decoder TFLITE float Qualcomm® SA8775P 2.876 ms 0 - 102 MB NPU
decoder TFLITE float Qualcomm® SA8775P 2.876 ms 0 - 102 MB NPU
decoder TFLITE float Qualcomm® SA8775P 2.876 ms 0 - 102 MB NPU
decoder TFLITE float Qualcomm® QCS9075 2.594 ms 0 - 83 MB NPU
decoder TFLITE float Qualcomm® QCS8450 (Proxy) 2.459 ms 0 - 217 MB NPU
decoder TFLITE float Qualcomm® SA7255P 4.202 ms 0 - 103 MB NPU
decoder TFLITE float Qualcomm® SA8295P 2.669 ms 0 - 43 MB NPU
decoder TFLITE float Snapdragon® 8 Elite For Galaxy Mobile 1.214 ms 0 - 190 MB NPU
encoder ONNX float Snapdragon® 8 Elite Gen 5 Mobile 4.156 ms 1 - 150 MB NPU
encoder ONNX float Snapdragon® 8 Elite Mobile 5.278 ms 16 - 166 MB NPU
encoder ONNX float Snapdragon® X2 Elite 4.579 ms 48 - 48 MB NPU
encoder ONNX float Snapdragon® X Elite 11.423 ms 47 - 47 MB NPU
encoder ONNX float Snapdragon® X Elite 11.423 ms 47 - 47 MB NPU
encoder ONNX float Snapdragon® 8 Gen 3 Mobile 7.824 ms 16 - 234 MB NPU
encoder ONNX float Qualcomm® QCS8550 (Proxy) 10.913 ms 0 - 60 MB NPU
encoder ONNX float Qualcomm® QCS9075 14.132 ms 15 - 19 MB NPU
encoder ONNX float Snapdragon® 8 Elite For Galaxy Mobile 5.278 ms 16 - 166 MB NPU
encoder QNN_DLC float Snapdragon® 8 Elite Gen 5 Mobile 4.312 ms 1 - 115 MB NPU
encoder QNN_DLC float Snapdragon® 8 Elite Mobile 5.46 ms 2 - 113 MB NPU
encoder QNN_DLC float Snapdragon® X2 Elite 5.176 ms 2 - 2 MB NPU
encoder QNN_DLC float Snapdragon® X Elite 12.007 ms 2 - 2 MB NPU
encoder QNN_DLC float Snapdragon® X Elite 12.007 ms 2 - 2 MB NPU
encoder QNN_DLC float Snapdragon® 8 Gen 3 Mobile 8.005 ms 0 - 192 MB NPU
encoder QNN_DLC float Qualcomm® QCS8275 (Proxy) 37.042 ms 2 - 107 MB NPU
encoder QNN_DLC float Qualcomm® QCS8550 (Proxy) 11.31 ms 2 - 4 MB NPU
encoder QNN_DLC float Qualcomm® SA8775P 13.979 ms 2 - 108 MB NPU
encoder QNN_DLC float Qualcomm® SA8775P 13.979 ms 2 - 108 MB NPU
encoder QNN_DLC float Qualcomm® SA8775P 13.979 ms 2 - 108 MB NPU
encoder QNN_DLC float Qualcomm® QCS9075 15.257 ms 2 - 12 MB NPU
encoder QNN_DLC float Qualcomm® QCS8450 (Proxy) 19.596 ms 0 - 306 MB NPU
encoder QNN_DLC float Qualcomm® SA7255P 37.042 ms 2 - 107 MB NPU
encoder QNN_DLC float Qualcomm® SA8295P 19.704 ms 2 - 232 MB NPU
encoder QNN_DLC float Snapdragon® 8 Elite For Galaxy Mobile 5.46 ms 2 - 113 MB NPU
encoder TFLITE float Snapdragon® 8 Elite Gen 5 Mobile 4.074 ms 0 - 110 MB NPU
encoder TFLITE float Snapdragon® 8 Elite Mobile 5.205 ms 6 - 118 MB NPU
encoder TFLITE float Snapdragon® 8 Gen 3 Mobile 7.714 ms 7 - 198 MB NPU
encoder TFLITE float Qualcomm® QCS8275 (Proxy) 36.817 ms 7 - 120 MB NPU
encoder TFLITE float Qualcomm® QCS8550 (Proxy) 10.695 ms 7 - 9 MB NPU
encoder TFLITE float Qualcomm® SA8775P 13.51 ms 7 - 111 MB NPU
encoder TFLITE float Qualcomm® SA8775P 13.51 ms 7 - 111 MB NPU
encoder TFLITE float Qualcomm® SA8775P 13.51 ms 7 - 111 MB NPU
encoder TFLITE float Qualcomm® QCS9075 13.937 ms 6 - 65 MB NPU
encoder TFLITE float Qualcomm® QCS8450 (Proxy) 19.779 ms 7 - 311 MB NPU
encoder TFLITE float Qualcomm® SA7255P 36.817 ms 7 - 120 MB NPU
encoder TFLITE float Qualcomm® SA8295P 19.074 ms 8 - 236 MB NPU
encoder TFLITE float Snapdragon® 8 Elite For Galaxy Mobile 5.205 ms 6 - 118 MB NPU

License

  • The license for the original implementation of TrOCR can be found here.

References

Community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using qualcomm/TrOCR 1

Paper for qualcomm/TrOCR