Skip to content

ort-vision-sdk

High-level SDKs for computer vision inference on top of the ONNX Runtime. The repository ships two sibling packages — the same task-oriented API (Classifier, Detector, Segmenter) and the same typed result shapes — one for Python (servers/scripts) and one for the browser (TypeScript).

Package Registry Directory Install
ort-vision-sdk PyPI sdk-python/ pip install ort-vision-sdk
@mauriciobenjamin700/ort-vision-sdk-web npm sdk-js-web/ npm install @mauriciobenjamin700/ort-vision-sdk-web onnxruntime-web

Idioma / Language

This documentation is bilingual. Use the language selector at the top of the page to switch between Português (BR) and English (US).

What it is

Using onnxruntime directly forces you to pick and configure execution providers, letterbox/resize/normalize/to_chw/batch your image, decode the model output (anchor grids, NMS, mask prototypes), map boxes back from the letterboxed input to the original image, and resolve class indices to human-readable labels — all repeated per task family.

ort-vision-sdk does all of that for you and returns a typed result in the Ultralytics idiom (boxes.xyxy, cls, conf, names, ...) so existing code ports over with minimal edits. From a raw image (path, bytes, NumPy array, or PIL image in Python; URL/Blob/canvas/etc. in the browser) to a typed result in one call.

What's in the box

Task Class Models supported
Classification Classifier Any ONNX classifier with output (1, num_classes) (torchvision-style)
Object detection Detector Anchor-free YOLO heads: v8, v9, v10, v11, v12, v26
Instance segmentation Segmenter YOLO-seg heads: v8-seg, v11-seg, v26-seg (+ prototypes)

All three tasks return the same envelope shape — a list (list[Results] in Python, Results[] in Web) of length 1 per image — so you switch tasks without rewriting the code that consumes the result.

Quick install

pip install ort-vision-sdk            # CPU only (default)
pip install "ort-vision-sdk[gpu]"     # adds onnxruntime-gpu (CUDA / TensorRT)
pip install "ort-vision-sdk[opencv]"  # adds the OpenCV image backend

Requires Python 3.10+.

npm install @mauriciobenjamin700/ort-vision-sdk-web onnxruntime-web

onnxruntime-web is a peer dependency — you bring your own version and ship the matching .wasm files.

First steps

from ort_vision_sdk import Detector

det = Detector("yolov8n.onnx")          # labels="coco" by default
result = det.predict("street.jpg")[0]   # list[DetectionResults], length 1
for d in result:
    print(d.name, d.conf, d.box.xyxy)
import { Detector } from "@mauriciobenjamin700/ort-vision-sdk-web";

const det = await Detector.create("/models/yolov8n.onnx");
const result = (await det.predict("/images/street.jpg"))[0];
for (const d of result) {
  console.log(d.className, d.confidence, d.bbox.asXyxy());
}

Continue with Installation and Quick start.

Status

Alpha — the public API is stable enough to build against, but minor versions may introduce breaking changes until 1.0. Pin the version range you build against.

License

MIT.