ort-vision-sdk

High-level SDKs for computer vision inference on top of the ONNX Runtime. The repository ships two sibling packages — the same task-oriented API (Classifier, Detector, Segmenter) and the same typed result shapes — one for Python (servers/scripts) and one for the browser (TypeScript).

Package	Registry	Directory	Install
`ort-vision-sdk`	PyPI	`sdk-python/`	`pip install ort-vision-sdk`
`@mauriciobenjamin700/ort-vision-sdk-web`	npm	`sdk-js-web/`	`npm install @mauriciobenjamin700/ort-vision-sdk-web onnxruntime-web`

Idioma / Language

This documentation is bilingual. Use the language selector at the top of the page to switch between Português (BR) and English (US).

What it is

Using onnxruntime directly forces you to pick and configure execution providers, letterbox/resize/normalize/to_chw/batch your image, decode the model output (anchor grids, NMS, mask prototypes), map boxes back from the letterboxed input to the original image, and resolve class indices to human-readable labels — all repeated per task family.

ort-vision-sdk does all of that for you and returns a typed result in the Ultralytics idiom (boxes.xyxy, cls, conf, names, ...) so existing code ports over with minimal edits. From a raw image (path, bytes, NumPy array, or PIL image in Python; URL/Blob/canvas/etc. in the browser) to a typed result in one call.

What's in the box

Task	Class	Models supported
Classification	`Classifier`	Any ONNX classifier with output `(1, num_classes)` (torchvision-style)
Object detection	`Detector`	Anchor-free YOLO heads: v8, v9, v10, v11, v12, v26
Instance segmentation	`Segmenter`	YOLO-seg heads: v8-seg, v11-seg, v26-seg (+ prototypes)

All three tasks return the same envelope shape — a list (list[Results] in Python, Results[] in Web) of length 1 per image — so you switch tasks without rewriting the code that consumes the result.

Quick install

PythonWeb (browser)

pip install ort-vision-sdk            # CPU only (default)
pip install "ort-vision-sdk[gpu]"     # adds onnxruntime-gpu (CUDA / TensorRT)
pip install "ort-vision-sdk[opencv]"  # adds the OpenCV image backend

Requires Python 3.10+.

npm install @mauriciobenjamin700/ort-vision-sdk-web onnxruntime-web

onnxruntime-web is a peer dependency — you bring your own version and ship the matching .wasm files.

First steps

PythonWeb (browser)

from ort_vision_sdk import Detector

det = Detector("yolov8n.onnx")          # labels="coco" by default
result = det.predict("street.jpg")[0]   # list[DetectionResults], length 1
for d in result:
    print(d.name, d.conf, d.box.xyxy)

import { Detector } from "@mauriciobenjamin700/ort-vision-sdk-web";

const det = await Detector.create("/models/yolov8n.onnx");
const result = (await det.predict("/images/street.jpg"))[0];
for (const d of result) {
  console.log(d.className, d.confidence, d.bbox.asXyxy());
}

Continue with Installation and Quick start.

Status

Alpha — the public API is stable enough to build against, but minor versions may introduce breaking changes until 1.0. Pin the version range you build against.

Source code & issues: https://github.com/mauriciobenjamin700/ort-vision-sdk
Python package: https://pypi.org/project/ort-vision-sdk/
Web package: https://www.npmjs.com/package/@mauriciobenjamin700/ort-vision-sdk-web

License

MIT.