Reference — Python API
Public surface of the ort-vision-sdk package (everything is importable
directly from ort_vision_sdk).
Tasks
| Class | Description |
|---|---|
Classifier |
Image classification (output (1, num_classes)). |
Detector |
Object detection (anchor-free YOLO heads). |
Segmenter |
Instance segmentation (YOLO-seg heads). |
VisionTask |
Common base class (do not instantiate directly). |
DetectorHead |
Type of the detection decoder families (e.g. "yolo"). |
SegmenterHead |
Type of the segmentation decoder families (e.g. "yolo-seg"). |
Each task exposes three inference variants with the same signature:
predict(), async_predict() (asyncio.to_thread) and ort_async_predict()
(InferenceSession.run_async). All return list[Results] of length 1 per image.
Constructors (summary)
Classifier(model_path, *, labels=None, providers=None, session_options=None,
input_size=(224, 224), mean=..., std=..., apply_softmax=True)
Detector(model_path, *, head="yolo", labels="coco", providers=None,
session_options=None, input_size=(640, 640), conf_threshold=0.25,
iou_threshold=0.45, max_detections=300)
Segmenter(model_path, *, head="yolo-seg", labels="coco", providers=None,
session_options=None, input_size=(640, 640), conf_threshold=0.25,
iou_threshold=0.45, max_detections=300, mask_threshold=0.5)
Detector.predict() and Segmenter.predict() accept per-call overrides:
conf_threshold, iou_threshold, classes.
Result envelopes
| Envelope | Bulk view | Iterating yields | Notable fields |
|---|---|---|---|
ClassificationResults |
probs |
n/a (single result) | cls, conf, name, probabilities |
DetectionResults |
boxes |
DetectionResult |
cls, conf, box.xyxy, cropped_image |
SegmentationResults |
boxes, masks |
SegmentationResult |
cls, conf, box.xyxy, mask, segmented_image |
Every envelope also exposes names, orig_img, orig_shape, path, and an
optional speed timings dict.
Bulk views (Ultralytics-style)
| Class | Attributes |
|---|---|
Boxes |
xyxy, xywh, xyxyn, xywhn, cls, conf, data |
Probs |
top1, top5, top1conf, top5conf, data |
Masks |
data, xyxy |
Per-instance types
| Type | Canonical fields | Ultralytics aliases |
|---|---|---|
DetectionResult |
class_id, class_name, confidence, bbox, cropped_image |
cls, name, conf, box |
SegmentationResult |
+ mask, segmented_image |
cls, name, conf, box |
ClassificationResult |
class_id, class_name, confidence |
cls, name, conf |
ClassProbability |
class_id, class_name, probability |
cls, name |
BoundingBox |
x1, y1, x2, y2 + xyxy |
— |
Images and labels
| Symbol | Description |
|---|---|
load_image(image) |
Loads any supported input into an HWC uint8 RGB ndarray. |
ImageInput |
Union type of the inputs accepted by predict(). |
ImageArray |
Alias for the HWC uint8 RGB ndarray. |
resolve_labels(spec, ...) |
Resolves a LabelSpec into dict[int, str]. |
LabelSpec |
Union type accepted by labels= (preset, list, dict, path, None). |
COCO_CLASSES |
Tuple with the 80 classes of the COCO preset. |
Source of truth
The full signatures, with types and docstrings, live in the source at
sdk-python/src/ort_vision_sdk/.
This page summarizes the public surface exported in __init__.py.