Reference — Python API

Public surface of the ort-vision-sdk package (everything is importable directly from ort_vision_sdk).

Tasks

Class	Description
`Classifier`	Image classification (output `(1, num_classes)`).
`Detector`	Object detection (anchor-free YOLO heads).
`Segmenter`	Instance segmentation (YOLO-seg heads).
`VisionTask`	Common base class (do not instantiate directly).
`DetectorHead`	Type of the detection decoder families (e.g. `"yolo"`).
`SegmenterHead`	Type of the segmentation decoder families (e.g. `"yolo-seg"`).

Each task exposes three inference variants with the same signature: predict(), async_predict() (asyncio.to_thread) and ort_async_predict() (InferenceSession.run_async). All return list[Results] of length 1 per image.

Constructors (summary)

Classifier(model_path, *, labels=None, providers=None, session_options=None,
           input_size=(224, 224), mean=..., std=..., apply_softmax=True)

Detector(model_path, *, head="yolo", labels="coco", providers=None,
         session_options=None, input_size=(640, 640), conf_threshold=0.25,
         iou_threshold=0.45, max_detections=300)

Segmenter(model_path, *, head="yolo-seg", labels="coco", providers=None,
          session_options=None, input_size=(640, 640), conf_threshold=0.25,
          iou_threshold=0.45, max_detections=300, mask_threshold=0.5)

Detector.predict() and Segmenter.predict() accept per-call overrides: conf_threshold, iou_threshold, classes.

Result envelopes

Envelope	Bulk view	Iterating yields	Notable fields
`ClassificationResults`	`probs`	n/a (single result)	`cls`, `conf`, `name`, `probabilities`
`DetectionResults`	`boxes`	`DetectionResult`	`cls`, `conf`, `box.xyxy`, `cropped_image`
`SegmentationResults`	`boxes`, `masks`	`SegmentationResult`	`cls`, `conf`, `box.xyxy`, `mask`, `segmented_image`

Every envelope also exposes names, orig_img, orig_shape, path, and an optional speed timings dict.

Bulk views (Ultralytics-style)

Class	Attributes
`Boxes`	`xyxy`, `xywh`, `xyxyn`, `xywhn`, `cls`, `conf`, `data`
`Probs`	`top1`, `top5`, `top1conf`, `top5conf`, `data`
`Masks`	`data`, `xyxy`

Per-instance types

Type	Canonical fields	Ultralytics aliases
`DetectionResult`	`class_id`, `class_name`, `confidence`, `bbox`, `cropped_image`	`cls`, `name`, `conf`, `box`
`SegmentationResult`	+ `mask`, `segmented_image`	`cls`, `name`, `conf`, `box`
`ClassificationResult`	`class_id`, `class_name`, `confidence`	`cls`, `name`, `conf`
`ClassProbability`	`class_id`, `class_name`, `probability`	`cls`, `name`
`BoundingBox`	`x1`, `y1`, `x2`, `y2` + `xyxy`	—

Images and labels

Symbol	Description
`load_image(image)`	Loads any supported input into an HWC uint8 RGB `ndarray`.
`ImageInput`	Union type of the inputs accepted by `predict()`.
`ImageArray`	Alias for the HWC uint8 RGB `ndarray`.
`resolve_labels(spec, ...)`	Resolves a `LabelSpec` into `dict[int, str]`.
`LabelSpec`	Union type accepted by `labels=` (preset, list, dict, path, None).
`COCO_CLASSES`	Tuple with the 80 classes of the COCO preset.

Source of truth

The full signatures, with types and docstrings, live in the source at sdk-python/src/ort_vision_sdk/. This page summarizes the public surface exported in __init__.py.