Interpret prediction results from video object tracking models
Stay organized with collections
Save and categorize content based on your preferences.
After requesting a prediction, Vertex AI returns results based on your
model's objective. Predictions from an object tracking model return time and
locations of objects to track, according to your own defined labels. The model
assigns a confidence score to each prediction, which communicates how confident
your model accurately identified and tracked an object. The higher the number,
the higher the model's confidence in the correctness of the prediction.
Example batch prediction output
The following sample is the predicted result for a model that tracks
cats and dogs in a video. Each result includes a label (cat or
dog) for the object being tracked, a time segment that specifies
when and for how long the object is being tracked, and a bounding box that
describes the location of the object.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-08-29 UTC."],[],[],null,["# Interpret prediction results from video object tracking models\n\nAfter requesting a prediction, Vertex AI returns results based on your model's objective. Predictions from an object tracking model return time and locations of objects to track, according to your own defined labels. The model assigns a confidence score to each prediction, which communicates how confident your model accurately identified and tracked an object. The higher the number, the higher the model's confidence in the correctness of the prediction.\n\n\u003cbr /\u003e\n\n#### Example batch prediction output\n\nThe following sample is the predicted result for a model that tracks\ncats and dogs in a video. Each result includes a label (`cat` or\n`dog`) for the object being tracked, a time segment that specifies\nwhen and for how long the object is being tracked, and a bounding box that\ndescribes the location of the object.\n\n\n| **Note**: The following JSON Lines example includes line breaks for\n| readability. In your JSON Lines files, line breaks are included only after each\n| each JSON object.\n\n\u003cbr /\u003e\n\n\n```\n{\n \"instance\": {\n \"content\": \"gs://bucket/video.mp4\",\n \"mimeType\": \"video/mp4\",\n \"timeSegmentStart\": \"1s\",\n \"timeSegmentEnd\": \"5s\"\n }\n \"prediction\": [{\n \"id\": \"1\",\n \"displayName\": \"cat\",\n \"timeSegmentStart\": \"1.2s\",\n \"timeSegmentEnd\": \"3.4s\",\n \"frames\": [{\n \"timeOffset\": \"1.2s\",\n \"xMin\": 0.1,\n \"xMax\": 0.2,\n \"yMin\": 0.3,\n \"yMax\": 0.4\n }, {\n \"timeOffset\": \"3.4s\",\n \"xMin\": 0.2,\n \"xMax\": 0.3,\n \"yMin\": 0.4,\n \"yMax\": 0.5,\n }],\n \"confidence\": 0.7\n }, {\n \"id\": \"1\",\n \"displayName\": \"cat\",\n \"timeSegmentStart\": \"4.8s\",\n \"timeSegmentEnd\": \"4.8s\",\n \"frames\": [{\n \"timeOffset\": \"4.8s\",\n \"xMin\": 0.2,\n \"xMax\": 0.3,\n \"yMin\": 0.4,\n \"yMax\": 0.5,\n }],\n \"confidence\": 0.6\n }, {\n \"id\": \"2\",\n \"displayName\": \"dog\",\n \"timeSegmentStart\": \"1.2s\",\n \"timeSegmentEnd\": \"3.4s\",\n \"frames\": [{\n \"timeOffset\": \"1.2s\",\n \"xMin\": 0.1,\n \"xMax\": 0.2,\n \"yMin\": 0.3,\n \"yMax\": 0.4\n }, {\n \"timeOffset\": \"3.4s\",\n \"xMin\": 0.2,\n \"xMax\": 0.3,\n \"yMin\": 0.4,\n \"yMax\": 0.5,\n }],\n \"confidence\": 0.5\n }]\n}\n```"]]