Throughout our APIs and this documentation, you will see frequent reference to jobs, tasks, and TDOs (Temporal Data Objects). These are the workhorse objects of the aiWARE platform. Here's a quick look at each of these, in reverse order:
TDO
A TDO is an all-purpose container object, aggregating information about jobs, assets, and temporal data, among other things. Important facts about TDOs:
- Generally, the lifecycle of a TDO is managed manually. Although some engines may create a TDO on their own, more commonly you create a TDO and submit it when kicking off a job with
createJob(). - When you no longer need a TDO, you can delete it and/or its contents programmatically. Otherwise, it lives forever.
- TDOs you create are generally visible (and thus usable) only by members of your organization.
- You will often create an empty TDO programmatically and then run an ingestion task on it to populate it with a media asset.
- When processing a media file referenced in your TDO, an engine will produce its own output (e.g., transcription output) in the form of an AION asset, which will be attached to your TDO by reference.
- A TDO can contain multiple assets of multiple types.
Task
The task is the smallest unit of work in aiWARE. Important facts to know about tasks are:
- A task specifies an engine that will be run against a TDO.
- Tasks are run as part of a job (see below).
- A task can be queried at any time using the GraphQL
task() method. - The possible status values that a task can have are shown below.
enum TaskStatus {
pending
running
complete
queued
accepted
failed
cancelled
standby_pending
waiting
resuming
aborted
paused
}
[Note] If a task finishes with a status of aborted or failed, it will cause the job it is part of to finish with a status of failed.
Job
The job is a higher-level unit of work that wraps one or more tasks.
[Note] If you need to aggregate jobs into an even higher-level unit of work, consider using Veritone Automate Studio to create a multi-job workflow.
Important facts to know about jobs are:
- You can create and queue (and thus essentially launch, immediately and asynchronously) a job using the GraphQL
createJob() method. - A job needs to operate against a TDO. You should specify the TDO's ID in the
targetID property when you call createJob(). - The order in which you list tasks in your call to
createJob() is important. If your job needs to ingest a media file, the ingestion-engine task should be the first task in your list of tasks. - You should check a job's status using the job ID returned by
createJob(). (See Run a job using GraphQL for an example of how to do this.) - A job can have any of the status values shown below.
enum JobStatus {
pending
complete
running
cancelled
queued
failed
}
Ingestion tasks
Ingestion refers to when an adapter intakes media files into a Data Center, digital asset management (DAM), or media asset management (MAM) system. When a file is ingested, it's generally copied to a secure location, registered with the host system, and optionally chunked, transcoded, tagged, indexed, thumbnailed, and/or subjected to other "normalizing" operations, such that the system can operate on all ingested files reliably, using the same APIs, with the same expectations, no matter where a file originally came from. In aiWARE, a file can undergo cognitive processing if and only if it has been ingested. The two most common ways to ingest a media file for processing in aiWARE are:
- Create a TDO and pull the media asset into it, in one operation, using
createTDOWithAsset(). See Create TDO and upload asset for more information. - Create a TDO manually and then run an ingestion job on it using
createJob() in conjunction with an appropriate ingestion engine (also called an adapter).
aiWARE offers many ready-to-use ingestion engines tailored to various intake scenarios, such as pulling videos (or other files) from YouTube, Google Drive, Dropbox, etc. To see a list of the available ingestion engines (adapters), run the following GraphQL query:
Example
query listIngestionEngines {
engines(filter: {
type: Ingestion
}) {
records {
name
id
}
}
}
[Note] You'll commonly use the Webstream Adapter — with ID "9e611ad7-2d3b-48f6-a51b-0a1ba40feab4" — to pull files from public URIs.
Example jobs
This example assumes that you have already created a TDO with ID 1100548727.
Example
mutation {
launchSingleEngineJob(
input: {
targetId: "1100548727"
engineId:"c0e55cde-340b-44d7-bb42-2e0d65e98255"
fields:[]
}
) {
id
targetId
status
}
}
To launch a job against a file on the web, run a mutation that looks like this:
Example
mutation {
launchSingleEngineJob(
input: {
uploadUrl:"https://s3-wzd-dv-fulfill-or-1.s3-us-west-2.amazonaws.com/HitchhikersGuide.mp4"
engineId:"c0e55cde-340b-44d7-bb42-2e0d65e98255"
fields:[]
}
) {
id
targetId
status
}
}
In both of the above mutations, the job mounts on your organization's default cluster, by default. It's important to understand how clusters work in aiWARE. In this case "cluster," means a cluster of AI Processing controllers -- essentially, an AI Processing instance. The clusterId tells aiWARE's AI Data where to send work. To specify a particular cluster ID to run the job on, do this:
Example
mutation {
launchSingleEngineJob(
input: {
uploadUrl:"https://s3-wzd-dv-fulfill-or-1.s3-us-west-2.amazonaws.com/HitchhikersGuide.mp4"
engineId:"c0e55cde-340b-44d7-bb42-2e0d65e98255"
fields: [
{ fieldName:"clusterId", fieldValue:"rt-1cdc1d6d-a500-467a-bc46-d3c5bf3d6901" }
]
}
) {
id
targetId
status
}
}
Configuration fields required by an engine can also be set this way.
Example
fields: [
{ fieldName:"inputIsImage", fieldValue:"true" },
{ fieldName:"minConfidence", fieldValue:"0.5" },
]
Specify special parameters that a cognition engine needs to know about (e.g. diarise and automaticPunctuation for a transcription engine) in this way.