Engines are processing units that accept data, such as audio, video, images, and text. As the main unit of cognitive computing in aiWARE, engines process the data using algorithms and machine learning techniques. The extracted output can be used to generate accurate predictions and insights, and automate many tasks.
How engines work
Engines are packaged and deployed as Docker images. For engines built in Automate Studio, Node-RED is used.
Each Docker image is an executable package that includes everything needed to run your engine: the code, a runtime, libraries, environment variables, and configuration files. If your engine can be implemented in such a way as to process a file in chunks (without respect to the ordering of chunks), aiWARE can carry out horizontal scaling of your processing automatically.
Stateful engines that require contextual knowledge based on the order of data frames are called stream engines. In aiWARE, stateless engines that can be scaled horizontally are called segment engines (or "chunk" engines).
Engine types
There are three main types of engines in aiWARE:
- Adapters (also known as ingestion engines)
- Cognitive engines
- Aggregator engines
Adapters (ingestion engines)
Adapters bring data from other sources into aiWARE. The data can either be a real-time stream or a bounded file, and can be structured or unstructured. Once the data is in aiWARE, it can be processed by cognitive engines to derive insights for your end users.
Cognitive engines
Cognitive engines process data brought in by adapters and employ sophisticated algorithms and machine learning techniques to produce even more data from which you can derive actionable insights. Examples of what a cognition engine does include natural language processing, transcription, and object detection. You can build a workflow of cognitive engines, to be run sequentially or in parallel, each one enhancing the target output data set. For example, a workflow could include the following engines:
- Ingest video stream (adapter)
- Transcribe video to text (cognitive)
- Translate to another language (cognitive)
- Do sentiment analysis (cognitive)
Aggregator engines
Aggregator engines consolidate the outputs from all the cognitive engines run within a job and create a new output data set for use in aiWARE. While the inputs to cognitive engines come from a single source (usually the ingested data from an adapter or the output from another cognition engine) the inputs to aggregator engines come from multiple sources. An aggregator can either be intracategory (processing results from a single cognition capability) or intercategory (working across two or more capabilities).
Below is a sample workflow that includes an intracategory aggregator engine at the end.
- Ingest audio stream (Adapter)
- Transcribe audio using a cloud engine provider A (Cognitive)
- Transcribe audio using a cloud engine provider B (Cognitive)
- Transcribe audio using a cloud engine provider C (Cognitive)
- Select the best result from all 3 transcripts (Aggregator)
[Note] Aggregator engines are not currently available to deploy through Veritone Developer.
Engine processing modes
Veritone engine processing modes are: chunk, stream, and batch. The runtime of the engines can be via Docker for traditional engines built in Developer, or Node-RED for engines built in Automate Studio. The following table lists the differences between engine modes.
| Chunk | Stream | Batch |
|---|
Processes a chunk at a time, producing AION outputs, which are then aggregated by Output Writer to produce engine results for the task.
The chunks are normally the outputs of a parent task (e.g. Stream Ingestor splitting a stream into chunks of wav files), and can be processed in any order and by multiple engine instances to produce a quicker result. | Receives stream input from either an external source or a parent task, may produce AION outputs as the stream is processed, or other stream data (e.g. Stream Ingestor).
The engine needs the data in a single context to avoid losing accuracy or incurring complexity. For example, tracking objects across frames would be easier in a single stream than collating them across multiple chunks. | May not consume any input data from a parent task of the job. However, it typically uses the payload for the task to perform the business logic, and outputs often produced directly into the container TDO.
Batch engines are typically associated with V1F (Iron-based) engines. |
The chunks are normally the outputs of a parent task (e.g. Stream Ingestor splitting a stream into chunks of wav files), and can be processed in any order and by multiple engine instances to produce a quicker result. | Receives stream input from either an external source or a parent task, may produce AION outputs as the stream is processed, or other stream data (e.g. Stream Ingestor).
The engine needs the data in a single context to avoid losing accuracy or incurring complexity. For example, tracking objects across frames would be easier in a single stream than collating them across multiple chunks. | May not consume any input data from a parent task of the job. However, it typically uses the payload for the task to perform the business logic, and outputs often produced directly into the container TDO.
Batch engines are typically associated with V1F (Iron-based) engines.
Develop an engine
To extend aiWARE with custom engine processing, develop a cognitive engine. Follow our user guide or review our guidelines for writing engines.
We recommend writing or using a custom engine when:
-
You have a custom model (either written from scratch or pre-trained on your data)
-
You want to work with a new version of a foundation model
-
You want to execute custom code as a task in an aiWARE job (In most cases, Automate is the best option, but if you have an existing code/container, or a complex workflow, using it as an engine may be best)