Title	Cognitive engine capabilities

URL Name	000003959

Audience	Public

Product (Internal List) aiWare - aiWare

Body

Within each engine class is a set of capabilities, which are based on what type of data they output.

Class	Capability	Description
Audio	Audio fingerprinting	Recognizes a specific audio segment, such as a radio advertisement, as it appears in a longer audio file or on its own.
Biometrics	Face detection	Detects the presence of one or multiple faces in an image or video.
Biometrics	Face recognition	Identifies one or multiple people in an image or video by associating each individual's face to their name.
Biometrics	Speaker verification	Determines the similarity between the speaker's voice in an audio file to the voice of a person with a specified username. In `enroll` mode, the engine enrolls the speaker's voice into the library under the username.
Data	Correlation	Associates two data products based on some commonality, such as occurrence over time. For example, may associate weather data on a given date with stock prices on that date.
Data	Geolocation	Identifies the geographic location of a person or object in the real world or some virtual equivalent.
Data	Brand safety	Processes media to determine where content falls on a scale of sensitivity or concern.
Facial features	Facial features	Computes metrics pertaining to face movement using a series of face landmarks and audio.
Speech	Speaker detection	aka Speaker Separation, Diarization. Partitions an input audio stream into segments according to who is speaking when.
Speech	Speaker recognition	aka Speaker Identification. Identifies speakers in an audio file based on trained recordings of their voice.
Speech	Transcription	Converts speech audio to text.
Text	Anomaly detection	Assigns a value to each item in a time-series according to how anomalous the object is.
Text	Content classification	Categorizes one or multiple documents according to a pre-defined ontology.
Text	Entity extraction	aka Named-entity recognition. Classifies named entities located in unstructured text into pre-defined categories such as people, organizations and locations.
Text	Keyword extraction	Identifies key terms and/or phrases that appear in documents, based on parts of speech, salience, or other criteria.
Text	Language identification	Detects one or multiple natural languages in text.
Text	Sentiment analysis	Classifies text according to sentiment. May include a score representing negative, neutral or positive, or include a wider breadth of tags such as "happy" or "excited."
Text	Summarization	Generates a summary of written text.
Text	Text extraction	Extract textual information from documents, and expresses that extracted text in a structured format.
Text	Translation	Translates natural language from a text source. Includes translating plain text, rich text, extracted text, recognized text (OCR), and transcripts.
Verification	Face verification	Determines the similarity between the face in an image to the face of a specified username. In `enroll` mode, the engine enrolls the face image into the library under the username.
Verification	Speaker verification	Determines the similarity between the speaker's voice in an audio file to the voice of a person with a specified username. In `enroll` mode, the engine enrolls the speaker's voice into the library under the username.
Vision	Image classification	Classifies the entire image rather than objects within an image, such as "landscape" or "basketball game."
Vision	License plate recognition (ALPR)	Produces a text string of alphanumeric characters for each license plate recognized in an image or video.
Vision	Logo detection	Recognizes one or more logos or branding elements in an image or video.
Vision	Object detection	Detects one or multiple objects or concepts in an image or video from a general/broad ontology, such as "car" or "person."
Vision	Text recognition	aka Optical Character Recognition. Converts alphanumeric characters to text in a document, image, or video.

Created Date	1/10/2024 11:30 PM

Last Modified Date	1/16/2024 9:51 PM

Last Published Date	1/16/2024 9:51 PM

Article Record Type	Documentation

Veritone Record Type	Documentation

Article Number	000003959

Cognitive engine capabilities