Salesforce

Media configuration

« Go Back
Information
Media configuration
000004157
Public
Product Selection
aiWare - aiWare
Article Details

This table indexes engine capabilities that help you choose the appropriate engine type for your media.

Your media may be: 

  • Video
  • Image
  • Audio file
  • Data file (such as text)

Engines for each type of media are listed below.

API calls

To call single engines, see the API example for running a job using launch single engine template.

Select engines in the UI

To choose engines for your media when registering engines in the Developer utility, see Step 2 - Functionality.

Video

Video also uses the engines under Image and Audio for capturing image and audio portions of videos.

ClassCapabilityDescription
BiometricsFace detectionDetects faces in an image or video.
BiometricsFace recognitionIdentifies people in an image or video by associating each individual's face with their name.
Facial FeaturesFacial featuresComputes metrics pertaining to face movement using a series of face landmarks and audio.
VisionLicense plate recognition (ALPR)Produces a text string of alphanumeric characters for each license plate recognized in an image or video.
VisionLogo detectionRecognizes logos or branding elements in an image or video.
VisionObject detectionDetects objects or concepts in an image or video from a general or broad ontology, such as "car" or "person."
VisionText recognition (OCR)Optical Character Recognition. Converts alphanumeric characters to text in a document, image, or video.

Image

ClassCapabilityDescription
BiometricsFace detectionDetects faces in an image or video.
BiometricsFace recognitionIdentifies people in an image or video by associating each individual's face with their name.
VisionImage classificationClassifies the entire image rather than objects within an image, such as "landscape" or "basketball game."
VisionLicense plate recognition (ALPR)Produces a text string of alphanumeric characters for each license plate recognized in an image or video.
VisionLogo detectionRecognizes logos or branding elements in an image or video.
VisionObject detectionDetects objects or concepts in an image or video from a general or broad ontology, such as "car" or "person."
VisionText recognition (OCR)Optical Character Recognition. Converts alphanumeric characters to text in a document, image, or video.

Audio

ClassCapabilityDescription
AudioAudio fingerprintingRecognizes a specific audio segment, such as a radio advertisement, as it appears in a longer audio file or on its own.
BiometricsSpeaker verificationDetermines the similarity between the speaker's voice in an audio file to the voice of a person with a specified username. In enroll mode, the engine enrolls the speaker's voice into the library under the username.
SpeechSpeaker detectionSpeaker Separation, Diarization. Partitions an input audio stream into segments according to who is speaking when.
SpeechSpeaker recognitionSpeaker Identification. Identifies speakers in an audio file based on trained recordings of their voice.
SpeechTranscriptionConverts speech audio to text.
VerificationSpeaker verificationDetermines the similarity between the speaker's voice in an audio file to the voice of a person with a specified username. In enroll mode, the engine enrolls the speaker's voice into the library under the username.

Data file

ClassCapabilityDescription
DataCorrelationAssociates two data products based on some commonality, such as occurrence over time. For example, may associate weather data on a given date with stock prices on that date.
DataGeolocationIdentifies the geographic location of a person or object in the real world or some virtual equivalent.
DataBrand safetyProcesses media to determine where content falls on a scale of sensitivity or concern.
TextAnomaly detectionAssigns a value to each item in a time-series according to how anomalous the object is.
TextContent classificationCategorizes one or multiple documents according to a pre-defined ontology.
TextEntity extractionaka Named-entity recognition. Classifies named entities located in unstructured text into pre-defined categories such as people, organizations and locations.
TextKeyword extractionIdentifies key terms and/or phrases that appear in documents, based on parts of speech, salience, or other criteria.
TextLanguage identificationDetects one or multiple natural languages in text.
TextSentiment analysisClassifies text according to sentiment. May include a score representing negative, neutral or positive, or include a wider breadth of tags such as "happy" or "excited."
TextSummarizationGenerates a summary of written text.
TextText extractionExtract textual information from documents, and expresses that extracted text in a structured format.
TextTranslationTranslates natural language from a text source. Includes translating plain text, rich text, extracted text, recognized text(OCR), and transcripts.
VerificationFace verificationDetermines the similarity between the face in an image to the face of a specified username. In enroll mode, the engine enrolls the face image into the library under the username.
VisionText recognition (OCR)Optical Character Recognition. Converts alphanumeric characters to text in a document, image, or video.
Additional Technical Documentation Information
Properties
1/11/2024 10:23 PM
1/11/2024 10:25 PM
1/11/2024 10:25 PM
Documentation
Documentation
000004157
Translation Information
English

Powered by