Salesforce

Using speaker separation

« Go Back
Information
Using speaker separation
000001917
Public
Product Selection
aiWare - aiWare
Article Details

Veritone speaker separation technology identifies, classifies, and tracks individual speakers in multi-person conversations. The system takes voice data from an uploaded audio or video file and transcribes it into text. The resulting transcript can then used to create a person-by-person exchange that’s broken into timestamped paragraphs with speaker labeling. A new paragraph begins each time there is a change in speaker to help identify who said what, and exactly when. Once a file has been transcribed, you can easily edit the text, search for it by keyword, and export it in a variety of formats.

A typical workflow consists of: 

  • Upload and transcribe: Drag-and-drop or upload files into Veritone from your local computer and transcribe the speech in your file into text.
  • Edit: Review and edit your speaker separation and transcript using Veritone’s built-in features and in-app text editor.
  • Export: Download the transcript in a variety of formats, such as .txt or .ttml.

Get started

Each of the workflow steps is covered in more detail throughout the Speaker Separation section of our help center. Click on a link below to learn more.

Additional Technical Documentation Information
Properties
9/26/2025 12:09 AM
9/26/2025 12:10 AM
9/26/2025 12:10 AM
Documentation
Documentation
000001917
Translation Information
English

Powered by