Information

Title	Using speaker separation

URL Name	000001917

Audience	Public

Product Selection

Product (Internal List) aiWare - aiWare

Article Details

Body

Veritone speaker separation technology identifies, classifies, and tracks individual speakers in multi-person conversations. The system takes voice data from an uploaded audio or video file and transcribes it into text. The resulting transcript can then used to create a person-by-person exchange that’s broken into timestamped paragraphs with speaker labeling. A new paragraph begins each time there is a change in speaker to help identify who said what, and exactly when. Once a file has been transcribed, you can easily edit the text, search for it by keyword, and export it in a variety of formats.

A typical workflow consists of:

Upload and transcribe: Drag-and-drop or upload files into Veritone from your local computer and transcribe the speech in your file into text.
Edit: Review and edit your speaker separation and transcript using Veritone’s built-in features and in-app text editor.
Export: Download the transcript in a variety of formats, such as .txt or .ttml.

Get started

Each of the workflow steps is covered in more detail throughout the Speaker Separation section of our help center. Click on a link below to learn more.

Created Date	9/26/2025 12:09 AM

Last Modified Date	9/26/2025 12:10 AM

Last Published Date	9/26/2025 12:10 AM

Article Record Type	Documentation

Veritone Record Type	Documentation

Article Number	000001917

Using speaker separation

Get started