Salesforce

Work with search

« Go Back
Information
Work with search
000003997
Public
Product Selection
aiWare - aiWare
Article Details

The Veritone Search API offers a highly customizable search function that includes core functions supporting parsing, aggregation, and auto-completion. Searches are performed using a variety of supported query types, expressed in JSON.

Search overview

Most search operations make use of optional query configurations that allow you to control the search behavior and narrow results by defining aspects of the content and values.

The Veritone Search API utilizes GraphQL to provide a more efficient way to deliver data with greater flexibility than a traditional REST approach. GraphQL is a query language that operates over a single endpoint using conventional HTTP requests and returning JSON responses. The structure not only lets you call multiple nested resources in a single query, it also allows you to define requests so that the query you send matches the data you receive.

To make effective use of the Search API, you'll need to know a few things about how data is stored in Veritone, the various options for structuring queries, and requirements for performing successful requests.

Base URL

Veritone uses a single endpoint for accessing the API. All calls to the API are POST requests and are served over http with application/json encoded bodies. The base URL varies based on the geographic region where the services will run. When configuring your integration, choose the base URL that supports your geographic location from the list below.

RegionBase URL
United Stateshttps://api.veritone.com/v3/graphql
Europehttps://api.uk.veritone.com/v3/graphql
[Note] The above base URLs are provided for use within SaaS environments. On-prem deployments access the API using an endpoint that is custom configured to the environment.

Making sample requests

To make it easier to explore, write, and test the API, we set up the GraphQL Sandbox, an interactive playground that gives you a code editor with autocomplete, validation, and syntax error highlighting features. Use the GraphQL interface to construct and execute queries, experiment with different schema modifications, and browse documentation. In addition, GraphQL bakes authorization right into the schema and automatically passes the Authentication header with a valid token when you're logged into the Veritone system.

Veritone's GraphQL Sandbox interface is the recommended method for ad hoc API requests, but calls can be made using any HTTP client. All requests must be HTTP POST to the base URL for your geographic region with the query parameter and application/json encoded bodies. Requests must be authenticated using an API token. Pass the token in your request using the Authorization header with a value Bearer token. If you're using a raw HTTP client, the query body contents must be sent in a string with all quotes escaped.

To interact with the API, you need to authorize yourself using a bearer token. Get instructions on our authentication page.

Query basics and syntax

The Veritone Search API gives you the flexibility to build a variety of query types to search and retrieve indexed media and mentions content. The Search API allows you to combine a series of simple elements together to construct queries as simple or as complex as you'd like in JSON format. Although queries are customizable, there is a common structure and set of core parameters that each must use. In addition, there are a number of optional filters, components, and syntax options that can be specified to modify a query.

Content type

Searches in Veritone are performed against two types of content: media and mentions. Each request must specify one of the following search content types as the root operation of the request.

  • Search media: The Search Media operation searches media files and assets for matching records.
  • Search mentions: The Search Mentions operation searches for matching records in mentions and watchlists.

Search operations return results that are sorted by TDO startTime in descending order (in other words, the latest is listed first).

Required query parameters

Regardless of the level of complexity, each search query in Veritone operates on four core elements:

  • index
  • field
  • operator
  • an operator value or variable(s)

Index

All search functionality runs against Veritone's public and private index databases. The index field defines whether to search the organization's public or private index (or both) for matching documents. There are two possible index values: "global," which refers to the public media index, and "mine," which refers to private media uploaded to an account. Each request must specify at least one index value enclosed in brackets.

Field

Content in Veritone is made searchable by mapping to specific document and data types. The field parameter defines the type of data/document to be searched, and each query must specify a value for the field property to be successful. If a field value is not provided, an empty result set will be returned. The field parameter uses a definitive set of values. See the related topics section below for a list.

Operator

Operators are the heart of the query. They describe the type of search that will be performed. Each operator uses one or more additional properties that add greater definition to the query criteria. All queries must specify at least one operator. The table below provides an overview of the different operators that can be used.

Operator nameDescription
rangeFinds matching items within a specified date/time range.
termFinds an exact match for a word or phrase.
termsFinds exact matches for multiple words or phrases.
query_stringSearches plain text strings composed of terms, phrases, and Boolean logic.
word_proximityFinds exact matches for words that are not located next to one another.
existsChecks for the presence of a specified engine output data type.
andFinds records for matching values when all of the conditions are true.
orFinds matching records when at least one of the conditions is true.
query_objectSearches for an array of objects with a common field.

Grouping

To apply two or more Booleans to a single field, group multiple terms or clauses together with parentheses. Below is an example of a search for Kobe or Bryant and basketball or Lakers. value:

"(kobe OR bryant) AND (basketball OR lakers)"

Escaping special characters

Veritone Search API supports escaping special characters in a query string to ensure they aren't interpreted by the query parser. If any of the following special characters have been enabled, use a double backslash before the character to search for it.

Special characters: + - ! ( ) { } [ ] ^ " ~ * ? :

In the example below, the ! is escaped since it's part of the company name "ideaDes!gns".

value: "ideaDes\!gns"

Use a single backslash to escape quotes around a phrase in a query string. The example below escapes the double quotes around "Kobe Bryant" to ensure they're interpreted literally and not as string closers.

value: ""kobe bryant" AND (basketball OR lakers)"

Query schema and types

Veritone Search API supports both simple queries and more advanced queries that use content schemas, query modifiers (such as wildcards and select statements), and multiple operators to provide an even broader range of searching options.

Schema

A schema is created by supplying a Search Media or Search Mentions content type as the root operation and then providing a query object with an operator, index, field, and a list of other properties that describe the data you want to retrieve. All search requests specify JSON data as the response field.

Sample search query schema

 query{                 => The base query parameter used to set up the GraphQL query object. (required)
 -------request fields-----------
  searchMedia(search:{  => The content type to be searched and search variable. (required)
    index: array        => Specifies whether to search the public ("global") or private (mine) index. (required)
    query: object       => The query object that defines the pipeline for the data to be returned. (required)
      field: string     => The data or document type to be searched. (required)
      operator: string  => The type of search to be performed. Common types include term, range, and query_string. (required)
      value: string     => The exact data to be searched for in a field. (required)
    offset: integer     => The zero-based index of the first record to show in the response. (optional)
    limit: integer      => The maximum number of results to return. (optional)
  }){
 -------return fields------------
    jsondata:           => A JSON object with response data. Each search request must specify jsondata as the return field. (required)
  }
}

Query types

Veritone supports several types of queries, which are identified by the operator field. Queries are organized in a hierarchical structure and use nested fields in curvy brackets. The schema uses the query parameter as a wrapper around a subquery that contains an operator, field, and a value or variables. The various query types allow you to customize the search; available options are described below.

Term/terms

The term/terms operators are used to find exact matches of keywords or phrases. A term/terms query can match text, numbers, and dates. Searches produce a list of records that contain the keywords or phrases, no matter where they appear in the text.

[Note]
  • Enclose single keywords and phrases in quotation marks (e.g., "basketball", "Mike Jones").
  • A phrase includes two or more words together, separated by a space (e.g., "free throw line").
  • The order of the terms in a phrase is respected. Only instances where the words appear in the same order as the input value are returned.

Search by term

The term operator is used to find an exact match for a single keyword or phrase. The following example shows a term search against the global index to find media files where the word "Mike" appears in the program name.

query: {
    operator: "term"
    field: "programName"
    value:  "mike"
  }
}

Search by terms

The terms operator allows you to search for multiple keywords or phrases. Search terms can include any string of words or phrases separated by commas. When submitting multiple search terms, search uses a default OR behavior and will return results that contain an exact match of any of the words/phrases.

[Note] When building a terms query, note that the terms operator and the values property are written in plural form.

The example below is a terms search query for the words "football" and "touchdown" found in transcript text in the user's private media index.

query: {
    operator: "terms"
    field: "transcript.transcript"
    values: ["football", "touchdown"]
  }

Date/time range search

The range operator can be used to find content that falls into a specified time frame. This query supports the following types of ranges:

  • Inclusive range: Uses a combination of comparison property filters to find matching content between a specific starting and ending date.
  • Open-ended range: Uses a single comparison property filter to search before or after a specific point in time. A range query must include at least one comparison property and specify the date as the value (using Unix/Epoch Timestamp format in milliseconds and UTC time zone).

Comparison properties

NameDescription
grgreater than or equal to: Searches for documents created on or after the specified date/time.
ltless than: Searches for documents created before the specified date/time.
gteless than or equal to: Searches for documents created on or before the specified date/time.
lteless than or equal to: Searches for documents created on or before the specified date/time.

Inclusive date range

To search for records created between specified dates, use two of the comparison property options to define the to and from dates. The following example is a search against the public index for the 10 most recent media files timestamped on or after 1/26/2017, 5:54:00 AM, and before 6/23/2017, 6:30:00 AM.

query: {
    operator: "range"
    field: "absoluteStartTimeMs"
    gte: 1485417954000
    lt:  1498199400000
  }
}

Open-ended date range

To search with an open-ended range (e.g., find files before a specified date), use just one of the comparison property options. The example below shows a search for the 30 most recent media files in the public index timestamped after 3/15/2017 at 6:30:00 AM.

query: {
    operator: "range"
    field: "absoluteStartTimeMs"
    lt:  1485417954000
  }

Query string

The query_string operator performs full-text searches and constructs queries as text strings using terms, phrases, and Boolean logic. The biggest advantage of the query string is its syntax that parses multiple structured queries into simpler ones.

[Note]
  • The query string supports more complex query constructs, such as wildcard searches.
  • Multiple query statements that are not joined by the and operator will return results that match any of the conditions.
  • Date and time values are not supported in a query string.

The below example will return results transcripts where "Kobe Bryant" or "Lakers" was found.

query: {
    operator: "query_string"
    field: "transcript.transcript"
    value: ""kobe bryant" lakers"
    }

Using and/or in a query string

You can specify a broader or more narrow search by using the and and or operators between values in a query string. When using and/or within a quoted string, they are treated as part of the field value (not the main query operator) and must be written in all uppercase. In addition, multiple terms in a query string can be grouped together using parentheses to make the logic clear. Note that terms in parentheses are processed first. The following example is a search for transcripts where "Kobe Bryant" was found with either the word "basketball" or the word "Lakers."

query: {
    operator: "query_string"
    field: "transcript.transcript"
    value: ""kobe bryant" AND (basketball OR lakers)"
    }

Word proximity

While the terms query searches for specified terms in the exact order as the input value, a word proximity search allows the words to be separated or to appear in a different order. The word proximity operator uses the inOrder property as a Boolean to find terms where they do not appear together, and the distance property to specify the maximum number of words that can separate them.

Word proximity properties

NameRequiredTypeDescriptionExample
inOrderyesBooleanA Boolean that when set to false searches for all of the words in an order different than the input value. Note that if the distance property has a value of 0, inOrder will be set to true.inOrder: false
distanceyesintegerThe number of non-search words that can separate the specified terms to be found. To match words in the same order as the input value, use a distance value of 0. (Note that a 0 value sets inOrder to true.) To transpose two words, enter a value of 1.distance: 10

The following query finds the terms "Atleti," "football," and "Campiones" in transcript text when they appear within 10 words of one another in any order.

query: {
    operator: "word_proximity"
     field: "transcript.transcript"
     values: ["Atleti football Campiones"]
    inOrder: false
     distance: 10
   }

Exists operator

The exists operator is used to check for the presence of a specific field. This query is useful for retrieving matching media files of a specific data type. If the specified field is not found, the operator will consider it non-existent and it will return no results. The example below is a search for media files with the field "veritone-file.mimetype".

query: {
    operator: "exists"
    name: "veritone-file.mimetype"
  }

Using and/or as query operators

The and and or operators allow you to combine two or more query clauses with Boolean logic to create broader or more narrow search results. These operators chain conditional statements together and always evaluate to true or false. As main query operators, and/or are case insensitive and can be written as uppercase or lowercase. It's important to note that when using these operators in compound queries, and takes precedence over or.

and Operator

The and operator matches documents where both terms exist anywhere in the text of a single record. The search below returns results for "Morning Show AM" mentions found on November 11, 2017.

query: {
    operator: "and",
    conditions: [
            {
                    operator: "term"
                    field: "trackingUnitName"
                    value: "Morning Show AM"
            },
            {
                    operator: "term"
                    field: "mentionDate"
                    value: "2017-11-09"
            }
     ]
  }

or Operator

The or operator connects two conditions and returns results if either condition is true. The example below shows a search for the name "Bob" or "Joe" or "Sue" in a transcript or records that created on or after November 17, 2017 at 9:00 AM.

query: {
    operator: "or",
    conditions: [
            {
                    operator: "terms",
                    field: "transcript.transcript"
                    values: ["bob", "joe", "sue"]
            },
            {
                    operator: "range",
                    field: "absoluteStartTimeMs"
                    gte: 1510909200000
            }
     ]
  }

Combining "and" and "or"

The and and or (and not) operators can be combined as multi-level subqueries to create a compound condition.

[Note] When these operators are used together, not is evaluated first, then and, and finally or.

The example below is a search for records found between November 17, 2017 at 9:00 AM and December 1, 2017 at noon where the word "basketball" or "Lakers" was found in a transcript, or the NBA logo was detected.

operator: "and",
    conditions: [
        {
                    operator: "range"
                    field: "absoluteStartTimeMs,
                    gte: 1510909200000
                    lte: 1512129600000
            },
            {
                    operator: "or"
                    conditions: [
                            {
                                    operator: "query_string"
                                    field: "transcript.transcript"
                                    value: ["basketball", "lakers"]
                            }
                                    operator: "term"
                                    field: "logo-recognition.series.found"
                                    value: "NBA"
                            }
                    ]
            }
    ]
}

Query object

The query_object operator allows an array of objects to be queried independently of one another. Each query_object uses a Boolean operator (e.g., and, or, not) to combine a list of nested subqueries. Nested subqueries can use any operator type. The below example is a search for an ESPN logo or the words "touchdown" and "ruled" or "college" in a transcript on or before December 11, 2017 at 9:11 PM.

operator: "query_object"
    query:{
        operator: "or"
        conditions: [{
            operator: "range"
            field: "absoluteStartTimeMs"
            lte: 1513026660000
        },
        {
            operator: "term"
            field: "logo-recognition.series.found"
            value: "ESPN"
        },
        {
            operator: "query_string"
            field: "transcript.transcript"
            value: "touchdown AND (ruled OR college)"
        }
        ]
    }

Query modifiers

Not

A not modifier returns search results that do not include the specified values. Any of the query operators can be negated by adding the property not with a value of true. Note that negation is currently unsupported at the compound operator level. The below example is a search that excludes the terms "NFL," "football," and "game."

query: {
    operator: "terms"
    field: "transcript.transcript"
    values: ["nfl", "football", "game"]
    not: true
      distance: 10
}

Wildcard searches

TheĀ wildcard modifier is useful when you want to search various forms of a word. Wildcard searches can be run on individual terms, using ? to replace a single character, and * to replace zero or more characters. For example, he* returns results that would include her, help, hello, helicopter and any other words that begin with "he". Searching he? will only match three-letter words that start with "he", such as hem, hen, and her.

[Note]
  • There can be more than one wildcard in a single search term or phrase, and the two wildcard characters can be used in combination. (e.g., j?? will match words with three or more characters that start with the letter j.)
  • Wildcard matching is only supported within single terms and not within phrase queries. (e.g., m will match "method" but not "meet there".)
  • A wildcard symbol cannot be used as the first character of a search.

Select statements

A select query modifier lets you retrieve only the data that you want. It also allows you to combine data from multiple property sources. Submitting a request with the parameter returns full record matches for all of the specified values. To use the select filter, enter a comma-separated list of field* names to return. The example below is a search for records where "Kobe Bryant" along with either the word "basketball" or the word "Lakers" is found in either a transcript or a file uploaded to Veritone {{cms}}.

query: {
    operator: "query_string"
    field: "transcript.transcript"
    value: ""kobe bryant" AND (basketball OR lakers)"
    },
  select: ["transcript.transcript", "veritone-file"]
}
Additional Technical Documentation Information
Properties
2/12/2024 10:23 PM
2/12/2024 10:24 PM
2/12/2024 10:24 PM
Documentation
Documentation
000003997
Translation Information
English

Powered by