Salesforce

Entity extraction engines

« Go Back
Information
Entity extraction engines
000004111
Public
Product Selection
aiWare - aiWare
Article Details
[API][Partial]
[Search][Yes]
[UI][No]

Entity extraction, sometimes referred to as named entity extraction, is an aspect of Natural Language Processing (NLP) that refers to labeling words, phrases, or even concepts in text. Entity extraction engines classify named entities, located in unstructured text, into predefined categories such as People, Organizations, or Locations. They can also denoteĀ time or date, which can be useful when pre-processing large chunks of text.

Use cases

When building a search engine, you might want to index people, places, and things so your customers can find content based on those categories. For example, users want to find articles about the city of London. Articles about people named London can be set to rank lower.

Engine input

Entity extraction engines can specify supportedInputFormats in their manifest for mime types they can support natively (e.g. text/plain, application/pdf). In this case, engines are given the entire file as their input and are responsible for outputting the entire list of extracted entities in their .aion output.

Training and libraries

If entity extraction engines are made trainable with libraries then they can map their output back to entities in the libraries they were trained on by including an entityId in their engine output.

Engine output

See the entity validation contract json-schema.

Examples

Here is an example output that only specifies a label for the identified entity:

{
  "validationContracts": ["entity"],
  "object": [
    {
      "type": "namedEntity",
      "label": "John",
      "sentence": 1
    }
  ]
}

A more involved example includes a label, confidence, a mapping to a category classification taxonomy, sentiment readings, and page/paragraph/sentence referencing (all optional):

{
  "validationContracts": ["entity"],
  "object": [
    {
      "type": "namedEntity",
      "label": "John Smith",
      "confidence": 0.5,
      "objectCategory": [
        {
          "class": "person"
        }
      ],
      "sentiment": {
        "positiveValue": 1,
        "negativeValue": 0
      },
      "page": 1,
      "paragraph": 1,
      "sentence": 1
    }
  ]
}

Library entity example

This is an example output that maps an extracted entity back to an aiWARE library entity.

{
  "validationContracts": ["entity"],
  "object": [
    {
      "type": "namedEntity",
      "entityId": "<ID of the entity from an aiWARE library>",
      "libraryId": "<Option ID of the library the entity is contained in>",
      "sentence": 1
    }
  ]
}
Additional Technical Documentation Information
Properties
5/7/2024 6:14 PM
5/7/2024 6:15 PM
5/7/2024 6:15 PM
Documentation
Documentation
000004111
Translation Information
English

Powered by