Title	Set up the project

URL Name	000004313

Audience	Public

Product (Internal List) aiWare - aiWare

Body

This is the first step in the user guide to building an engine. To see all the steps, see the user guide process page.

Steps

Clone the repo. You have several options:
- Clone the repo for this project.
- Create a project locally in a folder called /hello-world, with the following structure:
```
/hello-world
  |— Dockerfile
  |— index.js
  |— keyword-extraction.js
  |— package.json
```
  [Warning] Don't locate the project at the top level of your computer's root directory, nor at the top level of any large folder. There's a chance Docker will try to recursively copy the entire directory structure into the Docker image. Heed the warning at the Docker site.
Create the Dockerfile.

Create an empty file named Dockerfile (using, for example, the touch command in bash).

index.js and keyword-extraction.js

The code for an engine can be simple. It must respond to two HTTP requests (corresponding to a GET on /ready and a POST on /process), being sure — in the case of the /process call — to output JSON that obeys an applicable AION sub-schema.

The code for index.js is:

'use strict';

const KeywordExtraction = require("./keyword-extraction.js"),
    express = require('express'),
    multer = require('multer'),
    upload = multer({storage: multer.memoryStorage()}),
    app = express();

let chunkUpload = upload.single('chunk');
let server = app.listen( 8080 );
server.setTimeout( 10 * 60 * 1000 );

// READY WEBHOOK
app.get('/ready', (req, res) => {
    res.status(200).send('OK');
});

// PROCESS WEBHOOK
app.post('/process', chunkUpload, async (req, res)=>{
    try {
        let buffer = req.file.buffer.toString();
        let output = KeywordExtraction.getOutput( buffer, null );
        return res.status(200).send( output );
    } catch (error) {
        return res.status(500).send(error);
    }
});

The "cognition" logic in keyword-extraction.js is a simple vocabulary-extraction routine that looks like this:

module.exports = {

    getOutput: (buffer, configs) => {
        
        // Pass the chunk buffer in, get word-array out:
        let getWordArray = (buffer) => {
            
            let text = buffer.replace(/[^0-9a-zA-Z-']+/g, " ");
            let words = text.split(/\s+/);
            let results = [];
            let undupe = {};
            
            words.forEach(word => {
                if (word.length && word.indexOf("-") == 0)
                    return;
                let wordLowerCased = word.toLowerCase();
                if (!(wordLowerCased in undupe) && word.length)
                    undupe[wordLowerCased] = word;
            });
            
            for (var k in undupe)
                results.push(undupe[k]);
            return results;
        };

        // Pass word array in, get [ keywordObject ] out
        let getVtnObjectArray = (input) => {

            function KeywordObject(labelValue) {
                this.type = "keyword";
                this.label = labelValue;
            }

            let output = [];
            input.forEach(word => {
                output.push(new KeywordObject(word));
            });

            return output;
        };

        try {
            let words = getWordArray(buffer);
            let vtnObjectArray = getVtnObjectArray(words);
            let output = {
                "validationContracts": [
                    "keyword"
                ],
                "object": vtnObjectArray
            };

            return output;
        } catch (e) {
            console.log("Exception: " + e.toString())
        }
    }
};

A simple segment engine in Node.js

Here is how a segment (chunk) engine could look if written in Node.js:

const MyCognitionLogic = require("./my-cognition-logic.js"),
      express = require('express'),
      multer = require('multer'),
      upload = multer({storage: multer.memoryStorage()}),
      app = express();

let chunkUpload = upload.single('chunk'); // this puts a callback in chunkUpload
let server = app.listen( 8080 );
server.setTimeout( 10 * 60 * 1000 );
 
// READY WEBHOOK
app.get('/ready', (req, res) => {
    res.status(200).send('OK');
});

// PROCESS WEBHOOK
app.post('/process', chunkUpload, async (req, res)=>{
    try {
        let input = req.file.buffer.toString();
        let output = MyCognitionLogic.getOutput( input );
        return res.status(200).send( output );
    } catch (error) {
        return res.status(500).send(error);
    }
});

[Note] This reusable skeleton does no "cognition" per se — it merely delegates cognitive processing to a custom, user-written external module called my-cognition-logic.js.

In this example, the task of reading the file-upload stream is handled by a third-party open-source middleware module called Multer.

package.json

While a package.json file isn't mandatory, one is included in this example:

{
         "name": "hello-world",
         "version": "1.0.0",
         "description": "",
         "main": "index.js",
         "scripts": {
           "test": "echo \"Error: no test specified\" && exit 1"
         },
         "keywords": [],
         "author": "",
         "license": "ISC",
         "dependencies": {
           "express": "^4.17.1",
           "multer": "^1.4.1"
         }
       }

What to do next

Register the engine and upload the file. See Create an engine.

Created Date	5/21/2025 8:47 PM

Last Modified Date	5/23/2025 7:11 PM

Last Published Date	5/23/2025 7:11 PM

Article Record Type	Documentation

Veritone Record Type	Documentation

Article Number	000004313

Set up the project

Steps

index.js and keyword-extraction.js

A simple segment engine in Node.js

package.json

What to do next