Title	Classificationbox

URL Name	000004284

Audience	Public

Product (Internal List) aiWare - aiWare

Body

Classificationbox uses machine learning to automatically classify various types of data, such as text, images, and structured and unstructured data. By providing continuous and live learning, Classificationbox avoids the need for expensive training sessions, GPUs, or large data sets.

Classificationbox has a variety of utilities:

Learn about how your company is perceived by grouping tweets into positive and negative
Automatically group photos of cats and dogs
Group emails into spam and non-spam categories
Build a classifier to detect the language of a piece of text based on previously taught examples

Classificationbox in aiWARE

You can use Classificationbox in aiWARE by uploading .classificationbox files to a library and then using the Classificationbox engine to process content and classify images or frames from videos.

Run Classificationbox

An interactive administration console that includes everything you need to get going is available when you run Classificationbox.

Make sure you have Docker running with at least 2 CPUs and 4GB RAM.
Run this code in your terminal to start the box:

MB_KEY="YOURKEYHERE" docker run -p 8080:8080 -e "MB_KEY=$MB_KEY" machinebox/classificationbox

Go to http://localhost:8080/ in your browser to see what your box can do.

Updating Classificationbox

If you already have Classificationbox installed, you can update it with the following:

docker pull machinebox/classificationbox:latest

A few tools exist to help you train your classifier:

imgclass - Train Classificationbox with images from your hard drive
textclass - Train Classificationbox with text files to build a text-based classifier

Best practices

The quality of classifiers depends largly on the input data and how you teach a model.
Aim to have at least 100 examples for each class - exact requirements differ by case.
Have the same (or very similar) number of examples per class.
Ensure the quality of the examples you train with, and make sure each example is in the correct class.
Take a random selection of 80% of your examples for teaching, and use the other 20% for validating. You can measure what percentage of your validation set was correctly predicted by the model; this is the model accuracy. You can experiment with different data sets and compare them to decide which gives you the best results. To avoid a biased model, the order of the examples you teach should be random. Do not teach all examples for each class in a group, instead spread the teaching out among all classes.

Created Date	12/7/2023 4:21 PM

Last Modified Date	12/7/2023 4:29 PM

Last Published Date	12/4/2023 6:34 PM

Article Record Type	Documentation

Veritone Record Type	Documentation

Article Number	000004284