The document classification and clustering product organizes documents into topical sets and assigns labels to each document.
The document classification and clustering product helps customers organize their documents into topical sets and assign discovered labels to each document.
The package is a complete solution for:

While most classification and clustering products use bag of words and co-occurrence statistics, the Lymba document classification and clustering product is unique because it uses the rich output of the K-Platform to categorize and label documents. The K-Platform products that can be configured for use during classification or clustering are:
The feature extraction for the document classification and clustering is also configurable.
The clustering step is optional. If the document set is either pre-labeled or has a meaningful structure on disk, like email folders, then those labels will be used.
Document classification and clustering has wide applicability and is often the first step in any Natural Language Processing application with requirements to quickly organize and label documents. See PowerAgent™ for an application of this product tailored for Customer Relation Management.