Turning Information
... into Knowledge
Our Solutions Semantic Technologies Support Partners About
Distributed Sources Cross-lingual text analysis Content recognition and categorization Content similarity Semantic and similarity search Abstracts Generation Advanced visualization Privacy and security Semantic Web Customer benefits
Contact English Deutsch Franšais

Content recognition and categorization


During the analysis of a document, all nodes in the taxonomy tree that are addressed by the text-analysis process are highlighted, and the ensemble of highlighted nodes indicates the thematic areas covered by the document.

The corresponding thematic areas of each document are then projected into a 100-dimensional content-space, and finally, a categorization of the documents is achieved by means of a self-organizing neural network (Kohonen-Map), ending up with the documents grouped in "well-organized bookshelves." The neural network also provides a scientifically founded similarity measure based on information-theoretical principals that allow the comparison of documents according to their content.

This content recognition and categorization technology works across several different languages, recognizing, for example, that an English translation of a French, German, Italian, or Spanish document has the same content and contains the same information as the original document.

Unlike other systems, categorization with InfoCodex functions automatically, without any user intervention. This function eliminates the cumbersome and costly training for documentation classification - a significant advantage.


26. - 27.09.2016 Keynote at Big Data Analytics and Data Mining, London

14.03.16 WebPlatform Ares4 with InfoCodex presented at CeBIT 2016 Hannover

16.09.15 Presentation at SEMANTICS Vienna 2015 "Scientific Discovery by Machine ..." (Session 1.4)

02.06.2015 Presentation at Swiss Competitive Intelligence

26.02.13 Semantic Tech Turns up Biomarkers

13.02.13 Discovery of Novel Biomarkers (joint paper with Merck & Co)