Are you planning to build a knowledge center
for corporate information? The number of different information sources
that should be included can be very large, ranging from central file
servers, user PCs, e-mail archives (including attachments) - all spread
across the corporate network - to external Web sites and Internet search
results. This entire mass of information is then merged into a single
ordered document collection.
The spider agents within InfoCodex automatically gather the information from the disparate information sources and prepare them for further analysis by the content analysis engine.
InfoCodex supports all commonly used document formats such as MS Word, PDF, HTML, XML, PPT, PS, Excel, TXT, RTF, various mail formats including attachments, various image file formats, and any other file format for which the user can supply an i-filter.