Are you planning to build a knowledge center
for corporate information? The number of different information sources
that should be included can be very large, ranging from central file
servers, user PCs, e-mail archives (including attachments) - all spread
across the corporate network - to external Web sites and Internet search
results. This entire mass of information is then merged into a single
ordered document collection.
The spider agents within InfoCodex automatically gather the
information from the disparate information sources and prepare them
for further analysis by the content analysis
engine.
InfoCodex supports all commonly used document formats such
as MS Word, PDF, HTML, XML, PPT, PS, Excel, TXT, RTF, various mail formats
including attachments, various image file formats, and any other file
format for which the user can supply an i-filter.