Autonomy’s flagship search product is the IDOL content engine. Naturally, this has been incorporated into the iManage product line, starting with some customers on WorkSite Server 8.4, and many on WorkSite 8.5.
The 8.2 Verity search engine is at the end of its life, as far as iManage support is concerned. No new fixes will be provided for Verity issues encountered – and some of the newest file formats cannot be interpreted to be added to the full-text index.
The WorkSite 8.3 Vivisimo search engine rarely comes up in discussion. When iManage was bought by Autonomy, they relinquished rights to work closely with this competing search product. Therefore, WorkSite customers using the 8.3 Indexer also should be making plans to upgrade to the IDOL indexer.
Now that you’ve been convinced to upgrade your indexer, will your existing hardware be up for the job?
What do you have?
Autonomy support will provide you a recommendation based on specific details of your proposed platform and usage.
- How many users
- How much content, per WorkSite database
- Current and projected query load
- Geographic distribution – where are your users, where are your databases
- Hardware specifications of the system(s) where you hope to run IDOL
Number of Users
A simplistic way to determine the number of users is to count how many active records exist in the user table, or the number of user licenses renewed with iManage. However, the actual workload that concerns the indexer is the number of users who would carry out searches. You can determine how many users really work with iManage by reviewing the WorkSite database’s activity table: mhgroup.dochistory.
Do you anticipate any significant changes soon in the number of users? Additional departments being trained on WorkSite, new offices, etc?
Amount of Content
First, determine how many items are in the database. It’s much more reliable to count the records at the database level rather than relying on document numbers: the total is not thrown off by deleted items, numbers consumed by folder profiles, and offsets of first document number used.
The size of content can be determined a pair of ways: by looking at total disk space used, or by average file size.
The number ten million is significant – it’s recommended that a single content engine be configured to index no more than 10,000,000 items. The work should be spread across multiple content engines in order to ensure better ongoing performance.
What sort of growth do you project? How many emails and other items are being added monthly?
Query load
The load helps to set the scalability aspect of the system. How many queries happen per day, on average or peak loads? How many queries must be handled per second? What is the time-out threshold commonly found in user settings?
Next time – Hardware
In my next post, I’ll dig into the details of the hardware question. Will you virtualize, where are the files stored, etc.