Tuesday 21 April 2009

FAST ESP Doc Processing Latency - partner briefing


Complexity of Content - the more complex the data the more operations that will need to be performed on it.

Complexity of Pipeline – The greater the number of stages the longer it will take for content to pass through the pipeline.

Processing batch size – Larger batches will be better for throughput, despite taking longer time to process and leading to higher latency for each individual batch. They require less write operations in the long term. Around 1MB is optimal.

Large loads into memory - Large loads of dictionaries, matchers and lists into memory will slow down processing rates.

No. of nodesIn an ideal scenario, there is always an idle document processor available whenever content is submitted. If this is not the case, the submit call will have to wait for a document processor to become available.

No comments: