Monday 12 January 2009

Deployment Considerations for FAST ESP

Balancing hardware COST vs. Down time Cost.

Downtime costs money, but so does redundancy! We need to weigh up the costs and tradeoffs. FAST has a built in fault tolerance mechanism called FIX ML.

Solution:

The FIXML from which index is generated could reside on a high availability SAN. Re-generation of the index from this could take in the order of minutes to hours.

Redundancy.

How is Fault Tolerance handled?
Answering.
Fault tolerant can be made with reference to 5 main subsystems.

- Administration sub-system

- Processing sub-system, including the content distributor + Document processing

- Search Subsystem, including the Query & Results Processing service + the Search service.

-Indexing Subsystem, including the index dispatcher + the indexer

- Connector Subsystem

We can scale independently each of these modules.

There are 3 fault tolerance models.

Fail Safe (full functionality if fail). For example, redundancy with multiple clusters. Careerbuilder have two identical installations, with identical crawler setups, identical document processing. One is located in a data centre on the east coast. The other in a data centre on the west coast.

Fail Soft (limited functionality if fail). For example, search fault tolerance. A 2 row architecture.

Fail Stop (No functionality if fail). For example, a single node install.


- Each service can be made safe independently. We need to balance system cost with fault tolerance requirements.
- Built-in software load balancing functionality is provided by the search API. This conducts a round robin to the QR server. This is suitable for smaller installs.

It is advisable to use an external load balancer for larger installations.

Different parts of the system can have varying degrees of fault tolerance.

Simple mirroring of the search nodes will provide high availability to the level required for most applications. This is what we refer to as adding a new row.

FAST is made up of 5 key modules or subsystems:

The search service can be configured to be Fail Safe while the content submission service can be configured to be Fail Soft.

In summary, the extra expense of adding redundancy should be calculated against the consequences of downtime and likelihood of severe failures

Follow up.
- What is the cost to the business of downtime?
- What are the likelihood of severe failures?

- How long can employees tolerate waiting for new content to be searchable? When does it become unacceptable?
- Indexing, Document processing, search can run on separate nodes? Which of these would require fault-tolerance?

What recovery processes are in built to allow recovery of the system in the event of data

loss?

FAST ESP can automatically back-up content using its native FASTXML format. In the case where there was only one row for some reason. We would advise backing up the FIXML on a SAN or tape. If a column were to go down we could re-build the index from the backed up FIXML without the requirement to re-feed and re-process the content. The FI XML can provide a system roll-back to recover data.

How do we replace a failed node or add increased capacity?

If a node goes down we can easily replace that node in isolation. FAST is a distributed system based on the Grid computing model. We can invest where it hurts.

If we need faster ingestion – we simply append a new node & increase the Doc Processors on that node.

If we need a Higher QPS – we simply append a new node & add several Query Processers to that node.

Assuming that hardware is identical among all nodes, the scaling of query rate is near-linear with respect to the number of rows since there are no interactions or co-dependencies between them.

Increasing Data volume we can add additional columns and each column will handle a partition of the entire index.

For example, 3 columns have 1/3 of the index each. The content distributor distributes the ingested documents evenly across the nodes.

Fast ESP footprint.

How does the index reflect the size of the original content? What is the Ratio? What is the installation footprint size?

Answer.

Size of the index depends on at least 50 factors including

- Document types.

- Number of navigators required.

- Number of sort able fields.

- Linguistic expansion.

- Number of queries per second.

- Rate of ingestion.

Index size can be anywhere between 20 – 200% of original content.

An example case of when it could be 200% of original size, is for a DB of rows, let’s say 2 K each.

Performing document side linguistic expansion - Adding lemmatization, entity extraction, linguistic processing. That 2 K can now become 10 K.

An example case of when it could be 20% of original size is for Word, PDF, Excel, PPT files that have massive amounts of formatting data and images. The formatting noise and images can be stripped out and the text compressed.

Response.

It may be larger but is the objective to reduce the storage size or to reduce the cost to perform more tasks in a shorter time?

We typically have on average a 30-50% smaller footprint than our competitors including DBs. How do we know? We replace a lot of them - Endeca at Bestbuy.com.

Follow Up.
- How many documents will actually be indexed?
- What types of documents? What is the make-up?
- What is the average size of each document?
- What is the growth rate of the collection?
- How many navigators are required? sortable? Integer, double, string?

Performance.
How many queries can be handled on a single node? How many documents can be handled on a single node?
ANSWERING:
On a single node FAST ESP can handle between 100 and 300 queries per second & index 3 to 5 giga bytes per hour. That is 600,000 to 1 million 500 KB word docs.

We have had instances of up to 15 million documents on a single box.
At Thompson Financial they are using our next gen product Mars to achieve roughly a 200ms indexing latency.

Based on my personal hands on experiences, I personally worked on a POC for Play. Here we were able to support up to 200 QPS for 7 million records on a single box.

FIXML- XML that follows an internal FAST xml format. It is the flat form from which the binary index is created.

fail-soft system - When system components fail, a fail-soft system continues to operate, but with reduced functionality. Such systems are also often said to provide “graceful degradation”.

fail-stop system - will not provide any functionality if system components fail. It may return false results – a situation often referred to as a “Byzantine failure”.

SAN or a NAS - “storage area network” and “network attached storage”. The servers used are remote high-performance drives shared across multiple machines, and often connected with a fibre channel.

RAID “redundant array of independent disks” – a configuration of multiple drives used to provide fault tolerance (via mirroring or parity checking) or higher performance (via striping). Frequently used to increase search- engine performance and provide a certain level of redundancy.

MTBF - It is the mean time between failures - the total elapsed time subtracted by downtime divided by the number of failures of the component.

Grid Computing The creation of a "virtual supercomputer" composed of a network of loosely-coupled computers, acting in concert to perform very large tasks.

The Advantage. Each node can be purchased as commodity hardware at lower cost than a supercomputer. Economies of producing commodity hardware, compared to the lower efficiency of designing and constructing a small number of custom supercomputers.


1 comment:

Daniel Tunkelang said...

Just for the record, FAST did not replace Endeca at Best Buy. But I would be curious to see FAST or whoever was powering search at Best Buy at the time explain this.