Sunday, February 22, 2009

SINEQUA : SPEED AND VOLUME, keeping the relevancy and the rich functionalities


 

Sinequa just finished a first series of tests on our new version of Sinequa CS. I must confess I'm very proud.

Without any specific optimization, the results generate a lot of enthusiasm here. Sinequa has long been ahead in terms of relevancy and functionalities. When others did not see the point of managing security, linguistics or connectivity, we already solved these issues three years ago. We now have developed a new architecture including the necessary options to fulfill enterprise search needs at the kernel level of the technology, while at the same time generating first class performance. Sinequa technology is today at an unparalleled level of performance for this level of functionality. There will be detailed product data sheets coming soon, but in the mean time, here are a few points:

Number of queries on a large volume of users: up to 1700 simultaneous queries per second on one bi-processor server (average response time around 10 milliseconds). In production, our most demanding customer today manages up to 400 queries per second but with multiple servers, we actually generate here an improvement of around 50 times compared to the previous release of the technology. More importantly, it's highly sufficient to serve any customer needs.

Capacity to manage large volumes per server: one single server has indexed around 100 million documents (enterprise documents) in a few dozens of hours, and the server limits had not been reached. The server is a quadri-processor with 32 Giga of RAM (yes… it takes what it takes), so this is very promising; it represents a huge improvement for Sinequa, especially considering the performances come with a complete linearity based on the number of servers. We can now index the integrality of the enterprise content without consuming a lot of hardware resources, and this will be done in a reasonable time, and with sufficient refresh. For precise indexation time and volumes, I'll wait to have all the data per types of documents, since a PDF or a word document , an excel spreadsheet or a html document can be quite different. As an example, one entry level server(4 processors and 8 Giga of RAM): can index a little bit more than 1000 press documents per second, which means around 100 million documents in 24 hours (per server).

Capacity to index a database on an entry level server
(4 processors and 8 Giga of RAM): 5000 lines (or database objects) per second, which gave around 20 million lines per hour and finally 100 million database objects indexed in 5 hours. Maximal number of insertion per seconds: 10,000 which means in the end 100 million in less than three hours. I have recently read the performances of a competitor who was proudly indexing 30 million database objects in ten hours on a server. Sinequa does 6 to 7 times faster, and we are talking about a competitor who's main competitive advantage is supposed to be scalability.

We are impatient to see this new release of Sinequa being exposed to the users and content inside the enterprise; the rich functionalities of Sinequa combined with this level of performance, should give results that users will notice and vote for. We don't have long to wait as next month the first customer will be in production…

No comments:

Post a Comment