Query-Driven Indexing in Large-Scale Distributed Systems: Efficient query processing with distributed indexes - Tapa blanda

Skobeltsyn, Gleb

 
9783639150179: Query-Driven Indexing in Large-Scale Distributed Systems: Efficient query processing with distributed indexes

Sinopsis

Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Commercial Web search engines already rely upon complex systems to be able to return relevant query results and keep processing times within the comfortable sub-second limit. Nevertheless, the exponential growth of the amount of content on the Web poses serious challenges with respect to scalability. Coping with these challenges requires novel indexing solutions that not only remain scalable but also preserve the search accuracy. In this work we introduce and explore the concept of query-driven indexing - an index construction strategy that uses caching techniques to adapt to the querying patterns expressed by users. We suggest to abandon the strict difference between indexing and caching, and to build a distributed indexing structure, or a distributed cache, such that it is optimized for the current query load. Our experimental and theoretical analysis shows that employing query-driven indexing is especially beneficial when the content is (geographically) distributed in a Peer-to-Peer network.

"Sinopsis" puede pertenecer a otra edición de este libro.

Reseña del editor

Efficient and effective search in large-scale data repositories requires complex indexing solutions deployed on a large number of servers. Commercial Web search engines already rely upon complex systems to be able to return relevant query results and keep processing times within the comfortable sub-second limit. Nevertheless, the exponential growth of the amount of content on the Web poses serious challenges with respect to scalability. Coping with these challenges requires novel indexing solutions that not only remain scalable but also preserve the search accuracy. In this work we introduce and explore the concept of query-driven indexing - an index construction strategy that uses caching techniques to adapt to the querying patterns expressed by users. We suggest to abandon the strict difference between indexing and caching, and to build a distributed indexing structure, or a distributed cache, such that it is optimized for the current query load. Our experimental and theoretical analysis shows that employing query-driven indexing is especially beneficial when the content is (geographically) distributed in a Peer-to-Peer network.

Biografía del autor

Gleb Skobeltsyn is a post-doctoral researcher at Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland. He received his PhD from EPFL in January 2009. His research is focused on query-driven mechanisms for P2P Information Retrieval, Web search engine architectures, caching techniques, large-scale data management and social networks.

"Sobre este título" puede pertenecer a otra edición de este libro.