Scalable Similarity Search in Metric Spaces

Authors

BATKO Michal GENNARO Claudio PASQUALE Savino ZEZULA Pavel

Year of publication 2004
Type Article in Proceedings
Conference Pre-proceedings of the Sixth Thematic Workshop of the EU Network of Excellence DELOS
MU Faculty or unit

Institute of Computer Science

Citation
Field Computer hardware and software
Keywords distributed data; scalable structures; similarity search; metric space
Description Similarity search in metric spaces represents an important paradigm for content-based retrieval of many applications. Existing centralized search structures can speed-up retrieval, but they do not scale up to large volume of data because the response time is linearly increasing with the size of the searched file. The proposed GHT* index is a scalable and distributed structure. By exploiting parallelism in a dynamic network of computers, the GHT* achieves practically constant search time for similarity range queries in data-sets of arbitrary size. The amount of replicated routing information on each server increases logarithmically. At the same time, the potential for interquery parallelism is increasing with the growing data-sets because the relative number of servers utilized by individual queries is decreasing. All these properties are verified by experiments on a prototype system using real-life data-sets.
Related projects:

You are running an old browser version. We recommend updating your browser to its latest version.

More info