An Approach to Large Scale Interactive Retrieval of Cultural Heritage

Takami, MasatoBell, PeterOmmer, BjörnReinhard Klein and Pedro Santos2014-12-162014-12-162014978-3-905674-63-72312-6124https://doi.org/10.2312/gch.20141307https://diglib.eg.org/handle/10.2312/gch.20141307.087-095Large scale digitization campaigns are simplifying the accessibility of a rapidly increasing number of images from cultural heritage. However, digitization alone is not sufficient to effectively open up these valuable resources. Retrieval and analysis within these datasets is currently mainly based on manual annotation and laborious preprocessing. This is not only a tedious task, which rapidly becomes infeasible due to the enormous data load. We also risk to be biased to only see what an annotator beforehand has focused on. Thus a lot of potential is being wasted. One of the most prevalent tasks is that of discovering similar objects in a dataset to find relations therein. The majority of existing systems for this task are detecting similar objects using visual feature keypoints. While having a low processing time, these methods are limited to detect only close duplicates due to their keypoint based representation. In this work we propose a search method which can detect similar objects even if they exhibit considerable variability. Our procedure learns models of the appearance of objects and trains a classifier to find related instances.We address a central problem of such learning-based methods, the need for appropriate negative and positive training samples. To avoid a highly complicated hard negative mining stage we propose a pooling procedure for gathering generic negatives. Moreover, a bootstrap approach is presented to aggregate positive training samples. Comparison of existing search methods in cultural heritage benchmark problems demonstrates that our approach yields significantly improved detection performance. Moreover, we show examples of searching across different types of datasets, e.g., drafts and photographs.An Approach to Large Scale Interactive Retrieval of Cultural Heritage10.2312/gch.20141307