Spatial Coding for Large-scale Partial-duplicate Image Search

报告题目:Spatial Coding for Large-scale Partial-duplicate Image Search

报告人:Qi Tian (Professor )



  Bag-of-visual-words model is widely used in the state-of-the-art large-scale image retrieval system. It represents each image as a bag of visual words by quantizing local image descriptors to the closest visual words. However, feature quantization reduces the discriminative power of local features, which causes many false visual word matches. Recently, some geometric verification methods are proposed to check the geometric consistency of matched features in a post-processing step. Although retrieval precision is improved, either the computational cost is too expensive to ensure real-time response, or they are limited to local verification. To address this dilemma, we propose a novel scheme, Spatial Coding, designed for large scale partial-duplicate image retrieval. The spatial relationships among visual words are encoded in global region maps. Based on the region maps, a spatial verification approach is developed, which can detect false matches of local features efficiently, and consequently improve retrieval performance greatly.
  Experiments in partial-duplicate image retrieval, using a database of one million images from Image-Net, reveal that our approach can effectively detect duplicate images with rotation, scale changes, occlusion, and background clutter with very low computational cost. The spatial coding achieve an 53% improvement in mean average precision and 46% reduction in time cost over the baseline Bag-of-Visual-Words approach, respectively. They perform even better than full geometric verification while being much less computationally expensive. Our demo on 10-million dataset further reveals the scalability of our approach.

  Qi Tian is currently a Full Professor in the Department of Computer Science, the University of Texas at San Antonio (UTSA). During 2008-2009, he took one-year Faculty Leave at Microsoft Research Asia (MSRA) in the Media Computing Group. He received his Ph.D. in ECE from University of Illinois at Urbana-Champaign (UIUC). Dr. Tian’s research interests focus on multimedia information retrieval and published over 180 refereed journal and conference papers. He was the co-author of a ACM ICMCS 2012 Best Paper, a MMM 2013 Best Paper, a Top 10% Paper Award in MMSP 2011, a Best Student Paper in ICASSP 2006, and co-author of a Best Paper Candidate in PCM 2007. His research projects are funded by NSF, ARO, DHS, Google, FXPAL, NEC, SALSI, CIAS, Akiira Media Systems, HP and UTSA. He received 2010 ACM Service Award. He is the Guest Editors of IEEE Transactions on Multimedia, Journal of Computer Vision and Image Understanding, etc, and is the associate editor of IEEE Transaction on Circuits and Systems for Video Technology (TCSVT), Multimedia Systems Journal (MSJ) and in the Editorial Board of Journal of Multimedia (JMM), Journal of Machine Vision and Applications (MVA). Dr. Tian is a Chaired Professor in Tsinghua University, a Guest/Adjunct Professor at University of Science and Technology of China (USTC), Zhejiang University, Xi’an Jiaotong University, Xidian University, and Institute of Computing Technology, Chinese Academy of Science.