

Additional experiments are carried out so as to compare the proposed system with existing state of the art methods. Comparative experiments have shown that the proposed content-based image duplicate detector greatly outperforms detectors using the same image description but based on a simpler distance functions rather than using a classification algorithm. Finally, the fourth and last step consists in choosing the most probable original by picking that with the highest estimated probability. Indeed, each original image known to the system is associated with an adapted binary detector, based on a support vector classifier, that estimates the probability that a test image is one of its duplicate. The third step consists in using binary detectors to estimate the probability that the test image is a duplicate of the original images selected in the second step. In the second step, the most likely original images are efficiently selected using a spatial indexing technique called R-Tree. In the first step, the test image is described by using global statistics about its content. The classification is performed in four steps. The proposed content-based duplicate detection system classifies a test image by associating it with a label that corresponds to one of the original known images. The proposed method is referred to as content-based since it relies only on content analysis techniques rather than using image tagging as done in watermarking. More precisely, the developed system is able to discriminate possibly modified copies of original images from other unrelated images. This thesis is about the detection of duplicated images.
