# Difference of Gaussians (DoG) - how is it used to find scale invariant features?

## Recommended Posts

Hi,

I would have a theoretical question, about the Detector part of the SIFT algorithm, which as it is explained by D. Lowe in his paper, used the DoG to detect keypoints in an input image.

Problem formulation: Let x be my original image, and let y be a 2*scaled version of x.

in the SIFT algorithm: I don't understand how the SIFT detector can detect a same given keypoint from image x in image y. I don't understand how the 'scale space' DoG pyramid is used exacley to achieve this 'scale invariance'.

Please note that I have watched quite a few videos , e.g. https://www.youtube.com/watch?v=U0wqePj4Mx0&t=1s and read a few tutorials about it. E.g. http://docs.opencv.org/trunk/da/df5/tutorial_py_sift_intro.html and a few others.... I understand how the DoG pyramid is constructed, by successive blurring of the image. I also understand that keypoints are the maxima or minima from comparing to adjacent images in the DoG pyramid. (I understand that the difference of 2 blurred image with different sigmas gives an edge-like image)

However, for me, in every tutorial, they sort of 'skip' the fundamental part that would clearly explain why it makes it scale invariant ....

(Or in other words, can someone give an 'intuitive' explanation of how and why the DoG is used in such a way to find keypoints that are invariant to scale. E.g. How can it match/find the same keypoint, let's say its a corner of a box in image x and this same corner in image y (of course the corner will be 'bigger' in image y because we zoomed 2 times in image why as stated in the problem formulation)

best regards,

OD

##### Share on other sites

The intuitive explanation is that key features (e.g. box corners) should persist in the image at any scale, such as from down-sampling, rotating or stretching the features of the image.