Scale-invariant feature operator
Feature detection |
---|
Edge detection |
Corner detection |
Blob detection |
Ridge detection |
Hough transform |
Structure tensor |
Affine invariant feature detection |
Feature description |
Scale space |
In the fields of computer vision and image analysis, the scale-invariant feature operator (or SFOP) is an algorithm to detect local features in images. The algorithm was published by Förstner et al. in 2009.[1]
Algorithm
[edit ]The scale-invariant feature operator (SFOP) is based on two theoretical concepts:
Desired properties of keypoint detectors:
- Invariance and repeatability for object recognition
- Accuracy to support camera calibration
- Interpretability: Especially corners and circles, should be part of the detected keypoints (see figure).
- As few control parameters as possible with clear semantics
- Complementarity to known detectors
scale-invariant corner/circle detector.
Theory
[edit ]Maximize the weight
[edit ]Maximize the weight {\displaystyle w}= 1/variance of a point {\displaystyle p}
{\displaystyle w(\mathbf {p} ,\alpha ,\tau ,\sigma )=\left(N(\sigma )-2\right){\frac {\lambda _{min}(M(\mathbf {p} ,\alpha ,\tau ,\sigma ))}{\Omega (\mathbf {p} ,\alpha ,\tau ,\sigma )}}}
comprising:
1. the image model[2]
- Distance d of an edge from a reference point p in a spiral feature
{\displaystyle {\begin{aligned}\Omega (\mathbf {p} ,\alpha ,\tau ,\sigma )&=\sum _{n=1}^{N(\sigma )}[(\mathbf {q} _{n}-\mathbf {p} )^{T}\mathbf {R} _{\alpha }\mathbf {\nabla } _{T}g(\mathbf {q} _{n})]^{2}G_{\sigma }(\mathbf {q} _{n}-\mathbf {p} )\\&=N(\sigma )\mathbf {tr} \left\{R_{\alpha }\mathbf {\nabla } _{\tau }\mathbf {\nabla } _{\tau }^{T}R_{\alpha }^{T}*\mathbf {p} \mathbf {p} ^{T}G_{\sigma }(\mathbf {p} )\right\}\end{aligned}}}
2. the smaller eigenvalue of the structure tensor {\displaystyle \underbrace {M(\mathbf {p} ,\alpha ,\tau ,\sigma )} _{\text{structure tensor}}=\underbrace {G_{\sigma }(\mathbf {p} )} _{\text{weighted summation}}*\underbrace {(R_{\sigma }\nabla _{\tau }\nabla _{\tau }^{T}R_{\sigma }^{T})} _{\text{squared rotated gradients}}}
Reduce the search space
[edit ]Reduce the 5-dimensional search space by
- linking the differentiation scale {\displaystyle \tau } to the integration scale
- {\displaystyle \tau =\sigma /3}
- solving for the optimal {\displaystyle {\hat {\alpha }}} using the model
- {\displaystyle \Omega (\alpha )=a-b\cos(2\alpha -2\alpha _{0})}
- and determining the parameters from three angles, e. g.
- {\displaystyle \Omega (0^{\circ }),\Omega (60^{\circ }),\Omega (120^{\circ })\quad \rightarrow \quad a,b,\alpha _{0}\quad \rightarrow \quad {\hat {\alpha }}}
- pre-selection possible:
- {\displaystyle \alpha =0^{\circ },円\rightarrow ,円{\mbox{junctions}},\quad \alpha =90^{\circ },円\rightarrow ,円{\mbox{circular features}}}
Filter potential keypoints
[edit ]- non-maxima suppression over scale, space and angle
- thresholding the isotropy {\displaystyle \lambda _{2(M)}}:
eigenvalues characterize the shape of the keypoint, smallest eigenvalue has to be larger than threshold {\displaystyle T_{\lambda }}
derived from noise variance {\displaystyle V(n)} and significance level {\displaystyle S}:
- {\displaystyle T_{\lambda }(V(n),\tau ,\sigma ,S)={\frac {N(\sigma )}{16\pi \tau ^{4}}}V(n)\chi _{2,S}^{2}}
Algorithm
[edit ]Results
[edit ]Interpretability of SFOP keypoints
[edit ]- Results of different detectors on a Siemens star
-
Sfop: junctions red, circular features cyan
-
Edge-based Regions
-
Intensity-based Regions
See also
[edit ]References
[edit ]- ^ Forstner, Wolfgang; Dickscheid, Timo; Schindler, Falko (2009). "Detecting interpretable and accurate scale-invariant keypoints". 2009 IEEE 12th International Conference on Computer Vision. pp. 2256–2263. CiteSeerX 10.1.1.667.2530 . doi:10.1109/ICCV.2009.5459458. ISBN 978-1-4244-4420-5.
- ^ a b Bigün, J. (1990). "A Structure Feature for Some Image Processing Applications Based on Spiral Functions". Computer Vision, Graphics, and Image Processing. 51 (2): 166–194.
- ^ Förstner, Wolfgang (1994). "A Framework for Low Level Feature Extraktion". European Conference on Computer Vision. Vol. 3. Stockholm, Sweden. pp. 383–394.