Top Qs
Timeline
Chat
Perspective

Steerable filter

Image processing filter that can be rotated to any orientation From Wikipedia, the free encyclopedia

Remove ads

In image processing, a steerable filter is an orientation-selective filter that can be computationally rotated to any direction. Rather than designing a new filter for each orientation, a steerable filter is synthesized from a linear combination of a small, fixed set of "basis filters". This approach is efficient and is widely used for tasks that involve directionality, such as edge detection, texture analysis, and shape-from-shading.[1][2]

The principle of steerability has been generalized in deep learning to create equivariant neural networks, which can recognize features in data regardless of their orientation or position.[3][4]

Remove ads

Example

A common example of a steerable filter is the first derivative of a two-dimensional Gaussian function. This filter responds strongly to oriented image features like edges. It is constructed from two basis filters: the partial derivative of the Gaussian with respect to the horizontal direction () and the vertical direction ().

If is the Gaussian function, and and are its partial derivatives (which measure the rate of change in the and directions, respectively), a new filter oriented at an angle can be synthesized with the formula:

Here, the basis filters and are weighted by and to "steer" the filter's sensitivity to the desired orientation. This is equivalent to taking the dot product of the direction vector with the filter's gradient, .[1]

Remove ads

Generalization in deep learning: Equivariant neural networks

Summarize
Perspective

The concept of steerability is foundational to equivariant neural networks, a class of models in deep learning designed to understand symmetries in data.[5] A network is considered equivariant to a transformation (like a rotation) if transforming the input and then passing it through the network produces the same result as passing the input through the network first and then transforming the output. Formally, for a transformation and a network , this property is defined as .

This built-in understanding of geometry makes models more data-efficient. For example, a network equivariant to rotation does not need to be shown an object in multiple orientations to learn to recognize it; it inherently understands that a rotated object is still the same object. This leads to better generalization and performance, particularly in scientific applications.[3]

Mathematical foundation

Equivariant neural networks use principles from group theory to create operations that respect geometric symmetries, such as the SO(3) group for 3D rotations or the E(3) group for rotations and translations.[3]

Instead of learning standard filter kernels, these networks learn how to combine a fixed set of basis kernels. These basis functions are chosen so that they have well-defined behaviors under transformation groups.

  • Spherical harmonics are frequently used as basis functions because they form a complete set of functions that behave predictably under rotation, making them ideal for creating steerable 3D kernels.[6]
  • Features within the network are treated as geometric tensors, which are mathematical objects (like scalars or vectors) that are "typed" by their behavior under transformations. These types correspond to the irreducible representations (irreps) of the group.[3]
  • The tensor product is the fundamental operation used to combine these typed features in a way that preserves equivariance, guaranteeing that the network as a whole respects the desired symmetry.[3]

Frameworks like e3nn simplify the construction of these networks by automating the complex mathematics of irreducible representations and tensor products.[3]

Applications

Steerable and equivariant models are highly effective for problems with inherent geometric symmetries. Examples include:

Remove ads

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads