Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Remove Gaussian blur code alternatives that are exotic or didn't work very well #159

Draft
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

griwodz
Copy link
Member

@griwodz griwodz commented Aug 6, 2024

This PR cannot build before PR #156 is merged into it.

Description

PopSift had a wide variety of downscaling and Gaussian filtering modes. These were written to increase the potential parallelism of PopSift, explore the trade-off between sequential operation and wider Gaussian filters, the impact of various filter width on quality, etc.

In practice, it seems that user stay with the default. The less successful downscaling/filtering options have therefore been removed.

Two options have been retained from the original PopSift code. Both implement the classical sequence that starts by scaling the input image and performing Gaussian filtering incrementally through the levels of an octave, and downscaling the following octave from the 3rd last level of the previous octave. These are the non-interpolating and the interpolating approaches.

The classical "non-interpolating" approach

It takes the neighbouring pixels and multiplies them with the weights from the pre-computed Gaussian filter. The width of this filter varies for each level to ensure a regular scale space.

The alternative "interpolating" approach

It starts with the same Gaussian filters, although their width is increase to a multiple of 2. The weight of two neighbouring cells g[n] and g[n+1] is then reformulated as a relative weight between these two cells, ie.:
g[n]I + g[n+1]J = abI + (1-a)bJ = b*(a*I + (1-a)J) where b=g[n]+g[n+1] and a=g[n]/(g[n]+g[n+1])

The appropriate values for b can then be read from an array, while the neighboring pixels I and J can be read with linear interpolation with a weight of A through the texture engine. This means that the term a*I + (1-a)J is handled by the texture hardware, leaving only one multiplication per 2 pixels.

Features list

  • This PR does not add any features. It removes features.
  • Faster and less accurate downscaling and Gaussian filtering modes have been removed.
  • Enums for downscaling modes have been changed, "OpenCV mode" has been removed, "PopSift mode" and "VLFeat mode" have been renamed to reflect their actual functions.

Implementation remarks

  • Removed the config param ScalingMode (always use default).
  • Removed fixed scaling code.
  • Removed code to downscale everything directly from input image.
  • Removed the narrower Gauss filter width called "OpenCV mode".
  • Removed functions to interpolate from first image plane.
  • Removed specialized version to create very first level from input image.
  • Remove Gauss filter tables for direct downscaling using absolute tables.
  • Removed deprecated scaling mode "OpenCV".OpenCV was buggy when this code was written. It has improved since then.
  • Removed downscaling by interpolation, which could not be called with any parameter.
  • Restructured the calling code for pyramid building to make sure that host code that starts the CUDA kernels is located in the same code file, making the CUDA kernels static. This prevents the overhead for linkable CUDA code.
  • Simplified the solution with absolute sources. Returned to a solution without shuffle and identical code structure for horizontal and vertical Gaussian filtering.
  • Simplified and unified code for absolute source interpolated Gaussian filtering.
  • Use horiz_from_input_image exclusively for octave 0. Direct downscaling is now only use for the input image. Note that initial blur is assumed for every input image, even when it is later interpreted as initially unblurred. That does make a difference, but is apparently recommended.
  • Renamed extrema refinement modes to have more intuitive names. They are no longer tied to PopSift vs VLFeat. (except that the command line parameters of the test code retains the old terms so far)

@griwodz griwodz added in progress cuda issues related to cuda versions labels Aug 6, 2024
@griwodz griwodz changed the title Remove Gaussian blur code alternatives that are never used or didn't work very well. [WIP] Remove Gaussian blur code alternatives that are exotic or didn't work very well Aug 6, 2024
@griwodz griwodz self-assigned this Aug 6, 2024
@griwodz griwodz marked this pull request as draft August 12, 2024 10:57
work very well.

Remove the config param ScalingMode (always use default).
Remove fixed scaling code.
Remove code to downscale everything directly from input image.
Remove the narrower gauss filter width called "OpenCV mode".
Remove functions to interpolate from first image plane.
Remove specialized version to create very first level from input image.
Remove Gauss filter tables for direct downscaling using absolute tables.
Removed deprecated scaling mode "OpenCV".

OpenCV was buggy when this code was written. It has improved since then.
Also downscaling by interpolation, which could not be called with any parameter, is removed.
Restructure the calling code for the last 2 pyramid building functions

Move host code for normalized source kernel into kernel's file.
Normalized source mode is only used for the input image. It uses the normalization feature of CUDA textures to scale the input image while creating the first octave.
Simplify the solution with absolute sources.
Return to a solution without shuffle and identical code structure for horizontal and vertical Gaussian filtering.
Host functions to call Gaussian filtering from point textures moved in kernels' code file.
Host functions to call Gaussian filtering from interpolated textures moved in kernels' code file.
Simplified and unified code for absolute source interpolated Gaussian filtering.
Use horiz_from_input_image exclusively for octave 0.
Direct downscaling is not only use for the input image.  Note that initial blur is assumed for every input image, even when it is later interpreted as initially unblurred. That does make a difference, but is apparently recommended.
Extrema refinement modes have more intuitive names and are no longer tied to PopSift vs VLFeat.
(except that the command line parameters of the test code retains the old terms so far)
@griwodz griwodz force-pushed the dev/prune-pyramid-code branch from c94da94 to 9a20c15 Compare August 15, 2024 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cuda issues related to cuda versions in progress
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant