Skip to content

sledzias/libcvd-cl

Repository files navigation

libcvd-cl

libcvd-cl implements several computer vision algorithms in a simple and extensible framework of standard OpenCL and C++.

It is related to libcvd by the concepts of its algorithms, although the algorithms in libcvd-cl are completely reinvented for highly parallel architectures.

Included algorithms

  • Simple image blur

    • Blur grayscale images (BlurGrayStep)
    • Blur colour images (BlurRichStep)
  • Remarks:

    • OpenCL code is pre-generated with a given convolution kernel.
  • Point cloud pre-processing

    • Convert (x,y) to (u,v) or (u,v,q) for a known camera (ToUvqUvStep)
    • Filter positions by depth (ClipDepthStep)
    • Perform arbitrary pixel->value mappings (FxyStep)
  • FAST

    • Find corners in grayscale images (PreFastGrayStep then FastGrayStep)
    • Find corners in colour images (PreFastRichStep then FastRichStep)
  • Remarks:

    • Tests 16 pixels in a ring, with variable corner ring size and threshold.
    • OpenCL on CPU is slow, but OpenCL on a reasonable GPU is faster than C++ on CPU.
    • Code is almost entirely branch-free, intended for GPU and not CPU.
  • HIPS

    • Build descriptors for grayscale images (HipsGrayStep or HipsBlendGrayStep)
    • Build descriptors for colour images (HipsRichStep or HipsBlendRichStep)
    • Remove descriptors with high bit count (HipsClipStep)
    • Match descriptors by brute force (HipsFindStep)
    • Build balanced tree/forest in C++ (HipsMakeTreeStep)
    • Search balanced tree/forest in C++ (lossy, lossless) and OpenCL (lossy only) (HipsTreeFindStep)
  • Remarks:

    • Tests 64 pixels in concentric circles.
    • Search kernels have optional naive rotational invariance using barrel shift.
    • OpenCL on CPU is slow, but OpenCL on a reasonable GPU is faster than C++ on CPU.
  • 3-point pose using RANSAC

    • Assign matrix identity (MatIdentStep)
    • Randomly select point triples (RandomIntStep then MixUvqUvStep)
    • Generate Jacobian matrix (PoseUvqWlsStep)
    • Perform Cholesky decomposition and back-substitution (CholeskyStep)
    • Exponentiate SE3 matrix (SE3ExpStep)
    • Use SE3 matrix for point transforms (SE3Run1Step)
    • Evaluate SE3 matrix for inliers (SE3ScoreStep)
  • Remarks:

    • Basic iterative computation, but does not refine with all inliers.
    • OpenCL on CPU is quite fast, OpenCL on GPU is extremely fast.
  • 2D point cloud "radar" matching

    • Compute point radar (makePointRadar())
    • Match two point radars (matchPointRadars())
  • Remarks:

    • Highly experimental, not published in the literature.
    • Matches by spatial distribution of points in 2D.
    • Conceptually similar to "semi-local constraints".
    • Naturally invariant to 2D/3D scale, 2D translation and 2D rotation.
    • Intolerant of large shear, 3D translation, and 3D rotation.
    • Potentially useful for inter-frame tracking.
  • 3D point cloud "galaxy" matching

    • Compute point galaxy (makePointGalaxy())
    • Match two point galaxies (matchPointGalaxies())
  • Remarks:

    • Very highly experimental, yet to show promise.
    • Matches by spatial distribution of points in 3D.
    • Conceptually similar to "semi-local constraints".
    • Naturally invariant to 3D translation, 3D rotation and 3D scale by a single scalar.
    • Requires consistent (x,y,z) scale.
    • Potentially useful for inter-frame tracking.

Components

  • Python scripts to generate OpenCL source

    In general, where there are fixed-size loops in the OpenCL code, the unrolled code is pre-generated instead. This way, device registers can be used instead of (even implicit) global memory arrays and numeric expressions can be simplified and propagated, making code faster for only slightly higher developer effort.

    Not all kernels require code generation, but for consistency, even those that don't still have a Python script which is simply a large print.

  • C++ classes representing the OpenCL situation

    • Worker class bundles an OpenCL device, its context, and its command queue.
  • C++ classes defining data states

    • Camera information per pixel (CameraState)
    • Integer counter (CountState)
    • HIPS balanced tree state (HipsTreeState)
    • Image data of any format (ImageState)
    • Variable-size list data of any type (ListState)
    • Matrix/vector data of small fixed size (MatrixState)
    • (u,v) point cloud (UvState)
    • (u,v,q) point cloud (UvqState)
    • ((u,v,q),(u,v)) point pair cloud (UvqUvState)
  • Remarks:

    • States may be created once and linked by multiple steps.
    • Most states have methods to translate data to/from reasonable C++ types.
    • Each state is bound to exactly one worker.
  • C++ classes defining processing steps

    • See "Included algorithms" above.
  • Remarks:

    • Steps may link multiple input and output states.
    • Steps may be created once and run repeatedly as part of a pipeline.
    • Steps may be timed individually.
    • Each step is bound to exactly one worker, inferred from the worker of its states.
  • Test driver programs

    • Messy; not intended for code reuse.
    • May serve as working examples for library consumers.

Resources for OpenCL developers

About

OpenCL computer vision library

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published