|
| 1 | ++++ |
| 2 | +date = "2019-03-26T21:56:55+01:00" |
| 3 | +title = "Rolling shutter cameras" |
| 4 | +tags = ["example"] |
| 5 | +categories = ["general"] |
| 6 | +draft = false |
| 7 | +description = "Absolute pose estimation for a rolling shutter camera" |
| 8 | +weight = 1 |
| 9 | +author = "Rami Abuzerr, Paul Breiding" |
| 10 | +group = "applications" |
| 11 | ++++ |
| 12 | + |
| 13 | + |
| 14 | + |
| 15 | +We want to find the absolute pose for a calibrated rolling shutter camera. Suppose we have the following information available: |
| 16 | + |
| 17 | +- A set of 3D scene points expressed in the world coordinate frame. |
| 18 | +- Sets of 2D points in the two-dimensional images, where each set corresponds to one 3D point. |
| 19 | + |
| 20 | +Let us load the data: |
| 21 | +```julia |
| 22 | +using DelimitedFiles |
| 23 | +link1 = "https://gist.githubusercontent.com/PBrdng/e17d0e3bc4d983734238b9cb8386d560/raw/07272b125a6ad03c791fdf99e741318f1d85149b/3Dpoints" |
| 24 | +link2 = "https://gist.githubusercontent.com/PBrdng/e17d0e3bc4d983734238b9cb8386d560/raw/07272b125a6ad03c791fdf99e741318f1d85149b/2Dpoints" |
| 25 | +points3D = readdlm(download(link1)) |> transpose |
| 26 | +points2D = readdlm(download(link2)) |> transpose |
| 27 | +``` |
| 28 | + |
| 29 | +Now `points2D` is a $3\times N$ matrix, where $N$ is the number of images. The first two rows of `points2D` are the coordinates of the respective images, and the third row gives the *scanlines*. |
| 30 | +A rolling shutter camera captures images line by line over time. These lines are called scanlines. |
| 31 | + |
| 32 | +<figure> |
| 33 | +<p style="text-align:center;"><img src="/images/rollingshutter1.png" width="600px"/></p> |
| 34 | +<figcaption style="text-align:center;"> |
| 35 | +The 2D image taken with our rolling shutter camera. The color encodes the scanline that takes a picture of that point. |
| 36 | +</figcaption> |
| 37 | +</figure> |
| 38 | + |
| 39 | +Similarly `points3D` is a $4\times N$ matrix, where $N$ is the number of images. The columns of `points2D` and `points3D` are synchronized in the sense that the $i$-th column of `points2D` is the image of the $i$-th column of `points3D`. |
| 40 | + |
| 41 | +<figure> |
| 42 | +<p style="text-align:center;"><img src="/images/rollingshutter2.png" width="600px"/></p> |
| 43 | +<figcaption style="text-align:center;"> |
| 44 | +The 3D points that we take an image from. The color encodes the scanline that takes a picture of that point. |
| 45 | +</figcaption> |
| 46 | +</figure> |
| 47 | + |
| 48 | + |
| 49 | +We want to reconstruct the camera from this data. For this, we use a model introduced by Saurer et al.[^1]. Their morel introduces a linear velocity in the camera projection matrix to approximate the motion of the rolling shutter camera. Furthermore, the model assumes that each scanline takes exactly the same time, and the relative camera translation $t_n$ at the $n$-th scanline is |
| 50 | + |
| 51 | +$$ |
| 52 | +t_n = \mathbf{v}n \tau \quad |
| 53 | +$$ |
| 54 | + |
| 55 | +where $\mathbf{v}\in\mathbb R^3$ denotes the constant linear camera velocity, and $\tau$ is the time taken to complete each scanline, and $n$ ist the number of scanline. |
| 56 | +Making the assumption of a constant linear velocity, we can express a pixel on the $n$th scanline as: |
| 57 | + |
| 58 | +$$x_n = \begin{bmatrix} R & t - t_n \end{bmatrix} X,$$ |
| 59 | + |
| 60 | +where $R$ is an orthogonal matrix (we assume calibrated cameras). |
| 61 | + |
| 62 | +The variables $R$ and $t$ describe the camera’s position and orientation in the world frame, which is also its position for the first captured scanline. The variable $t_n$ represents the camera’s position for the $n$-th scanline. The correspondence between $x_n$ and $X$ signifies a relationship between a 2D point in the image and its corresponding 3D point. The given data in this equation consist of a set of 2D image points denoted by $x_n$ and their corresponding 3D points represented by $X$. The unknowns in this equation are $R$, $t$, and the linear velocity $v$ in $t_n$, amounting to a total of 9 degrees of freedom (3 each for $R$, $t$, and $v$). Consequently, we need at least |
| 63 | + |
| 64 | +$$ m = 5$$ |
| 65 | + |
| 66 | +images for reconstruction. |
| 67 | + |
| 68 | +By taking the cross product of $x_n$ , we obtain $x_n \times (\begin{bmatrix} R & t - t_n \end{bmatrix} X) = 0$ |
| 69 | +Assuming noisy data, we attempt to reconstruct $R,t$ and $v$ by solving the optimization problem |
| 70 | + |
| 71 | +$$\min\quad \left\Vert x_n \times (\begin{bmatrix} R & t - t_n \end{bmatrix} X)\right\Vert. $$ |
| 72 | + |
| 73 | +### Implementation with Homotopy Continuation |
| 74 | + |
| 75 | +Now let's solve this optimization problem using homotopy continuation by solving the critical equations. In our case we have |
| 76 | + |
| 77 | +$$\tau = 0.01.$$ |
| 78 | + |
| 79 | +```julia |
| 80 | +using HomotopyContinuation, LinearAlgebra, StatsBase |
| 81 | + |
| 82 | +τ = 0.01 |
| 83 | +m = 5 |
| 84 | +@var x[1:3, 1:m], u[1:2, 1:m] |
| 85 | +@var R[1:3, 1:3], t[1:3], v[1:3], n[1:m] |
| 86 | + |
| 87 | +cams = [[R t-(n[i]*τ).*v] for i in 1:m] # m = 5 cameras |
| 88 | +y = [cams[i] * [x[:,i]; 1] for i in 1:m] # m = 5 images |
| 89 | +``` |
| 90 | + |
| 91 | +Now we define the function that we want to minimize. |
| 92 | + |
| 93 | +```julia |
| 94 | +g = [cross(y[i], [u[:,i]; 1]) for i in 1:m] |
| 95 | +``` |
| 96 | + |
| 97 | +Since $R$ is constrained to be an orthogonal matrix and $v$ is constrained to have norm $1$, we use Lagrange multipliers for optimization. |
| 98 | + |
| 99 | +```julia |
| 100 | +k = R * R' - diagm(ones(3)) |
| 101 | +Rconstraints = [k[i,j] for i in 1:3, j in 1:3 if i<=j] |
| 102 | +vconstraints = transpose(v) * v - 1 |
| 103 | + |
| 104 | +@var l1[1:6], l2 # Lagrange multipliers |
| 105 | +L = transpose(g) * g - transpose(l1) * Rconstraints - l2 * vconstraints |
| 106 | +Lag = differentiate(L, [vec(R); t; v; l1; l2]); |
| 107 | + |
| 108 | +F = System(Lag, variables = [vec(R); t; v; l1; l2], parameters = [vec(x); vec([u; n'])]); |
| 109 | +``` |
| 110 | + |
| 111 | +Solving `F` then gives all critical points. We want to solve this system many times for different parameters. |
| 112 | +First, we need an initial solution for a random set of parameters. |
| 113 | + |
| 114 | +```julia |
| 115 | +p0 = randn(ComplexF64, 30) |
| 116 | +S0 = solve(F, target_parameters = p0) |
| 117 | +start = solutions(S0); |
| 118 | +``` |
| 119 | + |
| 120 | +The array `start` contains 480 solutions. Now, we track these 80 solutions towards the photo parameters we are interested in |
| 121 | + |
| 122 | +```julia |
| 123 | +N = size(points2D, 2) |
| 124 | +s = StatsBase.sample(collect(1:N), m; replace=false) |
| 125 | + |
| 126 | +X = points2D[1:3, s] |
| 127 | +Y = points2D[:, s] |
| 128 | + |
| 129 | +p1 = [vec(X); vec(Y)] |
| 130 | +S1 = solve(F, start, start_parameters = p0, target_parameters = p1) |
| 131 | +``` |
| 132 | + |
| 133 | +We solved the system `F` using parameter homotopies, with the start parameter `p0` and the target parameter `p1`. |
| 134 | +Then we evaluate each solution `r` with the parameter `r` computing the norm of `g(r,p)` and return the solution that has the smallest norm. |
| 135 | + |
| 136 | +```julia |
| 137 | +G = System(vcat(g...), variables = [vec(R); t; v], parameters = [vec(x); vec([u; n'])]); |
| 138 | +function find_min(sols, p) |
| 139 | + sols = [r[1:15] for r in sols] |
| 140 | + a = map(sols) do r |
| 141 | + norm(G(r, p)) |
| 142 | + end |
| 143 | + i = findmin(a) |
| 144 | + sols[i[2]] |
| 145 | +end |
| 146 | +recovery = find_min(real_solutions(S1), p1) |
| 147 | +R1, t1, v1 = reshape(recovery[1:9], 3, 3), recovery[10:12], recovery[13:15] |
| 148 | +``` |
| 149 | + |
| 150 | +Here is the reconstructed data: |
| 151 | + |
| 152 | +<figure> |
| 153 | +<p style="text-align:center;"><img src="/images/rollingshutter3.png" width="600px"/></p> |
| 154 | +<figcaption style="text-align:center;"> |
| 155 | +The reconstructed camera is in red, the velocity $v$ is in green. The plot shows the 3D points and the image points on the camera plane. |
| 156 | +</figcaption> |
| 157 | +</figure> |
| 158 | + |
| 159 | +[^1]: O. Saurer, M. Pollefeys, and G. H. Lee. *A Minimal Solution to the Rolling Shutter Pose Estimation Problem*. Computer Vision and Geometry Lab, ETH Zurich, and Department of Mechanical Engineering, National University of Singapore, 2013. |
0 commit comments