Lets try a simple optimization problem. Fitting a circle to 2D data

$$r^2 = x^2 + y^2$$

The measurement residual is defined as the distance to the circle boundary for each point i $$f(x, y, r, x_i, y_i) = \sqrt{(x - x_i)^2 + (y - y_i)^2} - r$$

Taking the partial derivatives for each parameter we get

$$\frac{\partial f}{\partial x} = \frac{ x - x_i }{\sqrt{(x-x_i)^2 + (y-y_i)^2}}$$

$$\frac{\partial f}{\partial y} = \frac{ y - y_i }{\sqrt{(x-x_i)^2 + (y-y_i)^2}}$$

$$\frac{\partial f}{\partial r} = -1$$

These derivatives are a bit complex. It would be best if we avoided the square root in the residual

We can redefine the residual as $$f(x, y, r, x_i, y_i) = r^2 - (x - x_i)^2 + (y - y_i)^2$$

Taking the partial derivatives for each parameter we get

$$\frac{\partial f}{\partial x} = 4r^2$$

$$\frac{\partial f}{\partial y} = -2(x-x_i)$$

$$\frac{\partial f}{\partial r} = -2(y-y_i)$$

We employ a little trick. We define m as the square root of r. By optimizing with parameter m it is impossible to get a negative radius.

With this information we can implement two cost functions. CircleFitAuto which relies on Ceres automatic Jet derivatives, and CircleFitAnalytical which analytically computes the derivatives each step.

class CircleFitAuto
{
public:
CircleFitAuto(const double x, const double y) : x_(x), y_(y) {}
template <typename T>
bool operator()(const T* const x, const T* const y,
const T* const m,
T* residual) const
{
const T r = *m * *m;
const T xp = x_ - *x;
const T yp = y_ - *y;
residual[0] = r * r - xp * xp - yp * yp;
return true;
}
private:
const double x_;
const double y_;
};

struct CircleFitAnalytical : public ceres::SizedCostFunction<1, 1, 1, 1>
{
CircleFitAnalytical(const double x, const double y) : x_(x), y_(y) {}
virtual ~CircleFitAnalytical() = default;
virtual bool Evaluate(double const* const* parameters, double* residuals,
double** jacobians) const
{
const double x = parameters[0][0];
const double y = parameters[1][0];
const double m = parameters[2][0];
const double r = m * m;
const double xp = x_ - x;
const double yp = y_ - y;

residuals[0] = r * r - xp * xp - yp * yp;

if (jacobians != nullptr && jacobians[0] != nullptr) {
jacobians[0][0] = 2 * (x_ - x);
jacobians[1][0] = 2 * (y_ - y);
jacobians[2][0] = 4 * m * m * m;
}
return true;
}
private:
const double x_;
const double y_;
};


We can generate points around the unit circle with normal noise (stddev = 0.1)

std::random_device rd{};
std::mt19937 gen{ rd() };
std::normal_distribution<double> dist{ 0, 0.1 };

std::vector<std::array<double, 2>> data;
const int samples = 1e2;
const double angle_step = 2 * M_PI / samples;
for (int i = 0; i < samples; ++i) {
const double theta = angle_step * i;
const double radius = 1.0 + dist(gen);
data.push_back({ std::cos(theta) * radius, std::sin(theta) * radius });
}


Visualising the data

Running the optimization with different sample sizes

The results are compelling. For this problem the analytical derivative is 25-30x faster than Ceres auto diff Jets.

This is likely due to the existence of a simple derivative.

If a relatively straight forward analytical derivative exists, I would use it!

…especially when algebraic tools can do the maths for you

801 Words

2020-10-18 14:30 +1100