Support vector regression dual
The \(\epsilon\)-insensitive objective is
\[\min_{w} \frac{1}{2} \|w\|^{2} + C \sum_{i = 1}^{n} \max(0, |w^{T} x_{i} - y_{i}| - \epsilon) \enspace .\]An equivalent expression can be formed using a slack variable.
\[\begin{aligned} \min_{w, \xi} \; & \frac{1}{2} \|w\|^{2} + C \sum_{i = 1}^{n} \xi_{i} \\ \text{s.t.} \; & \xi_{i} \ge w^{T} x_{i} - y_{i} - \epsilon, \\ & \xi_{i} \ge -w^{T} x_{i} + y_{i} - \epsilon, \\ & \xi_{i} \ge 0 \end{aligned}\]Switching to vector notation, where \(x_{i}\) are the columns of \(X\), the Lagrangian is
\[\begin{align} L(w, \xi, \alpha, \beta, \gamma) & = \frac{1}{2} \|w\|^{2} + C 1^{T} \xi + \alpha^{T} (X^{T} w - y - \epsilon 1 - \xi) \\ & \quad + \beta^{T} (-X^{T} w + y - \epsilon 1 - \xi) - \gamma^{T} \xi \enspace . \end{align}\]Minimising with respect to the primal variables,
\[\begin{align} \frac{\partial L}{\partial w} & = w + X \alpha - X \beta = 0, & w & = X (\beta - \alpha) \\ \frac{\partial L}{\partial \xi} & = C 1 - \alpha - \beta - \gamma = 0 \enspace . \end{align}\]The dual problem is therefore
\[\begin{aligned} \max_{\alpha, \beta, \gamma} \; & -\frac{1}{2} (\beta - \alpha)^{T} X^{T} X (\beta - \alpha) + y^{T} (\beta - \alpha) - \epsilon 1^{T} (\beta + \alpha) \\ \text{s.t.} \; & \alpha \ge 0, \; \beta \ge 0, \; \gamma \ge 0, \; \alpha + \beta + \gamma = C 1 \enspace , \end{aligned}\]which is equivalent to
\[\begin{aligned} \min_{\upsilon} \; & \frac{1}{2} \upsilon^{T} X^{T} X \upsilon - y^{T} \upsilon + \epsilon \|\upsilon\|_{1} \\ \text{s.t.} \; & |\upsilon_{i}| \le C \end{aligned}\]with the solution given \(w = X \upsilon\).