% ------------------------------------------------------------------- % ckbs: Constrained Kalman-Bucy Smoother Program: Copyright (C) 2006 % Authors: Bradlely Bell: bradbell at washington dot edu % Gianluigi Pillonetto: giapi at dei dot unipd dot it % License: GNU General Public License Version 2 % ------------------------------------------------------------------- % $begin ckbs_newton_step$$newlinech %$$ %$spell % ckbs % blkdiag % Hdia % Hlow % Bdia % tri % ds % dy % du % mu % Kuhn % $$% % section Affine Constrained Kalman Bucy Smoother Newton Step$$ % % $index ckbs_newton_step$$% index newton_step$$ % %$index step, Newton$$% % head Syntax$$ % $syntax/[/ds/, /dy/, /du/] = ckbs_newton_step(/ % mu/, /s/, /y/, /u/, /b/, /d/, /Bdia/, /Hdia/, /Hlow/)/$$% % head Purpose$$ % This routine computes one step of Newton's method for solving % the non-linear Kuhn-Tucker conditions for % the$latex \mu$$-relaxed affine constrained Kalman-Bucy smoother problem. % % % head Problem$$ % Given % $latex \mu \in \B{R}_+$$, % latex H \in \B{R}^{p \times p}$$, %$latex d \in \B{R}^p$$, % latex b \in \B{R}^r$$, and % $latex B \in \B{R}^{r \times p}$$, % the latex \mu$$-relaxed affine constrained Kalman-Bucy smoother problem is: %$latex $% \begin{array}{rl} % {\rm minimize} & \frac{1}{2} y^\T H y + d^\T y % - \mu \sum_{i=1}^r \log(s_i) % \; {\rm w.r.t} \; y \in \B{R}^p \; , \; s \in \B{R}_+^r % \\ % {\rm subject \; to} & s + b + B y = 0 % \end{array} %$ $$% In addition, latex H$$ is symmetric block tri-diagonal with each block of % size $latex n \times n$$and % latex B$$ is block diagonal with each block of size$latex m \times n$$% (there is an integer latex N$$ such that % $latex p = n * N$$and latex r = m * N$$). % % %$subhead Lagrangian$$% We use latex u \in \B{R}^r$$ % to denote the Lagrange multipliers corresponding to the constraint equation. % The corresponding Lagrangian is % $latex $% L(y, s, u) = % \frac{1}{2} y^\T H y + d^\T y % - \mu \sum_{i=1}^r \log(s_i) % + u^\T (s + b + B y) %$ $$% The partial gradients of the Lagrangian are given by % latex $% \begin{array}{rcl} % \nabla_y L(y, s, u ) & = & H y + B^\T u + d \\ % \nabla_s L(y, s, u ) & = & u - \mu / s \\ % \nabla_u L(y, s, u ) & = & s + b + B y \\ % \end{array} %$$$ % where$latex \mu / s $$is the component by component division of % latex \mu$$ by the components of the $latex s$$. % Note, from the second equation, that we only need consider % latex u \geq 0$$ because$latex s \geq 0$$. % % subhead Kuhn-Tucker Conditions$$ % We use $latex D(s)$$% to denote the diagonal matrix with latex s$$ % along its diagonal and$latex 1_r$$% to denote the vector, of length latex r$$ with all its components % equal to one. % We define % $latex F : \B{R}^{r + p + r} \rightarrow \B{R}^{r + p + r}$$% by % latex $% F(s, y, u) % = % \left( % \begin{array}{c} % s + b + B y \\ % H y + B^\T u + d \\ % D(s) D(u) 1_r - \mu 1_r % \end{array} % \right) %$$$ % The Kuhn-Tucker conditions for a solution of the %$latex \mu$$-relaxed constrained affine Kalman-Bucy smoother problem is % latex F(s, y, u) = 0$$. % % $subhead Newton Step$$% Given a value for latex (s, y, u)$$, the Newton step %$latex ( \Delta s^\T , \Delta y^\T , \Delta u^\T )^\T$$solves the problem: % latex $% F^{(1)} (s, y, u) % \left( \begin{array}{c} \Delta s \\ \Delta y \\ \Delta u \end{array} \right) % = % - F(s, y, u) %$$$ % % $head mu$$% The argument italic mu$$ is a positive scalar specifying the % relaxation parameter$latex \mu$$. % % head s$$ % The argument $italic s$$is a column vector of length latex r$$. % All the elements of$italic s$$are greater than zero. % % head y$$ % The argument $italic y$$is a column vector of length latex p$$ % %$head u$$% The argument italic u$$ is a column vector of length $latex r$$. % All the elements of italic s$$ are greater than zero. % %$head b$$% The argument italic b$$ is a column vector of length $latex r$$. % % head d$$ % The argument$italic d$$is a column vector of length latex p$$ % % $head Bdia$$% The argument italic Bdia$$ is an$latex m \times n \times N$$array. % For latex k = 1 , \ldots , N$$ we define % $latex B_k \in \B{R}^{m \times n}$$by % latex $% B_k = Bdia(:, :, k) %$$$ % %$head B$$% The matrix latex B$$ is defined by % $latex $% B % = % \left( \begin{array}{cccc} % B_1 & 0 & 0 & \\ % 0 & B_2 & 0 & 0 \\ % 0 & 0 & \ddots & 0 \\ % & 0 & 0 & B_N % \end{array} \right) %$ $$% % head Hdia$$ % The argument$italic Hdia$$is an latex n \times n \times N$$ array. % For $latex k = 1 , \ldots , N$$we define % latex H_k \in \B{R}^{n \times n}$$ by %$latex $% H_k = Hdia(:, :, k) %$ $$% % % head Hlow$$ % The argument $italic Hlow$$is an latex n \times n \times N$$ array. % For$latex k = 1 , \ldots , N$$we define % latex L_k \in \B{R}^{n \times n}$$ by % $latex $% L_k = Hlow(:, :, k) %$ $$% % head H$$ % The matrix$latex H$$is defined by % latex $% H % = % \left( \begin{array}{cccc} % H_1 & L_2^\T & 0 & \\ % L_2 & H_2 & L_3^\T & 0 \\ % 0 & \ddots & \ddots & \ddots \\ % & 0 & L_N & H_N % \end{array} \right) %$$$ % % $head ds$$% The result italic ds$$ is a column vector of length$latex r$$% equal to the latex \Delta s$$ components of the Newton step. % % $head dy$$% The result italic dy$$ is a column vector of length$latex p$$% equal to the latex \Delta y$$ components of the Newton step. % % $head du$$% The result italic du$$ is a column vector of length$latex r$$% equal to the latex \Delta u$$ components of the Newton step. % % $children# % example/newton_step_ok.m % #$$% % head Example$$ % The file$xref/newton_step_ok.m/$$contains an example and test of % code ckbs_newton_step$$. % It returns true if $code ckbs_newton_step$$passes the test % and false otherwise. % % head Method$$ % The derivative of$latex F$$is given by % latex $% F^{(1)} (s, y, u) = % \left( % \begin{array}{ccccc} % D( 1_r ) & B & 0 \\ % 0 & H & B^\T \\ % D( u ) & 0 & D(s) % \end{array} % \right) %$$$ % It follows that % $latex $% \left( \begin{array}{ccc} % D( 1_r ) & B & 0 \\ % 0 & H & B^\T \\ % D( u ) & 0 & D(s) % \end{array} \right) % \left( \begin{array}{c} \Delta s \\ \Delta y \\ \Delta u \end{array} \right) % = % - % \left( \begin{array}{c} % s + b + B y \\ % H y + B^\T u + d \\ % D(s) D(u) 1_r - \mu 1_r % \end{array} \right) %$ $$% Replacing the bottom row by the bottom row minus % latex D(u)$$ times the top row, % we obtain the reduced set of equations %$latex $% \left( \begin{array}{cc} % H & B^\T \\ % -D(u) B & D(s) % \end{array} \right) % \left( \begin{array}{c} \Delta y \\ \Delta u \end{array} \right) % = % \left( \begin{array}{c} % - H y - B^\T u - d \\ % \mu 1_r + D(u) ( b + B y ) % \end{array} \right) %$ $$% Replacing the top row by latex H^{-1}$$ times the top row we obtain % $latex $% \left( \begin{array}{cc} % D( 1_p ) & H^{-1} B^\T \\ % -D(u) B & D(s) % \end{array} \right) % \left( \begin{array}{c} \Delta y \\ \Delta u \end{array} \right) % = % \left( \begin{array}{c} % - y - H^{-1}( B^\T u + d ) \\ % \mu 1_r + D(u) ( b + B y ) % \end{array} \right) %$ $$% Replacing the bottom row by % the latex D(u)^{-1}$$ times the bottom row % plus$latex B$$times the top row, % we obtain the reduced equation % latex $% \begin{array}{rcl} % [ B H^{-1} B^\T + D(s / u) ] \Delta u % & = & % \mu ( 1_r / u ) + b - B H^{-1}( B^\T u + d ) % \end{array} %$$$ % Thus we obtain the following solution for the Newton step: % $latex $% \begin{array}{rcl} % \Delta u & = & % [ B H^{-1} B^\T + D(s / u) ]^{-1} % [ \mu ( 1_r / u ) + b - B H^{-1}( B^\T u + d ) ] % \\ % \Delta y & = & % - y - H^{-1} [ d + B^\T ( u + \Delta u ) ] \\ % \\ % \Delta s & = & % - s - b - B ( y + \Delta y ) % \end{array} %$ $$% It follows from the matrix inversion lemma that % latex $% [ B H^{-1} B^\T + D(s / u) ]^{-1} % = % D(s/u)^{-1} - D(s/u)^{-1} B [ H + B^\T D(s/u)^{-1} B ]^{-1} B^\T D(s/u)^{-1} %$$$ % %$end % --------------------------------------------------------------------------- function [ds, dy, du] = ckbs_newton_step(mu, s, y, u, b, d, Bdia, Hdia, Hlow) m = size(Bdia, 1); n = size(Bdia, 2); N = size(Bdia, 3); p = n * N; r = m * N; % % check sizes if size(s,1) ~= r | size(u,1) ~= r | size(b,1) ~= r r size(s,1) size(u,1) size(b,1) error('ckbs_newton_step: argument sizes do not agree'); end if size(y,1) ~= p | size(d,1) ~= p p size(y,1) size(d,1) error('ckbs_newton_step: argument sizes do not agree'); end if size(Hdia,1) ~= n | size(Hdia,2) ~= n | ... size(Hlow,1) ~= n | size(Hlow,2) ~= n n size(Hdia,1) size(Hdia,2) size(Hlow,1) size(Hlow,2) error('ckbs_newton_step: argument sizes do not agree'); end if size(Hdia,3) ~= N | size(Hlow,3) ~= N N size(Hdia,3) size(Hlow,3) error('ckbs_newton_step: argument sizes do not agree'); end if min(s) <= 0 error('ckbs_newton_step: min(s) <= 0 '); end if min(u) <= 0 error('ckbs_newton_step: min(u) <= 0 '); end % % compute diagonal of D(s/u)^{-1} u_sinv = u ./ s; % % compute the diagonal blocks for H + B^T * D(s/u)^{-1} * B Hplus = zeros(n, n, N); blk_m = 1 : m; for k = 1 : N Bk = Bdia(:,:,k); Dk = diag( u_sinv(blk_m) ); Hplus(:,:,k) = Hdia(:,:,k) + Bk' * Dk * Bk; blk_m = blk_m + m; end % % compute right hand side in equation for Delta u Bt_u_d = ckbs_blkdiag_mul_t(Bdia, u) + d; [e, lambda] = ckbs_tridiag_solve(Hdia, Hlow, Bt_u_d); B_e = ckbs_blkdiag_mul(Bdia, e); rhs = mu ./ u + b - B_e; % % du = [ B H^{-1} B^T + D(s / u) ]^{-1} * rhs % = [ D(u/s) - D(u/s) B ( H + B^T D(u/s) B )^{-1} B^T D(u/s) ] * rhs u_sinv_rhs = u_sinv .* rhs; Bt_u_sinv_rhs = ckbs_blkdiag_mul_t(Bdia, u_sinv_rhs); [e, lambda] = ckbs_tridiag_solve(Hplus, Hlow, Bt_u_sinv_rhs); B_e = ckbs_blkdiag_mul(Bdia, e); u_sinv_B_e = u_sinv .* B_e; du = u_sinv_rhs - u_sinv_B_e; % % dy = - y - H^{-1} [ d + B^T ( u + du ) ] Bt_u_du = ckbs_blkdiag_mul_t(Bdia, u + du); dy = - y - ckbs_tridiag_solve(Hdia, Hlow, d + Bt_u_du); % ds = - s - b - ckbs_blkdiag_mul(Bdia, y + dy); return end