# source:projects/ckbs/trunk/src/ckbs_newton_step.m

Last change on this file was 642, checked in by aleksand, 20 months ago

This is a template file for making commits to the ckbs repository.
Lines with no 'at' characters, are general comments not connected to
a specifi file. Lines containing an 'at' character are "file name"
followed by comment. Next line must be empty for ./commit.sh files to work.

File size: 9.6 KB
Line
1% -------------------------------------------------------------------
2% ckbs: Constrained Kalman-Bucy Smoother Program: Copyright (C) 2006-2011
4%          Gianluigi Pillonetto: giapi at dei dot unipd dot it
5%          Aleksandr Aravkin:    saravkin at eos dot ubc dot ca
7% -------------------------------------------------------------------
8% $begin ckbs_newton_step$$newlinech %$$ 9%$spell
10%       ckbs
11%       blkdiag
12%       Hdia
13%       Hlow
14%       Bdia
15%       tri
16%       ds
17%       dy
18%       du
19%       mu
20%       Kuhn
21%   tridiagonal
22% $$23% 24% section Affine Constrained Kalman Bucy Smoother Newton Step$$
25%
26% $index ckbs_newton_step$$27% index newton_step$$ 28% 29%$index step, Newton$$30% 31% head Syntax$$
32% $codei/[/ds/, /dy/, /du/] = ckbs_newton_step(/ 33% mu/, /s/, /y/, /u/, /b/, /d/, /Bdia/, /Hdia/, /Hlow/)/$$34% 35% head Purpose$$ 36% This routine computes one step of Newton's method for solving 37% the non-linear Kuhn-Tucker conditions for 38% the$latex \mu$$-relaxed affine constrained Kalman-Bucy smoother problem. 39% 40% 41% head Problem$$
42% Given
43% $latex \mu \in \B{R}_+$$, 44% latex H \in \B{R}^{p \times p}$$, 45%$latex d \in \B{R}^p$$, 46% latex b \in \B{R}^r$$, and
47% $latex B \in \B{R}^{r \times p}$$, 48% the latex \mu$$-relaxed affine constrained Kalman-Bucy smoother problem is: 49%$latex $50% \begin{array}{rl} 51% {\rm minimize} & \frac{1}{2} y^\R{T} H y + d^\R{T} y 52% - \mu \sum_{i=1}^r \log(s_i) 53% \; {\rm w.r.t} \; y \in \B{R}^p \; , \; s \in \B{R}_+^r 54% \\ 55% {\rm subject \; to} & s + b + B y = 0 56% \end{array} 57%$ $$58% In addition, latex H$$ is symmetric block tri-diagonal with each block of
59% size $latex n \times n$$and 60% latex B$$ is block diagonal with each block of size$latex m \times n$$61% (there is an integer latex N$$ such that
62% $latex p = n * N$$and latex r = m * N$$). 63% 64% 65%$subhead Lagrangian$$66% We use latex u \in \B{R}^r$$
67% to denote the Lagrange multipliers corresponding to the constraint equation.
68% The corresponding Lagrangian is
69% $latex $70% L(y, s, u) = 71% \frac{1}{2} y^\R{T} H y + d^\R{T} y 72% - \mu \sum_{i=1}^r \log(s_i) 73% + u^\R{T} (s + b + B y) 74%$ $$75% The partial gradients of the Lagrangian are given by 76% latex $77% \begin{array}{rcl} 78% \nabla_y L(y, s, u ) & = & H y + B^\R{T} u + d \\ 79% \nabla_s L(y, s, u ) & = & u - \mu / s \\ 80% \nabla_u L(y, s, u ) & = & s + b + B y \\ 81% \end{array} 82%$$$ 83% where$latex \mu / s $$is the component by component division of 84% latex \mu$$ by the components of the $latex s$$. 85% Note, from the second equation, that we only need consider 86% latex u \geq 0$$ because$latex s \geq 0$$. 87% 88% subhead Kuhn-Tucker Conditions$$
89% We use $latex D(s)$$90% to denote the diagonal matrix with latex s$$ 91% along its diagonal and$latex 1_r$$92% to denote the vector, of length latex r$$ with all its components
93% equal to one.
94% We define
95% $latex F : \B{R}^{r + p + r} \rightarrow \B{R}^{r + p + r}$$96% by 97% latex $98% F(s, u, y) 99% = 100% \left( 101% \begin{array}{c} 102% s + b + B y \\ 103% D(s) D(u) 1_r - \mu 1_r\\ 104% H y + B^\R{T} u + d 105% \end{array} 106% \right) 107%$$$ 108% The Kuhn-Tucker conditions for a solution of the 109%$latex \mu$$-relaxed constrained affine Kalman-Bucy smoother problem is 110% latex F(s, u, y) = 0$$.
111%
112% $subhead Newton Step$$113% Given a value for latex (s, u, y)$$, the Newton step 114%$latex ( \Delta s^\R{T} , \Delta u^\R{T} , \Delta y^\R{T} )^\R{T}$$solves the problem: 115% latex $116% F^{(1)} (s, u, y) 117% \left( \begin{array}{c} \Delta s \\ \Delta u \\ \Delta y \end{array} \right) 118% = 119% - F(s, u, y) 120%$$$
121%
122% $head mu$$123% The argument icode mu$$ is a positive scalar specifying the 124% relaxation parameter$latex \mu$$. 125% 126% head s$$
127% The argument $icode s$$is a column vector of length latex r$$. 128% All the elements of$icode s$$are greater than zero. 129% 130% head y$$
131% The argument $icode y$$is a column vector of length latex p$$ 132% 133%$head u$$134% The argument icode u$$ is a column vector of length $latex r$$. 135% All the elements of icode s$$ are greater than zero. 136% 137%$head b$$138% The argument icode b$$ is a column vector of length $latex r$$. 139% 140% head d$$ 141% The argument$icode d$$is a column vector of length latex p$$
142%
143% $head Bdia$$144% The argument icode Bdia$$ is an$latex m \times n \times N$$array. 145% For latex k = 1 , \ldots , N$$ we define
146% $latex B_k \in \B{R}^{m \times n}$$by 147% latex $148% B_k = Bdia(:, :, k) 149%$$$ 150% 151%$head B$$152% The matrix latex B$$ is defined by
153% $latex $154% B 155% = 156% \left( \begin{array}{cccc} 157% B_1 & 0 & 0 & \\ 158% 0 & B_2 & 0 & 0 \\ 159% 0 & 0 & \ddots & 0 \\ 160% & 0 & 0 & B_N 161% \end{array} \right) 162%$ $$163% 164% head Hdia$$ 165% The argument$icode Hdia$$is an latex n \times n \times N$$ array.
166% For $latex k = 1 , \ldots , N$$we define 167% latex H_k \in \B{R}^{n \times n}$$ by 168%$latex $169% H_k = Hdia(:, :, k) 170%$ $$171% 172% 173% head Hlow$$
174% The argument $icode Hlow$$is an latex n \times n \times N$$ array. 175% For$latex k = 1 , \ldots , N$$we define 176% latex L_k \in \B{R}^{n \times n}$$ by
177% $latex $178% L_k = Hlow(:, :, k) 179%$ $$180% 181% head H$$ 182% The matrix$latex H$$is defined by 183% latex $184% H 185% = 186% \left( \begin{array}{cccc} 187% H_1 & L_2^\R{T} & 0 & \\ 188% L_2 & H_2 & L_3^\R{T} & 0 \\ 189% 0 & \ddots & \ddots & \ddots \\ 190% & 0 & L_N & H_N 191% \end{array} \right) 192%$$$
193%
194% $head ds$$195% The result icode ds$$ is a column vector of length$latex r$$196% equal to the latex \Delta s$$ components of the Newton step.
197%
198% $head dy$$199% The result icode dy$$ is a column vector of length$latex p$$200% equal to the latex \Delta y$$ components of the Newton step.
201%
202% $head du$$203% The result icode du$$ is a column vector of length$latex r$$204% equal to the latex \Delta u$$ components of the Newton step.
205%
206% $children# 207% example/newton_step_ok.m 208% #$$209% 210% head Example$$ 211% The file$cref newton_step_ok.m$$contains an example and test of 212% code ckbs_newton_step$$.
213% It returns true if $code ckbs_newton_step$$passes the test 214% and false otherwise. 215% 216% head Method$$ 217% The derivative of$latex F$$is given by 218% latex $219% F^{(1)} (s, y, u) = 220% \left( 221% \begin{array}{ccccc} 222% D( 1_r ) & 0 & B \\ 223% D( u ) & 0 & D(s) \\ 224% 0 & B^\R{T} & H 225% \end{array} 226% \right) 227%$$$
228% It follows that
229% $latex $230% \left(\begin{array}{ccc} 231% I & 0 & B \\ 232% U & S & 0\\ 233% 0 & B^T & C \\ 234% \end{array}\right) 235% \left(\begin{array}{ccc} 236% \Delta s \\ \Delta y \\ \Delta u 237% \end{array}\right) 238% = 239% - 240% \left(\begin{array}{ccc} 241% s + b + By \\ 242% SU\B{1} - \mu\B{1}\\ 243% Cy + B^Tu + d 244% \end{array}\right)\;. 245%$ $$246% Below, latex r_i$$ refers to row$latex i$$. 247% Applying the row operations 248% latex $249% \begin{array}{ccc} 250% r_2 &=& r_2 - U r_1 \\ 251% r_3 &=& r_3 - B^T S^{-1} r_2 252% \end{array}\;, 253%$$$
254% we obtain the equivalent system
255% $latex $256% \left(\begin{array}{ccc} 257% I & 0 & B \\ 258% 0 & S & -UB\\ 259% 0 & 0 & C + B^T S^{-1} U B 260% \end{array}\right) 261% \left(\begin{array}{ccc} 262% \Delta s \\ \Delta y \\ \Delta u 263% \end{array}\right) 264% = 265% - 266% \left(\begin{array}{ccc} 267% s + b + B y \\ 268% -U(b + B y) - \mu\B{1}\\ 269% Cy + B^T u + d + B^T S^{-1} 270% \left(U(b + B y) + \mu \B{1} 271% \right) 272% \end{array}\right)\;. 273%$ $$274% latex \Delta y$$ is obtained 275% from a single block tridiagonal solve (see third row of system). 276% Then we immediately have 277%$latex $278% \Delta u = US^{-1}(b + B(y + \Delta y)) + \frac{\mu}{s} 279%$ $$280% and 281% latex $282% \Delta s = -s -b -B(y + \Delta y)\;. 283%$$$.
284% \$end
285% ---------------------------------------------------------------------------
286function [ds, dy, du] = ckbs_newton_step(mu, s, y, u, b, d, Bdia, Hdia, Hlow)
287    m = size(Bdia, 1);
288    n = size(Bdia, 2);
289    N = size(Bdia, 3);
290    p = n * N;
291    r = m * N;
292    %
293    % check sizes
294    if size(s,1) ~= r | size(u,1) ~= r | size(b,1) ~= r
295        r
296        size(s,1)
297        size(u,1)
298        size(b,1)
299        error('ckbs_newton_step: argument sizes do not agree');
300    end
301    if size(y,1) ~= p | size(d,1) ~= p
302        p
303        size(y,1)
304        size(d,1)
305        error('ckbs_newton_step: argument sizes do not agree');
306    end
307    if size(Hdia,1) ~= n | size(Hdia,2) ~= n | ...
308            size(Hlow,1) ~= n | size(Hlow,2) ~= n
309        n
310        size(Hdia,1)
311        size(Hdia,2)
312        size(Hlow,1)
313        size(Hlow,2)
314        error('ckbs_newton_step: argument sizes do not agree');
315    end
316    if size(Hdia,3) ~= N | size(Hlow,3) ~= N
317        N
318        size(Hdia,3)
319        size(Hlow,3)
320        error('ckbs_newton_step: argument sizes do not agree');
321    end
322    if min(s) <= 0
323        error('ckbs_newton_step: min(s) <= 0 ');
324    end
325    if min(u) <= 0
326        error('ckbs_newton_step: min(u) <= 0 ');
327    end
328    %
329    % compute diagonal of D(u/s)
330    u_sinv     = u ./ s;
331    %
332    % compute the diagonal blocks for H + B^T * D(u/s) * B
333    Hplus = zeros(n, n, N);
334    blk_m = 1 : m;
335    for k = 1 : N
336        Bk           = Bdia(:,:,k);
337        Dk           = diag( u_sinv(blk_m) );
338        Hplus(:,:,k) = Hdia(:,:,k) + Bk' * Dk * Bk;
339        blk_m = blk_m + m;
340    end
341    %
342    % compute right hand side in equation for Delta y
343    Bt_u_d        = ckbs_blkdiag_mul_t(Bdia, u) + d;
344    Cy            = ckbs_blktridiag_mul(Hdia, Hlow, y);
345    By            = ckbs_blkdiag_mul(Bdia, y);
346    row3          = Cy + Bt_u_d + ckbs_blkdiag_mul_t(Bdia, (u.*(b + By) + mu)./s);
347
348    % compute du
349    [dy, lambda]  = ckbs_tridiag_solve(Hplus, Hlow, -row3);
350
351    Bydy = ckbs_blkdiag_mul(Bdia, y + dy);
352    % compute du
353    du            = (u.*(b + Bydy) + mu)./s;
354
355    % compute ds
356    ds            =  -s - b - Bydy;
357    return
358end
Note: See TracBrowser for help on using the repository browser.