source: branches/cache/plan.txt @ 3338

Last change on this file since 3338 was 3338, checked in by bradbell, 6 years ago

Add one result to EndOp? so always have i_var < num_var.

cache.hpp: Add copyright and svn Id.
forward0sweep.hpp: check all operators (even after skip).
forward1sweep.hpp: check all operators (even after skip).
player.hpp: set num_cache_rec_ in all cases.
plan.txt: add copyright and svn Id.

  • Property svn:keywords set to Id
File size: 5.7 KB
Line 
1/* $Id: plan.txt 3338 2014-09-17 12:24:49Z bradbell $ */
2/* --------------------------------------------------------------------------
3CppAD: C++ Algorithmic Differentiation: Copyright (C) 2003-14 Bradley M. Bell
4
5CppAD is distributed under multiple licenses. This distribution is under
6the terms of the
7                    Eclipse Public License Version 1.0.
8
9A copy of this license is included in the COPYING file of this distribution.
10Please visit http://www.coin-or.org/CppAD/ for information on other licenses.
11-------------------------------------------------------------------------- */
12
13Plan for implementing caching
14
150. Next Step:
16        Implement Step 4.1, sorting last_use.
17
181. In cppad/local/ad_fun.hpp add following to ADFun private data.
19                pod_vector<size_t> cache_;
20        where
21                taylor_[ cap_order_taylor_ * cache[i_var] + k ]
22        is the the cache copy of
23                taylor_[ cap_order_taylor_ * i_var + k ]
24        for k = 0 , ... , num_order_taylor_ - 1.
25
262. During forward mode, directly after
27                taylor_[ cap_order_taylor_ * i_var + p ]
28        is computed, copy
29                taylor_[ cap_order_taylor_ * i_var + k ]
30        to
31                taylor_[ cap_order_taylor_ * i_var + k ]
32        for k = 0 , ... , p-1.
33
343. For each opertion in the tape, and each argument arg that is an index in
35        taylor_, change arg to cache[arg].
36
374. Caching an Operation Sequence:
38
39        4.1 Determine last_use[i_var]:
40    Use a forward analysis to determine last_use[i_var], the last result index
41        that uses variable i_var as an operand. In the special case where i_var
42        is not used, last_use[i_var] = i_var. Sort this vector so that
43        last_use[ order[i] ] <= last_use[ order[j] ] whenever i < j.
44
45
46        4.2 Determine cache[i_var]:
47       
48                4.2.1 Use an forward analysis that starts with an empty cache vector
49                and no indices available. Initialize last_use index i_last = 0;
50
51                4.2.1 For each variable i_var, if there is a cache index available, set
52                i_cache to that index. If not, create another cache slot and set
53                i_cache to the corresponding index. Set cache[i_var] = i_cache, and
54                mark the corresponding index as in use.
55
56                4.2.2 While last_use[ order[i_last] ] <= i_var, mark
57                cache[ order[i_last] ] as aviable and increment i_last.
58
59        4.3 Change operand indices:
60                For each variable index i_var that is an operand, map i_var to
61                cache[i_var].
62 
63
645. Example Tape:
65        y[0] = 5 * (x[0] + x[1])
66        y[1] = y[0] + 5 * (x[0] - x[1]) = 10 * x[0];
67
68        o=1    v=1    Inv     | fz[0]=1
69        o=2    v=2    Inv     | fz[0]=2
70        o=3    v=3    Subvv    vl=1     vr=2    | fz[0]=-1
71        o=4    v=4    Mulpv    pl=5     vr=3    | fz[0]=-5
72        o=5    v=5    Addvv    vl=1     vr=2    | fz[0]=3
73        o=6    v=6    Mulpv    pl=5     vr=5    | fz[0]=15
74        o=7    v=7    Addvv    vl=4     vr=6    | fz[0]=10
75        o=8    v=     End     
76
77        partial_0 ( y[0] + 2 * y[1]) = 25
78        partial_1 ( y[0] + 2 * y[1]) = 5
79
806. Example Cache Index determination:
81        i_var   last_use    cache
82            0          0        8
83            1          5        8
84            2          5        9
85            3          4       10
86            4          7       11
87            5          6       10
88            6          7        8
89            7          7        9
90
917. Example Zero Order Forward Mode:
92        o=1 v=1 c=8  Inv                         | z[8] =z[1]           =1
93        o=2 v=2 c=9  Inv                         | z[9] =z[2]           =2
94        o=3 v=3 c=10 Addvv vl=1 cl=8  vr=2 cr=9  | z[10]=z[3]=z[8]+z[9] =3
95        o=4 v=4 c=11 Mulpv pl=5       vr=3 cr=10 | z[11]=z[4]=5*z[10]   =15
96        o=5 v=5 c=10 Subvv vl=1 cl=8  vr=2 cr=9  | z[10]=z[5]=z[8]-z[9] =-1
97        o=6 v=6 c=8  Mulpv pl=5       vr=5 cr=10 | z[8] =z[6]=5*z[10]   =-5
98        o=7 v=7 c=9  Addvv vl=4 cl=11 vr=6 cr=8  | z[9] =z[7]=z[11]+z[8]=10
99
1008. Example First Order Reverse Mode:
101        z[*]=0, z[4]=1, z[7]=2
102
103        o=7 v=7 c=9  Addvv vl=4 cl=11 vr=6 cr=8  | z[9]+=z[7]      z[9]=2
104                                                 | z[8]+=z[9],     z[8]=2
105                                                 | z[11]+=z[9],    z[11]=2
106                                                 | z[9]=0
107
108        o=6 v=6 c=8  Mulpv pl=5       vr=5 cr=10 | z[8]+=z[6],     z[8]=2
109                                                 | z[10]+=5*z[8],  z[10]=10
110                                                 | z[8]=0
111
112        o=5 v=3 c=10 Subvv vl=1 cl=8  vr=2 cr=9  | z[10]+=z[5],    z[10]=10
113                                                 | z[8]+=z[10],    z[8]=10
114                                                 | z[9]-=z[10].    z[9]=-10
115                                                 | z[10]=0
116
117        o=4 v=4 c=11 Mulpv pl=5       vr=3 cr=10 | z[11]+=z[4],    z[11]=3
118                                                 | z[10]+=5*z[11], z[10]=15
119                                                 | z[11]=0
120
121        o=3 v=3 c=10 Addvv vl=1 cl=8  vr=2 cr=9  | z[10]+=z[3],    z[10]=15
122                                                 | z[8]+=z[10],    z[8]=25
123                                                 | z[9]+=z[10],    z[9]=5
124                                                 | z[10]=0
125
126        o=2 v=2 c=9  Inv                         | z[2]=z[9],      z[2]=5
127        o=1 v=1 c=8  Inv                         | z[1]=z[8],      z[1]=25
128
129
1309. Changes to data sructure:
131        ADFun::num_var_tape_:         no longer also the size of ADFun::taylor_
132        ADFun::num_cache_tape_:       number of cache entires in taylor_:
133        ADFun::cache_[num_var_tape_]: is the cache index for each variable
134
135
13610. Forward and Reverse:
137        10.1: If num_cache_tape_ is zero, cache will not be used.
138        otherwise, the special cache operatios will be done.
139
140        10.2: if NDEBUG is not defined, the varialbe
141        cache2variable[ADFun::num_cache_tape_] will map the cache index
142        to the corresonding current variable index. This will be used for tracing
143        and to check that the variables have the proper order for each operation;
144        i.e., the operations have lower variable index than the results.
145
14611. Operations:
147        The check that variables have proper order will be moved from each
148        individual operation into forward and reverse sweeps.
149
15012. Optmization:
151        compute the last_use and cache indices and change the
152        arguments to use cache indices instead of variable indices.
Note: See TracBrowser for help on using the repository browser.