Econ Trash: Comment for Roger

Tuesday, March 29, 2016

Comment for Roger

This post is an expanded follow on comment for Roger Sparks in this thread.

In particular, for an LTI system $A_d = e^{A_c T_s} \approx I + A_c T_s$ (the approximation holding when the matrix norm $\lVert A_c T_s \rVert = \lVert A_c \rVert T_s \ll 1$, ... a generalization of the approximation you may recall from this comment I left you on John Handley's blog). The subscripts $d$ and $c$ denote discrete and continuous time, respectively. Note that $A_d$ is different for every sample period $T_s$ used. $B_d$ is generally more complicated, but can be approximated as $B_d \approx A^{-1}_c (A_d - I) B_c$. This approximation is exact when $\mathbf{u}$ is a "zero order hold" (ZOH) function, meaning it stays constant over each sample period. More generally, for LTI systems with an arbitrary input $\mathbf{u}$ you must integrate:
$$
\mathbf{x}_{n+1} = e^{A_c T_s} \mathbf{x}_{n} + \int\limits_{0}^{T_s} e^{A_c (T_s - \tau)} B_c\, \mathbf{u}(\tau + n T_s) d \tau \tag 9
$$
and substituting for time-invariant $A_d$ from above:
$$
\mathbf{x}_{n+1} = A_d \mathbf{x}_{n} + \int\limits_{0}^{T_s} e^{A_c (T_s - \tau)} B_c\, \mathbf{u}(\tau + n T_s) d \tau \tag {9a}
$$
or more generally still, for a time-varying linear system:
$$
\mathbf{x}_{n+1} = \Phi_{(n+1) T_s, n T_s} \mathbf{x}_{n} + \int\limits_{n T_s}^{(n+1) T_s} \Phi_{(n+1) T_s, \tau} B_c (\tau) \mathbf{u}(\tau) d \tau \tag {10}
$$
and now substituting for a time varying version of the $A_d$ matrix (indexed by the sample period number $n$):
$$
\mathbf{x}_{n+1} = A_{dn} \mathbf{x}_{n} + \int\limits_{n T_s}^{(n+1) T_s} \Phi_{(n+1) T_s, \tau} B_c (\tau) \mathbf{u}(\tau) d \tau \tag {10a}
$$
where $\Phi_{t_2, t_1}$ is the state transition matrix from time $t_1$ to time $t_2$. Or you can, in principal, avoid the integration by doing a Taylor expansion on $\mathbf{u}$ and find a separate $B_{di}$ for each $i^{th}$ term. As Brian has previously pointed out, a numerical integration accomplishes much the same thing, only instead of Taylor terms sampled at one time, it uses $\mathbf{u}$ sampled at several times over a single nominal sample period.

Additionally, as I mentioned in a comment above, some non-linear dynamic systems can be approximated over each sample period as a linear system. For example, you may have a non-linear system such as $\dot{\mathbf{x}} = \mathbf{f}(\mathbf{x},\mathbf{u} ,t$), where $\mathbf{f}, \mathbf{x}$ and $\mathbf{u}$ are generally vectors. One way to accomplish this is to approximate $\mathbf{f}$ as $A_c$ and $B_c$ where these matrices are the Jacobians of $\mathbf{f}$ wrt $\mathbf{x}$ (i.e. $A_c = \partial \mathbf{f}/\partial\mathbf{x}$) and $\mathbf{u}$ (i.e. $B_c = \partial \mathbf{f}/\partial\mathbf{u}$) respectively, and to rewrite the differential equation in terms of a small perturbation of $\mathbf{x}$ (call it $\delta{\mathbf{x}}$) about a nominal solution $\mathbf{x}^*$, i.e. $\delta{\mathbf{x}} \equiv \mathbf{x} - \mathbf{x}^*$. Similarly define $\delta{\mathbf{u}} \equiv \mathbf{u} - \mathbf{u}^*$. Then we can write for an approximate perturbation to this non-linear differential equation $\delta{\dot{\mathbf{x}}} = A_c \delta\mathbf{x} + B_c \delta\mathbf{u}$. The discrete time approximation I gave above ($A_d \approx I + A_c T_s$) can be seen in this light as a two term vector Taylor expansion of $\mathbf{f}$ wrt $\mathbf{x}$, i.e. $A_d \approx I + (\partial\mathbf{f}/\partial\mathbf{x}) T_s$. Note that the "region of validity" for $A_c$ and $B_c$ may be small, and typically they are recalculated around different nominal values $\mathbf{x}^*$, $\mathbf{u}^*$ and time $t$ as the system evolves in time. An application might be, for example, a model of perturbations of a rocket about a nominal desired trajectory for purposes of implementing a feedback control system to keep it on course. As long as it never is allowed to wander too far off course, the approximation may be adequate, provided the system is "re-linearized" on a regular basis. This might suggest an economics application of a central bank "feedback control" rule for setting interest rates or determining the "high powered" money supply to "keep the economy on course" about some non-linear pre-determined solution to a set of non-linear differential equations, based on feedback from other measured macro variables, such as the inflation rate, the NGDP level or the unemployment rate.

Another huge advantage of linear systems is the ability to analyze and manipulate them in the "frequency domain" rather than the time domain analysis I give above. This means the ability to use continuous and discrete time Fourier transforms, Laplace transforms and Z-transforms. They also have advantages in terms of being able to analyze their response to noise, modeling errors and other uncertainties, and to design optimal "state estimators" (e.g. Kalman filters) perhaps designed to work in conjunction with optimal feedback or open loop control laws, which can effectively change the natural modes of the system to more desirable ones: for example, to stabilize an unstable system by moving its unstable eigenvalues to the "right half plane" (i.e. move modes with positive real components such that they instead have negative real components... in what's known with regard to Laplace transforms as the "S-plane"). The discrete time equivalent is to move unstable eigenvalues of $A_d$ from outside the "unit circle" to inside the unit circle (this unit circle drawn in the "Z-plane" associated with the Z-transform, which is the discrete time counterpart to the Laplace transform). Sometimes you hear economists speak of "unit roots." What they're referring to are "roots" (in the sense of roots to the "characteristic polynomial" of $A_d$, which are the eigenvalues, AKA "poles" of $A_d$) on the unit circle (corresponding to the imaginary axis of the S-plane in the continuous time world), which produces "marginally stable" outputs, such as non-rising non-decaying step impulse responses (the response of a perfect "integrator") or constant amplitude oscillations (sin and cos waves, or more generally "complex exponentials," i.e. $c\, e^{i \omega n T_s}$ with $i \equiv \sqrt{-1}$, $c$ a constant and $\omega$ the unitless frequency of oscillation (which you can practically take to be radians per sample)). Everything in quotes can be Googled for more information (those are all common terms).

I think you can probably see how important the $A$ matrix ($A_c$ or $A_d$) is to a linear system. It characterizes all the dynamics of the system, such as whether or not it's naturally stable or likely to oscillate, how fast it will respond to inputs (the time constants), and which modes are dominant (i.e. which modes are weighted with the largest coefficients). Of course the other matrices ($B$, $C$ and $D$) are also important, and you need them, for example, to analyze steady state responses or to determine if a system is "controllable" (can we control the states?) or "observable" (do we have enough sensors to observe all the states?). But $A$ is really the heart of the system.

-----------------------------------------------------------------------------------------------------------------

Appendix A: Details about one way to calculate discrete time matrices for non-linear systems

For the non-linear approximate discrete time case, it's sometimes beneficial to numerically calculate $A_d$ and $B_d$ from numerical solutions to the non-linear equations directly rather than via the partial differential Jacobian technique discussed above, especially if you have some idea of the covariance of $\delta \mathbf{x}$ (call it $C_\mathbf{x}$) and $\delta \mathbf{u}$ (call it $C_\mathbf{u}$). For example, say the eigenvalues (variances) of $C_\mathbf{x}$ are $\{\sigma^2_{\mathbf{x}, i}\}$, for $i = 1,...,n$ (assuming $\mathbf{x}$ is $n$-dimensional), and the corresponding orthonormal matrix of eigenvectors is $V_\mathbf{x} = [\mathbf{v}_{\mathbf{x}, i}]$. These eigenvectors are orthonormal since covariances are always positive semi-definite, and we'll assume positive definite (i.e. a full $n$ degrees of freedom (equivalently all variances are > 0)) for this exercise. Then a good way to calculate $A_{d n}$ which captures some of the effects of the higher order terms of the Taylor expansion as well as the dependence of $\mathbf{f}$ on $t$ (which we've so far assumed to be negligible over a sample period) is the following sum of outer products:
$$
A_{d n} = \sum_{i=1}^n [\mathbf{x}_i (t_{n+1}) - \mathbf{x}^* (t_{n+1})] \cdot \mathbf{v}^T_{\mathbf{x}, i} /\sigma_{\mathbf{x}, i} \tag {11}
$$
where $t_{m} \equiv m T_s$ and $\mathbf{x}_i (t_{n+1})$ is the numerical solution at time $t_{n+1}$ to the non-linear differential equation after perturbing the nominal solution $\mathbf{x}^* (t_{n})$ by $\mathbf{v}_{\mathbf{x}, i} \sigma_{\mathbf{x}, i}$ and propagating it forward by one sample period $T_s$ to time $t_{n+1}$. In other words we form $\mathbf{x}_i (t_n) = \mathbf{x}^* (t_{n}) + \delta \mathbf{x}_{i, n}$ where $\delta \mathbf{x}_{i, n} = \mathbf{v}_{\mathbf{x}, i} \sigma_{\mathbf{x}, i}$ and propagate that forward one time step to get $\mathbf{x}_i (t_{n+1})$. In general $C_\mathbf{x}$ and $C_\mathbf{u}$ will also be functions of time, and thus so will their eigenvalues and eigenvectors (though I've neglected a time index for them in the above). An analogous procedure exists for finding $B_{d n}$ using a formula similar to $(11)$ but with the eigenvalues and eigenvectors of $C_\mathbf{u}$ rather than $C_\mathbf{x}$. This procedure (for both $A_d$ and $B_d$) scales the non-linear effects by the size of the standard deviations ($\sigma$) of each degree of freedom in each covariance, thus in some sense capturing the expected size of the effects of higher order terms / non-linearities. Note that $(11)$ is a one-sided difference, whereas a more robust method would use both positive and negative perturbations ($\pm \delta$).

One way to picture what's going on in $(11)$ is to imagine that $C_\mathbf{x}$ describes a multivariate spatial Gaussian probability distribution that's long and narrow (i.e. its 1-$\sigma$ ellipsoidal iso-probability density contour is cigar shaped). Then a perturbation in $\mathbf{x}$ from the nominal is most likely to be located ~$1 \sigma$ one direction or the other along the major axis of this distribution (with $\mathbf{x}^*$ right in the middle of course). For the sake of argument, say it's exactly $1 \sigma$ along the positive major axis (it actually works for any principal axis of the ellipsoid, but imagine the major one for the sake of argument), and it has no perturbation components along the other axes. Then the inner product of this perturbation and the row vector $\mathbf{v}^T_{\mathbf{x}, j}/\sigma_{\mathbf{x}, j}$ will be $1$ and it will be $0$ for $\mathbf{v}^T_{\mathbf{x}, i}/\sigma_{\mathbf{x}, i},\; i \ne j$ where we're assuming the major axis is the $j^{th}$. That's because $V_\mathbf{x}$ is orthornormal (i.e. its columns are unit length and mutually orthogonal). Such a perturbation will give an exact result for the resultant perturbation propagated to time $t_{n+1}$ (assuming $\delta \mathbf{u} = 0$). In other words, such a perturbation at time $t_n$ (call it $\delta \mathbf{x}_j (t_n)$) will give an exact result for $\delta \mathbf{x}_j (t_{n+1}) = A_{d n} \delta \mathbf{x}_j (t_n)$. This is because, substituting $(11)$ for $A_{d n}$ we have:
$$
\begin{align}
A_{d n} \delta \mathbf{x}_j (t_n) & = \left( \sum_{i=1}^n [\mathbf{x}_i (t_{n+1}) - \mathbf{x}^* (t_{n+1})] \cdot \mathbf{v}^T_{\mathbf{x}, i, n} /\sigma_{\mathbf{x}, i, n} \right) \cdot \delta \mathbf{x}_i (t_n)\\
& = \left( \sum_{i=1}^n [\mathbf{x}_i (t_{n+1}) - \mathbf{x}^* (t_{n+1})] \cdot \mathbf{v}^T_{\mathbf{x}, i, n} /\sigma_{\mathbf{x}, i, n} \right) \cdot \mathbf{v}_{\mathbf{x}, j, n} \sigma_{\mathbf{x}, j, n}\\
& = \sum_{i=1}^n [\mathbf{x}_i (t_{n+1}) - \mathbf{x}^* (t_{n+1})] (\mathbf{v}^T_{\mathbf{x}, i, n} \cdot \mathbf{v}_{\mathbf{x}, j, n}) ( \sigma_{\mathbf{x}, j, n} /\sigma_{\mathbf{x}, i, n}) \\
& = [\mathbf{x}_j (t_{n+1}) - \mathbf{x}^* (t_{n+1})] (\mathbf{v}^T_{\mathbf{x}, j, n} \cdot \mathbf{v}_{\mathbf{x}, j, n}) ( \sigma_{\mathbf{x}, j, n}/\sigma_{\mathbf{x}, j, n})\\
& = \mathbf{x}_j (t_{n+1}) - \mathbf{x}^* (t_{n+1})\\
& = \delta \mathbf{x}_j (t_{n+1}) \tag {12}
\end{align}
$$
where I've added time indices where appropriate. Thus, in some sense, a "likely" perturbation gives exact results, non-linearities and all. The Jacobian method does not capture this, and may be equally difficult to calculate (although when closed form expressions are available, it's likely easier) [1]. Of course, this exact "likely" perturbation still has probability = 0, so we don't expect to actually avoid non-linear distortions altogether, but in some sense we can expect to account for them in a way not captured by the Jacobian method. Doing so is perhaps a bit conservative, but if our purpose is (for example) to keep track of how errors propagate through the system (e.g. with a Kalman filter), it's generally better to err on the conservative side, lest we fool ourselves into thinking we have a better state estimate than we actually do (which can cause us to weight new information ("innovations") with smaller gains than we should at future times). You can think of this problem as wrongly adopting overly strong "priors." If a more conservative result is desired, the perturbations can be made to be greater than 1-$\sigma$ in size.

[1] In fact, you can think of this method as a crude numerical estimate of $e^{(\partial\mathbf{f}/\partial\mathbf{x}) T_s}$, where the $\partial \mathbf{x}$ deviations (and thus the implicit $\partial \mathbf{f}$ deviations) aren't very infinitesimal at all, and where $T_s$ is implicitly folded into the differencing estimate directly rather than as a separate multiplier. But the point isn't to approximate the matrix exponential, but to find something better. However, if the system were actually LTI, it would be an exact representation of $e^{(\partial\mathbf{f}/\partial\mathbf{x}) T_s}$. And if the system were linear but time varying, it would produce exactly $\Phi_{n+1, n}$ where I've simplified the time indices from $(10)$ above.

-----------------------------------------------------------------

See comment below:

Below: Example of error propagation (falling objects with initial horizontal velocity): blue curve includes air resistance, and the red curve doesn't. Blue trajectory is non-linear and red is linear. Error ellipses are 1-sigma.

30 comments :

Roger SparksMarch 29, 2016 at 9:09 PM
I just discovered this as I was trying to understand how you placed Excel spreadsheets on the blog. (I think you are using a Microsoft account and the online app?)

This will take some study! Thank you very much for placing it here.

With a quick glance, I notice that you relate linear systems to the "frequency domain". I spent some time with filters and the Fourier Transform a couple of years back. I never attained a grasp of the math behind the transform but was successful in embedding prepackaged code into a C++ program that I still use. I had a lot of fun with the source code of Amaseis which is an earthquake recording software package written by Alan Jones. This was (and is) in conjunction with an amateur seismic station that I built a few years back. Too much. I will stop here.

Thanks again, for all of this.
ReplyDelete
Replies
Roger SparksMarch 30, 2016 at 5:40 AM
Good Morning Tom, I am beginning to get a glimmer of understanding here. But just a glimmer.

I am back at your very first comment on Brian's post. Here you define the terms and in particular

X = 1 - α1(1-θ)

Here X is a constant. Am I correct in thinking that A and B will also be constants? Or are they variables that act within X space in the sense that α1 and θ move in a relationship such that X is constant?

Does this get to the heart of the difference between linear and non-linear?

I know these are dumb questions but I must build on the fragments already within my grasp.

Your comment on Excel technique has not yet shown up when I refresh Brian's blog. I am looking forward to seeing that. Thanks.

Back to bed here. Too early to stay up with a busy day forthcoming.

This is all fascinating to me. I thank you so much for presenting it.
ReplyDelete
Replies
Tom BrownMarch 30, 2016 at 11:59 AM
Roger, I may be leading you astray with all this linear system theory! I don't really know. If your interest is primarily in economic modeling, I really am no expert on that. Where it overlaps with things I know I can go on at length (as evidenced above!). I'm not sure how much of what I wrote here applies to what you're interested in.

You can take my series of posts (starting with SIM and going through SIM7 and this one) as a cautionary tale of what can go wrong when an electrical engineer takes a cursory glance at SFC models and thinks he knows all about them based on his knowledge of linear systems. I do think I understand SFC models a bit better now, and I understand why I had miscommunications with people like Ramanan better now, but I'm FAR from an expert.

From my engineer's viewpoint, actually staying stock-flow consistent doesn't seem that important, as long as we're approximately stock-flow consistent. I eventually figured out how to stay SFC while changing sample periods (for example), but big-picture, I think I had it basically correct the 1st time (or the 2nd time: SIM2). But maybe there's something important I'm missing. I definitely have not read all of G&L's book (I've really only absorbed a portion of Chapter 3).

Of course my other purpose here was to understand Jason (the physicist's) complaint ... something I have not fully accomplished, but I'm OK w/ that.
ReplyDelete
Replies
Roger SparksApril 1, 2016 at 10:03 AM
This comment has been removed by the author.
ReplyDelete
Replies
Roger SparksApril 1, 2016 at 10:27 AM
Do you visualize the expansion of money as an "object" like I do?

It's a strange object that materializes at the will of government. Government borrows from itself. Where there was no money, now there is money.

This new money disappears if government taxes it back. A constant tax rate based on every money exchange is one way to analyze the collection.

This strange object is measured by money and transactions. The product (money*transactions) forms a parabola if plotted on money-transaction axis's. The curve is the "size" of the object and is constant at every point on the curve. The object has NO time parameters.

Linear System Theory can be used to predict how taxation might, over time , collect the object.

My method was not Linear System Theory? While the results were the same, I divided the whole object into two parts. Then I assigned one part to government, the second part to the private sector, with the size of the parts being dependent upon a uniform series of time periods.

Is this making any sense at all?

When we divide the object this way, only a small part of the entire object is visible in any time period. The visible part is the expansion related to the government tax collection. Thus we see GDP = G/T.

The privately held part of the object remains un-expanded, being only visible to the extent that wealth is visible.

[I know this is hard to visualize. It is something like complex numbers that are both present and not real.]

In SIM, we go on to add two additional complexities: We add more money (a new object) each period, AND, we draw part of the new money from previously issued money. You are familiar with the resulting trend to the steady state limit.

Concluding, do other people think of an injection of money as creating a GDP object (like I visualize)?
ReplyDelete
Replies
Tom BrownApril 1, 2016 at 3:24 PM
Roger, I couldn't help it: my question to myself in a comment above about "What concrete thing does Jason imagine H is?" prompted me to write a couple of more comments/questions there. Check it out and see if your mental image of the SIM world at all matches mine. I think it allows for some uncertainty about what the velocity of money is, but it doesn't allow for uncertainty about the stock of money.
ReplyDelete
Replies

Add comment

Subscribe to: Post Comments ( Atom )

USEFUL SYMBOLS

≡ ≈ ≠

~

∝

\pm

\mp

\cdot

\int

\oint

∵ ∴
≤ ≥ ∞ ‖ ∗ ∀ ∂ ∃ ∈ ∉ ∅

\div

¬ × ℝ ℂ ℕ ℚ ℤ ℜ ℑ √ ∛ ∜
ᵢ ⱼ ₖ ₘ ₙ ← → ↑ ↓ ↔ ↕ ↖ ↗
↘ ↙ ⇆ ⇄ ⇅ ⇵ ⇒ ⇔ ∇
₀ ₁ ₂ ₃ ₄ ₜ ⁿ ⁱ ⁰ ¹ ² ³

More here

Αα	Alpha	Νν	Nu
Ββ	Beta	Ξξ	Xi
Γγ	Gamma	Οο	Omicron
Δδ	Delta	Ππ	Pi
Εε	Epsilon	Ρρ	Rho
Ζζ	Zeta	Σσς	Sigma
Ηη	Eta	Ττ	Tau
Θθ	Theta	Υυ	Upsilon
Ιι	Iota	Φφ	Phi
Κκ	Kappa	Χχ	Chi
Λλ	Lambda	Ψψ	Psi
Μμ	Mu	Ωω	Omega