1 / 23

580.691 Learning Theory Reza Shadmehr System identification via subspace analysis Further reading:

580.691 Learning Theory Reza Shadmehr System identification via subspace analysis Further reading: Peter van Overschee and Bart De Moor (1996) Subspace identification for linear systems: theory, implementation, applications. Kluwer Academic, The Netherlands, pp. 1-56.

lottie
Download Presentation

580.691 Learning Theory Reza Shadmehr System identification via subspace analysis Further reading:

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 580.691 Learning Theory Reza Shadmehr System identification via subspace analysis Further reading: Peter van Overschee and Bart De Moor (1996) Subspace identification for linear systems: theory, implementation, applications. Kluwer Academic, The Netherlands, pp. 1-56.

  2. The problem of state estimation We have a system that has some hidden states x. In this system, we know precisely what the parameters are: we know A, B, C, D, and the noises. We also know the inputs to the system (inputs are u; they may be zero), and we have measured the outputs from the system (observations y) Task: our job is to try to estimate the hidden states x. Solution: The Kalman filter. The Kalman filter shows us how to combine our predictions about y (which are derived from out belief about x) with actual measurements y, and then update our estimate of the state x.

  3. State estimation via the Kalman filter Given inputs u and measurements y, estimate state x. We know parameters of the system precisely (A, B, C, D, Q, R). Our prior estimate of state x (mean of xhat) Our uncertainty (variance of xhat) Our predicted observation in trial n The Kalman gain: a ratio between uncertainty in our prediction and uncertainty in measurement Our posterior estimate of state x Our posterior uncertainty Prior estimate of state (and its uncertainty) for the next trial

  4. The problem of control Cost per step System dynamics We have a system that has some hidden states x. In this system, we know precisely what the parameters are: we know A, B, C, D, and the noises. Therefore, we know the system’s dynamics. Task: our job is to provide some inputs u to this system so that it produces outputs y that we desire. The optimum u and y minimize the cumulative cost. We need to find a feedback control policy that produces the optimum u. Solution: optimal feedback control. Find an optimal control policy that generates a motor command as a function of the estimated state of the system x.

  5. The problem of system identification System dynamics We have a system that has some hidden states x. We have given some inputs u and observed some outputs y. We know these inputs and outputs. In this system, we do not know A, B, C, D, or the noises. (of course, we also do not know the hidden states x). In fact, we do not even know the dimensions of the hidden state x. Task: our job is estimate the “best” system parameters A, B, C, and D so that given the inputs u, we in fact observe outputs y. Solution: system identification via subspace analysis.

  6. The problem statement (deterministic system) We are given: We believe that this data was generated by a system that has this structure: Find model parameters so to minimize sum of squared errors between predicted and observed measurements y.

  7. Non-uniqueness of the solution Given our data (input u and output y), there are many parameter values that can give us the precise sequence of outputs y for inputs u. That is, there is no unique solution to our problem. To show this, suppose we know the true parameters A, B, C, D, and x(1). nx1 nxn nxm oxn oxm mx1 ox1 If we now multiply both sides with some arbitrary invertible matrix T, we get: nxn Identity matrix (oxo) So we see that the same (u,y) can be generated with the equivalent system where parameters are multiplied by arbitrary invertible matrix T.

  8. Projecting a vector onto another vector When we project a onto b, we get a vector in the direction of b with magnitude of: Read: a projected onto b It’s easy to see that if a and b are perpendicular, then projection of a onto b is a zero vector. (Thanks to Vincent Ethier)

  9. Projecting row spaces of matrices Suppose that each row of the matrix A represents a vector. Because A has 3 columns, each of the row vectors lives in a 3D space. If the row vectors are linearly independent, then they can serve as a basis set. A basis set means that we can construct a new 3D vector as a linear combination of the row vectors. If the row vectors are linearly independent, then the basis set spans the entire 3D space. That is, we can represent any 3D vector as a linear combination of the 3 row vectors. We say that the rank of A is 3, because the row vectors span a 3D space. Each of the row vectors of matrix B are also some vector in 3D space. However, there are only two rows. So the space spanned by the row vectors of B is at most 2D (if they are linearly independent, this space is 2D). Rank of B is at most 2. The space spanned by the row vectors of B represents a plane. This plane is a subspace of the 3D space where row vectors of A live. We can project each row vector of A onto this subspace.

  10. y 1 1 0.5 0.5 0 0 -0.5 -0.5 -1 -1 1 0.75 z 0.5 0.25 0 -1 -1 -0.5 -0.5 0 0 0.5 0.5 x 1 1 The row vectors of B span this plane. This is a subspace of the 3D space where row vectors of A and B live. Projecting A onto B is projecting each row vector of A onto the space spanned by row vectors of B.

  11. defines the space that is perpendicular to the space spanned by the row vectors of B. In this case, the space spanned by B is a plane. So the perpendicular space is simply a line in 3D. The reason why Bp is a line and not a plane is because we want any vector in the subspace defined by B to be perpendicular to Bp. y 1 1 0.5 0.5 0 0 -0.5 -0.5 -1 -1 1 0.5 z 0 -0.5 -1 -1 -1 -0.5 -0.5 0 0 0.5 0.5 x 1 1 It then follows that:

  12. Plan of attack: The U and Y matrices written here are simply a history of our input and output data. We know that each row of matrix Y lives in a j dimensional space that is spanned by the row vectors in X and U. We will project Y onto the space perpendicular to that spanned by U. By doing so, we will get rid of the contribution of U, and have only the components of Y that live in the subspace described by X (the history of the hidden states, which are of course unknown). Therefore, by projecting Y onto Uperpendicular, we will end up with a new matrix that is precisely equal to CX. Although we do not know C or X, we will be able to exactly compute CX. Because C is a constant matrix, we will in fact know the history of the hidden states up to a constant multiple. This is just what we needed, because we can then solve for the unknown parameters A, B, C, and D up to a constant multiple. In practice, i is much smaller than j, which means that the row vectors in Y and U describe subspaces of the j-dimensional space. The matrices U and Y are called “Hankel” matrices.

  13. The important idea to notice is that each row vector in Y can be written as a linear function of the row vector in X and the row vectors in U. Therefore, the row vectors in Y live in the subspace defined by the row vectors of U and X.

  14. In these equations, we know the matrices Y and U, but we do not know any of the other matrices. We see that row vectors of Y live in the subspace defined by the row vectors of U and the row vector of X. Now what we will do is find a subspace that is perpendicular to the subspace defined by U. We will then project Y onto this subspace. By doing so, we will get rid of the influence of U on Y, and only have the influence of X.

  15. unknown known We just did an amazing thing. We computed a matrix O with row vectors that live in the subspace defined by the hidden state matrix X. In fact, each row vector of O is related to the hidden states by a constant matrix. For example, we see that if A was a scalar, all we would have to do to recover it is to find the ratio of two adjacent rows of O.

  16. We notice that the matrix O is composed of row vectors that are linearly dependent. Therefore, the rank of matrix O is equal to the rank of X. (Rank is simply the dimensionality of the subspace spanned by the row vectors of the matrix.) Rank of X is simply the dimensions of vector x. The important observation is that by examining the rank of matrix O, we arrive at the dimensionality of the hidden state vector x. Although we do not know the value of the hidden states, we now know the dimensions of vector x. We now perform a singular value decomposition of O. Dimension of each vector v is nx1, where n is the dimension of vector x. Dimensions of matrix S is nxn, where n is the dimension of vector x. Dimensions of Pi is the same as matrix C, which is the same as CA, CA^2, etc.

  17. Arbitrary, unknown invertible square matrix Our state estimate Our state estimate is proportional to the actual state via the unknown but constant matrix T. Matrix V without its last column Matrix V without its first column

  18. Summary We believe that the data was generated from the following linear system: We are given: Pick i to be an integer greater than the estimated size of the hidden states. Pick j=p-2i+1. Form the following matrices: Compute the projection Compute singular value decomposition of O. Estimate the hidden states. With the estimated hidden states, find an estimate of the model parameters.

  19. Observations on the subspace method Our state estimate is proportional to the actual state via the unknown but constant matrix T. Because matrix T is unknown, our estimate of model parameters will not be equal to the actual parameters that produced the data. What we have done is recover another system that is equivalent to the original system. Our system will produce the same output as the original system when it is given the same input.

  20. Recovering the exact parameters Suppose we know that matrix A is diagonal. In general, Ahat will not be diagonal. To make it diagonal, let us begin by finding eigen values of Ahat: And corresponding eigen vectors: We also had: So if A is known to be diagonal, then it must be equal to matrix L, eigen vectors of Ahat.

  21. Examples

More Related