Press "Enter" to skip to content

A Distributed Algorithm That Finds Almost Best Possible Estimate Under Non-Vanishing and Time-Varying Measurement Noise

  • Published in: The IEEE Control Systems Letters (L-CSS)
  • Authors: Jin Gyu Lee and Hyungbo Shim
  • DOI: 10.1109/LCSYS.2019.2923475
  • Abstract: In this letter, we review an existing distributed least-squares solver and share some new insights on it. Then, by the observation that an estimation of a constant vector under output noise can be translated into finding the least-squares solution, we present an algorithm for distributed estimation of the state of linear time-invariant systems under measurement noise. The proposed algorithm consists of a network of local observers, where each of them utilizes local measurements and information transmitted from the neighbors. It is proven that even under non-vanishing and time-varying measurement noise, we could obtain an almost best possible estimate with arbitrary precision. Some discussions regarding the plug-and-play operation are also given.

This work is a continued effort of CDSL to the distributed estimation problem, where a method to handle non-vanishing and time-varying measurement noise is provided. The main idea of this paper is the distributed least-squares solver. For more technical details, I refer to the paper.

A distributed least-squares solver

We first introduce the distributed least-squares solver for the linear equation Ax = b, developed by G. Shi and B. D. O. Anderson, whose individual dynamics is given by

$$\dot{x}_i(t) = -A_i^T(A_ix_i(t) – b_i) + \gamma\sum_{j \in \mathcal{N}_i}(x_j(t) – x_i(t)),$$

where \gamma > 0 is the coupling gain, and \mathcal{N}_i is a subset of \mathcal{N}:= \{1, 2, \dots, N\}, whose elements are the indices of the connected agents that send the information to the agent i. Each agent i uses the information of A_i and b_i only, in order to solve the global problem Ax = b, where A = {\rm col}(A_1, \dots, A_N) and b = {\rm col}(b_1, \dots, b_N).

Here, another insight behind this distributed least-squares solver is given from the so-called blended dynamics approach. In this approach, it has been shown that the behavior of heterogeneous multi-agent systems

$$\dot{x}_i(t) = f_i(t,x_i(t)) + \gamma \sum_{j \in \mathcal{N}_i} (x_j(t) – x_i(t)), \quad i \in \mathcal{N},$$

with large coupling gain \gamma, is approximated by the behavior of the blended dynamics defined by

$$\dot{\hat{x}}(t) = \frac{1}{N}\sum_{i=1}^N f_i(t,\hat{x}(t)), \qquad \hat x(0) = \frac{1}{N} \sum_{i=1}^N x_i(0)$$

when the blended dynamics has a certain type of stability such as asymptotic stability with respect to a compact set. In our case, the blended dynamics of the distributed least-squares solver is obtained as the gradient descent algorithm that solves the minimization of \|Ax -b\|^2 given by

$$\dot{\hat{x}}(t) = -\frac{1}{N}A^T(A\hat{x}(t) – b),$$

which is a linear system. Since the solution exponentially converges to (A^TA)^{-1}A^Tb, it is expected for the solution of the distributed least-squares solver to converge approximately to the least-squares solution.

Least-squares solver is a best possible estimator

Consider for a moment estimation of a constant vector \chi from a corrupted measurement b, which is given by b = A\chi + o where A is a known output matrix and o is an unknown constant noise. Therefore, any estimator \mathsf{E} can be represented as a mapping from the pair (A, b) to its estimate \mathsf{E}(A, b). Then, a fundamental limitation arises that applies to any possible estimator, as follows.

For any estimator \mathsf{E}, there exists a pair (\chi,o) such that the estimate \mathsf{E}(A, b) has an estimation error larger than or equal to |o|/\sigma_1(A), i.e., $$|\mathsf{E}(A, b) – \chi| = |\mathsf{E}(A, A\chi + o) – \chi| \ge |o|/\sigma_1(A)$$
where \sigma_1(A) denotes the smallest singular value of A.

According to this fundamental limitation, let us call \mathsf{E} by a best possible estimator if there is no (\chi, o) such that
$$|\mathsf{E}(A, A\chi + o) – \chi| > |o|/\sigma_1(A).$$
In fact, the mapping \mathsf{E}_{\text{lss}}, that maps the pair (A, b) to the least-squares solution of Ax = b, is an example of a best possible estimator, because we have $$\mathsf{E}_{\text{lss}}(A, A\chi + o) = \chi + (A^TA)^{-1}A^To,$$
from which, we get for any \chi and o,
$$|\mathsf{E}_{\text{lss}}(A, A\chi + o) – \chi| = |(A^TA)^{-1}A^To| \le |o|/\sigma_1(A).$$

A distributed algorithm that finds almost best possible estimate

Based on the discussion previously made, we now propose a distributed state estimator for the linear system \dot{\chi}(t) = S\chi(t) and y_i(t) = G_i\chi(t) + n_i(t). For the time being, suppose first that the information about the state vector \chi(t) is available in the form of the vector b_i(t), i=1,\dots,N, in b_i(t) = A_i\chi(t) + o_i(t) with the full-column rank matrix A. Here we note that the vector \chi(t) to be estimated is time-varying, and so, the distributed least-squares solver may not efficiently track the variation of \chi(t). Therefore, inspired by the internal model principle, our idea is to embed the model for the state \chi(t) into the distributed least-squares solver as
$$\dot{\chi}_i(t) = S\chi_i(t) -\kappa A_i^T(A_i \chi_i(t) – b_i(t)) + \kappa\gamma\sum_{j \in \mathcal{N}_i}(\chi_j(t) – \chi_i(t)).$$
The algorithm achieves an almost best possible estimate of \chi(t) with sufficiently large gains \kappa and \gamma.

Here, the notion of a best possible estimate is extended to cover non-vanishing and time-varying but bounded output noise o(t) in the sense of limit supremum, i.e., the estimate \hat{\chi}(t) generated by the knowledge of the output matrix A and the output A\chi(t) + o(t) is called a best possible estimate if we have
$$\limsup_{t \to \infty} |\hat{\chi}(t) – \chi(t)| \le \frac{1}{\sigma_1(A)} \limsup_{t \to \infty} |o(t)|.$$

1. Utilization of local partial state observer

Now, let us consider the system \dot{\chi}(t) = S\chi(t) with the measurement output y_i(t) = G_i\chi(t) + n_i(t). Then, the information vector b_i(t) might be an outcome of a local partial state observer that estimates the detectable portion of the state \chi(t), from the actual measurement y_i(t).

In particular, by performing the detectability decomposition for each pair (G_i, S), we obtain an orthogonal matrix \begin{bmatrix} A_i^T & B_i^T\end{bmatrix} such that
$$\begin{bmatrix} A_i \ B_i \end{bmatrix} S \begin{bmatrix}A_i^T & B_i^T \end{bmatrix} = \begin{bmatrix} \bar{S}_i^{11} & 0 \\ \bar{S}_i^{21} & \bar{S}_i^{22} \end{bmatrix}, \quad G_i\begin{bmatrix} A_i^T & B_i^T \end{bmatrix} = \begin{bmatrix} \bar{G}_i & 0 \end{bmatrix},$$
and that (\bar{G}_i, \bar{S}_i^{11}) is detectable. Then, one can always find a local partial state observer of the form
$$\dot{b}_i(t) = P_i(b_i(t), y_i(t)),$$
which provides an estimate of A_i\chi(t) with the estimation error o_i(t) := b_i(t) - A_i\chi(t). Since the detectability of the pair (G, S) ensures that the matrix A has a full-column rank.

An example of local partial state observer is given as
$$\dot{b}_i(t) = \bar{S}_i^{11}b_i(t) – \bar{L}_i(\bar{G}_i b_i(t) – y_i(t)),$$ where \bar{L}_i is any matrix such that \bar{S}_i^{11} - \bar{L}_i\bar{G}_i is Hurwitz, which exists because (\bar{G}_i, \bar{S}_i^{11}) is detectable. For this case, the estimation error o_i satisfies $$\dot{o}_i(t) = (\bar{S}_i^{11} – \bar{L}_i\bar{G}_i)o_i(t) + \bar{L}_in_i(t) =: \bar{S}_i^H o_i(t) + \bar{L}_in_i(t),$$ and thus,
$$o_i(t) = e^{\bar{S}_i^{H}t}o_i(0) + \int_0^t e^{\bar{S}_i^{H}(t-\tau)}\bar{L}_in_i(\tau) d\tau,$$
which gives that o_i(t) is bounded whenever n_i(t) is bounded, and in addition satisfies
$$\limsup_{t \to \infty} |o(t)| \le \sqrt{\begin{matrix}\sum_{i=1}^N \left|(\bar{S}i^H)^{-1}\bar{L}_i\right|^2\end{matrix}}\limsup_{t \to \infty} |n(t)|.$$

Comments are closed.