Researchers at the University of Toronto Institute of Aerospace Studies (UTIAS) have made a significant step towards enabling reliable predictions of complex dynamical systems when there are many uncertainties in the available data or missing information. This work could have numerous applications ranging from predicting the performance of aircraft engines to forecasting changes in global climate or the spread of viruses.

In a recent paper published in *Nature*, Professor **Prasanth B. Nair** (UTIAS) and **Kevin** **Course** (UTIAS PhD candidate) introduce a new machine learning algorithm that surmounts the real-world challenge of imperfect knowledge about system dynamics. This computer-based mathematical modelling approach is used for problem solving and better decision making in complex systems, where many components interact with each other.

“For the first time, we are able to apply state estimation to problems where we don’t know the governing equations, or the governing equations have a lot of missing terms,” says Course, the first author of the new paper.

“In contrast to standard techniques, which usually require a state estimate to infer the governing equations and vice-versa, our method learns the missing terms in the mathematical model and a state estimate simultaneously.”

State estimation, also known as data assimilation, refers to the process of combining observational data with computer models to estimate the current state of a system. Traditionally it requires strong assumptions about the type of uncertainties that exist in a mathematical model.

“For example, let’s say you have constructed a computer model that predicts the weather, and at the same time, you have access to real-time data from weather stations providing actual temperature readings,” says Nair. “Due to the model’s inherent limitations and simplifications — which is often unavoidable when dealing with complex real-world systems — the model predictions may not match the actual observed temperature you are seeing.

“State estimation combines the model’s prediction with the actual observations to provide a corrected or better-calibrated estimate of the current temperature. It effectively assimilates the data into the model to correct its state.”

However, it has been previously difficult to estimate the underlying state of complex dynamical systems in situations where the governing equations are completely or partially unknown. The new algorithm provides a rigorous statistical framework to address this long-standing problem.

“This problem is akin to deciphering the ‘laws’ that a system obeys without having explicit knowledge about them,” says Nair, whose research group is developing algorithms for mathematical modelling of systems and phenomena that are encountered in various areas of engineering and science.

A byproduct of Course and Nair’s algorithm is that it also helps to characterize missing terms or the entirety of the governing equations, which determine how the values of unknown variables change when one or more of the known variables change.

The main innovation underpinning the work is a reparametrization trick for stochastic variational inference with Markov Gaussian processes that enables an approximate Bayesian approach to solve such problems. This new development allows researchers to deduce the equations that govern the dynamics of complex systems and arrive at a state estimate using indirect and noisy measurements.

“Our approach is computationally attractive since it leverages stochastic, that is randomly determined, approximations that can be efficiently computed in parallel, and in addition, it does not rely on computationally expensive forward solvers in training,” says Course.

While Course and Nair approached their research from a theoretical viewpoint, they were able to demonstrate practical impact by applying their algorithm to problems ranging from modelling fluid flow to predicting the motion of black holes.

“Our work is relevant to several branches of sciences, engineering and finance, as researchers from these fields often interact with systems where first-principles models are difficult to construct or existing models are insufficient to explain system behaviour,” says Nair.

“We believe this work will open the door for practitioners in these fields to better intuit the systems they study,” adds Course. “Even in situations where high-fidelity mathematical models are available, this work can be used for probabilistic model calibration and to discover missing physics in existing models.

“We have also been able to successfully use our approach to efficiently train neural stochastic differential equations, which is a type of machine learning model that has shown promising performance for time-series datasets.”

While the new paper primarily addresses challenges in state estimation and governing equation discovery, the researchers say it provides a general groundwork for robust data-driven techniques in computational science and engineering.

“As an example, our research group is currently using this framework to construct probabilistic reduced-order models of complex systems. We hope to expedite decision-making processes integral to the optimal design, operation and control of real-world systems,” says Nair.

“Additionally, we are also studying how the inference methods stemming from our research may offer deeper statistical insights into stochastic differential equation-based generative models that are now widely used in many artificial intelligence applications.”