Department of Mathematics and Systems Analysis

Research

Mathematical Statistics and Data Science

The research group in Mathematical Statistics and Data Science studies advanced methods and models for analysing and representing data. We employ probability theory and stochastic processes to rigorously model uncertainty and randomness, and abstract and linear algebra to understand the structure of statistical models and the relationships between their parameters.

News and events

Join stochastics@list.aalto.fi to stay updated on probability and statistics in Aalto University.
Join stochastics-finland@list.aalto.fi for announcements on probability and statistics in Finland.


Members

Pauliina Ilmonen
Professor
Multivariate extreme values, functional data analysis, cancer epidemiology
Lasse Leskelä
Professor
Mathematical statistics, probability theory, network analysis
Kaie Kubjas
Associate Professor
Algebraic statistics
Vanni Noferini
Associate Professor
Network analysis, random matrix theory

Jukka Kohonen
University Lecturer
Statistics, combinatorics
Pekka Pere
University Lecturer
Statistics
Jonas Tölle
Senior University Lecturer
Stochastic processes, probability theory


Publications

Individual publication records and links to full articles when available can be found on the Aalto research page, where you can also find an overview of research output for the Mathematical Statistics and Data Science area.

Selected publications

Teaching

We teach courses in probability and statistics at all levels. Some of the offered courses are eligible as a basis for an SHV degree in insurance mathematics. Doctoral education in probability and statistics is coordinated by the Finnish Doctoral Education Network in Stochastics and Statistics (FDNSS).

Seminars

Upcoming seminars

  • 27.4. 15:15  MSc Ian Välimaa (Aalto): Mid-term review: Consistent clustering in tensors block models – Y405
  • 5.5. 10:15  BSc Kerkko Konola: Latent efficient price recovery in ETFs: A simulation study (MSc presentation) – M3 (M234)

    Recovering the latent efficient price from limit order book data is a fundamental challenge in econometrics, since the price that price discovery aims to reveal is never directly observed. This thesis studies whether statistical methods can reliably recover the latent price, and under what conditions one approach outperforms another. The analysis is carried out in a Monte Carlo simulation framework in which the true efficient price is known by construction, allowing a direct comparison of estimation accuracy. The data generating process incorporates the empirically relevant features of market microstructure. A driftless Brownian motion drives the latent price, order flow follows a persistent AR(1) process with linear price impact, and microstructure noise is state-dependent and heteroskedastic. Two estimators are compared across three microstructure regimes: a misspecified linear Kalman filter and a nonlinear XGBoost model. The Kalman filter performs better when distortions are mild, reacting to directional price changes roughly 0.3 seconds faster, whereas XGBoost achieves up to 48% lower mean squared error in the high-noise regime by capturing nonlinear order flow patterns the filter cannot represent. The framework is then extended to a multi-ETF setting in which three funds with different liquidity profiles track the same underlying index. Cross-asset information improves XGBoost uniformly across all ETFs, while the Kalman filter responds asymmetrically, improving for the less liquid ETFs but deteriorating for the most liquid one. The two estimators converge to near-identical accuracy once cross-asset information is available. Taken together, the results suggest that efficient price recoverability is regime- and specification-dependent, with implications for estimator selection and the design of cross-ETF statistical arbitrage strategies.


Projects and networks

  • FiRST – Finnish Centre of Excellence in Randomness and Structures, 2022–2029
  • NordicMathCovid, 2020–2022
  • COSTNET — European Cooperation for Statistics of Network Data Science, 2016–2020
  • Past projects...
           

Page content by: webmaster-math [at] list [dot] aalto [dot] fi