Department of Mathematics and Systems Analysis

Mathematical Statistics and Data Science

The research group in Mathematical Statistics and Data Science studies advanced methods and models for analysing and representing data. We employ probability theory and stochastic processes to rigorously model uncertainty and randomness, and abstract and linear algebra to understand the structure of statistical models and the relationships between their parameters.

News and events

LiBERA Winter School, 4–9 Jan 2027
Lammi Summer School, 25–29 May 2026
Past events...

Join stochastics@list.aalto.fi to stay updated on probability and statistics in Aalto University.
Join stochastics-finland@list.aalto.fi for announcements on probability and statistics in Finland.

Members

	Pauliina Ilmonen Professor Multivariate extreme values, functional data analysis, cancer epidemiology
	Lasse Leskelä Professor Mathematical statistics, probability theory, network analysis
	Kaie Kubjas Associate Professor Algebraic statistics
	Vanni Noferini Associate Professor Network analysis, random matrix theory

	Jukka Kohonen University Lecturer Statistics, combinatorics
	Pekka Pere University Lecturer Statistics
	Jonas Tölle Senior University Lecturer Stochastic processes, probability theory

Publications

Individual publication records and links to full articles when available can be found on the Aalto research page, where you can also find an overview of research output for the Mathematical Statistics and Data Science area.

Selected publications

J Pere, P Ilmonen, L Viitasaari. On extreme quantile region estimation under heavy-tailed elliptical distributions. Journal of Multivariate Analysis 2024.
F Arrigo, DJ Higham, V Noferini, R Wood. Weighted enumeration of nonbacktracking walks on weighted graphs. SIAM Journal on Matrix Analysis and Applications 2024.
M Bloznelis, L Leskelä. Clustering and percolation on superpositions of Bernoulli random graphs. Random Structures & Algorithms 2023.
M Hinz, JM Tölle and L Viitasaari. Variability of paths and differential equations with BV-coefficients. Annales de l’Institut Henri Poincaré - Probabilités et Statistiques 2023.
A Belyaeva, K Kubjas, LJ Sun, C Uhler. Identifying 3D genome organization in diploid organisms via Euclidean distance geometry. SIAM Journal on Mathematics of Data Science 2022.
J Alho, E Arjas, J Karvanen, L Leskelä, E Läärä ja P Pere. Tilastotieteen sanasto. Suomen Tilastoseura 2023.

Teaching

We teach courses in probability and statistics at all levels. Some of the offered courses are eligible as a basis for an SHV degree in insurance mathematics. Doctoral education in probability and statistics is coordinated by the Finnish Doctoral Education Network in Stochastics and Statistics (FDNSS).

Seminars

Upcoming seminars

27.4. 15:15 MSc Ian Välimaa (Aalto): Mid-term review: Consistent clustering in tensors block models – Y405
5.5. 10:15 BSc Kerkko Konola: Latent efficient price recovery in ETFs: A simulation study (MSc presentation) – M3 (M234)
Recovering the latent efficient price from limit order book data is a fundamental challenge in econometrics, since the price that price discovery aims to reveal is never directly observed. This thesis studies whether statistical methods can reliably recover the latent price, and under what conditions one approach outperforms another. The analysis is carried out in a Monte Carlo simulation framework in which the true efficient price is known by construction, allowing a direct comparison of estimation accuracy. The data generating process incorporates the empirically relevant features of market microstructure. A driftless Brownian motion drives the latent price, order flow follows a persistent AR(1) process with linear price impact, and microstructure noise is state-dependent and heteroskedastic. Two estimators are compared across three microstructure regimes: a misspecified linear Kalman filter and a nonlinear XGBoost model. The Kalman filter performs better when distortions are mild, reacting to directional price changes roughly 0.3 seconds faster, whereas XGBoost achieves up to 48% lower mean squared error in the high-noise regime by capturing nonlinear order flow patterns the filter cannot represent. The framework is then extended to a multi-ETF setting in which three funds with different liquidity profiles track the same underlying index. Cross-asset information improves XGBoost uniformly across all ETFs, while the Kalman filter responds asymmetrically, improving for the less liquid ETFs but deteriorating for the most liquid one. The two estimators converge to near-identical accuracy once cross-asset information is available. Taken together, the results suggest that efficient price recoverability is regime- and specification-dependent, with implications for estimator selection and the design of cross-ETF statistical arbitrage strategies.

Aalto Stochastics and Statistics Seminar

Projects and networks

FiRST – Finnish Centre of Excellence in Randomness and Structures, 2022–2029
NordicMathCovid, 2020–2022
COSTNET — European Cooperation for Statistics of Network Data Science, 2016–2020
Past projects...

Page content by: webmaster-math [at] list [dot] aalto [dot] fi

Research

Department of Mathematics and Systems Analysis

Mathematical Statistics and Data Science

News and events

Members

Full list of members...

Alumni...

Publications

Selected publications

Teaching

Seminars

Upcoming seminars

Aalto Stochastics and Statistics Seminar

Projects and networks

About

Personnel

Service personnel

Alphabetical