Mathematical Statistics and Data Science
The research group in Mathematical Statistics and Data Science studies advanced methods and models for analysing and representing data. We employ probability theory and stochastic processes to rigorously model uncertainty and randomness, and abstract and linear algebra to understand the structure of statistical models and the relationships between their parameters.
News and events
Join
stochastics@list.aalto.fi to stay updated on probability and statistics in Aalto University.
Join
stochastics-finland@list.aalto.fi for announcements on probability and statistics in Finland.
Members
 |
Pauliina Ilmonen Professor Multivariate extreme values, functional data analysis, cancer epidemiology |
 |
Kaie Kubjas Associate Professor Algebraic statistics |
 |
Lasse Leskelä Associate Professor Mathematical statistics, network analysis, probability theory |
 |
Vanni Noferini Associate Professor Network analysis, random matrix theory |
Publications
Individual publication records and links to full articles when available can be found on the
Aalto research page, where you can also find an overview of
research output for the Mathematical Statistics and Data Science area.
Selected publications
-
- F Arrigo, DJ Higham, V Noferini, R Wood. Weighted enumeration of nonbacktracking walks on weighted graphs. SIAM Journal on Matrix Analysis and Applications 2024.
- M Bloznelis, L Leskelä. Clustering and percolation on superpositions of Bernoulli random graphs. Random Structures & Algorithms 2023.
- M Hinz, JM Tölle and L Viitasaari. Variability of paths and differential equations with BV-coefficients. Annales de l’Institut Henri Poincaré - Probabilités et Statistiques 2023.
- A Belyaeva, K Kubjas, LJ Sun, C Uhler. Identifying 3D genome organization in diploid organisms via Euclidean distance geometry. SIAM Journal on Mathematics of Data Science 2022.
- J Alho, E Arjas, J Karvanen, L Leskelä, E Läärä ja P Pere. Tilastotieteen sanasto. Suomen Tilastoseura 2023.
Teaching
We teach courses in probability and statistics at all levels. Some of the offered courses are eligible as a basis for an
SHV degree in insurance mathematics. Doctoral education in probability and statistics is coordinated by the
Finnish Doctoral Education Network in Stochastics and Statistics (FDNSS).
Seminars
Upcoming seminars
- 9.12. 11:15 Meeri Palokangas (Aalto University): A quasi-likelihood-based gradient boosting machine for humanitarian demand prediction (MSc presentation) – M203
Predicting food aid demand requires balancing theoretical soundness with computational practicality while handling the messiness of zero-inflated humanitarian data. This thesis develops a natural gradient boosting approach for probabilistic demand forecasting by implementing the Extended Quasi-Likelihood (EQL) function within NGBoost, enabling native support for Tweedie-parametrized Compound Poisson-Gamma distributions without the computational barriers of exact likelihood estimation. The method generates full predictive distributions, which allows supply planners to quantify uncertainty and make risk-aware decisions in prepositioning and predictive procurement. The approach is evaluated on handover data from the World Food Programme (WFP) from April 2024 to March 2025, along with backtesting across six years. Results show conservative calibration errors of for central quantiles and relatively strong performance at extreme values where humanitarian logistics decisions are most consequential. While point prediction accuracy doesn't outperform the benchmark LGBM approach, the superior distributional calibration directly addresses WFP's stated need for quantified uncertainty in supply chain planning and makes the trade-off justified. The work bridges quasi-likelihood statistical theory with practical machine learning, allowing feasibility of learning on standard hardware, as opposed to the full likelihood approximation methods of these kinds of distributions, which are too heavy for repeated computations. However, the theoretical regularity (in the Riemannian sense) of the quasi-likelihood statistical manifolds, which is proposed in this work, remains unproven, leaving important theoretical questions about why the approach works empirically.
Projects and networks
Page content by: webmaster-math [at] list [dot] aalto [dot] fi