A tale of tails, ties and dependence for high-dimensional random matrices
Tid: Må 2025-12-15 kl 15.15
Plats: D3, Lindstedtsvägen 5
Medverkande: Johannes Heiny
Abstract: The dramatic increase and improvement of computing power and data collection devices have triggered the necessity to study and interpret the sometimes overwhelming amounts of data in an efficient and tractable way. Random matrix theory (RMT) has emerged as a powerful framework for analyzing high-dimensional data across a wide range of modern applications. As datasets grow in size and complexity, classical statistical assumptions often break down, while RMT provides asymptotic laws and spectral insights that remain stable in the high-dimensional regime. These tools enable robust estimation, anomaly detection, dimensionality reduction, and the characterization of noise versus signal in complex systems. By modeling data-dependent matrices—such as covariance, correlation, kernel, and adjacency matrices—RMT offers principled approaches for understanding their eigenvalue distributions and fluctuations. Consequently, RMT has become indispensable in fields including machine learning, finance, network science, wireless communications, and genomics, where large-scale structure and uncertainty must be navigated effectively.
In this talk, I will provide an overview of the challenges of dependence estimation for high-dimensional data, highlighting distinct phenomena for light- and heavy-tailed marginal distributions. We will see that self-normalization can often stabilize the eigenvalue distribution of large matrices.
Moreover, we provide distribution-free results for multivariate empirical versions of Kendall's Tau and Spearman's Rho in a setting where the dimension grows at most proportionally to the sample size. Although rank-based measures are known to be well suited for discrete and heavy-tailed data, previous works in the field focused mostly on the continuous and light-tailed case. We close this gap by imposing mild assumptions and allowing for general types of distributions. Interestingly, our analysis reveals that a non-trivial adjustment of classical Kendall's Tau is needed to obtain a pivotal limiting distribution in the presence of tied data.