Why Self-Supervised Encoders Want to Be Normal

arXiv:2604.27743v1 Announce Type: cross Abstract: We develop a geometric and information-theoretic framework for encoder-decoder learning built on the Information Bottleneck (IB) principle. Recasting IB as

arXiv:2604.24499v1 Announce Type: cross
Abstract: Information theory is a powerful framework to capture aspects of dynamical systems with multiple degrees of freedom. Mathematically, the dynamics can be represented as a continuous curve $mathcalC$ on a suitable hyperplane in flat space and the Fisher information provides the norm of an infinitesimal displacement along this curve. In many applications, however, we do not have direct access to $mathcalC$. Instead, we have to reconstruct the latter from a time-series of measurements (obtained as samples of size $n$), which are represented by an ordered set of points $widehatmathcalC$ on the same hyperplane. In this work, we calculate the bias of the Fisher information for large $n$, which provides a quantitative estimation for how accurately the dynamics of a system can be reconstructed from a given set of sampled data. Based on this result, we show that a clustering of the degrees of freedom reduces the bias and thus improves the accuracy with which the new system can be described with the same data. Inspired by a recent proposal for such a clustering, we provide a quantitive assessment of the loss of information, which allows to estimate how much information about the dynamics of a system can reliably be extracted based on a given set of data. We illustrate our findings in the case of a simple compartmental model. Although the latter is inspired by epidemiology, the results of this work are applicable to very general dynamical models with multiple degrees of freedom.

Subscribe for Updates

Copyright 2025 dijee Intelligence Ltd.   dijee Intelligence Ltd. is a private limited company registered in England and Wales at Media House, Sopers Road, Cuffley, Hertfordshire, EN6 4RY, UK registration number 16808844