For data clustering tasks where the clusters are expected to form complicated geometric patterns, it is often advantageous to consider non-linear dimension reduction techniques, such as spectral clustering. In spectral clustering, information on the similarity between data points are used to construct a graph together with a discrete diffusion operator on this graph, known as the graph Laplacian. Graph Laplacians encode geometric information contained in data, via the eigenfunctions associated with their small eigenvalues. These spectral properties provide powerful tools for data clustering and data classification tasks. We will give an introduction to graph-based spectral clustering and classification. When a large number of data points are available one may consider instead continuum limits of the graph Laplacian, both to give insight and, potentially, as the basis for numerical methods. We summarize recent insights into the properties of these algorithms by investigating their corresponding formulations in the large data limit. These results open doors to the design of new algorithms, in specific application domains such as image segmentation, and for large data regimes more generally.
Prof. Franca Hoffmann is a Bonn Junior Fellow at University of Bonn (Germany), and AIMS-Carnegie Research Chair in Data Science at Quantum Leap Africa, AIMS Rwanda. After completing her PhD at the Cambridge Centre for Analysis at University of Cambridge (UK) in 2017, she held the position of von Karman instructor at California Institute of Technology (US) from 2017 to 2020.
Franca Hoffmann's research is focused on the interface between applied mathematics and data analysis, with particular interest in the development of novel tools for data analysis and mathematical approaches to machine learning, involving graph based methods for unsupervised and semi-supervised learning, focusing on data clustering and classification, graph Laplacians and their continuum counterparts, spectral analysis, uncertainty quantification and consistency analysis.