Multilevel diffusion-based and spectral graph methods for clustering and anomaly detection
PhD: Università della Svizzera italiana
English
Graph partitioning, clustering, and anomaly detection are fundamental tasks in applications involving interacting or interconnected entities naturally modeled as large graphs. In many such settings, labeled data are scarce, delayed, or unreliable, thereby increasing the importance of unsupervised methods. However, diffusion and spectral methods, while effective, often become impractical on large-scale graphs because they require the computation of multiple orthonormal eigenvectors. This thesis makes three principal contributions in graph partitioning and unsupervised multilevel learning on graphs. First, we develop GraphLab.jl, an open-source Julia framework for research and teaching in graph partitioning. It provides coordinate, inertial, and spectral bisection, random-sphere separators, space-filling-curve orderings, and nested dissection, together with benchmarking and visualization utilities. Second, we introduce Multilevel Diffusion Clustering (MDC), a multilevel spectral method for multiway clustering on undirected graphs. MDC computes a diffusion-Laplacian embedding only on a coarse graph and refines the induced partition during uncoarsening through diffusion-weighted kernel k-means, limiting intensive computation of multiple orthonormal eigenvectors to a smaller, coarsest graph representation. Third, we propose Multilevel Anomaly Detection (MAD), a framework for directed graphs. MAD uses feature-derived anomaly scores to guide selective coarsening and mitigate severe class imbalance. At the coarse level, a spectral analysis on a feature-based similarity graph yields anomaly indicators, which are then propagated and refined during uncoarsening to recover fine-scale anomalies. Experiments on synthetic benchmarks and real-world graphs illustrate the broad applicability of the proposed multilevel pipelines, with applications including scientific computing and financial transaction networks.
-
Collections
-
-
Language
-
-
Classification
-
Computer science and technology
-
License
-
-
Open access status
-
green
-
Identifiers
-
-
Persistent URL
-
https://n2t.net/ark:/12658/srd1335226