Metric and Topological Approaches to Network Data Analysis
Abstract: Directed graphs naturally arise when modeling a variety of complex phenomena, ranging from causal dependencies between neural units in the brain to order-preserving representations of general time series data. Because these graphs may themselves be too complex for direct analysis, it is desirable to obtain hierarchical representations that encode the information in such graphs at multiple scales. Placing the path homology theory of digraphs—that was recently developed by Grigor’yan, Lin, Muranov, and Yau—in the framework of persistent homology is one method for obtaining such a multiscale abstraction. However, to justify this method for data analysis, one needs an appropriate stability theory relating perturbations in the input graphs to those in the persistent path homology outputs. While the landscape of such stability theories remains largely unexplored, we will show how techniques from metric geometry can provide one effective stability theory with strong guarantees. By enriching the resulting metric structure on digraphs with probability measures, we will show how to additionally obtain Riemannian structures that enable statistical techniques—i.e. taking Fréchet means, performing principal component analysis, and classifying via support vector machines—on the space of digraphs.