Networks are ubiquitous mathematical objects for modeling a wide range of real-world systems, from social networks and communication infrastructures to biological processes such as protein interactions. A distinctive feature of these real-world networks is that they evolve over time, with nodes and edges dynamically changing at any time. This temporal evolution introduces several new challenging problems, ranging from the choice of the “right” evolving network model that must be considered to the design and anal- ysis of both centralized and distributed algorithms in many scenarios. In this thesis, we start by considering temporal graphs, where interactions are timestamped. They offer a rich model for capturing the dynamics of evolving networks from a centralized per- spective, but they also introduce new computational challenges, since traditional static graph algorithms often cannot be directly applied to temporal data. Designing efficient centralized algorithms for temporal graphs requires dealing with both the structural and temporal dimensions of the network, while ensuring that solutions are not only accurate but also scalable. For instance, temporal centrality metrics, such as temporal between- ness centrality, must account for the ordering and timing of interactions. With respect to the static case, this complicates their computation but it provides more insightful results about the importance of the nodes in the network during time. In particular, temporal betweenness assigns a value to each node that is based on the fraction of op- timal temporal paths passing through it. Buß et al. (KDD, 2020) gave algorithms to compute various notions of temporal betweenness centrality, including perhaps the most natural one –shortest temporal betweenness. Their algorithm computes the centrality values of all nodes in time O(n3T 2), where n is the size of the network and T is the total number of time steps. For real-world networks, which easily contain tens of thousands of nodes, this complexity becomes prohibitive. Thus, it is reasonable to consider fast approximation algorithms that allow for measuring the relative importance of nodes in very large temporal graphs. In this thesis, we consider the problem of efficiently com- puting the temporal betweenness rankings and scores. We start by considering proxies (i.e., fast heuristics) to approximate the temporal betweenness rankings that take into account global and local properties of the network. We compare several such proxies on a diverse set of real-world networks. To this end we define a novel temporal degree notion called the pass-through-degree, which measures the number of pairs of neighbors of a node that are temporally connected through it, and we show that such a temporal degree notion can be computed in nearly linear time for all nodes. Moreover, we observe that it is surprisingly competitive as a proxy for the temporal betweenness centrality. Next, we consider the problem of approximating the temporal betweenness scores of all the nodes in the network. To this end, we develop a novel sampling-based approximation algorithm that computes probabilistically guaranteed high-quality temporal betweenness estimates (of nodes and temporal edges). Such an algorithm uses advanced tools from statistical learning theory and combinatorial optimization to estimate the temporal betweenness of all the nodes up to a small absolute error ε with probability of at least 1 − δ, where ε, δ ∈ (0, 1). We empirically show how the proposed algorithm achieves tight theoretical guarantees and significantly improves the state-of-the-art in terms of running time, approximation quality, sample size, and allocated memory, enabling very precise approximations on very big temporal graphs. In the second part of the thesis we consider dynamic network models from a distributed computing perspective, in which designing algorithms that provably work in highly dy- namic environments presents its own set of non-trivial challenges. Networks undergoing frequent changes due to node churn (node leaving and joining the network), rapid topology shifts, or adversarial behaviors demand decentralized solutions that can adapt in real time. These challenges emerge mostly in peer-to-peer systems, where the network structure is constantly evolving. We study the fundamental problem of performing com- putations in a distributed system subject to a heavy churn rate (i.e., high number of nodes joining and leaving the network in each round). We begin by extending the model by Becchetti et al. (SODA, 2020) to obtain dynamic random graph models that evolve forever: in the first model, edges can be faulty, i.e., each edge at each round disappears with some probability; in the second one, at every round new nodes join the network according to a stochastic process and each node currently in the network disappears with certain probability; in the third one, we consider a combination of the two models above, in which edges can be faulty and nodes can join and leave the network. We run extensive simulations to measure how long it takes a message starting at a random node to reach all, or almost all, the nodes. The simulations show that, for large ranges of the parameters of the models, the information spreads very fast, i.e., at a rate compatible with a logarithmic growth, as a function of the number of nodes in the network. We continue by studying robust and efficient distributed algorithms for building and main- taining distributed data structures in dynamic graphs on which nodes are continuously replaced by an oblivious adversary. We present a novel algorithm that builds and main- tains with high probability a skip list for poly(n) rounds despite O(n/ log n) adversarial churn per round (n is the “stable” network size). Our algorithm requires a maintenance overhead proportional to the churn rate. In addition, our algorithm is scalable in the sense that messages are small and every node sends a limited amount of messages in each round. A consequence of our maintenance algorithm is that it opens up opportunities for more general distributed computation in distributed dynamic graph models.

Models and Algorithms for Temporal Betweenness Centrality and Dynamic Distributed Data Structures / Cruciani, Antonio. - (2025 Mar 07).

Models and Algorithms for Temporal Betweenness Centrality and Dynamic Distributed Data Structures

CRUCIANI, ANTONIO
2025-03-07

Abstract

Networks are ubiquitous mathematical objects for modeling a wide range of real-world systems, from social networks and communication infrastructures to biological processes such as protein interactions. A distinctive feature of these real-world networks is that they evolve over time, with nodes and edges dynamically changing at any time. This temporal evolution introduces several new challenging problems, ranging from the choice of the “right” evolving network model that must be considered to the design and anal- ysis of both centralized and distributed algorithms in many scenarios. In this thesis, we start by considering temporal graphs, where interactions are timestamped. They offer a rich model for capturing the dynamics of evolving networks from a centralized per- spective, but they also introduce new computational challenges, since traditional static graph algorithms often cannot be directly applied to temporal data. Designing efficient centralized algorithms for temporal graphs requires dealing with both the structural and temporal dimensions of the network, while ensuring that solutions are not only accurate but also scalable. For instance, temporal centrality metrics, such as temporal between- ness centrality, must account for the ordering and timing of interactions. With respect to the static case, this complicates their computation but it provides more insightful results about the importance of the nodes in the network during time. In particular, temporal betweenness assigns a value to each node that is based on the fraction of op- timal temporal paths passing through it. Buß et al. (KDD, 2020) gave algorithms to compute various notions of temporal betweenness centrality, including perhaps the most natural one –shortest temporal betweenness. Their algorithm computes the centrality values of all nodes in time O(n3T 2), where n is the size of the network and T is the total number of time steps. For real-world networks, which easily contain tens of thousands of nodes, this complexity becomes prohibitive. Thus, it is reasonable to consider fast approximation algorithms that allow for measuring the relative importance of nodes in very large temporal graphs. In this thesis, we consider the problem of efficiently com- puting the temporal betweenness rankings and scores. We start by considering proxies (i.e., fast heuristics) to approximate the temporal betweenness rankings that take into account global and local properties of the network. We compare several such proxies on a diverse set of real-world networks. To this end we define a novel temporal degree notion called the pass-through-degree, which measures the number of pairs of neighbors of a node that are temporally connected through it, and we show that such a temporal degree notion can be computed in nearly linear time for all nodes. Moreover, we observe that it is surprisingly competitive as a proxy for the temporal betweenness centrality. Next, we consider the problem of approximating the temporal betweenness scores of all the nodes in the network. To this end, we develop a novel sampling-based approximation algorithm that computes probabilistically guaranteed high-quality temporal betweenness estimates (of nodes and temporal edges). Such an algorithm uses advanced tools from statistical learning theory and combinatorial optimization to estimate the temporal betweenness of all the nodes up to a small absolute error ε with probability of at least 1 − δ, where ε, δ ∈ (0, 1). We empirically show how the proposed algorithm achieves tight theoretical guarantees and significantly improves the state-of-the-art in terms of running time, approximation quality, sample size, and allocated memory, enabling very precise approximations on very big temporal graphs. In the second part of the thesis we consider dynamic network models from a distributed computing perspective, in which designing algorithms that provably work in highly dy- namic environments presents its own set of non-trivial challenges. Networks undergoing frequent changes due to node churn (node leaving and joining the network), rapid topology shifts, or adversarial behaviors demand decentralized solutions that can adapt in real time. These challenges emerge mostly in peer-to-peer systems, where the network structure is constantly evolving. We study the fundamental problem of performing com- putations in a distributed system subject to a heavy churn rate (i.e., high number of nodes joining and leaving the network in each round). We begin by extending the model by Becchetti et al. (SODA, 2020) to obtain dynamic random graph models that evolve forever: in the first model, edges can be faulty, i.e., each edge at each round disappears with some probability; in the second one, at every round new nodes join the network according to a stochastic process and each node currently in the network disappears with certain probability; in the third one, we consider a combination of the two models above, in which edges can be faulty and nodes can join and leave the network. We run extensive simulations to measure how long it takes a message starting at a random node to reach all, or almost all, the nodes. The simulations show that, for large ranges of the parameters of the models, the information spreads very fast, i.e., at a rate compatible with a logarithmic growth, as a function of the number of nodes in the network. We continue by studying robust and efficient distributed algorithms for building and main- taining distributed data structures in dynamic graphs on which nodes are continuously replaced by an oblivious adversary. We present a novel algorithm that builds and main- tains with high probability a skip list for poly(n) rounds despite O(n/ log n) adversarial churn per round (n is the “stable” network size). Our algorithm requires a maintenance overhead proportional to the churn rate. In addition, our algorithm is scalable in the sense that messages are small and every node sends a limited amount of messages in each round. A consequence of our maintenance algorithm is that it opens up opportunities for more general distributed computation in distributed dynamic graph models.
7-mar-2025
Models and Algorithms for Temporal Betweenness Centrality and Dynamic Distributed Data Structures / Cruciani, Antonio. - (2025 Mar 07).
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.12571/34764
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact