Fast extraction of top-k distances from graph data is a primitive of paramount importance in the fields of data mining, network analytics and machine learning, where ranked distances are exploited for several purposes (e.g. link prediction or network classification). While investigation on computational methods to address this retrieval task for regularly sized, static inputs has been extensive, much less is known when managed graphs are massive, i.e. having millions of vertices/edges, and time-evolving, i.e. when their structure can grow over time, a scenario that introduces a number of scalability and effectiveness issues otherwise not arising. Since, nowadays, most real-world applications exploiting top-k distances have to handle inherently dynamic and rapidly growing graphs, in this paper we present the first dynamic indexing scheme that supports very fast queries on top-k distances when graphs are massive and incrementally time- evolving. We assess the scalability and effectiveness of our method through extensive experimentation on both real-world and artificial graph datasets
Top-k Distance Queries on Large Time-Evolving Graphs
Andrea D'ascenzo
;
2023-01-01
Abstract
Fast extraction of top-k distances from graph data is a primitive of paramount importance in the fields of data mining, network analytics and machine learning, where ranked distances are exploited for several purposes (e.g. link prediction or network classification). While investigation on computational methods to address this retrieval task for regularly sized, static inputs has been extensive, much less is known when managed graphs are massive, i.e. having millions of vertices/edges, and time-evolving, i.e. when their structure can grow over time, a scenario that introduces a number of scalability and effectiveness issues otherwise not arising. Since, nowadays, most real-world applications exploiting top-k distances have to handle inherently dynamic and rapidly growing graphs, in this paper we present the first dynamic indexing scheme that supports very fast queries on top-k distances when graphs are massive and incrementally time- evolving. We assess the scalability and effectiveness of our method through extensive experimentation on both real-world and artificial graph datasetsI documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


