Welke, Pascal and Horváth, Tamás and Wrobel, Stefan (2016) Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. In: Min-Hashing for Probabilistic Frequent Subtree Feature Spaces. :. Lecture Notes in Computer Science, 9956 (1). Springer, Cham, Bari, pp. 67-82. ISBN 9783319463063
Full text not available from this repository.Abstract
We propose a fast algorithm for approximating graph similarities. For its advantageous semantic and algorithmic properties, we define the similarity between two graphs by the Jaccard-similarity of their images in a binary feature space spanned by the set of frequent subtrees generated for some training dataset. Since the feature space embedding is computationally intractable, we use a probabilistic subtree isomorphism operator based on a small sample of random spanning trees and approximate the Jaccard-similarity by min-hash sketches. The partial order on the feature set defined by subgraph isomorphism allows for a fast calculation of the min-hash sketch, without explicitly performing the feature space embedding. Experimental results on real-world graph datasets show that our technique results in a fast algorithm. Furthermore, the approximated similarities are well-suited for classification and retrieval tasks in large graph datasets.