KAUST Research Workshop on Optimization and Big Data
Identifying frequent subgraphs is a core operation for many analytics and machine learning algorithms on graphs. Frequent subgraph mining is computationally expensive rendering many graph analytics algorithms prohibitive for large graphs. Our group developed SmartPSI, a frequent subgraph mining algorithm based on the notion of pivoted subgraph isomorphism. Within SmartPSI we propose two radically different algorithms, called optimistic and pessimistic, each one suitable for different inputs. We also include a classifier based on machine learning, that is trained on-the-fly to decide dynamically which algorithm to execute for each part of the input graph. Finally, we implement an optimizer that generates for each algorithm a low-cost execution plan. Our experimental evaluation with large real graphs reveals that SmartPSI achieves up to 6 times performance improvement compared to the state-of-the-art distributed frequent subgraph mining system.