Computationally efficient link prediction in a variety of social networks Academic Article uri icon

abstract

  • Online social networking sites have become increasingly popular over the last few years. As a result, new interdisciplinary research directions have emerged in which social network analysis methods are applied to networks containing hundreds of millions of users. Unfortunately, links between individuals may be missing either due to an imperfect acquirement process or because they are not yet reflected in the online network (i.e., friends in the real world did not form a virtual connection). The primary bottleneck in link prediction techniques is extracting the structural features required for classifying links. In this article, we propose a set of simple, easy-to-compute structural features that can be analyzed to identify missing links. We show that by using simple structural features, a machine learning classifier can successfully identify missing links, even when applied to a predicament of classifying links between individuals with at least one common friend. We also present a method for calculating the amount of data needed in order to build more accurate classifiers. The new Friends measure and Same community features we developed are shown to be good predictors for missing links. An evaluation experiment was performed on ten large social networks datasets: Academia.edu, DBLP, Facebook, Flickr, Flixster, Google+, Gowalla, TheMarker, Twitter, and YouTube. Our methods can provide social network site operators with the capability of helping users to find known, offline contacts and to discover new friends online. They may also be used for exposing hidden links in online social networks.

publication date

  • January 1, 2013