Incremental learning with accuracy prediction of social and individual properties from mobile-phone data Conference Paper uri icon


  • As truly ubiquitous wearable computers, mobile phones are quickly becoming the primary source for social, behavioral, and environmental sensing and data collection. Today's smart phones are equipped with increasingly more sensors and accessible data types that enable the collection of literally dozens of signals regarding the phone, its user, and their environment. A great deal of research effort in academia and industry is put into mining this data for higher level sense-making, such as understanding user context, inferring social networks, learning individual features, and so on. In many cases this analysis work is the result of exploratory forays and trial-and-error. Adding to the challenge, the devices themselves are limited platforms, hence data collection campaign must be carefully designed in order to collect the signals in the appropriate frequency, avoiding the exhausting the the device's limited battery and processing power. Currently however, there is no structured methodology for the design of mobile data collection and analysis initiatives. In this work we investigate the properties of learning and inference of real world data collected via mobile phones over time. In particular, we analyze how the ability to predict individual parameters and social links is incrementally enhanced with the accumulation of additional data. To do so we use the Friends and Family dataset, containing rich data signals gathered from the smart phones of 140 adult members of an MIT based young-family residential community for over a year, and is one of the most comprehensive mobile phone datasets gathered in academia to date. We develop several models for predicting social and individual properties from sensed mobile phone data over time, including detection of life-partners, ethnicity, and whether a person is a student or not. Finally, we propose a method for predicting the maximal learning accuracy possible for the learning task at hand, based on an initial set of measurements. This has various practical implications, such as better design of mobile data collection campaigns, or evaluating of planned analysis strategies.

publication date

  • January 1, 2012