On the optimality of averaging in distributed statistical learning Academic Article uri icon

abstract

  • A common approach to statistical learning with Big-data is to randomly split it among machines and learn the parameter of interest by averaging the individual estimates. In this paper, focusing on empirical risk minimization or equivalently M-estimation, we study the statistical error incurred by this strategy. We consider two large-sample settings: first, a classical setting where the number of parameters is fixed, and the number of samples per machine. Second, a high-dimensional regime where both with. For both regimes and under suitable assumptions, we present asymptotically exact expressions for this estimation error. In the fixed-setting, we prove that to leading order averaging is as accurate as the centralized solution. We also derive the second-order error terms, and show that these can be non-negligible, notably for nonlinear models. The high-dimensional setting, in contrast …

publication date

  • January 1, 2014

published in