Computing gaussian mixture models with EM using side-information Academic Article uri icon


  • Abstract Estimation of Gaussian mixture models is an efficient and popular technique for clustering and density estimation. An EM procedure is widely used to estimate the model parameters. In this paper we show how side information in the form of equivalence constraints can be incorporated into this procedure, leading to improved clustering results. Equivalence constraints are prior knowledge concerning pairs of data points, indicating if the points arise from the same source (positive constraint) or from different sources (negative constraint). Such constraints can be gathered automatically in some learning problems, and are a natural form of supervision in others. We present a closed form EM procedure for handling positive constraints, and a Generalized EM procedure using a Markov net for the incorporation of negative constraints. Using publicly available data sets we demonstrate …

publication date

  • January 1, 2003