Pseudo-Spherical Contrastive Divergence

Published in The 35th Conference on Neural Information Processing Systems (NeurIPS 2021), 2021

Lantao Yu, Jiaming Song, Yang Song, Stefano Ermon. The 35th Conference on Neural Information Processing Systems. NeurIPS 2021.


Energy-based models (EBMs) offer flexible distribution parametrization. However, due to the intractable partition function, they are typically trained via contrastive divergence for maximum likelihood estimation. In this paper, we propose pseudo-spherical contrastive divergence (PS-CD) to generalize maximum likelihood learning of EBMs. PS-CD is derived from the maximization of a family of strictly proper homogeneous scoring rules, which avoids the computation of the intractable partition function and provides a generalized family of learning objectives that include contrastive divergence as a special case. Moreover, PS-CD allows us to flexibly choose various learning objectives to train EBMs without additional computational cost or variational minimax optimization. Theoretical analysis on the proposed method and experiments on both synthetic data and commonly used image datasets demonstrate the effectiveness of PS-CD and its superiority over maximum likelihood and f-EBMs. Based on a set of recently proposed indicative generative model evaluation metrics, we also provide an analysis on the modeling tradeoffs of different objectives in the PS-CD family on image generation tasks, justifying its modeling flexibility.