SemiCCA: Efficient Semi-supervised Learning of Canonical Correlations


Canonical correlation analysis (CCA) is a powerful tool for analyzing multi-dimensional paired data. However, CCA tends to perform poorly when the number of paired samples is limited, which is often the case in practice. To cope with this problem, we propose a semi-supervised variant of CCA named SemiCCA that allows us to incorporate additional unpaired samples for mitigating overfitting. Advantages of the proposed method over previously proposed methods are its computational efficiency and intuitive operationality: it smoothly bridges the generalized eigenvalue problems of CCA and principal component analysis (PCA), and thus its solution can be computed efficiently just by solving a single eigenvalue problem as the original CCA. © 2013 Information Processing Society of Japan.

IPSJ Transactions on Mathematical Modeling and its Applications (TOM)