The well-known Eckart-Young theorem asserts t hat the truncated singular value decomposition, obtained by discarding all but the first k largest singular values and their corresponding left and right singular vectors, is the best rank-k approximation in the sense of least squares to the original matrix. In other words, singular values alone serve well as unambiguous indicators of proximity to the data matrix. Unlike continuous data, the decomposition of a matrix with discrete data which is subject to the requirement that its approximations have the same type of data is a harder task and it is even harder when it comes to ranking these approximations. This work generalizes the notion of singular value decomposition via a sequence of variational formulations to discrete-type data. The process itself can guarantee neither the orthogonality, as is expected of discrete data, nor the ordering of best approximations. However, at the end of the undertaking, it is shown that a quantity analogous to the singular values and a truncated low rank factorization for discrete data analogous to the truncated singular value decomposition for continuous data are attainable. Our empirical study shows the applicability of our method to cluster analysis and pattern discovery using real-life data.
All Science Journal Classification (ASJC) codes