Integer Matrix Approximation and Data Mining

Bo Dong, Matthew M. Lin, Haesun Park

Research output: Contribution to journalArticle

Abstract

Integer datasets frequently appear in many applications in science and engineering. To analyze these datasets, we consider an integer matrix approximation technique that can preserve the original dataset characteristics. Because integers are discrete in nature, to the best of our knowledge, no previously proposed technique developed for real numbers can be successfully applied. In this study, we first conduct a thorough review of current algorithms that can solve integer least squares problems, and then we develop an alternative least square method based on an integer least squares estimation to obtain the integer approximation of the integer matrices. We discuss numerical applications for the approximation of randomly generated integer matrices as well as studies of association rule mining, cluster analysis, and pattern extraction. Our computed results suggest that our proposed method can calculate a more accurate solution for discrete datasets than other existing methods.

Original languageEnglish
Pages (from-to)198-224
Number of pages27
JournalJournal of Scientific Computing
Volume75
Issue number1
DOIs
Publication statusPublished - 2018 Apr 1

Fingerprint

Matrix Approximation
Integer Matrix
Data mining
Data Mining
Integer
Association rules
Cluster analysis
Least Squares Estimation
Association Rule Mining
Least Squares Problem
Cluster Analysis
Approximation
Least Square Method
Engineering
Calculate
Alternatives

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Numerical Analysis
  • Engineering(all)
  • Computational Theory and Mathematics
  • Computational Mathematics
  • Applied Mathematics

Cite this

Dong, Bo ; Lin, Matthew M. ; Park, Haesun. / Integer Matrix Approximation and Data Mining. In: Journal of Scientific Computing. 2018 ; Vol. 75, No. 1. pp. 198-224.
@article{56b0eeb89cef4cbfb3522ff80def2cb4,
title = "Integer Matrix Approximation and Data Mining",
abstract = "Integer datasets frequently appear in many applications in science and engineering. To analyze these datasets, we consider an integer matrix approximation technique that can preserve the original dataset characteristics. Because integers are discrete in nature, to the best of our knowledge, no previously proposed technique developed for real numbers can be successfully applied. In this study, we first conduct a thorough review of current algorithms that can solve integer least squares problems, and then we develop an alternative least square method based on an integer least squares estimation to obtain the integer approximation of the integer matrices. We discuss numerical applications for the approximation of randomly generated integer matrices as well as studies of association rule mining, cluster analysis, and pattern extraction. Our computed results suggest that our proposed method can calculate a more accurate solution for discrete datasets than other existing methods.",
author = "Bo Dong and Lin, {Matthew M.} and Haesun Park",
year = "2018",
month = "4",
day = "1",
doi = "10.1007/s10915-017-0531-7",
language = "English",
volume = "75",
pages = "198--224",
journal = "Journal of Scientific Computing",
issn = "0885-7474",
publisher = "Springer New York",
number = "1",

}

Integer Matrix Approximation and Data Mining. / Dong, Bo; Lin, Matthew M.; Park, Haesun.

In: Journal of Scientific Computing, Vol. 75, No. 1, 01.04.2018, p. 198-224.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Integer Matrix Approximation and Data Mining

AU - Dong, Bo

AU - Lin, Matthew M.

AU - Park, Haesun

PY - 2018/4/1

Y1 - 2018/4/1

N2 - Integer datasets frequently appear in many applications in science and engineering. To analyze these datasets, we consider an integer matrix approximation technique that can preserve the original dataset characteristics. Because integers are discrete in nature, to the best of our knowledge, no previously proposed technique developed for real numbers can be successfully applied. In this study, we first conduct a thorough review of current algorithms that can solve integer least squares problems, and then we develop an alternative least square method based on an integer least squares estimation to obtain the integer approximation of the integer matrices. We discuss numerical applications for the approximation of randomly generated integer matrices as well as studies of association rule mining, cluster analysis, and pattern extraction. Our computed results suggest that our proposed method can calculate a more accurate solution for discrete datasets than other existing methods.

AB - Integer datasets frequently appear in many applications in science and engineering. To analyze these datasets, we consider an integer matrix approximation technique that can preserve the original dataset characteristics. Because integers are discrete in nature, to the best of our knowledge, no previously proposed technique developed for real numbers can be successfully applied. In this study, we first conduct a thorough review of current algorithms that can solve integer least squares problems, and then we develop an alternative least square method based on an integer least squares estimation to obtain the integer approximation of the integer matrices. We discuss numerical applications for the approximation of randomly generated integer matrices as well as studies of association rule mining, cluster analysis, and pattern extraction. Our computed results suggest that our proposed method can calculate a more accurate solution for discrete datasets than other existing methods.

UR - http://www.scopus.com/inward/record.url?scp=85028974202&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85028974202&partnerID=8YFLogxK

U2 - 10.1007/s10915-017-0531-7

DO - 10.1007/s10915-017-0531-7

M3 - Article

AN - SCOPUS:85028974202

VL - 75

SP - 198

EP - 224

JO - Journal of Scientific Computing

JF - Journal of Scientific Computing

SN - 0885-7474

IS - 1

ER -