Developing a stroke severity index based on administrative data was feasible using data mining techniques

Sheng Feng Sung, Cheng Yang Hsieh, Yea-Huei Kao, Huey Juan Lin, Chih-Hung Chen, Yu Wei Chen, Ya Han Hu

Research output: Contribution to journalArticle

51 Citations (Scopus)

Abstract

Objectives Case-mix adjustment is difficult for stroke outcome studies using administrative data. However, relevant prescription, laboratory, procedure, and service claims might be surrogates for stroke severity. This study proposes a method for developing a stroke severity index (SSI) by using administrative data. Study Design and Setting We identified 3,577 patients with acute ischemic stroke from a hospital-based registry and analyzed claims data with plenty of features. Stroke severity was measured using the National Institutes of Health Stroke Scale (NIHSS). We used two data mining methods and conventional multiple linear regression (MLR) to develop prediction models, comparing the model performance according to the Pearson correlation coefficient between the SSI and the NIHSS. We validated these models in four independent cohorts by using hospital-based registry data linked to a nationwide administrative database. Results We identified seven predictive features and developed three models. The k-nearest neighbor model (correlation coefficient, 0.743; 95% confidence interval: 0.737, 0.749) performed slightly better than the MLR model (0.742; 0.736, 0.747), followed by the regression tree model (0.737; 0.731, 0.742). In the validation cohorts, the correlation coefficients were between 0.677 and 0.725 for all three models. Conclusion The claims-based SSI enables adjusting for disease severity in stroke studies using administrative data.

Original languageEnglish
Pages (from-to)1292-1300
Number of pages9
JournalJournal of Clinical Epidemiology
Volume68
Issue number11
DOIs
Publication statusPublished - 2015 Nov 1

Fingerprint

Data Mining
Stroke
Linear Models
National Institutes of Health (U.S.)
Registries
Risk Adjustment
Prescriptions
Outcome Assessment (Health Care)
Databases
Confidence Intervals

All Science Journal Classification (ASJC) codes

  • Epidemiology

Cite this

Sung, Sheng Feng ; Hsieh, Cheng Yang ; Kao, Yea-Huei ; Lin, Huey Juan ; Chen, Chih-Hung ; Chen, Yu Wei ; Hu, Ya Han. / Developing a stroke severity index based on administrative data was feasible using data mining techniques. In: Journal of Clinical Epidemiology. 2015 ; Vol. 68, No. 11. pp. 1292-1300.
@article{0ab102ddc84c4df7b4fc8cc9aec69ab3,
title = "Developing a stroke severity index based on administrative data was feasible using data mining techniques",
abstract = "Objectives Case-mix adjustment is difficult for stroke outcome studies using administrative data. However, relevant prescription, laboratory, procedure, and service claims might be surrogates for stroke severity. This study proposes a method for developing a stroke severity index (SSI) by using administrative data. Study Design and Setting We identified 3,577 patients with acute ischemic stroke from a hospital-based registry and analyzed claims data with plenty of features. Stroke severity was measured using the National Institutes of Health Stroke Scale (NIHSS). We used two data mining methods and conventional multiple linear regression (MLR) to develop prediction models, comparing the model performance according to the Pearson correlation coefficient between the SSI and the NIHSS. We validated these models in four independent cohorts by using hospital-based registry data linked to a nationwide administrative database. Results We identified seven predictive features and developed three models. The k-nearest neighbor model (correlation coefficient, 0.743; 95{\%} confidence interval: 0.737, 0.749) performed slightly better than the MLR model (0.742; 0.736, 0.747), followed by the regression tree model (0.737; 0.731, 0.742). In the validation cohorts, the correlation coefficients were between 0.677 and 0.725 for all three models. Conclusion The claims-based SSI enables adjusting for disease severity in stroke studies using administrative data.",
author = "Sung, {Sheng Feng} and Hsieh, {Cheng Yang} and Yea-Huei Kao and Lin, {Huey Juan} and Chih-Hung Chen and Chen, {Yu Wei} and Hu, {Ya Han}",
year = "2015",
month = "11",
day = "1",
doi = "10.1016/j.jclinepi.2015.01.009",
language = "English",
volume = "68",
pages = "1292--1300",
journal = "Journal of Clinical Epidemiology",
issn = "0895-4356",
publisher = "Elsevier USA",
number = "11",

}

Developing a stroke severity index based on administrative data was feasible using data mining techniques. / Sung, Sheng Feng; Hsieh, Cheng Yang; Kao, Yea-Huei; Lin, Huey Juan; Chen, Chih-Hung; Chen, Yu Wei; Hu, Ya Han.

In: Journal of Clinical Epidemiology, Vol. 68, No. 11, 01.11.2015, p. 1292-1300.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Developing a stroke severity index based on administrative data was feasible using data mining techniques

AU - Sung, Sheng Feng

AU - Hsieh, Cheng Yang

AU - Kao, Yea-Huei

AU - Lin, Huey Juan

AU - Chen, Chih-Hung

AU - Chen, Yu Wei

AU - Hu, Ya Han

PY - 2015/11/1

Y1 - 2015/11/1

N2 - Objectives Case-mix adjustment is difficult for stroke outcome studies using administrative data. However, relevant prescription, laboratory, procedure, and service claims might be surrogates for stroke severity. This study proposes a method for developing a stroke severity index (SSI) by using administrative data. Study Design and Setting We identified 3,577 patients with acute ischemic stroke from a hospital-based registry and analyzed claims data with plenty of features. Stroke severity was measured using the National Institutes of Health Stroke Scale (NIHSS). We used two data mining methods and conventional multiple linear regression (MLR) to develop prediction models, comparing the model performance according to the Pearson correlation coefficient between the SSI and the NIHSS. We validated these models in four independent cohorts by using hospital-based registry data linked to a nationwide administrative database. Results We identified seven predictive features and developed three models. The k-nearest neighbor model (correlation coefficient, 0.743; 95% confidence interval: 0.737, 0.749) performed slightly better than the MLR model (0.742; 0.736, 0.747), followed by the regression tree model (0.737; 0.731, 0.742). In the validation cohorts, the correlation coefficients were between 0.677 and 0.725 for all three models. Conclusion The claims-based SSI enables adjusting for disease severity in stroke studies using administrative data.

AB - Objectives Case-mix adjustment is difficult for stroke outcome studies using administrative data. However, relevant prescription, laboratory, procedure, and service claims might be surrogates for stroke severity. This study proposes a method for developing a stroke severity index (SSI) by using administrative data. Study Design and Setting We identified 3,577 patients with acute ischemic stroke from a hospital-based registry and analyzed claims data with plenty of features. Stroke severity was measured using the National Institutes of Health Stroke Scale (NIHSS). We used two data mining methods and conventional multiple linear regression (MLR) to develop prediction models, comparing the model performance according to the Pearson correlation coefficient between the SSI and the NIHSS. We validated these models in four independent cohorts by using hospital-based registry data linked to a nationwide administrative database. Results We identified seven predictive features and developed three models. The k-nearest neighbor model (correlation coefficient, 0.743; 95% confidence interval: 0.737, 0.749) performed slightly better than the MLR model (0.742; 0.736, 0.747), followed by the regression tree model (0.737; 0.731, 0.742). In the validation cohorts, the correlation coefficients were between 0.677 and 0.725 for all three models. Conclusion The claims-based SSI enables adjusting for disease severity in stroke studies using administrative data.

UR - http://www.scopus.com/inward/record.url?scp=84945475138&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84945475138&partnerID=8YFLogxK

U2 - 10.1016/j.jclinepi.2015.01.009

DO - 10.1016/j.jclinepi.2015.01.009

M3 - Article

C2 - 25700940

AN - SCOPUS:84945475138

VL - 68

SP - 1292

EP - 1300

JO - Journal of Clinical Epidemiology

JF - Journal of Clinical Epidemiology

SN - 0895-4356

IS - 11

ER -