Variable Selection of Travel Demand Models for Paratransit Service

A Data Mining Approach

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In order to forecast the demands for American Disability Act (ADA) travel, many complicated factors are needed to be considered, including, but not limited to socioeconomic data and service operational characteristics. In other words, the choice of suitable explanatory variables to construct an ADA travel demand model is an important and complex task. Data mining techniques provide a promising way to select the explanatory variables. They work especially well in situations where the number of relevant variables is large and where the interactions among variables or models are not clear. In this study, we applied data mining techniques for selecting variables and building models. Census data was mined to select candidate variables. Also, we compared the performances of three types of models: (1) a traditional linear model, (2) a traditional model with variables selected by the classification and regression tree (CART) method, and (3) a traditional model with the variables selected by the random forest (RF) method. The results show that the fraction of senior citizens (age > 65), average household size (owner occupied), fraction of African Americans, fraction of Hispanics, annual household income, fraction of males, median age, fraction of households with a family member with a disability, and the total population, are the significant variables for modeling ADA travel demands.

Original languageEnglish
Title of host publicationCICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals
EditorsXuedong Yan, Yu Zhang, Yafeng Yin
PublisherAmerican Society of Civil Engineers (ASCE)
Pages1568-1579
Number of pages12
ISBN (Electronic)9780784479292
DOIs
Publication statusPublished - 2015 Jan 1
Event15th COTA International Conference of Transportation Professionals: Efficient, Safe, and Green Multimodal Transportation, CICTP 2015 - Beijing, China
Duration: 2015 Jul 242015 Jul 27

Publication series

NameCICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals

Other

Other15th COTA International Conference of Transportation Professionals: Efficient, Safe, and Green Multimodal Transportation, CICTP 2015
CountryChina
CityBeijing
Period15-07-2415-07-27

Fingerprint

Data mining
travel
disability
demand
act
household size
linear model
household income
family member
census
candidacy
citizen
regression
interaction
performance

All Science Journal Classification (ASJC) codes

  • Transportation

Cite this

Shen, C-W., & Kuo, P. (2015). Variable Selection of Travel Demand Models for Paratransit Service: A Data Mining Approach. In X. Yan, Y. Zhang, & Y. Yin (Eds.), CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals (pp. 1568-1579). (CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals). American Society of Civil Engineers (ASCE). https://doi.org/10.1061/9780784479292.144
Shen, Chung-Wei ; Kuo, Pei-fen. / Variable Selection of Travel Demand Models for Paratransit Service : A Data Mining Approach. CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals. editor / Xuedong Yan ; Yu Zhang ; Yafeng Yin. American Society of Civil Engineers (ASCE), 2015. pp. 1568-1579 (CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals).
@inproceedings{0be278db934a4240941fd83a17ec5007,
title = "Variable Selection of Travel Demand Models for Paratransit Service: A Data Mining Approach",
abstract = "In order to forecast the demands for American Disability Act (ADA) travel, many complicated factors are needed to be considered, including, but not limited to socioeconomic data and service operational characteristics. In other words, the choice of suitable explanatory variables to construct an ADA travel demand model is an important and complex task. Data mining techniques provide a promising way to select the explanatory variables. They work especially well in situations where the number of relevant variables is large and where the interactions among variables or models are not clear. In this study, we applied data mining techniques for selecting variables and building models. Census data was mined to select candidate variables. Also, we compared the performances of three types of models: (1) a traditional linear model, (2) a traditional model with variables selected by the classification and regression tree (CART) method, and (3) a traditional model with the variables selected by the random forest (RF) method. The results show that the fraction of senior citizens (age > 65), average household size (owner occupied), fraction of African Americans, fraction of Hispanics, annual household income, fraction of males, median age, fraction of households with a family member with a disability, and the total population, are the significant variables for modeling ADA travel demands.",
author = "Chung-Wei Shen and Pei-fen Kuo",
year = "2015",
month = "1",
day = "1",
doi = "10.1061/9780784479292.144",
language = "English",
series = "CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals",
publisher = "American Society of Civil Engineers (ASCE)",
pages = "1568--1579",
editor = "Xuedong Yan and Yu Zhang and Yafeng Yin",
booktitle = "CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals",
address = "United States",

}

Shen, C-W & Kuo, P 2015, Variable Selection of Travel Demand Models for Paratransit Service: A Data Mining Approach. in X Yan, Y Zhang & Y Yin (eds), CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals. CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals, American Society of Civil Engineers (ASCE), pp. 1568-1579, 15th COTA International Conference of Transportation Professionals: Efficient, Safe, and Green Multimodal Transportation, CICTP 2015, Beijing, China, 15-07-24. https://doi.org/10.1061/9780784479292.144

Variable Selection of Travel Demand Models for Paratransit Service : A Data Mining Approach. / Shen, Chung-Wei; Kuo, Pei-fen.

CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals. ed. / Xuedong Yan; Yu Zhang; Yafeng Yin. American Society of Civil Engineers (ASCE), 2015. p. 1568-1579 (CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Variable Selection of Travel Demand Models for Paratransit Service

T2 - A Data Mining Approach

AU - Shen, Chung-Wei

AU - Kuo, Pei-fen

PY - 2015/1/1

Y1 - 2015/1/1

N2 - In order to forecast the demands for American Disability Act (ADA) travel, many complicated factors are needed to be considered, including, but not limited to socioeconomic data and service operational characteristics. In other words, the choice of suitable explanatory variables to construct an ADA travel demand model is an important and complex task. Data mining techniques provide a promising way to select the explanatory variables. They work especially well in situations where the number of relevant variables is large and where the interactions among variables or models are not clear. In this study, we applied data mining techniques for selecting variables and building models. Census data was mined to select candidate variables. Also, we compared the performances of three types of models: (1) a traditional linear model, (2) a traditional model with variables selected by the classification and regression tree (CART) method, and (3) a traditional model with the variables selected by the random forest (RF) method. The results show that the fraction of senior citizens (age > 65), average household size (owner occupied), fraction of African Americans, fraction of Hispanics, annual household income, fraction of males, median age, fraction of households with a family member with a disability, and the total population, are the significant variables for modeling ADA travel demands.

AB - In order to forecast the demands for American Disability Act (ADA) travel, many complicated factors are needed to be considered, including, but not limited to socioeconomic data and service operational characteristics. In other words, the choice of suitable explanatory variables to construct an ADA travel demand model is an important and complex task. Data mining techniques provide a promising way to select the explanatory variables. They work especially well in situations where the number of relevant variables is large and where the interactions among variables or models are not clear. In this study, we applied data mining techniques for selecting variables and building models. Census data was mined to select candidate variables. Also, we compared the performances of three types of models: (1) a traditional linear model, (2) a traditional model with variables selected by the classification and regression tree (CART) method, and (3) a traditional model with the variables selected by the random forest (RF) method. The results show that the fraction of senior citizens (age > 65), average household size (owner occupied), fraction of African Americans, fraction of Hispanics, annual household income, fraction of males, median age, fraction of households with a family member with a disability, and the total population, are the significant variables for modeling ADA travel demands.

UR - http://www.scopus.com/inward/record.url?scp=84953268983&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84953268983&partnerID=8YFLogxK

U2 - 10.1061/9780784479292.144

DO - 10.1061/9780784479292.144

M3 - Conference contribution

T3 - CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals

SP - 1568

EP - 1579

BT - CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals

A2 - Yan, Xuedong

A2 - Zhang, Yu

A2 - Yin, Yafeng

PB - American Society of Civil Engineers (ASCE)

ER -

Shen C-W, Kuo P. Variable Selection of Travel Demand Models for Paratransit Service: A Data Mining Approach. In Yan X, Zhang Y, Yin Y, editors, CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals. American Society of Civil Engineers (ASCE). 2015. p. 1568-1579. (CICTP 2015 - Efficient, Safe, and Green Multimodal Transportation - Proceedings of the 15th COTA International Conference of Transportation Professionals). https://doi.org/10.1061/9780784479292.144