U-Air: When urban air quality inference meets big data

Yu Zheng, Furui Liu, Hsun Ping Hsieh

Research output: Chapter in Book/Report/Conference proceedingConference contribution

545 Citations (Scopus)

Abstract

Information about urban air quality, e.g., the concentration of PM2.5, is of great importance to protect human health and control air pollution. While there are limited air-quality-monitor-stations in a city, air quality varies in urban spaces non-linearly and depends on multiple factors, such as meteorology, traffic volume, and land uses. In this paper, we infer the real-Time and fine-grained air quality information throughout a city, based on the (historical and real-Time) air quality data reported by existing monitor stations and a variety of data sources we observed in the city, such as meteorology, traffic flow, human mobility, structure of road networks, and point of interests (POIs). We propose a semi-supervised learning approach based on a co-Training framework that consists of two separated classifiers. One is a spatial classifier based on an artificial neural network (ANN), which takes spatially-related features (e.g., the density of POIs and length of highways) as input to model the spatial correlation between air qualities of different locations. The other is a temporal classifier based on a linear-chain conditional random field (CRF), involving temporally-related features (e.g., traffic and meteorology) to model the temporal dependency of air quality in a location. We evaluated our approach with extensive experiments based on five real data sources obtained in Beijing and Shanghai. The results show the advantages of our method over four categories of baselines, including linear/Gaussian interpolations, classical dispersion models, well-known classification models like decision tree and CRF, and ANN.

Original languageEnglish
Title of host publicationKDD 2013 - 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
EditorsRajesh Parekh, Jingrui He, Dhillon S. Inderjit, Paul Bradley, Yehuda Koren, Rayid Ghani, Ted E. Senator, Robert L. Grossman, Ramasamy Uthurusamy
PublisherAssociation for Computing Machinery
Pages1436-1444
Number of pages9
ISBN (Electronic)9781450321747
DOIs
Publication statusPublished - 2013 Aug 11
Event19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013 - Chicago, United States
Duration: 2013 Aug 112013 Aug 14

Publication series

NameProceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
VolumePart F128815

Other

Other19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013
CountryUnited States
CityChicago
Period13-08-1113-08-14

All Science Journal Classification (ASJC) codes

  • Software
  • Information Systems

Fingerprint Dive into the research topics of 'U-Air: When urban air quality inference meets big data'. Together they form a unique fingerprint.

Cite this