Spatio-temporal data mining involves extracting and analyzing useful information embedded in a large spatio-temporal database. Cluster analysis, one of the data mining techniques, provides the capability to investigate the spatio-temporal variation of data. Previous studies in cluster analysis indicate that the optimal number of clusters could be varied with the temporal scale of input data. This study employs multi-scale wavelet transforms and self-organizing map neural networks to mine air pollutant data. Experimental results show that regions determined from wavelet transform approach can reduce the local small regions using a small scale input data and improve the over-smoothed regions using one large scale input data. The results of cluster analysis using data generated from discrete wavelet transform and continuous wavelet transform also discussed in this paper. Data generated from continuous wavelet transform provide detailed time-variation features that can be used to detect the air pollutant spatial variation in a selected time period.