A complete emotional expression in natural face-to-face conversation typically contains a complex temporal course. In this paper, we propose a temporal course modeling-based error weighted cross-correlation model (TCM-EWCCM) for speech emotion recognition. In TCM-EWCCM, a TCM-based cross-correlation model (CCM) is first used to not only model the temporal evolution of the extracted acoustic and prosodie features individually but also construct the statistical dependencies among paired acoustic-prosodic features in different emotional states. Then, a Bayesian classifier weighting scheme named error weighted classifier combination is adopted to explore the contributions of the individual TCM-based CCM classifiers for different acoustic-prosodic feature pairs to enhance the speech emotion recognition accuracy. The results of experiments on the NCKU-CASC corpus demonstrate that modeling the complex temporal structure and considering the statistical dependencies as well as contributions among paired features in natural conversation speech can indeed improve the speech emotion recognition performance.