This paper proposes a scalable audio encoder which integrates modified psychoacoustic model and embedded zerotree wavelet (EZW) algorithm based on wavelet packet transform. The performance of modern audio coding systems is strongly dependent on the psychoacoustic model to remove the irrelevant audio information. However, the traditional methods to analyze the psychoacoustic models based on Fast Fourier Transform (FFT) require considerable amount of computational load. The modified psychoacoustic model proposed in this paper is directly driven by wavelet packet transform rather than FFT. EZW is used for scalable coding that is also derived from wavelet packet transform. Compared to the other audio encoders such as: MPEG layer I and EZW algorithm without psychoacoustic model, the experimental results show that the proposed encoder has better perceptual sound quality and lower computational complexity at low bit rate coding.