Apertureless scattering near-field optical microscopy (A-SNOM) is generally performed using a heterodyne detection technique since it provides a higher signal-to-noise (S/N) ratio than homodyne detection. Accordingly, this study constructs a robust interference-based model of the detection signal which takes account of both the tip enhancement phenomena and the tip reflective background electric field to analyze the amplitude and phase of heterodyne detection signals at different harmonics of the tip vibration frequency. The analytical results indicate that the high-order harmonic tip scattering noise decays more rapidly with a high-order Bessel function for small phase modulation depths than the near-field interaction signal. It is also shown that the signal contrast improves as the wavelength of the illuminating light source is increased or the incident angle is reduced. As compared with homodyne technique, it can be found the signal contrast is much improved in visible region in heterodyne technique. The results presented in this study provide an improved understanding of the complex signal detected in the heterodyne A-SNOM technique and suggest potential means of improving its S/N ratio such that the signal contrast of heterodyne A-SNOM can be improved.