Performance evaluation is key to many computer applications. Many techniques and profiling tools are available for measuring performance, but most of them depend on the hardware and the software on which they run. For a new platform, or a platform which is not popular, programmers usually suffer from few analysis tools, which has been a serious problem for application development on many embedded systems. Thus, a performance analysis tool with the software mechanism is quite important for developing embedded applications. This paper describes a software mechanism for analyzing program performance on a wide range of platforms via code instrumentation at the source level. We implement this mechanism in a pure software profiling toolkit, called Moduletracer, which works with a public-domain tool, CIL, to carry out code instrumentation for C programs. The toolkit aids programmers in understanding the behavior of applications by generating and analyzing traces and identify potential performance problems.