Despite the widespread use of multi-core processors in modern computer systems, developing software tools so as to make best use of available computing resources has never been more urgent. This is because a considerable amount of spurious dependence and cache misses lurking in general-purpose applications restricts seriously the extraction of potential parallelism on the nowadays prevalent multi-core machines. Existing tools are limited in their ability to thoroughly detect data dependence and provide prefetched objects simultaneously. Further, some of the tools are unable to profile large-scale applications. To address this problem, we propose a novel profiler, called (Formula presented.) , that performs both data dependence and stride reference profiling. Data dependence profiling employs a hash-based scheme to detect actual data dependence while filtering out useless dependence via timestamps. Stride reference profiling employs value profiling to profile the stride pattern for each dynamic load and select the profitable loads as prefetched objects for compilers. To demonstrate the effectiveness of (Formula presented.) , we have evaluated it using several SPEC CPU2006, MPI2007 and OMP2012 benchmarks on an Intel i7-4700 machine. Experimental results show that (Formula presented.) produces accurate profiling results, including expected data dependence and prefetched objects, which in turn contributes to more opportunities for extracting parallelism.
All Science Journal Classification (ASJC) codes