(Formula presented.): a data dependence and stride reference patterns profiling infrastructure

Hairong Yu, Guohui Li, Lih Chyun Shu

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Despite the widespread use of multi-core processors in modern computer systems, developing software tools so as to make best use of available computing resources has never been more urgent. This is because a considerable amount of spurious dependence and cache misses lurking in general-purpose applications restricts seriously the extraction of potential parallelism on the nowadays prevalent multi-core machines. Existing tools are limited in their ability to thoroughly detect data dependence and provide prefetched objects simultaneously. Further, some of the tools are unable to profile large-scale applications. To address this problem, we propose a novel profiler, called (Formula presented.) , that performs both data dependence and stride reference profiling. Data dependence profiling employs a hash-based scheme to detect actual data dependence while filtering out useless dependence via timestamps. Stride reference profiling employs value profiling to profile the stride pattern for each dynamic load and select the profitable loads as prefetched objects for compilers. To demonstrate the effectiveness of (Formula presented.) , we have evaluated it using several SPEC CPU2006, MPI2007 and OMP2012 benchmarks on an Intel i7-4700 machine. Experimental results show that (Formula presented.) produces accurate profiling results, including expected data dependence and prefetched objects, which in turn contributes to more opportunities for extracting parallelism.

Original languageEnglish
Pages (from-to)770-788
Number of pages19
JournalJournal of Supercomputing
Volume72
Issue number2
DOIs
Publication statusPublished - 2016 Feb 1

Fingerprint

Data Dependence
Profiling
Infrastructure
Dynamic loads
Loads (forces)
Computer systems
Parallelism
Dynamic Load
Timestamp
Multi-core Processor
Software Tools
Compiler
Cache
Filtering
Benchmark
Resources
Computing
Experimental Results
Demonstrate
Object

All Science Journal Classification (ASJC) codes

  • Software
  • Theoretical Computer Science
  • Information Systems
  • Hardware and Architecture

Cite this

@article{009e6b2c16dd4bbf9b8929f6ad738af3,
title = "(Formula presented.): a data dependence and stride reference patterns profiling infrastructure",
abstract = "Despite the widespread use of multi-core processors in modern computer systems, developing software tools so as to make best use of available computing resources has never been more urgent. This is because a considerable amount of spurious dependence and cache misses lurking in general-purpose applications restricts seriously the extraction of potential parallelism on the nowadays prevalent multi-core machines. Existing tools are limited in their ability to thoroughly detect data dependence and provide prefetched objects simultaneously. Further, some of the tools are unable to profile large-scale applications. To address this problem, we propose a novel profiler, called (Formula presented.) , that performs both data dependence and stride reference profiling. Data dependence profiling employs a hash-based scheme to detect actual data dependence while filtering out useless dependence via timestamps. Stride reference profiling employs value profiling to profile the stride pattern for each dynamic load and select the profitable loads as prefetched objects for compilers. To demonstrate the effectiveness of (Formula presented.) , we have evaluated it using several SPEC CPU2006, MPI2007 and OMP2012 benchmarks on an Intel i7-4700 machine. Experimental results show that (Formula presented.) produces accurate profiling results, including expected data dependence and prefetched objects, which in turn contributes to more opportunities for extracting parallelism.",
author = "Hairong Yu and Guohui Li and Shu, {Lih Chyun}",
year = "2016",
month = "2",
day = "1",
doi = "10.1007/s11227-015-1612-8",
language = "English",
volume = "72",
pages = "770--788",
journal = "Journal of Supercomputing",
issn = "0920-8542",
publisher = "Springer Netherlands",
number = "2",

}

(Formula presented.) : a data dependence and stride reference patterns profiling infrastructure. / Yu, Hairong; Li, Guohui; Shu, Lih Chyun.

In: Journal of Supercomputing, Vol. 72, No. 2, 01.02.2016, p. 770-788.

Research output: Contribution to journalArticle

TY - JOUR

T1 - (Formula presented.)

T2 - a data dependence and stride reference patterns profiling infrastructure

AU - Yu, Hairong

AU - Li, Guohui

AU - Shu, Lih Chyun

PY - 2016/2/1

Y1 - 2016/2/1

N2 - Despite the widespread use of multi-core processors in modern computer systems, developing software tools so as to make best use of available computing resources has never been more urgent. This is because a considerable amount of spurious dependence and cache misses lurking in general-purpose applications restricts seriously the extraction of potential parallelism on the nowadays prevalent multi-core machines. Existing tools are limited in their ability to thoroughly detect data dependence and provide prefetched objects simultaneously. Further, some of the tools are unable to profile large-scale applications. To address this problem, we propose a novel profiler, called (Formula presented.) , that performs both data dependence and stride reference profiling. Data dependence profiling employs a hash-based scheme to detect actual data dependence while filtering out useless dependence via timestamps. Stride reference profiling employs value profiling to profile the stride pattern for each dynamic load and select the profitable loads as prefetched objects for compilers. To demonstrate the effectiveness of (Formula presented.) , we have evaluated it using several SPEC CPU2006, MPI2007 and OMP2012 benchmarks on an Intel i7-4700 machine. Experimental results show that (Formula presented.) produces accurate profiling results, including expected data dependence and prefetched objects, which in turn contributes to more opportunities for extracting parallelism.

AB - Despite the widespread use of multi-core processors in modern computer systems, developing software tools so as to make best use of available computing resources has never been more urgent. This is because a considerable amount of spurious dependence and cache misses lurking in general-purpose applications restricts seriously the extraction of potential parallelism on the nowadays prevalent multi-core machines. Existing tools are limited in their ability to thoroughly detect data dependence and provide prefetched objects simultaneously. Further, some of the tools are unable to profile large-scale applications. To address this problem, we propose a novel profiler, called (Formula presented.) , that performs both data dependence and stride reference profiling. Data dependence profiling employs a hash-based scheme to detect actual data dependence while filtering out useless dependence via timestamps. Stride reference profiling employs value profiling to profile the stride pattern for each dynamic load and select the profitable loads as prefetched objects for compilers. To demonstrate the effectiveness of (Formula presented.) , we have evaluated it using several SPEC CPU2006, MPI2007 and OMP2012 benchmarks on an Intel i7-4700 machine. Experimental results show that (Formula presented.) produces accurate profiling results, including expected data dependence and prefetched objects, which in turn contributes to more opportunities for extracting parallelism.

UR - http://www.scopus.com/inward/record.url?scp=84954419672&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84954419672&partnerID=8YFLogxK

U2 - 10.1007/s11227-015-1612-8

DO - 10.1007/s11227-015-1612-8

M3 - Article

AN - SCOPUS:84954419672

VL - 72

SP - 770

EP - 788

JO - Journal of Supercomputing

JF - Journal of Supercomputing

SN - 0920-8542

IS - 2

ER -