Prefetch optimizations on large-scale applications via parameter value prediction

Shih Wei Liao, Tzu Han Hung, Donald Nguyen, Hucheng Zhou, Chinyen Chou, ChiaHeng Tu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

4 Citations (Scopus)

Abstract

A typical data center application requires the processor cycles of thousands of machines. Even a single-digit performance improvement can significantly reduce the cost and power consumption of a data center. Unfortunately, achieving sustained improvement, even if modest, is difficult. Data centers are dynamic environments where applications are frequently released and servers are continually upgraded. For maintainability and fault tolerance, the physical capabilities and configuration of the servers are abstracted from the application programmer. We study application performance under different processor prefetch configurations. These configurations are largely transparent to the programmer, yet we observe a wide range of performance when comparing the worst and best configurations, with relative performance improvement ranging from 1.4% to 75.1%. Alarmingly, one application that consumes many processor cycles has a 23.6% improvement. Default prefetch configurations favor aggressively prefetching memory, which benefits most applications, but some data center applications have highly tuned memory behavior and aggressive prefetching severely decreases performance. We develop a tuning framework which attempts to predict the optimal configuration based on hardware performance counters. It applies to a large number of performance-critical data center applications without modifying the source code or binaries. The framework achieves performance within 1% of the best performance of a suite of important data center applications.

Original languageEnglish
Title of host publicationICS'09 - Proceedings of the 23rd International Conference on Supercomputing
Pages519-520
Number of pages2
DOIs
Publication statusPublished - 2009 Nov 24
Event23rd International Conference on Supercomputing, ICS'09 - Yorktown Heights, NY, United States
Duration: 2009 Jun 82009 Jun 12

Publication series

NameProceedings of the International Conference on Supercomputing

Other

Other23rd International Conference on Supercomputing, ICS'09
CountryUnited States
CityYorktown Heights, NY
Period09-06-0809-06-12

Fingerprint

Servers
Data storage equipment
Maintainability
Fault tolerance
Electric power utilization
Tuning
Hardware
Costs

All Science Journal Classification (ASJC) codes

  • Computer Science(all)

Cite this

Liao, S. W., Hung, T. H., Nguyen, D., Zhou, H., Chou, C., & Tu, C. (2009). Prefetch optimizations on large-scale applications via parameter value prediction. In ICS'09 - Proceedings of the 23rd International Conference on Supercomputing (pp. 519-520). [1542359] (Proceedings of the International Conference on Supercomputing). https://doi.org/10.1145/1542275.1542359
Liao, Shih Wei ; Hung, Tzu Han ; Nguyen, Donald ; Zhou, Hucheng ; Chou, Chinyen ; Tu, ChiaHeng. / Prefetch optimizations on large-scale applications via parameter value prediction. ICS'09 - Proceedings of the 23rd International Conference on Supercomputing. 2009. pp. 519-520 (Proceedings of the International Conference on Supercomputing).
@inproceedings{913cdcfa0e1845ed95868a66ea16c0f2,
title = "Prefetch optimizations on large-scale applications via parameter value prediction",
abstract = "A typical data center application requires the processor cycles of thousands of machines. Even a single-digit performance improvement can significantly reduce the cost and power consumption of a data center. Unfortunately, achieving sustained improvement, even if modest, is difficult. Data centers are dynamic environments where applications are frequently released and servers are continually upgraded. For maintainability and fault tolerance, the physical capabilities and configuration of the servers are abstracted from the application programmer. We study application performance under different processor prefetch configurations. These configurations are largely transparent to the programmer, yet we observe a wide range of performance when comparing the worst and best configurations, with relative performance improvement ranging from 1.4{\%} to 75.1{\%}. Alarmingly, one application that consumes many processor cycles has a 23.6{\%} improvement. Default prefetch configurations favor aggressively prefetching memory, which benefits most applications, but some data center applications have highly tuned memory behavior and aggressive prefetching severely decreases performance. We develop a tuning framework which attempts to predict the optimal configuration based on hardware performance counters. It applies to a large number of performance-critical data center applications without modifying the source code or binaries. The framework achieves performance within 1{\%} of the best performance of a suite of important data center applications.",
author = "Liao, {Shih Wei} and Hung, {Tzu Han} and Donald Nguyen and Hucheng Zhou and Chinyen Chou and ChiaHeng Tu",
year = "2009",
month = "11",
day = "24",
doi = "10.1145/1542275.1542359",
language = "English",
isbn = "9781605584980",
series = "Proceedings of the International Conference on Supercomputing",
pages = "519--520",
booktitle = "ICS'09 - Proceedings of the 23rd International Conference on Supercomputing",

}

Liao, SW, Hung, TH, Nguyen, D, Zhou, H, Chou, C & Tu, C 2009, Prefetch optimizations on large-scale applications via parameter value prediction. in ICS'09 - Proceedings of the 23rd International Conference on Supercomputing., 1542359, Proceedings of the International Conference on Supercomputing, pp. 519-520, 23rd International Conference on Supercomputing, ICS'09, Yorktown Heights, NY, United States, 09-06-08. https://doi.org/10.1145/1542275.1542359

Prefetch optimizations on large-scale applications via parameter value prediction. / Liao, Shih Wei; Hung, Tzu Han; Nguyen, Donald; Zhou, Hucheng; Chou, Chinyen; Tu, ChiaHeng.

ICS'09 - Proceedings of the 23rd International Conference on Supercomputing. 2009. p. 519-520 1542359 (Proceedings of the International Conference on Supercomputing).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Prefetch optimizations on large-scale applications via parameter value prediction

AU - Liao, Shih Wei

AU - Hung, Tzu Han

AU - Nguyen, Donald

AU - Zhou, Hucheng

AU - Chou, Chinyen

AU - Tu, ChiaHeng

PY - 2009/11/24

Y1 - 2009/11/24

N2 - A typical data center application requires the processor cycles of thousands of machines. Even a single-digit performance improvement can significantly reduce the cost and power consumption of a data center. Unfortunately, achieving sustained improvement, even if modest, is difficult. Data centers are dynamic environments where applications are frequently released and servers are continually upgraded. For maintainability and fault tolerance, the physical capabilities and configuration of the servers are abstracted from the application programmer. We study application performance under different processor prefetch configurations. These configurations are largely transparent to the programmer, yet we observe a wide range of performance when comparing the worst and best configurations, with relative performance improvement ranging from 1.4% to 75.1%. Alarmingly, one application that consumes many processor cycles has a 23.6% improvement. Default prefetch configurations favor aggressively prefetching memory, which benefits most applications, but some data center applications have highly tuned memory behavior and aggressive prefetching severely decreases performance. We develop a tuning framework which attempts to predict the optimal configuration based on hardware performance counters. It applies to a large number of performance-critical data center applications without modifying the source code or binaries. The framework achieves performance within 1% of the best performance of a suite of important data center applications.

AB - A typical data center application requires the processor cycles of thousands of machines. Even a single-digit performance improvement can significantly reduce the cost and power consumption of a data center. Unfortunately, achieving sustained improvement, even if modest, is difficult. Data centers are dynamic environments where applications are frequently released and servers are continually upgraded. For maintainability and fault tolerance, the physical capabilities and configuration of the servers are abstracted from the application programmer. We study application performance under different processor prefetch configurations. These configurations are largely transparent to the programmer, yet we observe a wide range of performance when comparing the worst and best configurations, with relative performance improvement ranging from 1.4% to 75.1%. Alarmingly, one application that consumes many processor cycles has a 23.6% improvement. Default prefetch configurations favor aggressively prefetching memory, which benefits most applications, but some data center applications have highly tuned memory behavior and aggressive prefetching severely decreases performance. We develop a tuning framework which attempts to predict the optimal configuration based on hardware performance counters. It applies to a large number of performance-critical data center applications without modifying the source code or binaries. The framework achieves performance within 1% of the best performance of a suite of important data center applications.

UR - http://www.scopus.com/inward/record.url?scp=70449727094&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70449727094&partnerID=8YFLogxK

U2 - 10.1145/1542275.1542359

DO - 10.1145/1542275.1542359

M3 - Conference contribution

AN - SCOPUS:70449727094

SN - 9781605584980

T3 - Proceedings of the International Conference on Supercomputing

SP - 519

EP - 520

BT - ICS'09 - Proceedings of the 23rd International Conference on Supercomputing

ER -

Liao SW, Hung TH, Nguyen D, Zhou H, Chou C, Tu C. Prefetch optimizations on large-scale applications via parameter value prediction. In ICS'09 - Proceedings of the 23rd International Conference on Supercomputing. 2009. p. 519-520. 1542359. (Proceedings of the International Conference on Supercomputing). https://doi.org/10.1145/1542275.1542359