Skip to search boxSkip to navigationSkip to main content

Off-loading application controlled data prefetching in numerical codes for multi-core processors

  • Carsten Trinitis
    ,
  • Josef Weidendorfer
Research Output: Contribution to journal Article Peer-review

Abstract

An important issue when designing numerical code in High Performance Computing is cache optimisation in order to exploit the performance potential of a given target architecture. This includes techniques to improve memory access locality as well as prefetching. Inherent algorithm constrains often limit the first approach, which typically uses a blocking technique. While there exist automatic prefetching mechanisms in hardware and/or compilers, they can not complement blocking with additional prefetching. We provide an infrastructure for off-loading application controlled prefetching on a chip multiprocessor, allowing to further improve numerical code already optimised by standard cache optimisation. Clear benefits are shown for real workloads on existing hardware.

Publication Information

Output type

Research Output: Contribution to journal Article Peer-review

Original language

English

Pages from-to (Number of pages)

Pages 22-28

Journal (Volume, Issue Number)

International Journal of Computational Science and Engineering (Volume 4, Issue 1)

Publication milestones

  • Published - 01/11/2008

Publication status

Published - 01/11/2008

ISSN

1742-7185

External Publication IDs

  • handle.net: 10547/251814
  • Scopus: 56349135574

Publication metrics