Data-Driven Thread Execution on Heterogeneous Processors.

Saved in:
Bibliographic Details
Title: Data-Driven Thread Execution on Heterogeneous Processors.
Authors: Arandi, Samer1 arandi@najah.edu, Matheou, George2 geomat@cs.ucy.ac.cy, Kyriacou, Costas3 eng.kc@frederick.ac.cy, Evripidou, Paraskevas2 skevos@cs.ucy.ac.cy
Source: International Journal of Parallel Programming. Apr2018, Vol. 46 Issue 2, p198-224. 27p.
Subjects: Simultaneous multithreading processors, Heterogeneous computing, Virtual machine systems, Data flow computing, High performance computing
Abstract: In this paper we report our experience in implementing and evaluating the Data-Driven Multithreading (DDM) model on a heterogeneous multi-core processor. DDM is a non-blocking multithreading model that decouples the synchronization from the computation portions of a program, allowing them to execute asynchronously in a dataflow manner. Thread dependencies are determined by the compiler/programmer while thread scheduling is done dynamically at runtime based on data availability. The target processor for this implementation is the Cell processor. We call this implementation the Data-Driven Multithreading Virtual Machine for the Cell processor (DDM-VMc). Thread scheduling is handled in software by the Power Processing Element core of the Cell while the Synergistic Processing Element cores execute the program threads. DDM-VMc virtualizes the parallel resources of the Cell, handles the heterogeneity of the cores, manages the Cell memory hierarchy efficiently and supports distributed execution across a cluster of Cell nodes. DDM-VMc has been implemented on a single Cell processor with six computation cores, as well as, on a four Cell processor cluster with 24 computation cores. We present an in-depth performance analysis of DDM-VMc, using a suite of standard computational benchmarks. The evaluation shows that DDM-VMc scales well and tolerates scheduling overheads, memory and communication latencies effectively. Furthermore, DDM-VMc compares favorably with other platforms targeting the Cell processor, such as, the CellSs and Sequoia. [ABSTRACT FROM AUTHOR]
Copyright of International Journal of Parallel Programming is the property of Springer Nature and its content may not be copied or emailed to multiple sites without the copyright holder's express written permission. Additionally, content may not be used with any artificial intelligence tools or machine learning technologies. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
Database: Engineering Source
Full text is not displayed to guests.
Description
Abstract:In this paper we report our experience in implementing and evaluating the Data-Driven Multithreading (DDM) model on a heterogeneous multi-core processor. DDM is a non-blocking multithreading model that decouples the synchronization from the computation portions of a program, allowing them to execute asynchronously in a dataflow manner. Thread dependencies are determined by the compiler/programmer while thread scheduling is done dynamically at runtime based on data availability. The target processor for this implementation is the Cell processor. We call this implementation the Data-Driven Multithreading Virtual Machine for the Cell processor (DDM-VMc<inline-graphic></inline-graphic>). Thread scheduling is handled in software by the Power Processing Element core of the Cell while the Synergistic Processing Element cores execute the program threads. DDM-VMc<inline-graphic></inline-graphic> virtualizes the parallel resources of the Cell, handles the heterogeneity of the cores, manages the Cell memory hierarchy efficiently and supports distributed execution across a cluster of Cell nodes. DDM-VMc<inline-graphic></inline-graphic> has been implemented on a single Cell processor with six computation cores, as well as, on a four Cell processor cluster with 24 computation cores. We present an in-depth performance analysis of DDM-VMc<inline-graphic></inline-graphic>, using a suite of standard computational benchmarks. The evaluation shows that DDM-VMc<inline-graphic></inline-graphic> scales well and tolerates scheduling overheads, memory and communication latencies effectively. Furthermore, DDM-VMc<inline-graphic></inline-graphic> compares favorably with other platforms targeting the Cell processor, such as, the CellSs and Sequoia. [ABSTRACT FROM AUTHOR]
ISSN:08857458
DOI:10.1007/s10766-016-0486-6