Mostrar el registro sencillo

dc.contributor.authorPérez Pavón, Borja 
dc.contributor.authorStafford Fernández, Esteban 
dc.contributor.authorBosque Orero, José Luis 
dc.contributor.authorBeivide Palacio, Ramón 
dc.contributor.authorMateo, S.
dc.contributor.authorTeruel, X.
dc.contributor.authorMartorell, X.
dc.contributor.authorAyguadé, E.
dc.contributor.otherUniversidad de Cantabriaes_ES
dc.date.accessioned2020-10-13T18:25:57Z
dc.date.available2020-10-13T18:25:57Z
dc.date.issued2019
dc.identifier.issn0743-7315
dc.identifier.issn1096-0848
dc.identifier.otherCVE-2014-18166 ; TIN2016-76635-C2-2-R (AEI/FEDER, UE) ; TIN2015-65316-Pes_ES
dc.identifier.urihttp://hdl.handle.net/10902/19331
dc.description.abstractThe emergence of heterogeneous systems has been very notable recently. The nodes of the most powerful computers integrate several compute accelerators, like GPUs. Profiting from such node configurations is not a trivial endeavour. OmpSs is a framework for task based parallel applications, that allows the execution of OpenCl kernels on different compute devices. However, it does not support the co-execution of a single kernel on several devices. This paper presents an extension of OmpSs that rises to this challenge, and presents Auto-Tune, a load balancing algorithm that automatically adjusts its internal parameters to suit the hardware capabilities and application behavior. The extension allows programmers to take full advantage of the computing devices with negligible impact on the code. It takes care of two main issues. First, the automatic distribution of datasets and the management of device memory address spaces. Second, the implementation of a set of load balancing algorithms to adapt to the particularities of applications and systems. Experimental results reveal that the co-execution of single kernels on all the devices in the node is beneficial in terms of performance and energy consumption, and that Auto-Tune gives the best overall results.es_ES
dc.description.sponsorshipThis work has been supported by the University of Cantabria with grant CVE-2014-18166, the Generalitat de Catalunya under grant 2014-SGR-1051, the Spanish Ministry of Economy, Industry and Competitiveness under contracts TIN2016-76635-C2-2-R (AEI/FEDER, UE) and TIN2015-65316-P. The Spanish Government through the Programa Severo Ochoa (SEV-2015-0493)es_ES
dc.format.extent13 p.es_ES
dc.language.isoenges_ES
dc.publisherElsevieres_ES
dc.rightsAttribution-NonCommercial-NoDerivatives 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by-nc-nd/4.0/*
dc.sourceJournal of Parallel and Distributed Computing, Volume 125, March 2019, Pages 45-57es_ES
dc.subject.otherHeterogeneous systemses_ES
dc.subject.otherOmpSs programming modeles_ES
dc.subject.otherOpenCLes_ES
dc.subject.otherCo-executiones_ES
dc.titleAuto-tuned OpenCL kernel co-execution in OmpSs for heterogeneous systemses_ES
dc.typeinfo:eu-repo/semantics/articlees_ES
dc.relation.publisherVersionhttps://doi.org/10.1016/j.jpdc.2018.11.001es_ES
dc.rights.accessRightsopenAccesses_ES
dc.type.versionacceptedVersiones_ES


Ficheros en el ítem

Thumbnail

Este ítem aparece en la(s) siguiente(s) colección(ones)

Mostrar el registro sencillo

Attribution-NonCommercial-NoDerivatives 4.0 InternationalExcepto si se señala otra cosa, la licencia del ítem se describe como Attribution-NonCommercial-NoDerivatives 4.0 International