dc.contributor.author | Postigo Díaz, Daniel | |
dc.contributor.author | Herreros Cerro, David | |
dc.contributor.author | Barón, Eloy | |
dc.contributor.author | Camarero Coterillo, Cristobal | |
dc.contributor.author | Fuentes Saez, Pablo | |
dc.contributor.other | Universidad de Cantabria | es_ES |
dc.date.accessioned | 2024-10-15T17:31:10Z | |
dc.date.available | 2024-10-15T17:31:10Z | |
dc.date.issued | 2024 | |
dc.identifier.isbn | 979-8-4007-0648-6 | |
dc.identifier.other | PID2019-105660RB-C22 | es_ES |
dc.identifier.other | TED2021-131176B-I00 | es_ES |
dc.identifier.other | PID2022-136454NB-C21 | es_ES |
dc.identifier.uri | https://hdl.handle.net/10902/34259 | |
dc.description.abstract | A hotspot traffic pattern of communications can be a common phenomenon in HPC topologies that causes significant and lasting network performance degradation. This performance deterioration remains persistent over time, intensifying its impact even after the cessation of the detrimental traffic injection into the network. To understand its causes and effects, we analyze the network behavior under different hotspot traffic scenarios and compare the performance on various topologies. We examine both the performance drop due to traffic flows with endpoint contention, and the recovery process of the network after this phenomenon has occurred, if swift action is taken to mitigate it. Our results show that some topologies are more resilient to hotspot traffic than others, both to reduce the performance drop and/or to accelerate the recovery process. In particular, Flattened Butterfly is more resilient to congestion and consistently demonstrates a rapid recovery. The results of the analysis reinforce the need for mechanisms with effective and expeditious action to reduce the magnitude and duration of the performance drop. Furthermore, they highlight behavioral differences between topologies that can affect the effectiveness of mechanisms using congestion-based metrics. | es_ES |
dc.description.sponsorship | This work has been supported by Grants PID2019-105660RB-C22, TED2021-131176B-I00 and PID2022-136454NB-C21 funded by MICIU/AEI/ 10.13039/501100011033nd by ERDF/EU; by the Spanish Ministry of Science and Innovation Ramón y Cajal RYC2021-033959-I, and the European HiPEAC Network of Excellence. The experiments have been executed on the Altamira HPC cluster, at the Institute of Physics of Cantabria (IFCA-CSIC). | es_ES |
dc.format.extent | 9 p. | es_ES |
dc.language.iso | eng | es_ES |
dc.publisher | Association for Computing Machinery | es_ES |
dc.rights | © 2024 Copyright held by the owner/author(s). This work is licensed under a Creative Commons Attribution International 4.0 License. | es_ES |
dc.rights.uri | http://creativecommons.org/licenses/by/4.0/ | * |
dc.source | SNTA '24: proceedings of the Seventh International Workshop on Systems and Network Telemetry and Analytics, Nueva York, Association for Computing Machinery, 2024. Pisa, 15-23 | es_ES |
dc.subject.other | Network congestion | es_ES |
dc.subject.other | Hotspot pattern | es_ES |
dc.subject.other | Endpoint congestion | es_ES |
dc.subject.other | High-performance interconnection networks | es_ES |
dc.title | Defining the boundaries for endpoint congestion management in networks for high-performance computing | es_ES |
dc.type | info:eu-repo/semantics/conferenceObject | es_ES |
dc.relation.publisherVersion | https://doi.org/10.1145/3660320.3660333 | es_ES |
dc.rights.accessRights | openAccess | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2017-2020/PID2019-105660RB-C22/ES/REDES DE INTERCONEXION, ACELERADORES HARDWARE Y OPTIMIZACION DE APLICACIONES/ | es_ES |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-136454NB-C21/ES/ARQUITECTURA Y PROGRAMACION DE COMPUTADORES ESCALABLES DE ALTO RENDIMIENTO Y BAJO CONSUMO III-UC (TEAM-MATES UC)/ | es_ES |
dc.identifier.DOI | 10.1145/3660320.3660333 | |
dc.type.version | publishedVersion | es_ES |