Optimización de aplicaciones y equilibrio de carga en sistemas heterogéneos con Xeon Phi

Herrera Arcila, Adrián

Fecha

2017-09-08

Director/es

Bosque Orero, José Luis

Derechos

Atribución-NoComercial-SinDerivadas 3.0 España

Palabras clave

Paralelismo

Acelerador

Coste o carga computacional

Sistema heterogéneo

Equilibrio de carga

Intel Xeon Phi

Parallelism

Accelerator

Computational cost or complexity

Heterogeneous system

Load balancing

Resumen/Abstract

RESUMEN: La curiosidad del ser humano es innata e infinita, para muchos forma parte del sentido de la vida, pero a medida que se exploran y resuelven misterios, surgen otros de mayor magnitud y complejidad; hoy en día muchos de los estudios e investigaciones que tratan de arrojar luz sobre estos problemas requieren herramientas software extremadamente complejas (simulaciones, análisis numérico, predicciones, ...), y por consiguiente tienen asociado un coste computacional alto. Para enfrentarse a ello, los sistemas de alto rendimiento han pasado de un modelo de potencia bruta a uno colaborativo, en el que diversos componentes hardware trabajan de manera distribuida para ejecutar esas herramientas, asumiendo de forma paralela el coste total. En este sentido han surgido nuevos dispositivos hardware específicamente diseáados para este modelo; en concreto, Intel ha desarrollado el Xeon Phi, un acelerador de decenas de núcleos de proceso que se encarga de tareas con un alto grado de paralelismo. Esta investigación trata de informar del valor de esta pieza para las necesidades actuales y de cómo optimizar las aplicaciones para ejecutar sobre ella y sacar el máximo rendimiento; además se introducirán una serie de métricas, métodos y algoritmos para aprovechar el poder computacional combinado en un entorno heterogéneo compuesto por un procesador tradicional Xeon y este acelerador, un proceso conocido como equilibrio de carga. En este entorno se realizarán un conjunto de pruebas sobre dos aplicaciones típicas en la computación de alto rendimiento, la simulación N-body y el filtrado de imágenes, en las que se pondrán en práctica los métodos desarrollados. Por último, se analizarán los resultados de estas pruebas, donde serán explicadas las mejoras producidas por cada uno de ellos sobre los programas elegidos, así como las principales razones que han conducido a esos incrementos de rendimiento.

ABSTRACT: Human beings' curiosity is inborn and infinite, for many it takes part in their meaning of life conception, but while we get further through solving unanswered questions and mysteries, more and more versions of these pop up with higher degree of both scale and complexity; nowadays a considerable fraction of the studies and investigations going on require extremely complex software tools (simulations, numerical analysis, forecasting, ...), which in turn have a huge computational cost associated to them. In order to face this issue, high performance systems have transitioned from a pure power model to a rather colaborative one, in which big numbers of diverse hardware subsystems work in a distributed way to execute these mentioned tools, assuming the total cost in a parallel manner. Having this into account, new specific hardware for this model is being created; in particular, Intel has developed the Xeon Phi, an accelerator possesing dozens of processing cores which takes care of applications with high degrees of parallelism. This investigation tries to transmit the value of the Xeon Phi for today's needs, and gives advise on how to optimize software for it, squeezing its performance to higher levels; moreover a set of metrics, methods and algorithms will be introduced to exploit the combined computational power in a heterogeneous environment formed by a traditional Xeon processor and this accelerator, a process being known as load balancing. Within this environment, a test set will be applied to two main applications in high performance computing, N-body simulation and image filtering, this set reflecting the implementation of previously introduced methods. At last, the results of the tests will be analysed, and both the improvements in performance due to these methods as well as the reasons driving to them will be explained.

Colecciones a las que pertenece

G2454 Trabajos académicos [375]

Excepto si se señala otra cosa, la licencia del ítem se describe como Atribución-NoComercial-SinDerivadas 3.0 España