Нижегородский государственный университет им.Н.И.Лобачевского.


Факультет вычислительной математики и кибернетики

Лаборатория ITLabНовостиТекущие новости Switch to English version  
Текущие новости
О Лаборатории
Образовательные комплексы
Семинар Лаборатории
Вакансии Интел
Разработчики сайта
О нас пишут
Летняя школа 2011
Видео лекции
Клуб У.М.Н.И.К.
Забыли пароль? Регистрация

Текущие новости

Семинар «Об опыте оптимизации приложений для Intel Xeon Phi»

15.10 в 18:00 в 114 ауд. состоится семинар.

Докладчик: Роман Лыгин (Интел).


The Intel(R) Xeon Phi(TM) coprocessor is getting momentum in the HPC community and software developers are looking forward to give it a try with their legacy applications. However the developers often face a challenge to efficiently take advantage of available compute power what may undermine their first experience and thereby lead to disappointment. Although Intel does deliver on a promise of the same programming models and tools across the Xeon and Xeon Phi processors, reasonable efforts are still required from developers to efficiently port their applications to the coprocessor.
This talk will present a practical case study of porting the Tachyon, an open source ray tracer, part of the SpecMPI suite, to Xeon Phi. Initial port revealed disappointing performance, e.g. combined Xeon and Xeon Phi version ran 2.6x slower than Xeon-only version. To achieve good performance some code modifications had to be introduced improving both Xeon and Xeon Phi parts. The talk will demonstrate the use of Intel(R) Cluster Studio XE to pinpoint the problems and will highlight key code changes which helped achieve significant improvements (up to 7x vs from initial baseline, and 1.8x speed up vs improved Xeon version). The application exploits parallelism at multiple levels - symmetric MPI execution model, OpenMP-based multi-threading, and explicit SIMD (using SSE2/AVX/Xeon Phi instructions).
The talk will highlight use of software tools – Intel(R) Trace Analyzer and Collector, and Intel(R) VTune(TM) Amplifier XE in combination with MPI* and OpenMP* programming models, as well as a SIMD-enabled 3D vector operations library (reused and extended from Embree, the open source ray tracer by Intel Labs). Algorithmic changes include MPI-based dynamic scheduling, introduction of explicit intrinsics-based SIMD support, enabling greater OpenMP parallelism capacity.
The case study demonstrates the Intel’s message “optimization for Xeon pays off on Xeon Phi, and vice versa”. The talk may represent interest for the software developers who could re-apply this practical experience, ideas and approaches back to their projects.

<< вернуться  |   Документ от: 10.10.2013 14:00



© ITLab, Нижний Новгород,  2009