1st International Workshop on
RESource DISaggregation
in High-Performance Computing

Held together with The International Conference on High Performance Computing in Asia-Pacific Region (HPC Asia'22)

14th January 2022, Kobe, Japan

About The Event

Introduction

Disaggregation is an emerging compute paradigm that splits existing monolithic servers into a number of consolidated single-resource pools that communicate over a fast interconnect. This model decouples individual hardware resources, including tightly coupled ones such as processors and memory, and enables the creation of logical compute platforms with flexible and dynamic hardware configurations. The concept of disaggregation is driven by various recent trends in computation. From an application perspective, the increasing importance of data analytics and machine learning workloads in HPC centers brings unprecedented need for memory capacity, which is in stark contrast with the growing imbalance in the peak compute-to-memory capacity ratio of traditional system board based server platforms where memory modules are co-located with processors. At the hardware front, the proliferation of heterogeneous, special purpose computing elements promotes the need for configurable compute platforms, while at the same time, the increasing maturity of optical interconnects elevates the prospects of distance independence in networking infrastructure. The workshop intends to explore various aspects of resource disgregation and their implications for high performance computing, both in dedicated HPC centers as well as in cloud environments.

Organizers

RIKEN Center for Computational Science, JAPAN

Lawrence Berkeley National Laboratory, USA

Agenda

All times in Japan Standard Time (UTC+09)

Welcome and Introduction

Keynote: Composable Disaggregated Infrastructure Delivers Adaptive Data Performance for AI-driven HPC

Sumit Puri (CEO & Cofounder, Liqid)

Abstract: Attendees will learn how composable disaggregated infrastructure (CDI) solutions from Liqid improve performance and flexibility for AI-driven HPC. The Liqid Matrix software platform disaggregates the elements of the data center and permits users to pool and deploy GPU, FPGA, NVMe, and other accelerator technologies to align with an application's workload requirements. Those resources can be released for use by other applications once the job is complete for unprecedented flexibility and efficiency for AI+HPC. Use cases include government, academia, enterprise, and more, with performance benchmarks on real-life workloads. Find out how composable disaggregated infrastructure significantly increases time-to-value for scientific research, product development, media and broadcast, enterprise data analytics and much more.

Coffee Break

Roundtable on the Role of Photonics for Disaggregation -- moderated by John Shalf

Yu Tanaka (Fujitsu), Mark Wade (Ayar Labs), Keren Bergman (Columbia University)

Compute Express Link (CXL*): Changing the Industry Landscape

Debendra Das Sharma (Intel)

Abstract: CXL is a dynamic multi-protocol interconnect technology designed to support accelerators and memory devices. CXL provides a rich set of protocols that include I/O semantics similar to PCIe (i.e., CXL.io), caching protocol semantics (i.e., CXL.cache), and memory access semantics (i.e., CXL.mem) over PCIe PHY. CXL 2.0 specification enabled additional usage models beyond CXL 1.1, while being fully backwards compatible with CXL 1.1 (and CXL 1.0). CXL 2.0 enables dynamic resource allocation including memory and accelerator dis-aggregation across multiple domains. It enables switching, managed hot-plug, security enhancements, persistence memory support, memory error reporting, and telemetry. The availability of commercial IP blocks, Verification IPs, and industry standard internal interfaces enables CXL to be widely deployed across the industry. These along with a well-defined compliance program will ensure smooth interoperability across CXL devices in the industry.

CXL Memory and SMDK as a First Step toward Memory Disaggregation

Cheolmin Park (Samsung)

Abstract: AI services and heterogeneous computing have been introduced into memory systems, which has resulted in traditional memory architectures increasingly struggling to provide enough data to fully utilize CPUs and accelerators. As the global technology leader in the memory industry, Samsung Electronics is preparing memory disaggregation technology. Samsung’s CXL memory expander and SMDK (Scalable Memory Development Kit) is the first step toward memory disaggregation. CXL is a new interface which disaggregates memory from CPUs and allows for flexibility and availability of different processors and media characteristics. SMDK is a software development kit optimized for heterogeneous memory systems and provides a software solution for disaggregated memory architectures. In this talk, an overview of Samsung’s SMDK and CXL memory expander will be presented.

Rethinking Disaggregated Memory in HPC: A Userspace Software Approach

Ivy Peng (Lawrence Livermore National Laboratory)

Abstract: Current HPC systems use server-centric architecture and static resource allocation. Compute nodes are homogeneously configured with compute and memory resources, and job resource is allocated based on peak usage. The first part of this talk will examine memory utilization on several leadership HPC systems equipped with high-resolution monitoring infrastructure. Our analysis abstracts several representative temporal and spatial memory imbalance patterns in production jobs. The second part will introduce a userspace software to support on-demand memory expansion from network-attached memory. Despite the exact hardware and architecture of disaggregated memory being a moving target, our solution requires no modifications to the kernel nor privilege access and thus can be adopted in the HPC environment. Furthermore, with a focus on scalability and adaptability, application-specific optimizations can be achieved with no system-wide impact. Finally, early results of microbenchmarks and two memory-intensive applications will be presented.

Adjourn

Brenden Legros

Libero corrupti explicabo itaque. Brenden Legros

Facere provident incidunt quos voluptas.

Hubert Hirthe

Et voluptatem iusto dicta nobis. Hubert Hirthe

Maiores dignissimos neque qui cum accusantium ut sit sint inventore.

Cole Emmerich

Explicabo et rerum quis et ut ea. Cole Emmerich

Veniam accusantium laborum nihil eos eaque accusantium aspernatur.

Jack Christiansen

Qui non qui vel amet culpa sequi. Jack Christiansen

Nam ex distinctio voluptatem doloremque suscipit iusto.

Alejandrin Littel

Quos ratione neque expedita asperiores. Alejandrin Littel

Eligendi quo eveniet est nobis et ad temporibus odio quo.

Willow Trantow

Quo qui praesentium nesciunt Willow Trantow

Voluptatem et alias dolorum est aut sit enim neque veritatis.

Hubert Hirthe

Et voluptatem iusto dicta nobis. Hubert Hirthe

Maiores dignissimos neque qui cum accusantium ut sit sint inventore.

Cole Emmerich

Explicabo et rerum quis et ut ea. Cole Emmerich

Veniam accusantium laborum nihil eos eaque accusantium aspernatur.

Brenden Legros

Libero corrupti explicabo itaque. Brenden Legros

Facere provident incidunt quos voluptas.

Jack Christiansen

Qui non qui vel amet culpa sequi. Jack Christiansen

Nam ex distinctio voluptatem doloremque suscipit iusto.

Alejandrin Littel

Quos ratione neque expedita asperiores. Alejandrin Littel

Eligendi quo eveniet est nobis et ad temporibus odio quo.

Willow Trantow

Quo qui praesentium nesciunt Willow Trantow

Voluptatem et alias dolorum est aut sit enim neque veritatis.

Event Venue

RIKEN Center for Computational Science (R-CCS) - Online Event

RIKEN R-CCS, Japan

RIKEN is Japan's largest comprehensive research institution renowned for high-quality research in a diverse range of scientific disciplines. Founded in 1917 as a private research foundation in Tokyo, RIKEN has grown rapidly in size and scope, today encompassing a network of world-class research centers and institutes across Japan.

Contact Us

If you have any comments/questions, do not hesitate to contact us.

Address

7 Chome-1-26 Minatojima Minamimachi,
Chuo Ward, Kobe,
Hyogo 650-0047,
JAPAN