2nd International Workshop on
RESource DISaggregation
in High-Performance Computing

Held together with The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'22)

Friday, 18th November 2022, Dallas, TX, USA

Introduction

Disaggregation is an emerging compute paradigm that splits existing monolithic servers into a number of consolidated single-resource pools that communicate over a fast interconnect. This model decouples individual hardware resources, including tightly coupled ones such as processors and memory, and enables the composition of logical compute platforms with flexible and dynamic hardware configurations.

The concept of disaggregation is driven by various recent trends in computation. From an application perspective, the increasing importance of data analytics and machine learning workloads in HPC centers brings unprecedented need for memory capacity, which is in stark contrast with the growing imbalance in the peak compute-to-memory capacity ratio of traditional system board based server platforms where memory modules are co-located with processors. At the hardware front, the proliferation of heterogeneous, special purpose computing elements promotes the need for configurable compute platforms, while at the same time, the increasing maturity of optical interconnects elevates the prospects of better distance independence in networking infrastructure.

The workshop intends to explore various aspects of resource disgregation, composability and their implications for high performance computing, both in dedicated HPC centers as well as in cloud environments. RESIDS aims to bring together researchers and industrial practitioners to foster discussion, collaboration, mutual exchange of knowledge and experience related to future disaggregated systems.



Call for Papers

The RESDIS program committee solicits original, high-quality submissions of unpublished results related to the theme of resource disaggregation and composable systems. Topics of interest include, but not limited to:

- Disaggregated hardware in high-performance computing
- Operating systems and runtime support for disaggregated platforms
- Simulation of disaggregated platforms with existing infrastructure
- Runtime systems and programming abstractions for disaggregation and composability
- Networking for disaggregation, including silicon photonics and optical interconnects
- Implications of resource disaggregation for scientific computing and HPC applications
- Algorithm design for disaggregated and composable systems
- Disaggregated high throughput storage
- Resource management in disaggregated and composable platforms

The workshop proceedings will be published electronically via the IEEE Computer Society Digital Library. Submitted manuscripts should be formatted using the IEEE template conference mode. The maximum paper length is 8 pages, not including references and other appendices. Prospective authors should submit their papers in PDF format through Linklings’ submission site:

Submit Paper



Important Dates

August 5th 19th (Fri) AoE

Submission deadline (extended)

Acceptance notification

Final paper deadline

Workshop date

Organization

Workshop Chairs


Intel Corporation, USA & RIKEN, Japan

Lawrence Berkeley National Laboratory, USA

Columbia University, USA

Program Committee


Manya Ghobadi

Massachusetts Institute of Technology, USA

Madeleine Glick

Columbia University, USA

Ian Karlin

NVIDIA, USA

John (Jack) Lange

Oak Ridge National Laboratory, USA

Ivy Peng

Lawrence Livermore National Laboratory, USA

Yu Tanaka

Fujitsu, Japan

Gaël Thomas

Télécom SudParis, France

Agenda

All times in Central Daylight Time (UTC-5)

Welcome and Introduction

Opening Distinguished Lecture: Putting Memory and Computing in a Single Pool over Compute Express Link

Prof. Myoungsoo Jung (Korea Advanced Institute of Science and Technology)

Abstract: Compute express link (CXL) has recently attracted significant attention thanks to its excellent hardware heterogeneity management and resource disaggregation capabilities. Even though there is yet no commercially available product or platform integrating CXL 2.0/3.0 into memory pooling, it is expected to make memory resources practically and efficiently disaggregated much better than ever before. In this lecture, we will argue why existing computing and memory resources require a new interface for cache coherency and demonstrate how CXL can put the different types of resources into a disaggregated pool. As a use case scenario, this lecture will show two real system examples, building a CXL 2.0-based end-to-end system that directly connects a host processor complex and remote memory resources over CXL's memory protocol and a CXL-integrated storage expansion system prototype. At the end of the lecture, we introduce a set of hardware prototypes designed to support the future CXL system (CXL 3.0) as our ongoing project.

Methodology for Evaluating the Potential of Disaggregated Memory Systems

Nan Ding (LBNL), Samuel Williams (LBNL), Hai Ah Nam (NERSC), Taylor Groves (NERSC), Muaaz Gul Awan (NERSC), Christopher Delay (NERSC), Oguz Selvitopi (LBNL), Leonid Oliker (LBNL), Nicholas Wright (NERSC)

Abstract: Tightly-coupled HPC systems have rigid memory allocation and can result in expensive memory resource under-utilization. As novel memory and network technologies mature, disaggregated memory systems are becoming a promising solution for future HPC systems. It allows workloads to use the available memory of the entire system. We propose a design framework to explore the disaggregated memory system design space. The framework incorporates memory capacity, network bandwidth, and local and remote memory access ratio, and provides an intuitive approach to guide machine configurations based on technology trends and workload characteristics. We apply our framework to analyze eleven workloads from five computational scenarios, including AI training, data analysis, genomics, protein, and traditional HPC. We demonstrate the ability of our methodology to understand the potential and pitfalls of a disaggregated memory system and motivate machine configurations. Our methodology shows that 10 out of our 11 applications/workflows can leverage disaggregated memory without affecting performance.

TPP: Transparent Page Placement for CXL-Enabled Tiered Memory (Invited talk)

Hasan Maruf (University of Michigan)

Abstract: CXL-based memory expansion decouples CPU and memory within a single server and enables flexible server design with different generations and types of memory technologies. It can balance the fleet-wide resource utilization and address the memory bandwidth and capacity scaling challenges in hyperscale datacenters. Without efficient memory management, however, such systems can significantly degrade application-level performance. We propose a novel OS-level application-transparent page placement mechanism (TPP) for efficient CXL-memory management. TPP employs lightweight mechanisms to identify and place hot and cold pages to appropriate memory tiers. It enables page allocation to work independently from page reclamation logic which is tightly-coupled in today's Linux kernel. As a result, the local memory tier has memory headroom for new allocations. At the same time, TPP can promptly promote performance-critical hot pages trapped in the slow memory tiers to the fast tier node. Both promotion and demotion mechanisms work transparently without prior knowledge of an application's memory access behavior. TPP improves Linux's performance by up to 18% and outperforms state-of-the-art solutions for tiered memory by 10–17%. TPP has been actively being used in Meta datacenter for over a year and some parts of it have been merged to the Linux kernel since v5.18.

Coffee Break

First-Generation Memory Disaggregation for Cloud Platforms (Invited talk)

Daniel Berger (Microsoft Corporation)

Abstract: Similar to HPC, public cloud platforms require scaling memory capacity beyond what is feasible on traditional system boards. CXL promises to enable this memory capacity scaling. However, cloud platforms come with stringent performance requirements and seek to support opaque virtual machines without modifications. The CXL standard does not show how to compose full-stack systems from its building blocks or how to expose CXL to the user. This talk presents a first-generation design for public cloud platforms. We particularly focus on the CXL system topology and how to expose CXL to the user. The resulting design can reduce DRAM cost by 7% while guaranteeing performance within 1-5% of same-NUMA-node memory allocations.

Industry Talk: Compute Express Link (CXL*): Open Interconnect for building Composable Systems

Debendra Das Sharma (Intel Corporation)

Abstract: CXL is a dynamic multi-protocol interconnect technology designed to support accelerators and memory devices. CXL provides a rich set of protocols that include I/O semantics similar to PCIe (i.e., CXL.io), caching protocol semantics (i.e., CXL.cache), and memory access semantics (i.e., CXL.mem) over PCIe PHY. CXL 2.0 specification enabled additional usage models beyond CXL 1.1, while being fully backwards compatible with CXL 1.1 (and CXL 1.0). CXL 2.0 enables dynamic resource allocation including memory and accelerator dis-aggregation across multiple domains. It enables switching, managed hot-plug, security enhancements, persistence memory support, memory error reporting, and telemetry. CXL 3.0 doubles the bandwidth while providing larger scale composable systems with fabrics. The availability of commercial IP blocks, Verification IPs, and industry standard internal interfaces enables CXL to be widely deployed across the industry. These along with a well-defined compliance program will ensure smooth interoperability across CXL devices in the industry.

Panel Discussion and Workshop Q&A

Myoungsoo Jung (Korea Advanced Institute of Science and Technology), Nan Ding (LBNL), Hasan Maruf (University of Michigan), Daniel Berger (Microsoft Corporation), Debendra Das Sharma (Intel Corporation)

Adjourn

Event Venue

Room D172, Kay Bailey Hutchison Convention Center Dallas, TX, USA

Kay Bailey Hutchison Convention Center

Located in the heart of downtown Dallas, the convention center features 1 million square feet of prime exhibit space and is close to hotels, dining, attractions, shopping, and public transportation.

The convention center is located just 30 minutes from Dallas/Fort Worth International (DFW) airport and seven minutes from Dallas Love Field (DAL) airport, with several public transportation options readily available. Dallas Area Rapid Transit (DART) has a light rail station at the convention center, and a free D-Link shuttle bus to area shopping and entertainment.

Contact Us

If you have any comments/questions, do not hesitate to contact us.

Address

2501 NE Century Blvd, Hillsboro,
OR 97124, USA