3rd International Workshop on
RESource DISaggregation
in High-Performance Computing

Held together with The International Conference for High Performance Computing, Networking, Storage, and Analysis (SC'23)

Friday, 17th November 2023, Denver, CO, USA

Introduction

Disaggregation is an emerging compute paradigm that splits existing monolithic servers into a number of consolidated single-resource pools that communicate over a fast interconnect. This model decouples individual hardware resources, including tightly coupled ones such as processors and memory, and enables the composition of logical compute platforms with flexible and dynamic hardware configurations.

The concept of disaggregation is driven by various recent trends in computation. From an application perspective, the increasing importance of data analytics and machine learning workloads in HPC centers brings unprecedented need for memory capacity, which is in stark contrast with the growing imbalance in the peak compute-to-memory capacity ratio of traditional system board based server platforms where memory modules are co-located with processors. Meanwhile, traditional simulation workloads leave memory underutilized. At the hardware front, the proliferation of heterogeneous, special purpose computing elements promotes the need for configurable compute platforms, while at the same time, the increasing maturity of optical interconnects raises the prospects of better distance independence in networking infrastructure.

The workshop intends to explore various aspects of resource disgregation, composability and their implications for high performance computing, both in dedicated HPC centers as well as in cloud environments. RESIDS aims to bring together researchers and industrial practitioners to foster discussion, collaboration, mutual exchange of knowledge and experience related to future disaggregated systems.



Call for Papers

The RESDIS program committee solicits original, high-quality submissions of unpublished results related to the theme of resource disaggregation and composable systems. Topics of interest include, but not limited to:

- Disaggregated hardware in high-performance computing
- Operating systems and runtime support for disaggregated platforms
- Simulation of disaggregated platforms with existing infrastructure
- Runtime systems and programming abstractions for disaggregation and composability
- Networking for disaggregation, including silicon photonics and optical interconnects
- Implications of resource disaggregation for scientific computing and HPC applications
- Algorithm design for disaggregated and composable systems
- Disaggregated high throughput storage
- Disaggregated heterogeneous accelerators (GPUs, FPGAs, AI Accelerators, etc.)
- Resource management in disaggregated and composable platforms

The workshop proceedings will be published electronically via the IEEE Computer Society Digital Library. Submitted manuscripts should be formatted using templates and the CCS2012 guide that are available at: http://www.acm.org/publications/article-templates/proceedings-template.html/. Regular papers must be between 6 and 12 pages, including references and figures, short papers are up to 4 pages, including references and figures. Prospective authors should submit their papers in PDF format through Linklings’ submission site:





Important Dates

August 18th (Fri) AoE

Submission deadline (extended)

Acceptance notification

Final paper deadline

Workshop date

Organization

Workshop Chairs


Intel Corporation, USA & RIKEN, Japan

Lawrence Berkeley National Laboratory, USA

IBM Research Europe, Ireland

Program Committee


Michael Aguilar

Sandia National Laboratories, USA

Larry Dennison

Nvidia, USA

John (Jack) Lange

Oak Ridge National Laboratory, USA

Ivy Peng

KTH Royal Institute of Technology, Sweden

Yu Tanaka

Fujitsu, Japan

Gaël Thomas

Télécom SudParis, France

Agenda

All times in Mountain Standard Time (UTC-7)

Welcome and Introduction

Keynote: The Open Chiplet Economy and its Application to HPC

Bapiraju Vinnakota (Lawrence Berkeley National Laboratory)

Abstract: Multiple technological and business trends are driving a change to realizing semiconductor products as systems in package (SiP) that integrate multiple die, usually referred to as chiplets, into a single package. Compared to monolithic SoCs, chiplet-based designs require an evolution in architecture, interfaces, design and manufacturing flows. Several large companies have already made the transition to chiplet-based designs with proprietary tools and technologies. Many have not and need the expertise to do so. A new and open Chiplet economy to help more companies adopt chiplet technologies. It is based on and will require collaboration and standardizations on multiple dimensions, ensuring that companies are able to interact in an open, efficient and scalable manner. This talk will profile the chiplet economy including its motivations, standards, participants and current status. It will close with an overview of the potential relevance of the open chiplet economy to HPC.

Sunfish: An Open Centralized Composable HPC Management Framework

Phil Cayton (Intel Corporation), Michael Aguilar (Sandia National Laboratories), and Christian Pinto (IBM Research Europe)

Abstract: Traditional HPC systems are provisioned with sets of static fixed quantities of resources (e.g., memory, storage, accelerators, CPU) to execute requested computation. This is not sufficient for today’s datacenters running modern dynamic workloads, resulting in workloads executing on systems not optimized for their needs. Datacenters often end up over-provisioning systems with hardware resources to provide workload versatility in HPC clusters. Extending Composable Disaggregated Infrastructure (CDI) to HPC architectures enables servers to be composed out of physically disaggregated resources to match the requirements of a workload. Central resource management, using a standardized interface, enables client applications to monitor, compose, and intelligently optimize resource provisioning. The OpenFabrics Alliance in collaboration with DMTF, SNIA, and the CXL Consortium, is developing the Sunfish Management Framework for intelligent HPC CDI control. The goal of Sunfish is to enable interoperability through common interfaces for connecting workloads with resources, without having to worry about underlying hardware technologies.

RISA: Round-Robin Intra-Rack Friendly Scheduling Algorithm for Disaggregated Datacenters

Rashadul Kabir (Colorado State University), Ryan G. Kim (Intel Corporation), Mahdi Nikdast (Colorado State University)

Abstract: Recent trends see a move away from a fixed-resource server-centric datacenter model to a more adaptable “disaggregated” datacenter model. These disaggregated datacenters can then dynamically group resources to the specific requirements of an incoming workload, thereby improving efficiency. To properly utilize these disaggregated datacenters, workload allocation techniques must examine the current state of the datacenter and choose resources that not only optimizes the current workload request, but future ones. Since disaggreated datacenters are severely bottlenecked by the avaiable network resources, our work proposes a heuristic-based approach called RISA, which significantly reduces the network usage of workload allocations in disaggregated datacenters. Compared to the state-of-the-art, RISA reduces the power consumption for optical components by 33% and reduces the average CPU-RAM round-trip latency by 50%. Additionally, RISA significantly outperforms the state-of-the-art in terms of execution time.

Coffee Break

Disaggregation in Practice: Industry Session

Phillip Clark (Liqid) , Jianping Jiang (XConn) , John Ihnotic (GigaIO) , Keren Bergman (Xscape Photonics)

Panel Discussion

Michael Aguilar (Sandia National Laboratories), Rashadul Kabir (Colorado State University), Phillip Clark (Liqid), Dan Ernst (Microsoft), Jianping Jiang (XConn), John Ihnotic (GigaIO), Keren Bergman (Xscape Photonics)

Adjourn

Event Venue

Colorado Convention Center, 700 14th St, Denver, CO 80202

Colorado Convention Center

Located in the heart of downtown, the Colorado Convention Center is top-ranked in the nation and close to hotels, dining, attractions, shopping, and public transportation.

A STATE OF THE ART VENUE

Explore the Colorado Convention Center website, including sections on their sustainability efforts and guide to public art.

Contact Us

If you have any comments/questions, do not hesitate to contact us.

Address

2501 NE Century Blvd, Hillsboro,
OR 97124, USA