To view PDF files

You need Adobe Reader 7.0 or later in order to read PDF files on this site.
If Adobe Reader is not installed on your computer, click the button below and go to the download site.

Feature Articles: Technology Development Trends of the IOWN 2.0 Era—From Communications to Computing

Vol. 24, No. 4, pp. 22–29, Apr. 2026. https://doi.org/10.53829/ntr202604fa2

Initiatives toward Multiple Vendor-sourced Composable Servers in DCI Technology Development

Kensuke Koda and Kazuo Ninokata

Abstract

NTT Software Innovation Center is developing Data-Centric Infrastructure (DCI) technologies toward the implementation of NTT’s IOWN (Innovative Optical and Wireless Network). Focusing on composable servers as a key component of DCI, this article introduces the mechanism (device management interface and framework) for building and operating an infrastructure that combines multiple vendor-sourced devices and examines the issues in achieving multiple vendor-sourced composable servers at NTT.

Keywords: DCI, composable server, multiple vendor-sourced composable servers

PDF PDF

1. Introduction

NTT Software Innovation Center is developing Data-Centric Infrastructure (DCI) toward the implementation of NTT’s Innovative Optical and Wireless Network (IOWN). In the overall architecture defined by the IOWN Global Forum, DCI is positioned as a fundamental layer that enables highly efficient data processing in both distributed datacenter and heterogeneous computing environments. It is one of the essential foundations in the overall architecture of IOWN. We have previously introduced work on the documentation of reference implementation models of DCI functional architecture and compute clusters at the IOWN Global Forum [1] as well as work on the efficient use and power-consumption reduction of hardware resources using DCI as demonstrated at the NTT Pavilion at Expo 2025 Osaka, Kansai, Japan [2]. At NTT, DCI consists of hardware that connects multiple composable servers, graphics processing unit (GPU) servers, etc. via a network and a DCI controller that optimally allocates these interconnected central processing units (CPUs), GPUs, and other resources.

This article focuses on a DCI architecture using composable servers that can flexibly combine and use devices (Fig. 1). A composable server connects a host server and a Peripheral Component Interconnect Express/Compute Express Link (PCIe/CXL) expansion box (resource box) by using a PCIe/CXL fabric switch enabling CPUs, GPUs, storage, and CXL memory to be combined and used with a high degree of freedom (Fig. 2).


Fig. 1. DCI configuration using composable servers.


Fig. 2. Composable server overview.

We at NTT have collaborated with various vendors of composable server products and demonstrated the design and operation of systems that combine products from multiple vendors. We aim to build multiple vendor-sourced composable servers that combine products from multiple vendors and enable the operation of a container infrastructure via DCI controller software.

This article specifically introduces a management standard and framework for multiple vendor-sourced composable servers, including background on their establishment, and notes that are being studied for use in implementing DCI controller software. It also discusses the issues involved in developing multiple vendor-sourced composable servers at NTT. We first introduce a management standard, called Redfish interface, specified by the Distributed Management Task Force (DMTF) [3] for controlling datacenter infrastructure [4], the functions of which are now implemented in servers and network devices. As a framework proposed by the Open Fabrics Alliance (OFA) [5], we then describe in detail the Sunfish framework [6]. This framework can manage composable-server hardware (servers, memory, accelerators, etc.) from different vendors in an integrated and uniform manner via a standard management interface such as Redfish. Since Sunfish provides logical models that enable the management and dynamic composition (lifecycle management) of computing resources independent of a physical configuration, it can be thought of as a service framework that can also be used in the DCI controller software developed by NTT. As of December 2025, the Sunfish version is 0.5, and OFA is working vigorously to release the next set of specifications as version 1.0.

2. Trends in the standardization of multiple vendor-sourced composable servers

We introduce the design of a standard management interface and framework for building and operating multiple vendor-sourced composable servers.

Given the spread of cloud computing, virtualization, and hyperscale environments, the need to automate the management of a large number of servers through a unified application programming interface (API) has grown. At the same time, the Intelligent Platform Management Interface (IPMI) has become widely used as a management interface for servers and other products. However, IPMI is an old standard, drafted in 1998, and has had problems such as being unable to handle a large number of servers via representational state transfer (REST) APIs and being weak in extensibility. Against this background, DMTF standardized Redfish in 2015. Redfish features include a foundation in RESTful API, JSON (Java­Script Object Notation), and HTTPS (Hypertext Transfer Protocol Secure), enhanced security, provision of structured hardware configuration information, and a design that supports scale-out management. The Redfish data model has a tree structure (resource tree) that can be accessed via a hierarchical URI (uniform resource identifier).

Major vendors have adopted Redfish, which is now used for overall management of modern datacenters. A new demand arose for pooling devices within servers and PCIe/CXL expansion boxes in a datacenter (resource pooling), as in the case of composable servers, and for using those devices during dynamic reconfigurations. To meet this demand, DMTF began to add the concept of “composability” to standard specifications in 2017; thus, Redfish became applicable to the control of composable servers [7].

In addition to specifying an interface such as Redfish, OFA Sunfish is another initiative that defines a framework to make it easier to operate composable servers. Sunfish is an open architecture for operating and managing composable servers. It provides a framework for connecting devices such as CPUs, memory, storage, and GPUs over the network and for flexibly combining those devices to configure a logical server. The Sunfish architecture is shown in Fig. 3.


Fig. 3. Sunfish architecture overview.

Sunfish has the following three features:

(1) Vendor-neutral

(2) Abstracted resource representation model

(3) Standards-based open management interface

We first explain the vendor-neutral feature. When each vendor product provides Redfish, the possibility exists that the namespace of each will be duplicated. Sunfish addresses this problem by assigning unique identifiers (IDs) to products within the Sunfish management space. To support products that provide vendor-native APIs or tools, Sunfish provides a layer (Sunfish Agent) to convert these interfaces to Redfish. This corresponds to a repository service design that can manage multiple vendor products in an integrated and uniform manner.

We next explain the second feature, the abstracted resource representation model. In the Sunfish Service, Sunfish abstracts and manages information on servers, storage, and fabric configuration collected via a Sunfish Agent as a Redfish resource tree. In this way, the Sunfish Service is designed so that a system manager or management tool can pool, allocate, and reconfigure resources at the logical level without having to worry about physical matters, such as which servers have GPUs or which memory is connected to which servers.

The third feature is a standards-based open management interface. In this regard, APIs make use of DMTF Redfish and SNIA Swordfish [8]. To enable resource management and logical server configuration through the RESTful APIs that these standards provide, Sunfish is designed to support future extensibility and the integration of heterogeneous hardware.

We now explain the Sunfish management method. The Redfish resource tree that manages the Redfish resource trees of each device is integrated and configured (Fig. 4). When registering with the Sunfish Service, each device is assigned a unique ID within the Sunfish management space. This mechanism prevents the duplication of namespaces in the Redfish resource trees of each device. The client can thus operate on the integrated resource tree based on the IDs assigned by the Sunfish Service. The Sunfish Service also maintains an ID correspondence table, so that when operating on any composable server product under a Sunfish Agent, it requests the operation after converting that product to a namespace ID managed by that Sunfish Agent.


Fig. 4. Resource integration and management by Sunfish Service.

Finally, we explain the flow of creating a logical server using Sunfish. For simplicity, it is assumed that information on the resources required to configure a logical server is already known.

  • The client requests the Sunfish Service to create a logical server using the namespace IDs managed by the Sunfish Service.
  • On the basis of the client’s request, the Sunfish Service requests a Sunfish Agent having the stated resources to secure those resources for configuring a logical server. At this time, the resource IDs are converted to the namespace IDs managed by the Sunfish Agent.
  • Now, based on the request from the Sunfish Service, the Sunfish Agent requests the composable server product to secure those resources.
  • The composable server product secures resources according to the Sunfish Agent’s request.
  • The Sunfish Service updates the configuration information of its own Redfish resource tree.

In Sunfish, work is also being done on reference implementations, which will include the above device registration function. To implement the above flow for creating a logical server, however, it is necessary to define how the resources held by a Sunfish Agent will be recognized and incorporated into the Redfish resource tree of the Sunfish Service, etc. In terms of architecture, it is necessary to determine which units the Sunfish Agent layer should be configured in.

3. Issues in achieving multiple vendor-sourced composable servers

We believe that we need to address the following two key issues to achieve multiple vendor-sourced composable servers.

3.1 Issues in achieving integration and management

As described above, we are promoting the formulation of standards toward the integration and management of multiple vendor-sourced composable servers, as in Sunfish at OFA. However, in terms of actual operation, we have come to understand through tests and trials at NTT Software Innovation Center that a number of problems remain in terms of product gaps and architecture. We introduce the two major problems.

(1) Resource extraction, management method

In Sunfish, the resources of each product are integrated and managed using a Redfish resource tree in the Sunfish Service. The problem is how to configure the resource information under a Sunfish Agent, notify the Sunfish Service of that information, and integrate it. In this regard, there are resource groups having a dependent relationship due to a physically connected configuration, so this must also be taken into account. Targeting a fabric switch product equipped with the Redfish composability function, we are currently testing the implementation of functions required to configure resources via Sunfish, including configuration operations. However, when allocating devices within a resource box to a server, product specifications may call for a procedure that, instead of allocating in units of devices, allocates devices in terms of ports that coordinate those devices. There also may not be a one-to-one correspondence between ports and devices, so we became aware of the need for a resource-extraction mechanism that takes such product specifications into account (Fig. 5).


Fig. 5. Difficulty of resource management by product specifications.

(2) Sunfish Agent constituent units

Another issue is the need to study the constituent units of a Sunfish Agent. The constituent units of a Sunfish Agent mainly fall into the following two patterns, as shown in Fig. 6.


Fig. 6. Sunfish Agent constituent units.

Pattern 1: A Sunfish Agent is configured for each host server, fabric switch, and resource box.

The advantage of this pattern is that a composable server can be configured using different vendor products of the host server, fabric switch, and resource box. The disadvantage is that, similar to the issue of resource extraction and management method, the dependent relationship due to physical connections between devices must be maintained on the Sunfish Service side, meaning this function must be implemented there.

Pattern 2: A single Sunfish Agent is configured by grouping devices having a dependent relationship.

The advantage of this pattern is that there is no need to determine the physical configuration of resources on the Sunfish Service side. Another advantage is that it is easy to detect the range of impact on products when a failure occurs since a Sunfish Agent exists for each dependent relationship. The disadvantage is that restoration work becomes complicated when a failure occurs in the Sunfish Agent. In such a case, the state of all devices managed by the Sunfish Agent must be restored in a consistent manner on the basis of those dependent relationships.

Multiple patterns therefore exist for the constituent units of Sunfish Agents, each with its own advantages and disadvantages. We have been targeting the Pattern 2 from the viewpoint of the independence of the Sunfish Service and Sunfish Agent.

3.2 Toward actual operation of composable servers

Finally, we share the issues that must be addressed toward actual operation as obtained from tests and trials of composable-server functions.

(1) Time required for creating and reconfiguring a logical server

The time required to create and reconfigure a logical server can be an issue in actual operation. Since restarting a server takes time, it is necessary to complete operations such as allocating and releasing resources without restarting the server to the extent possible. Some vendors provide a dynamic reconfiguration function that does not require a server restart when creating and reconfiguring a logical server.

(2) Mechanism for managing/updating physical configuration information

Sunfish requires the physical connection configuration between a server and resource box to abstract resources. It is therefore important that Sunfish be able to obtain such physical connection information from a composable server. If it cannot, physical connection information will have to be managed manually, driving up infrastructure operating costs, and the possibility of failures caused by human operational errors must also be considered. Even with the overall system difficulty increased by composable servers, there is still a need for a mechanism to manage and update physical configuration information. To achieve multiple vendor-sourced composable servers, we are studying an approach that encourages composable server vendors to equip their products with this functionality.

(3) Increase in cable-connecting work between devices

Depending on the number of lanes, PCIe extension boxes and PCIe fabric switches typically require many PCIe cables. Cable thickness can also differ greatly between generations. In the 5th generation, for example, cables were not only thick but short as well, which made cable handling somewhat difficult. Cable-length limitations also made it necessary to consider physical racking positions, including those for server devices that need to be connected. A PCIe extension box may become unusable if the GPU auxiliary power cable, for example, is not a vendor-provided cable, and boxes have been known to fail when using a product not provided by the vendor such as a commercially available product. In short, there are many issues to consider in actual operations.

(4) Improved availability

A composable server generally consists of three components: server, fabric switch, and resource box. Improved availability of fabric switches and resource boxes is therefore necessary, and the question of how to achieve that is an important issue. For example, if resource boxes and the devices they incorporate take on a simple active/standby configuration in units of resource boxes, standby resources will be left unused, which runs contrary to the original purpose of composable servers, which is to use resources efficiently.

(5) Need for a configuration that takes lower layers into account

When configuring a logical server, it would be desirable to have a configuration achieved not only by simply selecting resources from resource pools but also by being as aware as possible of lower layers, taking into account the workloads generated by those resources. For example, when using NVIDIA’s GPUDirect RDMA offload function, selecting resources that are aware of PCIe switches and root complexes can affect workload performance and data-processing efficiency.

(6) Secure configuration operations

There is concern that a system will halt if a composable server exhibits an inconsistency in its state management. It is therefore vitally important to have thorough transaction management to implement multiple vendor-sourced composable servers.

4. Future developments

As an initiative toward multiple vendor-sourced composable servers in DCI technology development, we introduced Sunfish, which is currently considered a powerful framework for integration and management. We described its design and pointed out issues in the actual operation of composable servers.

Going forward, we will boost our research and development efforts toward the DCI-2 [9] scheduled for commercialization in FY2026. Specifically, we plan to integrate multiple vendor-sourced composable servers into the DCI-2 system while incorporating the design philosophy of Sunfish. We will also propose functions that are lacking in the current Sunfish specification and its reference implementations to OFA Sunfish and promote the integration and management standard for multiple vendor-sourced infrastructure. On top of that, we will work to enable operations that link multiple vendor-sourced composable servers and a container infrastructure. Since Kubernetes, the de facto standard container infrastructure, has provided a dynamic resource allocation function, work has also begun to integrate with this function to dynamically allocate devices within the resource box of a composable server to a worker node [10], for example. NTT is also participating in this effort. Finally, with the aim of expanding DCI’s operational functions, we plan to develop technologies for mobility use cases that require AI and video processing, as well as other use cases, while pursuing technology development for DCI controller software and proposing system references to the IOWN Global Forum.

References

[1] K. Ninokata and C. Schumacher, “Data-centric Infrastructure for Enabling Practical Use of IOWN,” NTT Technical Review, Vol. 23, No. 7, pp. 36–42, July 2025.
https://doi.org/10.53829/ntr202507fa4
[2] J. Oka, X. Shi, S. Mizuno, S. Suzuki, Y. Nakazawa, and M. Takagi, “A Pavilion Clad in Emotions: Harmonized Communication Experiences between People and Objects,” NTT Technical Review, Vol. 23, No. 10, pp. 45–50, Oct. 2025.
https://doi.org/10.53829/ntr202510fa5
[3] DMTF,
https://www.dmtf.org/
[4] Redfish,
https://www.dmtf.org/standards/redfish
[5] OFA,
https://www.openfabrics.org/
[6] Sunfish Framework,
https://www.openfabrics.org/openfabrics-management-framework/
[7] Redfish Composability White Paper (DSP2050),
https://www.dmtf.org/dsp/DSP2050
[8] SNIA Swordfish,
https://www.snia.org/forums/smi/swordfish
[9] S. Kinoshita, “IOWN INTEGRAL,” NTT Technical Review, Vol. 23, No. 3, pp. 26–35, Mar. 2025.
https://doi.org/10.53829/ntr202503fa2
[10] CoHDI (Composable Hardware in Disaggregated Infrastructure),
https://github.com/CoHDI
Kensuke Koda
Senior Research Engineer, System Software Project, Software Innovation Center, NTT, Inc.
He received a B.E. and M.E. in engineering science from Osaka University in 2011 and 2013. He has been with NTT since 2023, and his research interests include the development of composable disaggregated infrastructure using multiple vendor-sourced products and the implementation of integrated system software.
Kazuo Ninokata
Director, System Software Project, Software Innovation Center, NTT, Inc.
He received a B.E. in mechanical engineering from Osaka University in 2001 and an M.S in information science from Nara Institute of Science and Technology in 2003. He has been with NTT since 2003, and his research interests include DCI and the development of service architecture with AI and composable disaggregated computing.

↑ TOP