- The paper presents Apollo, the first large-scale deployment of optical circuit switching in datacenters, demonstrating significant cost, power, and latency improvements.
- It details the innovative Palomar design—a 3D MEMS-based switch offering millisecond-scale switching with 136x136 non-blocking ports and low insertion loss.
- Apollo’s integration of wavelength-division-multiplexing and bidirectional transceivers enhances bandwidth flexibility, meeting the high demands of modern ML workloads.
An Overview of Apollo: Implementing Optical Circuit Switching at Datacenter Scale
The paper "Mission Apollo: Landing Optical Circuit Switching at Datacenter Scale" presents an elaborate discussion on Apollo, putatively the first substantial deployment of optical circuit switches (OCSes) in datacenter networks. Throughout this paper, the authors from Google LLC elaborate on the motivation, design, implementation, and future implications of integrating OCS technology into the hyperscale datacenter environment.
The advocated transition to optical circuit switching in datacenters seeks to address contemporary network demands characterized by rapidly evolving workloads such as machine learning and the inherent challenges of existing electrical packet switch (EPS)-only networks. The fundamental advantages of deploying OCSes include substantial cost and power reduction, lower latency, data rate and wavelength agnosticism, and enhanced flexibility in network topology management. Specifically, OCSes offer an adaptable solution compatible with prevailing networking needs through straightforward light path steering between source and destination without the overhead of packet processing.
Design and Implementation Details
The design principles for the Apollo architecture center on manufacturability, serviceability, and reliability. Core to this architecture is a 3D MEMS-based OCS internally known as Palomar. Notably, Palomar's innovative design employs a singular camera image to control multiple MEMS mirrors for optical signal steering, significantly simplifying the system compared to conventional designs. The Palomar OCS provides millisecond-scale switching with 136x136 non-blocking ports, achieving insertion losses below 2 dB and return losses of about -38 dB.
Complementing the OCS platform, the Apollo system incorporates wavelength-division-multiplexed (WDM) transceivers and circulators that enable bidirectional communication on single fiber strands, effectively doubling port radix and optimizing the fiber optic infrastructure. The paper emphasizes the critical role of WDM technology development over multiple interconnect generations—40, 100, 200, 400 GbE—and its compatibility with bidirectional configurations. Strategic co-design of transceiver technology ensured not only low cost but also lower power consumption and interference mitigation through advanced digital signal processing.
Practical Implications and Future Directions
Through the Apollo system, Google achieves significant improvements in network cost efficiency, power consumption, and adaptable bandwidth provisioning, underscoring its substantial practical implications. By eliminating the need for Spine layers, reducing reliance on energy-intensive EPS systems, and adopting high-bandwidth, low-latency paths, the Apollo architecture aligns with the needs of contemporary datacenter operations—especially in ML training scenarios demanding predictable, high-throughput interconnects.
The future work suggested by the authors envisions further research into expanding the optical switching paradigm across various network layers. Proposed are enhancements in hardware technology to increase OCS port count for better network scalability and configuration flexibility, integrating quicker switching capabilities and maintaining reliability benchmarks.
This direction signals a movement toward comprehensive optical-dominated datacenter architectures, promoting more adaptive and efficient networking solutions not only within datacenters but across broader network hierarchies. The potential for longer-term network evolution, coupled with accelerated research into optical switching technologies, marks a vital step in addressing the increasing complexities of data-driven global operations.
In essence, the paper contributes substantially to the exploration of optical circuit switching as an integral component of future datacenter networks, highlighting both the technological challenges and immense opportunities that accompany the transition from traditional EPS to OCS-based frameworks.