Coherent Co-Packaged Optics (CPO)
- Coherent CPO is a cutting-edge optical interconnect technology that integrates coherent transceivers with high-speed ASICs to deliver Tb/s-scale, low-latency communications.
- It leverages advanced modulation formats like QPSK and QAM with silicon photonics and DSPs to achieve high spectral efficiency and energy performance.
- High-density integration and precise packaging enable multi-wavelength optical channels with reduced insertion loss, supporting disaggregated AI and scalable data center systems.
Coherent co-packaged optics (CPO) refers to the integration of coherent optical transceivers—capable of both amplitude and phase encoding—directly alongside high-speed digital ASICs at the package or module level. By leveraging silicon photonic devices and sophisticated electronic DSP engines, coherent CPO provides Tb/s-scale, spectrally efficient, and ultra-low-latency communication, facilitating the disaggregation of memory and compute in next-generation AI and data center systems. This technology targets orders-of-magnitude improvements in bandwidth density and per-bit energy consumption compared to conventional pluggable optics, addressing interconnect bottlenecks in AI/ML scaling, high-radix switching, and petabit-class system fabrics.
1. Fundamental Principles of Coherent Co-Packaged Optics
Coherent CPO leverages both the amplitude and phase of the optical carrier by employing IQ-modulation formats such as QPSK and QAM (16-QAM, 64-QAM), often with dual-polarization multiplexing for increased spectral efficiency. At the receiver, a coherent receiver architecture based on a 90° optical hybrid mixes the incoming signal with a local oscillator , yielding in-phase and quadrature (I/Q) photocurrents. This configuration enables the recovery of both amplitude and phase, essential for high-order modulation and advanced digital impairments compensation, including chromatic dispersion and polarization mode dispersion (Moazeni, 2023).
Spectral efficiency for an -ary format is given by in [bit/s/Hz], and for dual polarization, . Example: 16-QAM at yields spectral efficiency.
Laser phase noise (linewidth ) introduces phase variance over a given symbol duration , as . This requires robust digital carrier phase estimation (CPE) algorithms in the receiver DSP to keep residual phase error within operational margins.
2. High-Density Integration, Bandwidth Density, and Shoreline Scaling
A central metric for CPO is shoreline (beachfront) bandwidth density: the aggregate throughput per mm of chip edge interfacing directly with optical I/O. IBM's OTV-1 CPO modules employ a silicon-photonic die (PIC) flip-chip bonded to an ASIC, with optical I/O fanned through a molded polymer waveguide array at 0 pitch, greatly exceeding conventional fiber-array pitches of 1 (Knickerbocker et al., 2024). This enables:
- 20 fibers/mm at 50 μm pitch, versus ≈8 fibers/mm for standard pluggables.
- Effective beachfront density 2, where 3 is WDM channel count and 4 is pitch.
- Bandwidth density 5, e.g., 6, scaling to 7 at 8 pitch.
Insertion loss per channel is kept to 9 dB, with best channels at 0 dB, and crosstalk 1 dB at 2 channel spacing (Knickerbocker et al., 2024).
Vertical integration is addressed by high-density evanescent couplers, such as overlapping inverse double-taper structures enabling 3 insertion loss, 4 bandwidth, and 5 lateral and 6 vertical 7 alignment tolerance (Weninger et al., 2022).
3. Modulation Formats, Transceiver Architectures, and Building Blocks
High-order modulation and multiplexing are core to CPO performance. MRA-MZMs and RAMZI transmitters leveraging microring modulators (MRMs) are enabling technologies:
- Microring-assisted MZMs (MRA-MZMs) demonstrate 8, 9, and energy-per-bit 0 for QPSK and 1 for 16-QAM (Geravand et al., 24 Sep 2025).
- C2PO (Coherent Co-packaged Optics using offset-QAM-16) transmitters employ phase-constant amplitude modulation via RAMZI structures, enabling 400 Gb/s per 2 at 9.65 dBm laser power, with 10–1003 less photonic area than conventional MZI-based QAM transmitters, and DSP-free carrier phase recovery due to the constant-envelope offset-QAM constellation (Sturm et al., 13 Jun 2025).
- QD frequency comb lasers with mode-locked architectures deliver 4 lines at 100 GHz spacing, enabling 5 Tb/s over a single fiber and projected 6 Tb/s with polarization multiplexing and extended WDM (Geravand et al., 24 Sep 2025).
The integration of modulators, balanced photodiodes, advanced DSP engines (for equalization, CPE, and soft-decision FEC), and high-speed electrical drivers is achieved within 7 mm electrical paths, minimizing signal degradation and I/O energy.
4. Packaging, Coupling, and Reliability Engineering
Co-packaged modules adopt advanced assembly approaches for thermal and mechanical stability, electrical and optical co-integration, and minimized loss:
- Polymer waveguide fans-out interface optical channels from PIC edge to standard SMF arrays via adiabatic tapers, achieving low-loss, high-reliability attachment (Knickerbocker et al., 2024).
- Thermal integrity is ensured via co-integration with thermoelectric coolers and heat spreaders to stabilize modulator resonances under adjacent ASIC power densities. Resonance shifts on silicon photonics (8) necessitate active feedback control for high-order WDM and dense MRM arrays.
- All OTV-1 modules passed rigorous JEDEC standards (thermal cycling, damp-heat, high/low temp storage) with channel IL shifts 9 dB after stress (Knickerbocker et al., 2024).
- Vertical couplers facilitate multi-tier stacking within co-packaged designs, enabling substantially higher port counts via fine I/O pitch (0) and tolerance for passive assembly (Weninger et al., 2022).
Key trade-offs involve alignment (sub-1 for <20 μm pitch), crosstalk, and thermal expansion management (Knickerbocker et al., 2024).
5. Energy Efficiency, Performance Metrics, and Latency
CPO targets sub-1 pJ/bit energy budgets, enabled by:
- Short electrical link lengths across the chip/package interface, eliminating the dominant copper PCB trace losses of traditional board-level optics (Knickerbocker et al., 2024, Moazeni, 2023).
- Photonic devices (MRMs, MRA-MZMs) with 2 modulation energy (Geravand et al., 24 Sep 2025, Sturm et al., 13 Jun 2025).
- Shared local oscillator sources for multiple coherent channels, exploiting WDM and polarization multiplexing to maximize per-port throughput.
Performance metrics for typical CPO systems:
| Metric | Near-Term Value | Next-Gen Value |
|---|---|---|
| Bandwidth/lane | 400 Gb/s (16-QAM @32GBd) | 800 Gb/s–1 Tb/s (64-QAM @64–75GBd) |
| Beachfront density | 6× vs. pluggables (~10 Tb/s/mm target) | >10 Tb/s/mm |
| Energy/bit | Sub-1 pJ/bit | 3 pJ/bit projected |
| Pre-FEC BER | 4 | Post-FEC 5 |
| Latency | 6 ns (DSP, <1 m opt) | total 7 ns/hop |
The combination of high parallel optical port count, low insertion loss, and low electrical and optical path latency is critical to meeting synchronous AI workload requirements, such as maintaining round-trip latencies 8s to prevent compute starvation in distributed learning (Moazeni, 2023).
6. System-Level Implications: Disaggregated AI and Petascale Interconnects
Coherent CPO enables system architectures previously precluded by copper or intensity-modulation optics:
- Disaggregated memory and compute: Direct RDMA over coherent links, bypassing local HBM and reducing software overhead by 9s per transaction (Moazeni, 2023).
- High-radix circuit-switched topologies: Reconfigurable fat-tree or mesh networks with microsecond reconfiguration, each node equipped with multiple CPO ports (Moazeni, 2023).
- Scaling laws: For 0 accelerators and 1 memory pools, required port count per device 2 and package edge density 3 Tb/s/mm to yield aggregate 4 Tb/s per GPU.
- AI training: Fivefold improvement in throughput and elimination of inter-GPU link stalls when using CPO over copper or board-level optics (Knickerbocker et al., 2024).
A plausible implication is that CPO-based fabrics will become indispensable for memory-coherent exascale AI systems as the marginal performance benefit from transistor scaling further diminishes.
7. Outlook, Challenges, and Trade-Offs
While CPO demonstrates substantial readiness for deployment, several challenges must be addressed:
- Resonator stability and wavelength control become complex at high WDM channel counts; per-ring feedback and temperature control are essential (Geravand et al., 24 Sep 2025).
- Thermal crosstalk in densely packed modulators (MRMs, MRA-MZMs) requires isolation strategies (Geravand et al., 24 Sep 2025).
- Extending fine-pitch assembly and beachfront density to 5m pitch mandates alignment tolerances below standard passive assembly processes (Knickerbocker et al., 2024).
- Insertion loss of packaging (fiber coupling, vertical interconnects) must approach 6 dB/facet for energy budgets and system margin (Knickerbocker et al., 2024, Weninger et al., 2022).
- Advanced modulation schemes (offset-QAM, DSP-free CPE) can further reduce per-bit DSP energy and circuit complexity while preserving robustness (Sturm et al., 13 Jun 2025).
Current research trajectories include comb-driven multi-Tb/s transmitters (Geravand et al., 24 Sep 2025), further MRM/RAMZI integration (Sturm et al., 13 Jun 2025), and scaling vertical coupler density (Weninger et al., 2022), outlining a roadmap to package-scale, petabit-class coherent optical interconnects.
References:
- (Moazeni, 2023) Next-generation Co-Packaged Optics for Future Disaggregated AI Systems
- (Knickerbocker et al., 2024) Next generation Co-Packaged Optics Technology to Train & Run Generative AI Models in Data Centers and Other Computing Applications
- (Geravand et al., 24 Sep 2025) Comb-Driven Coherent Optical Transmitter for Scalable DWDM Interconnects
- (Weninger et al., 2022) High Density Vertical Optical Interconnects for Passive Assembly
- (Sturm et al., 13 Jun 2025) C2PO: Coherent Co-packaged Optics using offset-QAM-16 for Beyond PAM-4 Optical I/O