
Presenters:
- Forward-Looking Technologies for AI/ML Datacenter Clusters
Presenter: Katharine Schmidtke, Eribel Systems LLC, United States
- LPO Technology: System Integration Insights, Progress, and Challenges
Presenter: Yi Tang, Cisco Systems Inc., United States
- Translating AI/ML System Architecture Into Optical Requirements
Presenter: Craig Thompson, NVIDIA Corp., United State
- Silicon Photonics and Advanced 3-D Assembly for Short-Reach Optical Interconnects
Presenter: Joris Van Campenhout, IMEC, Belgium
Introduction: The rapid advancement of Artificial Intelligence (AI) and Machine Learning (ML) is fundamentally reshaping datacenter network architectures. Presentations at the OFC Summit on Optics for AI Datacenters highlighted a clear trajectory driven by the insatiable demand for data. Three interconnected themes emerged: the sheer scale of future connectivity needs, the ongoing shift from traditional pluggable optics to Co-Packaged Optics (CPO), and the longer-term evolution towards highly integrated Optical I/O (OIO).
- The Exploding Need for Connectivity and Bandwidth in AI Clusters
The core driver for optical innovation is the massive growth in AI model complexity and size. Performance scales with compute power, leading to future AI clusters potentially connecting hundreds of thousands of processing units (xPUs) [Ref: Eribel]. This isn’t just about external (North-South) traffic; the critical bottleneck is increasingly the East-West traffic within the AI fabric itself, handling the intense communication between processing elements during training and inference [Ref: NVIDIA]. This internal traffic is sensitive to latency, jitter, and requires enormous, reliable capacity [Ref: NVIDIA].
Consequently, the demand for optical connectivity bandwidth is growing exponentially, pushing speeds from 400G towards 800G, 1.6T, and likely beyond [Ref: ECOC Market Focus, NVIDIA]. Meeting this demand involves complex trade-offs between increasing the number of fibers, leveraging Dense Wavelength Division Multiplexing (DWDM), and boosting data rates per optical lane (e.g., towards 200 Gbps/lane) [Ref: Eribel]. This scaling introduces significant challenges, including managing network congestion, handling traffic bursts, accommodating diverse AI architectures, and critically, reducing the cost per bit for these high-bandwidth solutions [Ref: Eribel]. Furthermore, the sheer volume of optical connections needed deep within AI systems (for scale-up and scale-out) necessitates advancements in high-volume, cost-effective manufacturing processes [Ref: Eribel].
- The Shift Towards Co-Packaged Optics (CPO)
As bandwidth and density requirements escalate, traditional pluggable optical modules face power consumption and faceplate density limitations. CPO represents a major architectural shift, bringing the optical interfaces much closer to the primary processing ASICs (like GPUs or switches) within the same package. NVIDIA, a key proponent, highlighted CPO as crucial for achieving consistent low latency and the high port density required for their large, high-radix switches (e.g., 115 Tbps backplane capacity mentioned) designed for AI fabrics [Ref: NVIDIA, Optics.org].
The primary motivation for CPO is power efficiency. While Linear Pluggable Optics (LPO) offers an intermediate step, aiming to reduce power compared to retimed pluggables (e.g., <9W vs ~15W for 800G) [Ref: Cisco], CPO targets more aggressive power savings, aiming below 5 pJ/bit and potentially saving megawatts in large AI deployments [Ref: NVIDIA, Optics.org]. This often involves integrating silicon photonics engines, potentially using efficient components like Micro Ring Modulators (MRMs), directly with the switch ASIC, requiring close collaboration with manufacturing partners like TSMC and component suppliers [Ref: Optics.org].
However, CPO development is complex. It requires significant advancements in thermally stable, high-performance components (lasers, modulators, detectors, TIAs), new packaging and testing methodologies, and ensuring long-term reliability for components that may not be easily field-replaceable [Ref: NVIDIA]. Research presented by IMEC on high-speed, low-loss modulators (e.g., using LiTaO3/LNOI) could potentially serve as enabling technology for the high-performance optical engines needed in CPO [Ref: IMEC].
- The Evolution Towards Optical I/O (OIO)
Looking beyond CPO, the next frontier in integration is Optical I/O (OIO), sometimes referred to as 2.5D or 3D heterogeneous integration. OIO envisions bringing optical connections directly onto the chip package or potentially even the chip itself, enabling extremely high bandwidth density and further reducing power consumption by minimizing electrical trace lengths.
IMEC presented research exploring key OIO building blocks [Ref: IMEC OFC Page]:
- 5D OIO:Focuses on integrating photonic ICs alongside electronic ICs on an interposer. Research shows pathways to dense WDM (e.g., 32 channels demonstrated using compact Si-rings with high thermal tuning efficiency of ~5.85 mW/π) and targets ultra-low power (<1 pJ/bit).
- 3D OIO:Explores direct bonding techniques (like wafer-to-wafer hybrid bonding) for seamless electrical-optical integration and advanced, low-loss coupling methods (evanescent or grating couplers) for efficient light transfer between chips and fibers.
NVIDIA also includes OIO (2.5D/3D) in its longer-term roadmap, indicating it’s seen as a necessary step for future generations of AI systems requiring even greater interconnect bandwidth and efficiency [Ref: NVIDIA]. While promising significant gains, OIO relies on maturing advanced packaging technologies, ensuring robust thermal management, and developing high-yield manufacturing processes suitable for these complex integrated assemblies.
Conclusion: The OFC Summit presentations painted a picture of continuous optical innovation driven by AI. The escalating demand for bandwidth and connectivity within massive AI clusters is pushing the industry beyond traditional pluggables. CPO represents the current major transition, promising significant power and density improvements, while OIO signifies the future direction, aiming for even tighter integration and efficiency. Successfully navigating these transitions requires overcoming significant technical challenges in component performance, thermal management, packaging, reliability, and high-volume manufacturing.
References & Further Search:
- OFC 2024 Context:TechInsights Blog: “Optical Fiber Communications (OFC) 2024: The Year of the AI Data Center” – https://www.techinsights.com/blog/optical-fiber-communications-ofc-2024-year-ai-data-center
- Katharine Schmidtke Bio:UCSB IEE Profile – https://iee.ucsb.edu/people/global-advisory-board/katharine-schmidtke [Affiliation for Eribel presenter]
- LPO Market Context/Challenges:DataInsightsMarket Report Summary: “Linear Pluggable Optics (LPO) Market’s Consumer Landscape…” – https://www.datainsightsmarket.com/reports/linear-pluggable-optics-lpo-1632326 [Context for LPO]
- NVIDIA CPO/SiPh Strategy:org News: “Nvidia reveals plan to scale AI ‘factories’ with co-packaged optics” – https://optics.org/news/16/3/26 [Details on NVIDIA CPO direction]
- IMEC OFC 2024 Contributions:IMEC Event Page: “OFC 2024 | imec” – https://www.imec-int.com/en/events/ofc-2024 [Details on OIO research, Si-rings, bonding]
- General Bandwidth Trends:ECOC Market Focus Info (mentions 1.6T) – https://www.ecocexhibition.com/visit/market-focus/market-focus-session-information/
Based on the trends discussed during this OFC Summit, I identified following opportunities:
- Micro Optic Fly-over Cabling for CPO:
- Assessment:This is a major opportunity. CPO inherently involves bringing optical engines onto the main board, necessitating internal fiber routing from the engines to the faceplate or other system components. Fly-over assemblies are a key enabling technology for this, bypassing high loss PCB traces for high-speed signals.
- Requirements/Refinements:The demand will be for extremely high-density, low-profile assemblies. Key features will include low insertion loss, excellent bend performance for routing within tight chassis spaces, precise length matching (especially for parallel optics), and potentially novel connector interfaces for mating directly with the optical engines on board. Reliability and manufacturability at scale will be crucial.
- Support for Diverse Interface Types (Parallel Optics & DWDM/Duplex):
- Assessment:Flexibility will be essential. While CPO engines might have many internal lanes, the external interfaces will likely adhere to standards for interoperability.
- Requirements/Refinements:
- Parallel Optics:Support for interfaces like DR4 (4+4 fibers SMF) and potentially DR8 (8+8 fibers SMF) for 800G/1.6T will require high-quality MPO/MTP (e.g., 12F, 16F, 24F) connectivity solutions.
- Duplex/WDM Optics:Support for standards like FR4 (4 wavelengths on duplex SMF) and future WDM variants will require high-density duplex connectors (LC is standard, but SN/MDC are gaining traction for density).
- Fiber Shuffles (Standard & PM Fiber), including for ELS:
- Assessment:This is another significant opportunity, directly linked to the practical implementation of CPO, especially architectures using External Laser Sources (ELS).
- Requirements/Refinements:
- ELS Connectivity:ELS modules need to distribute light to multiple CPO engines. This requires custom fiber routing assemblies (shuffles) – potentially breaking out from a multi-fiber ELS output connector to several engine inputs. Low loss and precise routing are key.
- PM Fiber:Some CPO architectures may require Polarization-Maintaining (PM) fiber between the ELS and the modulator engine to maintain a specific polarization state. Offering PM fiber shuffle assemblies could be a valuable, specialized capability, though likely lower volume than standard SMF. This requires specific expertise in handling and terminating PM fiber accurately.
- High-Density Cable and Connectivity Solutions:
- Assessment:Absolutely critical and foundational. All the above trends converge on the need for much higher density at the faceplate, within the chassis, and potentially between racks.
- Requirements/Refinements:Leverage and adapt our existing high-density portfolio. This includes promoting solutions using small form factor connectors (SN, MDC, high-fiber MPOs), reduced-diameter cabling, and structured cabling designs that facilitate management of hundreds or thousands of fibers while maintaining performance and managing airflow.
Related articles
Share the page:
Contact US
If you want to know more about us, you can fill out the form to contact us and we will answer your questions at any time.