Cloud native EDA tools & pre-optimized hardware platforms
Swathi Jayaramaiah, Staff Product Marketing Manager and Madhumita Sanyal, Senior Staff Technical Marketing Manager, Synopsys
Bandwidth to process more complex data is exploding and driving 800G and 1.6T data rates. This is due to numerous factors such as increasing number of users and devices per user, increased access rates and methods, as well as additional diverse services. While the projected growth of 800G & 1.6T is at 2x and 4x over a nine-year timeframe, the application growth is anywhere from 7x to 55x for different content streams as seen in Table 1 from the .
Table 1: Bandwidth growth values
As seen in Table 1, The bandwidth demand is most visible in data center switching by a 16.3x increase over a period of eight years. Connectivity for data center rack units (RUs) is mostly copper, optics is used everywhere else. In RUs, switches are increasing speeds from 12.8T to 25.6T, 51.2T and 102.4T. The same speed evolution is seen in pluggable and co-packaged optics with speeds moving from 400G to 800G to 1.6T and beyond Ethernet. Previously, a 12.8T switch needed 32 instances of x8 50G SerDes. For next-generation switches, 112G and soon 224G SerDes are becoming essential. A higher speed SerDes will have a smaller footprint with lower cost, less power and shorter time to market.
The IEEE 802.3 working group has defined the 400G standard and the Ethernet Technology Consortium has defined and published the higher speed 800G standard. The IEEE 802.3 standard for 400G uses multi-lane distribution (MLD) to distribute data from a single Media Access Control (MAC) channel across 16 Physical Coding Sublayer (PCS) lanes. The Ethernet Technology Consortium¡¯s 800G standard uses a MAC, scaled up to 800 Gb/s, along with two 400G Gb/s PCSs (with modifications) to drive 8x100G lanes. There is a total of 32 PCS lanes (2x16 from the 400G standard), all with RS (544,514) forward error correction (FEC) that is supported in the 400G standard.
Some applications such as automotive or printers require lower speed from 10M to 25G Ethernet data rates, but in case of automotive, the data must be of higher quality and reliability. On the other end of the Ethernet speed spectrum, AI, hyperscale datacenters and telecom applications are already well on their way to 400G Ethernet systems and are approaching speeds of up to 800G. SoC design for many of such applications are already complex without factoring in the need to incorporate high-speed Ethernet. Additionally, incorporating an integrated Ethernet IP subsystem may not be a core competency for some SoC designers.
This article explains the Ethernet MAC and PHY layers and uses a case study to describe different Ethernet design configurations for 400G/800G links.
As shown in figure 1, a complete Ethernet IP subsystem includes PHY and MAC. An IEEE 802.3 compliant Ethernet IP subsystem can range from a simple 100G MAC/PCS and a 50G SerDes PHY system to a more complicated 800G system with multiple MACs/PCSs in different configurations, interfacing with a 56G/112G SerDes.
A PHY consists of a PCS + SerDes which includes PMA & PMD.
Figure 1: Integrated Ethernet IP including MAC and PHY
From an architectural view, as shown in Figure 2, in the seven-layer Open Systems Interconnection (OSI) model, Ethernet fills the bottom two layers ¨C the physical layer and the datalink layer.
Figure 2: Ethernet layers in the Open Systems Interconnection (OSI) model
The Physical layer, including PCS, PMA and PMD, transmits and receives the unstructured raw bit stream over a physical medium. Functions such as serialization, auto negotiation, link training are implemented in the physical layer. The Physical Medium Dependent (PMD) handles the medium which could range from a short reach cable to long reach over backplanes and optical fibers. PMD is a medium-dependent serial interface that executes bit timing and signal encoding. The next sub-layer sitting on top of the PMD is the Physical Medium Attachment (PMA) where the rate per lane and the number of lanes are set. In addition, PMA executes local and remote loopback testing along with data framing and test pattern generation.
A high-speed SerDes (consisting of a PMA and PMD) generally will be at 56G or 112G but can be in a 1/2/4 lane configurations as a x1/x2/x4 SerDes. The lower speed SerDes are available in 10G, 25G and 32G PHY.
PCS transfers information to and from the MAC or other PCS clients such as a repeater. PCS performs frame delineation, encoding/decoding such as 8b/10b or 64b/66b, fault information transport, deskew of received data and data restoration.
A high-speed PCS is usually available in 200G/400G/800G data rates and lower speed PCS range from 1G to 100G. A higher speed PCS usually has a configurable number of channels and can operate independently at different rates. For example, a 400G PCS could be configured as either of the following:
Data Link Layer, including the MAC layer and Logical Link Control (LLC) layer, provides node-to-node data transfer between two directly connected nodes. The MAC, apart from flow control, handles error correction from the physical layer.
MACs are also available in 200G/400G/800G data rates as well as in lower speeds from 10M to 100G. The MAC configuration options mirror those of the PCS options mentioned above.
As can be seen from the number of options described above, the Ethernet complexity and variations are numerous. For example, for a 51.2T Ethernet switch running at 100 Gbps line rate, we find that Ethernet could be implemented in at least three different configurations, as shown in Figure 3.
Configuration 1 - Monolithic Topology: This is 512 lanes of 100G SerDes placed on all edges of a single monolithic die, implemented with 128 instances of x4 112G Long Reach (LR) SerDes coupled with a Quad or Octal PCS & MAC . Factors to consider are available beachfront and floorplanning to ensure optimal routing, MAC/PCS placements and overall timing feasibility.
Configuration 2 - Two-Die Topology: This is a two-die implementation, connected by 112G Extra Short Reach (XSR) SerDes. Each implementation includes 64 instances in a x4 112G LR SerDes and Quad or Octal PCS & MAC. Advantages of a multi-die implementation are increased beachfront and better yield from each smaller die vs. a monolithic die.
Configuration 3 - Companion Die Topology: This is instances of 112G XSR on the host side connected to eight companion dies on the line side. Each companion die consists of 16 instances of x4 112G LR SerDes and Quad or Octal PCS & MAC. The advantage is that the main die can be in a more aggressive process node while the companion die could stay in older and more mature processes.
For configuration # 3, whether the chiplet is 1.6T (32 instances of four x4 112LR ) or 3.2T (16 instances of four x4 112LR ) or 6.4T (8 instance of four x4 112LR ), it needs to be computed by trying various block partitioning strategies. In addition, reference clock routing needs to be considered.
It¡¯s essential to consider package escape studies to meet the crosstalk specification, building power delivery network, and performing power integrity simulations, all of which ensure consistent performance across the board. Figure 3 shows the three configurations mentioned above.
Figure 3: Case study of different Ethernet configurations
In addition to the above, hardening is another essential consideration. Hardening involves what-if analysis of block partitioning for optimizing beachfront, front-end and back-end integration with a full RTL to GDS flow including SerDes design knowledge, PCS and MAC, as well as close collaboration with EDA tools for sign-offs.
To drive efficiency, simplify design efforts and reduce time to market, designers need an integrated and validated 400G/800G MAC, PCS and 56G/112G SerDes. Interface latency & power optimizations become easier when the integration is performed by designers that have the required knowledge and expertise of MAC, PCS and SerDes functionality, configurations and implementation.
Use cases are changing as high-performance computing applications evolve into areas like AI, automation, device packaging and many other new ways complex data will be used and processed. In addition to traditional Ethernet, several use-cases of 800G are emerging as disintegration and heterogeneous dies become popular due to yield and cost issues. OIF is working to introduce 3.2T and 6.4T standards. The challenges of 800G will impact designers in many ways including the evolving chiplet market, of which 400G/800G solutions are a key part.
Synopsys offers an integrated 200G/400G/800G Ethernet solution consisting of MAC, PCS and PMA/PMD IP. The MAC is IEEE-compliant and configurable to fit the needs of high-performance computing (HPC), AI and networking SoCs. The DesignWare? 56G and 112G PHY IP are silicon-proven and available in advanced FinFET processes, offering superior BER with maximum performance.