Quantcast
Channel: Cadence Functional Verification

Use Verisium SimAI to Accelerate Verification Closure with Big Compute Savings

$
0
0

Verisium SimAI App harnesses the power of machine learning technology with the Cadence Xcelium Logic Simulator - the ultimate breakthrough in accelerating verification closure. It builds models from regressions run in the Xcelium simulator, enabling the generation of new regressions with specific targets. The Verisium SimAI app also features cousin bug hunting, a unique capability that uses information from difficult-to-hit failures to expose cousin bugs. With these advanced machine learning techniques, Verisium SimAI offers the potential for a significant boost in productivity, promising an exciting future for our users.

Figure 1: Regression compression and coverage maximization with Verisium SimAI 

What can I do with Verisium SimAI?

You can exercise different use cases with Verisium SimAI as per your requirements. For some users, the goal might be regression compression and improving coverage regain. Coverage maximization and hitting new bins could be another goal. Other users may be interested in exposing hard-to-hit failures, bug hunting for difficult to find issues. Verisium SimAI allows users to take on any of these challenges to achieve the desired results.

Let's go into some more details of these use cases and scenarios where using SimAI can have a big positive impact.

  1. Using SimAI for Regression Compression and Coverage Regain

Unlock up to 10X compute savings with SimAI!

Verisium SimAI can be used to compress regressions and regain coverage. This flow involves setting up your regression environment for SimAI, running your random regressions with coverage and randomization data followed by training, and finally, synthesizing and running the SimAI-generated compressed regressions. The synthesized regression may prune tests that do not help meet the goal and add more runs for the most relevant tests, as well as add run-specific constraints. This flow can also be used to target specific areas like areas involving a high code churn or high complexity.

You can check out the details of this flow with illustrative examples in the following Rapid Adoption Kits (RAK) available on the Cadence Learning and Support Portal (Cadence customer credentials needed):

 

  1. Using SimAI for Coverage Maximization and Targeting coverage holes

Reduce your Functional Coverage Holes by up to 40% using SimAI!

Verisium SimAI can be used for iterative coverage maximization. This is most effective when regressions are largely saturated, and SimAI will explicitly try to hit uncovered bins, which may be hard-to-hit (but not impossible) coverage holes. This is achieved using iterative learning technology where with each iteration, SimAI does some exploration and determines how well it performed. This technique can also be used for bug hunting by using holes as targets of interest.

See more details on the Cadence Learning and Support Portal:

 

  1. Using SimAI for Bug Hunting

Discover and fix bugs faster using SimAI!

Verisium SimAI has a new bug hunting flow which can be used to target the goal of exposing hard-to-hit failure conditions. This is achieved using an iterative framework and by targeting failures or rare bins. The goal to target failures is best exercised when the overall failure rate is typically low (below 5%). Iterative learning can be used to improve the ability to target specific areas. Use the SimAI bug hunting use case to target rare events, low hit coverage bins, and low hit failure signatures.

See more details on the Cadence Learning and Support Portal:

Unlock compute savings, reduce your functional coverage holes, and discover and fix bugs faster with the power of machine learning technology now enabled by Verisium SimAI!

Please keep visiting  https://support.cadence.com/raks to download new RAKs as they become available.

Please note that you will need the Cadence customer credentials to log on to the Cadence Online  Support  https://support.cadence.com/, your 24/7 partner for getting help in resolving issues related to Cadence software or learning Cadence tools and technologies.

Happy Learning!


Flow Control Credit Updates in PCIe 6.1 ECN

$
0
0

As technology continues to evolve at a rapid pace, the importance of robust and efficient interconnect standards cannot be overstated. Peripheral Component Interconnect Express (PCIe) has been a cornerstone in high-speed data transfer, enabling seamless communication between various hardware components.   

With the advent of PCIe 6.1 ECN, a significant advancement in speed and efficiency, ensuring the accuracy and reliability of its operations is paramount. One critical aspect of this is the verification of shared credit updates. For detailed understanding on Shared Credit, please refer Understanding PCIe 6.0 Shared Flow Control. 

In this blog, we will discuss why this verification is essential and what it entails.  

Introduction 

PCIe 6.1 ECN brings numerous advancements over earlier versions, such as increased bandwidth and faster data transfer speeds.   

A crucial mechanism for efficient data transmission in PCIe 6.0 is the credit-based flow control system. In this system, devices monitor credits, representing the buffer capacity available for incoming data.   

When a device transmits data, it uses credits, which are replenished or adjusted once the data is received and processed. This system ensures that the sender does not overload the receiver.  

Given the critical role of shared credit updates in maintaining the integrity and efficiency of data transfers, verification of these updates is crucial.  Proper management of credit updates is essential to ensure data integrity, as any discrepancies can lead to data loss, corruption, or system crashes.   

Verification also guarantees efficient resource allocation, preventing scenarios where some components are starved of credit while others have an excess, thus avoiding inefficiencies.  Credit inefficiencies pose issues in low power negotiations by preventing devices from entering low power states. Additionally, verification involves checking for proper error handling mechanisms, ensuring that the system can recover gracefully from errors in credit updates and maintain overall stability.   

PCIe 6.1 ECN Flow Control Optimizations Over PCIe 6.0

PCIe 6.1 ECN builds on the FLIT-based architecture introduced in PCIe 6.0, further optimizing flow control mechanisms to handle increased data rates and improved efficiency.  PCIe 6.1 ECN introduced refinements in credit management, making the allocation and advertisement of credits more precise, which helps in reducing bottlenecks and improving data flow efficiency.  Enhancements in flow control protocols ensure better management of buffer spaces and more efficient credit allocation. These enhancements are designed to handle the increased data rates and throughput demands of next-generation applications, ensuring robust and efficient data flow across PCIe devices.  

Below are some major updates: 

  1. There have been improvements in error detection and correction mechanisms in PCIe 6.1 ECN to enhance flow control reliability by ensuring that corrupted data packets are detected and handled appropriately without disrupting the flow of valid packets.  
  2. The merged credit system, which was a key feature introduced int PCIe 6.0 to simplify and optimize credit management, was further enhanced in PCIe 6.1 ECN to improve performance and efficiency.  
  3. PCIe 6.1 ECN introduced better algorithms for allocating and reclaiming merged credits to handle high data rates, introduced more robust error detection and correction mechanism reducing the degradation or system instability. 
  4. PCIe 6.1 ECN provided clear guidelines on how to implement the merged credit system correctly, helping developers to implement more reliable systems. For more details, please refer to Specifications section 2.6.1 Flow Control (FC) Rules.

Summary 

In summary, PCIe 6.0 is a complex protocol with many verification challenges. You must understand many new Spec changes and think about the robust verification plan for the new features and backward compatible tests impacted by new features. Cadence’s PCIe 6.0 Verification IP is fully compliant with the latest PCIe Express 6.0 specifications and provides an effective and efficient way to verify the components interfacing with the PCIe 6.0 interface. Cadence VIP for PCIe 6.0 provides exhaustive verification of PCIe-based IP and SoCs, and we are working with early adopter customers to speed up every verification stage.   

More Information

For more info on how Cadence PCIe Verification IP and Triple Check VIP enable users to confidently verify PCIe 6.0, see VIP for PCI Express, VIP for Compute Express Link  and TripleCheck for PCI Express  

See the PCI-SIG website for more details on PCIe in general and the different PCI standards.  

For more information on PCIe 6.0 new features, please visit PCIeLaneMarginPCIe6.0LaneMargin, and Demonstrating PCIe 6.0 Equalization Procedure.

Training Insights – Palladium Emulation Course for Beginner and Advanced Users

$
0
0

The Cadence Palladium Emulation Platform is a hardware system that implements the design, accelerating its execution and verification. Itoffers the highest performance and fastest bring-up times for pre-silicon validation of billion-gate designs, using a custom processor built by Cadence.

This Palladium Introduction course is based on the Palladium 23.03 ISR4 version and covers the following modules:

  • Introduction
  • Palladium flow
  • Running a design on the Palladium system

This course starts with an “Introduction” module that explains Palladium and other verification platforms to show its place in the big picture. It also compares Palladium with Protium and simulation and discusses its usage and limitations.

The “Palladium Flow” module includes two stages at a high level, which are Compile and Run. Then, it covers these stages in detail. First, it covers the ICE compile flow and IXCOM compile flow steps in detail. Then it explains Run, which is common for both ICE and IXCOM modes.

The third module, “Running Design on the Palladium System,” covers all the items required for running your design on the Palladium system, including:

  • Software stack requirements
  • Basic concepts required to understand the flow
  • Compute machine requirements

In addition, this course contains labs for both the ICE and IXCOM flows with detailed steps to exercise the features provided by the Palladium system. The lab explains a practical example of multiple counters and exercising their signals for force, monitor, and deposit features, along with frequency calculation using a real-time clock. The course is available on the Cadence support page:

There is also a Digital Badge available. You will find the Badge exam opportunity when you enroll in the Online training or after you have taken the training as "live" training.

For questions and inquiries, or issues with registration, reach out to us at Cadence Training. Want to stay up to date on webinars and courses? Subscribe to Cadence Training emails. To view our complete training offerings, visit the Cadence Training website.

Related Training Bytes

Related Courses

Related Blogs

DDR5 UDIMM Evolution to Clock Buffered DIMMs (CUDIMM)

$
0
0

DDR5 is the latest generation of PCDDR memory that is used in a wide range of application like data centers, Laptops and personal computers, autonomous driving systems, servers, cloud computing, and gaming are now increasingly being used for AI applications with advances in memory bandwidth and density to allow DDR5 DIMMs (Dual Inline Memory Modules) to support densities higher then 256 GB per DIMM card. The highest speed DDR5 SDRAM devices can support data rates of up to 8800 MTps.

DDR5 SO-DIMMs and UDIMMs

One of the most recognized uses of PCDDR is with client devices like laptops and personal computers. These client devices mostly use two types of DDR5 DIMMs called SO-DIMM (Small Outline Dual Inline Memory Module) and UDIMM (Unbuffered Dual Inline Memory Module).

These types of DIMMs have no signal regeneration or buffering (which, for example, the Registering Clock Driver or the RCD does for clocks/command/control signals for a registered DIMMs). A typical 2-Rank UDIMM with x8 DDR5 SDRAM components has 8 or 10 components per rank depending on the system ECC (Error Correction Code) memory being part of the DIMM.

Why DDR5 Clock Buffer and CUDIMM?

Clocks are one of the most important signals for synchronous devices, and DDR5 SDRAMs are no exception. The host is responsible for the fanout to all the DRAM input ports, such as clocks for UDIMMs. Driving of all these DRAM clocks can put quite a bit of load on the host output drivers, thus affecting the signal quality, which can result in unexpected memory errors. This issue gets amplified when operating at the higher clock and data rates where the clock sign als transition from one logic value to the next over a very short time. To solve these signal integrity issues with DRAM clocks, JEDEC has come up with a new type of DDR5 DIMM component that is called DDR5 clock buffer. Clock buffers can be used for both DDR5 SO-DIMMs and DDR5 UDIMMs. DDR5 UDIMMs that include a clock buffer component as part of the DIMM card are called DDR5 CUDIMMs (Clock Buffered UDIMMs).

DDR5 Clock Buffer Overview

DDR5 Clock Buffer is a simple logic device that takes in two sets of input clock pins and drives two sets of clock pins as output per channel. The clock buffer device can operate in three types of clock modes: -

  • PLL bypass mode: In this mode, the clock buffer just passes on the input clocks to output without any kind of signal buffering. The PLL bypass mode enabled CUDIMM devices behave like traditional UDIMMs without any buffering of the clocks. This is why it’s also referred to as legacy mode. Recommended CUDIMM operating speeds in PLL bypass mode are typically limited to 3000 MHz.
  • Single PLL mode: In the single PLL Mode, the clock buffer device will use a Phase Lock Loop (PLL) for the regeneration of the incoming host clock to create a better-quality clock that is sent to the DRAMs. However, since there is only one PLL that is used in this mode, both sub channel output clocks will be driven based on only one set of input clocks with the other set of input clocks remaining unused.
  • Dual PLL mode: In this mode, the clock buffer will use two PLLs to independently generate each sub channel output clock based on each set of incoming host clocks. The second set of PLL can be turned on or off on the fly if needed to save power.

Beyond the clock modes, clock buffers provide additional flexibility to the system designers with register-controlled additional signal delays, optional output clock enable/disable per bit feature, drive strength and termination choices, etc. All DDR5 clock buffer device control word registers are accessible via DDR5 DIMM sideband.

Cadence VIPs offers a compressive memory subsystem solution that includes memory models for DDR5 SDRAM, DDR5 RCD, DDR5 DB, DDR5 clock buffer, all types of DDR5 DIMMs, including the DDR5 CUDIMMs, DFI Memory Controller/PHY VIPs, and a system VIP compliant to JEDEC specifications defined for each of those devices along with latest DFI Specification.

More information on Cadence DDR5 DIMM VIP is available at the Cadence VIP Memory Models website.

Jasper Formal Fundamentals 2403 Course for Starting Formal Verification

$
0
0

The course "Jasper Formal Fundamentals v24.03" introduces formal analysis to those who want to use formal analysis for design or verification. 

To optimally benefit from this course, you must already have sufficient knowledge of the System Verilog assertions to be capable of writing properties for formal verification. Hence, this training provides a module on formal analysis to help cover this essential background. 

In this course, you will learn how to code efficient SVA Properties for formal analysis, understand formal complexity and how to overcome it, and learn the basics of formal coverage.

After completing this course, you will be able to:

  • Define reusable, functionally correct SVA properties that are efficient for formal tools. These shall use abstract auxiliary code to simplify descriptions, make code maintenance easier, reduce debug time, and reduce tool-proof runtime.
  • Set up, run, and analyze results from formal analysis.
  • Identify designs upon which formal is likely to be successful while understanding formal complexity issues and how to identify and overcome them.
  • Use a systematic property development process to approach a completely new verification problem.
  • Understand the basics of formal coverage.

 The most recently updated release includes new modules on:

  • "Basic complexity handling" which discusses the complexity in formal and how to identify and handle them.
  • "Complexity reduction methods” which discusses the complexity reduction methods and which is suitable for which type of complexity problem.
  • “Coverage in formal” which discusses the basics of coverage in formal verification and how coverage can be used in formal.   

Take this course to learn the basics of formal verification. 

What's Next? 

You can check out the complete training: Jasper Formal Fundamentals. There is a free online version of the training available 24/7 for all customers with a Cadence Learning and Support Portal account. If you are interested in an instructor-led version of the training, please contact Cadence Training. And don't forget to obtain your digital badge after completing the training!

You can also check Jasper University page for more materials on formal analysis and Jasper apps. 

Related Trainings 

Jasper Formal Expert Training Course | Cadence

Verilog Language and Application Training Course | Cadence

SystemVerilog for Design and Verification Training Course | Cadence

SystemVerilog Assertions Training Course | Cadence

Related Training Bytes 

Jasper Formal Property Verification (FPV) App: Basic Usage Demo (Video)

Jasper Formal Methodology playlist

Related Training Blogs

It’s the Digital Era; Why Not Showcase Your Brand Through a Digital Badge!

Training Insights: Introducing the C++ Course for All Your C++ Learning Needs!

Training Insights: Reaching Your Verification Closure Using Verisium Manager

Training Insights - Free Online Courses on Cadence Learning and Support Portal

Partial Header Encryption in Integrity and Data Encryption for PCIe

$
0
0
Cadence PCIe/CXL VIP support for Partial Header Encryption in Integrity and Data Encryption.(read more)

Cadence Verisium Debug Introduces Verisium Debug App Store

$
0
0

Verisium Debug, the Cadence unified debug platform, offers a variety of debugging capabilities, including RTL debug, UVM testbench debug, UPF debug, and DMS debug. From IP to SoC level debug, the user can take the benefits of the rich debugging features to reduce the time for debug.

Not only the common and advanced debug features, Verisium Debug also provides Python-based interface API, which enables capabilities allowing users to customize functions with Verisium Debug Python API to access from design, waveform databases and add functions to Verisium Debug’s GUI for visualization purposes. With Verisium Debug’s Python API, users can turn repetitive works into automatic programs or reduce efforts to create in-house utilities with well-established infrastructure from Verisium Debug.

Here is an example of how the user uses Python API to create a customized function. Users can write a Python program to extract signals in a specific design scope and report the values of the extracted signals. From Fig 1., you can understand the procedure of the traversal steps.

  1. Import Python library in Verisium Debug package.
  2. Setup the database for traversal.
  3. Search the scope with the hierarchy information in the design DB.
  4. Query the signal list and the values of the signals.
  5. Print out the results.

Fig 1. Procedure of Verisium Debug Python Program

The result from the Verisium Debug Python App can be used for post-process design checking or fed into other utilities in the design flow.

The concept is very straightforward. With Verisium Debug and the Python API environment enabled, you can easily query any information that is stored in the databases of Verisium Debug. The result can be outputted in text format, or you can also use the API to display the results back to Verisium Debug’s GUI.

The Verisium Debug Python API is an important capability and resource for Verisium Debug users. To make Verisium Debug Python API easier to access, from Verisium Debug 24.10 release, Verisium Debug introduced the new Verisium Debug Python App Store.

Verisium Debug Apps Store

Fig 2. Verisium Debug App Store

The Python App Store includes ready-to-use Python App examples with the availabilities of original source code documents, which help the user to understand how to start writing an app that fits their use case.

Verisium Debug App Store Example

Fig 3. Example apps in Verisium Debug App Store

The Verisium Debug Python App Store can also be used by a team as an app management system. App creators can share the developed apps across teams within their companies. The in-house created apps will become easy to manage, and engineers can easily access the apps from the central location, which makes it possible for users to see the updated available Verisium Debug Apps from the Verisium Debug App Store.

Check the following videos for more information about Verisium Debug Python API:

Customize Verisium Debug with Python API

Verisium Debug Customized Apps with Python API

Unveiling the Capabilities of Verisium Manager for Optimized Operations

$
0
0

In SoC development, the verification cycle is a crucial phase that ensures products meet their specifications and function correctly. However, the complexity of modern SoC projects, with their constant data flow, multiple validation teams working in parallel, and tight schedules, presents significant challenges. This article explores these challenges and introduces Verisium Manager as a solution that embodies the 'One Tool Fits All' concept. This means that Verisium Manager is designed to handle all aspects of the verification process for SoC development, from planning to coverage analysis to regression testing, thereby addressing the complex needs of SoC verification.

The Hurdles in Traditional Validation Cycles

 A typical validation process involves planning, coverage analysis, and regression testing. This complexity is compounded by using separate tools for each activity, leading to multiple control environments, APIs, and databases, not to mention the array of tool owners. Such fragmentation results in constant data transfer and translation between systems, from the planning tool to the coverage analysis tool and then to the regression testing tool. This continuous movement of data causes delays, system instability, poor user experiences, and, ultimately, a dip in the quality of the validation process.

The use of multiple platforms leads to inefficiency and reduced productivity. What's needed is a unified system that can streamline the workflow, simplify the verification process, and enhance its effectiveness.

Envisioning the Ideal Solution: Verisium Manager

 The cornerstone of an efficient validation cycle is integration and simplicity. The ideal solution is a singular platform that consolidates planning, coverage analysis, and regression management into one smooth, unified process. Verisium Manager emerges as this much-needed solution, encompassing all the functionalities necessary to streamline the validation process. Its comprehensive nature instills confidence in its ability to handle all aspects of the verification cycle. It can be fully customized to address and enforce any validation methodology and can facilitate smooth integration into any customer environment.

Features that stand out in Verisium Manager include: 

  • Unified Workflow: It acts as a single cockpit from which all activities are orchestrated, ensuring the validation teams' work is uninterrupted and seamlessly integrated.
  • Customization and Integration: Verisium Manager supports customizing test-plan structures and mapping results per project, ensuring a perfect fit for various project requirements. Its ability to smoothly integrate into the project's environment and compute platforms is unparalleled.
  • Support for Continuous Updates and Migration: The tool accommodates constant updates to project data and supports the migration of legacy data, ensuring that no historical data is lost in the transition to a new system.

Addressing Project-Specific Needs

 Verisium Manager recognizes diversity in different projects and offers project-specific solutions, including:

 Enforcing Project Test-Plan Structures and Attributes: It supports and enforces each project's unique test-plan structure and mapping guidelines.

  • Unified Data Views and Measurements: Verisium Manager promotes a unified view of data across all teams and enforces unified measurements, ensuring consistency and clarity in the validation process.
  • Enabling Project-Specific Actions and Integrations: The tool is designed to support project-specific actions directly from its graphical user interface and allows for smooth integration with in-house databases, dashboards, and the project execution stack.

Verisium Manager is the epitome of efficiency in software/hardware validation. Its differentiating features, such as support for customization, unified data view, and comprehensive coverage and regression requirements, make it an indispensable tool for any validation team looking to elevate their workflow.


A Brief on Message Bus Interface in PIPE

$
0
0

PHY Interface for the PCI Express (PCIe), SATA, USB, DisplayPort, and USB4 Architectures (PIPE) enables the development of the Physical Layer (PHY) and Media Access Layer (MAC) design separately, providing a standard communication interface between these two components in the system.

In recent years, the PIPE interface specification has incorporated many enhancements to support new features and advancements happening in the supported protocols. As the supported features increase, so does the count of signals on PIPE interface. To address the issue of increasing signal count, the message bus interface was introduced in PIPE 4.4 and utilized for PCIe lane margining at the receiver and elastic buffer depth control.

In PIPE 5.0, all the legacy PIPE signals without critical timing requirements were mapped into message bus registers so that their associated functionality could be accessed via the message bus interface instead of implementing dedicated signals. It was decided that any new feature added in the new version of PIPE specification will be available only via message bus accesses unless they have critical timing requirements that need dedicated signals.

Message Bus Interface

The message bus interface provides a way to initiate and participate in non-latency-sensitive PIPE operations using a small number of wires. It also enables future PIPE operations to be added without adding additional wires. The use of this interface requires the device to be in a power state with PCLK running.

Control and status bits used for PIPE operations are mapped into 8-bit registers that are hosted in 12-bit address spaces in the PHY and the MAC. The registers are accessed using read-and-write commands driven over the signals M2P_MessageBus[7:0] and P2M_MessageBus[7:0]. These signals are synchronous with the PCLK and are reset with Reset#.

Message Bus Interface Commands

The 4-bit commands are used for accessing the PIPE registers across the message bus. A transaction consists of a command and any associated address and data.

All the following are time multiplexed over the bus from MAC and PHY:

  1. Commands (write_uncommitted, write_committed, read, read completion, write_ack)
  2. 12-bit address used for all types and read and writes
  3. 8-bit data, either read or written

There can be cases where multiple PIPE interface signals can change on the same PCLK. To address such cases, the concept of write_uncommitted and write_committed is introduced.

The uncommitted write should be saved into a write buffer, and its associated data values are updated into the relevant PIPE register at a future time when a write_committed is received, taking effect during the same PCLK cycle. Once a write_committed is sent, no new writes, whether committed or uncommitted, and any read command may be sent until a write_ack is received. Also, it is allowed to send NOP commands between write uncommitted and write committed. 

A simple timing demonstration of message bus:

Message Address Space

MAC and PHY each implement unique 12-bit address spaces. These address spaces will host registers associated with the PIPE operations. MAC accesses PHY registers using M2P_MessageBus[7:0], and PHY accesses the MAC registers using the M2P_MessageBus[7:0].

The MAC and PHY access specific bits in the registers to: initiate operations, Initiate handshakes, and Indicate status.

Each 12-bit address space is divided into four main regions: the receiver address region, the transmitter address region, the common address region, and the vendor-specific address region.

Each register field has an attribute description of either level or 1-cycle assertion. When a level field is written, the value written is maintained by the hardware until the next write to that field or until a reset occurs. When a 1-cycle field is written to assert the value high, the hardware maintains the assertion for only a single cycle and then automatically resets the value to zero on the next cycle.

Cadence has a mature Verification IP solution for the verification of various aspects and topologies of PIPE PHY design. For more details, you may refer to the Simulation VIP for PIPE PHY | Cadence page, or you may send an email to support@cadence.com.

Deferrable Memory Write Usage and Verification Challenges

$
0
0

The application of real-time data processing or responsiveness is crucial, such as in high-performance computing, data centers, or applications requiring low-latency data transfers. It enables efficient use of PCIe bandwidth and resources by intelligently managing memory write operations based on system dynamics and workload priorities. By effectively leveraging Deferrable Memory Write [DMWr], Devices can achieve optimized performance and responsiveness, aligning with the evolving demands of modern computing applications.

What Is Deferrable Memory Write?

Deferrable Memory Write (DMWr) ECN introduced this new memory transaction type, which was later officially incorporated in PCIe 5.0 to CXL2.0. This enhanced type of memory transaction is Deferrable Memory Write [DMWr], which flows as another type of existing Read/Write memory transaction; the major difference of this Deferrable Memory Write, where the Requester attempts to write to a given location in Memory Space using the non-posted DMWr TLP Type, it Postponing their completion of memory write transactions to improve overall system efficiency and performance, those memory write operation can be delay or deferred until other priority task complete.

The Deferrable Memory Write (DMWr) requires the Completer to return an acknowledgment to the Requester and provides a mechanism for the recipient to defer (temporarily refuse to service) the Request.

DMWr provides a mechanism for Endpoints and hosts to choose to carry out or defer incoming DMWr Requests. This mechanism can be used by Endpoints and Hosts to simplify the design of flow control, reduce latency, and improve throughput. The Deferrable Memory writes TLP format in Figure A.

 

(Fig A) Deferrable Memory writes TLP format.

Example Scenario

Here's how the DMWr works with a simplified example:Imagine a system with an endpoint device (Device A) and a host CPU (Device B). Device B wants to write data to Device A's memory, but due to varying reasons such as system bus congestion or prioritization of other transactions, Device A can defer the completion of the memory write request. Just follow these steps:

  1. Initiation of Memory Write: Device B initiates a memory write transaction to Device A. This involves sending the memory write request along with the data payload over the PCIe physical layer link.
  2. Acknowledgment and Deferral: Upon receiving the memory write request, Device A acknowledges the transaction but may decide to defer its completion. Device A sends an acknowledgment (ACK) back to Device B, indicating it has received the data and intends to complete the write operation but not immediately.
  3. Deferred Completion: Device A defers the completion of the memory write operation to a later, more opportune time. This deferral allows Device A to prioritize other transactions or optimize the use of system resources, such as memory bandwidth or processor availability.
  4. Completion and Response: At a later point, Device A completes the deferred memory write operation and sends a completion indication back to Device B. This completion typically includes any status updates or additional information related to the transaction.

Usage or Importance of DMWr

Deferrable Memory Write usage provides the improvement in the following aspects:

  • Reduced Latency: By deferring less critical memory write operations, more critical transactions can be processed with lower latency, improving overall system responsiveness.
  • Improved Efficiency: Optimizes the utilization of system resources such as memory bandwidth and CPU cycles, enhancing the efficiency of data transfers within the PCIe architecture.
  • Enhanced Performance: Allows devices to manage and prioritize transactions dynamically, potentially increasing overall system throughput and reducing contention.

Challenges in the Implementation of DMWr Transactions

The implementation of deferrable memory writes (DMWr) introduces several advancements and challenges in terms of usage and verification:

  1. Timing and Synchronization: DMWr allows transactions to be deferred, complicating timing requirements or completing them within acceptable timing windows to avoid protocol violations. Ensuring proper synchronization between devices becomes critical to prevent data loss or corruption.
  2. Protocol Compliance: Verification must ensure compliance with ECN PCIe 6.0 and CXL specifications regarding when and how DMWr transactions can be initiated and completed.
  3. Performance Optimization: While DMWr can improve overall system performance by reducing latency, verifying its impact on system performance and ensuring it meets expected benchmarks is crucial.
  4. Error Handling: Handling errors related to deferred transactions adds complexity. Verifying error detection and recovery mechanisms under various scenarios (e.g., timeout during deferral) is essential.

Verification Challenges of DMWr Transactions

The challenges to verifying the DMWr transaction consist of all checks with respect to Function, Timing, Protocol compliance, improvement, Error scenario, and security usage on purpose, as well as Data integrity at the PCIe and CXL.

  1. Functional Verification: Verifying the correct implementation of DMWr at both ends of the PCIe link (transmitter and receiver) to ensure proper functionality and adherence to specifications.
  2. Timing Verification: Validating timing constraints associated with deferring writes and ensuring transactions are completed within specified windows without violating protocol rules.
  3. Protocol Compliance Verification: Checking that DMWr transactions adhere to PCIe and CXL protocol rules, including ordering rules and any restrictions on deferral based on the transaction type.
  4. Performance Verification: Assessing the impact of DMWr on overall system performance, including latency reduction and bandwidth utilization, through simulation and testing.
  5. Error Scenario Verification: Creating and testing scenarios to verify error handling mechanisms related to DMWr, such as timeouts, retries, and recovery procedures.
  6. Security Considerations: Assessing potential security vulnerabilities related to DMWr, such as data integrity risks during deferred transactions or exposure to timing-based attacks.

Major verification challenges and approaches are timing and synchronization verification in the context of implementing deferrable memory writes (DMWr), which is crucial due to the inherent complexities introduced by deferred transactions. Here are the key issues and approaches to address them:

Timing and Synchronization Issues

  1. Transaction Completion Timing:
    • Issue: Ensuring deferred transactions are completed within the specified time window without violating protocol timing constraints.
    • Approach: Design an internal timer and checker to model worst-case scenarios where transactions are deferred and verify that they are complete within allowable latency limits. This involves simulating various traffic loads and conditions to assess timing under different scenarios.
  2. Ordering and Dependencies:
    • Issue: Verifying that transactions deferred using DMWr maintain the correct ordering and dependencies relative to non-deferred transactions.
    • Approach: Implement test scenarios that include mixed traffic of DMWr and non-DMWr transactions. Verify through simulation or emulation that dependencies and ordering requirements are correctly maintained across the PCIe link.
  3. Interrupt Handling and Response Times:
    • Issue: Verify the handling of interrupts and ensure timely responses from devices involved in DMWr transactions.
    • Approach: Implement test cases that simulate interrupt generation during DMWr transactions. Measure and verify the response times to interrupts to ensure they meet system latency requirements.

In conclusion, while deferrable memory writes in PCIe and CXL offer significant performance benefits, their implementation and verification present several challenges related to timing, protocol compliance, performance optimization, and error handling. Addressing these challenges requires rigorous testing and testbench of traffic, advanced verification methodologies, and a thorough understanding of PCIe specifications and also the motivation behind introducing this Deferrable Write is effectively used in the CXL further. Outcomes of Deferrable Memory Write verify that the performance benefits of DMWr (reduced latency, improved throughput) are achieved without compromising timing integrity or violating protocol specifications.

In summary, PCIe and CXL are complex protocols with many verification challenges. You must understand many new Spec changes and consider the robust verification plan for the new features and backward compatible tests impacted by new features. Cadence's PCIe 6.0 Verification IP is fully compliant with the latest PCIe Express 6.0 specifications and provides an effective and efficient way to verify the components interfacing with the PCIe 6.0 interface. Cadence VIP for PCIe 6.0 provides exhaustive verification of PCIe-based IP and SoCs, and we are working with Early Adopter customers to speed up every verification stage.

More Information

Training Webinar: Protium X2: Using Save/Restart for Debugging

$
0
0

Cadence Protium prototyping platforms rapidly bring up an SoC or system prototype and provide a pre-silicon platform for early software development, SoC verification, system validation, and hardware regressions.

In this Training Webinar, we will explore debugging using Save/Restart on Protium X2. This feature saves execution time and lets you focus on actual debugging. The system state can be saved before the bug appears and restartS directly from there without spending time in initial execution. We’ll cover key concepts and applications, explore Save/Restart performance metrics, and provide examples to help you understand the concepts.

Agenda:

  • The key concepts of debugging using save/restart
  • Capabilities, limitations, and performance metrics
  • Some examples to enable and use save/restart on the Protium X2 system

Date and Time

Thursday, November 7, 2024
07:00 PST San Jose / 10:00 EST New York / 15:00 GMT London / 16:00 CET Munich / 17:00 IST Jerusalem / 20:30 IST Bangalore / 23:00 CST Beijing

REGISTER

To register for this webinar, sign in with your Cadence Support account (email ID and password) to log in to the Learning and Support System*. Then select Enrol to register for the session. Once registered, you’ll receive a confirmation email containing all login details.

A quick reminder:

  • If you haven’t received a registration confirmation within 1 hour of registering, please check your spam folder and ensure your pop-up blockers are off and cookies are enabled.
  • For issues with registration or other inquiries, reach out to eur_training_webinars@cadence.com.

Want to See More Webinars?

You can find recordings of all past webinars here 

Like This Topic?

  • Take this opportunity and register for the free online course related to this webinar topic:

Protium Introduction Training

The course includes slides with audio and downloadable lab exercises designed to emphasize the topics covered in the lecture. There is also a Digital Badge available for the training.

   

Want to share this and other great Cadence learning opportunities with someone else?

Tell them to subscribe.

Hungry for Training? Choose the Cadence Training Menu that’s right for you.

To view our complete training offerings, visit the Cadence Training website.

Related Courses

Related Blogs

Related Training Bytes

Please see the course learning maps for a visual representation of courses and course relationships. Regional course catalogs may be viewed here

Training Webinar: Fast Track RTL Debug with the Verisium Debug Python App Store

$
0
0

As a verification engineer, you’re surely looking for ways to automate the debugging process. Have you developed your own scripts to ease specific debugging steps that tools don’t offer?

Working with scripts locally and manually is challenging—so is reusing and organizing them. What if there was a way to create your own app with the required functionality and register it with the tool?

The answer to that question is “Yes!” The Verisium Debug Python App Store lets you instantly add additional features and capabilities to your Verisium Debug Application using Python Apps that interact with Verisium Debug via the Python API.

Join me, Principal Education Application Engineer Bhairava Prasad, for this Training Webinar and discover the Verisium Debug Python App Store. The app store allows you to search for existing apps, learn about them, install or uninstall them, and even customize existing apps.

Date and Time

Wednesday, November 20, 2024
07:00 PST San Jose / 10:00 EST New York / 15:00 GMT London / 16:00 CET Munich / 17:00 IST Jerusalem / 20:30 IST Bangalore / 23:00 CST Beijing

REGISTER

To register for this webinar, sign in with your Cadence Support account (email ID and password) to log in to the Learning and Support System*. Then select Enroll to register for the session. Once registered, you’ll receive a confirmation email containing all login details.

A quick reminder:

  • If you haven’t received a registration confirmation within one hour of registering, please check your spam folder and ensure your pop-up blockers are off and cookies are enabled.
  • For issues with registration or other inquiries, reach out to eur_training_webinars@cadence.com.

 Like this topic?

 Want to share this and other great Cadence learning opportunities with someone else? Tell them to subscribe.

 Hungry for Training? Choose the Cadence Training Menu that’s right for you.

Related Courses

Related Blogs

Related Training Bytes

Please see course learning maps a visual representation of courses and course relationships. Regional course catalogs may be viewed here.

  *If you don’t have a Cadence Support account, go to Cadence User Registration and complete the requested information. Or visit Registration Help

Versatile Use Case for DDR5 DIMM Discrete Component Memory Models

$
0
0

DDR5 DIMM Architectures

The DDR5 generation of Double Data Rate DRAM memories has experienced rapid adoption in recent years. In particular, the JEDEC-defined DDR5 Dual Inline Memory Module (DIMM) cards have become a mainstay for systems looking for high-density, high-bandwidth, off-chip random access memory[1]. Within a short time, the DIMM architecture evolved from an interconnected hierarchy of only SDRAM memory devices (UDIMM[2]) to complex subsystems of interconnected components (RDIMM/LRDIMM/MRDIMM[3]).

Diagram of a typical DDR5 DIMM today

DIMM Designs and Popular Verification Use Cases

The growing complexity of the DIMMs presented a challenge for pre-silicon verification engineers who could no longer simply validate against single DDR5 SDRAM memory models. They needed to consider how their designs would perform against DIMMs connected to each channel and operating at gigahertz clock speeds.

To address this verification gap, Cadence developed DDR5 DIMM Memory Models that encapsulated all of the architectural complexities presented by real-world DIMMs based on a robust, easy-to-use, easy-to-debug, and easy-to-reconfigure methodology.

This memory-subsystem-in-a-single-instance model has seen explosive adoption among the traditional IP Developer and SOC Integrator customers of Cadence Memory Models. The Cadence DIMM models act as a single unit with all of the relevant DIMM components instantiated and interconnected within, and with all AC/Timing parameters among the various components fully matched out-of-the-box, based on JEDEC specifications as well as datasheets of actual devices in the market. The typical use-case for the DIMM models has been where the DUT is a DDR5 Memory Controller + PHY IP stack, and the validation plan mandated compliance with the JEDEC standards and Memory Device vendor datasheets.

Diagram of a typical MC+DFI DUT Testbench utilizing the DIMM Memory Model

Unique Use Case for the DIMM Discrete Component Models

Although the Cadence DIMM models have enjoyed tremendous proliferation because of their cohesive implementation and unified user API, the actual DIMM Models are built on top of powerful, flexible discrete component models, each of which was designed to stand on its own as a complete SystemVerilog UVM-based VIP. All of these discrete component models exist in the Cadence VIP Catalog as standalone VIPs, complete with their own protocol compliance checking capabilities and their own configuration mappings comprehensively modeling individual AC/Timing parameters.

Because of this deliberate design decision, the Cadence DIMM Discrete Component Models can support a unique use-case scenario. Some users seek to develop IC Designs for the various DIMM components. Such users need verification environments that can model the individual components of a DIMM and allow them the option to replace one or another component with their Component Design IP. They can then validate that their component design is fully compatible with the rest of the components on the DIMM and meets the integrity of the overall DIMM compliance with JEDEC standards or Memory Vendor datasheets.

The Cadence Memory VIP portfolio today includes various examples that demonstrate how customers can create DIMM “wrappers” by selecting from among the available DIMM discrete component models and “stitching” them together to build their own custom testbench around their specific Component Design IP.

Use-case where DUT is a DIMM Component, testbench made up of DIMM Discrete Component models

A Solution for Unique Component Scenarios

The Cadence DDR5 DIMM Memory Models and DIMM Discrete Component Models can provide users with a flexible approach to validating their specific component designs with a fully populated pre-silicon environment.

Augmented Verification Capabilities

When the DIMM “wrapper” model is augmented with the Cadence DFI VIP[4] that can simulate an MC+PHY stack and offers a SystemVerilog UVM test API to the verification engineer, the overall testbench transforms into a formidable pre-silicon validation vehicle. The DFI VIP is designed as a combination of an independent DFI MC VIP and a DFI PHY VIP connected to each other via the DFI Standard Interface and capable of operating seamlessly as a single unit. It presents a UVM Sequence API to the user into the DFI MC VIP with the Memory Interface of the PHY VIP connected to the DIMM “wrapper” model.

With this testbench in hand, the user can then fully take advantage of the UVM Sequence Library that comes with the DFI VIP to enable deep validation of their Component Design inside the DIMM “wrapper” model.

DIMM Component DUT testbench augmented with a DFI VIP

Verification Capabilities Further Enhanced

A possible further enhancement comes with the potential addition of an instance of the Cadence DIMM Memory Model in a Passive Monitor mode at the DRAM Memory Interface. The DIMM Passive Monitor consumes the same configuration describing the DIMM “wrapper” in the testbench, and thus can act as a reference model for the DIMM wrapper. If the DIMM Passive Monitor responds successfully to accesses from the DFI VIP, but the DIMM wrapper does not, then it exposes potential bugs in the DUT Components or in the settings of their AC/Timing parameters inside the DIMM wrapper.

DIMM Component DUT testbench with DFI VIP and a DIMM Passive Monitor Memory Model

Debuggability, Interface Visibility, and Protocol Compliance

One of the key benefits of the DIMM Discrete Component Models that become manifest, whether in terms of the unique use-case scenario described here, or when working with the wholly unified DDR5 DIMM Memory Models, is the increased debuggability of the protocol functionality. The intentional separation of the discrete components of a DIMM allows the user to have full visibility of the memory traffic at every datapath landmark within a DIMM structure. For example, in modeling an LRDIMM or MRDIMM, the interface between the RCD component and the SDRAM components, the interface between the RCD component and the DB components, and the interface between the SDRAM components and the DB components—all are visible and accessible to the user. The user has full access to dump the values and states of the wire interconnects at these interfaces to the waveform viewer and thus can observe and correlate the activity against any protocol violations flagged in the trace logs by any one or more of the DIMM Discrete Component Models.

Access to these interfaces is freely available when using the DIMM Discrete Component Models. On the unified DDR5 DIMM Memory Models, a feature called Debug Ports enables the same level of visibility into the individual interconnects amidst the SDRAM components, RCD components, and DB components. When combined with the Waveform Debugger[5] capability that comes built-in with the VIPs and Memory Models offered by Cadence and used with the Cadence Verisium Debug[6] tool, the enhanced debuggability becomes a powerful platform.

With these debug accesses enabled, the user can pull out transaction streams, chip state and bank state streams, mode register streams, and error message streams all right next to their RTL signals in the same Verisium Debug waveform viewer window to debug failures all in one place. The Verisium Debug tool also parses all of the log files to probe and extract messages into a fully integrated Smart Log in a tabbed window fully hyperlinked to the waveform viewer, all at your fingertips.

A Solution for Every Scenario

Cadence's DDR5 DIMM Memory Models and DIMM Discrete Component Models, partnered with the Cadence DFI VIP, can provide users with a robust and flexible approach to validating their designs thoroughly and effectively in pre-silicon verification environments ahead of tapeout commitments. The solution offers unparalleled latitude in debuggability when the Debug Ports and Waveform Debugger functions of the Memory Models are switched on and boosted with the use of the Cadence Verisium Debug tool.


[1] Shyam Sharma, DDR5 DIMM Design and Verification Considerations, 13 Jan 2023.

[2] Shyam Sharma, DDR5 UDIMM Evolution to Clock Buffered DIMMs (CUDIMM), 23 Sep 2024.

[3] Kos Gitchev, DDR5 12.8Gbps MRDIMM IP: Powering the Future of AI, HPC, and Data Centers, 26 Aug 2024.

[4] Chetan Shingala and Salehabibi Shaikh, How to Verify JEDEC DRAM Memory Controller, PHY, or Memory Device?, 29 Mar 2022.

[5] Rahul Jha, Cadence Memory Models - The Gold Standard, 15 Apr 2024.

[6] Manisha Pradhan, Accelerate Design Debugging Using Verisium Debug, 11 Jul 2023.

Randomization considerations for PCIe Integrity and Data Encryption Verification Challenges

$
0
0
Peripheral Component Interconnect Express (PCIe) is a high-speed interface standard widely used for connecting processors, memory, and peripherals. With the increasing reliance on PCIe to handle sensitive data and critical high-speed data transfer, e...(read more)

USB4 Sideband Channel Is Not a Side Business

$
0
0

The USB4 specification has been around for several years now. Two years ago, USB4 version 2.0 was also released by the USB Promoter Group. This specification enables up to 80Gbps link speed per direction in symmetric mode and 120Gbps link speed in asymmetric mode.

Be it Gen 2, Gen 3, or Gen 4 link speeds of 20Gbps, 40Gbps, 80Gbps, or 120Gbps, the sideband channel is indispensable for the stable operation of the high-speed link. It plays a multi-faceted role.

The sideband channel is a simple two-wire low-speed link operating at 1Mbps using 10-bit start/stop encoding that does not require any synchronization of its own. It supports different transaction types, namely, Link Type (LT), Administrative Type (AT), and Re-timer Type (RT), each having their specific roles mentioned in the table below.

The sideband channel role starts from the very beginning of bringing-up of a USB4 High-Speed link through various phases of lane initialization, to link establishment, handling different aspects of link management like Sleep entry, Disconnect/Re-connect, and to supporting link recovery when the link becomes unreliable for operation.

During lane initialization phases, its functionality ranges from being a medium for router detection by way of SBTX line being made logic high by the router in phase 2, to helping each router in getting USB4 port characteristics information of its link partner in phase 3, to broadcasting the negotiated USB4 port parameters and starting High-Speed Tx and Rx traffic in phase 4, and to negotiating transmitter Feed-Forward Equalization (TxFFE) parameters between each router or re-timer and a router or re-timer adjacent to it in phase 5.

In phase 3, the USB4 port characteristics that are exchanged determine the nature of the USB4 link. These characteristics indicate bonding support, lane speed (Gen 4, Gen 3, or Gen 2), RS-FEC enabled, and asymmetric link support (3Tx, 3Rx, or only symmetric link support).

During phase 4, a USB4 port sets the lane speed and starts transmission. Routers send broadcast RT transactions to each other bearing the negotiated link characteristics. Once these transactions are complete, routers activate the transmitter on each enabled lane at the selected speed and send SLOS1 on the high-speed link. For a Gen 3 or Gen 2 link, a router indicates this start of transmission to the link partner by sending an LT_Resume transaction and setting Tx active bit in the TxFFE register of sideband register space. For a Gen 4 link, a router transmits LFPS after setting the start TxFFE bit in the SB register space of the link partner to indicate that TxFFE negotiation process can start. After LFPS is stopped, back-to-back Gen4 TS1 with the indication field set to 1h are transmitted.

At the end of phase 4, the transmitters and receivers are ready to go to phase 5, where transmitter Feed-Forward Equalization (TxFFE) parameters between each link partner are negotiated. On completion of phase 5, the high-speed link is ready to send and receive traffic and proceed with high-speed link training.

While all sideband transaction types support the lane initialization processes in various phases, link type transactions have an important role to play once the USB4 high-speed link becomes active.

LT transactions are also used to signal a change in the lane adapter state due to events such as a lane disconnect or lane disable. They are also used for communicating sleep events from a USB4 port in a router to a USB4 port on the other router.

The Gen 4 link transitions from symmetric to asymmetric and vice-versa between two routers get requested and acknowledged using LT_SwitchRx2Tx and LT_SwitchAck link type transactions, respectively.

The Gen 4 link recovery uses ELT_Recovery extended link type transaction to initiate and complete the link recovery.

The sideband channel for a TBT3-Compatible USB4 port should also honor the variations that it has for TBT3-compatibility. This includes support for the bidirectional sideband channel. 

From a functionality perspective, the sideband channel may look simple. However, it has a very wide scope and a broad impact on USB4 functioning. It is, in essence, a system that contributes significantly to the smooth running of High-Speed USB4 Link.

Hence, verifying high-speed USB4 functionalities in combination with the sideband channel cannot be overlooked. It is an important verification planning advice that qualifying a USB4 design with the sideband channel functionality should never be underestimated.

To save on simulation run time for high-speed link verification, an approach that is usually taken is to bypass the sideband channel processes if the design under verification supports it and focus only on high-speed link protocol traffic. However, it does not in any way reduce the importance of verifying the high-speed link with the sideband channel processes enabled before the closure of the verification cycle, as the sideband channel is responsible for establishing, maintaining, and providing a recovery mechanism in a USB4 port.

Cadence has a mature Verification IP solution for the verification of various aspects of USB4 version 2.0 and version 1.0 design, with verification capabilities provided to do a comprehensive verification of these.

You may refer to https://www.cadence.com/en_US/home/tools/system-design-and-verification/verification-ip/simulation-vip.html for more information. 




<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>