"Simulation Acceleration” or “Accelerated Verification” are terms commonly used to describe a verification environment in which the Device Under Test is synthesized and runs on the emulator, while the testbench runs on the simulator. This blog describes a technique for maximizing the performance of the simulator and thus increasing the overall performance of the Simulation Acceleration.
In Simulation Acceleration (SA), the verification environment is partitioned into two levels. The lower level contains the components that are connected to the DUT – the BFM, transactor, collector, and so on. These components are implemented in a synthesizable language and loaded onto the emulator with the DUT. The upper level of the verification environment implements functionalities such as sequences, generation constraints, and checkers. These components run on the simulator and are often modeled at the transaction level (TLM).
There is no real parallelism between these two platforms. When the synthesized code runs on the emulator, the simulator waits. When there is a context switch–for example, when the BFM calls a get_next_item()–the accelerator halts and the simulator runs. When the simulator performs another context switch, the emulator software continues execution and the simulator halts, and so on until the end of the test.
This flow is illustrated in the following diagram. When the test begins, the upper level generates the units and their fields, runs initialization methods, and so forth. After generating the first data item/s, it passes them to the lower level (for example, via SCEMI pipes). Now the lower level transactors interact with the DUT, and the upper level halts. When the lower level needs to pass a status to the higher level or a request for the next items, the upper level starts running again and the lower level halts, and so on until the end of the test.
In a paper presented at DVCon 2012 (summarized in this blog ), Intel Principal Engineer Blake French, a Validation Architect for the Xeon Phi™ product line, presented a methodology defined by his team and targeted to improve the performance of Simulation Acceleration. His solution focused on the time spent on the simulator, Specman. The revolutionary methodology that he described was to omit all temporal activities from the e code. If there are no events and no TCMs, the Specman kernel does not get into action after every context switch. As the Intel analysis showed, this methodology resulted in an impressive performance improvement–from 40Khz to 60Khz.
To implement such an environment, the interface between the HW and SW is based on function calls, using DPI. The following diagram illustrates this flow–the only interface between the levels consists of function calls. When e code is called, it performs the required action (update the reference model, generate data items, …), but it does not start any TCM or emit any events.
After French and his team presented this methodology and its considerable effect on the SA performance, many customers approached Cadence asking for this “No Kernel Specman you gave to Intel.” Well, in 2012 there wasn’t any special version of Specman. To implement an environment in which there are no temporals defined in the e code, Intel had to tailor its code to work around Specman’s default behavior.
In the last several years, however, Specman has been enhanced to support this “No Temporals” use model, so that creating such verification environments is now much simpler.
The first enhancement to support this mode was the addition of a procedural interface to coverage. e coverage groups are based on events, and coverage is sampled when the relevant event is emitted. To enable coverage collection in a “No Temporals” mode–an environment in which no events are emitted–a new method was added. With this procedural interface for triggering coverage collection of a cover group, event emission is not compulsory. Instead, one can call the method covers.sample_cg().
For example:
extend my_monitor_u {
cur_pkt : pkt;
// This method is exported, to be called by the collector
new_item(pkt : pkt) is {
cur_pkt = pkt;
sample_coverage();
};
sample_coverage () is {
covers.sample_cg("my_monitor_u.pkt_ended", me);
};
};
export DPI-C my_monitor_u.new_item();
Another challenge with the “No Temporals” mode is that normally garbage collection is performed automatically by Specman. When the kernel is not active, garbage collection must be initiated from user code. A new function, snDoGcIfNeeded(), was added to address this issue. This functions checks the memory and, if required, performs garbage collection.
For example:
import snSvDPI::*;
module bfm();
task drive_transaction(inout trans trans );
//..
snDoGcIfNeeded();
end
endmodule
The newest addition to the set of enhancements supporting the “No Temporals” mode will be included in the next release of Specman, 16.1. By default, Specman behavior relies on the assumption that a test scenario is implemented in TCMs. If no TCM is started when the test begins (in run() ), the test ends. To overcome this undesired termination of the test, a new flag, -no_temporal_expressions, is being added to Specman. When this flag is set, the test will not stop, even when there are no active e threads.
With these Specman capabilities–function based interface, procedural activation of coverage collection, procedural interface to manual garbage collection, and the no_temporal_expressions flag–implementing the “Acceleration Verification with no Specman kernel,” which attracted so much attention and so many fans when introduced by Intel four years ago, will be much simpler to do.
Efrat Shneydor, Team Specman