Specman's Automatic GC Settings mechanism is aimed at eliminating the need for users to control the parameters which determine each Garbage Collection's behavior.
Setting config mem -automatic_gc_settings=STANDARD tells Specman to calculate all the parameters, to ensure that Specman's memory management system works in an optimal way.
The only parameter that is left for the user to play with is the -optimal_process_size (aka OPS). The importance of this parameter is that many of the other automatically calculated parameters are its derivatives. To set this parameter optimally, one should ask the following question:
WHAT SIZE MEMORY IMAGE DO I WANT MY PROCESS TO HAVE?
Let's say, for instance, that you have 20 GB free on your machine, and you have 2 simulations running in parallel. The optimal process size for each simulation would be 10GB, so you just assign OPS a value of 10GB.
Now, what if you don't know? Or don't care?
Specman then sets this value itself, based on the amount of available RAM on the machine on which the Specman process is to run on. Note - This can be quite a big number. It might make Specman run fast, (hardly perform GCs,) but it consumes lots of memory. The reasoning is that if the user does not care about process size, Specman will use the maximum available memory in order to create a smooth run.
Realistically though, most users do care about their process size and would want to limit it on one hand, while giving it enough liberty to avoid memory issues on the other. So, is there a good way to calculate efficient OPS, one that will ensure that the machine only uses as much resources as needed?
Let's start by saying it is not possible to know exactly how memory settings will affect a program run, if not already measured on EXACTLY the same run. Nevertheless, if we have such knowledge on similar runs, it may give us some hints. We can collect run information and analyze it in a way that can help us understand how efficiently the environment runs and if we need to take some actions to achieve fewer OOM (Out Of Memory) failures, better performance, more effective memory utilization, etc. However, when dealing with batch tests, choosing a specific test "as representative" for purposes of measurement can be as difficult as choosing specific memory settings, so the information must be collected on a representative group of runs, or on all the batch runs.
So what exactly do we need to search for in our log file, in order to calculate the optimal size for our environment? Let's first introduce three concepts:
1) "Static" (Live) Specman heap - this is the basic size of Specman dynamic memory that mostly belongs to persistent objects and generally remains stable during the simulation
2) "Static" non-Specman heap - Same as 1) but for all the rest of the players in the process
3) We would also like to define the memory requirement for the process during copy GC (since whilst Copy GC is operating, we are most likely to hit the peak of the memory consumption, as Specman might double its memory to perform the GC.
(2 X Maximum SN live heap) + (Garbage) + (Maximum non-SN heap)
Collecting the Relevant Data
Now in order to sample these values, we will need to collect the relevant memory and garbage collection related information. To collect this information, we need to set the following configuration parameters in our pilot simulation:
- print_process_size=TRUE - Prints the entire process size in three stages for every GC: Before, At peak time, and After
- show_mem_raw=TRUE - Prints Specman memory consumption, including top consumers
- print_debug_msgs=TRUE - Prints messages, including the exact phases of GC
- setenv SPECMAN_MEMORY_ACCOUNTING - Gives us information about Specman's Dynamic allocation.
Finding a value for Specman Live heap
Analyzing the result log, let's estimate the static Specman heap first. There are three printouts that can help us determine this value:
1. Last line in Copy (or disk based) GC printout; which is the new process size "after GC":
"Done - new size is nnn bytes"
For example:
MEMORY_DEBUG: process size after GC:
MEMORY_DEBUG: VSIZE = 1990940, RSS = 1792688
Done - new size is 1804478256 bytes.
2. "Total size of reachable data" line in show mem "Process sizes" table
For example:
Total allocated size of numerics: 12176 +
Total allocated size of structs: 34568K
Total size of reachable data: 1719M +
Total size in free blocks: 343K +
Total size of unreachable data: 375M
Heap size: 2096M
3. Last line in OTF GC printout: "Done - total size of reachable data is nnn..."
For example:
MEMORY_DEBUG: process size at the peak memory usage:
MEMORY_DEBUG: VSIZE = 3653716, RSS = 3514240
Done - total size of reachable data is 1,096,707,344 bytes (plus 2,417,146,640 free).
There will be several instances of those printings (the same number of GCs that you had in the simulation), and we need to choose the highest value that was printed. Printout no. 1 above should be most correct, and we should take the value from it, but the others should also be considered (if show mem is used and OTF GC is encountered).
Finding a value for Static non-SN heap
To estimate static non-SN heap, you need to take VSIZE after copy (or disk based) GC and subtract from it the "Done - new size is nnn bytes" value from the line below it (values obtained from OTF GC prints are not good).
MEMORY_DEBUG: process size after GC:
MEMORY_DEBUG: VSIZE = 1990940, RSS = 1792688
Done - new size is 1804478256 bytes.
In this case: 1990940K - 1804478256.
So the maximum of VSIZE after copy GC, which supposed to be "static SN heap" + "static non-SN heap", is an estimation of what is the minimum requirement for the environment.
Finding a value for Dynamic allocation (Garbage)
There is one more thing to be estimated -- the amount of memory used for dynamic allocations collected during GC. It will depend on the environment and how fast it allocates "transient" objects. If it happens fast, you need to have a large buffer so that GC is not triggered too often; if there are few dynamic allocations, it can be relatively small. In most cases, it's the same order of magnitude as static SN heap.
Example Calculation
Let's look at an example on how you could come up with some numbers for OPS on a typical simulation run.
As per the above example:
"process size after GC" : VSIZE = 1990940 (~1944M)
"Done - new size is" : 1804478256 bytes (~1721M)
Static non-SN heap = 1944 - 1721 = 223M
Live SN heap = 1721M
Recommended Optimal Process Size = (Static non-SN heap) + 2 X (Live SN heap) + Dynamic allocation
Or
*OPS= (Static non-SN heap) + ~3X (Live SN heap)
OPS= (223) + (3 * 1721) = 5386M
*Since we estimated Dynamic allocation to be of the same magnitude as Live SN heap, we used 3 times the value of Live SN heap. In most cases the dynamic allocation value would be lower so we can round down the result we got. For instance, in the above example
OPS = 5386M =~ 5G.
Notes:
- The result recommended OPS will not be effective if we notice disk-based GC occurrences in that pilot simulation. In this case we would want to give a higher value than the result OPS we calculated, in order to avoid disk-based GC, and re-calculate the OPS in the same manner, if we were successful avoiding it.
- If you know your environment does not use a lot of dynamic allocations (in such cases you will see that the difference between values of Live SN heap before and after GC are small), you can change the above formula to something closer to
OPS= (Static non-SN heap) + 2X (Live SN heap)
And round it up. This way you won't encounter a situation where you have a simulation which consumes a lot of memory, but performs no GCs.
- If you try setting the OPS to a very low value, Specman will automatically adjust it. Specman will notify you if it sets the optimal_process_size to a value other that what a user specifies, if you set the -notify_gc_settings option to TRUE. In that case, you will see a message as follows:
auto_gc_settings: Application too big, setting optimal_process_size to
760803328 which is sn_uintptr2ep(760803328)
Summary:
Calculating the optimal OPS is a bit tricky. You want it to limit Specman usage while giving it enough space not to encounter memory issues. The above example calculation gives you no more than a recommendation, which is based on the previous run. To get a better, more realistic value, you should run several pilot simulations, and perform the above analysis on the average of those runs.
* From SPMN 12.10s4 and forward this calculation is done automatically when you apply the config memory -print_process_size
Avi Farjoun
Muffadal Laila