Dump Improvements
Abstract
In z/VM 6.4 APAR VM65989 dumping was changed to let the system operator reduce the size of the dump by choosing to omit usually extraneous data from the dump. z/VM 7.1 further reduces the size of the dump by dumping the map of real memory in a more efficient manner. z/VM 7.1 also includes CPU efficiency changes that let the dump be accomplished with less CPU time.
In our experiments with a workload running in a 1024 GB LPAR, exploiting all improvements resulted in a 97% reduction in dump elapsed time and a 99% reduction in dump size, compared to exploiting none of them. Customers' results will vary according to system configuration and workload.
Introduction
In PTF UM35132 for APAR VM65989 z/VM 6.4 was changed to use a more efficient channel program for writing dumps. The change in the structure of the channel program improved I/O performance of dumps. The article is here for the reading.
Further dump improvements include more than just channel program optimization.
-
PTF UM35132 also includes a new SNAPDUMP and SET DUMP
operand,
PGMBKS NONE,
that lets the
system operator omit page management blocks (PGMBKs)
from the dump. PGMBKs, CP data structures that map
guest real storage,
are seldom useful in diagnosis.
Omitting them from the dump
decreases dump time and
decreases dump size,
usually without
compromising the usefulness of the dump.
-
z/VM 7.1 includes a new SNAPDUMP and
SET DUMP operand,
FRMTBL NO,
that lets the system
operator use an alternate method for dumping
the information contained in the real frame table.
Instead of writing the frame table itself, the
new method writes
a new data structure, the
correlation table. The correlation table
expresses the same information as the real frame
table, but it is much smaller and so it can be
dumped in less space and in less time.
- z/VM 7.1 also reduces the amount of CPU time required for calculating what to dump. It does this by using subroutines that have been optimized and by using Prefetch Data (PFD) to have the processor prefetch real frame table rows, so by the time they are needed, they are already in cache. These changes are always in play, in other words, there is no command operand to invoke them. This fix is not in the z/VM 7.1 base but rather is found in APAR VM66176, available concurrently with the GA of z/VM 7.1.
This article describes the effects of all those enhancements.
Method
A workload was devised to populate storage. The workload was built in such a way that it would populate storage in about the same fashion each time it was run. A snap dump was then taken. After the snap dump was taken, the dump was loaded from spool onto minidisk. The loaded file was then analyzed to calculate how many 4 KB records were written during the dumping, and how much elapsed time was used, and how much CPU time was used.
Dumps were done using z/VM 6.4 plus VM65989 and also using z/VM 7.1, exploiting increasing levels of the enhancements. To suppress PGMBKs, the SNAPDUMP operand PGMBKS NONE was used. To dump a correlation table instead of the frame table, the SNAPDUMP operand FRMTBL NO was used.
Results and Discussion
Effects of PGMBK Omission and Correlation Table
Table 1 shows the effects of PGMBK suppression and correlation table exploitation on dump size and dump time.
Suppressing PGMBKs reduced the size of the dump by 82% and reduced the dump time by 70%. Changing from the frame table to the correlation table reduced the size of the dump by 95% and reduced the dump time by 83%. The effect of both changes used together was a 99% reduction in dump size and a 95% reduction in dump elapsed time.
Effect of CPU Mitigation, 512 GB
Table 2 shows the effect of CPU mitigation on a 512 GB dump.
In our 512 GB measurement, CPU mitigation reduced elapsed time by 41%.
Effects, 1024 GB Workload
Table 3 shows the effects of the various enhancements on our 1024 GB experiment.
Compared to having no enhancements in play, having all enhancements in play reduced the size of the dump by 99% and reduced the dump elapsed time by 97%.
Summary and Conclusions
In our measurement in our 1024 GB LPAR,
the PGMBK suppression operand,
the correlation table operand, and
the CPU mitigation improvements
combined to reduce
dump size by 99% and
dump elapsed time by 97%.
Customer experience will vary by hardware
configuration and by workload.