The Queued Direct I/O (QDIO)
Enhanced Buffer State Management (QEBSM) facility provides
an optimized mechanism for
transferring data via QDIO (including FCP, which uses QDIO)
to and from virtual machines.
Prior to this new facility, z/VM had
to mediate between the virtual machine
and the OSA Express or FCP adapter during
QDIO data transfers. With QEBSM, z/VM is not involved
with a typical QDIO data transfer when the guest operating system or
device driver supports the facility.
Starting with z/VM 5.2.0 and the z990/z890 with QEBSM Enablement
applied (refer to
Performance Related APARs
for a list of required maintenance),
a program running in a virtual machine has the
option of using QEBSM when performing QDIO operations.
By using QEBSM,
the processor millicode performs the shadow-queue
processing typically performed by z/VM for a QDIO operation. This
eliminates the z/VM and hardware overhead associated with SIE entry and
exit for every QDIO data transfer. The shadow-queue processing still
requires processor time, but much less than required when done by the
software. The net effect is a small increase in virtual CPU time
coupled with a much larger decrease in CP CPU time.
This section summarizes measurement results comparing Linux
communicating over a QDIO connection under z/VM 5.1.0 with measurement
results under z/VM 5.2.0
with QEBSM active.
The Application Workload Modeler (AWM) was used to drive the workload
for OSA and HiperSockets. (Refer to
AWM Workload for more information.)
A complete set of runs was done for RR and STR workloads.
IOzone was used to drive the workload for native SCSI (FCP) devices.
Refer to Linux IOzone Workload for
details.
The measurements were done on a 2084-324 with two dedicated
processors in each of the two LPARs used.
Running under z/VM, an internal Linux driver at level
2.6.14-16 that supports QEBSM was used.
Two LPARs were used for the OSA and HiperSockets measurements.
The AWM client ran in one LPAR and the AWM server ran in the other
LPAR. Each LPAR had 2GB of main storage and 2GB of expanded storage.
CP Monitor data were captured for one LPAR (client side) during the
measurement and were reduced using Performance Toolkit for VM (Perfkit).
One LPAR was used for the FCP measurements. CP Monitor data
and hardware instrumentation data were captured.
The direct effect of QEBSM is to decrease CPU time.
This, in turn, increases throughput in
cases where it had been limited by CPU usage.
This effect is demonstrated in this table for all three cases.
The following tables compare the measurements for OSA,
HiperSockets and FCP. The %diff numbers shown are the
percent increase (or decrease) between the measurement on 5.1.0
and 5.2.0.
The following table shows the results for FCP. Values are
provided for total microseconds (µsec) per transaction, CP
(µsec) per transaction, and virtual (µsec) per
transaction.