To connect to an external network,
z/VM guests can use a dedicated OSA or a
vswitch.
This chapter provides a comparison of how the choice
impacts the transaction rate when running request-response (RR)
workloads and the outbound data rate when running
streaming (STR) workloads. A variety of different configurations are
compared.
Introduction
The Dedicated OSA vs. VSWITCH chapter of
the z/VM 5.2 Performance Report
compared two connectivity options available for guests
running under z/VM: direct connection to OSA and
vswitch.
Here we present
an update of the z/VM 5.2 information.
This refresh
contains a comparison of key measurement points between the two
options and lists some of the reasons for choosing one over the other.
Customer results will vary according to system configuration and workload.
Method
Application Workload Modeler
(AWM), a Linux network benchmarking application,
was used to drive network traffic between
one client Linux guest and one server Linux guest.
Each guest was in its own dedicated LPAR.
Both dedicated OSA configurations and vswitch configurations
were evaluated.
Both request-response (RR) and streaming (STR) workloads were used.
The RR workload consisted of the client sending 200
bytes to the server and the server responding with 1000 bytes.
The STR workload consisted
of the client sending 20 bytes to the server and the server responding
with 20 MB.
The measurement ran for 600 seconds.
The workloads were run in 12 configurations.
The configurations varied by maximum
transmission unit (MTU) size, SMT mode, and transport mode.
The table below
shows the combination of workloads and configurations used.
Each combination from Table 1
was run three times:
once using one socket connection,
once using 10 concurrent socket connections, and
once using 50 concurrent socket connections.
The measurements were done on a z15 8561-T01 using
two dedicated LPARs.
For SMT-1 runs,
each LPAR used two logical IFL cores.
For SMT-2 runs,
each LPAR used one logical IFL core.
Connectivity between the two
LPARs was over an OSA-Express6 10GbE card.
The software used
included z/VM 7.2 and Linux SLES 12 SP1.
Use of dedicated OSA to connect the client guest to the server guest.
In both environments, the server Linux guest ran in LPAR 1 and the client Linux guest
ran in LPAR 2. Each LPAR had 512 GB of central storage. CP monitor
data was captured for LPAR 1 (server side) during each
measurement and reduced using Performance Toolkit for VM (Perfkit).
The z/VM 5.2 measurements captured data from
the client side. For this new study,
the data was captured on the server side.
This more closely aligns
with the role typically played by a Linux guest.
Results and Discussion
The following tables contain the average of select metrics for each
run. For RR runs, the focus is on transaction rate. For STR runs, the
focus is on outbound data transmission rate. The tables also
compare the difference in these metrics between
the OSA and vswitch runs.
The %diff numbers shown are the percent change
comparing OSA to the vswitch. For example, if the number is
positive, OSA was that percent greater than vswitch.
In general, a Linux guest using a dedicated OSA gets higher
throughput and uses less CPU time than a Linux guest connected
through a vswitch. However, this must be balanced against
advantages gained using the vswitch, such as:
Ease of network design
Ability to share network resources (OSA card)
Management of the network including security and capabilities
available to the z/VM guest on the LAN
Measurement of the network via z/VM monitor records
Total CPU msec/transaction 0.01025 0.00696 0.00475 0.01015 0.00686 0.00464
Emul CPU msec/transaction 0.00927 0.00657 0.00465 0.00920 0.00648 0.00454
CP CPU msec/transaction 0.00098 0.00039 0.00010 0.00095 0.00038 0.00010
% difference
ETR 72.93% 68.54% 78.90% 73.87% 67.39% 80.43%
Total CPU msec/transaction 19.88% 44.70% -6.31% 12.53% 44.73% -6.26%
Emul CPU msec/transaction 57.39% 84.55% 15.67% 49.11% 85.14% 17.01%
CP CPU msec/transaction -63.16% -68.80% -90.48% -66.67% -69.35% -90.65%
Notes:
8561-T01, 2 dedicated IFL cores, 512 GB central storage, OSA-Express6 10GbE card,
z/VM 7.2 of May 7, 2020, Linux SLES 12 SP1.
The ETR of the OSA runs was 68.54% to 80.43%
higher than the equivalent vswitch runs when running the RR workload in an SMT-1 configuration with an MTU size of 1492.
The total CPU per transaction of the OSA runs was between 44.73% higher to 6.31% lower than the equivalent vswitch runs.
Total CPU msec/transaction 0.01192 0.00802 0.00586 0.01177 0.00782 0.00576
Emul CPU msec/transaction 0.01101 0.00762 0.00573 0.01086 0.00743 0.00564
CP CPU msec/transaction 0.00091 0.00040 0.00013 0.00091 0.00039 0.00012
% difference
ETR 70.39% 72.17% 108.11% 72.26% 69.59% 110.18%
Total CPU msec/transaction 12.56% 40.46% -12.54% -6.29% 39.64% -12.46%
Emul CPU msec/transaction 39.90% 73.97% 7.71% 18.95% 73.60% 8.46%
CP CPU msec/transaction -66.54% -69.92% -90.58% -73.47% -70.45% -91.30%
Notes:
8561-T01, 2 dedicated IFL cores, 512 GB central storage, OSA-Express6 10GbE card,
z/VM 7.2 of May 7, 2020, Linux SLES 12 SP1.
The ETR of the OSA runs was 69.59% to 110.18%
higher than the equivalent vswitch runs when running the RR workload in an SMT-2 configuration with an MTU size of 1492.
The total CPU per transaction of the OSA runs was between 40.46% higher to 12.54% lower than the equivalent vswitch runs.
Total CPU msec/Outbound MB -50.88% -36.18% -21.85% -45.89% -20.79% -13.33%
Emul CPU msec/Outbound MB -15.00% -18.51% 2.21% -10.29% 0.15% 11.85%
CP CPU msec/Outbound MB -99.90% -98.52% -96.89% -99.87% -97.48 -98.70%
Notes:
8561-T01, 2 dedicated IFL cores, 512 GB central storage, OSA-Express6 10GbE card,
z/VM 7.2 of May 7, 2020, Linux SLES 12 SP1.
The outbound data rate of the OSA runs was 11.23% to 94.39%
higher than the equivalent vswitch runs when running the STR workload in an SMT-1 configuration with an MTU size of 1492.
The total CPU per outbound MB rate of the OSA runs was between 13.33% to 50.88% lower than the equivalent vswitch runs.
Total CPU msec/Outbound MB -52.52% -31.83% -22.28% -49.65% -19.38% -18.65%
Emul CPU msec/Outbound MB -24.07% -12.40% -0.03% -19.21% 3.21% 4.92%
CP CPU msec/Outbound MB -99.88% -98.70% -98.06% -99.88% -96.58% -98.67%
Notes:
8561-T01, 1 dedicated IFL core, 512 GB central storage, OSA-Express6 10GbE card,
z/VM 7.2 of May 7, 2020, Linux SLES 12 SP1.
The outbound data rate of the OSA runs was 22.72% to 100.26%
higher than the equivalent vswitch runs when running the STR workload in an SMT-2 configuration with an MTU size of 1492.
The total CPU per outbound MB rate of the OSA runs was between 18.65% to 52.52% lower than the equivalent vswitch runs.
Total CPU msec/Outbound MB -21.83% -12.16% -11.69% -14.17% -3.27% -6.70%
Emul CPU msec/Outbound MB 29.58% 37.20% 38.63% 38.82% 48.17% 40.82%
CP CPU msec/Outbound MB -97.35% -95.99% -94.77% -96.38% -95.71% -94.28%
Notes:
8561-T01, 2 dedicated IFL cores, 512 GB central storage, OSA-Express6 10GbE card,
z/VM 7.2 of May 7, 2020, Linux SLES 12 SP1.
The outbound data rate of the OSA runs was 0.26% lower to 49.00%
higher than the equivalent vswitch runs when running the STR workload in an SMT-1 configuration with an MTU size of 8992.
The total CPU per outbound MB rate of the OSA runs was between 3.27% to 21.83% lower than the equivalent vswitch runs.
Total CPU msec/Outbound MB -27.25% -16.69% -13.91% -18.52% -10.99% -14.57%
Emul CPU msec/Outbound MB 16.01% 26.25% 30.99% 28.41% 31.49% 23.64%
CP CPU msec/Outbound MB -97.42% -96.28% -95.05% -96.61% -95.92% -94.98%
Notes:
8561-T01, 1 dedicated IFL core, 512 GB central storage, OSA-Express6 10GbE card,
z/VM 7.2 of May 7, 2020, Linux SLES 12 SP1.
The outbound data rate of the OSA runs was 0.95% lower to 87.79%
higher than the equivalent vswitch runs when running the STR workload in an SMT-2 configuration with an MTU size of 8992.
The total CPU per outbound MB rate of the OSA runs was between 10.99% to 27.25% better than the equivalent vswitch runs.
Summary
The results of the experiments conducted for this report indicate that
for a request-response (RR) workload,
Linux guests using a dedicated OSA
experience a greater ETR
than Linux guests using a vswitch.
Further, for a streaming (STR) workload,
Linux guests using a dedicated OSA
experience equal or greater outbound data rate
than Linux guests using a vswitch.
The degree of improvement varies depending on
the number of concurrent
connections used between the two guests, especially in the case of
a streaming workload.