Guest Support for FICON CTCA
z/VM (at the appropriate service level) supports FICON Channel-to-Channel communications between an IBM zSeries 900 and another z900 or an S/390 Parallel Enterprise Server G5 or G6. This enables more reliable and higher bandwidth host-to-host communication than is available with ESCON channels. Note that there are two types of FICON channels which are referred to as FICON and FICON Express. The latter has higher throughput and maximum bandwidth capability. We did not have access to FICON Express and so all references to FICON in this section refer to the former.
Methodology: This section presents and discusses measurement results that assess the performance of the FICON adapter using the support included in z/VM 4.2.0 CP, with APAR VM62906 applied, and comparing it with existing ESCON support.
The workload driver is an internal tool which can be used to simulate such bulk-data transfers as FTP or primitive benchmarks such as streaming or request-response. The data are driven from the application layer of the TCP/IP protocol stack, thus causing the entire networking infrastructure, including the adapter and the TCP/IP protocol code, to be measured. It moves data between client-side memory and server-side memory, eliminating all outside bottlenecks such as DASD or tape.
A client-server pair was used in which the client sent one byte and received 20MB of data (streaming workload) or in which the client sent 200 bytes and received 1000 bytes (request-response workload). Additional client-server pairs were added to determine if throughput would vary with an increase of number of connections.
While collecting the performance data, it was determined that optimum streaming workload results were achieved when TCP/IP was configured with DATABUFFERPOOLSIZE set to 32760 and DATABUFFERLIMITS set to 10 for both the outbound buffer limit and the inbound buffer limit. These parameters are used to determine the number and size of buffers that may be allocated for a TCP connection that is using window scaling.
It should be noted that it is possible for monitor data to not reflect that a device is a FICON device. This can happen if the device goes offline (someone pops the card) and goes online but without the vary online command being issued. If this situation is ever encountered a vary offline followed by a vary online command will correct the situation.
Each performance run, starting with 1 client-server pair and progressing to 10 client-server pairs, consisted of starting the server(s) on VM_s and then starting the client(s) on VM_c. The client(s) received data for 400 seconds. Monitor data were collected for 330 seconds of that time period. Data were collected only on the client machine.
At least 3 measurement trials were taken for each case, and a representative trial was chosen to show in the results. A complete set of runs was done with the maximum transmission unit (MTU) set to 32760 for streaming, and to 1500 for both streaming and request-response (RR). The CP monitor data for each measurement were reduced by VMPRF.
There are multiple devices associated with a FICON channel and TCPIP can be configured to use just one device or multiple devices. Measurements were done with just one device configured for the 1, 3, 5, and 10 client-server pair runs. Measurements were then done, for comparison purposes, with one device per client-server pair. This was done by specifying a unique device number on each of 10 device statements and associating it with a unique ip address. Note that ESCON does not have the same multiplexing capability that FICON does and therefore does not benefit from this same technique.
The following charts show, for both ESCON and FICON, throughput and CPU time for the streaming and RR workloads. Each chart has a bar for each connectivity/MTU pair measured. Specific details are mentioned for each workload after the charts for that workload.
Figure 2. Throughput - Streaming
Throughput for all cases shows that both ESCON and FICON with a single device maintained the same throughput rate that they achieved with one connection. FICON was able to move about twice as much as ESCON. However, when one device was used per connection, the throughput was much better for both MTU sizes.
Figure 3. CPU Time - Streaming
The corresponding CPU time, in general, shows the same pattern where time increases with each additional client/server pair. Both FICON and ESCON had approximately the same amount of CPU msec/MB for the 32K MTU case. The 1500 MTU cases showed higher CPU msec/MB with the FICON multiple device case showing higher efficiencies.
Throughput for all cases shows the same trend of increasing as additional connections are made. ESCON leads throughput until 10 connections, where FICON with single devices does better. Note that using multiples devices (one per connection pair) yielded poorer results than either ESCON or FICON with a single device defined.
CPU time decreases slightly as the workload increases and the system becomes more efficient for both ESCON and FICON with a single device. This was not true for FICON with multiple devices.
Results: The results are summarized in the following tables. MB/sec (megabytes per second) or trans/sec (transactions per second) was supplied by the workload driver and shows the throughput rate. All other values are from CP monitor data or derived from CP monitor data.
Total_cpu_util
This field was obtained from the SYSTEM_SUMMARY_BY_TIME VMPRF report that shows the average of both processors out of 100%.
tcpip_tot_cpu_util
This field is calculated from the USER_RESOURCE_UTILIZATION VMPRF report (CPU seconds, total) for the client stack (tcpip). 100% is the equivalent of one fully utilized processor.
cpu_msec/MB
This field was calculated from the previous tot_cpu_util divided by the number of megabytes per second to show the number of milliseconds of time per megabyte.
cpu_msec/trans
This field was calculated from the previous tot_cpu_util divided by the number of transactions per second to show the number of milliseconds of time per transaction.
Table 1. ESCON - Streaming 32K
Number of clients
runid
01
ensf0103
03
ensf0302
05
ensf0502
10
ensf1002
MB/sec
elapsed time (sec)
total_cpu_util
12.88
330.00
9.20
12.90
330.00
9.30
12.90
330.00
9.40
12.90
330.00
9.80
tcpip_tot_cpu_util
tcpip_virt_cpu_util
10.60
5.20
10.60
5.20
10.90
5.20
11.20
5.50
cpu_msec/MB
emul_msec/MB
cp_msec/MB
14.29
8.85
5.43
14.42
8.99
5.43
14.57
9.15
5.43
15.19
9.46
5.74
tcpip_cpu_msec/MB
tcpip_vcpu/MB
tcpip_ccpu/MB
8.23
4.00
4.23
8.22
3.99
4.23
8.46
3.99
4.46
8.69
4.23
4.46
Table 2. FICON - Streaming 32K - Single Device
Number of clients
runid
01
fnsf0103
03
fnsf0301
05
fnsf0502
10
fnsf1003
MB/sec
elapsed time (sec)
total_cpu_util
28.23
330.00
20.30
28.26
330.00
20.80
28.25
330.00
20.80
28.00
330.00
21.10
tcpip_tot_cpu_util
tcpip_virt_cpu_util
23.90
11.50
24.20
11.80
24.20
11.80
24.50
11.80
cpu_msec/MB
emul_msec/MB
cp_msec/MB
14.38
9.07
5.31
14.72
9.20
5.52
14.73
9.27
5.45
15.07
9.50
5.57
tcpip_cpu_msec/MB
tcpip_vcpu_msec/MB
tcpip_ccpu_msec/MB
8.48
4.08
4.40
8.58
4.18
4.40
8.58
4.18
4.40
8.77
4.22
4.55
Table 3. FICON - Streaming 32K - Multiple Devices
Number of clients
runid
01
fjsx0102
03
fjsx0301
05
fjsx0501
10
fjsx1003
MB/sec
elapsed time (sec)
total_cpu_util
28.25
330.00
19.80
44.93
330.00
32.30
45.78
330.00
33.40
47.02
330.00
36.80
tcpip_tot_cpu_util
tcpip_virt_cpu_util
22.70
11.20
37.30
18.20
38.80
18.80
43.00
20.70
cpu_msec/MB
emul_msec/MB
cp_msec/MB
14.02
8.78
5.24
14.38
9.04
5.34
14.59
9.17
5.42
15.65
9.83
5.83
tcpip_cpu_msec/MB
tcpip_vcpu_msec/MB
tcpip_ccpu_msec/MB
8.05
3.97
4.08
8.30
4.05
4.25
8.47
4.10
4.37
9.15
4.40
4.75
FICON shows more than twice the throughput as ESCON when using a single device for FICON. When multiple devices are defined, FICON shows more than four times improvement.
Table 4. ESCON - Streaming 1500
Number of clients
runid
01
essf0103
03
essf0303
05
essf0503
10
essf1001
MB/sec
elapsed time (sec)
total_cpu_util
11.78
330.00
12.80
11.78
330.00
14.60
11.77
330.00
16.30
11.87
330.00
20.30
tcpip_tot_cpu_util
tcpip_virt_cpu_util
17.30
11.50
18.50
12.40
20.00
13.00
22.70
14.50
cpu_msec/MB
emul_msec/MB
cp_msec/MB
21.73
15.79
5.94
24.79
18.00
6.79
27.70
20.05
7.65
34.20
24.77
9.44
tcpip_cpu_msec/MB
tcpip_vcpu_msec/MB
tcpip_ccpu_msec/MB
14.66
9.78
4.89
15.69
10.55
5.14
16.99
11.07
5.92
19.15
12.25
6.89
Table 5. FICON - Streaming 1500 - Single Device
Number of clients
runid
01
fssf0101
03
fssf0302
05
fssf0503
10
fssf1003
MB/sec
elapsed time (sec)
total_cpu_util
25.35
300.00
25.00
25.69
300.00
31.50
26.07
330.00
35.10
26.25
330.00
42.10
tcpip_tot_cpu_util
tcpip_virt_cpu_util
34.30
23.70
39.70
26.70
42.40
28.20
47.90
30.90
cpu_msec/MB
emul_msec/MB
cp_msec/MB
19.72
14.44
5.29
24.52
17.98
6.54
26.93
19.79
7.13
32.08
23.39
8.69
tcpip_cpu_msec/MB
tcpip_vcpu_msec/MB
tcpip_ccpu_msec/MB
13.54
9.34
4.21
15.45
10.38
5.07
16.27
10.81
5.46
18.24
11.77
6.46
Table 6. FICON - Streaming 1500 - Multiple Devices
Number of clients
runid
01
fssx0102
03
fssx0301
05
fssx0502
10
fssx1002
MB/sec
elapsed time (sec)
total_cpu_util
25.44
330.00
26.10
41.85
330.00
43.50
41.46
330.00
44.80
40.92
330.00
46.40
tcpip_tot_cpu_util
tcpip_virt_cpu_util
35.80
24.50
60.00
40.70
61.20
40.90
62.70
41.50
cpu_msec/MB
emul_msec/MB
cp_msec/MB
20.52
14.86
5.66
20.79
15.01
5.78
21.61
15.53
6.08
22.68
16.18
6.50
tcpip_cpu_msec/MB
tcpip_vcpu_msec/MB
tcpip_ccpu_msec/MB
14.17
9.65
4.53
14.34
9.72
4.62
14.76
9.87
4.90
15.33
10.15
5.18
Number of clients
runid
01
esrf0101
03
esrf0301
05
esrf0503
10
esrf1003
trans/sec
elapsed time (sec)
total_cpu_util
476.89
330.00
12.20
1096.80
330.00
27.20
1505.14
330.00
35.60
1824.48
330.00
42.80
tcpip_tot_cpu_util
tcpip_virt_cpu_util
12.40
6.40
25.80
13.00
32.40
16.70
37.30
19.40
cpu_msec/trans
emul_msec/trans
cp_msec/trans
0.51
0.34
0.18
0.50
0.33
0.16
0.47
0.32
0.15
0.47
0.33
0.14
tcpip_cpu_msec/trans
tcpip_vcpu_msec/trans
tcpip_ccpu_msec/trans
0.26
0.13
0.13
0.23
0.12
0.12
0.22
0.11
0.10
0.20
0.11
0.10
Table 8. FICON - RR 1500 - Single Device
Number of clients
runid
01
fsrf0102
03
fsrf0301
05
fsrf0502
10
fsrf1002
trans/sec
elapsed time (sec)
total_cpu_util
437.71
330.00
11.40
1042.70
330.00
25.30
1412.47
300.00
33.20
1881.99
330.00
43.20
tcpip_tot_cpu_util
tcpip_virt_cpu_util
11.50
5.80
24.20
12.40
30.30
15.70
37.90
19.70
cpu_msec/trans
emul_msec/trans
cp_msec/trans
0.52
0.33
0.19
0.49
0.33
0.16
0.47
0.32
0.15
0.46
0.32
0.14
tcpip_cpu_msec/trans
tcpip_vcpu_msec/trans
tcpip_ccpu_msec/trans
0.26
0.13
0.13
0.23
0.12
0.11
0.21
0.11
0.10
0.20
0.10
0.10
Table 9. FICON - RR 1500 - Multiple devices
Number of clients
runid
01
fsrx0102
03
fsrx0301
05
fsrx0502
10
fsrx1001
trans/sec
elapsed time (sec)
total_cpu_util
437.07
300.00
11.90
841.23
330.00
24.30
956.42
330.00
26.30
978.30
330.00
27.80
tcpip_tot_cpu_util
tcpip_virt_cpu_util
11.70
6.00
23.60
12.80
26.30
13.30
27.30
13.90
cpu_msec/trans
emul_msec/trans
cp_msec/trans
0.54
0.35
0.20
0.58
0.39
0.19
0.55
0.36
0.19
0.57
0.38
0.19
tcpip_cpu_msec/trans
tcpip_vcpu_msec/trans
tcpip_ccpu_msec/trans
0.28
0.14
0.14
0.28
0.14
0.14
0.28
0.14
0.14
0.28
0.14
0.14
With the smaller amounts of data being transferred, the RR
workload favors ESCON with FICON single device being similar.
TCPIP uses a 32K buffer when transferring data over CTC and this
is most likely the reason that the RR workload did not benefit from
using multiple devices.