If you need to test the IB network between 2 nodes look at my
tests on X4-2 Exadata.
I. Measuring Latency
On the 2nd node run the listener program and go to the 1st node:[root@ed04dbadm02 ~]# ib_read_lat
On the 1st node run the test and see the report:
[root@ed04dbadm01 ~]# ib_read_lat -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read Latency Test
Number of qps : 1
Connection type : RC
TX depth : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
remote address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 1.77 14.12 1.82
------------------------------------------------------------------
And another run:
[root@ed04dbadm01 ~]# ib_read_lat -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read Latency Test
Number of qps : 1
Connection type : RC
TX depth : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x008a PSN 0x266b05 OUT 0x10 RKey 0x1f5f800 VAddr 0x007f7a60362000
remote address: LID 0x0d QPN 0x0071 PSN 0xc9915e OUT 0x10 RKey 0x268c100 VAddr 0x000000011ce000
------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 2.34 20.83 2.88
4 1000 2.52 37.00 2.87
8 1000 2.53 34.74 2.88
16 1000 2.33 20.61 2.78
32 1000 2.36 16.59 2.87
64 1000 2.54 17.71 2.89
128 1000 2.59 30.65 2.95
256 1000 2.56 20.28 3.16
512 1000 2.46 25.53 3.45
1024 1000 3.23 30.31 3.75
2048 1000 4.21 28.89 4.56
4096 1000 4.82 20.08 5.40
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=1, ccnt=1
[root@ed04dbadm01 ~]#
And see the report from 2nd node:
[root@ed04dbadm02 ~]# ib_read_lat
------------------------------------------------------------------
RDMA_Read Latency Test
Number of qps : 1
Connection type : RC
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
remote address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
-----------------------------------------------------------------
II. Bandwidth
Next stage is the bandwidth test.
On 2nd node we run the listener:
[root@ed04dbadm02 ~]# ib_read_bw
And on 1st node we run the test itself:
[root@ed04dbadm01 ~]# ib_read_bw -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read BW Test
Number of qps : 1
Connection type : RC
TX depth : 300
CQ Moderation : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
remote address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000
------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec]
2 1000 9.51 9.41
4 1000 20.29 20.21
8 1000 40.42 40.25
16 1000 80.85 78.35
32 1000 161.99 161.52
64 1000 309.76 308.93
128 1000 650.33 648.20
256 1000 1291.16 1286.98
512 1000 2610.95 2523.90
1024 1000 3074.05 3070.13
2048 1000 3192.15 3159.41
4096 1000 3220.35 3219.55
8192 1000 3230.01 3229.81
16384 1000 3241.36 3241.14
32768 1000 3245.32 3245.29
65536 1000 3248.53 3248.47
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0
[root@ed04dbadm02 ~]# ib_read_bw
------------------------------------------------------------------
RDMA_Read BW Test
Number of qps : 1
Connection type : RC
CQ Moderation : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000
remote address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
pp_read_keys: Success
Couldn't read remote address
Unable to write to socket/rdam_cm
Failed to close connection between server and client
[root@ed04dbadm02 ~]#
And bi-directional test:
[root@ed04dbadm01 ~]# ib_read_bw -a -b -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read Bidirectional BW Test
Number of qps : 1
Connection type : RC
TX depth : 300
CQ Moderation : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0091 PSN 0x71081f OUT 0x10 RKey 0x1f94b00 VAddr 0x007f851c57d000
remote address: LID 0x0d QPN 0x0076 PSN 0xae8412 OUT 0x10 RKey 0x26bf600 VAddr 0x007fc7ed0af000
------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec]
2 1000 20.47 20.09
4 1000 40.95 40.82
8 1000 81.74 81.41
16 1000 164.40 163.81
32 1000 328.80 327.14
64 1000 658.83 656.19
128 1000 1320.12 1315.04
256 1000 2482.47 2459.42
512 1000 5260.80 5244.75
1024 1000 6250.11 6232.63
2048 1000 6395.13 6386.76
4096 1000 6466.50 6466.02
8192 1000 6459.10 6449.20
16384 1000 6490.17 6489.88
32768 1000 6503.01 6502.80
65536 1000 6499.86 6499.85
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0
[root@ed04dbadm01 ~]#
III. HELP
[root@ed04dbadm01 ~]# ib_read_lat -h
Usage:
ib_read_lat start a server and wait for connection
ib_read_lat <host> connect to server at <host>
Options:
-p, --port=<port> Listen on/connect to port <port> (default 18515)
-d, --ib-dev=<dev> Use IB device <dev> (default first device found)
-R, --rdma_cm Connect QPs with rdma_cm and run test on those QPs
-z, --com_rdma_cm Communicate with rdma_cm module to exchange data - use regular QPs
-i, --ib-port=<port> Use port <port> of IB device (default 1)
-c, --connection=<RC/UC/UD> Connection type RC/UC/UD (default RC)
-m, --mtu=<mtu> Mtu size : 256 - 4096 (default port mtu)
-s, --size=<size> Size of message to exchange (default 2)
-a, --all Run sizes from 2 till 2^23
-n, --iters=<iters> Number of exchanges (at least 2, default 1000)
-t, --tx-depth=<dep> Size of tx queue (default 50)
-u, --qp-timeout=<timeout> QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
-S, --sl=<sl> SL (default 0)
-x, --gid-index=<index> Test uses GID with GID index (Default : IB - no gid . ETH - 0)
-F, --CPU-freq Do not fail even if cpufreq_ondemand module is loaded
-V, --version Display version number
-I, --inline_size=<size> Max size of message to be sent in inline (default 400)
-e, --events Sleep on CQ events (default poll)
-o, --outs=<num> num of outstanding read/atom(default max of device)
-C, --report-cycles report times in cpu cycle units (default microseconds)
-H, --report-histogram Print out all results (default print summary only)
-U, --report-unsorted (implies -H) print out unsorted results (default sorted)
[root@ed04dbadm01 ~]# ib_read_bw -h
Usage:
ib_read_bw start a server and wait for connection
ib_read_bw <host> connect to server at <host>
Options:
-p, --port=<port> Listen on/connect to port <port> (default 18515)
-d, --ib-dev=<dev> Use IB device <dev> (default first device found)
-R, --rdma_cm Connect QPs with rdma_cm and run test on those QPs
-z, --com_rdma_cm Communicate with rdma_cm module to exchange data - use regular QPs
-i, --ib-port=<port> Use port <port> of IB device (default 1)
-c, --connection=<RC/UC/UD> Connection type RC/UC/UD (default RC)
-m, --mtu=<mtu> Mtu size : 256 - 4096 (default port mtu)
-s, --size=<size> Size of message to exchange (default 65536)
-a, --all Run sizes from 2 till 2^23
-n, --iters=<iters> Number of exchanges (at least 2, default 1000)
-t, --tx-depth=<dep> Size of tx queue (default 300)
-u, --qp-timeout=<timeout> QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
-S, --sl=<sl> SL (default 0)
-x, --gid-index=<index> Test uses GID with GID index (Default : IB - no gid . ETH - 0)
-F, --CPU-freq Do not fail even if cpufreq_ondemand module is loaded
-V, --version Display version number
-I, --inline_size=<size> Max size of message to be sent in inline (default 0)
-b, --bidirectional Measure bidirectional bandwidth (default unidirectional)
-Q, --cq-mod Generate Cqe only after <--cq-mod> completion
-e, --events Sleep on CQ events (default poll)
-N, --no peak-bw Cancel peak-bw calculation (default with peak)
-o, --outs=<num> num of outstanding read/atom(default max of device)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.