Exadata: January 2017

If you need to test the IB network between 2 nodes look at my

tests on X4-2 Exadata.

I. Measuring Latency

On the 2nd node run the listener program and go to the 1st node:
[root@ed04dbadm02 ~]# ib_read_lat

On the 1st node run the test and see the report:

[root@ed04dbadm01 ~]# ib_read_lat -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read Latency Test
Number of qps   : 1
Connection type : RC
TX depth        : 50
Mtu             : 2048B
Link type       : IB
Outstand reads : 16
rdma_cm QPs    : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
remote address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
------------------------------------------------------------------
#bytes #iterations    t_min[usec]    t_max[usec] t_typical[usec]
2       1000          1.77           14.12        1.82
------------------------------------------------------------------

And another run:
[root@ed04dbadm01 ~]# ib_read_lat -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read Latency Test
Number of qps   : 1
Connection type : RC
TX depth        : 50
Mtu             : 2048B
Link type       : IB
Outstand reads : 16
rdma_cm QPs    : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x008a PSN 0x266b05 OUT 0x10 RKey 0x1f5f800 VAddr 0x007f7a60362000
remote address: LID 0x0d QPN 0x0071 PSN 0xc9915e OUT 0x10 RKey 0x268c100 VAddr 0x000000011ce000
------------------------------------------------------------------
#bytes #iterations    t_min[usec]    t_max[usec] t_typical[usec]
2       1000          2.34           20.83        2.88
4       1000          2.52           37.00        2.87
8       1000          2.53           34.74        2.88
16      1000          2.33           20.61        2.78
32      1000          2.36           16.59        2.87
64      1000          2.54           17.71        2.89
128     1000          2.59           30.65        2.95
256     1000          2.56           20.28        3.16
512     1000          2.46           25.53        3.45
1024    1000          3.23           30.31        3.75
2048    1000          4.21           28.89        4.56
4096    1000          4.82           20.08        5.40
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=1, ccnt=1
[root@ed04dbadm01 ~]#

And see the report from 2nd node:

[root@ed04dbadm02 ~]# ib_read_lat
------------------------------------------------------------------
                    RDMA_Read Latency Test
Number of qps   : 1
Connection type : RC
Mtu             : 2048B
Link type       : IB
Outstand reads : 16
rdma_cm QPs    : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
remote address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
-----------------------------------------------------------------

II. Bandwidth

Next stage is the bandwidth test.

On 2nd node we run the listener:

[root@ed04dbadm02 ~]# ib_read_bw

And on 1st node we run the test itself:

[root@ed04dbadm01 ~]# ib_read_bw -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read BW Test
Number of qps   : 1
Connection type : RC
TX depth        : 300
CQ Moderation   : 50
Mtu             : 2048B
Link type       : IB
Outstand reads : 16
rdma_cm QPs    : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
remote address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000
------------------------------------------------------------------
#bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
2          1000           9.51               9.41
4          1000           20.29              20.21
8          1000           40.42              40.25
16         1000           80.85              78.35
32         1000           161.99             161.52
64         1000           309.76             308.93
128        1000           650.33             648.20
256        1000           1291.16            1286.98
512        1000           2610.95            2523.90
1024       1000           3074.05            3070.13
2048       1000           3192.15            3159.41
4096       1000           3220.35            3219.55
8192       1000           3230.01            3229.81
16384      1000           3241.36            3241.14
32768      1000           3245.32            3245.29
65536      1000           3248.53            3248.47
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0

[root@ed04dbadm02 ~]# ib_read_bw
------------------------------------------------------------------
                    RDMA_Read BW Test
Number of qps   : 1
Connection type : RC
CQ Moderation   : 50
Mtu             : 2048B
Link type       : IB
Outstand reads : 16
rdma_cm QPs    : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000

remote address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
pp_read_keys: Success
Couldn't read remote address
Unable to write to socket/rdam_cm
Failed to close connection between server and client
[root@ed04dbadm02 ~]#

And bi-directional test:
[root@ed04dbadm01 ~]# ib_read_bw -a -b -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
                    RDMA_Read Bidirectional BW Test
Number of qps   : 1
Connection type : RC
TX depth        : 300
CQ Moderation   : 50
Mtu             : 2048B
Link type       : IB
Outstand reads : 16
rdma_cm QPs    : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0091 PSN 0x71081f OUT 0x10 RKey 0x1f94b00 VAddr 0x007f851c57d000
remote address: LID 0x0d QPN 0x0076 PSN 0xae8412 OUT 0x10 RKey 0x26bf600 VAddr 0x007fc7ed0af000
------------------------------------------------------------------
#bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]
2          1000           20.47              20.09
4          1000           40.95              40.82
8          1000           81.74              81.41
16         1000           164.40             163.81
32         1000           328.80             327.14
64         1000           658.83             656.19
128        1000           1320.12            1315.04
256        1000           2482.47            2459.42
512        1000           5260.80            5244.75
1024       1000           6250.11            6232.63
2048       1000           6395.13            6386.76
4096       1000           6466.50            6466.02
8192       1000           6459.10            6449.20
16384      1000           6490.17            6489.88
32768      1000           6503.01            6502.80
65536      1000           6499.86            6499.85
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0
[root@ed04dbadm01 ~]#

III. HELP

[root@ed04dbadm01 ~]# ib_read_lat -h
Usage:
ib_read_lat            start a server and wait for connection
ib_read_lat <host>     connect to server at <host>

Options:
-p, --port=<port> Listen on/connect to port <port> (default 18515)
-d, --ib-dev=<dev> Use IB device <dev> (default first device found)
-R, --rdma_cm Connect QPs with rdma_cm and run test on those QPs
-z, --com_rdma_cm Communicate with rdma_cm module to exchange data - use regular QPs
-i, --ib-port=<port> Use port <port> of IB device (default 1)
-c, --connection=<RC/UC/UD> Connection type RC/UC/UD (default RC)
-m, --mtu=<mtu> Mtu size : 256 - 4096 (default port mtu)
-s, --size=<size> Size of message to exchange (default 2)
-a, --all Run sizes from 2 till 2^23
-n, --iters=<iters> Number of exchanges (at least 2, default 1000)
-t, --tx-depth=<dep> Size of tx queue (default 50)
-u, --qp-timeout=<timeout> QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
-S, --sl=<sl> SL (default 0)
-x, --gid-index=<index> Test uses GID with GID index (Default : IB - no gid . ETH - 0)
-F, --CPU-freq Do not fail even if cpufreq_ondemand module is loaded
-V, --version Display version number
-I, --inline_size=<size> Max size of message to be sent in inline (default 400)
-e, --events Sleep on CQ events (default poll)
-o, --outs=<num> num of outstanding read/atom(default max of device)
-C, --report-cycles report times in cpu cycle units (default microseconds)
-H, --report-histogram Print out all results (default print summary only)
-U, --report-unsorted (implies -H) print out unsorted results (default sorted)

[root@ed04dbadm01 ~]# ib_read_bw -h
Usage:
ib_read_bw            start a server and wait for connection
ib_read_bw <host>     connect to server at <host>

Options:
-p, --port=<port> Listen on/connect to port <port> (default 18515)
-d, --ib-dev=<dev> Use IB device <dev> (default first device found)
-R, --rdma_cm Connect QPs with rdma_cm and run test on those QPs
-z, --com_rdma_cm Communicate with rdma_cm module to exchange data - use regular QPs
-i, --ib-port=<port> Use port <port> of IB device (default 1)
-c, --connection=<RC/UC/UD> Connection type RC/UC/UD (default RC)
-m, --mtu=<mtu> Mtu size : 256 - 4096 (default port mtu)
-s, --size=<size> Size of message to exchange (default 65536)
-a, --all Run sizes from 2 till 2^23
-n, --iters=<iters> Number of exchanges (at least 2, default 1000)
-t, --tx-depth=<dep> Size of tx queue (default 300)
-u, --qp-timeout=<timeout> QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
-S, --sl=<sl> SL (default 0)
-x, --gid-index=<index> Test uses GID with GID index (Default : IB - no gid . ETH - 0)
-F, --CPU-freq Do not fail even if cpufreq_ondemand module is loaded
-V, --version Display version number
-I, --inline_size=<size> Max size of message to be sent in inline (default 0)
-b, --bidirectional Measure bidirectional bandwidth (default unidirectional)
-Q, --cq-mod Generate Cqe only after <--cq-mod> completion
-e, --events Sleep on CQ events (default poll)
-N, --no peak-bw Cancel peak-bw calculation (default with peak)
-o, --outs=<num> num of outstanding read/atom(default max of device)

After holidays at NY 2017 the CONFLUENCE won't work.
In its logs the application say something about ORA-01555 and alert show the unclear shutdown.

There were no bad messages in database alert logs and any corrupted blocks, however.

To check the data integrity I run the full export of database. And Full export show the bad table:

ORA-31693: Table data object "CONFLU"."BANDANA" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-01555: snapshot too old: rollback segment number with name "" too small
ORA-22924: snapshot too old

The command

set long 9999999
select * from conflu.bandana ;
has finished successfully and show all 297 rows as if there is no error !

But the export raised the error ORA-01555.

The solution is:

create table corrupted_lob_data (corrupted_rowid rowid);
declare
error_1578 exception;
error_1555 exception;
error_22922 exception;
pragma exception_init(error_1578,-1578);
pragma exception_init(error_1555,-1555);
pragma exception_init(error_22922,-22922);
n number;
begin
for cursor_lob in (select rowid r, bandanavalue from conflu.bandana) loop
    begin
      n := dbms_lob.instr (cursor_lob.bandanavalue, hextoraw ('889911')) ;
exception
when error_1578 then
insert into corrupted_lob_data values (cursor_lob.r, 1578);
commit;
when error_1555 then
insert into corrupted_lob_data values (cursor_lob.r, 1555);
commit;
when error_22922 then
insert into corrupted_lob_data values (cursor_lob.r, 22922);
commit;
end;
end loop;
end;
/

This script checks any LOB data in the corrupted table.

Then I found a specific bad row:   select * from conflu.bandana where rowid='AAAUnkAAFAAABiEAAG';

And specific filed "bandanavalue" in this row.
Then I updated this field:

update conflu.bandana set   bandanavalue = empty_clob() where rowid='AAAUnkAAFAAABiEAAG';
commit;

Now Confluence works!

Exadata

Thursday, January 12, 2017

End of Hardware Support for Exadata

Wednesday, January 11, 2017

How to measure infiniband latency and bandwidth

I. Measuring Latency

II. Bandwidth

III. HELP

Tuesday, January 10, 2017

ORA-01555 after unclear shutdown and How to repair a CONFLUENCE ?

How to disable/setup autostart parameters for specified instance ?