Oracle, Exadata, Crossplatform migration, RAC, Performance, Troubleshooting. The views expressed on this blog are my own and do not necessarily reflect the views of Oracle.
Thursday, January 12, 2017
Wednesday, January 11, 2017
How to measure infiniband latency and bandwidth
If you need to test the IB network between 2 nodes look at my
tests on X4-2 Exadata.
I. Measuring Latency
On the 2nd node run the listener program and go to the 1st node:[root@ed04dbadm02 ~]# ib_read_lat
On the 1st node run the test and see the report:
[root@ed04dbadm01 ~]# ib_read_lat -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read Latency Test
Number of qps : 1
Connection type : RC
TX depth : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
remote address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 1.77 14.12 1.82
------------------------------------------------------------------
And another run:
[root@ed04dbadm01 ~]# ib_read_lat -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read Latency Test
Number of qps : 1
Connection type : RC
TX depth : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x008a PSN 0x266b05 OUT 0x10 RKey 0x1f5f800 VAddr 0x007f7a60362000
remote address: LID 0x0d QPN 0x0071 PSN 0xc9915e OUT 0x10 RKey 0x268c100 VAddr 0x000000011ce000
------------------------------------------------------------------
#bytes #iterations t_min[usec] t_max[usec] t_typical[usec]
2 1000 2.34 20.83 2.88
4 1000 2.52 37.00 2.87
8 1000 2.53 34.74 2.88
16 1000 2.33 20.61 2.78
32 1000 2.36 16.59 2.87
64 1000 2.54 17.71 2.89
128 1000 2.59 30.65 2.95
256 1000 2.56 20.28 3.16
512 1000 2.46 25.53 3.45
1024 1000 3.23 30.31 3.75
2048 1000 4.21 28.89 4.56
4096 1000 4.82 20.08 5.40
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=1, ccnt=1
[root@ed04dbadm01 ~]#
And see the report from 2nd node:
[root@ed04dbadm02 ~]# ib_read_lat
------------------------------------------------------------------
RDMA_Read Latency Test
Number of qps : 1
Connection type : RC
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0d QPN 0x006e PSN 0x3b5a1e OUT 0x10 RKey 0x2661800 VAddr 0x00000001431000
remote address: LID 0x0c QPN 0x0088 PSN 0x41d82e OUT 0x10 RKey 0x1f31e00 VAddr 0x000000021b5000
-----------------------------------------------------------------
II. Bandwidth
Next stage is the bandwidth test.
On 2nd node we run the listener:
[root@ed04dbadm02 ~]# ib_read_bw
And on 1st node we run the test itself:
[root@ed04dbadm01 ~]# ib_read_bw -a -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read BW Test
Number of qps : 1
Connection type : RC
TX depth : 300
CQ Moderation : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
remote address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000
------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec]
2 1000 9.51 9.41
4 1000 20.29 20.21
8 1000 40.42 40.25
16 1000 80.85 78.35
32 1000 161.99 161.52
64 1000 309.76 308.93
128 1000 650.33 648.20
256 1000 1291.16 1286.98
512 1000 2610.95 2523.90
1024 1000 3074.05 3070.13
2048 1000 3192.15 3159.41
4096 1000 3220.35 3219.55
8192 1000 3230.01 3229.81
16384 1000 3241.36 3241.14
32768 1000 3245.32 3245.29
65536 1000 3248.53 3248.47
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0
[root@ed04dbadm02 ~]# ib_read_bw
------------------------------------------------------------------
RDMA_Read BW Test
Number of qps : 1
Connection type : RC
CQ Moderation : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0d QPN 0x0070 PSN 0x418495 OUT 0x10 RKey 0x267f600 VAddr 0x007fe31d2ab000
remote address: LID 0x0c QPN 0x0089 PSN 0x4f763c OUT 0x10 RKey 0x1f50900 VAddr 0x007fb36137f000
pp_read_keys: Success
Couldn't read remote address
Unable to write to socket/rdam_cm
Failed to close connection between server and client
[root@ed04dbadm02 ~]#
And bi-directional test:
[root@ed04dbadm01 ~]# ib_read_bw -a -b -d mlx4_0 ed04dbadm02
------------------------------------------------------------------
RDMA_Read Bidirectional BW Test
Number of qps : 1
Connection type : RC
TX depth : 300
CQ Moderation : 50
Mtu : 2048B
Link type : IB
Outstand reads : 16
rdma_cm QPs : OFF
Data ex. method : Ethernet
------------------------------------------------------------------
local address: LID 0x0c QPN 0x0091 PSN 0x71081f OUT 0x10 RKey 0x1f94b00 VAddr 0x007f851c57d000
remote address: LID 0x0d QPN 0x0076 PSN 0xae8412 OUT 0x10 RKey 0x26bf600 VAddr 0x007fc7ed0af000
------------------------------------------------------------------
#bytes #iterations BW peak[MB/sec] BW average[MB/sec]
2 1000 20.47 20.09
4 1000 40.95 40.82
8 1000 81.74 81.41
16 1000 164.40 163.81
32 1000 328.80 327.14
64 1000 658.83 656.19
128 1000 1320.12 1315.04
256 1000 2482.47 2459.42
512 1000 5260.80 5244.75
1024 1000 6250.11 6232.63
2048 1000 6395.13 6386.76
4096 1000 6466.50 6466.02
8192 1000 6459.10 6449.20
16384 1000 6490.17 6489.88
32768 1000 6503.01 6502.80
65536 1000 6499.86 6499.85
Completion with error at client
Failed status 10: wr_id 1 syndrom 0x88
scnt=300, ccnt=0
[root@ed04dbadm01 ~]#
III. HELP
[root@ed04dbadm01 ~]# ib_read_lat -h
Usage:
ib_read_lat start a server and wait for connection
ib_read_lat <host> connect to server at <host>
Options:
-p, --port=<port> Listen on/connect to port <port> (default 18515)
-d, --ib-dev=<dev> Use IB device <dev> (default first device found)
-R, --rdma_cm Connect QPs with rdma_cm and run test on those QPs
-z, --com_rdma_cm Communicate with rdma_cm module to exchange data - use regular QPs
-i, --ib-port=<port> Use port <port> of IB device (default 1)
-c, --connection=<RC/UC/UD> Connection type RC/UC/UD (default RC)
-m, --mtu=<mtu> Mtu size : 256 - 4096 (default port mtu)
-s, --size=<size> Size of message to exchange (default 2)
-a, --all Run sizes from 2 till 2^23
-n, --iters=<iters> Number of exchanges (at least 2, default 1000)
-t, --tx-depth=<dep> Size of tx queue (default 50)
-u, --qp-timeout=<timeout> QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
-S, --sl=<sl> SL (default 0)
-x, --gid-index=<index> Test uses GID with GID index (Default : IB - no gid . ETH - 0)
-F, --CPU-freq Do not fail even if cpufreq_ondemand module is loaded
-V, --version Display version number
-I, --inline_size=<size> Max size of message to be sent in inline (default 400)
-e, --events Sleep on CQ events (default poll)
-o, --outs=<num> num of outstanding read/atom(default max of device)
-C, --report-cycles report times in cpu cycle units (default microseconds)
-H, --report-histogram Print out all results (default print summary only)
-U, --report-unsorted (implies -H) print out unsorted results (default sorted)
[root@ed04dbadm01 ~]# ib_read_bw -h
Usage:
ib_read_bw start a server and wait for connection
ib_read_bw <host> connect to server at <host>
Options:
-p, --port=<port> Listen on/connect to port <port> (default 18515)
-d, --ib-dev=<dev> Use IB device <dev> (default first device found)
-R, --rdma_cm Connect QPs with rdma_cm and run test on those QPs
-z, --com_rdma_cm Communicate with rdma_cm module to exchange data - use regular QPs
-i, --ib-port=<port> Use port <port> of IB device (default 1)
-c, --connection=<RC/UC/UD> Connection type RC/UC/UD (default RC)
-m, --mtu=<mtu> Mtu size : 256 - 4096 (default port mtu)
-s, --size=<size> Size of message to exchange (default 65536)
-a, --all Run sizes from 2 till 2^23
-n, --iters=<iters> Number of exchanges (at least 2, default 1000)
-t, --tx-depth=<dep> Size of tx queue (default 300)
-u, --qp-timeout=<timeout> QP timeout, timeout value is 4 usec * 2 ^(timeout), default 14
-S, --sl=<sl> SL (default 0)
-x, --gid-index=<index> Test uses GID with GID index (Default : IB - no gid . ETH - 0)
-F, --CPU-freq Do not fail even if cpufreq_ondemand module is loaded
-V, --version Display version number
-I, --inline_size=<size> Max size of message to be sent in inline (default 0)
-b, --bidirectional Measure bidirectional bandwidth (default unidirectional)
-Q, --cq-mod Generate Cqe only after <--cq-mod> completion
-e, --events Sleep on CQ events (default poll)
-N, --no peak-bw Cancel peak-bw calculation (default with peak)
-o, --outs=<num> num of outstanding read/atom(default max of device)
Tuesday, January 10, 2017
ORA-01555 after unclear shutdown and How to repair a CONFLUENCE ?
After holidays at NY 2017 the CONFLUENCE won't work.
In its logs the application say something about ORA-01555 and alert show the unclear shutdown.
There were no bad messages in database alert logs and any corrupted blocks, however.
To check the data integrity I run the full export of database. And Full export show the bad table:
ORA-31693: Table data object "CONFLU"."BANDANA" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-01555: snapshot too old: rollback segment number with name "" too small
ORA-22924: snapshot too old
The command
set long 9999999
select * from conflu.bandana ;
has finished successfully and show all 297 rows as if there is no error !
But the export raised the error ORA-01555.
The solution is:
create table corrupted_lob_data (corrupted_rowid rowid);
declare
error_1578 exception;
error_1555 exception;
error_22922 exception;
pragma exception_init(error_1578,-1578);
pragma exception_init(error_1555,-1555);
pragma exception_init(error_22922,-22922);
n number;
begin
for cursor_lob in (select rowid r, bandanavalue from conflu.bandana) loop
begin
n := dbms_lob.instr (cursor_lob.bandanavalue, hextoraw ('889911')) ;
exception
when error_1578 then
insert into corrupted_lob_data values (cursor_lob.r, 1578);
commit;
when error_1555 then
insert into corrupted_lob_data values (cursor_lob.r, 1555);
commit;
when error_22922 then
insert into corrupted_lob_data values (cursor_lob.r, 22922);
commit;
end;
end loop;
end;
/
This script checks any LOB data in the corrupted table.
Then I found a specific bad row: select * from conflu.bandana where rowid='AAAUnkAAFAAABiEAAG';
And specific filed "bandanavalue" in this row.
Then I updated this field:
update conflu.bandana set bandanavalue = empty_clob() where rowid='AAAUnkAAFAAABiEAAG';
commit;
Now Confluence works!
In its logs the application say something about ORA-01555 and alert show the unclear shutdown.
There were no bad messages in database alert logs and any corrupted blocks, however.
To check the data integrity I run the full export of database. And Full export show the bad table:
ORA-31693: Table data object "CONFLU"."BANDANA" failed to load/unload and is being skipped due to error:
ORA-02354: error in exporting/importing data
ORA-01555: snapshot too old: rollback segment number with name "" too small
ORA-22924: snapshot too old
The command
set long 9999999
select * from conflu.bandana ;
has finished successfully and show all 297 rows as if there is no error !
But the export raised the error ORA-01555.
The solution is:
create table corrupted_lob_data (corrupted_rowid rowid);
declare
error_1578 exception;
error_1555 exception;
error_22922 exception;
pragma exception_init(error_1578,-1578);
pragma exception_init(error_1555,-1555);
pragma exception_init(error_22922,-22922);
n number;
begin
for cursor_lob in (select rowid r, bandanavalue from conflu.bandana) loop
begin
n := dbms_lob.instr (cursor_lob.bandanavalue, hextoraw ('889911')) ;
exception
when error_1578 then
insert into corrupted_lob_data values (cursor_lob.r, 1578);
commit;
when error_1555 then
insert into corrupted_lob_data values (cursor_lob.r, 1555);
commit;
when error_22922 then
insert into corrupted_lob_data values (cursor_lob.r, 22922);
commit;
end;
end loop;
end;
/
This script checks any LOB data in the corrupted table.
Then I found a specific bad row: select * from conflu.bandana where rowid='AAAUnkAAFAAABiEAAG';
And specific filed "bandanavalue" in this row.
Then I updated this field:
update conflu.bandana set bandanavalue = empty_clob() where rowid='AAAUnkAAFAAABiEAAG';
commit;
Now Confluence works!
Subscribe to:
Posts (Atom)
How to disable/setup autostart parameters for specified instance ?
Q: We have a 4-node RAC. I need to disable autostart of the DB on one node only. How to do it and how to see autostart parameters, confir...
-
The customer complains about slow import. The import was going about 40h at METADATA ONLY mode. The CPU consumption is about 0. Database al...
-
During AHF install the error was obtained: # ./ahf_setup AHF Installer for Platform Linux Architecture x86_64 AHF Installation Log : /tmp/a...