最近遇到一些慢SQL,其中主要问题是标量子查询的写法,并且主查询和子查询的数据量都很大,处理这种情况的最优解便是改写left join,案例中对SQL进行了简化。
简化后的SQL如下:
--原SQL
select t1.col_1,
nvl((select t3.name
from test3 t3,test4 t4
where t3.col_1 = t4.col_1
and t3.col_2 = t1.col_2
and t3.col_3 = t2.col_3
), t2.col_3 ) name,
(select nvl(sum(col_4),0) from test2 t where t.atype='AA' and t.col_1=t2.col_1) col4,
(select nvl(sum(col_5),0) from test2 t where t.atype='AA' and t.col_1=t2.col_1) col5,
sum(t2.col_4) allsum
from test1 t1,
test2 t2
where t1.id=t2.id
group by t1.col_1,t2.col_1,t1.col_2 ,t2.col_3;
1、t3和t4部分的查询以t3的两个字段分别与t1和t2两个主表同时关联,如果改写左连接此处与主表的相关性不太好处理;
2、test2需要分别按条件对两个字段进行汇总,sum(col_4)和sum(col_5),那么test2与主查询也就需要关联两次;
3、主查询部分t1与t2关联后还需要对整体做一次聚合分组,如果直接改写为左关联test2,如何保证分组字段完整也是个问题;
1、首先针对问题1和3,可以先将t1和t2关联并聚合分组后的结果集作为主查询,并提取出外关联需要的字段,保证分组结果的准确性,再与其他子查询部分进行关联;
2、由于test2的两次sum查询条件一样这里也就可以合并处理,减少表关联次数并降低逻辑读;
--改写后
select a.col_1,
nvl(b.name,a.col_3) name,
nvl(sum4,0),
nvl(sum5,0),
sum_col_4
from (
(select t1.col_1,t2.col_1 t2col1,t1.col_2 ,t2.col_3,sum(t2.col_4) sum_col_4
from test1 t1,
test2 t2
where t1.id=t2.id
group by t1.col_1,t2.col_1,t1.col_2 ,t2.col_3
) a
left join (select col_1,sum(col_4) sum4,sum(col_5) sum5 from test2 where atype='AA' group by col_1) t
on t.col_1=a.t2col1
left join (select t3.name ,t3.col_2,t3.col_3 from test3 t3,test4 t4 where t3.col_1 = t4.col_1) b
on a.col_2 = b.col_2 and a.col_3 = b.col_3
);
再看一下改写前后的执行计划对比,可以看到改写前test2需要按T.ATYPE = 'AA’扫描并关联两次,改写后只需关联一次,逻辑读和执行时间也都有提升。
--改写前
1 #NSET2: [155, 50347->50231, 282]
2 #PIPE2: [155, 50347->50231, 282]
3 #PIPE2: [135, 50347->50231, 282]
4 #PIPE2: [102, 50347->50231, 282]
5 #PIPE2: [61, 50347->50231, 282]
6 #PRJT2: [19, 50347->50231, 282]; exp_num(5), is_atom(FALSE)
7 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0),
8 #SPL2: [41, 50347, 420]; key_num(2), spool_num(2), has_variable(0), sites(-)
9 #PRJT2: [41, 50347->50231, 420]; exp_num(3), is_atom(FALSE)
10 #HAGR2: [41, 50347->50231, 420]; grp_num(1), sfun_num(3), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0,0,0]; slave_empty(0) keys(DMTEMPVIEW_889197127.AUTOID)
11 #HASH RIGHT JOIN2: [34, 50347->50495, 420]; key_num(1); col_num(3); MEM_USED(11950KB), DISK_USED(0KB) KEY(T.COL_1=DMTEMPVIEW_889197127.TMPCOL1)
12 #SLCT2: [7, 9926->9926, 138]; T.ATYPE = 'AA'
13 #CSCN2: [7, 50000->50000, 138]; INDEX33633489(TEST2)
14 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0),
15 #SPL2: [41, 50347, 420]; key_num(2), spool_num(1), has_variable(0), sites(-)
16 #PRJT2: [41, 50347->50231, 420]; exp_num(3), is_atom(FALSE)
17 #HAGR2: [41, 50347->50231, 420]; grp_num(1), sfun_num(3), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0,0,0]; slave_empty(0) keys(DMTEMPVIEW_889197127.AUTOID)
18 #HASH RIGHT JOIN2: [34, 50347->50495, 420]; key_num(1); col_num(3); MEM_USED(11950KB), DISK_USED(0KB) KEY(T.COL_1=DMTEMPVIEW_889197127.TMPCOL1)
19 #SLCT2: [7, 9926->9926, 138]; T.ATYPE = 'AA'
20 #CSCN2: [7, 50000->50000, 138]; INDEX33633489(TEST2)
21 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0),
22 #SPL2: [33, 25794, 522]; key_num(2), spool_num(0), has_variable(0), sites(-)
23 #PRJT2: [33, 25794, 522]; exp_num(2), is_atom(FALSE)
24 #HASH2 INNER JOIN: [33, 25794->0, 522]; KEY_NUM(2), MEM_USED(16046KB), DISK_USED(0KB) KEY(T3.COL_2=DMTEMPVIEW_889197127.TMPCOL2 AND T3.COL_3=DMTEMPVIEW_889197127.TMPCOL3) KEY_NULL_EQU(0, 0)
25 #HASH2 INNER JOIN: [5, 10140->244, 240]; KEY_NUM(1), MEM_USED(15022KB), DISK_USED(0KB) KEY(T4.COL_1=T3.COL_1) KEY_NULL_EQU(0)
26 #CSCN2: [1, 10000->10000, 48]; INDEX33633488(TEST4)
27 #CSCN2: [1, 10000->10000, 192]; INDEX33633487(TEST3)
28 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0),
29 #HEAP TABLE: [19, 50347, 282]; table_no(0) full(FALSE), mpp_full(0) autoid(TRUE) sites(-)
30 #PRJT2: [19, 50347->50231, 282]; exp_num(5), is_atom(FALSE)
31 #HAGR2: [19, 50347->50231, 282]; grp_num(4), sfun_num(1), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0]; slave_empty(0) keys(T1.COL_1, T2.COL_1, T1.COL_2, T2.COL_3)
32 #HASH2 INNER JOIN: [14, 50347->50231, 282]; KEY_NUM(1), MEM_USED(16046KB), DISK_USED(0KB) KEY(T1.ID=T2.ID) KEY_NULL_EQU(0)
33 #CSCN2: [1, 10000->10000, 126]; INDEX33633485(TEST1)
34 #CSCN2: [6, 50000->50000, 156]; INDEX33633489(TEST2)
Statistics
-----------------------------------------------------------------
0 data pages changed
0 undo pages changed
334 logical reads
0 physical reads
0 redo size
2463089 bytes sent to client
923 bytes received from client
6 roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
50231 rows processed
0 io wait time(ms)
487 exec time(ms)
--改写后
1 #NSET2: [50, 50347->50231, 678]
2 #PRJT2: [50, 50347->50231, 678]; exp_num(5), is_atom(FALSE)
3 #HASH RIGHT JOIN2: [50, 50347->50231, 678]; key_num(2); col_num(6); MEM_USED(12462KB), DISK_USED(0KB) KEY(B.COL_2=A.COL_2 AND B.COL_3=A.COL_3)
4 #PRJT2: [5, 10140->244, 240]; exp_num(3), is_atom(FALSE)
5 #HASH2 INNER JOIN: [5, 10140->244, 240]; KEY_NUM(1), MEM_USED(15022KB), DISK_USED(0KB) KEY(T4.COL_1=T3.COL_1) KEY_NULL_EQU(0)
6 #CSCN2: [1, 10000->10000, 48]; INDEX33633488(TEST4)
7 #CSCN2: [1, 10000->10000, 192]; INDEX33633487(TEST3)
8 #HASH RIGHT JOIN2: [35, 50347->50231, 438]; key_num(1); col_num(6); MEM_USED(11950KB), DISK_USED(0KB) KEY(T.COL_1=A.T2COL1)
9 #PRJT2: [8, 9926->9796, 156]; exp_num(3), is_atom(FALSE)
10 #HAGR2: [8, 9926->9796, 156]; grp_num(1), sfun_num(2), MEM_USED(2034KB), DISK_USED(0KB), distinct_flag[0,0]; slave_empty(0) keys(TEST2.COL_1)
11 #SLCT2: [7, 9926->9926, 156]; TEST2.ATYPE = 'AA'
12 #CSCN2: [7, 50000->50000, 156]; INDEX33633489(TEST2)
13 #PRJT2: [19, 50347->50231, 282]; exp_num(5), is_atom(FALSE)
14 #HAGR2: [19, 50347->50231, 282]; grp_num(4), sfun_num(1), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0]; slave_empty(0) keys(T1.COL_1, T2.COL_1, T1.COL_2, T2.COL_3)
15 #HASH2 INNER JOIN: [14, 50347->50231, 282]; KEY_NUM(1), MEM_USED(16046KB), DISK_USED(0KB) KEY(T1.ID=T2.ID) KEY_NULL_EQU(0)
16 #CSCN2: [1, 10000->10000, 126]; INDEX33633485(TEST1)
17 #CSCN2: [6, 50000->50000, 156]; INDEX33633489(TEST2)
Statistics
-----------------------------------------------------------------
0 data pages changed
0 undo pages changed
238 logical reads
0 physical reads
0 redo size
2463107 bytes sent to client
1009 bytes received from client
6 roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
50231 rows processed
0 io wait time(ms)
101 exec time(ms)
1、标量子查询大数据量的情况优先改写left join;
2、主查询部分可以查出需要的字段,作为派生表进行外关联;
3、对于相同条件的子查询可以提取公共条件合并处理;
文章
阅读量
获赞