注册
标量子查询改写——案例一
技术分享/ 文章详情 /

标量子查询改写——案例一

PYZ 2024/08/09 716 2 0

前言:

最近遇到一些慢SQL,其中主要问题是标量子查询的写法,并且主查询和子查询的数据量都很大,处理这种情况的最优解便是改写left join,案例中对SQL进行了简化。

简化后的SQL如下:

--原SQL select t1.col_1, nvl((select t3.name from test3 t3,test4 t4 where t3.col_1 = t4.col_1 and t3.col_2 = t1.col_2 and t3.col_3 = t2.col_3 ), t2.col_3 ) name, (select nvl(sum(col_4),0) from test2 t where t.atype='AA' and t.col_1=t2.col_1) col4, (select nvl(sum(col_5),0) from test2 t where t.atype='AA' and t.col_1=t2.col_1) col5, sum(t2.col_4) allsum from test1 t1, test2 t2 where t1.id=t2.id group by t1.col_1,t2.col_1,t1.col_2 ,t2.col_3;

问题点

1、t3和t4部分的查询以t3的两个字段分别与t1和t2两个主表同时关联,如果改写左连接此处与主表的相关性不太好处理;
2、test2需要分别按条件对两个字段进行汇总,sum(col_4)和sum(col_5),那么test2与主查询也就需要关联两次;
3、主查询部分t1与t2关联后还需要对整体做一次聚合分组,如果直接改写为左关联test2,如何保证分组字段完整也是个问题;

改写逻辑

1、首先针对问题1和3,可以先将t1和t2关联并聚合分组后的结果集作为主查询,并提取出外关联需要的字段,保证分组结果的准确性,再与其他子查询部分进行关联;
2、由于test2的两次sum查询条件一样这里也就可以合并处理,减少表关联次数并降低逻辑读;

--改写后 select a.col_1, nvl(b.name,a.col_3) name, nvl(sum4,0), nvl(sum5,0), sum_col_4 from ( (select t1.col_1,t2.col_1 t2col1,t1.col_2 ,t2.col_3,sum(t2.col_4) sum_col_4 from test1 t1, test2 t2 where t1.id=t2.id group by t1.col_1,t2.col_1,t1.col_2 ,t2.col_3 ) a left join (select col_1,sum(col_4) sum4,sum(col_5) sum5 from test2 where atype='AA' group by col_1) t on t.col_1=a.t2col1 left join (select t3.name ,t3.col_2,t3.col_3 from test3 t3,test4 t4 where t3.col_1 = t4.col_1) b on a.col_2 = b.col_2 and a.col_3 = b.col_3 );

再看一下改写前后的执行计划对比,可以看到改写前test2需要按T.ATYPE = 'AA’扫描并关联两次,改写后只需关联一次,逻辑读和执行时间也都有提升。

--改写前 1 #NSET2: [155, 50347->50231, 282] 2 #PIPE2: [155, 50347->50231, 282] 3 #PIPE2: [135, 50347->50231, 282] 4 #PIPE2: [102, 50347->50231, 282] 5 #PIPE2: [61, 50347->50231, 282] 6 #PRJT2: [19, 50347->50231, 282]; exp_num(5), is_atom(FALSE) 7 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0), 8 #SPL2: [41, 50347, 420]; key_num(2), spool_num(2), has_variable(0), sites(-) 9 #PRJT2: [41, 50347->50231, 420]; exp_num(3), is_atom(FALSE) 10 #HAGR2: [41, 50347->50231, 420]; grp_num(1), sfun_num(3), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0,0,0]; slave_empty(0) keys(DMTEMPVIEW_889197127.AUTOID) 11 #HASH RIGHT JOIN2: [34, 50347->50495, 420]; key_num(1); col_num(3); MEM_USED(11950KB), DISK_USED(0KB) KEY(T.COL_1=DMTEMPVIEW_889197127.TMPCOL1) 12 #SLCT2: [7, 9926->9926, 138]; T.ATYPE = 'AA' 13 #CSCN2: [7, 50000->50000, 138]; INDEX33633489(TEST2) 14 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0), 15 #SPL2: [41, 50347, 420]; key_num(2), spool_num(1), has_variable(0), sites(-) 16 #PRJT2: [41, 50347->50231, 420]; exp_num(3), is_atom(FALSE) 17 #HAGR2: [41, 50347->50231, 420]; grp_num(1), sfun_num(3), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0,0,0]; slave_empty(0) keys(DMTEMPVIEW_889197127.AUTOID) 18 #HASH RIGHT JOIN2: [34, 50347->50495, 420]; key_num(1); col_num(3); MEM_USED(11950KB), DISK_USED(0KB) KEY(T.COL_1=DMTEMPVIEW_889197127.TMPCOL1) 19 #SLCT2: [7, 9926->9926, 138]; T.ATYPE = 'AA' 20 #CSCN2: [7, 50000->50000, 138]; INDEX33633489(TEST2) 21 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0), 22 #SPL2: [33, 25794, 522]; key_num(2), spool_num(0), has_variable(0), sites(-) 23 #PRJT2: [33, 25794, 522]; exp_num(2), is_atom(FALSE) 24 #HASH2 INNER JOIN: [33, 25794->0, 522]; KEY_NUM(2), MEM_USED(16046KB), DISK_USED(0KB) KEY(T3.COL_2=DMTEMPVIEW_889197127.TMPCOL2 AND T3.COL_3=DMTEMPVIEW_889197127.TMPCOL3) KEY_NULL_EQU(0, 0) 25 #HASH2 INNER JOIN: [5, 10140->244, 240]; KEY_NUM(1), MEM_USED(15022KB), DISK_USED(0KB) KEY(T4.COL_1=T3.COL_1) KEY_NULL_EQU(0) 26 #CSCN2: [1, 10000->10000, 48]; INDEX33633488(TEST4) 27 #CSCN2: [1, 10000->10000, 192]; INDEX33633487(TEST3) 28 #HEAP TABLE SCAN: [19, 50347->50231, 282]; table_no(0), 29 #HEAP TABLE: [19, 50347, 282]; table_no(0) full(FALSE), mpp_full(0) autoid(TRUE) sites(-) 30 #PRJT2: [19, 50347->50231, 282]; exp_num(5), is_atom(FALSE) 31 #HAGR2: [19, 50347->50231, 282]; grp_num(4), sfun_num(1), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0]; slave_empty(0) keys(T1.COL_1, T2.COL_1, T1.COL_2, T2.COL_3) 32 #HASH2 INNER JOIN: [14, 50347->50231, 282]; KEY_NUM(1), MEM_USED(16046KB), DISK_USED(0KB) KEY(T1.ID=T2.ID) KEY_NULL_EQU(0) 33 #CSCN2: [1, 10000->10000, 126]; INDEX33633485(TEST1) 34 #CSCN2: [6, 50000->50000, 156]; INDEX33633489(TEST2) Statistics ----------------------------------------------------------------- 0 data pages changed 0 undo pages changed 334 logical reads 0 physical reads 0 redo size 2463089 bytes sent to client 923 bytes received from client 6 roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 50231 rows processed 0 io wait time(ms) 487 exec time(ms) --改写后 1 #NSET2: [50, 50347->50231, 678] 2 #PRJT2: [50, 50347->50231, 678]; exp_num(5), is_atom(FALSE) 3 #HASH RIGHT JOIN2: [50, 50347->50231, 678]; key_num(2); col_num(6); MEM_USED(12462KB), DISK_USED(0KB) KEY(B.COL_2=A.COL_2 AND B.COL_3=A.COL_3) 4 #PRJT2: [5, 10140->244, 240]; exp_num(3), is_atom(FALSE) 5 #HASH2 INNER JOIN: [5, 10140->244, 240]; KEY_NUM(1), MEM_USED(15022KB), DISK_USED(0KB) KEY(T4.COL_1=T3.COL_1) KEY_NULL_EQU(0) 6 #CSCN2: [1, 10000->10000, 48]; INDEX33633488(TEST4) 7 #CSCN2: [1, 10000->10000, 192]; INDEX33633487(TEST3) 8 #HASH RIGHT JOIN2: [35, 50347->50231, 438]; key_num(1); col_num(6); MEM_USED(11950KB), DISK_USED(0KB) KEY(T.COL_1=A.T2COL1) 9 #PRJT2: [8, 9926->9796, 156]; exp_num(3), is_atom(FALSE) 10 #HAGR2: [8, 9926->9796, 156]; grp_num(1), sfun_num(2), MEM_USED(2034KB), DISK_USED(0KB), distinct_flag[0,0]; slave_empty(0) keys(TEST2.COL_1) 11 #SLCT2: [7, 9926->9926, 156]; TEST2.ATYPE = 'AA' 12 #CSCN2: [7, 50000->50000, 156]; INDEX33633489(TEST2) 13 #PRJT2: [19, 50347->50231, 282]; exp_num(5), is_atom(FALSE) 14 #HAGR2: [19, 50347->50231, 282]; grp_num(4), sfun_num(1), MEM_USED(3613KB), DISK_USED(0KB), distinct_flag[0]; slave_empty(0) keys(T1.COL_1, T2.COL_1, T1.COL_2, T2.COL_3) 15 #HASH2 INNER JOIN: [14, 50347->50231, 282]; KEY_NUM(1), MEM_USED(16046KB), DISK_USED(0KB) KEY(T1.ID=T2.ID) KEY_NULL_EQU(0) 16 #CSCN2: [1, 10000->10000, 126]; INDEX33633485(TEST1) 17 #CSCN2: [6, 50000->50000, 156]; INDEX33633489(TEST2) Statistics ----------------------------------------------------------------- 0 data pages changed 0 undo pages changed 238 logical reads 0 physical reads 0 redo size 2463107 bytes sent to client 1009 bytes received from client 6 roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 50231 rows processed 0 io wait time(ms) 101 exec time(ms)

总结

1、标量子查询大数据量的情况优先改写left join;
2、主查询部分可以查出需要的字段,作为派生表进行外关联;
3、对于相同条件的子查询可以提取公共条件合并处理;

评论
后发表回复

作者

文章

阅读量

获赞

扫一扫
联系客服