注册
SQL优化案例——CASE WHEN EXISTS(相关子查询)
技术分享/ 文章详情 /

SQL优化案例——CASE WHEN EXISTS(相关子查询)

PYZ 2025/12/31 159 0 0

问题SQL

--实际SQL写法简化后如下 select case when t1.object_type = 'TABLE' and exists (select 1 from t2 where t2.id = t1.object_id) then '1' else '0' end c1, case when t1.object_type = 'INDEX' and not exists (select 1 from t2 where t2.id = t1.object_id) then '1' else '0' end c2 from t1 where object_name='SYSOBJECTS'; 1 #NSET2: [328, 63->64, 138] 2 #PIPE2: [328, 63->64, 138] 3 #PIPE2: [1, 63->64, 138] 4 #PRJT2: [1, 63->64, 138]; exp_num(3), is_atom(FALSE) 5 #BLKUP2: [1, 63->64, 138]; IDX_T1_OBJECT_NAME(T1); use_clu_addr(0) 6 #SSEK2: [1, 63->64, 138]; scan_type(ASC), IDX_T1_OBJECT_NAME(T1), is_global(0), scan_range['SYSOBJECTS','SYSOBJECTS'] 7 #SPL2: [1, 2048->0, 30]; key_num(1), spool_num(1), has_variable(1), sites(-), result_cache(FALSE) 8 #PRJT2: [1, 2048, 30]; exp_num(1), is_atom(FALSE) 9 #SSEK2: [1, 2048, 30]; scan_type(ASC), IDX_T2_ID(T2), is_global(0), scan_range[var1,var1] 10 #SPL2: [327, 2949120->0, 30]; key_num(1), spool_num(0), has_variable(0), sites(-), result_cache(FALSE) 11 #PRJT2: [327, 2949120->2949120, 30]; exp_num(1), is_atom(FALSE) 12 #SSCN: [327, 2949120->2949120, 30]; IDX_T2_ID(T2); btr_scan(1); is_global(0) Statistics ----------------------------------------------------------------- 0 data pages changed 0 undo pages changed 3667 logical reads 0 physical reads 0 redo size 1688 bytes sent to client 373 bytes received from client 1 roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed 0 io wait time(ms) 1318 exec time(ms)

问题分析

  • 问题SQL对相关子查询的返回结果是否存在进行判断,主查询实际返回64条行记录,被驱动表T2数据量较大,已知T1表的查询条件object_name存在索引,T2表的关联字段id存在索引,通过执行计划可见,not exists判断时可以利用T2关联字段索引将连接条件推入(has_variable=1),但exists判断时无法将连接条件下推(has_variable=0),并没有将主查询的结果推入到子查询做索引范围定位SSEK,主要代价消耗在第12步的索引SSCN全扫。

优化思路

针对这种情况可以有如下两种方案进行优化处理

  1. 改写LEFT JOIN,由于子查询中相关表都是T2,因此可以将两次子查询关联合并为一次左关联,通过判断T2.id字段是否为空,来替换原本的exists/not exists逻辑;
  2. 查询优化参数 REFED_EXISTS_OPT_FLAG 可以对此场景进行优化处理,参数详解如下:
简介:是否将相关 EXISTS 查询转化为非相关 IN 查询,可取值 0、1、2、4 及这些值的组合值
属性:会话级,同时支持 HINT 指定 
缺省值:1
REFED_EXISTS_OPT_FLAG=0 时,不进行优化。
REFED_EXISTS_OPT_FLAG=1 时,则会将相关 EXISTS 查询转化为非相关 IN 查询处理。
但是当查询满足以下任何一条时,则不进行优化:
1. EXISTS 子查询条件为否定,即 NOT EXISTS 时;
2. 查询中存在 GROUP BY 或 HAVING 子句时;
3. 查询中存在相关的层次查询时;
4. EXISTS 子查询引用超过一个上层表时;
5. EXISTS 子查询的关联条件为非等值时。
REFED_EXISTS_OPT_FLAG=2 时,在特殊场景下会将相关 EXISTS 查询转换为非相关 IN 查询处理,通过将相关表+DISTINCT 构造为派生表添加到子查询中去相关性。
此方式可以减少子查询中对象扫描次数,提升性能。
这里的特殊场景必须满足以下优化条件:
1. 子查询仅与一张表、一个列相关;
2. 相关对象只能是基表或实体视图,引用列的不同值比例低于 50%;
3. 子查询不存在 GROUP\HAVING\AFUN\SFUM\ROWNUM\CONNECT BY\TOP 项;
4. 子查询不属于 HAVING 项;
5. 子查询内部不存在相关子查询。
REFED_EXISTS_OPT_FLAG=4 时,将符合条件的相关 EXISTS 子句拉至上层进行处理,避免不必要笛卡尔积的产生。

方案一:改写LEFT JOIN

select case when t1.object_type = 'TABLE' and t2.id is not null then '1' else '0' end c1, case when t1.object_type = 'INDEX' and t2.id is null then '1' else '0' end c2 from t1 left join t2 on t2.id = t1.object_id where t1.object_name='SYSOBJECTS'; 1 #NSET2: [35, 63->64, 126] 2 #PRJT2: [35, 63->64, 126]; exp_num(2), is_atom(FALSE) 3 #INDEX JOIN LEFT JOIN2: [35, 63->64, 126]: col_num(2) ret_null(0) 4 #BLKUP2: [1, 63->64, 126]; IDX_T1_OBJECT_NAME(T1); use_clu_addr(0) 5 #SSEK2: [1, 63->64, 126]; scan_type(ASC), IDX_T1_OBJECT_NAME(T1), is_global(0), scan_range['SYSOBJECTS','SYSOBJECTS'] 6 #SSEK2: [20, 2048->131072, 30]; scan_type(ASC), IDX_T2_ID(T2), is_global(0), scan_range[T1.OBJECT_ID,T1.OBJECT_ID] Statistics ----------------------------------------------------------------- 0 data pages changed 0 undo pages changed 537 logical reads 8 physical reads 0 redo size 3146264 bytes sent to client 702 bytes received from client 7 roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed 7 io wait time(ms) 12 exec time(ms)

方案二:关闭参数refed_exists_opt_flag

  • refed_exists_opt_flag默认1会将相关exists转换为非相关的in,但not exists不进行查询转换处理,因此关闭参数即可保持整体的子查询相关性,将主查询连接条件下推
select /*+ refed_exists_opt_flag(0)*/ case when t1.object_type = 'TABLE' and exists (select 1 from t2 where t2.id = t1.object_id) then '1' else '0' end c1, case when t1.object_type = 'INDEX' and not exists (select 1 from t2 where t2.id = t1.object_id) then '1' else '0' end c2 from t1 where object_name='SYSOBJECTS'; 1 #NSET2: [1, 63->64, 138] 2 #PIPE2: [1, 63->64, 138] 3 #PIPE2: [1, 63->64, 138] 4 #PRJT2: [1, 63->64, 138]; exp_num(3), is_atom(FALSE) 5 #BLKUP2: [1, 63->64, 138]; IDX_T1_OBJECT_NAME(T1); use_clu_addr(0) 6 #SSEK2: [1, 63->64, 138]; scan_type(ASC), IDX_T1_OBJECT_NAME(T1), is_global(0), scan_range['SYSOBJECTS','SYSOBJECTS'] 7 #SPL2: [1, 2048->0, 30]; key_num(1), spool_num(1), has_variable(1), sites(-), result_cache(FALSE) 8 #PRJT2: [1, 2048, 30]; exp_num(1), is_atom(FALSE) 9 #SSEK2: [1, 2048, 30]; scan_type(ASC), IDX_T2_ID(T2), is_global(0), scan_range[var2,var2] 10 #SPL2: [1, 2048->0, 30]; key_num(1), spool_num(0), has_variable(1), sites(-), result_cache(FALSE) 11 #PRJT2: [1, 2048->9600, 30]; exp_num(1), is_atom(FALSE) 12 #SSEK2: [1, 2048->9600, 30]; scan_type(ASC), IDX_T2_ID(T2), is_global(0), scan_range[var1,var1] Statistics ----------------------------------------------------------------- 0 data pages changed 0 undo pages changed 253 logical reads 0 physical reads 0 redo size 1688 bytes sent to client 403 bytes received from client 1 roundtrips to/from client 0 sorts (memory) 0 sorts (disk) 0 rows processed 0 io wait time(ms) 7 exec time(ms)
评论
后发表回复

作者

文章

阅读量

获赞

扫一扫
联系客服