业务SQL长这样,看到SQL时我内心是拒绝的,给条件字段加运算是走不了索引的,还都是不等值条件,这么写只能全表扫描,实际生产环境上要执行13s左右,测试环境也要2s多,savedate字段还是用varchar2类型存储时间数据
select *
from ecm
where state != '0'
and state != '3'
and (? - savedate) < 3000000
and (? - savedate) > 10000
and flag != 'AA'
order by savedate;
首先查看执行计划,执行2.4s,代价基本都在CSCN
--带入一个参数
select *
from ecm
where state != '0'
and state != '3'
and ('20240808000001' - savedate) < 3000000
and ('20240808000001' - savedate) > 10000
and flag != 'AA'
order by savedate;
1 #NSET2: [163, 2256->3928, 234]
2 #PRJT2: [163, 2256->3928, 234]; exp_num(6), is_atom(FALSE)
3 #SORT3: [163, 2256->3928, 234]; key_num(1), is_distinct(FALSE), is_adaptive(0), MEM_USED(37721KB), DISK_USED(0KB)
4 #HASH RIGHT SEMI JOIN2: [162, 2256->3928, 234]; key_num(1) (ANTI), MEM_USED(224KB), DISK_USED(0KB) KEY(DMTEMPVIEW_889193517.colname=ECM.STATE) KEY_NULL_EQU(0)
5 #CONST VALUE LIST: [1, 2->2, 48]; row_num(2), col_num(1),
6 #SLCT2: [162, 2256->6478, 234]; (ECM.FLAG <> 'AA' AND var2 < var3 AND var2 > var4)
7 #CSCN2: [162, 1000000->1000000, 234]; INDEX33633597(ECM)
Statistics
-----------------------------------------------------------------
0 data pages changed
0 undo pages changed
1594 logical reads
0 physical reads
0 redo size
220291 bytes sent to client
304 bytes received from client
2 roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
3928 rows processed
0 io wait time(ms)
2422 exec time(ms)
由于是通过传入的绑定变量减去条件字段,这种情况下函数索引也没戏,得想办法改写了
把原本的(? - savedate) < 3000000 and (? - savedate) > 10000
改写为savedate > to_number(?) - 3000000 and savedate < to_number(?) - 10000,逻辑上应该是等价的,之后还要创建一个savedate字段的索引。
1、先创建组合索引(为什么要组合待会再说)
create index idx_ecm_01 on ecm(savedate,state,flag);
2、改写SQL
select *
from ecm
where state != '0'
and state != '3'
and savedate > to_number('20240808000001') - 3000000
and savedate < to_number('20240808000001') - 10000
and flag != 'AA'
order by savedate;
改写后发现执行计划压根没变,执行时间还是2.3s,都把条件字段单独放左边还建了索引怎么还是CSCN,why?
略微思考过后发现遗漏了一个问题——隐式转换,savedate创建为varchar2类型,而传参之后计算出来的值却是number。
1 #NSET2: [163, 1899->3928, 234]
2 #PRJT2: [163, 1899->3928, 234]; exp_num(6), is_atom(FALSE)
3 #SORT3: [163, 1899->3928, 234]; key_num(1), is_distinct(FALSE), is_adaptive(0), MEM_USED(37721KB), DISK_USED(0KB)
4 #HASH RIGHT SEMI JOIN2: [162, 1899->3928, 234]; key_num(1) (ANTI), MEM_USED(224KB), DISK_USED(0KB) KEY(DMTEMPVIEW_889193869.colname=ECM.STATE) KEY_NULL_EQU(0)
5 #CONST VALUE LIST: [1, 2->2, 48]; row_num(2), col_num(1),
6 #SLCT2: [162, 1899->6478, 234]; (ECM.FLAG <> 'AA' AND var1 > var4 AND var1 < var6)
7 #CSCN2: [162, 1000000->1000000, 234]; INDEX33633597(ECM)
Statistics
-----------------------------------------------------------------
0 data pages changed
0 undo pages changed
1619 logical reads
0 physical reads
0 redo size
220291 bytes sent to client
326 bytes received from client
2 roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
3928 rows processed
0 io wait time(ms)
2323 exec time(ms)
通过to_char手动转换后发现可以走索引了,观察现在的执行计划SSEK之后BLKUP回表之后SLCT过滤不等值条件,执行49ms,正常优化到这就够了,但此时还是存在可进一步优化的空间,这就要用到刚刚创建的组合索引。
select *
from ecm
where state != '0'
and state != '3'
and savedate > to_char(to_number('20240808000001') - 3000000)
and savedate < to_char(to_number('20240808000001') - 10000)
and flag != 'AA'
order by savedate;
1 #NSET2: [52, 28497->3928, 234]
2 #PRJT2: [52, 28497->3928, 234]; exp_num(6), is_atom(FALSE)
3 #HASH RIGHT SEMI JOIN2: [49, 28497->3928, 234]; key_num(1) (ANTI), MEM_USED(224KB), DISK_USED(0KB) KEY(DMTEMPVIEW_889193888.colname=ECM.STATE) KEY_NULL_EQU(0)
4 #CONST VALUE LIST: [1, 2->2, 48]; row_num(2), col_num(1),
5 #SLCT2: [49, 28497->6478, 234]; ECM.FLAG <> 'AA'
6 #BLKUP2: [49, 37500->8122, 234]; IDX_ECM_01(ECM)
7 #SSEK2: [49, 37500->8122, 234]; scan_type(ASC), IDX_ECM_01(ECM), scan_range[(exp11,max,min),(exp11,min,max))
Statistics
-----------------------------------------------------------------
0 data pages changed
0 undo pages changed
16646 logical reads
1 physical reads
0 redo size
220291 bytes sent to client
343 bytes received from client
2 roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3928 rows processed
0 io wait time(ms)
49 exec time(ms)
通过Hint调整为先过滤后回表,之前计划是先回表后过滤数据,如果SSEK2检索出的数据量很大的情况,BLKUP的代价也会比较高,而开启enable_index_filter=1,使用索引过滤优化,如果过滤条件涉及的列包含在索引中,那么索引进行SSEK2后就可以使用此过滤条件,先过滤后回表减少中间结果集,降低BLKUP的代价,最后的执行计划中逻辑读和执行时间又有了进一步的提升。
select /*+ enable_index_filter(1) */
*
from ecm
where state != '0'
and state != '3'
and savedate > to_char(to_number('20240808000001') - 3000000)
and savedate < to_char(to_number('20240808000001') - 10000)
and flag != 'AA'
order by savedate;
1 #NSET2: [52, 28497->3928, 234]
2 #PRJT2: [52, 28497->3928, 234]; exp_num(6), is_atom(FALSE)
3 #BLKUP2: [49, 28497->3928, 234]; IDX_ECM_01(ECM)
4 #SLCT2: [49, 28497->3928, 234]; (ECM.FLAG <> 'AA' AND NOT(ECM.STATE IN LIST))
5 #SSEK2: [49, 28497->8122, 234]; scan_type(ASC), IDX_ECM_01(ECM), scan_range[(exp11,max,min),(exp11,min,max))
Statistics
-----------------------------------------------------------------
0 data pages changed
0 undo pages changed
7810 logical reads
0 physical reads
0 redo size
220291 bytes sent to client
352 bytes received from client
2 roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3928 rows processed
0 io wait time(ms)
17 exec time(ms)
1、谓词条件字段上尽量不要加任何运算包括函数,让字段单独放在条件表达式的左边,Leave it alone;
2、这条sql看似是优化问题,实际上却是业务架构设计问题,以字符类型存储时间数据,过滤条件却用数字进行计算,如果直接通过timestamp存储不仅可以利用数据库自带的时间函数,sql逻辑也会更严谨。
最后留一个小问题,为什么走索引之后执行时间提升,但逻辑读反而比全表扫描更高?
我的理解,索引以树形结构存储,那么定位到具体的行记录可能需要进行多次读取操作,每一次读取都算作一个逻辑读。虽然这些读取操作能够迅速定位到所需数据,但总的逻辑读次数可能会比直接扫描整个表要高,欢迎讨论。
文章
阅读量
获赞