1.环境:
SYS> @ ver1
PORT_STRING VERSION BANNER
------------------------------ -------------- --------------------------------------------------------------------------------
x86_64/Linux 2.4.xx 11.2.0.4.0 Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
2.问题分析:
--//上个星期看了生产系统awr报表,发现一些问题.原来防水墙不断调用执行如下:
begin :con := "TASSETACL"."QUERYACL"(:sn, :on); end;
--//该语句已经消失,我问了同事,据说升级过该产品,当时awr的记录如下:
SQL ordered by Executions
%CPU - CPU Time as a percentage of Elapsed Time
%IO - User I/O Time as a percentage of Elapsed Time
Total Executions: 23,199,582
Captured SQL account for 34.2% of Total
Executions Rows Processed Rows per Exec Elapsed Time (s)%CPU %IO SQL Id SQL Module SQL Text
6,522,522 6,521,177 1.00 1,660.15 99.4 0 190q1sz3ywrd7 xxxxxx.eee begin :con := "TASSETACL".QUE...
2,228,287 2,228,115 1.00 107.98 99.5 0 g7ytdh9mxt1s0 xxxxxx.eee select count ( :"SYS_B_0" ) fr...
--//而现在变成了如下:
SQL ordered by Elapsed Time
Resources reported for PL/SQL code includes the resources used by all SQL statements called by the code.
为PL/SQL代码报告的资源包括该代码调用的所有SQL语句所使用的资源。
% Total DB Time is the Elapsed Time of the SQL statement divided into the Total Database Time multiplied by 100
%Total - Elapsed Time as a percentage of Total DB time
%CPU - CPU Time as a percentage of Elapsed Time
%IO - User I/O Time as a percentage of Elapsed Time
Captured SQL account for 65.1% of Total DB Time (s): 33,342
Captured PL/SQL account for 41.7% of Total DB Time (s): 33,342
Elapsed Time (s) Executions Elapsed Time per Exec (s) %Total %CPU %IO SQL Id SQL Module SQL Text
5,156.39 33,682 0.15 15.47 98.35 0.01 3pw59r6rkw9gb oraagent.bin@dm01dbadm01.XXX.com (TNS V1- declare PRIVS_ERROR exception;...
5,153.22 33,683 0.15 15.46 98.34 0.01 0xwxau19hznj8 oraagent.bin@dm01dbadm01.XXX.com (TNS V1- begin dbagent.try_login(); end...
5,089.67 134,713 0.04 15.26 98.34 0.00 8vmu6k690g87k oraagent.bin@dm01dbadm01.XXX.com (TNS V1- SELECT UPPER(NVL(PROGRAM, 'nul...
--//原来的问题消失了,但是消耗的时间却增加了,很明显对方的产品没有经过严格的测试,就给用户使用了.
--//取出sql语句,主要便于查看并且格式化如下:
SYS> @ sql_id 3pw59r6rkw9gb
--SQL_ID = 3pw59r6rkw9gb
declare
PRIVS_ERROR exception; --raise error,if rule exception,will trigger privs_error
pragma exception_init(PRIVS_ERROR, -1031);
begin
execute immediate 'begin dbagent.try_login(); end;';
exception
when PRIVS_ERROR then
raise;
when others then
rollback;
end dbagent_logon;;
--//注意后面多了1个分号,是我写的脚本无法区分PL/SQL语句与sql语句造成的问题.
SYS> @sql_id 0xwxau19hznj8
--SQL_ID = 0xwxau19hznj8
begin dbagent.try_login(); end;;
SYS> @ sql_id 8vmu6k690g87k
--SQL_ID = 8vmu6k690g87k
SELECT UPPER (NVL (PROGRAM, 'null'))
,UPPER (MODULE)
,TYPE
,DECODE
(
NVL (INSTR (PROCESS, ':'), 0)
,0, NVL (PROCESS, 1234)
,SUBSTR (PROCESS, 1, INSTR (PROCESS, ':') - 1)
)
,OSUSER
,MACHINE
,SCHEMANAME
,USERNAME
,SERVICE_NAME
,SID
,SERIAL#
FROM SYS.V_$SESSION
WHERE SID = SYS_CONTEXT ('userenv', 'sid');
--//134713/33682 = 3.99955.平均执行3pw59r6rkw9gb1次,执行8vmu6k690g87k 4次.
--//可以看出调用顺序3pw59r6rkw9gb,0xwxau19hznj8,8vmu6k690g87k 分别执行次数的比例是1:1:4.
--//很明显开发没有写好程序代码.仅仅需要调用1次8vmu6k690g87k就ok了.
--//如果修改代码可以提高效率4倍. 仅仅需要5156/4 = 1289秒.但是我的问题并不再这里,细节很重要,为什么8vmu6k690g87k的执行有点
--//慢.再仔细看sql_id='8vmu6k690g87k'每次执行需要0.04秒=40ms(估计存在四舍五入问题,因为0.04*4=0.16),有点慢,因为where条件
--//查询使用sid,不应该存在这么"慢"的情况.
--//开始以为都是Module=oraagent.bin执行的,后面我发现实际上几乎全部模块都存在相似调用,看代码可以猜出是登陆数据库时触发器
--//调用执行的代码.这时我突然想起我自己写的登陆触发器用于设置cursor_sharing=force也存在类似的调用v$session视图的情况,很
--//明显我写的代码调用次数应该也有33XXX次数.Elapsed Time至少应该 5089/4 = 1272秒,但是并没有出现在awr报表中.另外我想到是
--//否因为select部分消耗cpu资源太多,我马上否定我的猜测,因为仅仅返回1行,无论如何不应该这么"慢".
--//当我看我写的触发器代码马上明白问题在那里.我是分开写的,可以参考链接
--// http://blog.itpub.net/267265/viewspace-2768591/ => [20210418]查询v$视图问题.txt
--// connor-mcdonald.com/2021/04/12/better-performance-when-querying-the-v-views
3.继续:
$ cat a.txt
SELECT UPPER (NVL (PROGRAM, 'null'))
,UPPER (MODULE)
,TYPE
,DECODE
(
NVL (INSTR (PROCESS, ':'), 0)
,0, NVL (PROCESS, 1234)
,SUBSTR (PROCESS, 1, INSTR (PROCESS, ':') - 1)
)
,OSUSER
,MACHINE
,SCHEMANAME
,USERNAME
,SERVICE_NAME
,SID
,SERIAL#
FROM SYS.V_$SESSION
WHERE SID = SYS_CONTEXT ('userenv', 'sid');
--//查看起对应的执行计划:
SYS> @ sl all
alter session set statistics_level = all;
Session altered.
SYS> @ a.txt
SYS> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
---------------------------
Plan hash value: 2422122865
---------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows | A-Time | OMem | 1Mem | Used-Mem |
---------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 1 (100)| 1 |00:00:00.02 | | | |
| 1 | MERGE JOIN CARTESIAN | | 1 | 1 | 62 | 0 (0)| 1 |00:00:00.02 | | | |
| 2 | NESTED LOOPS | | 1 | 1 | 12 | 0 (0)| 1 |00:00:00.01 | | | |
|* 3 | FIXED TABLE FULL | X$KSLWT | 1 | 1 | 8 | 0 (0)| 1 |00:00:00.01 | | | |
|* 4 | FIXED TABLE FIXED INDEX| X$KSLED (ind:2) | 1 | 1 | 4 | 0 (0)| 1 |00:00:00.01 | | | |
| 5 | BUFFER SORT | | 1 | 1 | 50 | 0 (0)| 1 |00:00:00.02 | 2048 | 2048 | 2048 (0)|
|* 6 | FIXED TABLE FULL | X$KSUSE | 1 | 1 | 50 | 0 (0)| 1 |00:00:00.02 | | | |
---------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$88122447
3 - SEL$88122447 / W@SEL$4
4 - SEL$88122447 / E@SEL$4
6 - SEL$88122447 / S@SEL$4
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("W"."KSLWTSID"=TO_NUMBER(SYS_CONTEXT('userenv','sid')))
4 - filter("W"."KSLWTEVT"="E"."INDX")
6 - filter((BITAND("S"."KSSPAFLG",1)<>0 AND BITAND("S"."KSUSEFLG",1)<>0 AND "S"."INDX"=TO_NUMBER(SYS_CONTEXT('userenv','sid'))
AND "S"."INST_ID"=USERENV('INSTANCE')))
--//你可以发现ID=3,全表扫描X$KSLWT,而过滤条件是 filter("W"."KSLWTSID"=TO_NUMBER(SYS_CONTEXT('userenv','sid'))),也就是无法把
--//SYS_CONTEXT ('userenv', 'sid')当作常量处理.
--//而生产库X$KSLWT 记录的数量不是一般测试环境的数量.我的测试环境仅仅返回24条.而生产系统返回很多.
SYS> select count(*) from X$KSLWT;
COUNT(*)
----------
8793
--//访问X$KSUSE也是出现类似的情况.
--//解决很简单,分开写:
--// Get user SID information
SELECT SYS_CONTEXT('userenv', 'sid') INTO v_sid FROM dual;
--// Get Program executable,OSUSER Details,Machine Details for this session
SELECT UPPER(NVL(PROGRAM, 'NULL')),
UPPER(MODULE),
DECODE(NVL(INSTR(PROCESS, ':'), 0), 0, NVL(PROCESS, 1234), SUBSTR(PROCESS, 1, INSTR(PROCESS, ':') - 1))
TYPE, OSUSER, MACHINE, SCHEMANAME, USERNAME, SERVICE_NAME, SID, SERIAL#
FROM SYS.V_$SESSION
where sid = v_sid;
SYS> SELECT SYS_CONTEXT('userenv', 'sid') FROM dual;
SYS_CONTEXT('USERENV','SID')
----------------------------
17017
$ cat a.txt
SELECT UPPER (NVL (PROGRAM, 'null'))
,UPPER (MODULE)
,TYPE
,DECODE
(
NVL (INSTR (PROCESS, ':'), 0)
,0, NVL (PROCESS, 1234)
,SUBSTR (PROCESS, 1, INSTR (PROCESS, ':') - 1)
)
,OSUSER
,MACHINE
,SCHEMANAME
,USERNAME
,SERVICE_NAME
,SID
,SERIAL#
FROM SYS.V_$SESSION
WHERE
SID= 17017
--SID = SYS_CONTEXT ('userenv', 'sid')
;
SYS> @a.txt
..
SYS> @ dpc '' '' ''
PLAN_TABLE_OUTPUT
-------------------------------------
SQL_ID gxdr0gq6nyums, child number 0
-------------------------------------
Plan hash value: 589956979
----------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| A-Rows | A-Time | OMem | 1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 1 (100)| 1 |00:00:00.01 | | | |
| 1 | NESTED LOOPS | | 1 | 1 | 62 | 0 (0)| 1 |00:00:00.01 | | | |
| 2 | MERGE JOIN CARTESIAN | | 1 | 1 | 58 | 0 (0)| 1 |00:00:00.01 | | | |
|* 3 | FIXED TABLE FIXED INDEX | X$KSUSE (ind:1) | 1 | 1 | 50 | 0 (0)| 1 |00:00:00.01 | | | |
| 4 | BUFFER SORT | | 1 | 1 | 8 | 0 (0)| 1 |00:00:00.01 | 2048 | 2048 | 2048 (0)|
|* 5 | FIXED TABLE FIXED INDEX| X$KSLWT (ind:1) | 1 | 1 | 8 | 0 (0)| 1 |00:00:00.01 | | | |
|* 6 | FIXED TABLE FIXED INDEX | X$KSLED (ind:2) | 1 | 1 | 4 | 0 (0)| 1 |00:00:00.01 | | | |
----------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$88122447
3 - SEL$88122447 / S@SEL$4
5 - SEL$88122447 / W@SEL$4
6 - SEL$88122447 / E@SEL$4
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter(("S"."INDX"=17017 AND BITAND("S"."KSSPAFLG",1)<>0 AND BITAND("S"."KSUSEFLG",1)<>0 AND
"S"."INST_ID"=USERENV('INSTANCE')))
5 - filter("W"."KSLWTSID"=17017)
6 - filter("W"."KSLWTEVT"="E"."INDX")
--//connor-mcdonald.com 链接给出了另外的写法,使用with+materialize提示:
$ cat b.txt
with mysid as ( select /*+ materialize */ userenv('SID') n from dual )
SELECT UPPER(NVL(PROGRAM, 'NULL')),
UPPER(MODULE),
DECODE(NVL(INSTR(PROCESS, ':'), 0), 0, NVL(PROCESS, 1234), SUBSTR(PROCESS, 1, INSTR(PROCESS, ':') - 1))
TYPE, OSUSER, MACHINE, SCHEMANAME, USERNAME, SERVICE_NAME, SID, SERIAL#
FROM SYS.V_$SESSION,mysid
WHERE SID = mysid.n;
SYS> @ b.txt
..
Plan hash value: 2230424401
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows |E-Bytes| Cost (%CPU)| E-Time | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | | 4 (100)| | 1 |00:00:00.03 | 15 | 1 | 1 | | | |
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | | | | 1 |00:00:00.03 | 15 | 1 | 1 | | | |
| 2 | LOAD AS SELECT | | 1 | | | | | 0 |00:00:00.02 | 4 | 0 | 1 | 270K| 270K| 270K (0)|
| 3 | FAST DUAL | | 1 | 1 | | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 0 | 0 | 0 | | | |
| 4 | NESTED LOOPS | | 1 | 1 | 75 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 5 | 1 | 0 | | | |
| 5 | NESTED LOOPS | | 1 | 1 | 71 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 5 | 1 | 0 | | | |
| 6 | NESTED LOOPS | | 1 | 1 | 63 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 5 | 1 | 0 | | | |
| 7 | VIEW | | 1 | 1 | 13 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 5 | 1 | 0 | | | |
| 8 | TABLE ACCESS STORAGE FULL| SYS_TEMP_0FD9D6D71_CF54F429 | 1 | 1 | 13 | 2 (0)| 00:00:01 | 1 |00:00:00.01 | 5 | 1 | 0 | 1025K| 1025K| |
|* 9 | FIXED TABLE FIXED INDEX | X$KSUSE (ind:1) | 1 | 1 | 50 | 0 (0)| | 1 |00:00:00.01 | 0 | 0 | 0 | | | |
|* 10 | FIXED TABLE FIXED INDEX | X$KSLWT (ind:1) | 1 | 1 | 8 | 0 (0)| | 1 |00:00:00.01 | 0 | 0 | 0 | | | |
|* 11 | FIXED TABLE FIXED INDEX | X$KSLED (ind:2) | 1 | 1 | 4 | 0 (0)| | 1 |00:00:00.01 | 0 | 0 | 0 | | | |
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Query Block Name / Object Alias (identified by operation id):
-------------------------------------------------------------
1 - SEL$71D7A081
2 - SEL$1
3 - SEL$1 / DUAL@SEL$1
7 - SEL$D67CB2D2 / MYSID@SEL$2
8 - SEL$D67CB2D2 / T1@SEL$D67CB2D2
9 - SEL$71D7A081 / S@SEL$5
10 - SEL$71D7A081 / W@SEL$5
11 - SEL$71D7A081 / E@SEL$5
Predicate Information (identified by operation id):
---------------------------------------------------
9 - filter((BITAND("S"."KSSPAFLG",1)<>0 AND BITAND("S"."KSUSEFLG",1)<>0 AND "S"."INST_ID"=USERENV('INSTANCE') AND "S"."INDX"="MYSID"."N"))
10 - filter("S"."INDX"="W"."KSLWTSID")
11 - filter("W"."KSLWTEVT"="E"."INDX")
4.性能对比:
--//我在toad下测试,多次执行观察
--//调用b.txt 大约需要2Xms.
--//调用a.txt 大约需要5Xms.
--//有时候两者差异不大,感觉这样比较不行,没有考虑网络的往返.另外写一篇测试该语句性能的blog.
--//注:我后面测试with+materialize要产生redo日志,不建议在生产系统应用这样的写法,主要原因是执行频率太高了,累积起来redo很大!!
5.为什么调用1次0xwxau19hznj8,执行4次8vmu6k690g87k.
--//我看了一下源代码,源代码这次没有加密.
--//通过过程collect_app_info调用sql_id=8vmu6k690g87k
--//而try_login代码通过4个函数分别取得对应值
--get login app name
function get_app_name return varchar2 is
begin
collect_app_info;
return l_appname;
end;
--get login app module name
function get_app_module_name return varchar2 is
begin
collect_app_info;
return l_module;
end;
--get login app type
function get_app_type return varchar2 is
begin
collect_app_info;
return l_type;
end;
--get app process number
function get_app_process return number is
begin
collect_app_info;
return l_process;
end;
--//这样就很好解析为什么调用1次执行4次sql_id=8vmu6k690g87k.而实际上
procedure collect_app_info is
begin
if not app_info_collect_status then
select upper(nvl(program, 'null')),
upper(module),
type,
decode(nvl(instr(process, ':'), 0),
0,
nvl(process, 1234),
substr(process, 1, instr(process, ':') - 1)),
osuser,
machine,
SCHEMANAME,
USERNAME,
SERVICE_NAME,
sid,
serial#
into l_appname,
l_module,
l_type,
l_process,
l_osuser,
l_machine,
l_SCHEMANAME,
l_username,
l_service_name,
l_sid,
l_serial#
from sys.v_$session
where sid = sys_context('userenv', 'sid');
end if;
end;
--//已经赋值给对应变量,根本不需要再通过什么函数返回对应值.直接使用对应变量应该可以,当然我没有测试.
--//最简单修改:
procedure collect_app_info is
begin
if not app_info_collect_status then
select userenv('SID') into l_sid from dual;
select upper(nvl(program, 'null')),
upper(module),
type,
decode(nvl(instr(process, ':'), 0),
0,
nvl(process, 1234),
substr(process, 1, instr(process, ':') - 1)),
osuser,
machine,
SCHEMANAME,
USERNAME,
SERVICE_NAME,
sid,
serial#
into l_appname,
l_module,
l_type,
l_process,
l_osuser,
l_machine,
l_SCHEMANAME,
l_username,
l_service_name,
l_sid,
l_serial#
from sys.v_$session
where sid = l_sid;
end if;
end;
--//我的感觉可能分开写效率更高!!因为使用with+materialize会生成一个临时表,会产生日志,我还给测试看看.
6.总结:
--//很明显对方缺乏蛮力测试就把产品做了升级,我曾经对乙方工程师讲过类似的问题,尽量不要让客户先于你发现产品的问题,这样是很
--//丢脸的事情.你内部测试怎么犯错都可以,客户是看不见的,一旦你犯错,对方先发现,对方会........
--//实际上只要大量用户连接数据库,该语句的问题才明显的暴露出来.
内容来源于网络如有侵权请私信删除
文章来源: 博客园
- 还没有人评论,欢迎说说您的想法!