遇到ORA-3136的解决之道

mac2022-06-30  23

今天早上同事打电话说日照业务库第二个节点登录不进去,业务没法使用,weblogic第二个节点测不通。

PLSQL登录就死住,最后是end-of-communcation chanel  ,估计出现了严重的数据库故障。

    telnet进AIX小机,crs_stat状态,发现第二个节点为unkown,用命令crs_start  启动此实例,结果启动失败,crs_stop倒是能

关闭此实例。

   开始动手找原因,在oracle安装目录下admin文件中bdump的警告文件和udump的trace文件中:

 

Tue Jul 18 23:09:22 2006WARNING: inbound connection timed out (ORA-3136)Tue Jul 18 23:09:23 2006WARNING: inbound connection timed out (ORA-3136)Tue Jul 18 23:09:25 2006WARNING: inbound connection timed out (ORA-3136)Tue Jul 18 23:09:30 2006WARNING: inbound connection timed out (ORA-3136)Tue Jul 18 23:12:15 2006WARNING: inbound connection timed out (ORA-3136)

同时在sqlnet.log中记录了如下错误:

Fatal NI connect error 12170.

VERSION INFORMATION:TNS for Linux: Version 10.2.0.2.0 - ProductionOracle Bequeath NT Protocol Adapter for Linux: Version 10.2.0.2.0 - ProductionTCP/IP NT Protocol Adapter for Linux: Version 10.2.0.2.0 - ProductionTime: 19-JUL-2006 11:25:26Tracing not turned on.Tns error struct:ns main err code: 12535TNS-12535: TNS:operation timed outns secondary err code: 12606nt main err code: 0nt secondary err code: 0nt OS err code: 0Client address: (ADDRESS=(PROTOCOL=tcp)(HOST=192.168.1.123)(PORT=58147))

 

在eygle的网站有解决办法:

 

这是和网络连接相关的一个错误,Metalink上给出了如下的解决方案:

1.set INBOUND_CONNECT_TIMEOUT_<listenername>=0 in listener.ora2. set SQLNET.INBOUND_CONNECT_TIMEOUT = 0 in sqlnet.ora of server.3. stop and start both listener and database.4. Now try to connect to DB and observe the behaviour

详细的解决:

修改listener的inbound_connect_timeout参数的方法方法一:[bglbj1][oracle][/home/oracle]>lsnrctlLSNRCTL for Linux: Version 10.2.0.4.0 - Production on 01-SEP-2009 16:28:06Copyright (c) 1991, 2007, Oracle. All rights reserved.Welcome to LSNRCTL, type "help" for information.LSNRCTL> helpThe following operations are availableAn asterisk (*) denotes a modifier or extended command:start              stop               status            services           version            reload            save_config        trace              spawn             change_password    quit               exit              set*               show*             LSNRCTL> showThe following operations are available after showAn asterisk (*) denotes a modifier or extended command:rawmode                    displaymode               rules                      trc_file                  trc_directory              trc_level                 log_file                   log_directory             log_status                 current_listener          inbound_connect_timeout    startup_waittime          snmp_visible               save_config_on_stop       dynamic_registration      LSNRCTL> setThe following operations are available after setAn asterisk (*) denotes a modifier or extended command:password                   rawmode                   displaymode                trc_file                  trc_directory              trc_level                 log_file                   log_directory             log_status                 current_listener          inbound_connect_timeout    startup_waittime          save_config_on_stop        dynamic_registration      LSNRCTL>LSNRCTL> show inbound_connect_timeoutConnecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))LISTENER parameter "inbound_connect_timeout" set to 60The command completed successfullyLSNRCTL> set inbound_connect_timeout 0Connecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))LISTENER parameter "inbound_connect_timeout" set to 0The command completed successfullyLSNRCTL> show inbound_connect_timeoutConnecting to (ADDRESS=(PROTOCOL=tcp)(HOST=)(PORT=1521))LISTENER parameter "inbound_connect_timeout" set to 0The command completed successfullyLSNRCTL> set save_config_on_stop on  #表示修改参数永久生效,否则只是临时生效,下次重启监听又还原为原来的值了LSNRCTL> exit方法二:修改listener.ora文件,加入: INBOUND_CONNECT_TIMEOUT_LISTENER_NAME=0问题解决

修改后依然是启动不了数据库,但在依然启动不了数据库连sqlplus环境都进不去,我查看来最新一个udump 下的

trc文件,其中有Dumping diagnostic information for LCK0:OS pid = 422264loadavg : 0.23 0.25 0.26swap info: free_mem = 17.67M rsv = 128.00M           alloc = 12067.13M avail = 32768.00M swap_free = 20700.87M       F S      UID    PID   PPID   C PRI NI ADDR    SZ    WCHAN    STIME    TTY  TIME CMD  240001 A   oracle 422264      1   0  60 20 a0b36400 106964            Feb 10      -  1:47 ora_lck0_rzywk2open: The file access permissions do not allow the specified action.Warning: executed in non-root modeprocstack cannot verify that /unix matches the running kernel.Kernel symbols might not be validated.422264: ora_lck0_rzywk20x00000001000fc898  sskgpwwait(??, ??, ??, ??, ??) + 0x380x00000001000f9e7c  skgpwwait(??, ??, ??, ??, ??) + 0xbc0x000000010011e42c  kslges(??, ??, ??, ??, ??) + 0x54c0x000000010012225c  kslgetl(??, ??, ??, ??) + 0x33c0x00000001049f0998  ksfglt(??, ??, ??, ??, ??) + 0x1980x00000001045d1b84  kqlmbfre() + 0x1440x00000001045d61c4  kqlmba(??, ??) + 0x240x000000010015efa8  ksbcti(??, ??, ??) + 0x3c80x0000000100169f1c  ksbabs(??) + 0x3fc0x00000001019b5e18  kclabs(??) + 0xd80x0000000100166dd4  ksbrdp() + 0x4b40x000000010430c93c  opirip(??, ??, ??) + 0x3fc0x0000000102d9ae38  opidrv(??, ??, ??) + 0x4580x000000010370b950  sou2o(??, ??, ??, ??) + 0x900x0000000100000870  opimai_real(??, ??) + 0x1500x00000001000006d8  main(??, ??) + 0x980x0000000100000368  __start() + 0x98*** 2011-02-12 05:20:09.050*** 2011-02-12 05:20:19.051Waited for detached process: LCK0 for 310 seconds:*** 2011-02-12 05:20:19.051Dumping diagnostic information for LCK0:OS pid = 422264loadavg : 0.36 0.27 0.27swap info: free_mem = 32.64M rsv = 128.00M           alloc = 12072.14M avail = 32768.00M swap_free = 20695.86M       F S      UID    PID   PPID   C PRI NI ADDR    SZ    WCHAN    STIME    TTY  TIME CMD  240001 A   oracle 422264      1   0  60 20 a0b36400 106964            Feb 10      -  1:47 ora_lck0_rzywk2open: The file access permissions do not allow the specified action.Warning: executed in non-root modeprocstack cannot verify that /unix matches the running kernel.Kernel symbols might not be validated.

 

依据此处和alterSID.log的内容:

ksvcreate: Process(m000) creation failedSat Feb 12 05:19:14 2011Errors in file /oracle/app/oracle/admin/rzywk/bdump/rzywk2_j002_512250.trc:ORA-12012: error on auto execute of job 42791ORA-27468: "EXFSYS.RLM$EVTCLEANUP" is locked by another processSat Feb 12 05:40:42 2011Errors in file /oracle/app/oracle/admin/rzywk/bdump/rzywk2_j002_614620.trc:ORA-12012: error on auto execute of job 42792ORA-27468: "EXFSYS.RLM$SCHDNEGACTION" is locked by another processSat Feb 12 05:45:05 2011Errors in file /oracle/app/oracle/admin/rzywk/bdump/rzywk2_j003_516258.trc:ORA-12012: error on auto execute of job 42791ORA-27468: "EXFSYS.RLM$EVTCLEANUP" is locked by another processSat Feb 12 05:55:20 2011kkjcre1p: unable to spawn jobq slave processSat Feb 12 05:55:20 2011Errors in file /oracle/app/oracle/admin/rzywk/bdump/rzywk2_cjq0_434552.trc:

Sat Feb 12 05:57:36 2011kkjcre1p: unable to spawn jobq slave processSat Feb 12 05:57:36 2011Errors in file /oracle/app/oracle/admin/rzywk/bdump/rzywk2_cjq0_434552.trc:

可以判断是oracle内部的进程锁定的,上面的内容说的很清楚,就是422264      这个进程编号,

kill -9 422264 再进入sqlplus  "/as sysdba"  可以登录了,startup 数据库成功启动。

       看来解决问题还得具体问题具体分析,不过大佬们的文档还是需要借鉴的。

附上解决此问题的其他方案和分析:

  http://www.eygle.com/archives/2006/07/sqlnet_inbound_connect_timeout.html

http://topic.csdn.net/u/20080623/17/4383143c-5bf6-462a-b7d6-ac41e50661e9.html

 

转载于:https://www.cnblogs.com/myitworld/archive/2011/06/16/2214783.html

最新回复(0)