Friday, September 3, 2010

Memory Leaks

The thought for this posting has been in my head for a while and then this week this post came up on the OTN XML DB Forums ... XMLTYPE.createxml - how to free memory?

I knew when working with DOMDocuments that there was a .FreeDocument() but I didn't know of a similar concept for XMLType. From the testing I did given what the poster was seeing, it looks to be a bug only when using a sys_refcursor as input when doing an XMLTYPE conversion. It could impact BFILENAME as well but I need to get around to testing that at some point.

This brought back memories from a few years back of when I stumbled across a memory leak in the code one of our clients was using on 10.2.0.3. We knew we had memory leaks somewhere in the app code because we would need to restart our drivers (sessions that ran 24x7) every few days else the higher processing volume threads would eat up memory and kill the node. This was in a RAC so we got to watch the fail-over work as intended. Not something you want to do on a regular basis in a production system. We put in a process to stop/start those threads every so often to release memory while we tried to figure out where the leak was.

At that time a task came up where I had to do something as a one time job, so I hacked together a script that used a lot of production code, tested it and finally let it loose in Production. I was using DBMS_APPLICATION_INFO to log job progress and I noticed that over time, the job would start slowing down. I could stop and restart the job and it would run fast and then start slowing down again. After a lot of research and time stripping out pieces of code and rerunning and not seeing any progress, I finally figured out the code was missing some dbms_xmldom.FreeDocument statements that the documentation tells you to use.

Ah, the cause of the memory leak. Reviewed the code, plugged those in where missing, go back to testing my script and notice no difference. Huh? I stripped the script down even further till I had something like the following.
DECLARE 
l_dom_doc dbms_xmldom.DOMDocument;

PROCEDURE p_local_open_close IS
BEGIN
l_dom_doc := dbms_xmldom.newDOMDocument;
dbms_xmldom.FreeDocument(l_dom_doc);
END p_local_open_close;

BEGIN
DBMS_APPLICATION_INFO.SET_MODULE(module_name => 'user_domDocument_mem_leak_test',
action_name => 'SLEEPING');
dbms_lock.sleep(20);
/* Sleeps so have time to find OS PID via
select s.sid, s.process, s.serial#, s.module, s.action, p.spid os_pid
from gv$session s,
gv$process p
where s.module = 'user_domDocument_mem_leak_test'
and s.paddr = p.addr;
*/
dbms_application_info.set_action('RUNNING');
FOR i IN 1..700000 -- adjust as needed for your system
LOOP
p_local_open_close;
END LOOP;
dbms_application_info.set_action('COMPLETE');

END;

Running that on both 10.2.0.1 (our Development at the time) and 10.2.0.3 (Test/Production) would result in memory usage growing consistently while the script was running. We had our client open a SR with Oracle and they did confirm it was a bug. At some point later, Oracle provided the patch to fix this bug and all was good. I think this fix was included in the .4 release but I've lost the bug number for reference purposes. I think it stemmed from some of the JAVA used under the covers by dbms_xmldom based on what I learned from Oracle's responses regarding .freeDocument on this SR and another I had open with them regarding side effects.

One other interesting thing we saw is that the Development ran on RedHat Linux and Test/Production on SunOS. When the script finished running in Dev, the amount of memory that was used would be freed up even though the session/thread was still alive. On SunOS, we had to kill the session/thread before the memory would free up. I was always interested in whether that was caused by the .1 and .3 difference or the OS difference but we never had the opportunity to sync the version up until both went to .4

No comments:

Post a Comment