Internals of Querying the Concurrent Requests Queue

Once upon a time I wrote about the Internal Workflow of an E-Business Suite Concurrent Manager Process. Many things have changed since that blog post, from which the most obvious change was the release of Oracle e-Business Suite R12.2. So, I decided to check if the way how the concurrent manager queues were processed by concurrent manager processes were still the same. My main goal was to see if the manager processes still don’t attempt any way of coordination to distribute the requests among them.

This is how I did the testing:

I used the VM templates provided by Oracle to build my R12.2.4 test environment. By the way, I didn’t expect that the process of getting the environment up would be so simple! Downloading the media files from (edelivery.oracle.com) was the most time-consuming step, once done - it took me just 1 hour to un-compress everything, import the Virtual Assembly file and bring up the R12.2.4 environment on my laptop.
3 Standard managers are defined by default
“Sleep seconds” were left as is = 30 seconds
“Cache size” was increased from 1 to 5.
Identified the 3 DB processes that belong to the Standard managers: select sid, serial# from v$session where module='e:FND:cp:STANDARD';
I enabled tracing with binds and waits for each of them like this: exec dbms_monitor.session_trace_enable(sid,serial#,true,true);
Once that was done I submitted one concurrent program - “Active users” and waited for it to complete.
I disabled the tracing like this: exec dbms_monitor.session_trace_disable(sid,serial#);.
Collected the trace files

I found 2 of the trace files to be very interesting. To make things more simple, the manager process “A” will be the one that executed the concurrent request, and process “B” will be the one that didn’t.

Before the “Active Users” request was submitted

No other requests were running at the time I did the testing, so I clearly observed how both Managers A and B queried the FND_CONCURRENT_REQUESTS table BOTH of the trace files displayed the same method of how requests are picked up from the queue. Note, I’m showing only the lines relevant to display the main query only, and I have formatted the query text to make it more readable

PARSING IN CURSOR #139643743645920 len=1149 dep=0 uid=100 oct=3 lid=100 tim=1460211399835915 hv=3722997734 ad='d275f750' sqlid='cd23u4zfyhvz6'
SELECT R.Rowid
FROM Fnd_Concurrent_Requests R
WHERE R.Hold_Flag                             = 'N'
AND R.Status_Code                             = 'I'
AND R.Requested_Start_Date                   <= Sysdate
AND (R.Node_Name1                            IS NULL
OR (R.Node_Name1                             IS NOT NULL
AND FND_DCP.target_node_mgr_chk(R.request_id) = 1))
AND (R.Edition_Name                          IS NULL
OR R.Edition_Name                            <= sys_context('userenv', 'current_edition_name'))
AND EXISTS
  (SELECT NULL
  FROM Fnd_Concurrent_Programs P
  WHERE P.Enabled_Flag         = 'Y'
  AND R.Program_Application_Id = P.Application_Id
  AND R.Concurrent_Program_Id  = P.Concurrent_Program_Id
  AND EXISTS
    (SELECT NULL
    FROM Fnd_Oracle_Userid O
    WHERE R.Oracle_Id = O.Oracle_Id
    AND EXISTS
      (SELECT NULL
      FROM Fnd_Conflicts_Domain C
      WHERE P.Run_Alone_Flag = C.RunAlone_Flag
      AND R.CD_Id            = C.CD_Id
      )
    )
  AND (P.Execution_Method_Code                          != 'S'
  OR (R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID) IN ((0,98),(0,100),(0,31721),(0,31722),(0,31757)))
  )
AND ((R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID) NOT IN ((510,40032),(510,40033),(510,42156),(510,42157),(530,43793),(530,43794),(535,42626),(535,42627),(535,42628)))
ORDER BY NVL(R.priority, 999999999),
  R.Priority_Request_ID,
  R.Request_ID
END OF STMT
EXEC #139643743645920:c=0,e=33,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211399835910
FETCH #139643743645920:c=0,e=546,p=0,cr=106,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211399836507
WAIT #139643743645920: nam='SQL*Net message to client' ela= 3 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211399836572

*** 2016-04-09 10:17:09.837
WAIT #139643743645920: nam='SQL*Net message from client' ela= 30000367 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211429836965
...
EXEC #139643743645920:c=0,e=59,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211429838767
FETCH #139643743645920:c=0,e=689,p=0,cr=106,cu=0,mis=0,r=0,dep=0,og=1,plh=3984653669,tim=1460211429839587
WAIT #139643743645920: nam='SQL*Net message to client' ela= 4 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211429839652

*** 2016-04-09 10:17:39.840
WAIT #139643743645920: nam='SQL*Net message from client' ela= 30000325 driver id=1952673792 #bytes=1 p3=0 obj#=-1 tim=1460211459840003
...

It’s important to observe that:

All manager’s processes still compete for the same requests. If the query is executed at the same time - the same list of concurrent requests will be retrieved by all processes.
The constants literals used in lines 30-32 mean that the query for checking the queue is still built when the concurrent manager process starts up. These constants are mainly used to implement the specializations rules in the query.
Only rowid for the pending requests’ rows in FND_CONCURRENT_REQUESTS are fetched.
The sleep time is clearly visible on lines 41,42 and 48,50

After the “Active Users” request was submitted - Starting the concurrent request.

The manager process A was the first to pick up the submitted requests and it could be observed by the r=1 (1 row fetched) in the FETCH call for the query we just reviewed: FETCH #139643743645920:c=0,e=437,p=0,cr=113,cu=0,mis=0,r=1,dep=0,og=1,plh=3984653669,tim=1460211519844640

Immediately after that, the manager process A locked the row in FND_CONCURRENT_REQUESTS table, this way, the request got assigned to this process. Notice the similar where predicates used in this query, these are actually required to make sure that the request is still not picked up by another manager process. However the main thing here is the fact that the request row is accessed by the rowid retrieved earlier (row 45, the value of the bind variable :reqname is AAAjnSAA/AAAyn1AAH in this case). Locking of the row is done by the FOR UPDATE OF R.status_code NoWait clause on line 49

PARSING IN CURSOR #139643743640368 len=4530 dep=0 uid=100 oct=3 lid=100 tim=1460211519864113 hv=4239777398 ad='cde86338' sqlid='6ya6bzgybbrmq'
SELECT R.Conc_Login_Id,
  R.Request_Id,
  ... excluded other 156 columns for brevity...
FROM fnd_concurrent_requests R,
  fnd_concurrent_programs P,
  fnd_application A,
  fnd_user U,
  fnd_oracle_userid O,
  fnd_conflicts_domain C,
  fnd_concurrent_queues Q,
  fnd_application A2,
  fnd_executables E,
  fnd_conc_request_arguments X
WHERE R.Status_code             = 'I'
AND (R.Edition_Name            IS NULL
OR R.Edition_Name              <= sys_context('userenv', 'current_edition_name'))
AND R.Request_ID                = X.Request_ID(+)
AND R.Program_Application_Id    = P.Application_Id(+)
AND R.Concurrent_Program_Id     = P.Concurrent_Program_Id(+)
AND R.Program_Application_Id    = A.Application_Id(+)
AND P.Executable_Application_Id = E.Application_Id(+)
AND P.Executable_Id             = E.Executable_Id(+)
AND P.Executable_Application_Id = A2.Application_Id(+)
AND R.Requested_By              = U.User_Id(+)
AND R.Cd_Id                     = C.Cd_Id(+)
AND R.Oracle_Id                 = O.Oracle_Id(+)
AND Q.Application_Id            = :q_applid
AND Q.Concurrent_Queue_Id       = :queue_id
AND (P.Enabled_Flag            IS NULL
OR P.Enabled_Flag               = 'Y')
AND R.Hold_Flag                 = 'N'
AND R.Requested_Start_Date     <= Sysdate
AND ( R.Enforce_Seriality_Flag  = 'N'
OR ( C.RunAlone_Flag            = P.Run_Alone_Flag
AND (P.Run_Alone_Flag           = 'N'
OR NOT EXISTS
  (SELECT NULL
  FROM Fnd_Concurrent_Requests Sr
  WHERE Sr.Status_Code         IN ('R', 'T')
  AND Sr.Enforce_Seriality_Flag = 'Y'
  AND Sr.CD_id                  = C.CD_Id
  ))))
AND Q.Running_Processes                                     <= Q.Max_Processes
AND R.Rowid                                                  = :reqname
AND ((P.Execution_Method_Code                               != 'S'
OR (R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID)       IN ((0,98),(0,100),(0,31721),(0,31722),(0,31757))))
AND ((R.PROGRAM_APPLICATION_ID,R.CONCURRENT_PROGRAM_ID) NOT IN ((510,40032),(510,40033),(510,42156),(510,42157),(530,43793),(530,43794),(535,42626),(535,42627),(535,42628)))
FOR UPDATE OF R.status_code NoWait

The behavior of the manager process B was a little bit more interesting. It too managed to fetch the same rowid from FND_CONCURRENT_PROCESSES table belonging to the submitted “Active Users” processes. However, when it tried to lock the row in FND_CONCURRENT_REQUESTS (By using exactly the same query), this happened:

PARSING IN CURSOR #139690311998256 len=4530 dep=0 uid=100 oct=3 lid=100 tim=1460211519900924 hv=4239777398 ad='cde86338' sqlid='6ya6bzgybbrmq'
...
BINDS #139690311998256:
...
Bind#2
  oacdty=01 mxl=32(18) mxlc=00 mal=00 scl=00 pre=00
  oacflg=20 fl2=1000001 frm=01 csi=873 siz=0 off=64
  kxsbbbfp=7f0c2f713f20  bln=32  avl=18  flg=01
  value='AAAjnSAA/AAAyn1AAH'
EXEC #139690311998256:c=1000,e=1525,p=0,cr=25,cu=1,mis=0,r=0,dep=0,og=1,plh=4044729389,tim=1460211519902727
ERROR #139690311998256:err=54 tim=1460211519902750

The query failed with ORA-00054: resource busy and acquire with NOWAIT specified or timeout expired. This is how the access to pending concurrent requests is serialized to make sure only one of the manager processes can run it. And, I think, relying on the well-tuned and highly efficient locking mechanism of Oracle Database is a very very smart idea.

Conclusions

the coordination between manager processes is still not happening to distribute the requests, but the managers all query the queue the same way and then compete between themselves to lock the requests’ entries on the table 1st. The process that gets the lock also gets to execute the concurrent request.
The cache size variable couldn’t be observed in the trace files, but as far as I remember from my previous research the process would only fetch “cache size”-number of rowids using the 1st query in this post. This could be tested by submitting larger volume of requests simultaneously.
The “sleep seconds” kicks in only when the manager process didn’t fetch any rowids from the queue. After all the cached requests are attempted/executed by the manager process, the queue is checked again immediately without waiting for the “sleep seconds” (Not explained in detail in this post, but it’s revealed in the trace files)
The DMLs used to query the FND_CONCURRENT_REQUESTS and to lock the row are very very similar to Pre-R12.2 releases of e-Business Suite (Another sign that the process hasn’t changed). (One change that I see is the addition of where clause predicates for Checking the Editions)