I'm maintaining some stored procedures in an Oracle 11g RAC environment that are part of the pipeline of steps for data-interchange with other systems. One in particular has been producing duplicate records. Files get loaded into a set of staging-tables by an ETL job, which then calls this stored procedure, which has no parameters (except an OUT parameter that gets ignored by the calling ETL code, but that's another issue). The body of the procedure starts with a statement like
For I in ( SELECT * from SOME_STAGING_TABLE where status = 'NEW' order by X_ID) LOOP
However, the ETL job that calls this is in turn called by another ETL job that finds all "files" that haven't been processed and calls it, once per file, in parallel. I think what's happening is:
- Master ETL job runs, finds two files, invokes two worker jobs
- Worker ETL job A runs, gets through first phase
- Worker ETL job B runs, gets through first phase
- Worker ETL job A invokes stored procedure
- Stored procedure A' gets two files' worth of records to process
- Worker ETL job B invokes stored procedure
- Stored procedure B' gets two files' worth of records to process
One solution might be to enforce serialization in the ETL service. Another might be to change the calling-convention of the stored procedure so it takes a file ID as a parmeter. However, let's say I need to do it in the database. Is there any way to set up a stored procedure so that only one connection can be executing a designated block of code inside it at a time?
(In C or Java this would be called a "mutex" or "synchronized block", and I know MySQL has a "LOCK TABLES" statement that could be used for that purpose; does Oracle 11g have anything like that?)
3 Answers 3
Oracle has several locking mechanisms that you can use to serialize operations:
- You could create a single-row table that is locked at the beginning of the procedure. The lock will be locked until the transaction ends (commit/rollback).
- You could request a lock with
DBMS_LOCK.request
. This lock can be maintained for the duration of the session or until you callDBMS_LOCK.release
. This lock can be maintained across commits.
Example of serialization using DBMS_LOCK
:
SQL> CREATE OR REPLACE PROCEDURE serial IS
2 l_lock_handle VARCHAR2(128 BYTE);
3 l_lock_request INTEGER;
4 BEGIN
5 dbms_lock.allocate_unique(lockname=> 'MY_SERIAL_PROC', lockhandle => l_lock_handle);
6 LOOP
7 l_lock_request := dbms_lock.request(lockhandle => l_lock_handle,
8 timeout => 5,
9 release_on_commit => FALSE);
10 CASE l_lock_request
11 WHEN 0 THEN
12 EXIT; -- success
13 WHEN 1 THEN
14 dbms_output.put_line('lock already reserved, wait...');
15 dbms_lock.sleep(5); -- sleep 5 seconds before retrying
16 ELSE
17 raise_application_error(-20001, 'Lock error: ' || l_lock_request);
18 END CASE; --
19 END LOOP;
20 ----------------------------------------
21 -- serialized block of code --
22 -- (lock will be kept accross commit) --
23 ----------------------------------------
24 dbms_lock.sleep(30);
25 ----------------------------------------
26 -- End of serialized code --
27 ----------------------------------------
28 l_lock_request := dbms_lock.release(lockhandle => l_lock_handle);
29 END;
30 /
Procedure created
Only one session can run this procedure at the same time:
session1> exec serial;
session2> exec serial;
lock already reserved, wait...
lock already reserved, wait...
lock already reserved, wait...
PL/SQL procedure successfully completed
PL/SQL procedure successfully completed.
Looks like the standard way to do this on Oracle is the DBMS_LOCK package, as discussed in a previous question. Full documentation is available from Oracle.
By using the DBMS_LOCK, we cannot guarantee that only one session will pass through. I have observed that when multiple sessions request for a lock simultaneously, there is a greater chance that more than one session will get into the procedure.
One full proof method to allow only one worker to get into is that - in the beginning, run a query on fnd_concurrent_requests to identify the MIN(request_id) for the concurrent_program_id which are currently running, and set an acquire_lock flag for only this minimum request_id.
As all the requests will be triggered sequentially, therefore only one request will have this acquire_lock flag as Y.