Huge performance gains from using arrays and bulk collects.

Consider a simple select which I ran in SQL*Plus connected as SYSTEM:

select * from dba_users;

I ran a trace on the session and did a tkprof on the result.

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.29       0.76        201        205          0           0
Execute      1      0.00       0.00          0          0          0           0
Fetch     1353      0.99       1.79       1303       4536          0       20276

What do we notice about these stats? Parse count is 1, which is expected, and it will still be 1 if I run that select again. Check my previous post where I’ve talked about bind variables and why it’s bad for performance to repeat parsing, especially if it is hard parsing. Also, execute count is 1, which is expected, and you want to keep that to 1 as far as possible per select. I’ve seen applications do exotic selects with rowid’s in cursors and the execute column has the same number as the fetch column, which is the most inefficient way to get your data back.

If you had done ‘set array 2‘ in SQLPlus, then the Fetch stats would have looked like this:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Fetch    10134      1.35       1.88        676      12896          0       20266

Funny enough, when I then used ‘set array 1‘, Oracle ignored the setting and used a value of 2. Somewhere in the recesses of the Oracle core there must be an AI that is intelligent enough to question and compensate for the obvious stupidity of getting back one row per fetch!

By running a simple select at the SQL prompt in SQLPlus, Oracle created an implicit cursor which you can see if you look at the raw trace file before it was tkprof’d.

PARSING IN CURSOR #18446744071486994888 len=23 dep=0 uid=5 oct=3 lid=5 tim=11879507487109 hv=665675061 ad='47520d368' sqlid='ckyh1gcmuut9p'
select * from dba_users
END OF STMT

Now, why were there only 1353 fetches done for 20276 rows of data returned? Implicit cursors will use the default array size of 15 which is the SQLPlus (client tool) setting for array[size].

Let’s write an explicit cursor via an anonymous block to see what Oracle does.

SET SERVEROUTPUT ON
declare 
  cursor c1 is
    select * from v$session;
begin   
  for c in c1 loop 
    null;
  end loop;  
end;
/

Checking the tkprof result:

call     count       cpu    elapsed       disk      query    current        rows
------- ------  -------- ---------- ---------- ---------- ----------  ----------
Parse        1      0.01       0.01          0          0          0           0
Execute      1      0.00       0.00          0          0          0           1
Fetch        0      0.00       0.00          0          0          0           0
------- ------  -------- ---------- ---------- ---------- ----------  ----------

That’s funny… why no result in the Fetch row? Going to Oracle’s documentation, it states that ‘Fetches are only performed for SELECT statements‘. Further more, as shown in a previous post, the anonymous block is declared and run in the session being traced, so it shows a value for Parse and Execute. The “select * from v$session” is run on the server as the cursor c1 that we had defined.

So going back to our trusty V$SQL which shows all statements executed since startup:

select 
  SQL_TEXT, PARSE_CALLS, EXECUTIONS, FETCHES
from V$SQL
where upper(sql_text) like '%V$SESSION%';

SQL_TEXT                  PARSE_CALLS   EXECUTIONS   FETCHES
SELECT * FROM V$SESSION   1             1            7

That’s interesting… Fetches=7. The select returned 696 rows so this means that the array size has been set to 100. Either that, or Oracle has performed an implicit bulk collect as stated by PL/SQL guru Steven Feuerstein here :
Oracle Database automatically optimizes cursor FOR loops to perform similarly to BULK COLLECT queries… and fetch 100 rows at a time.

From the oracle documentation of SET ARRAY, it states that “Values over approximately 100 provide little added performance.” So how can we use much larger “arrays” and get the intuitive performance benefit of collecting a large number of rows into a memory structure? This is where we have to use BULK COLLECT and FORALL.

Consider this code which will show the usage of BULK COLLECT. I am not going to exhaustively list all the BULK usage options, such as SQL%BULK_EXCEPTIONS, as that is the purpose of the documentation, and I don’t want to sidetrack from the point I am making about huge performance gains. However it is worth mentioning the LIMIT clause which limits the number of rows returned by the BULK COLLECT.

declare 
  TYPE ARRAY IS TABLE OF dba_objects%ROWTYPE;
  obj_data ARRAY;
  CURSOR c1 IS 
    SELECT * FROM dba_objects;
begin
  OPEN c1;
    LOOP
      FETCH c1 BULK COLLECT INTO obj_data;     
      EXIT WHEN c1%NOTFOUND;
    END LOOP;
  CLOSE c1;
end;
/

The trace shows the code but we have to check the stats from V$SQL:

select 
  SQL_TEXT, PARSE_CALLS, EXECUTIONS, FETCHES, BUFFER_GETS
from V$SQL
where upper(sql_text) like '%DBA_OBJECTS%';

SQL_TEXT                    PARSE_CALLS   EXECUTIONS   FETCHES   BUFFER_GETS 
SELECT * FROM DBA_OBJECTS   1             1            1         3100

Lo and behold, we get the awesome result of Fetches=1 for a select of over 100,000 rows! Why is this so significant instead of multiple fetches? If the application is client/server, then every fetch is an extra network round trip between client (thin or fat) and the database server. If the SQL is on the server then every fetch is an extra context switch. In both cases, this can result in huge performance benefits on busy production servers that execute thousands or even millions of queries per hour. Have you ever wondered about innocuous “idle” SQL*Net waits events? The majority of these could be as a result of not using bulk collects and arrays efficiently! Every one of them means an extra network round trip and this means wastage of CPU and I/O resources that could have been used for the next process in the run queue.

Now could the bulk mechanism be used for DML (insert, update, delete)? Sure, this is where the FORALL statement is used. In the example above I could have inserted this code after the FETCH:

FORALL i IN 1..obj_data.COUNT
  INSERT INTO some_table VALUES obj_data(i);

This would load the entire array with one write and therefore one roundtrip or context switch, giving the same benefit the BULK COLLECT had done.

Note that BULK COLLECT can be used with your dynamic SQL in the EXECUTE IMMEDIATE statement, but as stated in a previous post it is usually better to limit the use of dynamic sql in busy OLTP production environments.

Array usage in some other development tools

We’ve already seen the use of array[size] in SQL*Plus. This can also be used in Forms.

Pro*C can use pre-fetching transparently with:

$ proc .... PREFETCH=200

This can also be set at compilation time with ANSI C and C++ and overrides the usage of CURSIZE within the C code.

Using the property weblogic.oci.cacheRows, Weblogic JDBC implements array fetching using its Row Caching feature.

Magic” is the RAD tool we’ve been using and arrays can be set at table level. Every development language worth its salt should be able to use arrays and if their oracle drivers are properly written then the relevant OCI calls can be made for arrays and bulk collects.

Conclusion
If you want to experience a massive performance gain in both your DSS and OLTP production environments, make complete and gratuitous use of arrays and bulk collect mechanisms in your code.