SQL Cookbook

Anthony Molinaro

Mentioned 3

A guide to SQL covers such topics as retrieving records, metadata queries, working with strings, data arithmetic, date manipulation, reporting and warehousing, and hierarchical queries.

More on Amazon.com

Mentioned in questions and answers.

Let's say my table structure looks something like this:

CREATE TABLE [dbo].[table1] (
    [id] [int] IDENTITY(1,1) NOT NULL,
    [data] [varchar](255) NOT NULL,
    CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED ([id] ASC)
)

CREATE TABLE [dbo].[table2] (
    [id] [int] IDENTITY(1,1) NOT NULL,
    [table1_id] [int] NOT NULL,
    [data] [varchar](255) NOT NULL,
    CONSTRAINT [PK_table2] PRIMARY KEY CLUSTERED ([id] ASC)
)

The [id] field of the first table corresponds to the [table1_id] field of the second. What I would like to do is insert data into both tables in a single transaction. Now I already know how to do this by doing INSERT-SELECT-INSERT, like this:

BEGIN TRANSACTION;
DECLARE @id [int];
INSERT INTO [table1] ([data]) VALUES ('row 1');
SELECT @id = SCOPE_IDENTITY();
INSERT INTO [table2] ([table1_id], [data]) VALUES (@id, 'more of row 1');
COMMIT TRANSACTION;

That's all good and fine for small cases like that where you're only inserting maybe a handful of rows. But what I need to do is insert a couple hundred thousand rows, or possibly even a million rows, all at once. The data is coming from another table, so if I was only inserting it into a single table, it would be easy, I'd just have to do this:

INSERT INTO [table] ([data])
SELECT [data] FROM [external_table];

But how would I do this and split the data into [table1] and [table2], and still update [table2] with the appropriate [table1_id] as I'm doing it? Is that even possible?

Keep a look out for SQL Server to support the 'INSERT ALL' Statement. Oracle has it already, it looks like this (SQL Cookbook):

insert all
  when loc in ('NEW YORK', 'BOSTON') THEN
   into dept_east(deptno, dname, loc) values(deptno, dname, loc)
  when loc in ('CHICAGO') THEN
   into dept_mid(deptno, dname, loc) values(deptno, dname, loc)
  else
   into dept_west(deptno, dname, loc) values(deptno, dname, loc)
select deptno, dname, loc
  from dept

I am using Oracle 10g and the following paradigm to get a page of 15 results as a time (so that when the user is looking at page 2 of a search result, they see records 16-30).

select * 
  from 
( select rownum rnum, a.*
    from (my_query) a
   where rownum <= 30 )
where rnum > 15;

Right now I'm having to run a separate SQL statement to do a "select count" on "my_query" in order to get the total number of results for my_query (so that I can show it to the user and use it to figure out total number of pages, etc).

Is there any way to get the total number of results without doing this via a second query, i.e. by getting it from above query? I've tried adding "max(rownum)", but it doesn't seem to work (I get an error [ORA-01747] that seems to indicate it doesnt like me having the keyword rownum in the group by).

My rationale for wanting to get this from the original query rather than doing it in a separate SQL statement is that "my_query" is an expensive query so I'd rather not run it twice (once to get the count, and once to get the page of data) if I dont have to; but whatever solution I can come up with to get the number of results from within a single query (and at the same time get the page of data I need) should not add much if any additional overhead, if possible. Please advise.

Here is exactly what I'm trying to do for which I receive an ORA-01747 error because I believe it doesnt like me having ROWNUM in the group by. Note, If there is another solution that doesnt use max(ROWNUM), but something else, that is perfectly fine too. This solution was my first thought as to what might work.

 SELECT * FROM (SELECT r.*, ROWNUM RNUM, max(ROWNUM)
 FROM (SELECT t0.ABC_SEQ_ID AS c0, t0.FIRST_NAME, t0.LAST_NAME, t1.SCORE
 FROM ABC t0, XYZ t1
 WHERE (t0.XYZ_ID = 751) AND 
 t0.XYZ_ID = t1.XYZ_ID 
 ORDER BY t0.RANK ASC) r WHERE ROWNUM <= 30 GROUP BY r.*, ROWNUM) WHERE RNUM > 15

--------- EDIT -------- Note, based on the first comment I tried the following that appears to work. I dont know how well it performs versus other solutions though (I'm looking for the solution that fufills my requirement but performs the best). For example, when I run this it takes 16 seconds. When I take out the COUNT(*) OVER () RESULT_COUNT it takes just 7 seconds:

    SELECT * FROM (SELECT r.*, ROWNUM RNUM, ) 
    FROM (SELECT COUNT(*) OVER () RESULT_COUNT, 
          t0.ABC_SEQ_ID AS c0, t0.FIRST_NAME, t1.SCORE 
    FROM ABC t0, XYZ t1 
    WHERE (t0.XYZ_ID = 751) AND t0.XYZ_ID = t1.XYZ_ID 
    ORDER BY t0.RANK ASC) r WHERE ROWNUM <= 30) WHERE RNUM > 1

The explain plan changes from doing a SORT (ORDER BY STOP KEY) to do a WINDOW (SORT).

Before:

SELECT STATEMENT () 
 COUNT (STOPKEY)    
  VIEW ()   
   SORT (ORDER BY STOPKEY)  
    NESTED LOOPS () 
     TABLE ACCESS (BY INDEX ROWID)  XYZ
      INDEX (UNIQUE SCAN)   XYZ_ID
     TABLE ACCESS (FULL)    ABC

After:

SELECT STATEMENT () 
 COUNT (STOPKEY)    
  VIEW ()   
   WINDOW (SORT)    
    NESTED LOOPS () 
     TABLE ACCESS (BY INDEX ROWID)  XYZ
      INDEX (UNIQUE SCAN)   XYZ_ID
     TABLE ACCESS (FULL)    ABC

I think you have to modify your query to something like this to get all the information you want on a "single" query.

SELECT *
FROM (SELECT r.*, ROWNUM RNUM, COUNT(*) OVER () RESULT_COUNT 
      FROM (SELECT t0.ABC_SEQ_ID AS c0, t0.FIRST_NAME, t1.SCORE
            FROM ABC t0, XYZ t1
            WHERE (t0.XYZ_ID = 751) 
            AND t0.XYZ_ID = t1.XYZ_ID 
            ORDER BY t0.RANK ASC) R)
WHERE RNUM between 1 and 15 

The reason is that the COUNT(*) OVER() window function gets evaluated after the WHERE clause, hence not giving the total count of records but the count of records that satisfy the ROWNUM <= 30 condition.

If you cannot accept the performance ot this query, or of executing 2 separate queries, maybe you should think about a solution like the one proposed by FrustratedWithFormsDesigner in his/her comment about caching the count of records.

If you work with databases on a regular basis I recommend you get a copy of SQL Cookbook. It is an exceptional book with lots of useful tips.

I have such a table:

 name value1 value2 value3
 ---------------------------
 name1 1       1       1
 name2 1       1       2
 name3 2       2       11
 name4 2      12       2
 name5 3       3       8
 name6 3       3       2

what I need is such a result:

 name value1 value2 value3
 ---------------------------
 name2 1       1       2
 name4 2      12       2
 name5 3       3       8

I.e:

  1. Exact 1 enty for each group of value1;
  2. This entry must have the max value of value2 in this group of value1;
  3. This entry must have the max value of value3 in the group of value2.

After having searched in Internet, I have got a solution by using scalar subquery in SELECT list as a single column, but it is very ugly and complex, since the same sub query must be run for each of the column value1, value2, value3.

SQL Cookbook mentions a solution in recipe 14.10 via defining a type as object, but i prefer a solution in a single SELECT statement.

Any easy way?

Analytics are your friend in this case:

SQL> CREATE TABLE t (NAME VARCHAR2(32), v1 INTEGER, v2 INTEGER, v3 INTEGER);

Table created
SQL> INSERT INTO t VALUES ('name1',1,1,1);

1 row inserted
SQL> INSERT INTO t VALUES ('name2',1,1,2);

1 row inserted
SQL> INSERT INTO t VALUES ('name3',2,2,11);

1 row inserted
SQL> INSERT INTO t VALUES ('name4',2,12,2);

1 row inserted
SQL> INSERT INTO t VALUES ('name5',3,3,8);

1 row inserted
SQL> INSERT INTO t VALUES ('name6',3,3,2);

1 row inserted
SQL> SELECT NAME, v1, v2, v3
       FROM (SELECT NAME, v1, v2, v3
                  , MAX(v2) OVER(PARTITION BY v1) mv2
                  , MAX(v3) OVER(PARTITION BY v1,v2) mv3
               FROM t)
      WHERE v2 = mv2
        AND v3 = mv3
      ORDER BY v1;

NAME   V1 V2 V3
------ -- -- --
name2   1  1  2
name4   2 12  2
name5   3  3  8