Thursday, March 31, 2011

SQL statement question

The table Arc(x,y) currently has the following tuples (note there are duplicates):

(1,2),
(1,2),
(2,3),
(3,4),
(3,4),
(4,1),
(4,1),
(4,1),
(4,2)

Compute the result of the query:

SELECT a1.x, a2.y, COUNT(*)
FROM Arc a1, Arc a2
WHERE a1.y = a2.x
GROUP BY a1.x, a2.y;

What are a1 and a2 referring to?

From stackoverflow
  • a1 and a2 are just aliases for the Arc table which is being joined to itself. You could also say,

    Arc As a1, Arc As a2
    

    Is that what you're asking?

    kylex : if they're aliases how does that work in the statement?
    GregD : it works because it's the same as saying Arc = "a1" and Arc = "a2"
  • It's called a self join.

    You may join a table with itself, as in your task, it will result in:

    (1, 2) (2, 3)  // note the join condition, y from the right is equal to x from the left
    (1, 2) (2, 3)
    (2, 3) (3, 4)
    (2, 3) (3, 4)
    ...
    

    etc.

    This will of course shrink after GROUP'ing.

    a1 and a2 refer to the instances of the table. From which part of the result you want the x? Use a1.x for the left part, a2.x for the right part.

  • The answer to the query is:

    create temp table arc(x int, y int);
    insert into arc values(1,2);
    insert into arc values(1,2);
    insert into arc values(2,3);
    insert into arc values(3,4);
    insert into arc values(3,4);
    insert into arc values(4,1);
    insert into arc values(4,1);
    insert into arc values(4,1);
    insert into arc values(4,2);
    
    SELECT a1.x, a2.y, COUNT(*)
    FROM Arc a1, Arc a2
    WHERE a1.y = a2.x
    GROUP BY a1.x, a2.y;
    
          4           3                1
          4           2                6
          2           4                2
          1           3                2
          3           1                6
          3           2                2
    

    This is not necessarily the answer to the question, though. As already noted, 'a1' and 'a2' are table aliases, alternative names for the Arc table so that it can be joined to itself.

    • Think of a1 and a2 as temporary copies of the original table Arc.
    • Now after making the copies you apply the conditions (WHERE clause) which say that you want to select a row in table a1 if it's value for the column y equals the value x from the table a2.
    • after finding all the matches you further group the results which means that you summarize them into groups.
    • These groups are made firstly by the value x of the first table and within that by the value y of the other table.
    • Now the COUNT(*) statement does it's work an tells you how many elements each group contains.

0 comments:

Post a Comment