A join combines records from two or more tables in a relational database. In the Structured Query Language (SQL), there are three types of joins: inner, outer, and cross. Outer joins are subdivided further into left outer joins, right outer joins, and full outer joins.
Mathematically, join is relation composition, the fundamental operation in relational algebra, and generalizing function composition.
| LastName | DepartmentID |
|---|---|
| Smith | 34 |
| Jones | 33 |
| Robinson | 34 |
| Jasper | 36 |
| Steinberg | 33 |
| Rafferty | 31 |
| DepartmentName | DepartmentID |
|---|---|
| Sales | 31 |
| Engineering | 33 |
| Clerical | 34 |
| Marketing | 35 |
While not used very commonly, a cross join is the foundation upon which inner joins are built. A cross join returns the cartesian product of the sets of rows from the joined tables.
The SQL code for a cross join lists the tables to be joined (FROM), but does not include any filtering predicate (WHERE).
Example cross join (ANSI 92 standard syntax): SELECT * FROM employee CROSS JOIN department;
Example cross join (alternative syntax): SELECT * FROM employee ,department;
+
As you can see the cross join does no matching of like records. These joins are occasionally used to generate all possible combinations of records from tables that do not share a common element.
An inner join essentially finds the intersection between the two tables. This is the most common type of join used, and is considered the default join type. The join example below takes all the records from table A (in this case, employee) and finds the matching record(s) from table B (department). If no match is found, the record from A is not included in the results. If multiple results are found in B that match the predicate then one row will be returned for each (the values from A will be repeated).
Special care must be taken when joining tables on columns that can be NULL since NULL values will never match each other. See Left Outer Join or Right Outer Join for a solution.
Example inner join (ANSI 92 standard syntax): SELECT * FROM employee INNER JOIN department ON employee.DepartmentID = department.DepartmentID Example inner join (non-standard syntax): SELECT * FROM employee ,department WHERE employee.DepartmentID = department.DepartmentID
Inner join result : +
A left outer join is very different from an inner join. Instead of limiting results to those in both tables, it limits results to those in the "left" table (A). This means that if the ON clause matches 0 records in B, a row in the result will still be returned—but with NULL values for each column from B.
For example, this allows us to find the employee's departments, but still show the employee even when their department is NULL or does not exist. The example above would have ignored employees in non-existent departments.
Example left outer join (ANSI 92 standard syntax): SELECT * FROM employee LEFT OUTER JOIN department ON employee.DepartmentID = department.DepartmentID
+
A right outer join is much like a left outer join, except that the tables are reversed. Every record from the right side, B or department, will be returned, and NULL values will be returned for those that have no matching record in A.
Example right outer join (ANSI 92 standard syntax): SELECT * FROM employee RIGHT OUTER JOIN department ON employee.DepartmentID = department.DepartmentID
+
A full outer join combines the results of both left and right outer joins. These joins will show records from both tables, and fill in NULLs for missing matches on either side.
Some database systems do not support this functionality, but it can be emulated through the use of left outer joins and unions (see below).
Example full outer join (ANSI 92 standard syntax): SELECT * FROM employee FULL OUTER JOIN department ON employee.DepartmentID = department.DepartmentID
+
The same example, for use on databases that do not support FULL OUTER JOIN: SELECT employee.LastName, employee.DepartmentID, department.DepartmentName, department.DepartmentID FROM employee LEFT JOIN department ON employee.DepartmentID = department.DepartmentID UNION SELECT employee.LastName, employee.DepartmentID, department.DepartmentName, department.DepartmentID FROM department LEFT JOIN employee ON employee.DepartmentID = department.DepartmentID WHERE employee.DepartmentID = NULL
The efficient implementation of joins has been the goal of much work in database systems, because joins are both extremely common and rather difficult to execute efficiently. The difficulty results from the fact that (inner) joins are both commutative and associative. In practice, this means that the user merely supplies the list of tables to be joined and the join conditions to be used, and the database system has the task of determining the most efficient way to perform the operation. Determining how to execute a query containing joins is done by the query optimizer. It has two basic freedoms:
Many join algorithms treat their input tables differently. The input tables are referred to as the outer and inner tables, or left and right, respectively. In the case of nested loops, for example, the entire inner table will be scanned for each row of the outer table .
Query plans involving joins can be classified as:
These names are derived from the appearance of the query if drawn as a tree, with the outer join relation on the left and the inner table on the right (as is the convention).
There are four fundamental algorithms to perform a join operation.
A refinement to this technique is called "block nested loops": for every block in the outer relation, the entire inner relation is scanned. For each match between the current inner tuple and one of the tuples in the current block of the outer relation, a tuple is added to the join result set. This variant means that more computation is done for each tuple of the inner relation, but far fewer scans of the inner relation are required.
This is one reason why many optimizers keep track of the sort order of query nodes — if one or both input relations to a merge join is already sorted on the join attribute, an additional sort is not required. Otherwise, the DBMS will need to perform the sort, usually using an external sort to avoid consuming too much memory.
The technique as it relates to relational databases was first discussed in a 1981 Journal of the ACM article by Philip A. Bernstein and Dah-Ming W. Chiu titled Using semi-joins to solve relational queries.
Join (SQL) | Relationale Algebra | Jointure | Алгоритм соединения (СУБД)
This article is licensed under the GNU Free Documentation License.
It uses material from the
"Join (SQL)".
Home Page • arts • business • computers • games • health • hospitals • home • kids & teens • news • physicians • recreation• reference • regional • science • shopping • society • sports • world