Self Joins In SQL Statements
Self-joins are useful in SQL select statements and are often used to identify duplicate entries or entries linked in other ways such as customers who share the same telephone number or patients who share the same address.
A self-join is an internal table (or view) join. The self-join is between fields in a table and fields within a virtual copy of the table. For instance in a hospital database the patients’ address and postcode details are kept in the table tblAddress with the identifier of the patient in the field PatientID.
The following SQL statement can be run in Query Analyzer in order for the hospital administrators to list the patients who share an address and patients who have been erroneously registered twice under different PatientID’s:
SQL statement
|
SELECT
|
A1.PatientID
|
AS
|
PatientID1,
|
A1.Address
|
AS
|
Address1,
|
A1.Postcode
|
AS
|
Postcode1
|
|
|
|
|
Alias
|
|
|
Alias
|
|
|
Alias
|
|
FROM
|
tblAddress
|
AS
|
A1
|
INNER JOIN
|
tblAddress
|
AS
|
A2
|
ON
|
A1.Postcode =A2.Postcode
|
|
|
|
Alias |
|
|
|
Alias |
|
|
|
WHERE LEFT(A1.Address,3)=LEFT(A2.Address,3) AND A1.PatientID <> A2.PatientID
|
|
ORDER BY A1.Postcode
|
Note: aliases are used to distinctly name fields or tables with the same name to clarify ambiguities.
This returns the following results set:
| PatientID1 | Address1 | Postcode1 |
| 231234 | 3 Montgomery Lane, Hartshead | AT12 7DS |
| 432122 | 3 Mongomery La, Rushington,Hartshead | AT12 7DS |
| 27923 | 21 Cystine Road, Sefton, Merseyside | L87 9ZZ |
| 3987 | 21 Cystine Rd, Liverpool | L87 9ZZ |
The Patients corresponding to these ID pairs can then be investigated to test whether they are duplicates registrations or different patients, possibly relatives, with the same address.
Self-joins can also be written as outer joins.
Pat Brownbill
|