Posts Tagged ‘query optimisation’

Inner And Outer Joins In SQL Statements

When joining two tables in a SQL database it is necessary to consider what type of join is required.

There are 4 possible ways in which these tables can be joined; Inner Join, Left Join, Right Join and Full Join (all of which may give different outcomes). Left, Right and Full Joins are all outer Joins. The mystery of outer joins often puts people off them but this article seeks to explain left and right joins with a set of simple examples.

The following examples use two tables in a hospital database. The patient table tblPatient (11 records) contains a code (PatientDisease) for their disease and the codelist or lookup table tlkpDisease (12 records) contains a list of disease codes (DiseaseID) and their meaning in words (DiseaseText).

1. Inner Join

This is where only those records from the table on the Left of the Join that match those from the table on the Right of the Join are returned.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient INNER JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

Note: The word Inner may be omitted as Inner joins are assumed by default.

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
835001 32299 32299 Meningitis
831001 49399 49399 Asthma

Note: Only those 9 patients who have a patientdisease which is in the disease codelist table are returned in this results set.

2. Left Join

This is where all records from the table on the Left of the Join are returned and only those that match from the table on the Right of the Join.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient LEFT JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
845001 57 NULL NULL
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
835001 32299 32299 Meningitis
836001 33699 NULL NULL
831001 49399 49399 Asthma

Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.

3. Right Join

This is where all records from the table on the Right of the Join are returned and only those that match from the table on the Left of the Join are returned.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient RIGHT JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
NULL NULL 29909 Infantile autism
835001 32299 32299 Meningitis
NULL NULL 35109 Bell's palsy
831001 49399 49399 Asthma
NULL NULL 74921 Cleft palate

Note: All 12 diseases from the disease codelist table are returned in the results set whether or not there is a patient with that disease in the patient table.

4. Full Join

This is where all records from the table on the Left of the Join are returned and all those from the table on the Right of the Join are returned.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient FULL JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
845001 57 NULL NULL
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
NULL NULL 29909 Infantile autism
835001 32299 32299 Meningitis
836001 33699 NULL NULL
NULL NULL 35109 Bell's palsy
831001 49399 49399 Asthma
NULL NULL 74921 Cleft palate

Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.

Summary

The above SQL statement returned four different results sets using the same join criteria on the same tables with different join types.

Microsoft Access query optimisation

Everyone wants the performance of their database to be optimal. In particular, there is often a requirement for a specific query or object that is query based, to run faster.

The performance of a query is affected by the tables or queries that underly the query and by the complexity of the query. Table based forms are faster than query based forms and attached tables are slower than integral tables. Sometimes it may be wise to import rather than to attach frequently accessed foreign tables.

Access uses Rushmore techniques borrowed from FoxPro, to automatically optimise queries that contain two or more indexes. (Rushmore supports some, but not all foreign formats of attached table. For example, it supports FoxPro but not Btrieve. As Rushmore is an automatic process there is no need to understand it except to remember to index two or more query fields for all but the simplest of queries.)

Here are some of the methods that Advent IT use to optimise the speed of Access queries:

  • Display the minimum number of fields in a query. Set criteria dependant fields that are not required in the dynaset to "not shown".
  • Index all restriction based fields, all fields included in expressions, all sorted fields and all join fields.
  • Use primary keys or unique indexes wherever possible.
  • Use numeric rather than text primary keys.
  • Use non blank unique fields.
  • Avoid the use of IIf() function in queries.
  • Avoid domain aggregate functions such as Dlookup().
  • Make careful use of Between and Equal to, rather than > or < speeds up queries.
  • Use fixed column headings in Crosstab queries.
  • For reports based on queries use Portrait view in preference to Landscape and select Fast Laser Printing to Yes (View,Options,Other Properties).
  • Use Make table queries for running reports on static data. These are called snapshot reports.
  • Use Count (*) rather than Count(Column).
  • When creating restrictions on a joined column in one-to-many relationships, test out the comparative performance when placing the restriction on the "one" side or the "many" side. The "one" side is not always the fastest — the "many" may have markedly fewer records.
  • Short table and field names run faster than long names.
  • Normalise tables — join strategies execute more quickly on smaller tables.
  • Denormalise tables — reduce the number of joins. Get the balance right between normalisation and denormalisation by experiment.
  • Avoid the use of Distinct Row queries — Union queries do not need the distinct row feature as they are automatically return unique fields unless set to Union All.