Posts Tagged ‘query optimisation’
Inner And Outer Joins In SQL Statements
When joining two tables in a SQL database it is necessary to consider what type of join is required.
There are 4 possible ways in which these tables can be joined; Inner Join, Left Join, Right Join and Full Join (all of which may give different outcomes). Left, Right and Full Joins are all outer Joins. The mystery of outer joins often puts people off them but this article seeks to explain left and right joins with a set of simple examples.
The following examples use two tables in a hospital database. The patient table tblPatient (11 records) contains a code (PatientDisease) for their disease and the codelist or lookup table tlkpDisease (12 records) contains a list of disease codes (DiseaseID) and their meaning in words (DiseaseText).
1. Inner Join
This is where only those records from the table on the Left of the Join that match those from the table on the Right of the Join are returned.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | INNER JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
Note: The word Inner may be omitted as Inner joins are assumed by default.
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| 835001 | 32299 | 32299 | Meningitis |
| 831001 | 49399 | 49399 | Asthma |
Note: Only those 9 patients who have a patientdisease which is in the disease codelist table are returned in this results set.
2. Left Join
This is where all records from the table on the Left of the Join are returned and only those that match from the table on the Right of the Join.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | LEFT JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 845001 | 57 | NULL | NULL |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| 835001 | 32299 | 32299 | Meningitis |
| 836001 | 33699 | NULL | NULL |
| 831001 | 49399 | 49399 | Asthma |
Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.
3. Right Join
This is where all records from the table on the Right of the Join are returned and only those that match from the table on the Left of the Join are returned.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | RIGHT JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| NULL | NULL | 29909 | Infantile autism |
| 835001 | 32299 | 32299 | Meningitis |
| NULL | NULL | 35109 | Bell's palsy |
| 831001 | 49399 | 49399 | Asthma |
| NULL | NULL | 74921 | Cleft palate |
Note: All 12 diseases from the disease codelist table are returned in the results set whether or not there is a patient with that disease in the patient table.
4. Full Join
This is where all records from the table on the Left of the Join are returned and all those from the table on the Right of the Join are returned.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | FULL JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 845001 | 57 | NULL | NULL |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| NULL | NULL | 29909 | Infantile autism |
| 835001 | 32299 | 32299 | Meningitis |
| 836001 | 33699 | NULL | NULL |
| NULL | NULL | 35109 | Bell's palsy |
| 831001 | 49399 | 49399 | Asthma |
| NULL | NULL | 74921 | Cleft palate |
Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.
Summary
The above SQL statement returned four different results sets using the same join criteria on the same tables with different join types.
Microsoft Access query optimisation
Everyone wants the performance of their database to be optimal. In particular, there is often a requirement for a specific query or object that is query based, to run faster.
The performance of a query is affected by the tables or queries that underly the query and by the complexity of the query. Table based forms are faster than query based forms and attached tables are slower than integral tables. Sometimes it may be wise to import rather than to attach frequently accessed foreign tables.
Access uses Rushmore techniques borrowed from FoxPro, to automatically optimise queries that contain two or more indexes. (Rushmore supports some, but not all foreign formats of attached table. For example, it supports FoxPro but not Btrieve. As Rushmore is an automatic process there is no need to understand it except to remember to index two or more query fields for all but the simplest of queries.)
Here are some of the methods that Advent IT use to optimise the speed of Access queries:
- Display the minimum number of fields in a query. Set criteria dependant fields that are not required in the dynaset to "not shown".
- Index all restriction based fields, all fields included in expressions, all sorted fields and all join fields.
- Use primary keys or unique indexes wherever possible.
- Use numeric rather than text primary keys.
- Use non blank unique fields.
- Avoid the use of IIf() function in queries.
- Avoid domain aggregate functions such as Dlookup().
- Make careful use of Between and Equal to, rather than > or < speeds up queries.
- Use fixed column headings in Crosstab queries.
- For reports based on queries use Portrait view in preference to Landscape and select Fast Laser Printing to Yes (View,Options,Other Properties).
- Use Make table queries for running reports on static data. These are called snapshot reports.
- Use Count (*) rather than Count(Column).
- When creating restrictions on a joined column in one-to-many relationships, test out the comparative performance when placing the restriction on the "one" side or the "many" side. The "one" side is not always the fastest — the "many" may have markedly fewer records.
- Short table and field names run faster than long names.
- Normalise tables — join strategies execute more quickly on smaller tables.
- Denormalise tables — reduce the number of joins. Get the balance right between normalisation and denormalisation by experiment.
- Avoid the use of Distinct Row queries — Union queries do not need the distinct row feature as they are automatically return unique fields unless set to Union All.
