Posts Tagged ‘sql’
Microsoft SQL Server Reporting Services - Case Study 2006
Advent IT have implemented a reporting solution for Momenta to help them manage the DTI's Knowledge Transfer Partnership (KTP) Programme. This solution integrates Microsoft SQL Server Reporting Services with SoftArtisans OfficeWriter to provide a powerful, centralised reporting and mail merge system.
Please read the case study on the Microsoft site for more information.
Self joins in SQL statements
Self-joins are useful in SQL select statements and are often used to identify duplicate entries or entries linked in other ways such as customers who share the same telephone number or patients who share the same address.
A self-join is an internal table (or view) join. The self-join is between fields in a table and fields within a virtual copy of the table. For instance in a hospital database the patients’ address and postcode details are kept in the table tblAddress with the identifier of the patient in the field PatientID.
The following SQL statement can be run in Query Analyzer in order for the hospital administrators to list the patients who share an address and patients who have been erroneously registered twice under different PatientID’s:
SQL statement
| SELECT | A1.PatientID | AS | PatientID1, | A1.Address | AS | Address1, | A1.Postcode | AS | Postcode1 |
| Alias | Alias | Alias |
| FROM | tblAddress | AS | A1 | INNER JOIN | tblAddress | AS | A2 | ON | A1.Postcode =A2.Postcode |
| Alias | Alias |
| WHERE LEFT(A1.Address,3)=LEFT(A2.Address,3) AND A1.PatientID <> A2.PatientID |
| ORDER BY A1.Postcode |
Note: aliases are used to distinctly name fields or tables with the same name to clarify ambiguities.
This returns the following results set:
| PatientID1 | Address1 | Postcode1 |
| 231234 | 3 Montgomery Lane, Hartshead | AT12 7DS |
| 432122 | 3 Mongomery La, Rushington,Hartshead | AT12 7DS |
| 27923 | 21 Cystine Road, Sefton, Merseyside | L87 9ZZ |
| 3987 | 21 Cystine Rd, Liverpool | L87 9ZZ |
The Patients corresponding to these ID pairs can then be investigated to test whether they are duplicates registrations or different patients, possibly relatives, with the same address.
Self-joins can also be written as outer joins.
Inner And Outer Joins In SQL Statements
When joining two tables in a SQL database it is necessary to consider what type of join is required.
There are 4 possible ways in which these tables can be joined; Inner Join, Left Join, Right Join and Full Join (all of which may give different outcomes). Left, Right and Full Joins are all outer Joins. The mystery of outer joins often puts people off them but this article seeks to explain left and right joins with a set of simple examples.
The following examples use two tables in a hospital database. The patient table tblPatient (11 records) contains a code (PatientDisease) for their disease and the codelist or lookup table tlkpDisease (12 records) contains a list of disease codes (DiseaseID) and their meaning in words (DiseaseText).
1. Inner Join
This is where only those records from the table on the Left of the Join that match those from the table on the Right of the Join are returned.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | INNER JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
Note: The word Inner may be omitted as Inner joins are assumed by default.
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| 835001 | 32299 | 32299 | Meningitis |
| 831001 | 49399 | 49399 | Asthma |
Note: Only those 9 patients who have a patientdisease which is in the disease codelist table are returned in this results set.
2. Left Join
This is where all records from the table on the Left of the Join are returned and only those that match from the table on the Right of the Join.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | LEFT JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 845001 | 57 | NULL | NULL |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| 835001 | 32299 | 32299 | Meningitis |
| 836001 | 33699 | NULL | NULL |
| 831001 | 49399 | 49399 | Asthma |
Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.
3. Right Join
This is where all records from the table on the Right of the Join are returned and only those that match from the table on the Left of the Join are returned.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | RIGHT JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| NULL | NULL | 29909 | Infantile autism |
| 835001 | 32299 | 32299 | Meningitis |
| NULL | NULL | 35109 | Bell's palsy |
| 831001 | 49399 | 49399 | Asthma |
| NULL | NULL | 74921 | Cleft palate |
Note: All 12 diseases from the disease codelist table are returned in the results set whether or not there is a patient with that disease in the patient table.
4. Full Join
This is where all records from the table on the Left of the Join are returned and all those from the table on the Right of the Join are returned.
SQL statement
| SELECT | Indexno, PatientDisease, DiseaseID, DiseaseText | ||||
| FROM | tblPatient | FULL JOIN | tlkpDisease | ON PatientDisease=DiseaseID | |
| (left table) | (join) | (right table) | (join criteria) | ||
This returns the following results set:
| IndexNo | PatientDisease | DiseaseID | DiseaseText |
| 17001 | 30 | 30 | Leukaemia |
| 840001 | 37 | 37 | Breast cancer |
| 841001 | 50 | 50 | Carcinoma |
| 845001 | 57 | NULL | NULL |
| 831001 | 58 | 58 | Other neoplasm |
| 846001 | 60 | 60 | Epilepsy |
| 840001 | 62 | 62 | Allergic condition |
| 838001 | 24399 | 24399 | Thyroid insufficiency |
| NULL | NULL | 29909 | Infantile autism |
| 835001 | 32299 | 32299 | Meningitis |
| 836001 | 33699 | NULL | NULL |
| NULL | NULL | 35109 | Bell's palsy |
| 831001 | 49399 | 49399 | Asthma |
| NULL | NULL | 74921 | Cleft palate |
Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.
Summary
The above SQL statement returned four different results sets using the same join criteria on the same tables with different join types.
Microsoft Access query optimisation
Everyone wants the performance of their database to be optimal. In particular, there is often a requirement for a specific query or object that is query based, to run faster.
The performance of a query is affected by the tables or queries that underly the query and by the complexity of the query. Table based forms are faster than query based forms and attached tables are slower than integral tables. Sometimes it may be wise to import rather than to attach frequently accessed foreign tables.
Access uses Rushmore techniques borrowed from FoxPro, to automatically optimise queries that contain two or more indexes. (Rushmore supports some, but not all foreign formats of attached table. For example, it supports FoxPro but not Btrieve. As Rushmore is an automatic process there is no need to understand it except to remember to index two or more query fields for all but the simplest of queries.)
Here are some of the methods that Advent IT use to optimise the speed of Access queries:
- Display the minimum number of fields in a query. Set criteria dependant fields that are not required in the dynaset to "not shown".
- Index all restriction based fields, all fields included in expressions, all sorted fields and all join fields.
- Use primary keys or unique indexes wherever possible.
- Use numeric rather than text primary keys.
- Use non blank unique fields.
- Avoid the use of IIf() function in queries.
- Avoid domain aggregate functions such as Dlookup().
- Make careful use of Between and Equal to, rather than > or < speeds up queries.
- Use fixed column headings in Crosstab queries.
- For reports based on queries use Portrait view in preference to Landscape and select Fast Laser Printing to Yes (View,Options,Other Properties).
- Use Make table queries for running reports on static data. These are called snapshot reports.
- Use Count (*) rather than Count(Column).
- When creating restrictions on a joined column in one-to-many relationships, test out the comparative performance when placing the restriction on the "one" side or the "many" side. The "one" side is not always the fastest — the "many" may have markedly fewer records.
- Short table and field names run faster than long names.
- Normalise tables — join strategies execute more quickly on smaller tables.
- Denormalise tables — reduce the number of joins. Get the balance right between normalisation and denormalisation by experiment.
- Avoid the use of Distinct Row queries — Union queries do not need the distinct row feature as they are automatically return unique fields unless set to Union All.
