Posts Tagged ‘data organisation’

Inner And Outer Joins In SQL Statements

When joining two tables in a SQL database it is necessary to consider what type of join is required.

There are 4 possible ways in which these tables can be joined; Inner Join, Left Join, Right Join and Full Join (all of which may give different outcomes). Left, Right and Full Joins are all outer Joins. The mystery of outer joins often puts people off them but this article seeks to explain left and right joins with a set of simple examples.

The following examples use two tables in a hospital database. The patient table tblPatient (11 records) contains a code (PatientDisease) for their disease and the codelist or lookup table tlkpDisease (12 records) contains a list of disease codes (DiseaseID) and their meaning in words (DiseaseText).

1. Inner Join

This is where only those records from the table on the Left of the Join that match those from the table on the Right of the Join are returned.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient INNER JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

Note: The word Inner may be omitted as Inner joins are assumed by default.

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
835001 32299 32299 Meningitis
831001 49399 49399 Asthma

Note: Only those 9 patients who have a patientdisease which is in the disease codelist table are returned in this results set.

2. Left Join

This is where all records from the table on the Left of the Join are returned and only those that match from the table on the Right of the Join.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient LEFT JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
845001 57 NULL NULL
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
835001 32299 32299 Meningitis
836001 33699 NULL NULL
831001 49399 49399 Asthma

Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.

3. Right Join

This is where all records from the table on the Right of the Join are returned and only those that match from the table on the Left of the Join are returned.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient RIGHT JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
NULL NULL 29909 Infantile autism
835001 32299 32299 Meningitis
NULL NULL 35109 Bell's palsy
831001 49399 49399 Asthma
NULL NULL 74921 Cleft palate

Note: All 12 diseases from the disease codelist table are returned in the results set whether or not there is a patient with that disease in the patient table.

4. Full Join

This is where all records from the table on the Left of the Join are returned and all those from the table on the Right of the Join are returned.

SQL statement

SELECT Indexno, PatientDisease, DiseaseID, DiseaseText
FROM tblPatient FULL JOIN tlkpDisease ON PatientDisease=DiseaseID
(left table) (join) (right table) (join criteria)

This returns the following results set:

IndexNo PatientDisease DiseaseID DiseaseText
17001 30 30 Leukaemia
840001 37 37 Breast cancer
841001 50 50 Carcinoma
845001 57 NULL NULL
831001 58 58 Other neoplasm
846001 60 60 Epilepsy
840001 62 62 Allergic condition
838001 24399 24399 Thyroid insufficiency
NULL NULL 29909 Infantile autism
835001 32299 32299 Meningitis
836001 33699 NULL NULL
NULL NULL 35109 Bell's palsy
831001 49399 49399 Asthma
NULL NULL 74921 Cleft palate

Note: All 11 patients from the patients table and all 12 diseases from the disease codelist table are returned in the results set.

Summary

The above SQL statement returned four different results sets using the same join criteria on the same tables with different join types.

What are databases and what are their benefits?

What is a database?

At the heart of all businesses, whether they are scientific, medical, educational, research-based or commercial, are data. Companies acquire data from their clients, patients, subscribers, customers and suppliers and from their own products and services. Businesses often require data from other sources as reference data for their own products or services. This reference data includes information such as: population and demographic data; stocks and share prices; consumer profiles; medical code lists; telephone and email directories. The key to a company’s success is how they organise their data.

When business data is kept in several disparate formats, manual intervention is needed in order to cross reference data or link one or more datasets to each other. When a company has lots of separate datasets it is challenging for any one individual to fully understand the company’s business. By analysing the company’s business with a view to bringing datasets together it is possible to document and display all the facets of a company’s data. Design methods are then used to arrange the different datasets in a manner that any grouping, search or calculation can be made across all a company’s datasets and the power of the data can be harnessed and fully exploited. This designed arrangement of data is called a database design. Once the design is agreed a database can be developed to provide an integral business solution.

The benefits of databases

Implementing a database solution for a business will enhance productivity and accuracy. Duplicated processes can be eradicated; data can be entered by several people simultaneously, data can be validated on data-entry; data-entry can be reduced by providing pre-typed lookup lists and duplicate records can be exposed and removed. Data can be displayed dynamically to anyone who has permitted access in a wealth of different formats and reports. When web based solutions are used, data can be securely, entered and viewed anywhere in the world. A company’s web site can be kept up-to-date from their database facilitating marketing and sales.

Interrogating databases empowers organizations to achieve their targets. Knowing which customers are also suppliers; calculating the effect of international exchange rates on product or component prices; deriving population based incidence rates or calculating the potential size of a company’s client base can be keys to marketing success; better sales margins; achieving research goals or business plan acceptance.

Who needs databases?

In short all businesses need to store their data in fully relational databases which will enable them to utilise and exploit their data. For some business models there is excellent ‘off the shelf’ software which will be perfect for their purposes, while for specialist and research businesses it will be necessary to request customised software developed by specialists such as Advent IT.