Everything About Sub-Queries In SQL

0
75
sub-queries-in-sql

Sub Query is an intermediate-level concept in SQL. And at an intermediate level, this is the toughest & most confusing topic too. In this blog, we will try to make this topic super easy for everyone. We will proceed step by step and will give all the relevant examples to explain the concept in the best possible manner. Plus, for the practical part, the SQL Server platform has been used. So, if you are using any other platform then some of the commands might differ.

What is a Sub-Query?

A sub-query is a query within another SQL query embedded with WHERE, HAVING & FROM clauses. The output of the inner query becomes the input for the outer or main query. In other words, the output of the inner query is used as a condition for the outer query filtering the result for the outer query. A Sub-Query is also called an inner query or nested query.

Types of Sub-Queries

  • Single Row Sub-Query

A single-row sub-query returns either 0 or 1 row to the outer query. Let’s create a table & understand it.

Create Table Students(

RN Varchar(10) Primary Key,

SName Varchar(20),

Marks Int,

City Varchar(15))

Insert into Students

Values(1, ‘Vineet’, 80, ‘Delhi’),

(2, ‘Pradeep’, 70, ‘Jaipur’),

(3, ‘Raman’, 90, ‘Lucknow’),

(4, ‘Dinesh’, 65, ‘Delhi’)

Select * From Students

RNSNameMarksCity
1Vineet80Delhi
2Pradeep70Jaipur
3Raman90Lucknow
4Dinesh65Delhi

Now, for example, you need to find out all the students who scored more than RN 2. Then you need the concept of sub-query.

Select * from Students

Where Marks> (Select Marks from Students Where RN=2)

RNSNameMarksCity
1Vineet80Delhi
3Raman90Lucknow

The outcome of the inner query is a single row which is going as input for the outer query, this is the reason this is called Single Row Sub-Query.

  • Multi-Row Sub-Query

A multi-row sub-query returns more than 1 row to the outer query. For example, you need to find out the maximum marks holders from each city.

Select * from Students

RNSNameMarksCity
1Vineet80Delhi
2Pradeep70Jaipur
3Raman90Lucknow
4Dinesh65Delhi

Select * from Students

Where Marks IN (Select Max(Marks) from Students Group by City)

RNSNameMarksCity
1Vineet80Delhi
2Pradeep70Jaipur
3Raman90Lucknow

Here, the inner query is returning 3 rows as output being a multi-row sub-query.

  • Multi Column Sub-Query

So far, when we were using single & multi-row sub-queries, the inner query was returning a single column. If the inner query is returning more than one column, then it is called a multi-column sub-query.

Multi-column subqueries help you to combine duplicate WHERE conditions into a single WHERE clause as you can compare multiple columns with a single WHERE clause. This concept does not work in SQL Server. In the SQL server, the same case can be handled with Joins as a replacement.

  • Correlated Sub-Query

A correlated subquery is evaluated once for each row processed by the parent statement. It is also known as a synchronized subquery or a repeating subquery. It is used whenever an inner query must return a different result or set of results for each candidate row of the outer query.

Let’s create a new table Details:

Create table Details(

Product_id varchar(255),

Product_Name varchar(255),

category varchar(255),

List_Price int)

Insert into Details

Values(‘P1’, ‘Pencil’,’1′, 200),

(‘P2’, ‘Pen’,’2′, 300),

(‘P3’, ‘Erazer’,’3′, 250),

(‘P4’, ‘Sharpener’,’1′, 400),

(‘P5’, ‘Notebook’,’2′, 380),

(‘P6’, ‘Pen’,’3′, 350)

Select * from Details

Product_idProduct_nameCategoryList_Price
P1Pencil1200
P2Pen2300
P3Erazer3250
P4Sharpener1400
P5Notebook2380
P6Pen3350

Now, we need to find the products whose list price is equal to the highest list price of the products within the same category.

Select Product_Name, List_Price from Details WHERE List_Price in (Select Max(List_Price) from Details group by Category)

Product_NameList_Price
Sharpener400
Notebook380
Pen350

This is the example of a Correlated sub-query where the output of the inner query depends upon the value coming from the outer query for the comparison. So, the execution of the inner query is dependent upon the outer query. So, this query is a multi-row sub-query and a correlated query.

Rules to be followed in Sub-Query

  • The inner query is enclosed within parentheses.
  • We cannot use the ORDER BY clause in the inner query but can use the same in the outer query.
  • The BETWEEN operator cannot be used to connect the outer & inner query but can be used within the subquery.

Operators that can be used with Sub-Query

  • Single Row Operators (=, >=, <=, >, <)

These operators are used only with single-row sub-queries. Refer to the example of the single-row sub-query above.

  • Multi-Row Operators (IN, NOT IN, ANY, SOME, ALL, EXISTS, NOT EXISTS)

These operators are used when the inner query is returning more than 1 row. Hence, they are used in multi-row subqueries. These operators help us to compare a value with a list of values coming out from the execution of the inner query. We will give the example of each operator here one by one.

1. IN

The IN operator helps you to specify multiple values in a WHERE clause. The IN operator is a way or shorthand for multiple OR conditions.

Refer to the Multi-Row Sub-Query example above.

2. NOT IN

Refer to the Multi-Row Sub-Query example above. The same example can be used with NOT IN for the reverse results.

3. ANY

ANY is a logical operator that compares a value with a group of values returned by an inner query. It must be preceded by a comparison operator (>, >=, <, <=, =, <>) & then followed by an inner query.

The values in the column of the outer query must match one or more values (any value) in the group of values to evaluate to true.

Select Product_Name, List_Price from Details WHERE List_Price = ANY (Select Max(List_Price) from Details group by Category)

Product_NameList_Price
Sharpener400
Notebook380
Pen350

4. SOME

SOME is a logical operator that compares a value with a group of values returned by an inner query. It must be preceded by a comparison operator (>, >=, <, <=, =, <>) & then followed by an inner query.

The values in the column of the outer query must match one or more values (at least some value) in the group of values to evaluate to true.

Select Product_Name, List_Price from Details WHERE List_Price = SOME (Select Max(List_Price) from Details group by Category)

Product_NameList_Price
Sharpener400
Notebook380
Pen350

5. ALL

ALL is a logical operator that compares a value with a group of values returned by an inner query. It must be preceded by a comparison operator (>, >=, <, <=, =, <>) & then followed by an inner query.

The values in the column of the outer query must match all the values in the group of values to evaluate to true.

Select Product_Name, List_Price from Details WHERE List_Price = ALL (Select Max(List_Price) from Details group by Category)

Product_NameList_Price

The output is blank because all the values did not match.

6. EXISTS

The EXISTS operator is used to test for the presence of any record in an inner query. The EXISTS operator returns TRUE if the inner query returns one or more records.

Select * from Students

WHERE exists (Select * from Students WHERE City=’Mumbai’)

RNSNameMarksCity

As there is no student from the city Mumbai so the output has no records.

7. NOT EXISTS

The NOT EXISTS operator is used to test for the absence of any record in an inner query. The NOT EXISTS operator returns TRUE if the inner query returns zero records.

Select * from Students

WHERE not exists (Select * from Students WHERE City=’Mumbai’)

RNSNameMarksCity
1Vineet80Delhi
2Pradeep70Jaipur
3Raman90Lucknow
4Dinesh65Delhi

As the city Mumbai is absent in the data, the output has all the records.

Statements that can be used with Sub-Query

  • Select

Select is a Data Query Language (DQL) statement that is used to query the database.  In the blog, whatever tables we create, we will use a Select statement to query the database and get the output. When we use Select in the sub-query concept then it comes twice. One in the inner query & one in the outer query.

Select * from Students

Where Marks> (Select Marks from Students Where RN=2)

This query above shows the use of the Select statement in the sub-query.

  • Insert

Let’s understand how we can use insert statement in sub-query. Let’s create another empty table with the same column names & data types.

Create Table Students1(

RN Varchar(10) Primary Key,

SName Varchar(20),

Marks Int,

City Varchar(15))

Insert into Students1

Select * from Students

Select * From Students1

RNSNameMarksCity
1Vineet80Delhi
2Pradeep70Jaipur
3Raman90Lucknow
4Dinesh65Delhi

Here, we have used Insert in the outer query & Select in the inner query. Now, this sub-query has inserted all the data from table Students to Students1. Using the WHERE clause, the values can be inserted into Students1 based on certain criteria also.

  • Delete

Now, for example, you need to delete all the students who scored more than RN 2. Then you need the concept of sub-query. We will implement this concept in the Students1 table here.

Select * From Students1

RNSNameMarksCity
1Vineet80Delhi
2Pradeep70Jaipur
3Raman90Lucknow
4Dinesh65Delhi

Delete from Students1

Where Marks> (Select Marks from Students Where RN=2)

Select * From Students1

RNSNameMarksCity
2Pradeep70Jaipur
4Dinesh65Delhi
  • Update

Let’s add a new column Grade to the table Students.

Alter table Students

add Grade Varchar(5)

The table now looks as follows:

Select * From Students

RNSNameMarksCityGrade
1Vineet80DelhiNULL
2Pradeep70JaipurNULL
3Raman90LucknowNULL
4Dinesh65DelhiNULL

Now, we need to update this table as per the following criteria.

Students who scored more than the highest marks in Delhi deserve A+ and the rest A.

Update Students

SET Grade=’A+’

Where Marks> (Select Max(Marks) from Students Where City=’Delhi’)

Select * From Students 

RNSNameMarksCityGrade
1Vineet80DelhiNULL
2Pradeep70JaipurNULL
3Raman90LucknowA+
4Dinesh65DelhiNULL

 

Update Students

SET Grade=’A’

Where Grade is NULL

Select * From Students 

RNSNameMarksCityGrade
1Vineet80DelhiA
2Pradeep70JaipurA
3Raman90LucknowA+
4Dinesh65DelhiA

 

Final Words

We can use sub-queries on a single table as well as on multiple tables. In a way, Sub-query is an alternative to the Joins concept in SQL. But there are cases where Sub-queries are the only option and even Joins do not help.

Sub-queries are advantageous because they divide the whole statement into different queries and are more readable, but we should prefer Joins over Sub-queries due to better optimization. Joins reduce the processing burden on the database by replacing multiple queries with one query.

Want to learn 

Previous articleComprehensive Guide On DAX Using Power BI
Next article6 Essential Skills To Become A Successful Entrepreneur
A learner, dreamer, and passionate about modern tools and technology like Excel 365, Power BI, SQL, HR Analytics, Six Sigma, WordPress and Visuals to name a few. Total Experience 15+ years including more than a decade of experience in S&P Global and ongoing 4+ years of experience with fast-growing startups & few MNCs. Clients served: Startups, MSMEs to a few of the Big 4 Consulting firms. Trained more than 2500 professionals in the last 4 years on various skills (Analytics, Content, WordPress, Finance, Entrepreneurship, etc). Last but the most important - mother of an awesome kid!

LEAVE A REPLY

Please enter your comment!
Please enter your name here