Sub Query is an intermediate-level concept in SQL. And at an intermediate level, this is the toughest & most confusing topic too. In this blog, we will try to make this topic super easy for everyone. We will proceed step by step and will give all the relevant examples to explain the concept in the best possible manner. Plus, for the practical part, the SQL Server platform has been used. So, if you are using any other platform then some of the commands might differ.
What is a Sub-Query?
A sub-query is a query within another SQL query embedded with WHERE, HAVING & FROM clauses. The output of the inner query becomes the input for the outer or main query. In other words, the output of the inner query is used as a condition for the outer query filtering the result for the outer query. A Sub-Query is also called an inner query or nested query.
Types of Sub-Queries
Single Row Sub-Query
A single-row sub-query returns either 0 or 1 row to the outer query. Let’s create a table & understand it.
Create Table Students(
RN Varchar(10) Primary Key,
SName Varchar(20),
Marks Int,
City Varchar(15))
Insert into Students
Values(1, ‘Vineet’, 80, ‘Delhi’),
(2, ‘Pradeep’, 70, ‘Jaipur’),
(3, ‘Raman’, 90, ‘Lucknow’),
(4, ‘Dinesh’, 65, ‘Delhi’)
Select * From Students
RN | SName | Marks | City |
1 | Vineet | 80 | Delhi |
2 | Pradeep | 70 | Jaipur |
3 | Raman | 90 | Lucknow |
4 | Dinesh | 65 | Delhi |
Now, for example, you need to find out all the students who scored more than RN 2. Then you need the concept of sub-query.
Select * from Students
Where Marks> (Select Marks from Students Where RN=2)
RN | SName | Marks | City |
1 | Vineet | 80 | Delhi |
3 | Raman | 90 | Lucknow |
The outcome of the inner query is a single row which is going as input for the outer query, this is the reason this is called Single Row Sub-Query.
Multi-Row Sub-Query
A multi-row sub-query returns more than 1 row to the outer query. For example, you need to find out the maximum marks holders from each city.
Select * from Students
RN | SName | Marks | City |
1 | Vineet | 80 | Delhi |
2 | Pradeep | 70 | Jaipur |
3 | Raman | 90 | Lucknow |
4 | Dinesh | 65 | Delhi |
Select * from Students
Where Marks IN (Select Max(Marks) from Students Group by City)
RN | SName | Marks | City |
1 | Vineet | 80 | Delhi |
2 | Pradeep | 70 | Jaipur |
3 | Raman | 90 | Lucknow |
Here, the inner query is returning 3 rows as output being a multi-row sub-query.
Multi Column Sub-Query
So far, when we were using single & multi-row sub-queries, the inner query was returning a single column. If the inner query is returning more than one column, then it is called a multi-column sub-query.
Multi-column subqueries help you to combine duplicate WHERE conditions into a single WHERE clause as you can compare multiple columns with a single WHERE clause. This concept does not work in SQL Server. In the SQL server, the same case can be handled with Joins as a replacement.
A correlated subquery is evaluated once for each row processed by the parent statement. It is also known as a synchronized subquery or a repeating subquery. It is used whenever an inner query must return a different result or set of results for each candidate row of the outer query.
Let’s create a new table Details:
Create table Details(
Product_id varchar(255),
Product_Name varchar(255),
category varchar(255),
List_Price int)
Insert into Details
Values(‘P1’, ‘Pencil’,’1′, 200),
(‘P2’, ‘Pen’,’2′, 300),
(‘P3’, ‘Erazer’,’3′, 250),
(‘P4’, ‘Sharpener’,’1′, 400),
(‘P5’, ‘Notebook’,’2′, 380),
(‘P6’, ‘Pen’,’3′, 350)
Select * from Details
Product_id | Product_name | Category | List_Price |
P1 | Pencil | 1 | 200 |
P2 | Pen | 2 | 300 |
P3 | Erazer | 3 | 250 |
P4 | Sharpener | 1 | 400 |
P5 | Notebook | 2 | 380 |
P6 | Pen | 3 | 350 |
Now, we need to find the products whose list price is equal to the highest list price of the products within the same category.
Select Product_Name, List_Price from Details WHERE List_Price in (Select Max(List_Price) from Details group by Category)
Product_Name | List_Price |
Sharpener | 400 |
Notebook | 380 |
Pen | 350 |
This is the example of a Correlated sub-query where the output of the inner query depends upon the value coming from the outer query for the comparison. So, the execution of the inner query is dependent upon the outer query. So, this query is a multi-row sub-query and a correlated query.
Rules to be followed in Sub-Query
- The inner query is enclosed within parentheses.
- We cannot use the ORDER BY clause in the inner query but can use the same in the outer query.
- The BETWEEN operator cannot be used to connect the outer & inner query but can be used within the subquery.
Operators that can be used with Sub-Query
Single Row Operators (=, >=, <=, >, <)
These operators are used only with single-row sub-queries. Refer to the example of the single-row sub-query above.
Multi-Row Operators (IN, NOT IN, ANY, SOME, ALL, EXISTS, NOT EXISTS)
These operators are used when the inner query is returning more than 1 row. Hence, they are used in multi-row subqueries. These operators help us to compare a value with a list of values coming out from the execution of the inner query. We will give the example of each operator here one by one.
1. IN
The IN operator helps you to specify multiple values in a WHERE clause. The IN operator is a way or shorthand for multiple OR conditions.
Refer to the Multi-Row Sub-Query example above.
2. NOT IN
Refer to the Multi-Row Sub-Query example above. The same example can be used with NOT IN for the reverse results.
3. ANY
ANY is a logical operator that compares a value with a group of values returned by an inner query. It must be preceded by a comparison operator (>, >=, <, <=, =, <>) & then followed by an inner query.
The values in the column of the outer query must match one or more values (any value) in the group of values to evaluate to true.
Select Product_Name, List_Price from Details WHERE List_Price = ANY (Select Max(List_Price) from Details group by Category)
Product_Name | List_Price |
Sharpener | 400 |
Notebook | 380 |
Pen | 350 |
4. SOME
SOME is a logical operator that compares a value with a group of values returned by an inner query. It must be preceded by a comparison operator (>, >=, <, <=, =, <>) & then followed by an inner query.
The values in the column of the outer query must match one or more values (at least some value) in the group of values to evaluate to true.
Select Product_Name, List_Price from Details WHERE List_Price = SOME (Select Max(List_Price) from Details group by Category)
Product_Name | List_Price |
Sharpener | 400 |
Notebook | 380 |
Pen | 350 |
5. ALL
ALL is a logical operator that compares a value with a group of values returned by an inner query. It must be preceded by a comparison operator (>, >=, <, <=, =, <>) & then followed by an inner query.
The values in the column of the outer query must match all the values in the group of values to evaluate to true.
Select Product_Name, List_Price from Details WHERE List_Price = ALL (Select Max(List_Price) from Details group by Category)
Product_Name | List_Price |
The output is blank because all the values did not match.
6. EXISTS
The EXISTS operator is used to test for the presence of any record in an inner query. The EXISTS operator returns TRUE if the inner query returns one or more records.
Select * from Students
WHERE exists (Select * from Students WHERE City=’Mumbai’)
RN | SName | Marks | City |
As there is no student from the city Mumbai so the output has no records.
7. NOT EXISTS
The NOT EXISTS operator is used to test for the absence of any record in an inner query. The NOT EXISTS operator returns TRUE if the inner query returns zero records.
Select * from Students
WHERE not exists (Select * from Students WHERE City=’Mumbai’)
RN | SName | Marks | City |
1 | Vineet | 80 | Delhi |
2 | Pradeep | 70 | Jaipur |
3 | Raman | 90 | Lucknow |
4 | Dinesh | 65 | Delhi |
As the city Mumbai is absent in the data, the output has all the records.
Statements that can be used with Sub-Query
Select
Select is a Data Query Language (DQL) statement that is used to query the database. In the blog, whatever tables we create, we will use a Select statement to query the database and get the output. When we use Select in the sub-query concept then it comes twice. One in the inner query & one in the outer query.
Select * from Students
Where Marks> (Select Marks from Students Where RN=2)
This query above shows the use of the Select statement in the sub-query.
Insert
Let’s understand how we can use insert statement in sub-query. Let’s create another empty table with the same column names & data types.
Create Table Students1(
RN Varchar(10) Primary Key,
SName Varchar(20),
Marks Int,
City Varchar(15))
Insert into Students1
Select * from Students
Select * From Students1
RN | SName | Marks | City |
1 | Vineet | 80 | Delhi |
2 | Pradeep | 70 | Jaipur |
3 | Raman | 90 | Lucknow |
4 | Dinesh | 65 | Delhi |
Here, we have used Insert in the outer query & Select in the inner query. Now, this sub-query has inserted all the data from table Students to Students1. Using the WHERE clause, the values can be inserted into Students1 based on certain criteria also.
Delete
Now, for example, you need to delete all the students who scored more than RN 2. Then you need the concept of sub-query. We will implement this concept in the Students1 table here.
Select * From Students1
RN | SName | Marks | City |
1 | Vineet | 80 | Delhi |
2 | Pradeep | 70 | Jaipur |
3 | Raman | 90 | Lucknow |
4 | Dinesh | 65 | Delhi |
Delete from Students1
Where Marks> (Select Marks from Students Where RN=2)
Select * From Students1
RN | SName | Marks | City |
2 | Pradeep | 70 | Jaipur |
4 | Dinesh | 65 | Delhi |
Update
Let’s add a new column Grade to the table Students.
Alter table Students
add Grade Varchar(5)
The table now looks as follows:
Select * From Students
RN | SName | Marks | City | Grade |
1 | Vineet | 80 | Delhi | NULL |
2 | Pradeep | 70 | Jaipur | NULL |
3 | Raman | 90 | Lucknow | NULL |
4 | Dinesh | 65 | Delhi | NULL |
Now, we need to update this table as per the following criteria.
Students who scored more than the highest marks in Delhi deserve A+ and the rest A.
Update Students
SET Grade=’A+’
Where Marks> (Select Max(Marks) from Students Where City=’Delhi’)
Select * From Students
RN | SName | Marks | City | Grade |
1 | Vineet | 80 | Delhi | NULL |
2 | Pradeep | 70 | Jaipur | NULL |
3 | Raman | 90 | Lucknow | A+ |
4 | Dinesh | 65 | Delhi | NULL |
Update Students
SET Grade=’A’
Where Grade is NULL
Select * From Students
RN | SName | Marks | City | Grade |
1 | Vineet | 80 | Delhi | A |
2 | Pradeep | 70 | Jaipur | A |
3 | Raman | 90 | Lucknow | A+ |
4 | Dinesh | 65 | Delhi | A |
Final Words
We can use sub-queries on a single table as well as on multiple tables. In a way, Sub-query is an alternative to the Joins concept in SQL. But there are cases where Sub-queries are the only option and even Joins do not help.
Sub-queries are advantageous because they divide the whole statement into different queries and are more readable, but we should prefer Joins over Sub-queries due to better optimization. Joins reduce the processing burden on the database by replacing multiple queries with one query.
Want to learn