Remove duplicate rows from the SQL table.

Different ways to remove or delete duplicate rows from a SQL Table.

Today we will see different solutions for remove or delete duplicates row from our tables. This is the most popular interview question. Don’t worry I will saw you all the possible ways to remove duplicate rows from the table.

We create the Employee table and insert some dummy data.

CREATE TABLE [dbo].[Employee]
(
       EmpId INT IDENTITY(1,1) PRIMARY KEY,
       EmpName VARCHAR(20),
       ManagerId INT NULL,
       Salary MONEY,
       DeptId INT
);
GO
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rajesh',NULL, 60000,1);
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rohan',1, 50000,1);
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Janak',NULL, 70000,2);
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Mohan',3, 50000,2);
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rahul',NULL, 40000,3);
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Ronak',NULL, 40000,4);
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Ronak',NULL, 40000,4);
INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rahul',NULL, 40000,3);

Hope you can successfully able to create both the above tables in your database now we go for example.

The actual employee data in the employee table is as below.

SELECT * FROM Employee

1) Remove duplicate rows using Group By and Having Clause query.

SELECT
    EmpName,
    Salary,
    COUNT(*) AS CNT,
MAX(EmpId) AS EmpId
FROM
       Employee
GROUP BY
       EmpName,
       Salary
HAVING COUNT(*) > 1;

The above query returns the duplicate rows based on the employee name and salary from the employee table and the result is like below.

Now, we delete above duplicate rows from our table using Group By and Having Clause.

DELETE FROM
Employee
WHERE
EmpId NOT IN

(
SELECT
       MAX(EmpId) AS EmpId
FROM
       Employee
GROUP BY
       EmpName,
       Salary
);

The above query successfully deletes duplicate rows from our table and now our result is as below.

SELECT * FROM Employee

2) Remove duplicate rows using CTE (Common Table Expression) query.

;WITH CTE AS(
SELECT *,ROW_NUMBER() OVER (PARTITION BY EmpName,Salary ORDER BY EmpId) AS DuplicateRowCount FROM Employee
)

SELECT * FROM CTE

The result of the above query is as below.

Now, we delete duplicate rows based on the column DuplicateRowCount using the below query.

;WITH CTE AS(
SELECT *,ROW_NUMBER() OVER (PARTITION BY EmpName,Salary ORDER BY EmpId) AS DuplicateRowCount FROM Employee
)

DELETE FROM CTE WHERE DuplicateRowCount>1

The above query successfully deletes duplicate rows from the employee table and now our result is as below.

SELECT * FROM Employee

I hope you have liked this article for the understanding of Remove duplicate rows from the SQL table.

Please feel free to comment if you have found anything incorrect, or you want to share more information about the topic discussed above.

Please Share this article so it will help to another one because “Helping millions grow better”.

Follow my blog https://mydotnetguru.blogspot.com to read the more interesting technical articles.

Next Recommended Article:

100+ .Net Interview Questions & Answers | Technology News.

Menu

Monday, September 28, 2020

Remove duplicate rows from the SQL table.

Different ways to remove or delete duplicate rows from a SQL Table.

HAVING COUNT(*) > 1;

);

SELECT * FROM CTE

DELETE FROM CTE WHERE DuplicateRowCount>1

No comments:

Post a Comment