Monday, September 28, 2020

Remove duplicate rows from the SQL table.

Different ways to remove or delete duplicate rows from a SQL Table.



Today we will see different solutions for remove or delete duplicates row from our tables. This is the most popular interview question. Don’t worry I will saw you all the possible ways to remove duplicate rows from the table.


[Top 100+ popular SQL Interview Questions and Answers Blog.]


We create the Employee table and insert some dummy data.

CREATE TABLE [dbo].[Employee]

(

       EmpId INT IDENTITY(1,1) PRIMARY KEY,

       EmpName VARCHAR(20),

       ManagerId INT NULL,

       Salary MONEY,

       DeptId INT

);

GO

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rajesh',NULL, 60000,1);

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rohan',1, 50000,1);

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Janak',NULL, 70000,2);

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Mohan',3, 50000,2);

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rahul',NULL, 40000,3);

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Ronak',NULL, 40000,4);

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Ronak',NULL, 40000,4);

INSERT INTO [dbo].[Employee] (EmpName,ManagerId,Salary,DeptId) VALUES ('Rahul',NULL, 40000,3);


Hope you can successfully able to create both the above tables in your database now we go for example.

The actual employee data in the employee table is as below.

SELECT * FROM Employee



1) Remove duplicate rows using Group By and Having Clause query.

SELECT

    EmpName,

    Salary,

    COUNT(*) AS CNT,

    MAX(EmpId) AS EmpId

FROM

       Employee

GROUP BY

       EmpName,

       Salary

HAVING COUNT(*) > 1;


The above query returns the duplicate rows based on the employee name and salary from the employee table and the result is like below.



Now, we delete above duplicate rows from our table using Group By and Having Clause.

DELETE FROM

       Employee

WHERE

       EmpId NOT IN 

(

SELECT

       MAX(EmpId) AS EmpId

FROM

       Employee

GROUP BY

       EmpName,

       Salary

);

The above query successfully deletes duplicate rows from our table and now our result is as below.

SELECT * FROM Employee


2) Remove duplicate rows using CTE (Common Table Expression) query.

;WITH CTE AS(

 SELECT *,ROW_NUMBER() OVER (PARTITION BY EmpName,Salary ORDER BY EmpId) AS DuplicateRowCount FROM Employee

)

SELECT * FROM CTE

The result of the above query is as below.



Now, we delete duplicate rows based on the column DuplicateRowCount using the below query.

;WITH CTE AS(

 SELECT *,ROW_NUMBER() OVER (PARTITION BY EmpName,Salary ORDER BY EmpId) AS DuplicateRowCount FROM Employee

)

DELETE FROM CTE WHERE DuplicateRowCount>1


The above query successfully deletes duplicate rows from the employee table and now our result is as below.

SELECT * FROM Employee 



I hope you have liked this article for the understanding of Remove duplicate rows from the SQL table.

Please feel free to comment if you have found anything incorrect, or you want to share more information about the topic discussed above.

Please Share this article so it will help to another one because “Helping millions grow better”.

Follow my blog https://mydotnetguru.blogspot.com to read the more interesting technical articles.

Next Recommended Article:

  1. Optimized top 2 ways to find 2nd or Nth highest salary in SQL Server.
  2. When to use abstract class and interface in C#?
  3. Async and Await in C#
  4. Top 100+ popular ASP.Net MVC Interview Questions and Answers.
  5. Top 100+ popular SQL Interview Questions and Answers Blog.
  6. Top 100+ popular C# Interview Questions and Answers.
  7. @Input, @Output decorator and EventEmitter class in Angular.
  8. Dependency Injection and types of dependency injection.