In Oracle there are many ways to delete duplicate records. Note
that below example are described to just explain the different possibilities.
Consider the EMP
table with below rows
create table emp(
EMPNNO integer,
EMPNAME varchar2(20),
SALARY number);
EMPNNO integer,
EMPNAME varchar2(20),
SALARY number);
10
Bill 2000
11 Bill 2000
12 Mark 3000
12 Mark 3000
12 Mark 3000
13 Tom 4000
14 Tom 5000
15 Susan 5000
11 Bill 2000
12 Mark 3000
12 Mark 3000
12 Mark 3000
13 Tom 4000
14 Tom 5000
15 Susan 5000
1. Using rowid
SQL > delete from
emp
where rowid not in
(select max(rowid) from emp group by empno);
where rowid not in
(select max(rowid) from emp group by empno);
This technique can
be applied to almost scenarios. Group by operation should be on the columns
which identify the duplicates.
2. Using self-join
SQL > delete
from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );
3. Using row_number()
SQL > delete
from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);
This is another
efficient way to delete duplicates
4. Using dense_rank()
SQL > delete
from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid) rn
from emp
)
where rn > 1
);
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid) rn
from emp
)
where rn > 1
);
Here you can use
both rank() and dens_rank() since both will give unique records when order by
rowid.
5. Using group by
Consider the EMP
table with below rows
10
Bill 2000
11 Bill 2000
12 Mark 3000
13 Mark 3000
11 Bill 2000
12 Mark 3000
13 Mark 3000
SQL > delete
from emp where
(empno,empname,salary) in
(
select max(empno),empname,salary from emp
group by empname,salary
);
(empno,empname,salary) in
(
select max(empno),empname,salary from emp
group by empname,salary
);
This technique is
only applicable in few scenarios.
Always take extra caution while deleting records.
1. First identify
the duplicates using select.
2. Double verify
those are actual ‘duplicates’ or not
3. Take backup
if necessary
4. Apply commit only
if you are sure.
Did you find above
post useful ? Your comments are highly valuable.
EmoticonEmoticon