4 Ways to Delete Duplicate Records in Oracle
Introduction
Duplicate records can cause a variety of issues in your Oracle database, such as skewing query results or taking up unnecessary storage space. In this article, we will explore four different methods to identify and delete duplicate records in an Oracle database.
1. RowID-Based Deletion
The first method involves using the built-in attribute `RowID`. This unique identifier can help you pinpoint and remove duplicate rows.
“`sql
DELETE FROM table_name
WHERE rowid NOT IN (
SELECT MIN(rowid)
FROM table_name
GROUP BY column1, column2, …, columnN
);
“`
Replace `table_name` with your table name and `column1`, `column2`, etc., with the columns that define a duplicate record.
2. Self-Join Technique
The self-join technique is another method for identifying duplicate records. Here, you perform a join on the table with itself based on the criteria defining duplicate records.
“`sql
DELETE FROM (
SELECT t1.*
FROM table_name t1
JOIN table_name t2
ON (t1.column1 = t2.column1 AND t1.column2 = t2.column2 AND … AND t1.rowid > t2.rowid)
);
“`
Replace `table_name` with your desired table name and fill in the appropriate column names for `column1`, `column2`, etc.
3. DISTINCT Keyword with Temporary Table
Create a temporary table containing only distinct records then truncate the original table and insert distinct records from the temporary one.
“`sql
CREATE TABLE temp_table AS
SELECT DISTINCT *
FROM table_name;
TRUNCATE TABLE table_name;
INSERT INTO table_name
SELECT * FROM temp_table;
DROP TABLE temp_table;
“`
Replace `table_name` with your respective table name and adjust accordingly.
4. Analytic Functions (RANK, DENSE_RANK or ROW_NUMBER)
Oracle supports analytic functions like `RANK()`, `DENSE_RANK()`, and `ROW_NUMBER()`, which can be helpful in tagging duplicate records and then removing them.
“`sql
DELETE FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY column1, column2, … ORDER BY rowid) rn
FROM table_name
)
WHERE rn > 1;
“`
Replace `table_name` with your desired table name and adjust the columns accordingly.
Conclusion
In this article, we explored four different ways to delete duplicate records in an Oracle database: using RowID-based deletion, self-join technique, temporary tables with the DISTINCT keyword, and leveraging Oracle’s analytic functions. Remember always to test your queries on a test environment before running them on production data to prevent accidental data loss.