Presentation is loading. Please wait.

Presentation is loading. Please wait.

Deleting LOTS of Data Author: Randy Cunningham, OCP.

Similar presentations


Presentation on theme: "Deleting LOTS of Data Author: Randy Cunningham, OCP."— Presentation transcript:

1 Deleting LOTS of Data Author: Randy Cunningham, OCP

2 Why Use DELETE? Retention policies; good data hygiene Data de-duplication Corporate mergers & acquisitions Purge and archival processing Cleaning out temporary & work tables Reclaim storage space Obtain better performance?…

3 What Are The Issues? DELETE never finishes DELETE takes too long Locking Impact on application performance Desired space not released by DELETE UNDO space & errors

4 What Happens During a DELETE 4 X

5 5Redo X X 1 X 23 MV Log 6 4Undo Insert…

6 Alternatives to DELETE TRUNCATE TABLE CTAS, truncate and reinsert data to keep Use partitioning: Truncate partition Drop partition Exchange partition DELETE confined to one, or a few, partitions

7 TRUNCATE TABLE Upside: typically many times faster than using an unqualified DELETE operation Downsides: Does not operate in transactional context (it’s DDL) You can’t rollback a TRUNCATE Cannot truncate parent rows in a R.I. Constraint It is DDL and requires DROP TABLE privileges

8 Delegating TRUNCATE rights CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE '||fqtn; END trunc_do; GRANT EXECUTE ON trunc_do TO PUBLIC;

9 Delegating TRUNCATE rights CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE '||fqtn; END trunc_do; GRANT EXECUTE ON trunc_do TO PUBLIC;

10 Delegating TRUNCATE rights CREATE PROCEDURE trunc_do (fqtn VARCHAR2) IS BEGIN EXECUTE IMMEDIATE 'TRUNCATE TABLE '||fqtn; END trunc_do; CREATE PROCEDURE trunc (owner IN VARCHAR2 DEFAULT USER, table_name IN VARCHAR) AUTHID CURRENT_USER IS fqtn VARCHAR2(80) := owner || '.' || table_name; BEGIN EXECUTE IMMEDIATE -- Ensure DELETE rights 'DELETE FROM ' || fqtn || ' WHERE 1=2'; trunc_do (fqtn); END trunc; GRANT EXECUTE ON trunc TO PUBLIC;

11 Create Table as Select… Use where most of the data are deleted Good for sidestepping UNDO issues Provides quick & simple fault resilience Downside: requires intermediate storage

12 Create Table as Select: Example CREATE TABLE student_temp AS SELECT * FROM student WHERE status_flag IN ('A','N','T') OR enroll_date >= to_date (’02/15/2006’, ‘MM/DD/YYYY’); TRUNCATE TABLE student; INSERT /*+APPEND*/ INTO student SELECT * FROM student_temp; DROP TABLE student_temp;

13 Create Table as Select: Observations Run steps manually, or provide thorough error handling All limitations of TRUNCATE TABLE apply Ensure there is enough space for the CTAS temporary holding table Be sure to use APPEND hint in the INSERT Does not work for tables with LONG data type (use SQL*Plus COPY command instead)

14 Table Partitioning DROP PARTITION – ideal for temporal data TRUNCATE PARTITION – for cyclic data EXCHANGE PARTITION – for archival DELETE … [PARTITION] – a full scan of the partition is quicker than a full table scan

15 Dropping a Table Partition Quickly eliminates an entire range or list Example: ALTER TABLE enrollment DROP PARTITION Year2001 UPDATE GLOBAL INDEXES; 20022003200420052006

16 Truncating a Table Partition Quick, easy way to manage cyclic partitions Example: ALTER TABLE enrollment TRUNCATE PARTITION mar_data UPDATE GLOBAL INDEXES; Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec X

17 Exchanging a Table Partition Quick way to excise data from the table, but retain it for archival or backup: CREATE TABLE enrolled03_2005 AS SELECT * FROM enrollment WHERE 1=2; ALTER TABLE enrollment EXCHANGE PARTITION mar_data WITH TABLE enrolled03_2005 UPDATE GLOBAL INDEXES;

18 DELETE on a Table Partition If the predicate filters the partition key, then only specific partitions are scanned. In this case, it is not necessary to specify PARTITION in the DELETE command. Look for these in the EXPLAIN PLAN: PARTITION RANGE SINGLE PARTITION RANGE ITERATOR

19 If DELETE You Must… Check the EXPLAIN PLAN for the DELETE! DELETE in batches Get indexes out of the way Use ROWID to minimize interference Use ROWID to implement parallelism

20 Check EXPLAIN PLAN for DELETE If most rows are being deleted, you will want to see a full table scan. If most blocks are being visited, you will want to see a full table scan. Optimize subquery form DELETE to perform well as a stand-alone SELECT.

21 DELETE in batches Delete a million (or so) rows at a time: DELETE FROM evt WHERE status = 'X' AND rownum <= 1000000; Delete an unrelated partition at a time: DELETE FROM evt PARTITION div162 WHERE status = 'X';

22 Get Indexes Out of the Way Old way: Drop the indexes and build them from scratch following deletion New way: Alter the indexes unusable and rebuild them following deletion: ALTER INDEX u_name_ix UNUSABLE ; DELETE FROM names WHERE … ALTER INDEX u_name_ix REBUILD ;

23 Before You Tamper With Indexes… Be certain an index isn’t needed to facilitate the access path for your DELETE Ensure that a significant proportion of the rows in the table are affected Test to be sure that it is worthwhile Be sure that other database operations are not relying on the index (best done during a scheduled maintenance window)

24 ROWID is your friend Isolates query operation from DELETE Minimizes number of block changes Is blazingly fast and efficient Overcomes key preserved table restrictions Can facilitate DELETE workload mgmt: Restartability Home-brew parallelization Batching

25 ROWID Case Study 1 We have identified extraneous rows: SELECT * FROM ( SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest FROM spammers) WHERE latest_uce < very_latest

26 ROWID Case Study 1 DELETE FROM ( SELECT * FROM ( SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest FROM spammers) WHERE latest_uce < very_latest)

27 ROWID Case Study 1 Result: ORA-01752: cannot delete from view without exactly one key- preserved table

28 ROWID Case Study 1 DELETE FROM spammers WHERE ROWID IN ( SELECT ROWID FROM ( SELECT domain_owner, latest_uce, MAX(latest_uce) OVER (PARTITION BY domain_owner) very_latest FROM spammers) WHERE latest_uce < very_latest)

29 ROWID Case Study 2 - Deduplication DELETE FROM Customers WHERE ROWID IN ( SELECT DISTINCT ROWID FROM (SELECT ROWID, MIN (ROWID) OVER (PARTITION BY Cust_Last_Name, Cust_First_name) Best FROM Customers) WHERE ROWID <> Best);

30 ROWID Case Study 3 - Staging CREATE GLOBAL TEMPORARY TABLE Rowid_Tbl (Xrowid ROWID) ON COMMIT PRESERVE ROWS; INSERT INTO Rowid_Tbl SELECT DISTINCT ROWID FROM (SELECT ROWID, … FROM Huge_Table … ); DELETE FROM Huge_Table WHERE ROWID IN (SELECT Xrowid FROM Rowid_Tbl) ;

31 ROWID Case Study 4 – Workload DEFINE N=8 -- # buckets desired CREATE TABLE Delete_Driver (rid ROWID, pctile NUMBER, PRIMARY KEY (rid)); INSERT /*+APPEND*/ INTO Delete_Driver SELECT ROWID rid, ntile(&N) OVER (ORDER BY ROWID) pctile FROM Warranty WHERE Warranty_Expiration < TO_DATE('01/01/2002','MM/DD/YYYY');

32 ROWID Case Study 4 - Workload Each batch, whether run in serial or parallel, operates this way: Define Job=4 -- set between 1 and 8 DELETE FROM (SELECT NULL FROM Warranty, Delete_Driver WHERE Warranty.ROWID = Delete_Driver.rid AND pctile=&Job); COMMIT;

33 ROWID Caveats Not for use in portable SQL… it is specific to Oracle Don’t use stale ROWIDs… don’t keep them around permanently Not for use where rows migrate constantly

34 Questions?

35 Questions 35


Download ppt "Deleting LOTS of Data Author: Randy Cunningham, OCP."

Similar presentations


Ads by Google