Download presentation
Presentation is loading. Please wait.
Published byCameron Barton Modified over 8 years ago
1
Database Refactoring By: Chris Hoover
2
2 Agenda 1. Introduction 2. The Problem 3. The Solution 4. Demo 5. Q&A
3
3 Who Am I? ● Senior database administrator at AWeber Communications. ● DBA since 2000. ● 5 years of PostgreSQL experience.
4
4 Who is AWeber? ● AWeber serves the opt-in email marketing software needs of small businesses worldwide. ● 58,000+ current SMB clients and staff of 43 ● Been in business since 1998 (11 Years)
5
5 The Problem
6
6 ● Databases with suboptimal design
7
7 The Problem ● Databases with suboptimal design ● Non-normalized relations
8
8 The Problem ● Databases with suboptimal design ● Non-normalized relations ● Inefficient relations
9
9 The Problem ● Databases with suboptimal design ● Management issues
10
10 The Problem ● Databases with suboptimal design ● Management issues ● Return on Investment (ROI)
11
11 The Problem ● Databases with suboptimal design ● Management issues ● Return on Investment (ROI) ● Loss of revenue due to downtime
12
12 The Problem ● Databases with suboptimal design ● Management issues ● Return on Investment (ROI) ● Loss of revenue due to downtime ● Stopping/Slowing new development
13
13 The Problem ● Databases with suboptimal design ● Management issues ● Return on Investment (ROI) ● Loss of revenue due to downtime ● Stopping/Slowing new development ● No “Bells & Whistles” to sell customer on
14
14 The Problem ● Databases with suboptimal design ● Management issues ● Application issues
15
15 The Problem ● Databases with suboptimal design ● Management issues ● Application issues ● Tight coupling of application to database
16
16 The Problem ● Databases with suboptimal design ● Management issues ● Application issues ● Tight coupling of application to database – Changes to relations break the application
17
17 The Problem ● Databases with suboptimal design ● Management issues ● Application issues ● Tight coupling of application to database – Changes to relations break application ● Schema changes require lots of regression testing
18
18 The Problem ● Databases with suboptimal design ● Management issues ● Application issues ● Tight coupling of application to database – Changes to relations break application ● Schema changes require lots of regression testing ● Time requirements for migration of existing data
19
19 The Solution
20
20 The Solution ● Isolate the application from the database! ● HOW???
21
21 The Solution Typical application to database connectivity: Application mailto_addresses
22
22 The Solution Original Table: CREATE TABLE mailto_addresses( user_id integer NOT NULL PRIMARY KEY, mailto_1_first_name text, mailto_1_middle_name text, mailto_1_last_org_name text, mailto_1_address_1 text, mailto_1_address_2_text, mailto_1_city text, mailto_1_state char(2), mailto_1_zipcode text,... mailto_10_state char(2), mailto_10_zipcode text);
23
23 The Solution ● Isolate the relation(s) from direct queries
24
24 The Solution ● Isolate the relation(s) from direct queries ● Rename the table ● ALTER TABLE mailto_addresses RENAME TO mailto_addresses_orig;
25
25 The Solution ● Isolate the relation(s) from direct queries ● Rename the table ● Create a view named as the table
26
26 The Solution ● Isolate the relation(s) from direct queries ● Rename the table ● Create a view named as the table CREATE VIEW mailto_addresses AS SELECT * FROM mailto_addresses_orig;
27
27 The Solution ● Isolate the relation(s) from direct queries ● Rename the table ● Create a view named as the table ● Created needed rules to handle insert, update, and deletes
28
28 The Solution Insert rule: CREATE RULE mailto_addresses_insert_rl AS ON INSERT TO public.mailto_addresses DO INSTEAD SELECT insert_mailto_addresses(new.*); CREATE FUNCTION insert_mailto_addresses(...) RETURNS VOID LANGUAGE plpgsql AS $BODY$ INSERT INTO mailto_addresses_orig VALUES (in_new.*); RETURN; $BODY$;
29
29 The Solution Update rule: CREATE RULE mailto_addresses_update_rl AS ON update TO public.mailto_addresses DO INSTEAD SELECT update_mailto_addresses (new.*, old.*); CREATE FUNCTION update_mailto_addresses(...) RETURNS VOID LANGUAGE plpgsql AS $BODY$ UPDATE mailto_addresses_orig SET (columns) = (in_new.*) WHERE user_id = in_old.user_id; RETURN; $BODY$;
30
30 The Solution Delete rule: CREATE RULE mailto_addresses_delete_rl AS ON DELETE TO public.mailto_addresses DO INSTEAD SELECT delete_mailto_addresses(old.*); CREATE FUNCTION delete_mailto_addresses(...) RETURNS VOID LANGUAGE plpgsql AS $BODY$ DELETE FROM mailto_addresses_orig WHERE user_id = in_old.user_id; RETURN; $BODY$;
31
31 The Solution ● Isolate the relation(s) from direct queries ● Rename the table ● Create a view named as the table ● Created needed rules to handle insert, update, and deletes ● Perform in a single transaction!
32
32 The Solution Application mailto_addresses mailto_addresses_orig
33
33 The Solution ● Isolate the relation(s) from direct queries ● Create new schema
34
34 The Solution CREATE TABLE mailto_addresses_orig ( user_id integer NOT NULL PK, mailto_1_first_name text, mailto_1_middle_name text, mailto_1_last_org_name text, mailto_1_address_1 text, mailto_1_address_2 text, mailto_1_city text, mailto_1_state char(2), mailto_1_zipcode text,... mailto_10_state char(2), mailto_10_zipcode text ); CREATE TABLE customer_mailing_addresses ( user_id integer NOT NULL, first_name text, middle_name text, last_org_name text, address_line_1 text, address_line_2 text, city text, state char(2), zipcode text, sort_key smallint CHECK(sort_key BETWEEN 1 and 10), PRIMARY KEY(user_id, sort_key) );
35
35 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Update the primary key to use the sort_key!
36
36 The Solution CREATE TABLE mailto_addresses_orig ( user_id integer NOT NULL PK, mailto_1_first_name text, mailto_1_middle_name text, mailto_1_last_org_name text, mailto_1_address_1 text, mailto_1_address_2 text, mailto_1_city text, mailto_1_state char(2), mailto_1_zipcode text,... mailto_10_state char(2), mailto_10_zipcode text ); CREATE TABLE customer_mailing_addresses ( user_id integer NOT NULL, first_name text, middle_name text, last_org_name text, address_line_1 text, address_line_2 text, city text, state char(2), zipcode text, sort_key smallint CHECK(sort_key BETWEEN 1 and 10), PRIMARY KEY(user_id, sort_key) );
37
37 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Update the primary key to use the sort_key! ● Constrain sort_key to only allow the max number of rows.
38
38 The Solution CREATE TABLE mailto_addresses_orig ( user_id integer NOT NULL PK, mailto_1_first_name text, mailto_1_middle_name text, mailto_1_last_org_name text, mailto_1_address_1 text, mailto_1_address_2 text, mailto_1_city text, mailto_1_state char(2), mailto_1_zipcode text,... mailto_10_state char(2), mailto_10_zipcode text ); CREATE TABLE customer_mailing_addresses ( user_id integer NOT NULL, first_name text, middle_name text, last_org_name text, address_line_1 text, address_line_2 text, city text, state char(2), zipcode text, sort_key smallint CHECK(sort_key BETWEEN 1 and 10), PRIMARY KEY(user_id, sort_key) );
39
39 The Solution Application mailto_addresses mailto_addresses_orig customer_mailing_addresses
40
40 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Alter original relations
41
41 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Alter original relations ● Add migration status column ● ALTER TABLE mailto_addresses_orig ADD COLUMN is_migrated boolean;
42
42 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Alter original relations ● Add migration status column ● is_migrated tells us which row is authoritative. ● NO DEFAULT
43
43 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Alter original relations ● Add migration status column ● is_migrated tells us which row is authoritative. ● NO DEFAULT ● Create index on is_migrated ● CREATE INDEX CONCURRENTLY mao_is_migrated_index on mailto_addresses_orig (is_migrated);
44
44 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Alter original relations ● Update functions
45
45 The Solution Update insert_mailto_addresses function: ● Insert data for record 1 into user_mailing_addresses. (required for new view) ● If data exists for address 2-10, insert address into user_mailing_addresses. ● If no data for a given address (2-10) don't insert.
46
46 The Solution Update update_mailto_addresses function: Update mailto_addresses_orig set is_migrated = true where is_migrated <> true If found the call insert_mailto_addresses(new.*) If not found: Check each set of addresses for differences between old and new. If found then upsert the new record.
47
47 The Solution Update delete_mailto_addresses: Delete from mailto_addresses_orig Delete from user_mailing_addresses
48
48 The Solution ● Isolate the relation(s) from direct queries ● Create new schema ● Alter original relations ● Update functions ● Create triggers on mailto_addresses_orig
49
49 The Solution Create update trigger on mailto_addresses_orig ● Check to make sure is_migrated <> TRUE ● Raise exception if trying to update old table where is_migrated.
50
50 The Solution CREATE FUNCTION update_trigger_func() RETURNS trigger AS $BODY$ BEGIN IF new.is_migrated THEN RAISE EXCEPTION 'Row migrated.'; END IF; RETURN NEW; END; $BODY$ CREATE TRIGGER update_trigger BEFORE UPDATE ON mailto_addresses_orig FOR EACH FOR EXECUTE PROCEDURE update_trigger_func();
51
51 The Solution Remove insert privileges from mailto_addresses_orig!
52
52 The Solution ● Update the mailto_addresses view to query from both tables.
53
53 The Solution CREATE OR REPLACE VIEW mailto AS SELECT user_id, m1.first_name as mailto_1_first_name, m1.middle_name as mailto_1_middle_name, m1.last_org_name as mailto_1_last_org_name, m1.address_1 as mailto_1_address_1, m1.address_2 as mailto_1_address_2, m1.city as mailto_1_city, m1.state as mailto_1_state, m1.zipcode as mailto_1_zipcode, m2.first_name as mailto_2_first_name,. m10.zipcode as mailto_10_zipcode FROM customer_mailing_addresses m1 LEFT JOIN customer_mailing_addresses m2 ON m1.user_id = m2.user_id and sort_key=2 LEFT JOIN customer_mailing_addresses m3 ON m1.user_id = m3.user_id and sort_key=3 LEFT JOIN customer_mailing_addresses m4 ON m1.user_id = m4.user_id and sort_key=4 LEFT JOIN customer_mailing_addresses m5 ON m1.user_id = m5.user_id and sort_key=5 LEFT JOIN customer_mailing_addresses m6 ON m1.user_id = m6.user_id and sort_key=6 LEFT JOIN customer_mailing_addresses m7 ON m1.user_id = m6.user_id and sort_key=7 LEFT JOIN customer_mailing_addresses m8 ON m1.user_id = m7.user_id and sort_key=8 LEFT JOIN customer_mailing_addresses m9 ON m1.user_id = m8.user_id and sort_key=9 LEFT JOIN customer_mailing_addresses m10 ON m1.user_id=m10.user_id and sort_key=10 WHERE m1.sort_key=1 UNION SELECT * FROM mailto_addresses_orig WHERE NOT is_migrated or is_migrated is null;
54
54 The Solution The view now allows us to pull appropriate records from either the old table or the new table. Insert rule normalizes the data and puts it into the new table(s). Update rule migrates the data from the old table and inserts it into the new table, if needed. Otherwise, it just updates the new table(s). Delete rule removes data from both tables.
55
55 The Solution Application mailto_addresses mailto_addresses_orig customer_mailing_addresses
56
56 The Solution Benefits No application changes required
57
57 The Solution Benefits No application changes required Complete data migration not required
58
58 The Solution Benefits No application changes required Complete data migration not required Can be done with system online
59
59 The Solution Benefits No application changes required Complete data migration not required Can be done with system online Problem database areas can be corrected
60
60 The Solution Benefits No application changes required Complete data migration not required Can be done with system online Problem database areas can be corrected Data “self” migrates, although migration program can be created to speed up process.
61
61 The Solution Caveats/Gotcha's Increased CPU/IO load from functions Possibly slower overall query time
62
62 DEMO
63
63 Questions?
64
64 Thanks for attending. Contact Information: Chris Hoover chrish@aweber.com www.aweber.com
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.