Presentation is loading. Please wait.

Presentation is loading. Please wait.

Database Design for Student Loan Limited

Similar presentations


Presentation on theme: "Database Design for Student Loan Limited"— Presentation transcript:

1 Database Design for Student Loan Limited
Chapter 13 Database Design for Student Loan Limited Moderate size case to in which to apply knowledge and skills of Parts 1 to 3 - Based on loan processing system of a large student loan processor - Simplified to fit into one chapter: essential ideas but omitting complicating details - Real database is more than 150 tables Objectives: - Perform view modeling and view integration for a moderate size case - Perform schema conversion and normalization - Estimate workload - Perform index selection - Specify the data requirements for applications in a comparable case For details about the case that could not be shown in slides, see the associated Visio files (ERDs) and the file Chapter13Tables.

2 Outline Case description Conceptual data modeling
Logical database design Physical database design Case description: - Workflow - Description of forms and reports Conceptual data modeling: - View modeling - Incremental view integration Logical database design: - Schema conversion - Normalization Physical database design: - Workload estimation - Index selections - Application development: data requirements

3 Case Overview Guaranteed Student Loans Environment
Lender Service Provider Guarantor Department of Education Replace existing information system Guaranteed student loans: - Subsidized and unsubsidized: interest paid during school on subsidized - Multiple GSL loans during school - Repayment begins after separation from school Student: receives loan from a lender Lender: makes loan with repayment guarantee from Department of Education Service provider: Student Loan Limited - Usually separate from lender - Collects payments - Tracks student status - Calculates repayment schedules Guarantor: audits work of the service provider Department of Education - Makes repayment guarantee - Notifies lenders of loan terms and amounts

4 Loan Processing Workflow
Originate loan Approve loan Separate from school Apply Send bill Make payment Miss payments Apply: students apply to banks Approve loan: lender Loan origination: sent from lender to service provider (Student Loan Limited) Separation from school: student notified of repayment schedule Billing: - Begins about 6 months after separation - Normally Bill-Payment cycle until repayment is complete - Other processing if payments are missed: eventual claim if repayment is not made Claim

5 Major Documents Loan origination form Disclosure letter
Statement of account Loan activity report Loan origination form: - submitted in batch from lenders - triggers involvement of service provider - Student data: address, educational institution - Loan data: note value and interest rate; lender data - Disbursement data: dates and amounts Disclosure letter: - sent after separation - Repayment schedule: amount of payment, first date, number of payments Statement of Account: - bills sent every payment period - Amount and date due - balances for each loan Loan activity report: - sent at the end of year - summarizes principal and interest payments made

6 Loan Origination Form Hierarchical form
- Parent node (main form): student and loan data - PK of the parent node: LoanNo - Child node (subform): disbursement data - Local key in the child node: date

7 Loan Origination ERD Initial form to analyze because loan origination begins the processing cycle ERD shows only PKs: see chapter 13 for other attributes; Complete ERD is available as a Visio file Loan: - Center of ERD - PK of form is the PK of Loan - Simplify ERD: make Guarantor, Institution, and Lender attributes in Loan Student: - Attributes come from parent node - Requirements state that a student can apply for multiple loans DisburseLine: - Derived from child node (subform) - PK: combination of LoanNo and Date - Identification dependent on Loan through the Sent relationship

8 Disclosure Letter Structure
Simple structure: no child nodes PK: combination of LoanNo and DateSent Some columns can be computed: compute LastPayDate given NumPayments and FirstPayDate

9 Disclosure Letter ERD Second document to analyze: sent after loan origination and student separation ERD: only show PKs; see chapter 13 for other attributes Complete ERD available as a Visio file

10 Statement Structure Parent node (main form):
- PK of the parent node: StatementNo - Contains statement and student data Child node: - Local key in the child node: LoanNo - Contains data about a loan (loanno, balance, and rate)

11 Statement ERD ERD differs slightly from Figure 13.13:
- This figure was drawn with Visio Professional 2000 - Does not support M-N relationships - Weak entity and identifying relationships instead of M-N relationship - Only PKs shown to reduce clutter - Complete ERD available in a Visio document ERD details: - Incremental integration after adding the Statement - Weak entity and identifying relationships capture the subform - Loan can appear on many statements - Statement must have at least one loan - Principal and interest are attributes of StatementLoan

12 Loan Activity Structure
Loan Activity Report: - Summarizes all outstanding loans - One report per student per year Parent node: - StudentNo: PK - Date and other student data as fields Child Node: - LoanNo: local key - Summary attributes for the loan as non key attributes

13 Loan Activity ERD Incremental integration after adding LoanActivity report ERD differs slightly from Figure 13.15: - This figure was drawn with Visio Professional 2000 - Does not support M-N relationships - Weak entity and identifying relationships instead of M-N relationship - Only PKs shown to reduce clutter - Complete ERD available in a Visio document Design changes: - Add LoanActivity entity type - ReportNo: added as the PK - Uses image to store details of a loan activity report: for archival purpose

14 Schema Conversion Rules
Entity type rule 1-M relationship rule M-N relationship rule Identification dependency rule The conversion can be performed using the first four rules (Chapter 6) as depicted in Table 5. The optional 1-M relationship rule (Rule 5) could be applied to the Guarantees relationship. However, the number of loans without guarantors appears small so the 1-M relationship rule is used instead. Use M-N relationship rule on Figure 13.15: applied relationship Use identification dependency rule on AppliedTo and StatementsApplied relationships in slide 13.

15 Schema Conversion Result

16 Normalization Student not in BCNF because of Zip FD
Zip  State Loan not in BCNF because of RouteNo FD RouteNo  DisBank Institution not in BCNF because of Zip FDs Zip  City, State Because most FDs involve a primary key on the left-hand side, there is not much normalization work. However, the Loan, Student, and Institution tables violate BCNF as these tables have determinants that are not candidate keys. The DiscLetter and LoanReport tables do not violate BCNF because all determinants are candidate keys. For the tables violating BCNF, here are explanations and options about splitting the tables:

17 Normalized Table Design
In the revised table design above, the ZipCode table and the Bank table are added to remove redundancies. For most foreign keys, deletions are restricted because the corresponding one and M tables are not closely related. For example, deletions are restricted for the foreign key Loan.InstID because the Institution and Loan tables are not closely related. In constrast, deletions cascade for the foreign key DisburseLine.LoanNo because disbursement lines are identification dependent on the related loan. Deletions also cascade for the foreign key Applied.StatementNo because applied rows represent statement lines that have no meaning without the statement. The update action of most foreign keys was set to cascade to allow easy changing of primary key values.

18 Physical Database Design
Application profiles: tables, conditions, parameter values, and frequencies Table profiles: estimated number of rows and distribution of values Index selection: clustering and non clustering indexes Derived data and denomalization Other implementation considerations Use application and table profiles to make decisions about index selection, derived data, and denormalization Other considerations: transition to a new system

19 Application Profiles Shows profiles for the loan origination form: see chapter for other details Three separate applications are associated with the Loan Origination Form. Verifying data involves retrievals to ensure that the student, lender, institution, and guarantor exist. If a student does not exist, a new row is added. Creating a loan involves inserting a row in the Loan table and multiple rows in the DisburseLine table.

20 Application Frequencies
Subset of application frequencies: see chapter 11 for full details For more detail, define the distribution over time of application frequencies To make physical database design decisions, the relative importance of applications must be specified. The frequencies in the above table assume 100,000 new loans per year and 100,000 students in repayment per year. The loan origination applications and the statement of account applications dominate the workload. The coarse frequencies (per year) are sufficient to indicate the relative importance of applications. A finer specification (e.g., by month or day) may be needed to schedule work such as to arrange for batch processing instead of on-line processing. For example, applications involving loan origination forms may be processed in batch instead of on-line.

21 Table Profiles Subset of table profiles are shown: for complete details see Chapter 11 More detail: describe distribution of column values and joint distributions The volume of modification activity (inserts, updates, deletes) can help in the estimation of table profiles. In addition, you should use statistics from existing systems and interviews with key application personnel to help make the estimates. Table provides an overview of the profiles. More detail about column distributions and relationship distributions can be added after the system is partially populated.

22 Index Selections Rules refer to Chapter 10
Subset of index selection decisions: see Table 11.11 ·        Indexes on the primary keys of the Student, Lender, Guarantor, Institution, DiscLetter, LoanActivity, Statement, and Bank tables support the verify loan, display disclosure letter, display activity report, and display statement of account applications. ·        A non-clustering index on student name may be a good choice to support retrieval of statements of account and the loan activity reports. ·        To support joins, nonclustering indexes on foreign keys Loan.StdNo, Statement.StdNo, Applied.LoanNo, and Applied.StatementNo may be useful. For example, an index on Loan.StdNo facilitates joining the Student and the Loan tables when given a specific StdNo value.

23 Derived Data and Denormalization Decisions
Loan.NoteValue DiscLetter and LoanActivity tables have derived data in the image columns. Denormalization LenderNo and Lender.Name in the Loan table violates BCNF, but it may reduce joins between the Loan and the Lender tables There are some derived data in the revised table design. The NoteValue column in the Loan table can be derived from columns in related rows of the DisburseLine table. The DiscLetter and the LoanActivity tables have lots of derived data in the Image columns. In all of these cases, the derived data seem justified because of the difficulty of computing it. The usage of the database should be monitored carefully to determine whether the Loan table should be denormalized by adding name columns in addition to the LenderNo, GuarantorNo, InstID, and RouteNo columns. If performance can be significantly improved, denormalization is a good idea because the Lender, Guarantor, Institution, and Bank tables are relatively static.

24 Other Implementation Issues
Processing volumes in a new system can be much larger than in the old system Poor quality of old data may cause many rejections in the conversion process Size of image data Many implementation issues in converting from an old system to a new system. ·        Smooth conversion from the old system to the new system is an important issue. One impediment to smooth conversion is processing volumes. Sometimes processing volumes in a new system can be much larger than in the old system. One way to alleviate potential performance problems is to execute the old and the new systems in parallel with more work shifted to the new system over time. ·        An important part of the conversion process involves the old data. Converting the old data to the new format is not usually difficult except for data quality concerns. Sometimes, the poor quality of old data causes many rejections in the conversion process. The conversion process needs to be sensitive to rejecting poor-quality data because rejections can require extensive manual corrections. ·        The size of the image data (loan activity reports and disclosure letters) can impact the performance of the database. Archival of the image data can improve performance for images that are infrequently retrieved.

25 Application Development Notes
Provides cross check on quality of database design Data requirements for forms and reports Loan origination form Loan activity report Derived data maintenance: AFTER ROW trigger for Loan.Balance For details, see Chapter 13 and the file Chapter13Tables

26 Summary Case includes a significant subset of student loan processing.
Solution depicts models for database development phases. Next step: database development for a real organization Open-ended, unclear, and changing requirements are challenges. This chapter presented a moderate-size case study as a capstone of the database development process. The Student Loan Limited case described a significant subset of commercial student loan processing including accepting loans from lenders, notifying students of repayment, billing and processing payments, and reporting loan status. The solution depicted models and documentation produced in the conceptual modeling, logical database design, and physical database design phases. This case, although presenting you with a larger, more integrated problem than the other chapters in Part 2, is still not comparable to performing database development for a real organization. For a real organization, requirements are often open-ended, unclear, and continuous. Deciding on the database boundary and modifying the database design in response to requirement changes are crucial to long-term success. Monitoring the operation of the database allows you to improve performance as dictated by database usage. These challenges make database development a stimulating intellectual activity.


Download ppt "Database Design for Student Loan Limited"

Similar presentations


Ads by Google