Presentation is loading. Please wait.

Presentation is loading. Please wait.

CSE255 Midterm Exam Statistics (Fa07)

Similar presentations


Presentation on theme: "CSE255 Midterm Exam Statistics (Fa07)"— Presentation transcript:

1 CSE255 Midterm Exam Statistics (Fa07)
Notes: Final Exam 120 points MT - range from 40-50%; FE - range from 60-50% Track Performance from MT to Final Exam Differential from MT to Final If Increase - Weighted Average If Stay Same - that is your Performance If Decrease - that is your Performance Consider P1 points if Close to next range Worry Worry a lot!

2 Problem 1a - Relational Algebra (Fa07)
List the name and address of all vendors that provide both hardware and software. BOTHV = HVendorID,SVendorID (  HWFlag=T and SWFlag=T (Vendor)) ANS = HVName, HVAddr (HardwareVendor * BOTHV) HVendorID  SVName, SVAddr (SardwareVendor * BOTHV) SVendorID

3 Problem 1b - Relational Algebra (Fa07)
List the versions and vendor names of all C++ software installed on computers made by the vendor DEC. COMPV = VendorID,CInventNum (Computer * Inventory) CInventNum=InventNum HWVEND = HVendorID,CInventNum (COMPV * Vendor) VendorID DEC = CInventNum ( HVName=DEC (HWVEND * HardwareVendors)) HVendorID DECSW = SInventNum (DEC * InstalledSW) CInventNum DEC++SW = SVendorID,SWVersion ( SWName=C++ DECSW * Software) SInventNum ANS = SWName,SWVersion (DEC++SW * SoftwareVendor) SVendorID

4 Problem 1c - Relational Algebra (Fa07)
List all purchase orders from the vendor Dell that have been ordered but not delivered. DELL = VendorID ( HVName=DELL (Vendor * HardwareVendors)) HVendorID ANS = PONumber, PODate, POCost ( DeliveredDate=Null DELL * Inventory) VendorID

5 Problem 2 - ER Specialization (Fa07)
Each Specialization was worth 5 pts – deductions for Not indicating Total Not supplying attributes Most Popular Answers: Disjoint: Inventory with Computer, Accessory, and Software as Children Overlap Vendor with HWVendor and SWVendor as Children Union: Accessory m Software Computer 1 m u 1 DeployedPC

6 Problem 3- Functional Dependencies (Fa07)
Computer( CInventNum, ComputerName, ComputerType, AccID); CInventNum  ComputerName, ComputerType CInventNum, AccID  ComputerName, ComputerType Inventory( InvenNum, SerialNum, PONum, PODate, DeliveredDate, POCost, VendorID); PONum  InventNum, SerialNum, PODate, DeliveredDate, etc. InventoryNum  SerialNum (or some other one not involving PO) SoftwareVendor (SVendorID, SVName, SVAddr, SWName, SWVersion, SWDesc); SVendorID  SVName, SVAddr SVendorID, SVName, SWName, SWVersion SWDesc, SWAddr SVendorID   SWName (first two basically equivalent) SVName   SWName SWName   SWVersion

7 Prob 4 – Relational Schema Analysis (Fa07)
Computer( CInventNum, ComputerName, ComputerType, AccID); Insert - For a new computer, you must always insert an accessory (since it is part of the key). If there are N accessories, there are N rows for each computer. Update – if you change the Name or Type, you must change all N tuples. Delete - No obvious delete anomalies. Conclusion: Computer represents two different entities (Guideline 1) – the computer and its accessories, and as a result, violates Guideline 2 in regards to insert anomalies. A better design would separate accessories in a similar manner to the Installed Software table. Accessory( AccID, AInventNum, HVendorID, AccName, AccType, AccSize); From an inventory control perspective, there is no way to track the total number of each accessory that has been purchased. You may have 10 USB 120 Gig external hard drives, and each one would have its own AccID and AIventNum. The other problem is related to Guideline 3 due to null values for AccSize (limited problem). Insert, Delete, and Update: No obvious anomalies. Conclusion: The table is OK – but it could be improved by separating out the different types of accessories (that have been purchased). It may also make sense not to track this at all in their gory detail – many companies (UConn included) don’t track equipment that is less than $1000, and many of these fit into that category.

8 Prob 4 – Relational Schema Analysis (Fa07)
Software( SInventNum, SVendorID, SWName, SWVersion); The only real problem in this table is that SVendorID, SWName, and SWVersion are foreign keys into the SoftwareVentor table, and as a result, this information is replicated in both tables. Conclusion: There may be a better way to design the Software, Installed Software, and SoftwareVendor tables, particularly in regards to reducing the key size (and hence the foreign key linkages). Inventory(InvenNum, SerialNum, PONum, PODate, DeliveredDate, POCost, VendorID); This table suffers violates two guidelines: Guideline 1 in regards to representing two different entities (inventory and purchase orders), and Guideline 3 in regard to an excessive amount of null values. Insert, Delete, and Update: No obvious anomalies. Conclusion: Split into two different tables: Inventory (InvenNum, SerialNum, PONum) and PurchaseOrder (PONum, PODate, DeliveredDate, POCost, VendorID) which will address Guideline 1 and will not result in a Inventory tuple until the item is actually received. DeliveredDate will still be null for all outstanding orders.

9 Prob 4 – Relational Schema Analysis (Fa07)
InstalledSoftware( CInventNum, SWInventNum); Vendor( VendorID, HWFlag, HWVendorID, SWFlag, SWVendorID); InstalledSoftware is dealing with two foreign key references to the Computer and Software tables, respectively. Vendor is allowing us to unify the different ID tracking systems for software and hardware vendors. The only problem with Vendor is that there are potentially null values for companies that sell either hardware or software but not both. You could argue that the flags are not needed in Vendor as well, since the null values (or not-null) has this information. Conclusion: In the case of Vendor (and VendorID, SVendorID, HVendorID), this may be a poor design and if the database has not been deployed, it may make sense to totally redesign this identifier to have a single identifier. This would allow the Vendor table to be eliminated. This would separate vendor common information into a single Vendor (VendorID, VName, VAddr). This would eliminate the null value (Guideline 3) problem of Vendor. Thus, changes to Vendor would impact both the HardwareVendor and SoftwareVendor Tables.

10 Prob 4 – Relational Schema Analysis (Fa07)
HardwareVendor ( HVendorID, HVName, HVAddr, ModelNum, ModelName, ModelDescr); SoftwareVendor (SVendorID, SVName, SVAddr, SWName, SWVersion, SWDesc); Both tables have insert anomalies (can’t insert HWV or SWV without inserting a product), delete anomalies (if you delete the last item for a vendor you delete the vendor), and update anomalies (if you change an address, or a name – buyout, you need to change multiple tuples) – Guideline 2 is a real issue in this regard. Both tables are representing two different entities: the contact information for a vendor (id, name, and address) and the vendors products – violating Guideline 1 for having a relation only represent a single entity. As a result, the keys are convoluted – you need to have a ModelNum for HardwareVendors and a compound key for SoftwareVendors; this makes the foreign key references more complicated. Conclusion: As mentioned, redesign the tables Vendor, HWVendor, and SWVendor to pull out their commonalities and unify the identifier. Use Vendor as defined on the previous slide, use VendorID in the HWVendor and SWVendor tables while dropping Name and Addr from those tables.

11 Problem 5- Normalization (Fa07)
INVOICE( OrderID, OrderDate, CustID, CustName, CustAddr, ProdID, ProdDesc, UnitPrice, OrderedQuantity) FULL A. {OrderID, ProductID}  OrderedQuantity PART B. OrderID  {OrderDate, CustID, CustName, CustAddr} TRANS C. CustID  {CustName, CustAddr} PART D. ProdID  {ProdDesc, UnitPrice} Remove Partial Dependencies ORDER_LINE( OrderID, ProdID, OrderedQuantity) PRODUCT(ProdID, ProdDesc, UnitPrice) CUST_ORDER( OrderID, OrderDate, CustID, CustName, CustAddr) Remove Transitive Dependency in CUST_ORDER ORDER_LINE( OrderID, ProdID, OrderedQuantity) PRODUCT(ProdID, ProdDesc, UnitPrice) ORDER( OrderID, OrderDate, CustID) CUSTOMER( CustID, CustName, CustAddr)

12 CSE255 Midterm Exam Statistics (Fa04)
Notes: Final Exam 120 points MT - range from 40-50%; FE - range from 60-50% Track Performance from MT to Final Exam Exam Differential from MT to Final If Increase - Weighted Average If Stay Same - that is your Performance If Decrease - that is your Performance Worry Worry Worry a lot! Worry a lot!

13 Problem 1- Relational to EER (Fa04)
Person PersonID Lname Fname StartYear o Player NumYears UniformNum Coach EndYear Record Wins Losses d RSRecord PORecord n 1 m PPG ROSTERS STATISTICS RPG k APG m Team TeamID Year Squad m TITLES TitleType 1

14 Problem 2 - EER to Relational (Fa04)
Step 1 - Strong Entities Item(UPC, Name, Wcost, Rcost, …) DailySales(Date, Total) CustomerOrder(AcctNum, Total) Step 8A - Option A DeliItem(UPC, Weight, Costlb, Increment) Step 7 - m-n-k Sales(ItemUPC, DeliItemUPC, Date) Step 7 - m-n-k - collapse DeliOrder and Order Order(AcctNum, ItemUPC, DeliItemUPC)

15 Problem 3a- Functional Dependencies (Fa04)
BOOKAUTHORS(BookId, SSN, LastName, FirstName, ); BookID, SSN  Lname, Fname, SSN  Lname, Fname,  Lname, FName BORROWER(CardNo, SSN, LastName, FirstName, Address, Phone); Card#  Lname, Fname, etc. SSN  Lname, Fname, etc. Card# SSN  Lname, Fname, etc. LIBRARYBRANCH(BranchId, BranchName, Address); BranchID  Bname, Addr Other Single FDs

16 Problem 3a - Functional Dependencies(Fa04)
BOOKS(BookId, Title, PubName, PubAddress, PubPhone); Title   BookID PubName   Title BookID Others Possible BOOKLOANS(BookId, BranchId, CardNo, DateOut, DueDate); BookID   BranchID BranchID   BookID CardNo   BookID CardNo   Dateout, DueDate BookID   CardNo Others Possible BOOKCOPIES(BookId, BranchId, NoOfCopies); BookID   NoCopies BranchID   BookID BookID   BranchID

17 Problem 4a - Relational Algebra (Fa04)
Find the names of all TV directors (first and last) who were (are) also movie actors, who directed a TVShow after their first movie role. TVD = PersonID,ShowID, StartYear (TVDirectors * TVShows) ShowID MVA = PersonID,ShowID, Year (MovieRoles * Movies) ShowID TVDMVA=PersonID ( StartYear>Year (TVD * MVA)) PersonID ANS4a = LName, FName (Person * ( TVDMVA)) PersonID

18 Problem 4b - Relational Algebra (Fa04)
Find the Movie names and gross revenues, and names of all Movie directors (first and last) who had roles in at least 3 TV episodes. X = M.PersonID,TVR.PersonID, EpisodeID (MovieDirs*TVRoles) PersonID MVD=PersonID(MD.PID=TVD.PID and Count(EpisodeID(*))>2(X) DIRS=ShowID, Lname, FName(Person*(MVD*MovieDirectors) PersonID PersonID ANS4b = MovieName, Gross, LName, FName (DIRS*Movies) ShowID

19 Problem 5a - SQL Queries (Fa04)
Find the names (first and last) and role names (first and last) for all actors who had their first movie role before their first TV role. LOTS OF WAYS TO SOLVE THIS PROBLEM. I LOOKED FOR: 1. Use of Person, Roles, TVShows, TVRoles, Movies, MovieRoles Tables. 2. Join Conditions to Link all of the Tables 3. Checking FirstRole = True 4. Comparing StartYear to Year I WAS PRETTY GENEROUS IN GRADING THIS PROBLEM.

20 Problem 5b - SQL Queries (Fa04)
Find the show names and names of all TV directors (first and last) who have directed at least 4 different TV episodes for which they have won emmys, sorted and grouped by TV show name. SELECT ShowName, Lname, FName FROM Person AS P, TVShows AS TVS, TVDirectors AS TVD WHERE TVD.EmmyFlag = True AND TVS.SHowID = TVD.ShowID AND TVD.PersonID = P.PersonID GROUP BY ShowName, EpisodeID HAVING Count (*) > 4 ORDER BY ShowName;

21 CSE255 Midterm Exam Statistics (Fa03)
Note: I have Included but Have Hidden Project Grades. Find Yourself Based on HW and Exam Grades. Worry

22 Problems 1-3 - Relational Tables (Fa03)

23 Problem 1 - ER Specialization (Fa03)
Each Specialization was worth 5 pts 3 pts for the entities 1 pt for having attributes 1 pt for having assumptions Basically, the answers were good for disjoint and overlapping, and were somewhat lacking (absent) for unions/categories. Some Union Were Directors (TV and Movie) Productions (TVShows and Movies)

24 Problem 2a - Relational Algebra (Fa03)
List the full name of an actor that has been a Star in a Movie and a Guest-Star in a TVShow. STAR = PersonID (MovieRoles * ( RoleType=Star Roles)) RoleID, ShowID GSTAR=PersonID (TVRoles * ( RoleType=Guest-Star Roles)) RoleID, ShowID ANS2a = LName, FName (Person * ( Star * GStar)) PersonID

25 Problem 2b - Relational Algebra (Fa03)
List the full name and the movie name for all directors who have won an Oscar for a movie that grossed less money than it cost to make. MVS = MovieName, ShowID ( ( Gross < Cost Movies)) DIRS= (MVS * ( OscarFlag=True MovieDirectors)) ShowID ANS2b = MovieName, LName, FName (DIRS * Person) PersonID

26 Problem 3a - SQL Queries (Fa03)
List the full name of all actors who have never been directors. First SELECT - All Actors NOT IN SELECT - All Directors SELECT LName, FName FROM Person AS P, TVRoles AS TR, MoiveRoles AS MR WHERE ( (TR.PersonID = P.PersonID) OR (MR.PersonID = P.PersonID)) AND P.PersonID NOT IN (SELECT PersonID FROM TVDirectors, MovieDirectors)

27 Problem 3b - SQL Queries (Fa03)
List the TV show name and StartYear for all TV shows that have been on TV for at least 5 years and that have the episode name ``The Pilot''. SELECT ShowName, StartYear FROM Episodes AS E, TVShows AS T WHERE E.Ename = ‘The Pilot’ AND T.NuMSeasons >= 5 AND E.ShowID = T.ShowID)

28 Problem 3c - SQL Queries (Fa03)
List the full name of Actors that have won at least one Oscar and at least one Emmy. SELECT LName, FName FROM PERSON WHERE PersonID IN ( SELECT PersonID FROM TVRoles AS T, MovieRoles AS M WHERE T.PersonID = M.PersonID AND T.EmmyFlag =True and M.OscarFlag= True) FROM Person AS P, TVRoles AS T, MovieRoles AS M WHERE T.PersonID = M.PersonID AND P.PersonID = M.PersonID AND T.EmmyFlag =True and M.OscarFlag= True)

29 Problem 3d - SQL Queries (Fa03)
List the role name (first and last) of all Movies and TV Shows that ``Steve Martin'' has had a role in. SELECT RLName, RFName FROM Roles WHERE RoleID IN ( SELECT RoleID FROM TVRoles AS T, MovieRoles AS R, Person AS P WHERE P.LName = ‘Martin’ AND P.FName = ‘Steve’ AND (P.PersonID = T.PersonID OR AND P.PersonID = M.PersonID)) FROM Roles AS R, TVRoles AS T, MovieRoles AS R, Person WHERE P.LName = ‘Martin’ AND P.FName = ‘Steve’ AND ((P.PersonID = T.PersonID AND R.RoleID = T.RoleID ) OR (P.PersonID = M.PersonID AND R.RoleID = M.RoleID )))

30 Problem 4 - Update Anomalies (Fa03)
No anomalies in either table - they each represent a single concept with some nulls and no repetitive values Insert - For a new episode, the role type for a given RLName/RFName/ShowID must be consistent with earlier value. Insert new episode, all data must be replicated and exact for all roles. Update - Change role name (first/last), roletype, PersonId, etc. multiple tuples must be modified. Values of TVFlag and EmmyFlag, and Mflag/OscarFlag must all be consistent with one another whenever there is a change. Delete - No obvious delete anomalies.

31 Problem 4 - Update Anomalies (Fa03)
Insert - For new episodes, exact values for Showname, Showyear, NumSeasons, ShowID must be consistent with earlier episodes of same show. Can’t insert Show without an Episode, nor Episode without Show. Update - Change Showname/Startyear, NumSeasons, NumEmmy, ShowID affects all tuples for all Episodes. Delete - Delete the last episode, you remove information on a show. Insert - No strongly obvious anomalies. However, since EpisodeID is “faked” for movies (set to E1), this is a potential problem. Update - Values of TVFlag and EmmyFlag, and Mflag/OscarFlag must all be consistent with one another whenever there is a change. For movies, must prohibit updates that set EpisodeID to any value except E1. Delete - No obvious anomalies.

32 Problem 5a- Functional Dependencies (Fa03)
RLName, RFName, ShowID, EpisodeID, TVFlag  EmmyFlag RLName, RFName, ShowID, MFlag  OscarFlag RoleID, EpisodeID, TVFlag  EmmyFlag;RoleID, EpisodeID OscarFlag RLName, RFName, ShowID, EpisodeID, RoleID, PersonID  RoleType, TVFlag, MFlag EpisodeID, RoleID  PersonID, RoleType RoleID  RLName, RFName, ShowID, EpisodeID, RoleID, PersonID, RoleType, TVFlag, Mflag RLName, RFName, ShowID  RoleID ShowName, StartYear  NumSeasons, ShowID, NumEmmy ShowName, StartYear  ShowID ShowID  ShowName, StartYear ShowID, EpisodeID  EName, Edescr or ShowName, StartYear, EpisodeID  EName, Edescr

33 Problem 5a - Functional Dependencies(Fa03)
PersonID, ShowID, EpisodeId, TVFlag  EmmyFlag PersonID, ShowID, MFlag  OscarFlag PersonID, ShowID, EpisodeId  TVFlag, Mflag or PersonID, ShowID, EpisodeId  TVFlag PersonID, ShowID  Mflag

34 Problem 5b - Functional Dependencies(Fa03)
Showname, Startyear  EpisodID, ShowId EName  ShowId ShowID  EpisodeID ShowName  Numseasons, StartYear, ShowID, NumEmmy EpisodeID  Ename, Edescr ShowName  EpisodeID, Ename, Edesc PersonID  ShowID PersondID  EpisodeID PersonID, ShowID  EpisodeID PersonID  ShowID, EpisodeID (combo of first two)

35 CSE255 Midterm Exam Statistics (Sp03)
Out of 75 All Problems are Counted and Problem 4’s Bonus Worry Out of 65 Only 1a,b and 2a,b with Problem 4’s Bonus Worry I will take your maximum percentage of both versions

36 CSE255 Midterm Exam Solution (Sp03)
Problems 1 and 2 - Relational Tables

37 Problem 1a - Relational Algebra (Sp03)
List the full names of all mens players that have/had the uniform number 5. PL = PLName, PFName ( UniformNumber=5 (PLAYER)) MT = TeamID ( Squad=‘Mens’ (TEAM)) ANS1a = PLName, PFName (PL * ( MT * ROSTERS))

38 Problem 1b - Relational Algebra (Sp03)
List the full names and points per game of all mens players that averaged more than 20 points per game and at least 5 rebounds per game in one year on teams with winning regular season records. MT = TeamID ( Squad=‘Mens’ (TEAM)) WR = TeamID ( Wins > Losses (RSRECORD * MT )) PL = ( PPG > 20 and RPG  5 (STATISTICS * WR )) ANS1b = PLName, PFName, PPG (PT * PLAYER)

39 Problem 1c - Relational Algebra (Sp03)
List the full names and uniform numbers of all womens players that played for Geno Auriemma in 1996. WT = TeamID ( Squad=‘Womens’ and Year =1996 (TEAM )) GAP = ( CLName=‘Auriemma’ (ROSTERS * WT)) ANS1c = PLName, PFName, UniformNumber (PLAYER*GAP)

40 Version 1 - Problem 2a - SQL Queries (Sp03)
Find the coaches (full names) of all teams that won a NCAA title without winning a BigEastRS (Regular Season) title. SELECT CLName, CFName FROM COACH AS C, ROSTERS AS R WHERE C.CLName = R.CLName AND EXISTS ( SELECT * FROM TITLES AS T WHERE TitleType = ‘NCAA’ and T.TeamID = R.TeamID) AND NOT EXISTS WHERE TitleType = ‘BigEastRS’ and T.TeamID = R.TeamID)

41 Version 2 - Problem 2a - SQL Queries (Sp03)
Find the coaches (full names) of all teams that won a NCAA title without winning a BigEastRS (Regular Season) title. SELECT CLName, CFName FROM COACH AS C, ROSTERS AS R, TITLES AS T WHERE C.CLName = R.CLname AND T.TeamID = R.TeamID AND T.TitleType = ‘NCAA’ AND T.TeamID NOT IN ( SELECT TeamID FROM TITLES AS S WHERE S.TitleType = ‘BigEastRS’)

42 Version 3 - Problem 2a - SQL Queries (Sp03)
Find the coaches (full names) of all teams that won a NCAA title without winning a BigEastRS (Regular Season) title. SELECT CLName, CFName FROM COACH AS C, ROSTERS AS R, TITLES AS T WHERE C.CLName = R.CLname AND T.TeamID = R.TeamID AND T.TitleType = ‘NCAA’ AND NOT EXISTS ( SELECT * FROM TITLES AS S WHERE S.TitleType = ‘BigEastRS’ AND S.TeamID = T.TeamID)

43 Problem 2b - SQL Queries (Sp03)
Find the average PPG, RPG, and APG for their entire careers, for all mens players (full names) who started their careers during the 1980s. SELECT PLName, PFName, AVG(PPG), AVG(RPG), AVG(APG) FROM TEAM AS T, PLAYER AS P, STATISTICS AS S WHERE P.StartYear > 1979 and P.StartYear < 1990 and P.PLName = S.PLName and T.Squad = ‘Mens’ and T.TeamID = S.TeamID GROUP BY PLName

44 Problem 2c - SQL Queries (Sp03)
Find the total wins of all coaches (full names) for their entire careers. SELECT CLName, CFName, SUM(R.Wins + P.Wins) FROM COACH AS C, ROSTERS AS R, RSRECORD AS X, PORECORD AS Y WHERE C.CLName = R.CLName and R.TeamID = X.TeamID and R.TeamID = Y.TeamID

45 Problem 3 - Update Anomalies (Sp03)
No Modify Anomaly - only one entry per player/coach (unique LNames) Insert Anomaly - Player in past can’t be coach in future - Player or coach leaves for 1 (or more years) and then returns - no way to store his/her return. No Delete Anomaly-only one entry per player/coach (unique LNames) Note: Lots of null values due to capturing two types of people.

46 Problem 3 - Update Anomalies (Sp03)
There are numerous problems for this table, since many values are replicated. For Example, teams that win multiple titles (NCAA and BigEastRS) must have each player and coach listed twice to capture this data. If you insert a player for a past team (that was omitted), you would have to make sure you inserted the player for all Titles of TeamID. Specifically: Insert: Can’t have a Team without having a player. Can’t have TitleType unless there is a Team with that title Delete: Last Player on a team - loose the team Modify: Change PLName, impact all TeamIDs for the player Basically - I looked for reasoning and a solid argument for this table.

47 Problem 3 - Update Anomalies (Sp03)
Modify Anomaly - Whenever RSWins or RSLosses is modified (for a win or a loss), TTLWins or TTLLosses must be incremented. No Insert Anomaly - only one entry per team. No Delete Anomaly - no information is lost on anything but the team. No Modify Anomaly - PPG, RPG, and APG are independent of one another and no values in common across different players. No Insert Anomaly - only one entry per player. No Delete Anomaly- no information is lost on anything but the player.

48 Problem 4 - Functional Dependencies (Sp03)
LName  FName, StartYear LName, PFlag  NumYears, UniformNumber LName, CFlag  EndYear Year, Squad  CLName CLName  TitleType TeamID  Year, Squad PLName  TitleType CLName  TeamId, Year, Squad TeamID, Year, Squad  TitleType TeamID, Year, Squad  PLName

49 Problem 4 - Functional Dependencies (Sp03)
TeamID  RSWins, RSLosses, TTLWins, TTLLosses PLName, TeamID  PPG, RPG, APG PLName   PPG, RPG, APG PLName   TeamID TeamID   PLName

50 Problem 5 (Sp03) WRITES Author SSN FirstName LastName Person


Download ppt "CSE255 Midterm Exam Statistics (Fa07)"

Similar presentations


Ads by Google