CpSc 3220 File and Database Processing Lecture 1 Course Overview File Storage Basics
Course Introduction Syllabus and tentative schedule are on Blackboard The course covers two main areas: File Processing Database Processing
Course Outcomes 1. Implement file reading and writing programs using PHP. 2. Identify file access schemes, including: sequential file access direct file access indexed sequential file access. 3. Describe file-sorting and file-searching techniques. 4. Describe data compression and encryption techniques. 5. Design a rational database using E-R modeling techniques. 6. Build a relational database. 7. Write database queries using SQL. 8. Implement a web-based relational database using MySQL.
File Processing Concepts A study of how data is stored and maintained on secondary storage Different ways of categorizing files Data files - numeric and character data Text Binary Graphics / Audio / Video Unstructured / Structured We will focus on structured data files
File Structures File Structures are persistent data structures Files composed of records Records composed of fields Files can be viewed as tables File -> Table Record -> Row Field -> Column
Instructor File IDNameDeptNameSalary 10101SrinivasanComp.Sci WuFinance MozartMusic EinsteinPhysics El SaidHistory GoldPhysics KatzComp.Sci CalifieriHistory SinghFinance CrickBiology BrandtComp.Sci KimElec.Eng.80000
Department File DeptNameBuildingBudget BiologyWatson Comp.Sci.Taylor Elec.Eng.Taylor FinancePainter HistoryPainter MusicPackard PhysicsWatson 70000
The CRUD paradigm Open the current version of a file Process it using the CRUD operations Create records Retrieve records Update records Delete records Output and close the new version of the file
Aside: A Process for Generating Acronyms Step 1. Chose a group of words or phrases that identify your process and let their first letters become the acronym Create records Retrieve records Update records Delete records If that doesn’t give an acceptable acronym go to step 2
Generating Acronyms Step 2 Re-order the words so that their first letters make a better acronym Create records Update records Retrieve records Delete records If that doesn’t work go to step 3
Generating Acronyms Step 3 Find synonyms for one or more the words so that their first letters will make a good acronym For example: Update becomes Change records Delete becomes Remove records Retrieve becomes Access records Create becomes Produce new records If that doesn’t work go to step 4
Generating Acronyms Step 4 Give up and get back to serious work
Physical Storage Media Speed Cost Reliability Type volatile storage non-volatile storage
Physical Storage Media Cache – fastest and most costly form of storage; volatile;. Main memory - fast access (10s to 100s of nanoseconds); expensive; volatile Flash memory – half fast; cheap; non-volatile Magnetic-disk – slow; cheap; non-volatile Optical storage – slower; cheaper; non-volatile Tape storage – slow access/fast transfer; cheap; non-volatile
Storage Hierarchy Primary storage: fastest media but volatile (cache, main memory). Secondary storage: non-volatile, moderately fast access time; also called on-line storage (flash memory, magnetic disks) Tertiary storage: non-volatile, slow access time; also called off-line storage (magnetic tape, optical storage)
Magnetic Hard Disk Mechanism
Performance Measures Access time – the time it takes from when a read or write request is issued to when data transfer begins. Seek time – time to reposition the arm over the correct track; 4 to 10 milliseconds on typical disks Rotational latency – time for the addressed sector to appear under the head; 4 to 11 milliseconds on typical disks (5400 to rpm) Data-transfer rate – the rate at which data can be retrieved from or stored to the disk; 25 to 100 MB per second max rate
Disk-Block Access Block a contiguous sequence of sectors from a single track; the smallest amount that can be accessed sizes range from 512 bytes to several kilobytes Inner track Outer track
Optimization of Disk Block Access Optimize block access time by organizing the blocks to correspond to how data will be accessed Store related information on the same or nearby cylinders. Files may get fragmented over time Systems have utilities to defragment the file system, in order to speed up file access
Murach's PHP and MySQL, C1 © 2010, Mike Murach & Associates, Inc. Slide 20
Murach's PHP and MySQL, C1 © 2010, Mike Murach & Associates, Inc. Slide 21
Murach's PHP and MySQL, C1 © 2010, Mike Murach & Associates, Inc. Slide 22
Summary File processing allows persistent data structures Most languages include libraries for file handling File processing is a large and complicated subject File storage devices can be grouped in three classes Magnetic disks are the most common storage device for file processing
For Next Time Read Chapter 1 of PHP and MySQL book