Data Types & File Size Calculations Data Structures Data Types & File Size Calculations 09/12/2018
Learning Objectives Explain concepts of files, records, fields. Explain the function of key fields. Calculate an estimated file size given the number of records. 09/12/2018
Data Data stored in computers is normally connected in some way. For example, some data about 20 students has a connection because it all refers to the same set of people. Each person will have the same information stored about them, for instance their name, address, telephone number, exam grades… 09/12/2018
Fields & Records Each column is a field. Each row is a record. 45278 Account Number Surname Forename Balance 45278 Smith Sally €10.00 1208 Jones John €20.00 3217 Cain Shazad €100.00 4310 White Peter €250.00 09/12/2018
Key Field Some fields may contain the same items of data in more than one record. e.g. there may be two people with the same name or balance. It is important that the computer can identify individual records, and it can only do this if it can be sure that one of the fields will always contain different data in all the records. The key field is unique and is used to identify the record. In our example the key field would be the account number. 09/12/2018
Range / Fractional Real Precision Storage Requirements (bytes) Data Types Main Data Types Data Types Range / Fractional Real Precision Storage Requirements (bytes) Text (Characters) / String in VB Any characters 1 per character Char Any character 1 byte Integer (Numeric, Whole numbers, no fractions) Byte 0 - 255 1 Integer In Access stored by 2 bytes so +/- 32,768. In VB stored by 4 bytes so approx. +/- 2 billion 2 - 4 Long (Integer) In Access stored by 4 bytes so approx. +/- 2 billion. In VB stored by 8 bytes so approx. +/– 9.2...E+18. 4 – 8 Floating Point (Fractional Real Numbers) Single 7 4 Double 15 8 Decimal 28 12 Boolean (Y/N True/False) Often 1 byte is reserved Date/Time
Summary of Data Types Storage Requirements Storage Requirements (bytes) Text 1 byte per character Char 1 byte Byte 0 – 255 Integer > 255 4 bytes Decimal 8 bytes Boolean (True/False – Yes/No – 2 answers only) Date / Time 09/12/2018 Note: Use this summary in exam questions. The previous slide is for general interest only.
All numbers/dates etc.. take up the same memory space It does not matter how big or small a number is, it will always take up the specified number of bytes. e.g. A variable declared as Byte: Numbers 0, 1, 2 or 255 will take ALL up 1 byte each. The size of the number does not matter. e.g. A variable declared as Integer: Numbers 0, 1, 2 or 100000 will ALL take up 4 bytes each. Also note that you cannot pick and choose, e.g. a small number cannot be held as a byte and a big number held as an integer. The data type has to be chosen at the point of declaration and is chosen depending on how big the number might be, not what it actually is at any one time. e.g. A variable declared as a decimal. Numbers 3.1, 3.141, 1000.453278 will ALL take up 8 bytes each. The size of the number or length of the decimal part does not matter.
Text Fields must be “Fixed Lengths” As text fields are 1 byte per character they are different lengths which makes it impossible to estimate their size. So we must fix them to an appropriate maximum length. I suggest you do this in 10’s: e.g. 10 / 20 / 30 / 40 / 50 / … / 100 / … Exams won’t require you do this but my quizzes will and it is more sensible anyway. For example, the length of someone’s name should be fixed at 10 (or 20) characters long so, as each character is 1 byte, its size would be 10 bytes (or 20 bytes). 30 characters or more is too long and would be considered incorrect in an exam (and my quizzes). Note that this means all names are this size, irrespective of its real length ; names are not 10 characters long would have spaces added to the end and names which are longer than this would be cut short.
1. A library stores details of the books that are available. Apart from title and author, state 3 other fields that it would be sensible for the library to store in this file, giving a reason why each of your chosen fields would be necessary. (6) One mark for each of three sensible fields with an extra mark for an explanation of the need for that field. e.g. ISBN/to identify book Shelf number/ to allow for ease of search for book Fiction or reference or children's (some form of category)/ to decide whereabouts in library it should go 09/12/2018
A library stores details of the books that are available. State which field would be used as the key field of the record and explain why a key field is necessary. (2) Book number (ISBN) because it is unique to that record and hence can be used as an identifier. Name a suitable data type for this field. String / Integer Why? 09/12/2018
Whether or not an order is outstanding. 2. A stock file in a warehouse has the following fields in each record. (i) State data types suitable for each of the fields. Name of item. Date of last delivery. Price of item. Whether or not an order is outstanding. Number of that item left in stock. Text / String Date Currency / Decimal Boolean Integer 09/12/2018
Whether or not an order is outstanding. (ii) Given that there are approximately 10000 different items in the warehouse, estimate the size of the stock file. You should clearly show all the stages in the calculation. Name of item. Date of last delivery. Price of item. Whether or not an order is outstanding. Number of that item left in stock. 10 bytes 8 bytes 1 bytes 4 bytes 09/12/2018
(ii) Given that there are approximately 10000 different items in the warehouse, estimate the size of the stock file. You should clearly show all the stages in the calculation. 10 + 8 + 8 + 1 + 1 = Name of item Last delivery date Price of item Outstanding? Number in stock 31 bytes 31 * 10000 = 310000 bytes 10000 items = 10000 rows
(ii) Calculation Total 31 bytes 31 x 10000 = 310000 bytes Overhead: 10% of 310000 = 310000 * 0.1 = 310000 + 31000 = 341000 341000 / 1024 ~ 341 Kbytes Notes: You must write /1024 because computers do not work in base 10 as we do, but in base 2 – see Data Representation Presentation later. However, exams allow you to actually /1000 to make the calculation easier, as long you write /1024, then approximate (~) and actually /1000. If you get answers in the millions you will need to convert it into Megabytes (Mb) which means /1048576 (~1000000 bytes or 1 million bytes). You could add 10% by directly * 1.1 (as long as you allowed to use a calculator which only happens in paper 2, not paper 1). 09/12/2018