Data Structures.

Data Structures

Assessment Outcomes 4a - Explain the purpose of files in data processing. 4b - Define a file in terms of records and fields. 4d - Explain fixed and variable length fields and records and give examples of the appropriate use of each type. 4e - Design files and records appropriate for a particular application.

What is a data structure?
A data structure is a collection of data items stored in memory, in addition a number of operations are provided by the software to manipulate that data structure. A data structure means there is a relationship of some kind between the data items Exactly what the relationships are determine what type of data structure is being used. Typical operations on a data structure are: ADD INSERT DELETE or REMOVE FIRST NEXT LAST LOCATE

Why is data valuable Data is valuable for a number of reasons:
It takes time to compile, a long time! It takes time to input the data into the computer. To recompile data or re-enter it into a computer is expensive because you have to pay someone to do it, when they could be doing something far more productive for your company. You need information about an order placed with your company so that you can process the order and then be paid for it - that's how your company makes a profit! You need to know when to pay your bills and taxes so that you don't get taken to court. You need to be able to chase up people who haven't paid you so that you can pay your bills and keep trading.

Analysing Data If you were running a supermarket and you lost the data in your stock control system, you wouldn't know what you had on the shelves. You wouldn't know when you needed to re-order stock and you would lose money while you sorted out the problem. If you kept historical data on your supermarket's computer and you lost that, you would lose valuable information about product trends, what sells well and at what times and doesn’t sell very well, for example. This would have an effect on your business's potential to maximise their profits.

Files and File Structures
When you collect together records into a file, with each record holding the same information as other records, you are creating a file structure that is very easy for computers to analyse and manipulate Programs written for computer systems are very good at doing repetitive tasks quickly, but especially so if everything is organised in the same way. It is a relatively easy task, for example, for a computer program to search through ten million records and then display those that have the surname 'Cooper' if all the records are structured in the same way. If the records weren't structured in the same way, a program could still be written to find 'Cooper' in the records, but it would be a far more complex task and likely to take much longer.

Dynamic and static data structures
With a static data structure, the size of the structure is fixed. Static data structures are very good for storing a well-defined number of data items. For example a programmer might be coding an 'Undo' function where the last 10 user actions are kept in case they want to undo their actions. In this case the maximum allowed is 10 steps and so he decides to form a 10 item data structure. There are many situations where the number of items to be stored is not known before hand. In this case the programmer will consider using a dynamic data structure. This means the data structure is allowed to grow and shrink as the demand for storage arises. The programmer should also set a maximum size to help avoid memory collisions. For example a programmer coding a print spooler will have to maintain a data structure to store print jobs, but he cannot know before hand how many jobs there will be.

Dynamic and Static Data Structures

Entities, Attributes, Tables, and Records
Database structure A database is organised using a set of key components. These include: entities - each recorded item attributes - details about the entity field - columns used to capture attributes record - one row of details about an entity table - a set of fields and records primary key - unique number for an entity This is an example table of a flat-file database. The entities are films and the attributes are details about the films:

Entities, Attributes, Tables, and Records
The table contains all of the fields and the records for one type of entity. A database may contain more than one table. Records Records contain a collection of data for each entity, usually recorded as a row in the table. Fields The column headings are called the fields. Each field contains a different attribute. For every entity, a unit of data will be entered into each field. Each column might require different data types. For example, the 'Title' column will require data entered as text and the 'Certificate' column will need data entered as numbers. Unit of data Each individual piece of data entered is a unit of data. These units are also called data elements. The primary key contains a unique identifier for each record. To make each record in a database unique we normally assign them a primary key. Even if a record is deleted from a database, the primary key will not be used again. The primary key can be automatically generated and will normally just be a unique number or mix of numbers and letters.

Fixed and Variable Length Fields
Fixed length means that an element of data must be a set number of characters. For example the Date of Birth field may be forced into DD-MM-YYYY format – this is 10 characters long. Any data that is too short wont be accepted. For example would be too short! Some data structures can have constraints in them to fill in the rest – for example it could turn into or depending on the constraint. Variable length fields however allow elements of data to be different sizes. For example the ‘Name’ field. A name could be ‘John’ or ‘Paul’ which are both 4 characters long – or it could be ‘Francessco’ which is 10! Variable length fields are used when the exact number of characters is unknown. They usually have a maximum limit set on them to stop the amount of data escalating.

Fixed and Variable Length Fields
Decide on the data type and size to allow If you have ever set up a database, then you will have told your application what data type you want to use for each of your fields in a file, and also some information about how much space to allow for each field. This information would be brought together in a document called a ‘Data Dictionary'. For example, in the dog file, you may have decided on the following: ID. This field is not a number but is an ID code. Therefore we will not use data type Integer but will use data type text instead. We will assume that the maximum number of dogs that will ever be in this file is 5000 so that an ID code of 4 characters long will be fine. We will allow 4 bytes for the ID code, one byte for each character. Name. We know that some people give their dogs very long names. It is difficult to judge what to allow so we will allow plenty of room for error. We will allow 50 bytes, data type text.

Mutable and Non Mutable Data Structures
Mutable and immutable are English words meaning "can change" and "cannot change" respectively. The meaning of the words is the same in the IT context; i.e. a mutable data structure can be changed, and an immutable data structure cannot be changed. Strings and other concrete objects are typically expressed as immutable objects to improve readability and runtime efficiency in object-oriented programming.

Arrays Arrays are always static. However, it is possible to make a (static) array "appear" dynamic. Space is reserved for potential new items within the array. The more space that is reserved, the more new records can be added, but the less efficiently the data is stored.

Linked Lists and Stacks
The basic List is a data structure having a number of items stored in the order that they were originally added. The 'List' can be allocated a fixed length - in which case it is a 'static data structure' on the other hand, if the list is allowed to grow or shrink then it is a 'dynamic data structure'. An example of a simple list is the 'array' which can hold a number of data items or 'elements' as they are sometimes called. If the array is defined at compile time, then it is a 'static array'. If the array is allowed to vary in length then that is an example of a dynamic list. A 'linked list' is one where each data item points to its neighbours. A linked list is excellent as a general storage structure because it is simple to insert and delete items and to find the first and last item.

Linked Lists and Stacks
Typical operations that can be carried out on a list are: ADD (or INSERT) Adds an item to the list DELETE (or REMOVE) Removes an item from the list FIRST Identifies the first item in the list NEXT Identifies the next item in the list LAST Identifies the end of the list LOCATE Identifies the location of a specific item within the list PUSH Adds an item to the start of a list APPEND Adds an item to the end of a list

Addressing a linked list
A linked list maintains the memory location of each item in the list by using a series of 'pointers' within the data structure. A number of pointers are required, these are: The 'start pointer' points to the first node in the list. The last node in the list has a 'null pointer' (which is an empty pointer) Each node has a pointer providing the location of the next node in the list.

Addressing a linked list
Example: Storing an alphabetic list of names in a linked list A List data structure is to be used to store some names. The list is in alphabetic sort order. The bare list looks like: The completed list looks like: The first node is 'Alan' as this node is the first in alphabetic order. Alan node then points to Henry, which points to Paul which points to Sam. Sam is the last node in the list and so has a null pointer. As each item is added to the list, the software must work out its relative position and adjust the pointer to suit.

Queues The QUEUE is another extremely common type of list data structure. A queue is a First-In, First-Out list. A queue maintains two pointers A 'front of queue' pointer An 'end of queue' pointer.

Uses of a queue A queue data structure is used whenever there are a number of items waiting for a resource to become available. If you had three items added to a queue in this order 1. Dog 2.Cat 3. Horse the queue would look like this The start pointer locates 'Dog' and the Rear pointer locates 'Horse' The operations that are associated with a queue are Operations on a QUEUE ADD add an item to the back of the queue REMOVE removes the item at the front of the queue FRONT Identifies the item at the front of the queue, but does not remove it. Dog Cat Horse

Uses of a queue Printer Queue Process Queue
A printer queue will contain a number of jobs waiting to be processed, the next one to be processed is the first one that went into the queue. Another very common use of a queue is to handle processes running on the CPU. Only one process can run on the CPU at any given time, but there will be many processes wanting to run. So the operating system will add each process to a queue and allow the one at the front of the queue to get the next turn at running within the CPU.

Adding and deleting from queues
To add an item to the queue To remove an item from the queue Is the queue full? Case 1: Queue is full Return an error message to the calling function End of task Case 1: Queue is not full Allocate memory to the new node Adjust the rear pointer to locate the new node Is the queue empty? Case 1: Queue is empty Return an error message to the calling function End of task Case 1: Queue is not empty Mark the memory the node used as free once more Adjust the rear pointer to locate the previous node

Circular Queues This is a particular kind of queue where new items are added to the rear of the queue as items are read off the front of the queue. So there is constant stream of data flowing into and out of the queue. Another name for it is 'circular buffer. The diagram on the right shows items entering the queue at the rear whilst the item at the front of the queue is read (and removed) Normally, one process is writing items into the queue and a different process is reading items from the queue.

Circular Queues Example
When burning a DVD it is essential that the laser beam burning pits onto the surface is constantly fed with data, otherwise the DVD fails. Most leading DVD burn applications make use of a circular buffer to stream data from the hard disk onto the DVD. The first part, the 'writing process', fills up a circular buffer with data, then the 'burning process' begins to read from the buffer as the laser beam burns pits onto the surface of the disk. If the buffer starts to become empty, the application carefully slows down the burn rate and speeds up the writing rate to avoid 'buffer underflow' i.e. an empty buffer

Advantages of a circular queue
It is a structure that allows data to be passed from one process to another whilst making the most efficient use of memory.

Trees The QUEUE and the STACK are linear lists. This means each data item only points to the one before it and after it.They have the idea of order built into them, such as 'last' or 'first'. But they do not imply there is any relationship between the data items themselves. The TREE on the other hand, is designed to represent the relationship between data items. Just like a family tree, a TREE data structure is illustrated on the right.

Trees Each data item within a tree is called a 'node'.
The highest data item in the tree is called the 'root' or root node. Below the root lie a number of other 'nodes'. The root is is the 'parent' of the nodes immediately linked to it and these are the 'children' of the parent node. If nodes share a common parent, then they are 'sibling' nodes, just like a family. The link joining one node to another is called the 'branch'. The tree structure above is a general tree. But there is a very specific form of tree described on the next page

Trees The TREE is a general data structure that describes the relationship between data items or 'nodes'. The parent node of a binary tree has only two child nodes.

Trees The typical operations that can be carried out on a Binary Tree are Operations on a Binary Tree CREATE Create a new tree LEFT Get the Left sub-tree RIGHT Get the Right sub-tree ITEM Get the data item within the root node EmptyTree Informs whether or not the tree is empty INSERT Place a node at a specific location within the tree DELETE Remove a node from a specific location within the tree

Searching Binary Trees
One of the most powerful uses of the TREE data structure is to sort and manipulate data items. Most databases use the Tree concept as the basis of storing, searching and sorting its records. The Binary search tree holds data items in a sorted order, but with the addition of a simple rule Rule: The LEFT node always contain values that come before the root node and the RIGHT node always contain values that come after the root node. For numbers, this means the left sub-tree contains numbers less than the root and the right sub-tree contains numbers greater than the root. For words, as might be in a sorted dictionary, the order is alphabetic.

Searching Binary Trees
Helpful trick If you put a dot corresponding to the location on the diagram below on all of the nodes and trace a line around the tree, marking off each dot as you go, this will give you the output of the tree traversal. For instance, on the tree, if we use the pre- order traversal then we place the dot on the left hand side The output would therefore be: A, B, D, H, I, E, J, C, F, K, L, G

Useful Resources ict.com/as_as_computing/ocr/H447/F453/3_3_5/data_structures/miniweb/pg16.ht m mutable-and-immutable-string-in-c

Data Structures.

Similar presentations

Presentation on theme: "Data Structures."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Data Structures.

Similar presentations

Presentation on theme: "Data Structures."— Presentation transcript:

Similar presentations

About project

Feedback