 Files are used for long-term retention of large amounts of data, even after the program that created the data terminates.  persistent data.  The smallest.

 Files are used for long-term retention of large amounts of data, even after the program that created the data terminates.  persistent data.  The smallest data item that computers support is called a bit  short for “binary digit”—a digit that can assume one of two values  Digits, letters and special symbols are referred to as characters  Bytes are composed of 8 bits.  C# uses the Unicode ® character set (www.unicode.org) in which characters are composed of 2 bytes.www.unicode.org  Just as characters are composed of bits, fields are composed of characters.  A field is a group of characters that conveys meaning.  Typically, a record is composed of several related fields.  A file is a group of related records.  To facilitate the retrieval of specific records from a file, at least one field in each record is chosen as a record key, which uniquely identifies a record.  A common file organization is called a sequential file, in which records typically are stored in order by a record-key field.

A file can be seen as 1.A stream of bytes (no structure), or 2.A collection of records with fields

 To access a specific record without having to retrieve all records before it  File structures allowing this: ◦ indexed files ◦ hashed files  To access a record in a file randomly, we need to know the address of the record

Can have more than one index, each with a different key. For example, an employee file can be retrieved based on either social security number or last name. This type of indexed file is usually called an inverted file.

 A hashed file uses a mathematical function to accomplish this mapping. The user gives the key, the function maps the key to the address and passes it to the operating system, and the record is retrieved

Random files: hashing methods Many several hashing methods Example: Direct hashing The key is the data file address without any algorithmic manipulation. The file must therefore contain a record for every possible key. Although situations suitable for direct hashing are limited, it can be very powerful, because it guarantees that there are no synonyms or collisions, as with other methods.

Random files: hashing methods cont Example: Modulo division (%) hashing (division remainder hashing) divides the key by the file size and uses the remainder plus 1 for the address. This gives the simple hashing algorithm that follows, where list_size is the number of elements in the file. The reason for adding a 1 to the mod operation result is that the list starts with 1 instead of 0.

 Records can only be accessed one after another from beginning to end.  Records are stored one after another in auxiliary storage ◦ disk ◦ tape  EOF (end-of-file) marker after the last record. ◦ The operating system has no information about the record addresses, it only knows where the whole file is stored. ◦ The operating system knows that records are sequential.

Pseudo code While Not EOF { //Read the next record //Process the record } Record key  Identifies a record to facilitate the retrieval of specific records from a file

 that need to access all records from beginning to end ◦ Ex.: Personal information  Because each record is processed, sequential access is more efficient and easier than random access

Comparing random versus sequential operations is one way of assessing application efficiency in terms of disk use. Accessing data sequentially is much faster than accessing it randomly because of the way in which the disk hardware works. The seek operation, which occurs when the disk head positions itself at the right disk cylinder to access data requested, takes more time than any other part of the I/O process. Because reading randomly involves a higher number of seek operations than does sequential reading, random reads deliver a lower rate of throughput. The same is true for random writing. You might find it useful to examine your workload to determine whether it accesses data randomly or sequentially. If you find disk access is predominantly random, you might want to pay particular attention to the activities being done and monitor for the emergence of a bottleneck. For workloads of either random or sequential I/O, use drives with faster rotational speeds. For workloads that are predominantly random I/O, use a drive with faster seek time.

 Must be updated periodically to reflect changes in information  all of the records need to be checked and updated (if necessary) sequentially: ◦ new Master File ◦ old Master File ◦ transaction File (contains changes to be applied to the master file)  add  delete  change ◦ Report Message OLD MASTER TRANSACTION NEW MASTER ERROR MESSAGES UPDATE PROGRAM

ERROR MESSAGES: NO MATCH500000000 DUPLICATE ADDITION888888888 UPDATE OLD MASTER FILE: 111111111ADAMS015000 NEW YORK 222222222BAKER025000 NEW YORK 333333333ZIDROW008000 NEW YORK 444444444MILGROM040000 BOSTON 555555555BENJAMIN100000 CHICAGO 666666666SHERRY007500 CHICAGO 777777777BOROW017500 BOSTON 888888888JAMES050000 NEW YORK 999999999RENAZEV030000 NEW YORK TRANSACTION FILE: 222222222028000 C 222222222 BOSTON C 400000000NEW EMPLOYEE016000 BOSTON A 500000000020000 C 610000000NEW EMPLOYEE II018000 CHICAGO A 610000000 NEW YORK C 666666666SHERRY D 777777777055000 C 888888888JAMES017500 NEW YORK A NEW MASTER FILE: 111111111ADAMS015000 NEW YORK 222222222BAKER028000 BOSTON 333333333ZIDROW008000 NEW YORK 400000000NEW EMPLOYEE016000 BOSTON 444444444MILGROM040000 BOSTON 555555555BENJAMIN100000 CHICAGO 610000000NEW EMPLOYEE II018000 NEW YORK 777777777BOROW055000 NEW YORK 888888888JAMES050000 NEW YORK 999999999RENAZEV030000 NEW YORK

A text file is a file of characters. It cannot contain integers, floating-point numbers, or any other data structures in their internal memory format. To store these data types, they must be converted to their character equivalent formats. Some files can only use character data types. Most notable are file streams (input/output objects in some object-oriented language like C#, C++, Java) for keyboards, monitors and printers. This is why we need special functions to format data that is input from or output to these devices. Text files

Binary files A binary file is a collection of data stored in the internal format of the computer. In this definition, data can be an integer (including other data types represented as unsigned integers, such as image, audio, or video), a floating-point number or any other structured data (except a file). Unlike text files, binary files contain data that is meaningful only if it is properly interpreted by a program. If the data is textual, one byte is used to represent one character (in ASCII encoding). But if the data is numeric, two or more bytes are considered a data item.

 C# views each file as a sequential stream of bytes  When a console application executes, the runtime environment creates the Console.Out, Console.In and Console.Error streams ◦ Console.In refers to the standard input stream object, which enables a program to input data from the keyboard. ◦ Console.Out refers to the standard output stream object, which enables a program to output data to the screen. ◦ Console.Error refers to the standard error stream object, which enables a program to output error messages to the screen.  Console methods Write and WriteLine use Console.Out to perform output  Console methods Read and ReadLine use Console.In to perform input

Class Directory provides capabilities for manipulating directories The DirectoryInfo object returned by method CreateDirectory contains information about a directory !

 Defines a scope, outside of which an object or objects will be disposed.  C#, through the.NET Framework common language runtime (CLR), automatically releases the memory used to store objects that are no longer required. The release of memory is non-deterministic; memory is released whenever the CLR decides to perform garbage collection. However, it is usually best to release limited resources such as file handles and network connections as quickly as possible.  The using statement allows the programmer to specify when objects that use resources should release them. The object provided to the using statement must implement the IDisposable interface. This interface provides the Dispose method, which should release the object's resources.  A using statement can be exited either when the end of the using statement is reached or if an exception is thrown and control leaves the statement block before the end of the statement.

class IntroToLINQ { static void Main() { // The Three Parts of a LINQ Query: // 1. Data source. int[] numbers = newint[7] { 0, 1, 2, 3, 4, 5, 6 }; // 2. Query creation.// numQuery is an IEnumerable var numQuery = from num in numbers where (num % 2) == 0 select num; // 3. Query execution. foreach (int num in numQuery) { Console.Write("{0,1} ", num); }

 In LINQ the execution of the query is distinct from the query itself  No data retrieving by creating a query variable Create a data source from an XML document // using System.Xml.Linq; XElement contacts = XElement.Load(@"c:\myContactList.xml"); Northwnd db=new Northwnd(@"c:\northwnd.mdf"); // Query for customers in London. IQueryable custQuery = from cust in db.Customers where cust.City == "London" select cust;

Queries that perform aggregation functions over a range of source elements must first iterate over those elements. Examples of such queries are Count, Max, Average, and First. These execute without an explicit foreach statement because the query itself must use foreach in order to return a result. Note also that these types of queries return a single value, not an IEnumerable collection. var evenNumQuery = from num in numbers where (num % 2) == 0 select num; int evenNumCount = evenNumQuery.Count();

To force immediate execution of any query and cache its results, call the ToList or ToArray methods. List numQuery2 = (from num in numbers where (num % 2) == 0 select num).ToList(); or int[]var numQuery3 = (from num in numbers where (num % 2) == 0 select num).ToArray();

Step 1. Build REUSABLE dll in BankLibrary.sln. Record simulator

 Class SaveFileDialog is used for selecting files.  The constant FileMode.OpenOrCreate indicates that the FileStream should open the file if it exists or create the file if it does not.  To preserve the original contents of a file, use FileMode.Append.  The constant FileAccess.Write indicates that the program can perform only write operations with the FileStream object.  There are two other FileAccess constants—FileAccess.Read for read-only access and FileAccess.ReadWrite for both read and write access.  An IOException is thrown if there is a problem opening the file or creating the StreamWriter.  StreamWriter method WriteLine writes a sequence of characters to a file.  The StreamWriter object is constructed with a FileStream argument that specifies the file to which the StreamWriter will output text.  Method Close throws an IOException if the file or stream cannot be closed properly.

Same as “”

Must be a Key

Select where to save first, than enter data 

Clients.txt 100,Nancy,Brown,-25.54 200,Stacey,Dunn,314.33 300,Doug,Barker,0.00 400,Dave,Smith,258.34 500,Sam,Stone,34.98

 OpenFileDialog is used to open a file.  The behavior and GUI for the Save and Open dialog types are identical, except that Save is replaced by Open.  Specify read-only access to a file by passing constant FileAccess.Read as the third argument to the FileStream constructor.  StreamReader method ReadLine reads the next line from the file.

A FileStream object can reposition its file-position pointer to any position in the file. When a FileStream object is opened, its file-position pointer is set to byte position 0. You can use StreamReader property BaseStream to invoke the Seek method of the underlying FileStream to reset the file-position pointer back to the beginning of the file. Exercise: Add a Start/Beginning button to the ReadSequentialAccessFileForm program to go to the very first record.

filter

 Class BinaryFormatter enables entire objects to be written to or read from a stream.  BinaryFormatter method Serialize writes an object’s representation to a file.  BinaryFormatter method Deserialize reads this representation from a file and reconstructs the original object.  Both methods throw a SerializationException if an error occurs during serialization or deserialization.

Method Serialize takes the FileStream object as the first argument so that the BinaryFormatter can write its second argument to the correct file. Remember that we are now using binary files, which are not human readable.

Deserialize returns a reference of type object. If an error occurs during deserialization, a SerializationException is thrown.

 Clients.bin  100,Nancy,Brown,-25.54  200,Stacey,Dunn,314.33  300,Doug,Barker,0.00  400,Dave,Smith,258.34  500,Sam,Stone,34.98

 Files are used for long-term retention of large amounts of data, even after the program that created the data terminates.  persistent data.  The smallest.

Similar presentations

Presentation on theme: " Files are used for long-term retention of large amounts of data, even after the program that created the data terminates.  persistent data.  The smallest."— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

 Files are used for long-term retention of large amounts of data, even after the program that created the data terminates.  persistent data.  The smallest.

Similar presentations

Presentation on theme: " Files are used for long-term retention of large amounts of data, even after the program that created the data terminates.  persistent data.  The smallest."— Presentation transcript:

Similar presentations

About project

Feedback