Download presentation
Presentation is loading. Please wait.
Published byCornelius Hicks Modified over 6 years ago
1
Abstract Data Types Data abstraction, or abstract data types, is a programming methodology where one defines not only the data structure to be used, but the processes to manipulate the structure like process abstraction, ADTs can be supported directly by programming languages To support it, there needs to be mechanisms for defining data structures encapsulation of data structures and their routines to manipulate the structures into one unit by placing all definitions in one unit, it can be compiled at one time information hiding to protect the data structure from outside interference or manipulation the data structure should only be accessible from code encapsulated with it so that the structure is hidden and protected from the outside objects are one way to implement ADTs, but because objects have additional properties, we defer discussion of them until the next chapter Programmers and computer science in general greatly improved on programming methodologies over time as they learned about good and bad programming habits, and what programming language constructs were needed to help support good habits. One major change in programming emphasis arose in the late 70s/early 80s as a response to programmers creating data structures as needed, rather than in a principled manner. The solution is known as abstract data types, to support the need of data abstraction. Since you have already studied abstract data types in CSC 360, we will skip over the example illustrated in the textbook in section 11.2 and concentrate on how languages support them. Early languages had no support for ADTs. Early FORTRAN had no structures other than arrays. COBOL allowed one to define a structure but had no mechanisms for encapsulation or information hiding. PL/I included all of the various data structures as part of the language so that, while you could declare a variable to be of a specific data structure type and access it through built-in processes, you could not define your own. Simula-67 was the first to offer the ability to define your own data structures, but this idea was not popularized until ALGOL-68. Its two primary successors, Pascal and C, popularized the notion of programmer defined structures, which has been provided in nearly every language since. However, neither of these languages has mechanisms for information hiding, and they only have weak support for encapsulation (encapsulation is not mandatory).
2
ADT Design Issues Encapsulation: it must be possible to define a unit that contains a data structure and the subprograms that access (manipulate) it design issues: will ADT access be restricted through pointers? can ADTs be parameterized (size and/or type of data being stored)? Information hiding: controlling access to the data structure through some form of interface so that it cannot be directly manipulated by external code often implemented via two sections of code public part (interface) constitutes those elements that can be accessed externally (often limited to subprograms and constants) private part, which remains secure because it is only accessible by subprograms of the ADT itself Simula 67 was the first language to provide a facility for user-defined classes. Objects were dynamically allocated, which was fairly unique at the time, and the class construct combined both data structure definition and the subprograms to operate on the variables, making Simula-67 the first language to offer encapsulation. However, Simula 67 did not offer a mechanism for information hiding, so the Simula 67 class fails as an ADT. ALGOL-68 would also permit the user to define data structures but had no encapsulation or information hiding. The form of a class in Simula67 is class class_name; begin variable declarations here subprogram definitions of operators here code section here end class_name;
3
Modula-2 ADTs Unit for encapsulation called a module
modules can be combined to form libraries of ADTs To define a module: definition module: the interface containing a partial or complete type definition (data structure) and the subprogram headers and parameters implementation module: the portion of the data structure that is to be hidden, along with all operation subprograms If the complete type declaration is given in the definition module, the type is “transparent” otherwise it is “opaque” opaque types represent true ADTs and must be accessed through pointers this restriction allows the ADT to be entirely hidden from user programs since the user program need only define a pointer By restricting an ADT’s access to be via pointer, one can modify the ADT code and recompile that definition without having to recompile any user code. For instance, if I define an ADT in a file and compile it, and then you write a program to use my ADT, fine. Later, I modify the structure of my ADT (without modifying the interface) and recompile it and make the new object file available to you. You will not have to recompile your code if your code accesses the ADT through a pointer. However, if your code declares a variable of the ADT structure itself, then your code MUST be recompiled because I have changed the storage requirements for the ADT.
4
ADTs in Ada The encapsulation construct is the package
Packages consist of two parts: specification package (the public or interface part) body package (the hidden or private part) The two packages can be compiled separately but only if specification package is compiled first The specification package must include details of the data structure itself to preserve information hiding, the data structure’s definition can follow the word private denoting that the following is hidden Ada offers three forms of ADTs those without information hiding, and thus are not true ADTs those that preserve information hiding by specifying that the data structure is private those that specify that the data structure is limited private all ADTs have built-in operations for assignment and equality except for limited private ADTs which have no built-in operations at all For most languages, ADTs are implemented by pointers and so assignment merely copies pointer values so that one pointer now points at the other ADT, rather than copying the data structure itself. In Ada, assignment means “copy the data structure into a new memory location”. Similarly, equality in most languages tests the two pointers to see if they point at the same memory location, but in Ada, equality tests to see if two data structures have the same data values. While assignment and equality may be less efficient in Ada than other languages, it provides more flexibility in that these operations are more meaningful.
5
Example Part I type Stack_Type is limited private;
package Stack_Pack is type Stack_Type is limited private; Max_Size : constant := 100; function Empty(Stk : in Stack_Type) return Boolean; procedure Push(Stk : in out Stack_Type; Element : in Integer); procedure Pop(Stk : in out Stack_Type); function Top(Stk : in Stack_Type) return Integer; private type List_Type is array (1..Max_Size) of Integer; type Stack_Type is record List : List_Type; Topsub : Integer range 0..Max_Size := 0; end record; end Stack_Pack; The specification package for a stack ADT – see the next slide for the body package The actual ADT definition must either appear in the open section (e.g., the public part) or in the private section NOTE: because we are using List_Type in our Stack_Type definition, List_Type must be defined first, and therefore it is defined prior to our actual Stack ADT, which is a record with a List and a Topsub. The List_Type does not have to be defined in either the private section or in this package as long as it is defined somewhere, but it makes the most sense to define it where it is because it should also be hidden. By making the ADT definition above a pointer to a stack record, and defining the actual data structure in the body package, it makes our definition a little cleaner – that is, we are not defining the data structure in one place and the code elsewhere, we are defining the interface in one place and the structure and code in the body package. The specification package would look like this instead: private type Stack_Type; type Stack_Ptr is access Stack_Type; And then define Stack_Type itself (along with List_Type) in the body package. While this is cleaner, there is a drawback to this approach in that a user program can declare a Stack_Ptr and manipulate it without having it point at a Stack_Type and therefore lead to run-time errors. In addition, equality and assignment are now of pointers and therefore do not copy the data structure or compare the data structure as is planned in Ada. An alternative implementation to this approach is to define a pointer in the private section of this package and define the actual Stack_Type ADT in the body package. This is discussed in more detail in the notes section of this slide.
6
Example Part II with Ada.Text_IO; use Ada.Text_IO;
package body Stack_Pack is function Empty(Stk : in Stack_Type) return Boolean is begin return Stk.Topsub = 0; end Empty; procedure Push(Stk : in out Stack_Type; Element : in Integer) is if Stk.Topsub >= Max_Size then Put_Line(“ERROR – Stack overflow”); else Stk.Topsub := Stk.Topsub +1; Stk.List(Topsub):=Element; end if; end Push; procedure Pop(Stk : in out Stack_Type) is begin … end Pop; function Top(Stk : in Stack_Type) return Integer is begin … end Top; end Stack_Pack; Notice that the actual ADT definition is omitted in the body package because it had already been defined in the specification package The author expresses concern regarding the use of pointers to ADTs. In this Ada example, the data structure is a record (like a struct), and not a pointer to a record (struct). There are advantages and disadvantages to this approach. The primary advantages are that we do not have to perform pointer dereferencing every time we want to access the data structure and that we don’t have typical pointer concerns (aliases, dangling pointers, lost objects). However, we also have disadvantages – assignment requires copying items between two data structures, and equality means testing items between two data structures. The main advantage of using a pointer to a data structure however is to get away from the need to recompile code as discussed on the first slide’s notes for this chapter. The Modula-2 approach where all data structures are pointed to is cleaner than the Ada approach where the programmer has a choice. The author really seems to like the Ada approach better (as mentioned in older editions of the book). The rest of the implementation is omitted
7
C++ ADTs C++ offers two mechanisms for building data structures: the struct and the class because the struct does not have a mechanism for information hiding, it can only offer encapsulation (and encapsulation is not enforced when using structs, merely available), so for a true ADT, we must use C++s object C++ classes contain both visible (public) and hidden (private) components (as well as protected for inheritance) C++ instances can be static, heap-dynamic and stack-dynamic the lifetime of an instance ends when it reaches the end of the scope of where it was declared a stack-dynamic object may have heap-dynamic data so that parts of the object may continue even though the instant is deallocated we defer most of our discussion of objects in C++ to the next chapter, but we will see an example next Note that C/C++ structs do not support encapsulation at all, however you can use structs and build your own encapsulation through the use of a header file.
8
C++ Example #include <iostream.h> class stack { private:
Unlike the Ada example, in C++, the entire definition is encapsulated in one location Information hiding is preserved through the use of a private part with the interface being defined in the public part Any methods that are to be defined in this class but not accessible outside of the class would also be defined in the private section #include <iostream.h> class stack { private: int *stackPtr; int max; int topPtr; public: stack( ) { // constructor stackPtr = new int [100]; max = 99; topPtr = -1; } ~stack( ) {delete [ ] stackPtr;} // destructor void push(int number) {…} // details omitted void pop( ) {…} int top( ) {…} int empty( ) {…} We explore classes in chapter 12, so we won’t cover the above example in any detail here.
9
Java, C# and Ruby ADTs All three languages support ADTs through classes Java permits no stand-alone functions, only methods defined in class definitions and unlike C++, referenced through reference variables (pointers), therefore, in Java, every data structure is an ADT it is up to the programmer as to whether information hiding is enforced or not C# borrows from both C++ and Java but primarily from Java, where all objects are heap dynamic, modifiers are private, public, protected, but C# also offers internal and protected internal modifiers which are used for assemblies (cross-platform objects), and methods that can serve as both accessors and mutators (see the example in section 11.4) Ruby requires that all class variables be private, and all methods default to being public (but the programmer can change this) class variables do not have to be explicitly declared in Ruby, see the example in section 11.4 we look at Ruby in more detail in chapter 12
10
Parameterized ADTs The ability to define an ADT where the type and/or size is specified generically so that a specific version can be generated later a stack defined without specifying the element type (integer vs. string vs. float, etc) a stack defined without a restriction on the size of the stack Ada, C++, Java and C# all have this capability The approach is to replace the type definition with a place holder that is filled in later In ADA: generic Max_Size : positive; type Element_Type is private; … rest of ADT as before except that Element_Type replaces Integer and Max_Size as a constant is removed now we instantiate our ADT: package Integer_Stack is new Generic_Stack(100, Integer);
11
Parameterized ADTs Continued
In C++, parameterized ADTs are implemented as templated classes to change the stack class’ size, only change the constructor to receive the size as a parameter, which is used to establish the size of the array to change the stack’s type, the class is now defined as a template using template <class Type> where Type is the place-holder to be filled in by a specific type at run-time In both Ada and C++, the parameterized ADT definition is generated at compile-time the new statement signals that a new package should be generated by the compiler in C++, if two definitions ask for the same type of ADT, only 1 set of source code is generated, in Ada, the same source code is generated twice! In Java and C#, parameterized ADTs are implemented as generic classes (you should have covered this in 360 for Java, so we skip it here) Note about Java and generics: Prior to Java 1.5, generics were not available. You could however simulate this through the use of polymorphism. Recall that if you had some class Parent and subclass Child, and a method called foo in Parent, then an object of either class could call upon foo, so foo becomes a generic method. If you could extend this concept by making the ADT store data of type Objects, then since all object types descend from Object, the ADT can then implement methods that operate on the type Object but still permit you to store a specific type in the ADT (for instance, if the ADT is a stack, you could store Strings or Colors there). If you were to store a primitive type, you would use the appropriate wrapper classes (such as Integer to store int values). So you could make a generic ADT in Java through this technique. The only problem with this is to obtain a specific item from the ADT would require casting the Object to its right type. You can see a brief example of this on page where the ADT is used to store Integers. Java 1.5 cleans this up by permitting true generic objects for parameterized ADTs.
12
Encapsulation Constructs
For large programs, to avoid having to recompile all code when one section changes code can be grouped into logically related chunks called encapsulations using nested subprograms we can place logically related subprograms inside of the subprograms that commonly call them approach not available in C-languages use header files to place logically related functions’ prototypes in the same header, C++ also adds the “friend” (we look at this in chapter 12) Ada packages (which can be compiled separately) can include any number of data and subprogram declarations so that they can contain interfaces for multiple ADTs C# assemblies that can share code with other software written in the .NET environment Each language has some technique for then using the named encapsulation, sometimes called a namespace The idea of a namespace is that it allows units of a program (e.g., function names) to have the same name but be different sets of code – a namespace is a container (encapsulation) in which a given name can be recognized. Here, we look at how namespaces are specified in some of the more common languages. C++ uses namespace to specify a namespace. To access elements of a namespace, you use the :: operator. The :: is known as the scope resolution operator. The :: notation is needed when two libraries have different definitions but share the same named item. An example might be having a namespace for MyStack and referencing a variable of that encapsulation using MyStack::topPtr. Java uses packages which combine one or more class definitions. Without a package, special access can be granted between classes (we cover this in chapter 12). We use import to import an entire package or a specific class from a package: import java.io.*; imports all classes in the package import java.io.JOptionPane; imports only the selected class Since Java is OO, the use of the namespace is governed by interaction with objects. Therefore, Java does not permit imported classes to share names, but names of different class’ methods, variables and constants can be shared. You would address a shared item by referencing the class, as in aStack.topPtr or aStack.pop(); Ada also uses packages. In Ada, the with statement is used to include a package, and the uses statement is used to specify the specific definition (ADT) desired from that package. For example: with Ada.Text_IO; uses Ada.Text_IO; or uses Ada.Text_IO.Put; Ruby uses modules, which are collections of methods and constants. Modules are unlike classes though as in that a Module is not a definition that can be instantiated or extended.
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.