Programming Languages Concepts Chapter 1: Programming Languages Concepts Lecture # 4
Chapter 1: Preliminaries Reasons for Studying Concepts of Programming Languages Programming Domains Language Evaluation Criteria Influences on Language Design Language Categories Language Design Tradeoffs Implementation Methods Programming Environments
Language Implementation A programming language is characterized by its: Lexical Elements Syntax Semantics
Language Implementation: Lexical Elements Consist of the following: The character set: set of all characters allowed in the text of a program. The rules for grouping characters into words (lexemes) (For example, the rules for constructing identifiers, integer constants, string constants,... etc). The use of reserved words and keywords. How comments are written.
Language Implementation: Lexical Elements (cont.) Lexemes are grouped into categories called tokens. The rules for specifying tokens may be specified using formal systems such as regular expressions or context-free grammars (CFG). Example: index = 2*count + 17; Identifier, integer, operator, sign >> tokens TokensLexemes Identifier index, count Integer 2, 17 Operator =, *, + Sign ;
Language Implementation: Lexical Elements (cont.) Note: Each token can be: Keyword: such as if, for, while, class, … Identifier: such as count, index, average, … Operator: such as +, -, *, /, %, ==, >, >=, <, <=, … Constant: such as 17, 3.14, … Literal string: such as “Hello friend”, “Enter name:”, … Sign: such as ( ) [ ], ; “ ” ‘ ’ …
Language Implementation: Syntax It describes the correct form of the syntactic units of a programming language such as arithmetic expressions, assignment statements, programs, procedures, functions,..., etc. It is specified by providing the rules for constructing the syntactic units from tokens and other syntactic units. These rules are written by using: Context-Free Grammars (CFG), Bakus-Naur Forms (BNF), or Syntax graphs.
Language Implementation: Syntax (cont.) (for reading) The syntactic units of this programming language may be specified using CFG as: BEGIN END identifier = identifier constant +
Language Implementation: Semantics It describes the meanings that may be attached to its syntactic units. There are two types of semantics: Static Semantics: are the rules that describe certain constraints of the language such as all identifiers must be declared, data types of operands and operators are compatible, functions are called with the proper number of arguments,... etc. Dynamic (Run-Time) Semantics: specify what each syntactic unit does and how it should be translated.
Language Implementation: Semantics (cont.) Notes Static semantics are specified using attribute grammars. The major formalisms that are used to specify dynamic semantics are: Operational semantics, Axiomatic semantics, and Denotational (Mathematical) semantics.
Language Implementation Methods There are three major methods for implementing programming languages: Compiler implementation Pure interpretation Hybrid implementation
Language Implementation Methods: Compiler implementation Compiler implementation: A high-level language program is translated into a machine language program, which can be executed directly on the computer. Examples are C/C++, Pascal, and COBOL.
Language Implementation Methods: Pure interpretation Pure interpretation: A program called interpreter takes as input the high-level language program and its input data; it executes the statements of this program on the computer and produces the result(s) of the program. An example is BASIC.
Notes: Easier implementation of programs (run-time errors can easily and immediately displayed). Slower execution (10 to 100 times slower than compiled programs). Often requires more space. Becoming rare on high-level languages. Significant come back with some Web scripting languages (e.g., JavaScript). Language Implementation Methods: Pure interpretation (cont.)
Language Implementation Methods: Hybrid implementation Hybrid implementation: The high-level language program is translated into an intermediate language designed to allow easy interpretation. The intermediate code program is then executed using an interpreter. Examples are Perl, Java, and Lisp.
Language Implementation Methods: Hybrid implementation (cont.) A compromise between compilers and pure interpreters. Faster than pure interpretation. Example: Java applets Intermediate form is called byte code. Applets are downloaded in byte code form, then interpreted by byte code interpreter.
Just-in-Time Implementation Systems (JIT systems) Initially translate programs to an intermediate language. Then compile intermediate language into machine code. Machine code version is kept for subsequent calls. JIT systems are widely used for Java programs..NET languages are implemented with a JIT system.
Preprocessors Preprocessor macros (instructions) are commonly used to specify that code from another file is to be included. A preprocessor processes a program immediately before the program is compiled to expand embedded preprocessor macros. A well-known example: C preprocessor expands #include, #define, and similar macros.
Programming Environments The collection of tools used in software development. UNIX An older operating system and tool collection. Nowadays often used through a graphic user interface (GUI) (e.g., CDE, KDE, or GNOME) that run on top of UNIX. Borland Jbuilder An integrated development environment (IDE) for Java. Microsoft Visual Studio.NET A large, complex visual environment. Used to program in C#, Visual BASIC.NET, Jscript, J#, or C++.