Concepts of Programming Languages

Slides:



Advertisements
Similar presentations
1 Chapter 2 Introduction to Java Applications Introduction Java application programming Display ____________________ Obtain information from the.
Advertisements

Primitive Data Types There are a number of common objects we encounter and are treated specially by almost any programming language These are called basic.
Chapter 5: Elementary Data Types Properties of types and objects –Data objects, variables and constants –Data types –Declarations –Type checking –Assignment.
CS 330 Programming Languages 11 / 06 / 2007 Instructor: Michael Eckmann.
ISBN Chapter 6 Data Types Character Strings Pattern Matching.
String Escape Sequences
CS 355 – Programming Languages
Section 2 - More Basics. The char Data Type Data type of a single character Example char letter; letter = 'C';
Lesson 3 – Regular Expressions Sandeepa Harshanganie Kannangara MBCS | B.Sc. (special) in MIT.
Introduction to Programming Prof. Rommel Anthony Palomino Department of Computer Science and Information Technology Spring 2011.
1 Variables, Constants, and Data Types Primitive Data Types Variables, Initialization, and Assignment Constants Characters Strings Reading for this class:
Regular Expression Darby Tien-Hao Chang (a.k.a. dirty) Department of Electrical Engineering, National Cheng Kung University.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
ISBN 0-321— Chapter 6 sections 1-4, 9 Primitive Data Types Numbers Strings Ordinal Types Pointers.
Regular Expressions Chapter 11 Python for Informatics: Exploring Information
5 BASIC CONCEPTS OF ANY PROGRAMMING LANGUAGE Let’s get started …
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
Regular Expressions.
Overview A regular expression defines a search pattern for strings. Regular expressions can be used to search, edit and manipulate text. The pattern defined.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
GREP. Whats Grep? Grep is a popular unix program that supports a special programming language for doing regular expressions The grammar in use for software.
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
A Simple Java Program //This program prints Welcome to Java! public class Welcome { public static void main(String[] args) { public static void main(String[]
Operators and Expressions. 2 String Concatenation  The plus operator (+) is also used for arithmetic addition  The function that the + operator performs.
ISBN Chapter 6 Data Types. Copyright © 2006 Addison-Wesley. All rights reserved. 6-2 Chapter 6 Topics Introduction Primitive Data Types.
Unit 11 –Reglar Expressions Instructor: Brent Presley.
CS 330 Programming Languages 11 / 01 / 2007 Instructor: Michael Eckmann.
ISBN Chapter 6 Data Types. Copyright © 2006 Addison-Wesley. All rights reserved.2 Primitive Data Types Almost all programming languages.
Strings, Characters, and Regular Expressions Session 10 Mata kuliah: M0874 – Programming II Tahun: 2010.
Chapter 2: Data and Expressions. Variable Declaration In Java when you declare a variable, you must also declare the type of information it will hold.
Chapter 6. Character String Types It is one in which the values consists of sequences of characters. How to Define a variable contain a string? In a programming.
Regular Expressions In Javascript cosc What Do They Do? Does pattern matching on text We use the term “string” to indicate the text that the regular.
© 2004 Pearson Addison-Wesley. All rights reserved August 27, 2007 Primitive Data Types ComS 207: Programming I (in Java) Iowa State University, FALL 2007.
Chapter # 2 Part 2 Programs And data
RE Tutorial.
Chapter 2 Variables.
Regular Expressions Upsorn Praphamontripong CS 1110
Strings and Serialization
Looking for Patterns - Finding them with Regular Expressions
Chapter 2 Introduction to C++ Programming
Chapter 6 – Data Types CSCE 343.
Type Checking Generalizes the concept of operands and operators to include subprograms and assignments Type checking is the activity of ensuring that the.
Lecture 19 Strings and Regular Expressions
Why Python? Introduction Mohamed Yehia Dahab
CSC 594 Topics in AI – Natural Language Processing
© 2016 Pearson Education, Ltd. All rights reserved.
Primitive Data Types August 28, 2006 ComS 207: Programming I (in Java)
Lecture 16: Introduction to Data Types
CS 326 Programming Languages, Concepts and Implementation
Instructor : Ahmed Alalawi Slides from Chung –Ta King
Multiple variables can be created in one declaration
Concepts of Programming Languages
JLex Lecture 4 Mon, Jan 26, 2004.
CSC 594 Topics in AI – Natural Language Processing
Escape Sequences What if we wanted to print the quote character?
Chapter 6 Data Types.
Introduction to C++ Programming
Advanced Find and Replace with Regular Expressions
Chapter 2: Java Fundamentals
Introduction to C++ Programming
Programming Fundamentals Lecture #3 Overview of Computer Programming
Chapter # 2 Part 2 Programs And data
CS 240 – Lecture 7 Boolean Operations, Increment and Decrement Operators, Constant Types, enum Types, Precedence.
CSCI The UNIX System Regular Expressions
Introduction to Computer Science
Chapter 2 Variables.
Chapter 1 c++ structure C++ Input / Output
REGEX.
ADVANCE FIND & REPLACE WITH REGULAR EXPRESSIONS
PYTHON - VARIABLES AND OPERATORS
Presentation transcript:

Concepts of Programming Languages Dr. Mohamed Yehia Dahab

Pattern Matching (Regular Expressions) A regular expression (regex for short) is a special text string for describing a search pattern You can think of regular expressionsas wildcards A regular expression is written in a formal language that can be interpreted by a regular expression processor such as “*.txt” to find all text files in a file manager

Regular Expressions (Cont’) Metacharacter Description Examples character Any literal letter, number, or punctuation character (other than those that follow) matches itself. apple matches apple. (pattern) Patterns can be grouped together using parentheses so that they can be treated as a unit. see following . Match a single character (except linefeed). s.t matches sat, sit, sQt, s3t, s&t, s t,... ? Match zero or one of the previous character/expression. (When immediately following ?, +, *, or {min,max} it prevents the expression from using "greedy" evaluation.) colou?r matches color, colour + Match one or more of the previous character/expression. a+rgh! matches argh!, aargh!, aaargh!,... * Match zero or more of the previous character/expression. b(an)*a matches ba, bana, banana, bananana,... {number} Match exactly number copies of the previous character/expression. .o{2}n matches noon, moon, loon,...

Regular Expressions (Cont’) Metacharacter Description Examples {min,max} Match between min and max copies (inclusive) of the previous character/expression. kabo{2,4}m matches kaboom, kabooom, kaboooom. [set] Match a single character in set (list and/or range). Most characters that have special meanings in regular expressions do not have to be backslash-escaped in character sets. J[aio]b matches Jab, Jib, Job [A-Z][0-9]{3} matches Canadian postal codes. [^set] Match a single character not in set (list and/or range). Most characters that have special meanings in regular expressions do not have to be backslash-escaped in character sets. q[^u] matches very few English words (Iraqi? qoph? qintar?). | Match either expression that it separates. (Mi|U)nix matches Minix and Unix ^ Match the start of a line. ^# matches lines that begin with #. $ Match the end of a line. ^$ matches an empty line.

Regular Expressions (Cont’) Metacharacter Description Examples \ Interpret the next metacharacter character literally, or introduce a special escaped combination (see following). \* matches a literal asterisk. \n Match a single newline (carriage return in Python) character. Hello\nWorld matches Hello World. \t Match a single tab character. Hello\tWorld matches Hello     World. \s Match a single whitespace character. Hello\s+World matches Hello World, Hello  World, Hello   World,... \S Match a single non-whitespace character. \S\S\S matches AAA, The, 5-9,... \d Match a single digit character. \d\d\d matches 123, 409, 982,... \D Match a single non-digit character. \D\D matches It, as, &!,...

Regular Expressions in Python Before you can use regular expressions in your program, you must import the library using "import re“ import re line = 'This email from someone' if re.search('from', line) : print ('Found')

Regular Expressions in Python (Cont’) The re.search() returns a True/False depending on whether the string matches the regular expression import re line = '30/5/1999' if re.search('\d+', line) : print ('Found')

Regular Expressions in Python (Cont’) If we want the matching strings, we use re.findall() ['30', '5', '1999'] import re line = '30/5/1999' y = re.findall('\d+', line) print (y)

Regular Expressions in Python (Cont’) Finding proper names, starting with a capital letter ['Ahmed', 'Ali'] import re line = 'I saw Ahmed and Ali' y = re.findall('[A-Z][a-z]+', line) print (y)

Regular Expressions in Python (Cont’) If there is no string matched with regular expression, findall return an empty list [] import re line = 'I saw Ahmed and Ali' y = re.findall('[A-Z][0-9]+', line) print (y)

Regular Expressions in Python (Cont’) Greedy matching is to match the largest possible string ['300$ while Ali found 350$'] import re line = 'Ahmed found 300$ while Ali found 350$' y = re.findall('[0-9].+\$', line) print (y)

Regular Expressions in Python (Cont’) Non-Greedy matching ['300$', '350$'] import re line = 'Ahmed found 300$ while Ali found 350$' y = re.findall('[0-9].+?\$', line) print (y)

Regular Expressions in Python (Cont’) Non-Greedy matching for HTML ['<H1>Some text </H1><H1>Some text again </H1>'] import re line = '<H1>Some text </H1><H1>Some text again </H1>' y = re.findall(‘<H1>.*</H1>', line) print (y)

Regular Expressions in Python (Cont’) Parenthesis are not part of the match - but they tell where to start and stop ['Some text ', 'Some text again '] import re line = '<H1>Some text </H1><H1>Some text again </H1>' y = re.findall(‘<H1>(.*?)</H1>', line) print (y)

Regular Expressions in Python (Cont’) Extracting a host name You can see this code on https://ideone.com/aF3pDE ['gmail.com'] import re x = 'From mohamed.dahab@gmail.com Fri Jan 5 09:14:16 2016' y = re.findall('@(.*?)\s', x) print (y)

Report 1 Perform all previous examples using two different programming languages Suggested programming languages: Java Vb.net You can use https://ideone.com or any compile on lines sites Write your code in the site, save your code in the site, copy the link and finally write the link beside the given example (take care and do not overwrite the previous code)

Character String Implementation Static length: compile-time descriptor Limited dynamic length: may need a run-time descriptor for length (but not in C and C++) Dynamic length: need run-time descriptor; allocation/de-allocation is the biggest implementation problem

Compile- and Run-Time Descriptors Compile-time descriptor for static strings Run-time descriptor for limited dynamic strings

User-Defined Ordinal Types An ordinal type is one in which the range of possible values can be easily associated with the set of positive integers Examples of primitive ordinal types in Java integer char boolean

Enumeration Types All possible values, which are named constants, are provided in the definition C# example enum days {mon, tue, wed, thu, fri, sat, sun}; Design issues Is an enumeration constant allowed to appear in more than one type definition, and if so, how is the type of an occurrence of that constant checked? Are enumeration values coerced to integer?

Evaluation of Enumerated Type Aid to readability, e.g., no need to code a color as a number Aid to reliability, e.g., compiler can check: operations (don’t allow colors to be added) No enumeration variable can be assigned a value outside its defined range

Example: Enumeration in C++ #include <iostream> int main(){ enum Color { RED, GREEN, BLUE }; Color r = RED; switch(r) { case RED : std::cout << "red" "\n"; break; case GREEN : std::cout << "green" "\n"; break; case BLUE : std::cout << "blue" "\n"; } http://codepad.org/eYCzkdcx

Subrange Types An ordered contiguous subsequence of an ordinal type Example: 12..18 is a subrange of integer type Ada’s design type Days is (mon, tue, wed, thu, fri, sat, sun); subtype Weekdays is Days range mon..fri; subtype Index is Integer range 1..100; Day1: Days; Day2: Weekday; Day2 := Day1;

Subrange Evaluation Aid to readability Reliability Make it clear to the readers that variables of subrange can store only certain range of values Reliability Assigning a value to a subrange variable that is outside the specified range is detected as an error