Martin Kruliš 29. 2. 2016 by Martin Kruliš (v1.1)1.

Slides:



Advertisements
Similar presentations
JavaScript I. JavaScript is an object oriented programming language used to add interactivity to web pages. Different from Java, even though bears some.
Advertisements

FORM VALIDATION Faheem Ahmed Khokhar. FORM VALIDATION Faheem Ahmed Khokhar.
C Language.
Programming Languages and Paradigms The C Programming Language.
What is a pointer? First of all, it is a variable, just like other variables you studied So it has type, storage etc. Difference: it can only store the.
Current Assignments Homework 5 will be available tomorrow and is due on Sunday. Arrays and Pointers Project 2 due tonight by midnight. Exam 2 on Monday.
Chapter 10.
VBA Modules, Functions, Variables, and Constants
Javascript Client-side scripting. Up to now  We've seen a little about how to control  content with HTML  presentation with CSS  Javascript is a language.
Chapter 3 Program translation1 Chapt. 3 Language Translation Syntax and Semantics Translation phases Formal translation models.
JavaScript, Third Edition
Introduction to C Programming
 2004 Prentice Hall, Inc. All rights reserved. Chapter 25 – Perl and CGI (Common Gateway Interface) Outline 25.1 Introduction 25.2 Perl 25.3 String Processing.
CSC 8310 Programming Languages Meeting 2 September 2/3, 2014.
Last Updated March 2006 Slide 1 Regular Expressions.
Chapter 4 – The Building Blocks Data Types Literals Variables Constants.
AIT 616 Fall 2002 PHP. AIT 616 Fall 2002 PHP  Special scripting language used to dynamically generate web documents  Open source – Free!!!  Performs.
 Text Manipulation and Data Collection. General Programming Practice Find a string within a text Find a string ‘man’ from a ‘A successful man’
Regular Expressions in.NET Ashraya R. Mathur CS NET Security.
COP 3813 Intro to Internet Computing Prof. Roy Levow PHP.
Chap 3 – PHP Quick Start COMP RL Professor Mattos.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
Perl 6 Update - PGE and Pugs Dr. Patrick R. Michaud April 26, 2005.
ASP.NET Programming with C# and SQL Server First Edition Chapter 5 Manipulating Strings with C#
Input, Output, and Processing
Hello.java Program Output 1 public class Hello { 2 public static void main( String [] args ) 3 { 4 System.out.println( “Hello!" ); 5 } // end method main.
Introduction to PHP A user navigates in her browser to a page that ends with a.php extension The request is sent to a web server, which directs the request.
1 CSC 594 Topics in AI – Text Mining and Analytics Fall 2015/16 4. Document Search and Regular Expressions.
7 1 Chapter 7 Introduction to Structured Query Language (SQL) Database Systems: Design, Implementation, and Management, Seventh Edition, Rob and Coronel.
Learners Support Publications Classes and Objects.
Copyright © 2010 Certification Partners, LLC -- All Rights Reserved Perl Specialist.
Variables and ConstantstMyn1 Variables and Constants PHP stands for: ”PHP: Hypertext Preprocessor”, and it is a server-side programming language. Special.
1Computer Sciences Department Princess Nourah bint Abdulrahman University.
Copyright © 2012 Pearson Education, Inc. Publishing as Pearson Addison-Wesley C H A P T E R 2 Input, Processing, and Output.
Regular Expressions in PHP. Supported RE’s The most important set of regex functions start with preg. These functions are a PHP wrapper around the PCRE.
Data TypestMyn1 Data Types The type of a variable is not set by the programmer; rather, it is decided at runtime by PHP depending on the context in which.
Database Systems Design, Implementation, and Management Coronel | Morris 11e ©2015 Cengage Learning. All Rights Reserved. May not be scanned, copied or.
When you read a sentence, your mind breaks it into tokens—individual words and punctuation marks that convey meaning. Compilers also perform tokenization.
12. Regular Expressions. 2 Motto: I don't play accurately-any one can play accurately- but I play with wonderful expression. As far as the piano is concerned,
Telecooperation Technische Universität Darmstadt Copyrighted material; for TUD student use only Q&A Telecooperation Group TU Darmstadt.
Chapter 13 – C++ String Class. String objects u Do not need to specify size of string object –C++ keeps track of size of text –C++ expands memory region.
Copyright Curt Hill Variables What are they? Why do we need them?
Python Primer 1: Types and Operators © 2013 Goodrich, Tamassia, Goldwasser1Python Primer.
Copyright © 2003 ProsoftTraining. All rights reserved. Perl Fundamentals.
Copyright © Curt Hill Regular Expressions Providing a Search Pattern.
1 Lecture 9 Shell Programming – Command substitution Regular expressions and grep Use of exit, for loop and expr commands COP 3353 Introduction to UNIX.
CIT 383: Administrative ScriptingSlide #1 CIT 383: Administrative Scripting Regular Expressions.
Martin Kruliš by Martin Kruliš (v1.1)1.
XP New Perspectives on XML, 2 nd Edition Tutorial 7 1 TUTORIAL 7 CREATING A COMPUTATIONAL STYLESHEET.
Basic Scripting & Variables Yasar Hussain Malik - NISTE.
PHP Reusing Code and Writing Functions 1. Function = a self-contained module of code that: Declares a calling interface – prototype! Performs some task.
1 PHP Intro PHP Introduction After this lecture, you should be able to: Know the fundamental concepts of Web Scripting Languages in general, PHP in particular.
Session 2: PHP Language Basics iNET Academy Open Source Web Development.
Martin Kruliš by Martin Kruliš (v1.0)1.
Variable Variables A variable variable has as its value the name of another variable without $ prefix E.g., if we have $addr, might have a statement $tmp.
FILES AND EXCEPTIONS Topics Introduction to File Input and Output Using Loops to Process Files Processing Records Exceptions.
PHP LANGUAGE MULTIPLE CHOICE QUESTION SET-5
Topics Designing a Program Input, Processing, and Output
PHP Internals Martin Kruliš by Martin Kruliš (v1.2)
Regular Expressions in Perl
Chapter 19 PHP Part II Credits: Parts of the slides are based on slides created by textbook authors, P.J. Deitel and H. M. Deitel by Prentice Hall ©
Intro to PHP & Variables
Topics Introduction to File Input and Output
PHP.
CSCI 431 Programming Languages Fall 2003
Introduction to C++ Programming
Topics Designing a Program Input, Processing, and Output
Topics Designing a Program Input, Processing, and Output
PHP PART 2.
Introduction to Computer Science
Presentation transcript:

Martin Kruliš by Martin Kruliš (v1.1)1

 Dynamic Nature of PHP ◦ Values  Exist in a managed memory space  Created as literals, results of expressions, or by internal constructions and functions  Explicit data type  boolean, integer, float, string, array, object, resource, NULL ◦ Memory Management  Uses copy-on-write and reference counting  Values are not always copied on assignment  Once a value has zero references, it is garbage collected by Martin Kruliš (v1.1)2

 Dynamic Nature of PHP ◦ Variables  Mnemonic references to values  No declarations, created on the first assignment  In the global or local scope  Globals can be mapped into local context ( global keyword)  No explicit type (type is determined by current value)  Type can be changed with a new assignment  Existence can be tested ( isset ) and terminated ( unset ) ◦ Arrays  An array key behaves in many ways like a variable by Martin Kruliš (v1.1)3

 Implications ◦ Large data handling  Shared reading works fine thanks to CoW  Explicit unset() may release no longer needed data ◦ There are no pointers…  Some data structures depend on pointers/references  Instead of pointers  The arrays are flexible enough  Objects are passed by reference (like in C#/Java)  Variable variables  Explicit references by Martin Kruliš (v1.1)4

 Indirect Access to Values ◦ Name of one variable is stored in another variable $a = 'b'; $$a = 42; // the same as $b = 42; $a = 'b'; $b = 'c'; $c = 'd'; $$$$a = 'hello'; // the same as $d = 'hello'; ◦ The {, } can be used to avoid ambiguous situations ◦ Can be used with members, functions, classes, … $obj = new $className(); $obj->$varName = 42; by Martin Kruliš (v1.1)5

 References ◦ Similar to Unix hard-links in FS ◦ Multiple variables attached to the same data  Reference is taken by the & operator ◦ Independent on object references  A reference to an object can be created $a = 1; $b = &$a; $b++; echo $a; // prints by Martin Kruliš (v1.1)6 int (1) $a $b int (2)

 Arguments as References ◦ Similar usage as var keyword in Pascal function inc(&$x) { $x++; }  Returning References function &findIt($what) { global $myArray; return &$myArray[$what]; } by Martin Kruliš (v1.1)7

 References vs. Pointers function foo(&$var) { $var = &$GLOBALS['bar']; }  The unset() Function ◦ Does not remove data, only the variable ◦ Data are removed when not referenced  The global Declaration global $a;  $a = &$GLOBALS['a']; by Martin Kruliš (v1.1)8 $x = 42; foo($x); How is $x affected? $x = 42; foo($x); How is $x affected?

 Declaration ◦ Keyword function followed by the identifier function foo([args, …]) { … body … } ◦ Function body  Pretty much anything (even a function/class decl.)  Nested functions/classes are declared once the function is called for the first time  Functions are 2 nd level citizens and identifier space is flat ◦ Results  Optional argument of the return construct  Only one value, but it can be an array or an object by Martin Kruliš (v1.1)9

 Argument Declarations ◦ Implicit values may be provided function foo($x, $y = 1, $z = 2) { … }  Arguments with implicit values are aligned to the right  Note that PHP functions does not support overloading  Variable Number of Arguments ◦ Any function can be called with more arguments than formally declared ◦ Functions func_num_args(), func_get_arg(), and func_get_args() provide access to such arguments by Martin Kruliš (v1.1)10

 Indirect Calling ◦ Calling a function by its name stored in a variable function foo($x, $y) { … } $funcName = 'foo'; $funcName(42, 54);// the same as foo(42, 54);  Similar Constructs ◦ Using specialized invocation functions  call_user_func('foo', 42, 54);  call_user_func_array('foo', array(42, 54)); by Martin Kruliš (v1.1)11

 Testing Function Existence ◦ function_exists() – test whether given func. exists ◦ get_defined_functions() – list of all func. names  Cleanup Functions ◦ register_shutdown_function() – registers a function, which is executed when the script finishes  Special Case of Left-side Function ◦ Func. list() is used at the left-side of assignments  Reverse logic – it fills its arguments list($a, $b, $c) = array(1, 2, 3); by Martin Kruliš (v1.1)12

 Lambda (Nameless) Functions ◦ A unique name is generated automatically ◦ Function create_function()  Gets the arguments and the body as strings  Returns an identifier of newly created function ◦ Useful in many situations  One-call functions  Call-back functions $mul = create_function('$x, $y', 'return $x * $y'); echo $mul(3, 4);// prints out '12' by Martin Kruliš (v1.1)13 Creates a string identifier, which cannot collide with regular identifiers

 Anonymous Functions ◦ Better way how to implement nameless functions $fnc = function($args) { …body… }; ◦ The anonymous function is an instance of Closure  It can be passed on like an object ◦ The visible variables must be explicitly stated $fnc = function(…) use ($var, &$refvar) { … };  These variables are captured in the closure  Variables passed by reference can be modified by Martin Kruliš (v1.1)14 Example 1

by Martin Kruliš (v1.1)15

Charsets, text processing, and regular expressions, by Martin Kruliš (v1.1)16

 One Charset to Rule Them All ◦ HTML, PHP, database (connection), text files, … ◦ Determined by the language(s) used  Unicode covers almost every language ◦ Early incoming, late outgoing conversions  Charset in Meta-data ◦ Must be in HTTP headers header('Content-Type: text/html; charset=utf-8'); ◦ Do not use HTML meta element with http-equiv  Except special cases (like saving HTML file locally) by Martin Kruliš (v1.1)17

 Multibyte Character Encoding ◦ Some charsets (e.g., UTF-8, UTF-16, …) ◦ Standard string functions are ANSI based  They treat each byte as a char  Multibyte String Functions Library ◦ Standard library, often present in PHP ◦ Duplicates most of the standard string functions, but with prefix mb_ ( mb_strlen, mb_strpos, …) ◦ Encoding conversions mb_convert_encoding() ◦ mb_internal_encoding() – specifies the internal encoding used in PHP by Martin Kruliš (v1.1)18 Example 2

 Encoding Input Data from HTTP ◦ Usually done transparently  Check “mbstring” section of php.ini ◦ Can be done manually mb_parse_str()  Databases ◦ The database or the database connection usually requires to be configured ◦ An example for MySQL database  mysqli_set_charset() by Martin Kruliš (v1.1)19

 Lexicographical Comparison of Strings ◦ Best to be done elsewhere (in DBMS for instance) ◦ The strcmp() function is binary safe ◦ The locale must be set correctly ( setlocale() )  Iconv Library ◦ An alternative to Multibyte String Functions ◦ Fewer functions ◦ Easier for encoding conversions  Can deal with missing mappings and replacements by Martin Kruliš (v1.1)20

 What to Verify or Sanitize ◦ Everything that possibly comes from users: $_GET, $_POST, $_COOKIE, … ◦ Data that comes from external sources (database, text files, …)  When to Verify or Sanitize ◦ On input – verify correctness  Before you start using data in $_GET, $_POST, … ◦ On output – sanitize to prevent injections  When data are inserted into HTML, SQL queries, … by Martin Kruliš (v1.1)21

 How to Verify ◦ Regular expressions ◦ Filter functions  filter_input(), filter_var(), …  Useful for special validations ( , URL, IP, …)  How to Sanitize ◦ String and filter functions, regular expressions ◦ htmlspecialchars() – encoding for HTML ◦ urlencode() – encoding for URL ◦ DBMS-specific functions ( mysqli_escape_string() ) by Martin Kruliš (v1.1)22

 String Search Patterns ◦ Special syntax that encodes a program (language) for regular automaton ◦ Simple to use  Encoding is (mostly) human readable ◦ POSIX and Perl Standards  Usage ◦ Searching strings, listing matches ◦ Find and replace ◦ Splitting a string into an array of strings by Martin Kruliš (v1.1)23

 Expression ◦ expr modifiers ◦ Separator is a single character (usually /, #, %, …) ◦ Pattern modifiers are flags that affect the evaluation  Base Syntax ◦ Sequence of atoms ◦ Atom could be  Simple (non-meta) character (letter, number, …)  Dot (. ) represents any character  A list of characters in [] ( [abc], [0-9a-z_], …) by Martin Kruliš (v1.1)24

 Important Meta-characters ◦ \ - an escaping character for other meta-characters ◦ Anchors ^, $ marking start/end of a string/line  ^ in character class definition inverts the set ◦ [, ] – character class definition ◦ {, } – min/max quantifier atom{n}, atom{min,max}  [0-9]{8} (8-digit number),.{1,9} (1-9 chars) ◦ (, ) – subpattern (treated like an atom) ◦ *, +, ? – repetitions, shorthand notations of {0,}, {1,}, and {0,1} respectively ◦ | - branches ( ptrn1|ptrn2 ) by Martin Kruliš (v1.1)25

 Character Classes ◦ Pre-defined classes identified by names [:name:]  For example [ab[:digit:]] matches a, b, and 0-9 ◦ alpha – letters ◦ digit – decimal digits ◦ alnum – letters and digits ◦ blank – horizontal whitespace (space and tab) ◦ space – any whitespace (including line breaks) ◦ lower, upper – lowercase/uppercase letters ◦ cntrl – control characters ◦ xdigit – hexadecimal digits by Martin Kruliš (v1.1)26

 Modifiers ◦ i – case Insensitive ◦ m – multiline mode ( ^, $ match start/end of a line) ◦ s – '.' matches also a newline character ◦ x – ignore whitespace in regex (except in character class constructs) ◦ S – more extensive performance optimizations ◦ U – switch to not greedy evaluation  Greedy evaluation means that patterns with *, +, or ? tries to match as many characters as possible by Martin Kruliš (v1.1)27

 Subpatterns ◦ To ensure correct operation precedence (one|two|three){1,3} ◦ To add modifiers to only a part of the expression (?modifiers:ptrn) ◦ To mark important parts of the expression  Used to retrieve parts of a string after matching  Named subpatterns (? ptrn), or (?'name'ptrn)  Unnamed subpatterns (no capturing in matching) (?:ptrn) by Martin Kruliš (v1.1)28

 Verification (RFC 2822) (?:[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/ =?^_`{|}~-]+)*|"(?:[\x01-\x08\x0b\x0c\x0e-\x1f\x21 \x23-\x5b\x5d-\x7f]|\\[\x01-\x09\x0b\x0c\x0e-\x7f])* (?:[a-z0-9-]*[a-z0-9])?|\[(?:(?:25[0-5]|2[0-4][0-9]| [01]?[0-9][0-9]?)\.){3}(?:25[0-5]|2[0-4][0-9]|[01]? [0-9][0-9]?|[a-z0-9-]*[a-z0-9]:(?:[\x01-\x08\x0b\x0c \x0e-\x1f\x21-\x5a\x53-\x7f]|\\[\x01-\x09\x0b\x0c \x0e-\x7f])+)\]) by Martin Kruliš (v1.1)29

 preg_match($ptrn, $subj [,&$matches]) ◦ Searches given string by a regex ◦ Returns true if the pattern matches the subject ◦ The $matches array gathers the matched substrings of subject with respect to the expression and subpatterns  Subpatterns are indexed from 1  At index 0 is the entire expression  Named patterns are indexed associatively by their names by Martin Kruliš (v1.1)30 "6 eggs, 3 spoons of oil, 250g of flower" array(1) { [0] => string("6") } /[[:digit:]]+/ ~

 preg_replace($ptrn, $repl, $str) ◦ Search and replace substrings in a string  Each match of the pattern is replaced  Replacement may contain references to subpatterns  preg_split($ptrn, $str [,$limit]) ◦ Similar to explode() function ◦ Split a string into an array of strings ◦ The pattern is used to match delimiters  Delimiters are not part of the result by Martin Kruliš (v1.1)31 Example 3

 Differences ◦ The expression is not enclosed by separators  No modifiers can be added ◦ Only simple subpatterns ◦ Only a few escape sequences  Functions ◦ ereg(), ereg_replace(), split() ◦ Each function has –i version (case insensitive)  eregi() – case insensitive version of ereg() ◦ Deprecated since PHP by Martin Kruliš (v1.1)32

by Martin Kruliš (v1.1)33