HTML Darby Tien-Hao Chang Department of Electrical Engineering National Cheng Kung University
HTML introduction HTML stands for Hyper Text Markup Language An HTML file is a text file containing small markup tags The markup tags tell the Web browser how to display the page An HTML file must have an htm or html file extension An HTML file can be created using a simple text editor
Sample HTML Title of page This is my first homepage. This text is bold
HTML elements HTML tags are used to mark-up HTML elements HTML tags are surrounded by the two characters The surrounding characters are called angle brackets HTML tags normally come in pairs like and The first tag in a pair is the start tag, the second tag is the end tag The text between the start and end tags is the element content HTML tags are not case sensitive, means the same as
Sample HTML This text is bold Start tag content end tag This is my first homepage. This text is bold Tag attribute
Basic HTML tags Defines an HTML document Defines the document's body to Defines header 1 to header 6 to Defines a paragraph Inserts a single line break Defines a horizontal rule Defines a comment
Sample HTML This is heading 1 This is heading 2 This is heading 3 This is heading 4 This is heading 5 This is heading 6
Sample HTML This paragraph contains a lot of lines in the source code, but the browser ignores it. To break lines in a paragraph, use the br tag.
Sample HTML This is heading 1 This is heading 2
More HTML tags Defines bold text Defines big text Defines emphasized text Defines italic text Defines small text Defines strong text Defines subscripted text Defines superscripted text Defines inserted text Defines deleted text Defines computer code text Defines keyboard text Defines sample computer code Defines teletype text Defines a variable Defines preformatted text Defines an abbreviation Defines an acronym Defines an address element Defines the text direction Defines a long quotation Defines a short quotation Defines a citation Defines a definition term
Haha s/ ]*>//g
Powerful regular expression s/ ]*>//g s substitute < left angle bracket [^>] any character except right angle bracket [^>]* all characters formed the tag (attributes) > right angle bracket g replace globally, i.e. all occurrences
Is semantic important? Yes, sometimes To extract the heading of a news article html 發票案/李慧芬週五前 返澳 近日將與李碧君對質 /^ (.*) \n$/ print $1, “\n”;
How to display a less than sign (<) in browser? Character Entities A character entity has three parts: an ampersand (&), an entity name or a # and an entity number, and finally a semicolon (;). To display a less than sign in an HTML document we must write: < or <
The most common character entities ResultDescriptionEntity nameEntity number non-breaking space <less than<< >greater than>> &ersand&& "quotation mark"" 'apostrophe''
HTML links This text is a link to a page on this Web site. This text is a link to a page on the World Wide Web.
HTML frames
HTML frames
HTML tables row 1, cell 1 row 1, cell 2 row 2, cell 1 row 2, cell 2
HTML tables Cell that spans two columns: Name Telephone Bill Gates Cell that spans two rows: First Name: Bill Gates Telephone:
HTML lists An Unordered List: Coffee Tea An Ordered List: Coffee Tea A Definition List: Coffee Black hot drink Milk White cold drink
HTML forms description: description description 1 description 2 default text
Form ’ s action attribute and submit button Username:
Methods GET and POST in HTML forms - what's the difference? The difference between GET and POST is primarily defined in terms of form data encoding so that former means that form data is to be encoded (by a browser) into a URL while the latter means that the form data is to appear within a message body If the processing of a form is idempotent (i.e. it has no lasting observable effect on the state of the world), then the form method should be GET If the service associated with the processing of a form has side effects (for example, modification of a database or subscription to a service), the method should be POST
Exercise Resolution, number of units, EC no. and so on with a given PDB ID Today’s headings Comics use LWP::Simple; $web = &get( $url );
Exercise hints $web =~ /Title\s* \s*[^>]*>\s*([^\n]+)/
Javascript – a case study
A review of dirtycomi Encoding (Big5, GB2312, UTF-8) Retrieve HTML code with GET method Traverse multiple pages Trace Javascript code and re-implement it in Perl Completely pretend itself as a human + browser