Download presentation
Presentation is loading. Please wait.
1
CIT 383: Administrative Scripting
HTML CIT 383: Administrative Scripting
2
CIT 383: Administrative Scripting
Topics Evolution of HTML HTML Structure Regular Expressions v Parsing HPricot XPath CIT 383: Administrative Scripting
3
CIT 383: Administrative Scripting
Evolution of HTML 1991 HTML created (only 22 tags) 1995 HTML 2.0 1996 Tables added to HTML 2.0 Jan 1997 HTML 3.2 published by W3C Dec 1997 HTML 4.0 2000 XHTML 1.0 2008 HTML 5.0 working draft published. CIT 383: Administrative Scripting
4
CIT 383: Administrative Scripting
HTML Structure <html> <title>My title</title> <body> <a href=“...”>My link</a> <h1>My header</h1> </body> </html> CIT 383: Administrative Scripting
5
CIT 383: Administrative Scripting
HTML Structure Image from CIT 383: Administrative Scripting
6
Why Not Regular Expressions?
Angle-bracket tags are difficult to deal with. Tag regexp: <\w+\s+[^>]*> Matches <img alt=“ruby” src=“rb.png”> Doesn’t: <img alt=“ruby>” src=“rb.png”> Solution:check for > in attributes. Have to match every form of attribute name=“value” name=‘value’ name=value name CIT 383: Administrative Scripting
7
CIT 383: Administrative Scripting
Hpricot h = Hpricot(html-string) Creates a new HPricot::Doc object. el = h.at(string) Finds first matching Hpricot::Elements object. el = h.search(string or XPath expression) Returns array of matching objects. el.inner_html Returns HTML enclosed in element. CIT 383: Administrative Scripting
8
CIT 383: Administrative Scripting
XPath Searches h.search("p") Find all paragraph tags in document. doc.search("/html/body//p") Find all paragraph tags within the body tag. Find all anchor tags with a src attribute. Find all a tags with a src attribute of google.com. CIT 383: Administrative Scripting
9
CIT 383: Administrative Scripting
References Michael Fitzgerald, Learning Ruby, O’Reilly, David Flanagan and Yukihiro Matsumoto, The Ruby Programming Language, O’Reilly, 2008. Hal Fulton, The Ruby Way, 2nd edition, Addison- Wesley, 2007. Robert C. Martin, Clean Code, Prentice Hall, Dave Thomas with Chad Fowler and Andy Hunt, Programming Ruby, 2nd edition, Pragmatic Programmers, 2005. CIT 383: Administrative Scripting
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.