Libwww, the W3C protocol library libwww - The W3C Protocol Library „Großes Schwerpunktseminar WI“ University of Applied Sciences Gießen-Friedberg Stefan Sabatzki
Libwww, the W3C protocol library Contents 1.Introduction 2.Structure libwww 3.Programming with libwww 4.Conclusion
Libwww, the W3C protocol library Contents 1.Introduction –What is libwww? –Why libwww? 2.Structure libwww 3.Programming with libwww 4.Conclusion
Libwww, the W3C protocol library What is libwww? Generic framework for building web applications Written in C Pluggable modularity Means to provide most common Internet access methods Transmit data in many different media formats Dataflow to and from the server
Libwww, the W3C protocol library What is libwww? (2) First version implemented 1992 by Tim Berners-Lee Development at CERN 1994 libwww moved from CERN to W3C 1998 released as opensource As of September 2003 W3C stopped work on libwww As of January 2004 libwww officially belongs to the „Open Source Community“
Libwww, the W3C protocol library Why libwww? Experimenting and prototyping Performance, modularity and extensibility Free and open source code Mailing lists and active community
Libwww, the W3C protocol library Contents 1.Introduction 2.Structure libwww –Design Model –Request/Response Paradigm –Data Flow –Threads, Eventloops and Filters –Modules as Statemachines 3.Programming with libwww 4.Conclusion
Libwww, the W3C protocol library Design Model Layering as design model
Libwww, the W3C protocol library Design Model (2) More demonstrative
Libwww, the W3C protocol library Request/Response Paradigm Application issues request Libwww fulfills request Presented to application on arrival Simultaneous requests handled by Librarycore
Libwww, the W3C protocol library Data Flow Streams are used to transport data Derived from generic stream –Protocol streams –Converters –Presenters –I/O streams –Basic streams
Libwww, the W3C protocol library Data Flow (2) Structured streams –Derived from generic stream –Accepts structured document –Ordered tree-structured arrangement of data –Each instance is associated with SMGL parser –Each instance is associated with corresponding DTD
Libwww, the W3C protocol library Data Flow (3) Cascaded streams –Stream chains –Setup before data arrives
Libwww, the W3C protocol library Data Flow (4) –Setup after data arrives
Libwww, the W3C protocol library Threads, Eventloops and Filters Not thread-save Implements pseudo-thread model –Uses non-blocking sockets –Based on callback functions Before/After-Filter –Global and local filters –Registered at runtime
Libwww, the W3C protocol library Threads, Eventloops and Filters (2)
Libwww, the W3C protocol library Modules as Statemachines Since libwww 3.0 Protocol modules implemented as statemachines Part of thread-model Keep track of current state in communication interface
Libwww, the W3C protocol library Modules as Statemachines (2)
Libwww, the W3C protocol library Contents 1.Introduction 2.Structure libwww 3.Programming with libwww –C++ Simulation –APIs and Library Interfaces –Simple Example –More Complex Example 4.Conclusion
Libwww, the W3C protocol library C++ Simulation Construction/destruction –*_new / *_delete (HTRequest_new / HTRequest_delete) Data hiding Inheritance –Explicit pointer casting PRIVATE, PUBLIC Makros
Libwww, the W3C protocol library APIs and Library Interfaces Set of APIs called packages Win32: DLLs Unix: separate static libraries Package interface exported via single include file: WWW*.h Some important packages –Basic Utility Packages –Core Packages –Initialization Packages –Transport Packages –Protocol Packages –Parser Packages
Libwww, the W3C protocol library Simple Example Displays all links in document Applicable to text, html/xml tags, etc. // snippet... HText_registerLinkCallback(foundLink);. HTEventList_loop(request);... foundLink (...) { HTAnchor * dest = HTAnchor_followMainLink(...); char * address = HTAnchor_address(dest); HTPrint("Found link `%s\'\n", address); HT_FREE(address); }
Libwww, the W3C protocol library More Complex Example Rudimentary commandline browser See project
Libwww, the W3C protocol library Contents 1.Introduction 2.Structure libwww 3.Programming with libwww 4.Conclusion –What‘s missing? –Facts about libwww –Personal Opinon
Libwww, the W3C protocol library What‘s missing? Not thread-safe No cookie-jar, only parsing/generation Consistent usage of RegEx C++ representation
Libwww, the W3C protocol library Facts about libwww Who uses libwww? No one? Sample applications on project homepage No reviews, benchmarks, comparisons Not ‚bug free‘ ‚Competitors‘ (mostly UNIX) –WinInet –Libghttp –Libcurl –Libhttp –Neon
Libwww, the W3C protocol library Personal Opinion Typical opensource project Tricky installation ‚Feels‘ old IS old Desperate attempt to reach OOP Non-trivial usage, but very flexible and potent
Libwww, the W3C protocol library Thank you for your attention ?