WebOOB (Web Outside Of Browser)

Slides:



Advertisements
Similar presentations
Chapter 6 Server-side Programming: Java Servlets
Advertisements

Other Web Application Development Technologies. PHP.
Hypertext Markup Language. Platform: - Independent  This means it can be interpreted on any computer regardless of the hardware or operating system.
Browsers and Servers CGI Processing Model ( Common Gateway Interface ) © Norman White, 2013.
CP476 Internet Computing Browser and Web Server 1 Web Browsers A client software program that allows you to access and view Web pages on the Internet –Examples.
UWWD In our quest to eliminate bad websites, we present…. HALLELUJAH!!
Exporting reports – Data Integration & Presentation What is involved in presenting report data in other ways? What is involved in presenting report data.
INTRO TO MAKING A WEBSITE Mark Zhang.  HTML  CSS  Javascript  PHP  MySQL  …That’s a lot of stuff!
QA Automation Solution. Solution Architecture Test Management tool CI Tool Automation framework Testing Project BDD Tool Text of test to Testing Project.
Lecturer: Ghadah Aldehim
READY-TO-WEAR: QUICK AND EASY MICROSITES FOR DATA-DRIVEN REPORTS Brian Karfunkel Data Analyst NYU Furman Center NNIP Idea Showcase July 16,
Selecting and Combining Tools F. Duveau 02/03/12 F. Duveau 02/03/12 Chapter 14.
HTML 101 MPM What is a website? A website is basically a collection of web pages stored on a particular computer (called a web server) and accessed.
Here you are at your computer, but you don’t have internet connections. Your ISP becomes your link to the internet. In order to get access you need to.
A/WWW Enterprises15 July 1996 Implementing Queries with HTTP A. Warnock A/WWW Enterprises
Objective Understand concepts used to web-based digital media. Course Weight : 5%
Faculty Webpage Design Minimum Requirements. Go to: then High Schoolhttp://gcsc.groupfusion.net/
240-Current Research Easily Extensible Systems, Octave, Input Formats, SOA.
audio video object Options: controls autoplay Need to define height and width Options: controls autoplay.
Module: Software Engineering of Web Applications Chapter 2: Technologies 1.
(ITI310) By Eng. BASSEM ALSAID SESSIONS 10: Internet Information Services (IIS)
1.Getting Started 2.Modifying Design 3.Newsletter Templates 4.Announcement 5.Administer Sections Index Training 14 th Mar., 2011.
Internet addresses By Toni Grey & Rashida Swan HTTP Stands for HyperText Transfer Protocol Is the underlying stateless protocol used by the World Wide.
: Information Retrieval อาจารย์ ธีภากรณ์ นฤมาณนลิณี
Making the website. Get your folders sorted first Create a new folder in “N” called “My hockey website” Create folders inside called “Documents”, “images”
Stata tweets and other API libraries: a practical guide William Matsuoka Stata Conference Chicago, IL - July 28, 2016.
Re Write POGO using openArchitectureWare Technology ● Pogo History ● OpenArchitectureWare technology ● Generated code ● Project status.
veoh clone script, metacafe clone script, Dailymotion clone script
Script the Web with Weboob
Lesson 11: Web Services and API's
From infra admin's point of view
Code Editing Lesson 2.
Managing State Chapter 13.
Internet Made Easy! Make sure all your information is always up to date and instantly available to all your clients.
IS1500: Introduction to Web Development
CST 1101 Problem Solving Using Computers
Introduction to gathering and analyzing data via APIs Gus Cavanaugh
WWU Hackathon May 6 & 7.
What is WWW? The term WWW refers to the World Wide Web or simply the Web. The World Wide Web consists of all the public Web sites connected to the Internet.
Node.js Express Web Applications
JSP (Java Server Page) JSP is server side technology which is used to create dynamic web pages just like Servlet technology. This is mainly used for implementing.
Evolution of Internet.
Sec (4.3) The World Wide Web.
Web Design Checklist By Sparkz Web Design Agency source :
Lesson 11: Web Services & API's
Introduction to Programming the WWW I
Software Quality Assurance
UNIT 15 Webpage Creator.
PHP / MySQL Introduction
Moving from a PHP Flat-File Electronic Resources Manager to Drupal 6 Views Image courtesy of USFSW Mountain Praire (Flickr User) Under Creative Commons.
November 8th, 2017 Matthew Davis and John Fink
Competitor Price Monitoring
Client side & Server side scripting
Web scraping tools, an introduction
Data Extraction using Web Scraping
Flight prices.
Objective Understand web-based digital media production methods, software, and hardware. Course Weight : 10%
A technical look at the new capabilities
Unit 1 The Web Book Test.
JavaScript & jQuery AJAX.
Lesson 11: Web Services and API's
Internet Protocols IP: Internet Protocol
Recitation on AdFisher
Chapter 7 Network Applications
Features Overview.
The Internet and Electronic mail
Web Application Development Using PHP
Performance/Load/Stress Testing
Introduction To Building a Web Site
Presentation transcript:

WebOOB (Web Outside Of Browser) Use the Web from your shell! revol@free.fr

Who needs browsers anyway? “The Web is about transmitting information to everyone regardless the platform” (Tim B. Lee) Browsers need to load 2 or 3MB of images and JS even when you just need the data itself. JS runs untrusted non-free code on your machine You can't easily pipe the browser into grep, cut or sed 😞

WebOOB, a Web client for your shell A python framework for web scraping Several capabilities (video, bank, message…) CLI & GUI applications using capabilities To search and collect data, submit forms… Modules implementing some capabilities Youtube, Europarl (video), PhpBB (message)…

WebOOB framework A set of python classes Browser functions HTTP[S] engine HTML parser… Settings for application and backends Module discovery…

Applications Command line Interactive (FTP-like commands) or not Formatters for CSV, JSON, HTML, plain text… GUI (PyQt) Simple GUI for a single task “There's an OOB for that!”

(Some) Applications [Q]Boobmsg [Q]Cineoob [Q]Cookboob [Q]HaveDate [Q]Videoob Boobank Boobill Boobtracker Comparoob Pastoob…

Modules Support one or more capabilities for a website Instantiated for a specific website = backend [vimeo] _enabled = 1 _module = vimeo [redminedemo] _module = redmine url = http://demo.redmine.org/ username = import

(Some) Modules Redmine Github (tickets) FreeMobile (bills) Many (french) banks Chronopost Collissimo… Youtube Europarl (videos) Vimeo Dailymotion… RMLL \o/ (videos)

Development status Not all modules support all wanted capabilities Some video modules lack search function… Browser2 class makes writing modules easier Some still needs rewriting from old Browser class Used professionally for banking websites

*nix commands composition Now you can Redirect stdout to the Web Redirect stdin from it as well Automate things with your shell of choice Support new sites without changing the workflow

Creating 200 tickets from a CSV? Configure a backend with the redmine or github account Parse the CSV, generate an mbox-like file / line Properties as headers Description as body for f in *.txt; do boobtracker -d post $account < $f; done Profit!

Converting forum posts to slides boobmsg -q -b phpbb formatter json ';' export_thread 36.1681 > talks.json Some python to generate html slide templates: python gen-desc.py talks.json Convert them to PDF: lowriter talks.html

Forum posts to slides

Forum posts to slides

References http://weboob.org/ http://git.symlink.me/?p=weboob/devel.git http://people.symlink.me/~rom1/blog/weboob/

Conclusion There are other ways to browse the web WebOOB puts it in a (nut)shell. Scraping can be fragile (depends on HTML) But sometimes it's the only solution And is saves a lot of time! Questions?