Scraping Facebook via API in R

Slides:



Advertisements
Similar presentations
Yammer Technical Solutions Overview
Advertisements

Overview of Twitter API Nathan Liu. Twitter API Essentials Twitter API is a Representational State Transfer(REST) style web services exposed over HTTP(S).
FI-WARE Testbed Access Control temporary solution.
Tele’Ware Software Application. Helping you manage your clients….
XP Browser and Basics1. XP Browser and Basics2 Learn about Web browser software and Web pages The Web is a collection of files that reside.
Hannes Tschofenig MIT CFP Privacy & Security Working Group Feb. 2 nd 2011.
Browser and Basics Tutorial 1. Learn about Web browser software and Web pages The Web is a collection of files that reside on computers, called.
Bloglines.com How to use bloglines By: Jake Szymanski.
Fall, Privacy&Security - Virginia Tech – Computer Science Click to edit Master title style Design Extensions to Google+ CS6204 Privacy and Security.
Survey of Identity Repository Security Models JSR 351, Sep 2012.
XP New Perspectives on Browser and Basics Tutorial 1 1 Browser and Basics Tutorial 1.
ASP.NET.. ASP.NET Environment ASP.NET is Microsoft's programming framework that enables the development of Web applications and services. It is an easy.
Week seven CIT 354 Internet II. 2 Objectives Database_Driven User Authentication Using Cookies Session Basics Summary Homework and Project 2.
Support Training Module. Support Manual 1.“On The Lot” – How it all works… 2.Craigslist Settings 3.Post to Craigslist 4.Backpage Settings 5.Post to Backpage.
1 Java Servlets l Servlets : programs that run within the context of a server, analogous to applets that run within the context of a browser. l Used to.
FriendFinder Location-aware social networking on mobile phones.
FriendFinder Location-aware social networking on mobile phones.
Esri UC 2014 | Demo Theater | Using ArcGIS Online App Logins in Node.js James Tedrick.
1 State and Session Management HTTP is a stateless protocol – it has no memory of prior connections and cannot distinguish one request from another. The.
By: Rodney Walker And Stevin Tawadros. What is Twitter?  Twitter is a service that allows you to communicate with friends via the Internet. Instead of.
Driving Innovation V Technology Strategy Board IC tomorrow Application Developer Overview.
START Application Spencer Johnson Jonathan Barella Cohner Marker.
B2access.eudat.eu B2ACCESS User Training How to register with B2ACCESS Version 1 February 2016 This work is licensed under the Creative Commons.
Azure Active Directory is becoming one of, if not the, primary user identity management services for cloud applications. One of Azure Active Directory's.
1 Terminal Management System Usage Overview Document Version 1.1.
Survey of Identity Repository Security Models JSR 351, Sep 2012.
Join the Community
Introduction to gathering and analyzing data via APIs Gus Cavanaugh
Consuming OAuth Services in Alfresco Share
API (Application Program Interface)
Node.js Express Web Applications
Section 13 - Integrating with Third Party Tools
Cosc 5/4730 REST services.
Data Virtualization Tutorial… OAuth Example using Google Sheets
Node.js Express Web Services
Programming the Web Using Visual Studio .NET
City of Lakewood, California - Lakewood Online key points
Assess Survey Invitations
Leveraging BI in SharePoint with PowerPivot and Power View
Radius, LDAP, Radius used in Authenticating Users
WELCOME Mobile Applications Testing
What is REST API ? A REST (Representational State Transfer) Server simply provides access to resources and the REST client accesses and presents the.
All about social networking
Social Networks Integration in Android
SSOScan: Automated Testing of Web Applications for Single Sign-On Vulnerabilities Yuchen Zhou, and David Evans 23rd USENIX Security Symposium, August,
WStore Programmer Guide
A GACP and GTMCP company
Testing REST IPA using POSTMAN
BY: SHIVI AGRAWAL ( ) CSE-(6)C
November 8th, 2017 Matthew Davis and John Fink
Application layer Lecture 7.
Web Systems Development (CSC-215)
CSC 495/583 Topics of Software Security Intro to Web Security
SOCIAL RESPONSIBILITY
X-Road as a Platform to Exchange MyData
SharePoint Online Authentication Patterns
Office 365 Development.
ARCHITECTURE OVERVIEW
TechEd /22/2019 9:22 PM © 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows, and other product names are or may be registered trademarks.
Lecture 5: Functions and Parameters
Introduction into the Power BI REST API Jan Pieter Posthuma
“All About Me” Staff Development Day
Chapter 42 Web Services.
State Handling CS 4640 Programming Languages for Web Applications
[Based in part on SWE 432 and SWE 632 materials by Jeff Offutt, GMU]
Human and Computer Interaction (H.C.I.) &Communication Skills
D Guidance 26-Jun: Would like to see a refresh of this title slide
State Handling CS 4640 Programming Languages for Web Applications
[Based in part on SWE 432 and SWE 632 materials by Jeff Offutt, GMU]
Cross Site Request Forgery (CSRF)
Presentation transcript:

Scraping Facebook via API in R . Shashank Hebbar, Ph.D. Student, Analytics and Data Science Kennesaw State University 1

What is an API? API is short for Application Programming Interface. Basically, it means a way of accessing the functionality of a program from inside another program. So instead of performing an action using an interface that was made for humans, a point and click GUI for instance, an API allows a program to perform that action automatically.  Todays API , usually refer to the API that are based on World Wide Web’s HTTP Protocol, that is also used by web servers and browsers to exchange data. 2

API Identification /authorization API key (aka token). A key is used to identify the user along with track and control how the API is being used (guard against malicious use). A key is often obtained by supplying basic information (i.e. name, email) to the organization and in return they give you a multi-digit key. OAuth is an authorization framework that provides credentials as proof for access to certain information. Many APIs are open to the public and only require an API key; however, some APIs require authorization to account data (think personal Facebook & Twitter accounts) R has an extensive list of packages in which API data feeds have been hooked into R. You can find a slew of them scattered throughout the https://cran.r-project.org/web/views/WebTechnologies.html. 3

Facebook API Register a new application From Facebook Developer click on Apps at the top of the page to go to the application dashboard. Click the fb-create-new-app-button button near the top. Once you are done with the verification process, your application is created. Note down the App Id & App Secret 4

Create OAuth token to Facebook R session. fbOAuth creates a long-lived OAuth access token that enables R to make authenticated calls to the Facebook API. 5

Functions from Rfacebook Package function getLikes getLikes(user, n = n , token): Extract list of liked pages of a Facebook user with page id. Arguments: user: user name/ID , n: Number of liked pages to return for user. searchPages(, token, n = n): It Search pages that having a string/keyword. Arguments: string: any string , n: Number of pages to return function getPage  getPage(page , token, n = n): Extract list of posts from a public Facebook page. Missing Values have not been reported. For each Category: Outside = 344 (5.9% of total) Inside = 665 (11.47%) Out of bed = 659 (11.36%) Eating = 402 (6.93%) Bathing = 321 (5.54%) Toileting = 649 (11.19%) Dressing = 522 (9.00%) 6

Analyzing data from a Facebook page  For example, assume that we're interested in learning about how the Facebook page Humans of New York has become popular, and what type of audience it has. The first step would be to retrieve a data frame with information about all its posts Using this data frame, it is relatively straightforward to visualize how the popularity of Humans of New York has grown exponentially over time. Missing Values have not been reported. For each Category: Outside = 344 (5.9% of total) Inside = 665 (11.47%) Out of bed = 659 (11.36%) Eating = 402 (6.93%) Bathing = 321 (5.54%) Toileting = 649 (11.19%) Dressing = 522 (9.00%) 7

Other API Packages in R Some of the popular packages are    Some of the popular packages are blsAPI for pulling U.S. Bureau of Labor Statistics data rnoaa for pulling NOAA climate data rtimes for pulling data from multiple APIs offered by the New York Times The rnoaa package allows users to request climate data from multiple data sets through the National Climatic Data Center API. Unlike blsAPI, the rnoaa app requires you to have an API key. To request a key go to https://www.ncdc.noaa.gov/cdo-web/token and provide your email; a key will immediately be emailed to you.

What if there is no package for that API?! Although numerous R API packages are available, and cover a wide range of data, you may eventually run into a situation where you want to leverage an organization’s API but an R package does not exist. This is where httr comes in. httr was developed by Hadley Wickham to easily work with web APIs. One of the popular function here is Get(). We use the Get() function to access an API, provide it some request parameters, and receive an output. httr is designed to map closely to the underlying http protocol. There are two important parts to http: the request, the data sent to the server, and the response, the data sent back from the server