How to crawl facebook data


How to scrape Facebook data

Extract all sorts of data from Facebook with our web crawling tools.

No bandwidth limits. Crawl and scrape as many web pages as necessary

Protect your web crawler against proxy failures, IP leaks, and CAPTCHAs

Achieve maximum scraping efficiency

Your first 1,000 requests are free of charge!

Create a free account and then apply from the dashboard.

Facebook is currently the world’s biggest social media platform. With almost 3 billion monthly active users, there is no doubt that here, you will find a vast amount of data that can be useful for SEO monitoring, brand campaigns, or marketing plans.

However, retrieving Facebook data is no easy task and you will need proxies to avoid getting blocked and bypass CAPTCHAs. With the help of Crawlbase, you can easily overcome these issues by using our crawling and scraping products.

Optimized for speed and reliability

Scrape every Facebook page you want. You can crawl anything from news feeds, search results, to public groups. With an average response time between 4 to 10 seconds, you can ensure that your projects will stay efficient and only fresh data is acquired.

Need help contact us

Stay secure while crawling millions of Facebook pages

Our APIs are built on top of thousands of residential and data center proxies worldwide combined with Artificial Intelligence so you can anonymously scrape various Facebook pages. Crawlbase can effortlessly avoid CAPTCHAs and has the best protection against blocked requests.

Get data for your projects without worrying about setting up proxies or infrastructure, so you can focus on what matters most- growing your business.

Need help contact us

An easy-to-use API for everyone

For beginners and experts, for small and big projects, for casual users and developers. Our API is so easy to use you can start scraping Facebook in minutes.

Get your token now by signing up and try your first API call with just one simple cURL request:

The all-in-one solution for your data collection needs

Use our Crawling API to get the full HTML code and scrape any content that you want.

Take a screenshot of an entire Facebook page on any screen resolution if you wish to keep track of any changes easily with our Screenshot API.

Send your crawled pages straight to the cloud using the Crawlbase’s Cloud Storage.

For huge projects, you can use the Crawler with asynchronous callbacks to save cost, retries, and bandwidth.

Need help contact us

Start using Crawlbase now

We are loved by thousands of individuals and companies around the world. Our goal is to provide the internet data freedom you deserve.

Test for free

Your first 1,000 requests are free of charge,Sign up now

Simple pricing

Pay-as-you-go pricing with no hidden fees.

No long-term contracts

It is your account and you decide when to stop, can be cancelled at any time.

Need more help?

You can check our FAQ section or ask question by clicking Contact us

Customers & Clients

Used by the world’s most innovative businesses – big and small

Supporting all kinds of crawling projects

Create Free Account!

Start crawling and scraping the web today

Create a free account and then apply from the dashboard.

Start crawling in minutes

Crawling your Facebook friends’ data | by Ali Raza

Making Data Science Fun

Write a crawler to get your Facebook Friends’ online activity

Crawling Facebook is against Facebook’s policy unless you have express written permission. Following post is just meant for educational purposes. I do not encourage anyone to crawl Facebook.

There is tons of information publicly available on social networks, which, sometimes we even forget exists. In this tutorial we will learn how to get some of the publicly available information on Facebook platform, allowing us to perform exploratory data analysis. That includes (but not limited to) detecting a user’s sleep patterns, online activity patterns, social network graphs, chat history, and different associations in the data.

This post is divided into two sections:

Facebook keeps tons of information on us, which goes all the way back to when we first started to use the social network. You can download your own archive of this information if you want to see what Facebook knows and for getting insights from the data.

Getting a copy of your personal Facebook data is very easy. Here’s how you can do that:

You can see all of your private and public information, your message threads, and a lot of other stuff in the downloaded file.

Facebook Graph API endpoints related to friends’ data were closed permanently because of changing privacy policy.

Getting friends’ data is not very straightforward anymore. In light of Facebook’s promises to protect their users data, Facebook has been continuously revising its Graph API, especially after the Cambridge Analytica debacle.

As a result, several Graph API endpoints have been closed completely including the endpoints related to friends’ information.

So no Graph API access means No access at all? Not exactly. We can still access the (publicly available) data using web browsers which means we can employ crawling tools to extract the information we need.

Let’s get straight to the job without further boring explanations!

1. Install Required Tools for Crawling

We will be using the following tools/technologies for the purpose:

Install Virtual Environment (optional)

This step is optional but I would suggest you to create a python virtual environment for our project so that we don’t pollute global python environment. If you have Anaconda installed on your Ubuntu/Linux machine you can use conda command to create a python virtual environment:

$ conda create -n py35_fb python=3. 5
$ source activate py35_fb # activate conda environment

There are few other ways to create a python virtual environment but explaining those is not the intention of this post.

Install Selenium

In our virtual environment, we can install selenium using pip:

pip install selenium==3.141.0

Download Chrome Driver

Because we want to run headless selenium (without GUI), we need to download Chrome Driver. Let’s download it, unzip, and place the chromedriver file from extracted folder in the working directory. We will be using this later for initializing headless version of selenium.

Please make sure that you download the compatible version of Chrome Driver for your Installed Chrome Browser.

2. Crawling Friend’s Online Activity

We can find out the list of online friends either from Messenger’s web interface or from Facebook Mobile’s buddy page. In this tutorial, we will be using Facebook Mobile interface.

If you are new to crawling, I suggest you to read some introductory article about crawling like this one. Following is the image of how the buddy page looks like. The pictures and names are blurred for privacy reasons.

Figure 1: Facebook Mobile’s Buddy Page showing a list of Online Friends

Let’s see the HTML source of the page and find out where our required information exists in the page.

Following is the structure of the HTML source of the page. For the sake of understanding, I have excluded unnecessary content of the page.

<html>
<body>
...
<div>
...
<div>
...
<strong>
John Doe
</strong>
...
</div>
...
<div>
...
<div>
...
<strong>
Jane Doe
</strong>
...
</div>
...
</div>
...
</body>
</html>

We can see from the structure of the document that strong tag inside the div tag (with class ‘content’) contains the information we need.

Tip: You can use Chrome Dev Tools to analyze the HTML source of any page in Chrome.

That’s all the information we need for our purpose, now let’s get to the interesting stuff.

Let’s initialize selenium browser using the following piece of code:

Don’t forget to replace the value of chrome_driver_ex variable with correct path to chromedriver file. If above code is executed without errors, headless (without GUI) selenium browser should be opened successfully.

Note: Facebook blocks your account if there are several unsuccessful login attempts from unknown location.

Now we need to load the buddy list page and extract the required information. But before that, we need to login to our Facebook account. There are a few very important steps we need to take care of:

It’s of course possible to automate the authorization step required for login from unrecognized locations using selenium but that is more complicated and out of scope of this tutorial.

Next step is to analyze the structure of the buddy page to find out how we can fill the login form. I have already analyzed the structure of the login page for you, so we can skip to the code for login:

If above code works without errors, you can head into your working directory to see the logged_in.png image showing successful login.

But it’s very possible that you encounter some problems resulting in failed login. Let’s list down some of the possible causes.

  1. Facebook blocked the login attempt because of unknown location or two factor authentication as discussed previously in the tutorial, or because of some other security reason. Seeing the logged_in.png should give you the idea of what is the problem.
  2. If your internal connection is slow and you did not put enough wait for the the login page to reload, required elements cannot not be found.
  3. If Facebook changes the source of the page (e.g changing the name attribute of login button), required elements cannot not be found. You will need to find the correct elements you need by analyzing the structure of the HTML document.

Assuming that the login was successful, we can load the buddy list page and get a list of online friends. We can use XPath to get the required elements of the document using the following code:

That’s all it is. “names” is a list of available online friends at the time of crawling. We can load the page every half an hour (or more frequently) to get and store the list of online friends at a particular time and use the data later for the exploratory data analysis.

User’s online status is just a tiny bit of information compared to what’s available online and we can use the same crawling techniques to scrap other information. This article is also a reminder to us that how much of our information is available online.

I hope this post was helpful for your work and research. If you have any questions, feel free to write a comment below.

Read related stories:

  1. Crawling your Facebook friends’ data
  2. Analyzing Online Activity and Sleep Patterns
  3. [Coming Soon] — Analyzing my Social Network Graph

How to save and transfer a profile from Facebook and Instagram to another social network

You can transfer data from Facebook in a semi-automatic mode to four platforms. The platform is a service to which you want to transfer your information from Facebook:

VKontakte, at the request of users, has added the ability to save materials from Instagram: vk.com/instagram_manager.

To transfer pictures, just enter your username on Instagram or enter a link to your account. They will be included in the closed album "Instagram", but privacy can be changed at any time. nine0003

In the future it will be possible to use the archive. Then the video will also be saved (in the "Video" section), as well as Reels (in the "Clips"). All materials will be visible only to you. So have time to download a file from Instagram with all the information.

Before moving on, let's determine what information is on the social network

What information Facebook stores

0025 View your information allows you to see all your profile data in one place. We also offer a range of tools and resources to help you verify and control your information on Facebook.

Note. If you would like to download a copy of your Facebook information, check out the Download Your Information tool.

What can you learn about your profile using the View Your Information tool?

The View Your Info tool provides a summary of your Facebook profile that you can see in one place at any time. In the Your Information section, we've divided this information into categories to make it easier for you to find the information you're looking for. nine0003