Scraping Facebook Profiles with Python

num_like_tag = soup.find('a', string=re.compile(r' likes$')) num_like = num_like_tag.get_text() if num_like_tag else "none" num_follower_tag = soup.find('a', string=re.compile(r' followers$')) num_follower = num_follower_tag.get_text() if num_follower_tag else "none" num_following_tag = soup.find('a', string=re.compile(r' following$')) num_following = num_following_tag.get_text() if num_following_tag else "none"

data = { 'url': url, 'cover_image': cover_image, 'logo_image': logo_image, 'num_like': num_like, 'num_follower': num_follower, 'num_following': num_following, 'detail': detail, 'post_photos': photos } all_data.append(data)

from playwright.sync_api import sync_playwright import time from bs4 import BeautifulSoup import json, re urls = [ "https://www.facebook.com/adidas/", "https://www.facebook.com/Cristiano", "https://www.facebook.com/nasaearth" ] all_data = [] with sync_playwright() as p: browser = p.chromium.launch(headless=True) context = browser.new_context() for url in urls: page = context.new_page() print(f"Going to page: {url}") page.goto(url) time.sleep(3) soup = BeautifulSoup(page.content(), 'html.parser') cover_image = soup.find('img', {'data-imgperflogname': 'profileCoverPhoto'})['src'] logo_image = soup.select_one('g image')['xlink:href'] num_like_tag = soup.find('a', string=re.compile(r' likes$')) num_like = num_like_tag.get_text() if num_like_tag else "none" num_follower_tag = soup.find('a', string=re.compile(r' followers$')) num_follower = num_follower_tag.get_text() if num_follower_tag else "none" num_following_tag = soup.find('a', string=re.compile(r' following$')) num_following = num_following_tag.get_text() if num_following_tag else "none" photo = soup.find_all('div', class_='x1yztbdb')[1] photos = [img['src'] for img in photo.select('img')] detail = soup.find('meta', {'name': 'description'})['content'] if soup.find('meta', {'name': 'description'}) else "none" data = { 'url': url, 'cover_image': cover_image, 'logo_image': logo_image, 'num_like': num_like, 'num_follower': num_follower, 'num_following': num_following, 'detail': detail, 'post_photos': photos } all_data.append(data) print(f"Finished scraping {url}") page.close() browser.close() with open('out_all_urls.json', 'w', encoding='utf-8') as f: json.dump(all_data, f, ensure_ascii=False, indent=4) print("Scraping completed for all URLs.")

8 Responses

top 25 motivational quotes says:

November 14, 2024 at 10:51 pm

These are really great ideas in concerning blogging. You have
touched some fastidious factors here. Any way keep up wrinting.

Reply
صيانة اريستون says:

December 11, 2024 at 2:24 am

You are so interesting! I don’t think I’ve read anything like this before.
So good to discover somebody with a few genuine thoughts on this subject.
Really.. thanks for starting this up. This site is one thing that’s needed on the web, someone with a
little originality!

Reply
how to extract email from instagram says:

December 28, 2024 at 11:48 pm

I don’t know whether it’s just me or if perhaps everybody else
encountering problems with your website. It appears as if some of the
text on your posts are running off the screen. Can someone else please comment and let me know
if this is happening to them as well? This could be a problem with my browser
because I’ve had this happen before. Many
thanks

Reply
purchased email lists says:

January 1, 2025 at 7:33 pm

These are really wonderful ideas in concerning blogging.
You have touched some pleasant factors here. Any
way keep up wrinting.
⭐ Feel free to surf to my article about buying email list database:
purchased email lists

Gather targeted email list database from social
media channels and gmaps

#usa emails list
#how to create an email list

Reply
vclub. tel says:

January 8, 2025 at 6:08 am

Hi, I log oon to your blogs regularly. Your story-telling style is awesome, keep up the
good work!

My page vclub. tel

Reply
1. codemate tv says:
  
  January 9, 2025 at 1:09 am
  
  thanks, glad you liked this
  
  Reply
English lessons says:

January 9, 2025 at 12:56 am

Hey there would you mind stating which blog platform you’re using?
I’m going to start my own blog soon but I’m having a tough time selecting between BlogEngine/Wordpress/B2evolution and Drupal.
The reason I ask is because your layout seems different then most
blogs and I’m looking for something unique. P.S Apologies for getting off-topic
but I had to ask!

My post connected to language courses

Reply
1. codemate tv says:
  
  January 9, 2025 at 1:11 am
  
  i am using WordPress, its great
  
  Reply

Codemate TV

Scraping Facebook Profiles with Python & Playwright 📊

Introduction 🎓

Why Scrape Facebook Profiles? 🤔

🛠️ Step 1: Importing Libraries

💡 Explanation:

🌐 Step 2: Defining the URLs to Scrape

🚀 Step 3: Launching the Playwright Browser

🔄 Step 4: Looping Through Each URL

🍜 Step 5: Parsing the Page with BeautifulSoup

📸 Step 6: Scraping the Profile Data

🖼️ Step 7: Scraping Additional Profile Photos

💾 Step 8: Storing the Data

Step 9: Saving the Data to a JSON File

✅ Step 10: Wrapping Up

Complete Code

Table of Contents

How to Scrape TikTok Comments With Python Requests

Scraping Facebook Profiles with Python

How to scrape an iranian ecommerce website, Digikala

Extracting Data from Stanford’s CS Program: What You Need to Know

8 Responses

Leave a Reply Cancel reply

Pages

Codemate TV

All rights reserved