Codemate TV

How to Scrape TikTok Comments With Python Requests 🐍💬

Introduction 🎓

Have you ever wanted to scrape comments from a TikTok video to analyze audience reactions or gather data for a project? In this tutorial, I’ll guide you step by step through the process of scraping TikTok comments using Python and the requests library. We’ll break down the code, explain its components, and ensure that you understand everything, even if you’re new to web scraping. Let’s dive in! 🚀

If you’d like, you can watch the video tutorial along with this blog.

Importing Libraries

Let’s kick things off by importing the necessary libraries and setting up our base URL. This URL will serve as the foundation for our API requests:

				
					import requests
import json

Here, we import the requests library to handle HTTP requests and json to work with JSON data (TikTok’s response format). These are essential for fetching the data from the TikTok server.

Defining the TikTok Post URL

				
					post_url = 'https://www.tiktok.com/@pythonblyat5/video/7347374025221393696'
post_id = post_url.split('/')[-1]

We define the TikTok post URL from which we want to scrape the comments. The post_id is extracted from the URL using split('/')[-1], which fetches the last part of the URL (the video ID). This post_id will be used to query TikTok’s API.

Setting Up Headers

When making web requests, it’s important to mimic a real browser. This is done using HTTP headers. The headers dictionary contains several key-value pairs that make your request look like it’s coming from a regular browser (specifically Google Chrome on Windows in this case). This helps to prevent the server from blocking the request. 🔒

				
					headers = {
    'accept': '*/*',
    'accept-language': 'en-US,en;q=0.9,fa;q=0.8',
    'cache-control': 'no-cache',
    'pragma': 'no-cache',
    'priority': 'u=1, i',
    'referer': 'https://www.tiktok.com/explore',
    'sec-ch-ua': '"Google Chrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"',
    'sec-ch-ua-mobile': '?0',
    'sec-ch-ua-platform': '"Windows"',
    'sec-fetch-dest': 'empty',
    'sec-fetch-mode': 'cors',
    'sec-fetch-site': 'same-origin',
    'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36'
}

Initializing Variables

				
					comments = []
comments.append({'post_url': post_url})
curs = 0

comments: We initialize an empty list to store the comments we’ll scrape. The first item in the list is a dictionary containing the post URL for reference.
curs: This variable keeps track of the pagination cursor, which will allow us to load more comments in subsequent requests.

Defining the Parser Function

				
					def parser(data):
    comment = data['comments']
    
    for cm in comment:
        com = cm['share_info']['desc']
        
        if com == "":
            com = cm['text']
            
        comments.append(com)
        print(com)
    return data

The parser function extracts the comments from the JSON data returned by the TikTok API.
- data['comments']: We access the list of comments in the response.
- We loop through each comment and extract the comment text. If the desc (description) field is empty, we fall back to the text field.
- Each comment is appended to the comments list and printed.

Making the API Request

				
					def req(post_id, curs):
    url = f"https://www.tiktok.com/api/comment/list/?WebIdLastTime=1727614026&aid=1988&...&cursor={curs}&aweme_id={post_id}"
    response = requests.request("GET", url, headers=headers)
    
    info = response.text
    raw_data = json.loads(info)
    
    return raw_data

The req function sends a GET request to TikTok’s API to fetch the comments.

The URL contains multiple parameters like the post_id (the video ID) and curs (the cursor for pagination).
response.text contains the raw JSON response from the server.
json.loads(info) converts the raw JSON string into a Python dictionary.

Loop to Keep Scraping

				
					while 1:
    data = req(post_id, curs)
    same_data = parser(data)

    if same_data['has_more'] == 1:
        curs += 20
        print('moving to the next cursor')
    else:
        break

We use a while loop to continuously scrape comments:

The req function fetches the data for the current cursor position.
The parser function extracts the comments.
If the has_more field in the response is 1, it means there are more comments to fetch, so we increment the cursor by 20 to move to the next set of comments.
If no more comments are available, the loop breaks.

Saving the Comments to a File

				
					with open('output.json', 'w', encoding='utf-8') as f:
    json.dump(comments, f, ensure_ascii=False, indent=4)

Once all the comments are scraped, we save them to a JSON file (output.json). We ensure the comments are saved in a readable format with indentation using indent=4. The ensure_ascii=False ensures that non-ASCII characters are properly encoded (e.g., emojis or special characters).

the complete code 📝

In this tutorial, you learned how to scrape TikTok comments using Python and the requests library. Here’s a quick recap:

Importing libraries: We used requests for HTTP requests and json for handling TikTok’s JSON response.
Setting up headers: We mimicked a browser using HTTP headers to avoid being blocked.
Fetching and parsing comments: We used a loop and a cursor system to scrape comments in batches.
Saving the data: Finally, we saved the scraped comments into a JSON file.

				
					import requests,json


post_url = 'https://www.tiktok.com/@pythonblyat5/video/7347374025221393696'

post_id = post_url.split('/')[-1]

headers = {
  'accept': '*/*',
  'accept-language': 'en-US,en;q=0.9,fa;q=0.8',
  'cache-control': 'no-cache',
  'pragma': 'no-cache',
  'priority': 'u=1, i',
  'referer': 'https://www.tiktok.com/explore',
  'sec-ch-ua': '"Google Chrome";v="129", "Not=A?Brand";v="8", "Chromium";v="129"',
  'sec-ch-ua-mobile': '?0',
  'sec-ch-ua-platform': '"Windows"',
  'sec-fetch-dest': 'empty',
  'sec-fetch-mode': 'cors',
  'sec-fetch-site': 'same-origin',
  'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/129.0.0.0 Safari/537.36'
}

comments = []
comments.append({'post_url':post_url})

curs = 0

def parser(data):
    comment = data['comments']

    for cm in comment:
        com = cm['share_info']['desc']

        if com == "":
            com = cm['text']

        comments.append(com)
        print(com)
    return data


def req(post_id,curs):
    url = f"https://www.tiktok.com/api/comment/list/?WebIdLastTime=1727614026&aid=1988&app_language=en&app_name=tiktok_web&aweme_id={post_id}&browser_language=en-US&browser_name=Mozilla&browser_online=true&browser_platform=Win32&browser_version=5.0%20%28Windows%20NT%2010.0%3B%20Win64%3B%20x64%29%20AppleWebKit%2F537.36%20%28KHTML%2C%20like%20Gecko%29%20Chrome%2F129.0.0.0%20Safari%2F537.36&channel=tiktok_web&cookie_enabled=true&count=20&cursor={curs}&data_collection_enabled=true&device_id=7420045675172267552&device_platform=web_pc&focus_state=true&from_page=video&history_len=9&is_fullscreen=false&is_page_visible=true&odinId=7420045705102722080&os=windows&priority_region=&referer=&region=GB&screen_height=864&screen_width=1536&tz_name=Asia%2FTehran&user_is_login=false&webcast_language=en&msToken=LACQZNSB6gTZGDBhIqrVKgoO19z0F5AUS9BgaOy6Y0PI5qokvf8CrE01SHTkGr296aSEazAVJNBRMo3E3MrB2Dz4dEIMfNjmfFdz1Gra7Kb3BZpNrZesv2bwtBjLEQqf572Gz3xrA-7ju9L7y-p1z0vmCRc=&X-Bogus=DFSzswVOGPUANtN1t69wzMSwXQM1&_signature=_02B4Z6wo00001xqLMzgAAIDDkLHDRCQFA1saizeAAKBlda"
    response = requests.request("GET", url, headers=headers)

    info = response.text
    raw_data = json.loads(info)
 
    return raw_data    

while 1:

    data = req(post_id,curs)
    same_data = parser(data)

    if same_data['has_more'] == 1:
        curs+=20
        print('moving to the next curser')
    else:
        break
    

with open('output.json', 'w', encoding='utf-8') as f:
    json.dump(comments, f, ensure_ascii=False, indent=4)

print("\ndata has been saved...., good job amir")

One Response

Arkadi Nikolaevich Did you have to make it easy Go go says:

January 1, 2025 at 1:54 pm

Arkadi Nikolaevich Did you have to make it easy Go go

Reply

Codemate TV

How to Scrape TikTok Comments With Python Requests 🐍💬

Introduction 🎓

Importing Libraries

Defining the TikTok Post URL

Setting Up Headers

Initializing Variables

Defining the Parser Function

Making the API Request

Loop to Keep Scraping

Saving the Comments to a File

the complete code 📝

Table of Contents

How to Scrape TikTok Comments With Python Requests

Scraping Facebook Profiles with Python

How to scrape an iranian ecommerce website, Digikala

Extracting Data from Stanford’s CS Program: What You Need to Know

One Response

Leave a Reply Cancel reply

Pages

Codemate TV

All rights reserved