Reddit Post Scraper: Scrape Posts from Subreddits Using Python

 Reddit is one of the largest platforms for discussions, memes, and news. If you ever wanted to scrape posts from a specific subreddit for data analysis, automation, or research, this tutorial is for you!

In this blog, we’ll build a Reddit Post Scraper using Python with the PRAW (Python Reddit API Wrapper) library.

 


What is a Reddit Post Scraper?

A Reddit Post Scraper extracts posts from subreddits and collects information like:

Post Titles – The heading of each post.
Upvotes & Comments – To analyze engagement.
Post URLs – Direct links to the posts.
Author & Date – Metadata about the post.

This data is useful for content analysis, sentiment analysis, and automation projects.

 


Features of Our Reddit Scraper

Extracts Latest or Trending Posts
Fetches Titles, Upvotes, and URLs
Filters Posts by Keywords
Saves Data in CSV Format

 


Step-by-Step Guide to Scraping Reddit Posts

Step 1: Install Required Libraries

We will use PRAW (Python Reddit API Wrapper) to interact with Reddit’s API. Install it using:


pip install praw

Step 2: Set Up Reddit API Credentials

To access Reddit’s API, you need to create an API client:

1️⃣ Go to Reddit Apps.
2️⃣ Click "Create an App" and select script as the app type.
3️⃣ Note down your Client ID, Client Secret, and Username.

 


Step 3: Write the Python Script

Here’s a simple Python script to scrape posts from a specific subreddit:


import praw
import csv

# Reddit API Credentials
reddit = praw.Reddit(
    client_id="your_client_id",
    client_secret="your_client_secret",
    user_agent="your_user_agent"
)

def scrape_reddit(subreddit_name, num_posts=10):
    subreddit = reddit.subreddit(subreddit_name)
    
    posts_data = []
    
    for post in subreddit.hot(limit=num_posts):  # Fetch 'hot' posts
        posts_data.append([post.title, post.score, post.url])
    
    # Save data to a CSV file
    with open(f"{subreddit_name}_posts.csv", "w", newline="", encoding="utf-8") as file:
        writer = csv.writer(file)
        writer.writerow(["Title", "Upvotes", "URL"])
        writer.writerows(posts_data)

    print(f"Scraped {num_posts} posts from r/{subreddit_name} and saved to {subreddit_name}_posts.csv")

# Run the scraper
if __name__ == "__main__":
    subreddit_name = input("Enter subreddit name: ")
    num_posts = int(input("Enter number of posts to scrape: "))
    scrape_reddit(subreddit_name, num_posts)


Code Explanation

  • praw.Reddit() – Authenticates the script with Reddit’s API.
  • subreddit.hot(limit=num_posts) – Fetches trending posts from the subreddit.
  • post.title, post.score, post.url – Extracts title, upvotes, and post link.
  • CSV File Output – Saves the scraped data into a structured file.
 

Step 4: Running the Reddit Scraper

1️⃣ Save the script as reddit_scraper.py.
2️⃣ Run it using:

python reddit_scraper.py

3️⃣ Enter a subreddit name (e.g., Python, technology, memes).
4️⃣ Enter the number of posts to scrape.
5️⃣ The scraped data will be saved in CSV format.

 


Conclusion

With just a few lines of Python, we created a Reddit Post Scraper that extracts valuable data from subreddits. You can extend this script to analyze post trends, find viral content, or automate research. 🚀



If you have any questions, feel free to ask in the comments below. Happy coding! 😊