Web Data Labs › Blog › Product Hunt Scraper

Product Hunt Scraper 2026: Extract Launches, Upvotes & Comments

April 27, 2026  ·  6 min read

Product Hunt is the go-to launch platform for indie makers, SaaS startups, and tech products, surfacing hundreds of new launches every week and generating some of the most concentrated early-adopter signal available anywhere on the internet. For startup researchers, VCs, competitive intelligence teams, and growth practitioners, Product Hunt data — launch upvotes, comment volume, maker backgrounds, and launch timing patterns — provides a real-time window into what is getting traction in the tech product market.

Product Hunt does offer an API, but access is rate-limited and approval-gated. The API has historically been restricted to approved partners, and rate limits make bulk historical collection impractical. Getting structured launch data at scale — particularly for competitive monitoring or trend analysis — requires a different approach.

Why people scrape Product Hunt

What makes Product Hunt hard to scrape

Product Hunt is a React single-page application served over a GraphQL API. Raw HTTP requests to product pages return a minimal HTML shell with no launch data; all content is hydrated client-side via JavaScript. This immediately eliminates simple HTML parsing approaches and requires either browser rendering or reverse-engineering the GraphQL schema.

The pagination and historical data problem: Product Hunt’s public site only shows the most recent launches in each category by default. Accessing historical launch data — everything from a specific date range, or all launches in a specific tag — requires navigating the site’s internal pagination state, which is managed through GraphQL cursor-based pagination rather than URL parameters. A scraper that only handles the visible homepage misses the majority of the dataset that makes trend analysis valuable.

Comment data adds another layer of complexity. Product Hunt discussion threads use nested replies, and loading full threads requires additional API calls beyond the initial product page load. For products with active launch day discussion (100+ comments), extracting the complete conversation requires managing multiple levels of pagination within each thread.

Product Hunt also uses Cloudflare on its infrastructure, meaning non-browser requests from data center IPs are frequently blocked or return challenge pages before any product data is served. The combination of SPA rendering requirements and Cloudflare protection makes this a target that breaks most simple scraping setups.

How to use the Product Hunt Scraper

We maintain a Product Hunt Scraper on Apify that handles JavaScript rendering, GraphQL pagination, and comment thread extraction. You provide date ranges, topic tags, or specific product URLs; it returns structured launch and engagement data.

Input

Scrape today’s top launches:

{
  "mode": "daily",
  "date": "2026-04-27",
  "maxProducts": 50,
  "includeComments": true
}

Or search by topic tag over a date range:

{
  "mode": "topic",
  "topic": "artificial-intelligence",
  "dateFrom": "2026-01-01",
  "dateTo": "2026-04-27",
  "maxProducts": 500
}

Or scrape a specific product URL:

{
  "mode": "urls",
  "productUrls": [
    "https://www.producthunt.com/posts/example-product"
  ],
  "includeComments": true,
  "maxComments": 200
}

Output

Each product launch returns a structured object:

{
  "productId": "ph_123456",
  "name": "DataSift AI",
  "tagline": "Turn any website into structured data in seconds",
  "description": "DataSift AI uses vision models to extract structured data from any webpage...",
  "url": "https://www.producthunt.com/posts/datasift-ai",
  "websiteUrl": "https://datasift.ai",
  "upvotes": 847,
  "commentsCount": 134,
  "rank": 3,
  "launchDate": "2026-04-27",
  "topics": ["Artificial Intelligence", "Developer Tools", "Productivity"],
  "makers": [
    {
      "name": "Alex Chen",
      "username": "alexchen_builds",
      "profileUrl": "https://www.producthunt.com/@alexchen_builds"
    }
  ],
  "thumbnail": "https://ph-files.imgix.net/...",
  "pricingType": "freemium",
  "featured": true,
  "goldenKitty": false
}

Fields returned per launch

FieldTypeDescription
namestringProduct name
taglinestringOne-line product description
upvotesintegerTotal upvote count
commentsCountintegerNumber of comments on the launch
rankintegerDaily ranking position (1 = #1 Product of the Day)
launchDatestringDate the product was launched
topicsarrayProduct Hunt topic tags
makersarrayMaker profiles with names and PH usernames
pricingTypestringfree, freemium, paid, or open_source
featuredbooleanWhether editorially featured by PH team

Comment output format

When comment extraction is enabled:

{
  "commentId": "cm_789012",
  "productId": "ph_123456",
  "text": "This solves exactly the problem I have with building data pipelines. The vision-based approach is clever...",
  "authorName": "Sarah K.",
  "authorUsername": "sarahk_dev",
  "upvotes": 42,
  "isFounderReply": false,
  "publishedAt": "2026-04-27T11:30:00Z"
}

Output is available as JSON, CSV, or XLSX. Runs can be scheduled daily to capture launches automatically and build a time-series dataset of Product Hunt activity in your target categories.

Pricing

The actor uses Pay Per Event pricing at $0.005 per product and $0.001 per comment.

VolumeCost
Daily top 50 launches (no comments)$0.25
Daily top 50 + 100 comments each$5.25
1 month of daily launches (1,500 products)$7.50
Full topic history (500 products, AI tag)$2.50

Try it

Product Hunt Scraper on Apify →

Apify has a free tier for testing. Sign up here if you do not have an account. The actor connects to Apify’s scheduling and webhook APIs so you can run daily launch monitoring pipelines and trigger alerts when a competitor or relevant product in your space launches.