Web Data Labs › Blog › Product Hunt Scraper

Product Hunt Scraper 2026: Extract Launches, Upvotes & Comments

April 27, 2026 · 6 min read

Product Hunt is the go-to launch platform for indie makers, SaaS startups, and tech products, surfacing hundreds of new launches every week and generating some of the most concentrated early-adopter signal available anywhere on the internet. For startup researchers, VCs, competitive intelligence teams, and growth practitioners, Product Hunt data — launch upvotes, comment volume, maker backgrounds, and launch timing patterns — provides a real-time window into what is getting traction in the tech product market.

Product Hunt does offer an API, but access is rate-limited and approval-gated. The API has historically been restricted to approved partners, and rate limits make bulk historical collection impractical. Getting structured launch data at scale — particularly for competitive monitoring or trend analysis — requires a different approach.

Why people scrape Product Hunt

Competitive intelligence — Track when competitors launch new products or features on Product Hunt. Monitor upvote trajectory in the first 24 hours as an early signal of market reception before reviews and press coverage catch up.
Market trend research — Identify emerging product categories by analyzing which tags and problem areas are accumulating the most launches and upvotes over time. Spot rising trends in AI tooling, developer infrastructure, or consumer apps months before mainstream coverage.
Investor deal flow — VCs and angels use Product Hunt launch performance as a signal layer for early-stage traction. A product with 500+ upvotes on launch day with positive comments from technical users is worth investigating further.
Maker and founder research — Identify prolific makers who have multiple successful launches. Useful for recruiting, partnership outreach, or understanding who the most productive builders are in a specific niche.
Launch strategy analysis — Analyze what launch day, time, and post structure correlates with higher upvote counts. Build a dataset of successful launches to inform your own launch strategy.
B2B lead generation — Companies that launch on Product Hunt are, by definition, actively building and shipping. Their makers are warm prospects for developer tools, SaaS infrastructure, and B2B services targeting early-stage startups.

What makes Product Hunt hard to scrape

Product Hunt is a React single-page application served over a GraphQL API. Raw HTTP requests to product pages return a minimal HTML shell with no launch data; all content is hydrated client-side via JavaScript. This immediately eliminates simple HTML parsing approaches and requires either browser rendering or reverse-engineering the GraphQL schema.

The pagination and historical data problem: Product Hunt’s public site only shows the most recent launches in each category by default. Accessing historical launch data — everything from a specific date range, or all launches in a specific tag — requires navigating the site’s internal pagination state, which is managed through GraphQL cursor-based pagination rather than URL parameters. A scraper that only handles the visible homepage misses the majority of the dataset that makes trend analysis valuable.

Comment data adds another layer of complexity. Product Hunt discussion threads use nested replies, and loading full threads requires additional API calls beyond the initial product page load. For products with active launch day discussion (100+ comments), extracting the complete conversation requires managing multiple levels of pagination within each thread.

Product Hunt also uses Cloudflare on its infrastructure, meaning non-browser requests from data center IPs are frequently blocked or return challenge pages before any product data is served. The combination of SPA rendering requirements and Cloudflare protection makes this a target that breaks most simple scraping setups.

How to use the Product Hunt Scraper

We maintain a Product Hunt Scraper on Apify that handles JavaScript rendering, GraphQL pagination, and comment thread extraction. You provide date ranges, topic tags, or specific product URLs; it returns structured launch and engagement data.

Input

Scrape today’s top launches:

{
  "mode": "daily",
  "date": "2026-04-27",
  "maxProducts": 50,
  "includeComments": true
}

Or search by topic tag over a date range:

{
  "mode": "topic",
  "topic": "artificial-intelligence",
  "dateFrom": "2026-01-01",
  "dateTo": "2026-04-27",
  "maxProducts": 500
}

Or scrape a specific product URL:

{
  "mode": "urls",
  "productUrls": [
    "https://www.producthunt.com/posts/example-product"
  ],
  "includeComments": true,
  "maxComments": 200
}

Output

Each product launch returns a structured object:

{
  "productId": "ph_123456",
  "name": "DataSift AI",
  "tagline": "Turn any website into structured data in seconds",
  "description": "DataSift AI uses vision models to extract structured data from any webpage...",
  "url": "https://www.producthunt.com/posts/datasift-ai",
  "websiteUrl": "https://datasift.ai",
  "upvotes": 847,
  "commentsCount": 134,
  "rank": 3,
  "launchDate": "2026-04-27",
  "topics": ["Artificial Intelligence", "Developer Tools", "Productivity"],
  "makers": [
    {
      "name": "Alex Chen",
      "username": "alexchen_builds",
      "profileUrl": "https://www.producthunt.com/@alexchen_builds"
    }
  ],
  "thumbnail": "https://ph-files.imgix.net/...",
  "pricingType": "freemium",
  "featured": true,
  "goldenKitty": false
}

Fields returned per launch

Field	Type	Description
`name`	string	Product name
`tagline`	string	One-line product description
`upvotes`	integer	Total upvote count
`commentsCount`	integer	Number of comments on the launch
`rank`	integer	Daily ranking position (1 = #1 Product of the Day)
`launchDate`	string	Date the product was launched
`topics`	array	Product Hunt topic tags
`makers`	array	Maker profiles with names and PH usernames
`pricingType`	string	free, freemium, paid, or open_source
`featured`	boolean	Whether editorially featured by PH team

Comment output format

When comment extraction is enabled:

{
  "commentId": "cm_789012",
  "productId": "ph_123456",
  "text": "This solves exactly the problem I have with building data pipelines. The vision-based approach is clever...",
  "authorName": "Sarah K.",
  "authorUsername": "sarahk_dev",
  "upvotes": 42,
  "isFounderReply": false,
  "publishedAt": "2026-04-27T11:30:00Z"
}

Output is available as JSON, CSV, or XLSX. Runs can be scheduled daily to capture launches automatically and build a time-series dataset of Product Hunt activity in your target categories.

Pricing

The actor uses Pay Per Event pricing at $0.005 per product and $0.001 per comment.

Volume	Cost
Daily top 50 launches (no comments)	$0.25
Daily top 50 + 100 comments each	$5.25
1 month of daily launches (1,500 products)	$7.50
Full topic history (500 products, AI tag)	$2.50

Try it

Product Hunt Scraper on Apify →

Apify has a free tier for testing. Sign up here if you do not have an account. The actor connects to Apify’s scheduling and webhook APIs so you can run daily launch monitoring pipelines and trigger alerts when a competitor or relevant product in your space launches.