← All posts

Scraping G2 Reviews at Scale (2026 Guide)

April 24, 2026 · Web Data Labs
TABLE OF CONTENTS Why people scrape G2 Why G2 is hard to scrape in 2026 What data you can extract Actor input schema Actor output example Use cases Pricing Conclusion

Why people scrape G2

G2 is the largest B2B software review platform in the world, with over 2 million verified reviews across 150,000+ products. Every review includes detailed structured data: star ratings broken into sub-categories, reviewer company size, job function, industry, what they liked, what they disliked, and what problem the software solved. That granularity makes G2 review data uniquely useful for competitive intelligence, product research, and voice-of-customer analysis.

G2 does offer an API — but access is gated behind a G2 partnership, typically reserved for vendors who pay for profile management. There is no self-serve API access for third-party data extraction, which means most teams that need G2 data in bulk have to find another path.

Competitive intelligence

Product and marketing teams at SaaS companies monitor G2 data on competitors as a continuous signal. What are customers consistently praising? What are they complaining about? Which weaknesses appear repeatedly in reviews written in the last 90 days? This is qualitative competitive intelligence that no internal survey can replicate, because it comes directly from users who chose the competitor over you — or vice versa.

Voice of customer for product development

Product managers use G2 review text as a proxy for unfiltered customer feedback. Reviews written by real users about real workflows surface pain points, missing features, and integration gaps that internal NPS surveys miss. Aggregating hundreds of reviews and running thematic analysis on the "what I wish were different" fields gives a product roadmap signal that is hard to replicate from other sources.

Sales intelligence

Sales teams use G2 category rankings and review metadata to identify prospects: companies that have reviewed competitor software are demonstrably in the market for solutions in that category. Review metadata — company size, industry, job function — lets you filter for exactly the buyer profile you want to reach.

Why G2 is hard to scrape in 2026

G2 protects its review data aggressively, and extracting it reliably requires overcoming several layers of defense:

Cloudflare protection and JS challenges

G2 runs Cloudflare's enterprise-tier WAF with JavaScript challenge pages that block standard HTTP clients and basic headless browser configurations. The challenge verification requires executing JavaScript in a complete browser environment with a matching fingerprint. This is the first gating layer, and it eliminates the majority of naive scraping attempts before a single review renders.

Review pagination with dynamic cursors

G2 does not use simple page number pagination. Reviews are loaded dynamically with cursor-based continuation tokens that are tied to session state. Iterating through hundreds of reviews for a single product requires maintaining session continuity across many requests while pacing to avoid triggering rate limits — a non-trivial engineering problem.

Structured data embedded in JavaScript state

Much of G2's review data is not in visible HTML but embedded in JavaScript application state objects. Extracting it requires parsing these state objects correctly across multiple page types (product page, category grid, comparison view), each with a different data structure. Changes in G2's frontend can break state key paths without warning.

Login requirements for full review text

G2 increasingly limits full review access to logged-in users. Anonymous visitors see truncated reviews, partial ratings breakdowns, and obfuscated reviewer metadata. A scraper that only handles the unauthenticated layer returns degraded data for the most valuable use cases.

Bottom line: A production-grade G2 scraper in 2026 requires handling Cloudflare JS challenges, cursor-based pagination, JS state extraction, and authenticated sessions. For teams that need this data reliably, the maintenance burden of a custom solution typically outweighs the one-time build cost within a quarter.

What data you can extract

A well-built G2 scraper returns structured records for each review, with metadata about the reviewer and product. Core fields:

Field Type Description
productNamestringSoftware product being reviewed
overallRatingnumberOverall star rating (1–5)
easeOfUsenumberEase of use sub-rating
customerSupportnumberCustomer support sub-rating
valueForMoneynumberValue for money sub-rating
reviewTitlestringReview headline
whatLikedstringFree-text "What do you like best?" response
whatDislikedstringFree-text "What do you dislike?" response
recommendationsstringRecommendations to others
reviewerRolestringReviewer's job function
companySizestringReviewer's company size (e.g., "51-200 employees")
industrystringReviewer's industry
reviewDatestringISO date the review was submitted
verifiedUserbooleanWhether review is G2 verified

Actor input

The G2 Reviews Scraper accepts product URLs or product names directly. No G2 account or partner access needed.

{
  "productUrls": [
    "https://www.g2.com/products/salesforce/reviews",
    "https://www.g2.com/products/hubspot-crm/reviews"
  ],
  "maxReviewsPerProduct": 500,
  "filterByRating": null,
  "sortBy": "recent"
}

Key input parameters:

Supplying multiple productUrls in one run is the most efficient way to do competitive sweeps — you can compare 5–10 products in a single actor invocation and receive all results in a unified dataset.

Actor output

Each result is a flat JSON record representing a single review. Here is a representative example:

{
  "productName": "HubSpot CRM",
  "overallRating": 4.5,
  "easeOfUse": 4.5,
  "customerSupport": 4.0,
  "valueForMoney": 4.0,
  "reviewTitle": "Best free CRM for small teams, but enterprise features need work",
  "whatLiked": "The free tier is genuinely useful and the UI is intuitive enough that our sales team adopted it without training. Contact timeline view is excellent. Integration with Gmail is seamless.",
  "whatDisliked": "Reporting customization is limited on the lower tiers. We hit walls quickly when trying to build anything more complex than the default dashboards. Advanced automation requires a significant plan upgrade.",
  "recommendations": "Great for teams under 50 users who want something they can set up in a day. If you need complex multi-step workflows out of the box, budget for the higher tier.",
  "reviewerRole": "Sales Manager",
  "companySize": "51-200 employees",
  "industry": "Computer Software",
  "reviewDate": "2026-04-10",
  "verifiedUser": true
}

Use cases

1. Competitive product intelligence

SaaS product teams run G2 scraping pipelines to monitor competitors continuously. The "what I dislike" field is the highest-signal source: it surfaces specific feature gaps, pricing frustrations, onboarding failures, and support issues that users are willing to put on record publicly. Aggregating 500 recent negative reviews for three competitors and running keyword frequency analysis on the dislike text gives a ranked list of competitor weaknesses — directly actionable for roadmap prioritization and sales battlecard development.

2. Review-driven lead generation

Reviewer metadata — company size, industry, job function — makes G2 review data a structured lead source. Companies that have left reviews for competing products in your category are demonstrably using solutions like yours. Their review text tells you what they care about and what frustrates them. Combining this with company lookup data gives sales teams a pre-qualified, insight-rich prospect list that outperforms cold lists.

3. Voice of customer for positioning

Marketing teams and messaging consultants use G2 review language to extract the exact vocabulary that customers use to describe problems, outcomes, and feature value. This language — pulled verbatim from verified buyers — is more reliable than internal customer interviews for writing landing page copy, ad creative, and sales scripts. The patterns that repeat across hundreds of reviews reflect what actually resonates, not what sounds good internally.

4. Analyst and investor research

Market research firms and investment analysts use G2 review trends as a product-quality signal for software companies. A product with a rapidly improving rating trajectory and increasing review volume over 12 months is a signal that the company is iterating effectively. Declining ratings alongside growing review volume is an early warning sign worth tracking ahead of earnings. This data is public and verifiable in ways that qualitative analyst reports are not.

Pricing

The G2 Reviews Scraper uses pay-per-result pricing on Apify. You pay only for reviews successfully extracted — no charges for failed requests, retries, or idle compute time.

Volume Estimated Cost Use case
100 reviews~$0.50–$1.00Single product spot check
1,000 reviews~$5–$10Full product competitive sweep (5–10 products)
10,000 reviews~$50–$100Category-wide analysis or recurring monitoring pipeline
Note: Apify's free tier includes $5/month in compute credits — sufficient to extract several hundred reviews and verify field coverage for your specific analysis. For proxy-heavy platforms like G2, a service like ScraperAPI can handle raw HTTP fallback for simpler endpoints in complementary workflows.

Conclusion

G2 holds some of the richest publicly available B2B software feedback data — but extracting it reliably in 2026 requires navigating Cloudflare JS challenges, cursor-based pagination, and session-authenticated requests that make a self-built solution a significant ongoing engineering commitment.

For product teams, sales orgs, and market researchers that need clean, structured G2 review data without building and maintaining that infrastructure, the G2 Reviews Scraper handles it and returns ready-to-use JSON records with full review text, sub-ratings, reviewer metadata, and verification status.

If you are building competitive intelligence systems, review monitoring pipelines, or lead generation tools that rely on G2 data at scale, it is a practical starting point. Start a free run on Apify →