Web Data Labs › Blog › Shopify Store Scraper

How to Scrape Shopify Store Data in 2026 (Products, Prices, Inventory)

April 27, 2026  ·  6 min read

Shopify powers over 4.5 million online stores worldwide, from solo DTC brands to enterprise retailers doing hundreds of millions in annual revenue. For competitive intelligence teams, price monitoring tools, dropshipping operators, and market researchers, Shopify store data — product catalogs, pricing, variant availability, and inventory signals — represents one of the most valuable e-commerce datasets available outside of Amazon.

Unlike Amazon or Walmart, Shopify does not have a centralized marketplace API. Each store is an independent deployment. While Shopify’s platform exposes some data through JSON endpoints, these are inconsistently enabled across stores, rate-limited, and often incomplete. Getting reliable, structured product data across dozens or hundreds of Shopify stores requires a different approach.

Why people scrape Shopify stores

What makes Shopify stores hard to scrape at scale

On the surface, many Shopify stores look accessible. The platform’s JSON product endpoints (/products.json) are publicly documented and work on stores that have not disabled them. But this apparent openness hides significant operational complexity when you scale beyond a handful of stores.

The consistency problem: Shopify stores are not a single target — they are millions of independently configured deployments. Some expose full JSON feeds; others disable them, add Cloudflare, or implement custom bot detection at the CDN layer. A scraper that works reliably on one store may fail entirely on another store built on the same platform. Handling this heterogeneity at scale requires per-store detection logic and fallback strategies that most teams do not have time to build.

Beyond access inconsistency, Shopify’s JSON endpoints are paginated with a 250-product limit per page and no standardized pagination token across store versions. Large catalogs with thousands of SKUs require careful pagination management and deduplication. Variant data — sizes, colors, material options — is nested and not always cleanly separated from parent product records, requiring normalization logic that differs across store themes and catalog structures.

Inventory data adds another layer of complexity. Shopify stores can configure inventory tracking per variant, and the available quantity field is often omitted or set to a non-meaningful value for stores using third-party fulfillment. Reading inventory signals correctly requires understanding which stores have configured tracking and which have not.

Anti-bot measures have also increased across high-value Shopify stores. Major DTC brands that know their pricing data is commercially sensitive have added Cloudflare Bot Management, behavioral fingerprinting challenges, and IP reputation scoring on top of the standard Shopify stack. These stores effectively close the JSON endpoints to non-browser traffic.

How to use the Shopify Store Scraper

We maintain a Shopify Store Scraper on Apify that handles store-level detection, JSON fallback logic, pagination, and data normalization. You provide store URLs; it returns structured product catalogs with variant-level detail.

Input

Scrape one or more Shopify stores by URL:

{
  "storeUrls": [
    "https://gymshark.com",
    "https://allbirds.com",
    "https://chubbiesshorts.com"
  ],
  "maxProductsPerStore": 500,
  "includeVariants": true,
  "includeInventory": true
}

Or target specific product categories within a store:

{
  "storeUrls": ["https://gymshark.com"],
  "collectionPath": "/collections/mens-t-shirts",
  "maxProductsPerStore": 200
}

Output

Each product returns a structured object:

{
  "store": "gymshark.com",
  "productId": "6789012345678",
  "title": "Vital Seamless 2.0 T-Shirt",
  "handle": "vital-seamless-2-0-t-shirt-mens",
  "vendor": "Gymshark",
  "productType": "T-Shirts",
  "tags": ["mens", "seamless", "training"],
  "price": 40.00,
  "compareAtPrice": null,
  "currency": "USD",
  "availableVariants": 18,
  "totalVariants": 24,
  "variants": [
    {
      "variantId": "39876543210123",
      "title": "Small / Black",
      "price": 40.00,
      "sku": "GS-VST-S-BLK",
      "inventoryQuantity": 143,
      "available": true
    },
    {
      "variantId": "39876543210456",
      "title": "Medium / Black",
      "price": 40.00,
      "sku": "GS-VST-M-BLK",
      "inventoryQuantity": 0,
      "available": false
    }
  ],
  "images": ["https://cdn.shopify.com/s/files/..."],
  "publishedAt": "2025-09-14T10:00:00Z",
  "updatedAt": "2026-04-22T08:30:00Z",
  "url": "https://gymshark.com/products/vital-seamless-2-0-t-shirt-mens"
}

Fields returned per product

FieldTypeDescription
storestringSource Shopify store domain
productIdstringShopify internal product ID
titlestringProduct name
vendorstringBrand or manufacturer name
pricefloatCurrent selling price
compareAtPricefloatOriginal price if on sale, else null
availableVariantsintegerVariant count currently in stock
variantsarrayAll size/color variants with price and inventory
tagsarrayStore-assigned product tags
publishedAtstringWhen product was first listed
updatedAtstringLast catalog update timestamp

Output is available as JSON, CSV, or XLSX. Runs can be scheduled on Apify to monitor specific stores on a daily or hourly cadence, enabling automated price change detection and inventory alerts.

Common use case: price change monitoring

A typical workflow is to run the scraper daily against a list of competitor stores, then diff the results against the previous run. Any product where price changed or availableVariants dropped to zero triggers a downstream alert — a Slack message, a webhook to your pricing tool, or a row in a Google Sheet.

Apify’s scheduling and webhook APIs make this straightforward to automate without managing infrastructure. You configure the actor run, set a daily schedule, and point the output to your destination. No servers required.

Pricing

The actor uses Pay Per Event pricing at $0.005 per product.

VolumeCost
100 products$0.50
500 products$2.50
Full store (1,000 SKUs)$5.00
Daily monitor (5 stores × 200 SKUs) × 30 days$15/month

Try it

Shopify Store Scraper on Apify →

Apify has a free tier for testing. Sign up here if you do not have an account. The actor integrates with Apify’s scheduling, webhook, and dataset APIs so you can run automated competitor monitoring pipelines without managing any infrastructure.