Web Data Labs › Blog › NASA APOD Scraper

How to Get NASA Astronomy Picture of the Day Data Programmatically (APOD API Guide)

May 3, 2026 · 5 min read

NASA's Astronomy Picture of the Day has been running since June 16, 1995. Every single day for more than three decades, an astronomer has hand-picked an image of the universe — a galaxy, a comet, a Mars rover panorama, a Hubble deep field — and written a short explanation aimed at a curious general audience. There are now more than 11,000 entries in the archive, and the project is still going. As an open source of high-quality, well-curated, copyright-clean astronomy media, APOD has no real equivalent.

If you're building anything around space, science education, or simply need a beautiful image and a paragraph of context per day, APOD is the obvious data source. The catch is the same as with most government datasets: it exists, it's free, and the moment you try to use it programmatically at any scale you start running into edge cases. This post walks through what APOD data actually looks like, what people use it for, and how to pull it cleanly without writing or maintaining the integration yourself.

What is APOD and what's in each entry

Each APOD entry is a structured record. The interesting fields are the picture URL, an HD version, a title, a longer explanation written by the curating astronomer, the date, and an optional copyright credit when the image isn't public domain. About 5–10% of entries are videos rather than images — YouTube embeds, Vimeo clips, or interactive panoramas. Anything you build needs to handle both.

The dataset is small enough to walk in full (a single request per day since 1995) but large enough that you don't want to babysit a backfill. The image hosting also moves around occasionally, and older entries sometimes have inconsistent metadata — missing copyright fields, broken thumbnail URLs, slightly different HTML formatting in the explanation. Anyone who has tried to write their own APOD integration has hit at least one of these.

Who actually uses this data

Space and science dashboards — Show today's APOD image plus a rotating set of recent ones on a homepage or status board. Common in classrooms, planetariums, and "smart mirror" style projects.
Wallpaper and lock-screen apps — A daily refresh of the HD image is one of the simplest, most-loved features you can ship in a wallpaper app. The copyright field decides whether you can redistribute it or just link out.
Newsletter and bot automation — Daily emails, Telegram channels, Bluesky bots, Discord servers. Pull the day's title, explanation, and HD URL and you have a finished post with zero writing.
Educational tools and curriculum builders — Teachers and edtech products use APOD to source authoritative images for lesson plans. The fact that each entry comes with a written explanation by a working astronomer is the killer feature here — you can't easily replicate that with stock photography.
Datasets for ML and search — A clean archive of 11,000+ captioned astronomy images is a useful starting point for image classification, captioning, and semantic search projects.
Media and content sites — Space news sites, science aggregators, and "today in space" widgets all benefit from a structured feed they can render however they want.

Why pulling it yourself is more work than it looks

The official APOD endpoint is rate-limited per API key, returns slightly different shapes depending on the parameters you send, and gets cranky on long date ranges. Image URLs change hosting occasionally. Some entries return a video URL where you'd expect an image. Copyright fields are inconsistent. Backfilling thirty years of history without rate-limit retries and schema normalization is a weekend you're not getting back.

Most production builds also want some form of residential proxies in front of any large historical fetch — not because APOD is hostile, but because shared cloud IPs eat 429s long before per-key quotas do.

The easier path: a managed actor

We built and maintain cryptosignals/nasa-apod-scraper on Apify so you don't have to. It handles the rate limits, the schema normalization, and the video-vs-image handling. You give it a small JSON input describing what you want, and it returns clean structured records. Pay-per-result, no API key juggling.

Get today's picture

{
  "action": "today"
}

Returns:

[
  {
    "date": "2026-05-03",
    "title": "M31: The Andromeda Galaxy",
    "explanation": "The most distant object easily visible to the unaided eye...",
    "url": "https://apod.nasa.gov/apod/image/2605/M31_HubbleSpitzerGendler_960.jpg",
    "hdurl": "https://apod.nasa.gov/apod/image/2605/M31_HubbleSpitzerGendler_4096.jpg",
    "media_type": "image",
    "copyright": "Robert Gendler"
  }
]

Get a random batch (great for wallpaper apps)

{
  "action": "random",
  "count": 5
}

Backfill a date range

{
  "action": "range",
  "start_date": "2025-01-01",
  "end_date": "2025-12-31"
}

The output is one normalized record per day, with consistent field names, the media type clearly tagged, and broken/missing fields handled. Whatever you do with the result — render a widget, send an email, fill a database, train a model — you start from clean data instead of a pile of edge cases.

If APOD is just one input in a larger space or science product, the actor pattern composes well: fetch APOD daily, store the records, and combine with whatever other feeds you want (launches, telescope schedules, near-Earth object data) without each integration becoming its own maintenance burden.

Try it

Run cryptosignals/nasa-apod-scraper on Apify. Free credits cover thousands of records on signup, which is more than enough to back the first version of whatever you're building.