CosmiQ Works and AI.Reverie Release Largest High-Resolution Dataset of Real and Synthetic Overhead Imagery for Open Use

RarePlanes dataset experiments show synthetic data can train robust computer vision algorithms

Example of the real and synthetic datasets present in RarePlanes. The top two rows feature the real Maxar WorldView-3 satellite imagery and the bottom two rows show the AI.Reverie synthetic data. The dataset features variable weather conditions, biomes, and ground surface types. (Photo: Business Wire)

NEW YORK--()--CosmiQ Works and AI.Reverie today released RarePlanes, now the largest openly available, very-high resolution dataset that can test the value of synthetic data from an overhead perspective. The complexity of overhead datasets like this provide one the best accelerants for new artificial intelligence methods, specifically computer vision, that can advance both aerial and on-the-ground applications like autonomous driving.

CosmiQ Works, an applied research initiative within IQT Labs, and AI.Reverie, a leader in synthetic data technology, leveraged RarePlanes in a newly released case study that showed how synthetic data can be used to bootstrap algorithms when real data is insufficient. Most impressively, the tests found that fine-tuning a synthetic model with 10% of the observed dataset achieved performance on par with a model trained on 100% observed data.

Significantly reducing the need and reliance on real data, which is slow, expensive, and often difficult to procure, could open the floodgate to prolific adoption of computer vision across industries and applications.

“We open this dataset to empower great minds everywhere to test the value of synthetic data for themselves,” said AI.Reverie co-founder and CEO Daeil Kim. “We’re inspired to see these results in our own experiments with our partners at CosmiQ.”

RarePlanes consists of 253 Maxar WorldView satellite scenes with 14,700 hand annotations and 50,000 synthetic satellite images with 630,000 annotations of various aircraft types. The high number of labeled attributes enables creation of up to 110 custom classes for research. The dataset is available for free download through Amazon Web Services’ Open Data Program.

“RarePlanes helps tackle some of the unique challenges overhead imagery presents with a new class of data generated by AI.Reverie’s novel simulation platform,” said CosmiQ Works Research Scientist Jake Shermeyer. “Our intent is to provide a launchpad for new techniques and applications with this expansive open dataset.”

RarePlanes research authors Shermeyer and Thomas Hossler, AI.Reverie Computer Vision Scientist, will lead a discussion of the study Wednesday, July 8 at 9am PDT. Register here:

About AI.Reverie

AI.Reverie is a simulation platform that trains AI to understand the world. It offers a suite of synthetic data and vision APIs to help businesses across different industries train their machine learning algorithms and improve their AI applications, along with benchmarking services to measure the impact.

About CosmiQ Works

Founded in 2015 as an applied research initiative within IQT Labs, CosmiQ Works is focused on developing, prototyping, and evaluating emerging open source artificial intelligence capabilities for geospatial use cases. CosmiQ Works helps accelerate development and adoption of these technologies into deployable products and ongoing research and development initiatives.


Perrin Lawrence
Head of Communications

Release Summary

CosmiQ Works and AI.Reverie released RarePlanes, now the largest openly available, very-high resolution dataset with real and synthetic data for AI.

Social Media Profiles


Perrin Lawrence
Head of Communications