# RapidCrops

## Data coverage

<img src="https://data.source.coop/planet/rapidcrops/docs/RapidCrops_coverage.png" alt="Banner image showing geographical extent of the dataset." width="50%"/>

Dense temporal coverage (2018-2022) across seven countries:
- Austria
- Denmark
- Spain (partial coverage in Catalonia only)
- France
- Germany (partial coverage in Brandenberg, Nordrhein Westfalen & Lower Saxony)
- Netherlands
- Portugal

Additional years of coverage are included for:
- Germany (2023 & 2024)
- France (2023)
- Netherlands (2023 & 2024)

Datasets are published in two forms:
1. "National": National datasets are provided individually for each year of coverage.
2. "Combined": All national datasets are combined into a single partitioned (by country) dataset for a given year.

## Data structure

Because all data resides within Europe, boundary geometries are all provided in `EPSG:3035`.

See below an example GeoParquet schema for the France 2020 dataset:

<img src="https://data.source.coop/planet/rapidcrops/docs/RapidCrops_table_example.png" alt="Table schema for France 2020 dataset." width="100%"/>

## Usability and efficiency

Redundant vertices were removed from the boundary polygons with a tolerance of 20cm to protect the integrity of the parcel.
This resulted in a >25% reduction in the number of vertices, allowing for more efficient spatial queries.

## Assessing suitability for combining parcels with EO products

Identifying parcels from which clean pixels can be obtained from satellite imagery depends on both the size of a parcel and the regularity of its shape.
RapidCrops provides attributes to help quantify these properties:
- `parcel_width`: The length of the shortest edge on the parcel boundary's minimum rotated rectangle
- `area_ratio_hull`: The ratio between the area of the parcel boundary, and that of its convex hull
- `max_contained_radius`: The radius of the parcel boundary's maximum inscribed circle
Knowing this can help to filter out parcels from which meaningful analysis cannot be performed, or to help explain variation in performance across parcels.

These metrics can be used to identify suitable parcels for analysis by a given EO sensor, as highlighted below:

<img src="https://data.source.coop/planet/rapidcrops/docs/eo_suitability_metrics.png" alt="Table schema for France 2020 dataset." width="80%"/>

## Facilitating spatial stratification

Grid cell IDs from two popular spatial gridding systems (Quadkey & H3) are included for each boundary to allow for spatially-stratified sampling. The gridding systems are hierarchical and coarser grid resolutions can be derived form the values provided according to the user’s need.

Below is the quadkey spatial gridding at the resolution provided:

<img src="https://data.source.coop/planet/rapidcrops/docs/spatial_stratification_fine.png" alt="Quadbin keys at fine resolution." width="30%"/>


But coarser resolution gridding can be derived by reading a subset of the leftmost characters in the key:

<img src="https://data.source.coop/planet/rapidcrops/docs/spatial_stratification_coarse.png" alt="Quadbin keys at coarser resolution." width="30%"/>

## Accessing the data

```python
import geopandas as gpd

url = "s3://planet/rapidcrops/national/AUT/rapidcrops_2018_aut.geo.parquet"

# Retrieve a full dataset
df = gpd.read_parquet(url)

# OR, filter the dataset on a bbox for faster retrieval
df = gpd.read_parquet(url, bbox=bbox)
```

## Underlying data licenses

The data is made available subject to the licenses associated with the underlying source datasets. No additional limitations or conditions are applied. See below for links to the underlying license information for each source dataset:

| Source dataset        | Dataset license       |
|-----------------------|-----------------------|
| Austria               | [INSPIRE public access license](http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations) & [CC-BY-AT 4.0](https://creativecommons.org/licenses/by/4.0/)                     |
| Denmark               |   [INSPIRE public access license](http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations) & [INSPIRE no conditions](http://inspire.ec.europa.eu/metadata-codelist/ConditionsApplyingToAccessAndUse/noConditionsApply) & [CC0 1.0 Universal](https://creativecommons.org/publicdomain/zero/1.0/deed.en)                   |
| France                |   [Custom open license](https://www.etalab.gouv.fr/wp-content/uploads/2014/05/Open_Licence.pdf)                    |
| Germany               |   Custom open licenses: [NRW](http://dcat-ap.de/def/licenses/dl-by-de/2.0), [Brandenberg](https://www.govdata.de/dl-de/by-2-0), [LS](http://dcat-ap.de/def/licenses/dl-by-de/2.0)                    |
| Netherlands           |   [INSPIRE public access license](http://inspire.ec.europa.eu/metadata-codelist/LimitationsOnPublicAccess/noLimitations) & [INSPIRE no conditions](http://inspire.ec.europa.eu/metadata-codelist/ConditionsApplyingToAccessAndUse/noConditionsApply) & [Dutch creative commons license](http://creativecommons.org/publicdomain/mark/1.0/deed.nl)                    |
| Portugal              |  [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)                     |
| Spain                 |  [Custom open license](https://administraciodigital.gencat.cat/ca/dades/dades-obertes/informacio-practica/llicencies/#limitacio-de-responsabilitat)                     |

## Suggested citation

```
Holden, P., Davis, T., Holmes, C., Senaras, C., & Wania, A. (2025). RapidCrops: A pan-European label dataset for large-scale crop classification (v1.0) [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15166359
```

In BibTeX format:

```
@dataset{rapidcrops,
  title = {RapidCrops: A pan-European label dataset for large-scale crop classification},
  publisher = {Zenodo},
  doi = {https://doi.org/10.5281/zenodo.15166359},
  year = {2025},
  urldate = {<date of access, ISO>},
  author = {
    Holden, Piers and
    Davis, Timothy and
    Holmes, Christopher and
    Senaras, Caglar and
    Wania, Annett
  },
  version = {1.0},
}
```