39. SPW GeoParquet¶
The Standard Planting Window (SPW) dataset provides daily planting recommendations and related indicators for each farm/grid.
Data is stored as partitioned GeoParquet files in the GAP bucket.
39.1 Prerequisites¶
Python 3.9+
DuckDB with extensions:
httpfs(S3/HTTP access)spatial(GEOMETRY support)
Read-only GAP object storage credentials
39.1.1 Environment variables¶
export S3_ENDPOINT_URL="https://fra1.digitaloceanspaces.com"
export S3_ACCESS_KEY_ID="YOUR_READ_ONLY_KEY"
export S3_SECRET_ACCESS_KEY="YOUR_READ_ONLY_SECRET"
export S3_BUCKET_NAME="bucket-name"
39.2 Path layout¶
SPW is partitioned by year and month:
An example would be:
Always pass hive_partitioning=true to include partition columns.
39.3 Column descriptions¶

| Column | Type | Description |
|---|---|---|
date |
DATE | Record date |
year |
INT | Partition year |
farm_id |
BIGINT | Internal numeric farm id |
farm_unique_id |
VARCHAR | External farm identifier |
country |
VARCHAR | Country name |
farm_group |
VARCHAR | Farm group (e.g. “KALRO 2025 A”) |
farm_group_id |
BIGINT | Group identifier |
grid_id |
BIGINT | Numeric grid identifier |
grid_unique_id |
VARCHAR | Unique grid identifier |
geometry |
GEOMETRY | Farm/grid geometry |
signal |
VARCHAR | Decision signal (e.g. “Do NOT plant, DRY Tier 4b”) |
last_2_days |
DOUBLE | Rainfall/indicator over last 2 days |
last_4_days |
DOUBLE | Rainfall/indicator over last 4 days |
today_tomorrow |
DOUBLE | Forecast indicator for today + tomorrow |
too_wet_indicator |
VARCHAR | Text category (e.g. “Not too wet to plant”) |
39.4 Example queries¶
39.4.1 Filter by date¶
from datetime import date
import os, duckdb
conn = duckdb.connect()
conn.install_extension("httpfs"); conn.load_extension("httpfs")
conn.install_extension("spatial"); conn.load_extension("spatial")
spw_date = date.fromisoformat("2025-09-10")
spw_path = (
f"s3://{os.environ['S3_BUCKET_NAME']}/"
f"staging/spw_geoparquet/year={spw_date.year}/month={spw_date.month}.parquet"
)
q = f"""
SELECT *
FROM read_parquet('{spw_path}', hive_partitioning=true)
WHERE date = '{spw_date}'
LIMIT 10;
"""
conn.sql(q).show()

39.4.2 Filter by farm_unique_id¶
farm_unique_id = '4023361'
conn.sql(f"""
CREATE OR REPLACE TABLE spw_geoparquet AS
SELECT *
FROM read_parquet('{spw_path}', hive_partitioning=true)
WHERE date BETWEEN '2025-09-05' AND '2025-09-10'
""")
conn.sql(f"""
SELECT date, farm_unique_id, farm_group, signal, too_wet_indicator
FROM spw_geoparquet
WHERE farm_unique_id = '{farm_unique_id}'
ORDER BY date DESC
""").show()
39.4.3 Export to CSV¶
import os
os.makedirs("output_csv", exist_ok=True)
export_q = f"""
SELECT *
FROM spw_geoparquet
WHERE farm_unique_id = '{farm_unique_id}'
"""
conn.sql(f"COPY ({export_q}) TO 'output_csv/spw_geoparquet_{farm_unique_id}.csv' (HEADER, DELIMITER ',');")
