Back to writing
Crop yield prediction satellite analysis

Writing

Yield Prediction Using Satellite Data and Machine Learning

October 26, 2023

Efficient water resource management in agriculture is a critical challenge, especially in the face of climate change and increasing freshwater demand. This article explores a system designed to optimise agricultural practices by leveraging satellite data and machine learning to predict crop yields.

System Architecture Overview

The system is built around the integration of satellite data, environmental indices, and machine learning models. At its core, it uses Google Earth Engine (GEE) for processing satellite imagery and extracting key environmental indices related to soil moisture, vegetation health, and other factors that influence crop yield.

The backend is developed in Node.js, acting as the intermediary between the satellite data processing pipeline and the app interface. The Flutter-based app provides real-time analytics and visualisations, enabling informed decision-making in agricultural management.

Leveraging Google Earth Engine for Soil Information

GEE is a powerful cloud-based platform for geospatial analysis with access to a vast repository of satellite imagery. In this project it was used to compute three key environmental indices:

1. Normalized Difference Vegetation Index (NDVI)

NDVI assesses vegetation health using near-infrared and red light reflectance:

NDVI = (NIR - Red) / (NIR + Red)

Higher values indicate healthier vegetation — crucial for yield prediction.

2. Normalized Difference Water Index (NDWI)

NDWI monitors water content in vegetation and soil:

NDWI = (Green - NIR) / (Green + NIR)

This index helps assess soil moisture levels, vital for irrigation planning.

3. Enhanced Vegetation Index (EVI)

EVI improves on NDVI by reducing atmospheric influences and accounting for canopy background signals:

EVI = G × (NIR - Red) / (NIR + C₁×Red - C₂×Blue + L)

These indices were computed from Sentinel-2 and Landsat imagery. The processing pipeline in GEE filtered images by cloud cover, selected relevant spectral bands, and applied the formulas above.

Data Processing and Analysis

The extracted indices were further processed to generate insights:

  • Temporal Analysis: Data was analysed over time to identify trends in vegetation health and soil moisture, helping understand seasonal variations and predict yields.
  • Spatial Analysis: Spatial patterns were mapped to identify areas requiring intervention — critical for optimising irrigation and resource allocation.
  • Machine Learning Integration: Processed data was fed into ML models trained on the ML4EARTH HACKATHON dataset. Feature engineering, dimensionality reduction, and hyperparameter tuning were used to improve accuracy.

Backend and App Integration

The Node.js backend interfaced with GEE to retrieve satellite data, computed indices, stored results, and served APIs to the Flutter app. Key app features included:

  • Interactive maps displaying soil moisture and vegetation health.
  • Graphs showing temporal trends in environmental indices.
  • Recommendations for irrigation and resource management based on predictive analytics.

Conclusion

This system represents an approach to data-driven agricultural management — integrating satellite data, environmental indices, and machine learning. The use of GEE ensures accurate and scalable analysis, while the backend and app provide a seamless interface for end-users. By leveraging these technologies, the system offers a practical path toward optimising water resource use and enhancing crop yields.