- Introduction
The increasing availability and access to satellite earth observation (EO) data at higher spatial, temporal and spectral resolutions, and UAVs and smart phone cameras to supplement or complement satellite data, as well simultaneous advances in using this data in crop modelling and machine learning and AI based analytics, has great potential in a wide range of agricultural applications. Potential applications include crop monitoring for drought management, on-farm crop production management, crop surveys for yield prediction, crop insurance among others. Effectively leveraging these technological advances in crop management across scales (farm to village to regional and national for crop insurance payouts at village level, regional drought management or or other contingency planning, food security policies, etc.) requires understanding their capabilities and limitations in two contexts: (i) the context of how they help to capture crop health and yield at local farm/village scale, and (ii) how to organize and process the growing variety and massive volume (several petabytes) of EO data for easy access by users for application from local to region to national scales in a systematic, reliable, consistent and traceable way. (as the data far exceeds the memory, storage and processing capabilities of traditional personal computers).
In this review, we identify and assess the state-of-art of advances in: (i)satellite remote sensing technologies for crop monitoring, (ii) earth observation and big data analytics platforms to store and process massive EO data, and (iii) global crop monitoring platforms.
- Satellite remote sensing data sources and access
Over the last five decades, government and private space agencies have launched numerous earth observation (EO) enabling progressively increasing access to data at higher spatial, temporal and spectral resolutions (Table 1). Among the most important are the Landsat missions since 1970s, and MODIS missions since 1980s, that provided continuous time series data over the past five decades which is largely responsible for the development of remote sensing science that enabled new applications in agricultural and economic development, including agricultural insurance.
Table 1: Indicative list of selected satellites/sensors and gridded data products of relevant indicators available in public and private domains for
crop monitoring and insurance
No. | Satellite/Sensor | Launch
date |
Ongoing /
end date |
Spatial
resolution |
Temperature sensor spatial resolution | Temporal resolution days |
Category | Imagery Products available for crop monitoring * | Agency | Public/
private |
1. | AVHRR | 08/24/1981 | ongoing | 1.1 km | 1.0,0.5 | Biophysical | Vegetation Indices | NOAA | public | |
2. | MODIS -Terra | 12/18/1999 | ongoing | 250m,500m,1km | 1km | 1.0 | Biophysical | Temperature, LST | NASA | public |
3. | MODIS-Terra | 12/18/1999 | ongoing | 250m,500m,1km | 1km | 1.0 | Biophysical | Vegetation Indices | NASA | public |
4. | VIIRS | 11/28/2011 | ongoing | 500m | 1.0 | Biophysical | Vegetation Indices | NASA/NOAA | public | |
5. | VIIRS_T | 11/28/2011 | ongoing | 750m | 1.0 | Weather | Temperature, LST | NASA/NOAA | public | |
6. | ECOSTRESS | 06/29/2018 | ongoing | 30m, 60m | 60m | 1-7 days | Biophysical | Plant Temperature, Evaporative Stress Index | NASA | public |
7. | Landsat-1 MSS | 07/23/1972 | 01/06/1978 | 30m, 60m | 16 | Biophysical | Vegetation Indices | NASA | public | |
8. | Landsat-2 MSS | 01/22/1975 | 02/25/1982 | 30m, 60m | 16 | Biophysical | Vegetation Indices | NASA | public | |
9. | Landsat-3 MSS | 03/05/1978 | 03/31/1983 | 30m, 60m | 16 | Biophysical | Vegetation Indices | NASA | public | |
10. | Landsat-4 MSS | 07/16/1982 | 12/14/1993 | 30m, 60m | 16 | Biophysical | Vegetation Indices | NASA | public | |
11. | Landsat-5 TM | 03/01/1984 | 01/15/2013 | 30m, 60m | 120m | 16 | Biophysical | Vegetation Indices | NASA | public |
12. | Landsat-7 ETM+ | 04/15/1999 | ongoing | 30m, 60m | 60m | 16 | Biophysical | Vegetation Indices | NASA | public |
13. | Landsat-8 OLI/TIRS | 02/11/2013 | ongoing | 30m, 60m | 100m | 16 | Biophysical | Vegetation Indices | NASA | public |
14. | ASTER | Dec 1999 | ongoing | 15m, 30m, 90m | 15m | 1 | Biophysical | Elevation, Surface reflectance, Emissivity, LST | NASA, METI(Japan) | public |
15. | Sentinel-1A SAR | 04/03/2014 | ongoing | 10m | Biophysical | Soil Moisture, ET | ESA | public | ||
16. | Sentinel-1B SAR | 04/25/2016 | ongoing | 10m | Biophysical | Soil Moisture, ET | ESA | public | ||
17. | Sentinel-2A MSI | 06/23/2015 | ongoing | 10m,20m,60m | 10 (5) | Biophysical | Vegetation Indices, FAPAR, LAI | ESA | public | |
18. | Sentinel-2B MSI | 03/07/2017 | ongoing | 10m,20m,60m | 10 (5) | Biophysical | Vegetation Indices | ESA | public | |
19. | Sentinel-3 SLSTR | 02/16/2016 | ongoing | 1km | Weather | Temperature, LST | ESA | public | ||
20. | Sentinel-3A | 02/16/2016 | ongoing | 300m | Biophysical | Vegetation Indices | ESA | public | ||
21. | Sentinel-3B | 04/25/2018 | ongoing | 300m | Biophysical | Vegetation Indices | ESA | public | ||
22. | Sentinel-3B | 04/25/2018 | ongoing | 1km | Weather | Temperature, LST | ESA | public | ||
23. | Sentinel-5P TROPOMI | 10/13/2017 | ongoing | 7km | 17 | Biophysical | Air quality, Ozone, GHGs | ESA | public | |
24. | SPOT-1 | 02/22/1986 | 12/31/1990 | 20m | Biophysical | Vegetation Indices | CNES | public | ||
25. | SPOT-2 | 01/22/1990 | 07/31/2009 | 20m | Biophysical | Vegetation Indices | CNES | public | ||
26. | SPOT-3 | 09/26/1993 | 11/14/1997 | 20m | Biophysical | Vegetation Indices | CNES | public | ||
27. | SPOT-4 | 03/24/1998 | 07/31/2013 | 20m | Biophysical | Vegetation Indices | CNES | public | ||
28. | SPOT-5 | 05/04/2002 | 03/31/2015 | 10m,20m | Biophysical | Vegetation Indices | CNES | public | ||
29. | SPOT-6 | 09/11/2012 | ongoing | 6m | Biophysical | Vegetation Indices | CNES | public | ||
30. | SPOT-7 | 06/30/2014 | ongoing | 6m | Biophysical | Vegetation Indices | CNES | public | ||
31. | Pleiades 1A HiRI | 12/17/2011 | ongoing | 2.8 m | 1 | Biophysical | Vegetation Indices | CNES | public | |
32. | Pleiades 1B HiRI | 12/02/2012 | ongoing | 2.8m | 1 | Biophysical | Vegetation Indices | CNES | public | |
33. | RADARSAT-1 | 11/04/1995 | 12/14/2007 | 10 -100m | 7 | Biophysical | Vegetation structure, Soil Moisture, ET | CSA | Public | |
34. | RADARSAT-2 | 12/14/2007 | ongoing | 3-100m | 7 | Biophysical | Vegetation structure, soil moisture, ET | CSA | public | |
35. | RADARSAT-Constellation | 06/12/2019 | ongoing | 3-100 | 4 | Vegetation structure, Soil Moisture, ET | CSA | public | ||
36. | ALOS 1,2,3-AVNIR | 2006, 2014, 2019 | ongoing | 10m | 5 | Biophysical | Vegetation indices | JAXA (Japan) | public | |
37. | ALOS 1,2,3-PALSAR | 2006, 2014,2019 | ongoing | 10m | 3 | Biophysical | Soil moisture | |||
38. | ALOS 1,2,3-PRISM | 2006,2014,2019 | ongoing | 2.5m; ALOS 3, 0.8m) | 5 | Biophysical | Elevation | |||
39. | Resourcesat-1 LISS-3 | 10/17/2003 | ongoing | 23.5m | Biophysical | Vegetation Indices | ISRO | public | ||
40. | Resourcesat-1 LISS-4 | 10/18/2003 | ongoing | 5.8m | Biophysical | Vegetation Indices | ISRO | public | ||
41. | Resourcesat-1 AWIFS | 10/19/2003 | ongoing | 56m | Biophysical | Vegetation Indices | ISRO | public | ||
42. | Resourcesat-2 LISS-3 | 4/20/2011 | ongoing | 23.5m | Biophysical | Vegetation Indices | ISRO | public | ||
43. | Resourcesat-2 LISS-4 | 4/21/2011 | ongoing | 5.8m | Biophysical | Vegetation Indices | ISRO | public | ||
44. | Resourcesat-2 AWIFS | 4/22/2011 | ongoing | 56m | Biophysical | Vegetation Indices | ISRO | public | ||
45. | OCO-2 | 07/02/2014 | ongoing | 3km | 16 | Biophysical | Solar induced fluorescence | NASA | public | |
46. | OCO-3 | 05/04/2019 | ongoing | 3km | Biophysical | Solar induced fluorescence | NASA | public | ||
47. | CBERS 1 | 10/14/1999 | 8/1/2003 | 20m, 80m | 160m | 3, 26 | Biophysical | Vegetation indices | CNSA / INPE | public |
48. | CBERS 2 | 10/21/2003 | 12/31/2009 | 20m, 80m,260m | 160m | 3,5,5 | Biophysical | Vegetation indices | CNSA / INPE | public |
49. | CBERS 3 | 12/1/2013 | 12/1/2013 | 10m, 20m, 40m | 80m | 3,5,26 | Biophysical | Vegetation indices | CNSA / INPE | public |
50. | CBERS 4 | 12/1/2014 | ongoing | 10m, 20m, 40m, 64m | 80 | 3,5,26 | Biophysical | Vegetation indices | CNSA / INPE | public |
51. | CBERS 4A | 12/20/2019 | ongoing | 8m, 16m, 55m | 5,31 | Biophysical | Vegetation indices | CNSA / INPE | public | |
52. | GOSAT | 2009 | ongoing | 10.5 km | 3 | Biophysical | Green house gases (GHG) | JAXA | public | |
0.5, 1.5 km | 3 | Biophysical | Clouds and aerosols | |||||||
53. | AMSRE | |||||||||
54. | SMAP | 01/31/2015 | ongoing | 36km | 2 to 3 days | Biophysical | Soil Moisture | NASA | ||
55. | SMOS | 11/02/2009 | ongoing | 40km | 2 days | Biophysical | Soil Moisture | ESA | public | |
56. | TRMM | 11/27/997 | 2014 | 5km | 1 day | weather | Precipitation | JAXA/NASA | public | |
57. | GMI (GPM) | 2/27/2014 | ongoing | 11km | Weather | Precipitation | NASA/JAXA | public | ||
58. | GOES-1-17 | 10/16/1975 | ongoing | 1km | 1day | Weather | Soil Moisture, ET | NOAA/NASA | public | |
59. | Meteosat-1-7 | 11/23/1977 | 1984 | 1km | 30 min | Weather | Precipitation, Soil Moisture, ET | EUMETSAT/E | public | |
60. | Meteosat-8-11 | 08/28/2002 | ongoing | 1km | 5, 15 min | Weather | Precipitation, Soil Moisture, ET | EUMETSAT/E | public | |
61. | CHIRPS (Gridded Data products) | 01/01/1981 | present | 5km | 1.0 | Weather | Precipitation | UCSB | public | |
62. | CHIRTS(Gridded Data products) | 01/01/1983 | present | 5km | 1.0 | Weather | Temperature | UCSB | public | |
63. | GRIDMET (Gridded Data products) | 1979 | present | 4km | Rainfall, Relative Humidity, Tmin, Tmax, wind speed, vapour pressure deficit, reference evapotranspiration | UC Merced | Publ;ic | |||
64. | ||||||||||
65. | VanderSat | 06/01/2002 | ongoing | 100m | Biophysical | Soil Moisture, ET | Vandersat | private | ||
66. | IKONOS | 1999 | 2015 | 4m | 3 | Biophysical | Digital Globe | private | ||
67. | WorldView-1 | 09/18/2007 | ongoing | 0.5m-2m | 1.7 | Biophysical | Vegetation Indices | Maxar (Digital Globe) | private | |
68. | WorldView-2 | 10/08/2009 | ongoing | 0.5m-2m | 1.1 | Biophysical | Vegetation Indices | Maxar | private | |
69. | WorldView-3 | 08/23/2014 | ongoing | 0.3m-1.24m | Biophysical | Vegetation Indices | Maxar | private | ||
70. | WorldView-4 | 11/11/2016 | ongoing | 0.3m-1.24m | Biophysical | Vegetation Indices | Maxar | private | ||
71. | RapidEye -Planet | 08/29/2008 | 03/31/2020 | 5m | Biophysical | Vegetation Indices | Planet | private | ||
72. | PlanetScope | 06/22/2016 | ongoing | 3m,5m | Biophysical | Vegetation Indices | Planet | private | ||
73. | SkySat (Planet) | 11/21/2013 | ongoing | 0.72m | Biophysical | Vegetation Indices | Planet | private | ||
74. | aWhere | 01/01/2006 | ongoing | 9km | Weather | Precipitation, Temperature | aWhere | private | ||
75. | ICEYE-X1 to X10; SAR | 01/01/2018 | ongoing | 0.25 to 5m, 5 to 20m | hourly to daily | Weather | Soil Moisture, flood mapping | ICEYE | private |
*Algorithms are available in respective toolkits of several satellites/sensors (AVHRR, MODIS, VIIRS, LANDSAT, SENTINEL ) for derived vegetation status indicators of biophysical importance in representing important vegetation properties of crops to assess crop condition and yield estimation, like LAI (leaf area index), FAPAR (Fraction of absorbed photosynthetically active radiation, FVC (Fractional Vegetative Cover), and others.
Emerging Paradigm: Earth Observation Big Data and Analysis Cloud Platforms
Access to free, time series, global moderate to high resolution satellite remote sensing data has progressively increased over the last four decades (Table 1), starting with AVHRR (1981 to present) and MODIS (1999 to present), followed by publication by USGS of global Landsat (30m resolution) data Archive (1970 to present) in 2008/2009 under US open data initiative, and by the European Space Agency of Sentinel satellites high resolution data (10m, 20m, and microwave; 2015 onwards). A host of other public and private satellites also now provide access to high resolution (up to sub-meter) time series Earth Observation (EO) data, making the current EO data pool vastly different from a decade ago. These developments enabled scientists, businesses, and policy makers in various domains, including agriculture, to visualize the enormous potential of time series EO data in addressing a wide range of important environmental, economic and social problems, at local, regional, and global scales. But, the growing variety and volume (several petabytes) of EO data far exceeds the memory, storage and processing capabilities of traditional remote sensing data storage, distribution, and processing locally on personal computers. It is also not technically feasible to carry out mandatory pre-processing of long time series of raw EO data in the traditional paradigm. These factors have limited EO data use for application development in the traditional paradigm to only very small portions of available data.
The major hurdle in working with time series global scale EO data from diverse sources is in providing the proper connections between data, applications, and users. Overcoming this hurdle necessitates a paradigm shift in EO data retrieval, storage and analysis from local processing on personal computers towards: (i) adoption of next generation infrastructures based on cloud platforms and big data technologies for data storage and processing, and (ii) for automating the pre-processing stages of raw EO data into Analysis Ready Data (ARD). Analysis Ready Data are time-series stacks of satellite imagery that are ready for a user to analyze with minimal or no additional pre-processing of the imagery. They are a packaged product created after pre-processing raw EO data through standard multiple stages that include: searching and downloading data from various providers, image fusing, clipping the data to cover only the area of interest, correcting for geometry, sensors, radiometry, and atmosphere, identifying pixels shadowed by clouds or with poor quality data, and lining up the images pixel for pixel, by geospatially co-registering and resampling the data. Developing long term continuous ARD sets that are consistent (over time from same sensor, and across multiple sensors) is a work in progress that is evolving with new developments in both sensor technologies and algorithms.
The idea of ARD has also shifted the burden of pre-processing EO data from individual users to data providers, and lowered technical barriers for users to fully utilize EO data. Usually ARD are provided as tiled interoperable, georegistered stacks of both Top of Atmosphere (TOA) reflectance and atmospherically corrected surface reflectance products, and with explicit quality assessment information and appropriate metadata for traceability. Users then work directly with data that are pre-processed and arranged in a coherent time series stack for their area of interest, instead of a bunch of randomly placed overlapping images. In the new paradigm, users also can significantly increase the scope and limits of their analysis by working with powerful cloud-computing platforms (instead of personal computers) and advanced analyses with time series, ML, AI or other models, to address complex problems.
Operational EO big data platforms for agricultural applications
In recent years, several approaches to EO big data infrastructures for storage and processing on cloud platforms have evolved to enable and accelerate applications in different domains. Their common goals include helping users to achieve: (i) easier access to EO data , (ii) easier use (storage and processing) of EO data in the cloud, (iii) easier EO big data analytics in the cloud, and (iv) better usability through tailored imagery web applications. Google was among the first to enable the shift towards using EO big data cloud platforms when it introduced the Google Earth Engine (GEE) in 2010, to enhance use of satellite imagery for large scale and time series applications. GEE set a benchmark in enabling universal access to its high power cloud computing resources for fast retrieval and processing of time series ARD from diverse sensors (nearly all sensors and gridded products in Table 1). In addition to ARD of multiple satellite data, the platform provides: (i) a large repository of other geospatial data, including environmental variables, weather, and climate forecasts, land cover, topography, and socioeconomic data, and (ii) a portfolio statistical, ML and AI tools, for a wide array of applications (Gorelick et al, 2017). Its library comprises 600 + EO analysis ready datasets and 1000+ analytical tools. Each data source available on GEE has its own time series of EO/ARD data organized into a stack called Image Collection. Users can access Google Earth Engine with only an active internet connection and a Google account (250 GB free quota). With regular updates of EO data, tools, and features, the platform adapts to a wide range of user requirements and expertise. Users can also analyze their private data on the GEE platform with help of backend data and analytical tools. A summary of features of GEE and some other currently available EO data cloud platforms for potential use in crop monitoring and insurance is given in Table 2.
Table 2: Cloud based platforms for EO data access and analysis
Platform | Satellite Datasets | Access | Special features |
Google Earth Engine (GEE) , 2010
(operated by Google) |
Near real time and archived Global ARD Datasets with corresponding cloud masks; derived time series products of major biophysical variables (NDVI, EVI, etc.); Gridded weather data products ; (nearly all sensors and gridded data sets of Table 1 are included on the platform) | Free and unlimited access to all public sensor time series data and storage for research, education, and non-profit use ; Limited access to some private satellite datasets (Planet; MAXAR) | Users can leverage storage and computing resources of GEE platform; allows scaling to large regional and global analysis of time series data on the platform; open source code for extracting data and processing with a range of statistical and ML/AI algorithms; analysis results can be displayed on the fly in GEE browser and can also be extracted to user systems for integration with other data; users can contribute own datasets and algorithms and develop mobile apps with GEE platform data and analysis tools; sufficiently user friendly for non-specialists in RS or ML/AI tools; Closed source software, so cannot guarantee reproducibility as source code can change |
Geospatial Big Data Platform GBDX (Maxar), 2018 | ARD of MAXAR (WorldView) data; and open data of Sentinel and Landsat | Access based on purchase of subscription; Also offers a free Community Edition called GBDX Notebooks that gives free access to open data (Landsat/ Sentinel) and some MAXAR data | Leverages Amazon Web Services (AWS) to deliver scalable storage and computing resources that can be used for geospatial analytics and AI machine learning applications; Does not allow export of derivatives or the open images except some limited extraction in the Community edition ( 6 GB instance and 20 GB of drive space). |
Radiant Earth
(Open source Platform of Radiant Earth Foundation, 2016)
|
Library of ARD of Sentinel-2 and Landsat data, ML algorithms and Training data sets | Free access to data, algorithms, and training data sets for applications of ML/AI to support decisions in critical areas like agriculture, forests, disaster management, for sustainable global development; Free access to available crop land data to develop crop masks | Target audience Global Development Community (NGOs, Academics, entrepreneurs); Not designed to scale to large regional analyses; Focus on localized studies for ML applications; Repository of Training Data sets for ML algorithms; Allows maintaining own personal projects and bringing in additional EO and secondary data; Allows sharing of data, training data sets, and algorithms with community. |
Sentinel Hub Playground (Sinergise) | ARD of Sentinel, Landsat- MODIS and DEM | Flexible pricing from free to basic and enterprise level uses; Free access to explore and download satellite imagery for non-commercial/ research use; Paid access through specific protocols and API, data processing, mobile application data access, higher access limits | Between GBDX and GEE in function; Limited variety of EO data; Closer to GEE in terms of free access to data, analysis tools and direct display of results in the browser; Allows customized analysis scripts but not sharing of scripts among users; Closed source code, so cannot guarantee
Reproducibility; presents the best balance between the analyzed capacities. The drawback of the ODC solution is mainly the lack of support for reproducibility of science, which is not found in the others either. On the other hand, the other capacities evaluated are at least partially met.
|
SEPAL (System for Earth Observation Data Access, Processing and Analysis for Land Monitoring), 2018FAO –platform for forest and land monitoring |
Open source ARD from GEE and directly from other sources (including Sentinel, Pleiades, WorldView, etc) | Free access, largely meant for developing countries with limited access to satellite data resources. | More focused on infrastructure management and provision of tools for EO data analyses; So, . big data challenges are not directly addressed; Combination of GEE (for EO data) and open source software ORFEO Toolbox (open source remote sensing data processing software for multiple sources),R and others for analysis; GEE is used for data retrieval and the Amazon Web Services Cloud (AWS) is
used for data storage and infrastructure for computing analyses. |
Open Data Cube – ODC
First Developed as Australian Geoscience Data Cube (AGDC, 2017); Modified to allow diverse users, datasets and national or regional use options |
Data cubes (time series stacks) of ARD of Landsat-5/7/8, Sentinel-1/2,
MODIS ALOS-1/2, ASTER DEM and others |
Available under Apache 2.0 license as a suite of applications
These repositories include . |
Generic framework composed of a series of data structures and tools for organization and analysis of massive EO data sets;
Open source code distributed through github repositories which include web interface modules for data visualization, data statistics extraction tools as well as jupyter notebooks with examples of access and use of indexed data in ODC; Does not allow sharing of applications and data Different National ODC implementations are operational in Australia, Switzerland, Kenya, UK, Taiwan, and many other countries |
High-resolution, high-frequency, consistent, and more detailed time series satellite data based crop monitoring is needed over extended periods for effective implementation of crop insurance schemes nationwide, including PMFBY. The data management and analysis challenges arising from the huge time series data volumes can be overcome only with new cloud computing infrastructures, technologies and data architectures, such as those listed in Table 2. Among these GEE is currently the most developed with access to more data and analysis resources. However, since it is closed platform, it cannot guarantee reproducibility source code can change. Sentinel Hub has relatively limited EO data resources and also has similar drawback as GEE for reproducibility. SEPAL is more focused on infrastructure management and the provision of tools for the retrieval and analysis of EO data. It does not directly address big data challenges of EO data storage and processing. Radiant Earth is focussed more on small area studies and machine learning tools. ODC provides a toolkit to facilitate application development by the user. ODC, SEPAL and Radian Earth are open source platforms and their code is available in open repositories. ODC is the only platform that gives a user direct access to data and its infrastructure and data processing capabilities. It provides public documents of the platform governance process and of how to create or incorporate new features into the platform including new software tools. It allows high replicability, provides for high scalability for storage and processing and the maximum opportunity for data access interoperability. ODC also uses distributed data storage to minimize data movement during processing so that processing occurs where data is stored. While GEE is the most useful ready to use platform or users with its multiple source ARD, library of processing tools, but the transient nature of its code and uncertainty of its availability for the long term raise questions about reliability for use in nationally important long term public agricultural schemes. For such schemes ODC may be the preferred platform for national schemes despite considerable effort involved initially in building the platform, because of its generic open source scalable framework, distributed data storage, data and storage scalability, and interoperability of data sources.
In India, most research and application development with Indian satellite data (Resourcesat series) has so far been in the traditional paradigm, that is, processing only a small portion of available data for relatively few periods on personal computers. Some early attempts at developing ARD time series stacks for public access, and building a national ODC have only recently been initiated by ISRO.