Nitendra Gautam

Big Data techniques applicable in Agriculture

Technology used in agriculture has continued to grow since the early 20th century,when the industry shifted from horse-drawn plow to mechanized tractors. Advancement in the field of genetics, chemical inputs, and, more recently, guidance systems has transformed the agricultural industry into a technology-intensive and data-rich field.Current Field of agriculture is more of technology oriented than labor oriented.As the size of farm has changed farming concepts, techniques, risks associated with farming,and data size associated with has changed.This has led to increase in data produced from different tools and machineries used in farms.

According to report by global Harvest initiative, Human population will exceed 9 billion by 2050 .It will lead to a increase in demand for food and fuel even though production rate remains the same.It means that producing and distributing food for this population will be the biggest challenge. That’s why there has been a wide research in use of big data techniques in agricultural methods in both academia and commerical product.

Why Big Data In Agriculture ?

Within last 20 years, agricultural industry has increased their ability to generate,capture, and store these generated big data by the use of mobile technology and data management software. Current state of Agricultural practices is supported by biotechnology and emerging digital technologies such as remote sensing, cloud computing and Internet of Things(IOT) ,leading to a notion of Smart Farming or Precision Agriculture.

Modern day agriculture is called Precision Agriculture in which smart farming technique are used to get to collect and analyze the data related to the fields. Aim of Precison Farming or Precision Agriculture is to increase productivity, decrease production cost and minimize the environmental impact of Farming.Smart farming is important as it helps to tackle the challenges of agricultural production in terms of productivity, environmental impact, food security and sustainability.

Use of Big data techniques in agriculture helps to store and process data related to crops,weather terrain, geographic conditions. Farmers can use this technology to know the success of various crops in diverse geographic area, predictive impact of natural conditions in the crops which ultimately provides ways to increase productivity in these fields.Farmers can use these technology proactively to increase the crop yield, find out seed and fertilization application rates,soil analysis and get weather reports. Using Big data in field of Agriculture also helps to process data related to crops,weather, terrain and geographic conditions. [1,2]

Sources of Big Data & techniques for Big Data Analysis

Below table shows the different Agriculture area along with the source of data that is produced.

Agricultural AreaBig Data SourcesTechniques for Big Data Analysis
Weather and climate changeWeather stations, surveys, static historical information (weather and climate data, earth observation data), remote sensing (satellites), geospatial data.Machine learning (scalable vector machines), statistical analysis, modeling, cloud platforms, MapReduce analytics, GIS geospatial analysis.
LandRemote sensing (satellites, synthetic aperture radar, airplanes), geospatial data, historical datasets (land characterization and crop phenology, rainfall and temperature, elevation, global tree cover maps), camera sensors (multispectral imaging), weather stations.Machine learning (scalable vector machines, K-means clustering, random forests, extremely randomized trees), NDVI vegetation indices, Wavelet based filtering, image processing, statistical analysis, spectral matching techniques, reflectance and surface temperature calculations.
Animal ResearchHistorical information about soils and animals (physiological characteristics), ground sensors (grazing activity, feed intake, weight, heat, milk production of individual cows, sound),camera sensors (multispectral and optical).Machine learning (decision trees, neural networks, scalable vector machines).
CropsGround sensors (metabolites), remote sensing (satellite), historical datasets (land use, national land information, statistical data on yields).Machine learning (scalable vector machines, K-means clustering), Wavelet based filtering, Fourier transform, NDVI vegetation indices.
SoilGround sensors (salinity, electrical conductivity, moisture), cameras (optical), historical databases (e.g. AGRIC soils).Machine learning (K-means clustering, Farthest First clustering algorithm).
WeedsRemote sensing (airplane, drones), historical information (digital library of images of plants and weeds, plant-specific data).Machine learning (neural networks, logistic regression), image processing, NDVI vegetation indices.
Food availability and securitySurveys, historical information and databases (e.g. CIALCA, ENAR, rice crop growth datasets), GIS geospatial data, statistical data, remote sensing (synthetic aperture radar).Machine learning (neural networks), statistical analysis, modeling, simulation, network-based analysis, GIS geospatial analysis, image processing.
BiodiversityGIS geospatial data, historical information and databases (SER database of wildlife species.Statistics (Bayesian belief networks).
Farmers decision makingStatic historical information and datasets (e.g. US government survey data), remote sensing (satellites, drones), weather stations, humans as sensors, web-based data, GIS geospatial data, feeds from social media.Cloud platforms, web services, mobile applications, statistical analysis, modeling, simulation, benchmarking, big data storage, message-oriented middleware.
Farmers’ insurance and financeWeb-based data, historical information, weather stations, humans as sensors (crops, yields, (financial transactions data).Cloud platforms, web services, mobile applications.
Remote sensingRemote sensing (satellite, airplane, drones), historical information and datasets (e.g. MODIS surface reflectance datasets, earth land surface dataset of images, WMO weather datasets, reservoir heights derived from radar altimetry, web-based data, geospatial data (imaging, maps).Cloud platforms, statistical analysis, GIS geospatial analysis, image processing, NDVI vegetation indices, decision support systems, big data storage, web and community portals, MapReduce analytics, mobile applications, computer vision, artificial intelligence.


Software tools used for big data analysis in agriculture

CategorySoftware tools
Image processing toolsIM toolkit, VTK toolkit, OpenCV library
Machine learning (ML) toolsGoogle TensorFlow, R, Weka, Flavia, scikit-learn, SHOGUN, mlPy, Mlpack, Apache Mahout, Mllib and Oryx
Cloud-based and Big Data storage and analytic platformsCloudera, Hortonworks,MapR based Hadoop Platform EMC Corporation, IBM InfoSphere BigInsights, IBM PureData system for analytics, Aster SQL MapReduce, Pivotal GemFire, Pivotal Greenplum,and Apache Pig, Apache Spark, Apache Storm
GIS systemsArcGIS, Autodesk, MapInfo, MiraMon ,GRASS GIS
Big databasesApache Hive,Hawq,Cassandra,Hbase HadoopDB, MongoDB, ElasticSearch, Google BigTable,Rasdaman, MonetDB/SciQL, PostGIS, Oracle GeoRaster, SciDB
Messaging/Publish Subscribe SystemMQTT, RabbitMQ ,Apache Kafka
Modeling and simulationAgClimate, GLEAMS, LINTUL, MODAM, OpenATK
Statistical toolsNorsys Netica, R, Weka, Python ML Libraries
Time-series analysisStata, RATS, MatLab, BFAST



(1) Stubbs, Megan. “Big Data in US Agriculture.” Congressional Research Service, January 6, 2016.

(2) Kamilaris, A.; Kartakoullis, A.; Prenafeta-Boldú, F. A review on the practice of big data analysis in agriculture. Comput. Electron. Agric. Int. J. 2017, 143, 23–37. CrossRef