Skip this Video
Download Presentation
Amrender Kumar

Loading in 2 Seconds...

play fullscreen
1 / 74

Amrender Kumar - PowerPoint PPT Presentation

  • Uploaded on

Pests and Diseases Forewarning System. Amrender Kumar. Scientist Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi, INDIA [email protected] Crop – Pests - Weather Relationship. Crop. Weather. Pests.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Amrender Kumar' - cheri

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Pests and Diseases Forewarning System

Amrender Kumar


Indian Agricultural Statistics Research Institute,

Library Avenue, New Delhi, INDIA

[email protected]


Crop – Pests - Weather





Diseases and pests are major causes of reduction in crop yields.
  • However, in case information about time and severity of outbreak of diseases and pests is available in advance, timely control measures can be taken up so as to reduce the losses.
  • Weather plays an important role in pest and disease development.
  • Therefore, weather based models can be an effective scientific tool for forewarning diseases and pests in advance.
why pests and disease forewarning
Why pests and disease forewarning
  • Forewarning / assessment of disease important for crop production management
    • for timely plant protection measures
      • information whether the disease status is expected to be below or above the threshold level is enough, models based on qualitative data can be used – qualitative models
    • loss assessment
      • forewarning actual intensity is required - quantitative model
variables of interest
Variables of interest
  • Maximum pest population or disease severity.
  • Pests population/diseases severity at most damaging stage i.e. egg, larva, pupa, adult.
  • Pests population or diseases severity at different stages of crop growth or at various standard weeks.
  • Time of first appearance of pests and diseases.
  • Time of maximum population/severity of pests and diseases.
  • Weekly monitoring of pests and diseases progress.
  • Occurrence/non-occurrence of pests & diseases.
  • Extent of damage.
data structure
Data Structure

Historical data at periodical intervals for 10-15 years

Historical data for 10-15 years at one point of time
    • overall status
    • disease intensity
    • crop damage.
Data for 5-6 years at periodic intervals
    • For week-wise models, data points inadequate
    • combined model for the whole data in two steps
  • Data at one point of time for 5-6 years
    • Model development not possible
  • Qualitative data for 10-15 years
    • Qualitative forewarning
      • Occurrence / non-occurrence of disease
  • Mixed data – conversion to qualitative categories
  • Data collected at periodic intervals for one year
    • Within year growth model
Choice of explanatory variables
  • Relevant weather variables
    • appropriate lag periods depending on life cycle
  • Crop stage / age
  • Natural enemies
  • Starting / previous year’s last population of pathogen
forecast models
Forecast Models
  • Between year models
    • These models are developed using previous years’ data.
    • The forecast for pests and diseases can be obtained by substituting the current year data into a model developed upon the previous years.
  • Within year models
    • Sometimes, past data are not available but the pests and diseases status at different points of time during the current crop season are available.
    • In such situations, within years growth model can be used, provided there are 10-12 data points between time of first appearance of pests and diseases and maximum or most damaging stage.
    • The methodology consists of fitting appropriate growth pattern to the pests and diseases data based on partial data.
Thumb rules
    • Most common
    • Extensively used
    • Judgment based on past experience with no or little mathematical background


A day is potato late blight favorable if

  • the last 5 - day temperature average is < 25.50 C
  • the total rainfall for the last 10 days is > 3.0 cm
  • the minimum temperature on that day is > 7.20 C

Trivedi et al. (1999)

Regression models
  • Relationship between two or more quantitative variables
  • The model is of the form

Y = 0 + 1 X1+2 X2 ………. +p Xp + e ,


    • i’s are regression coefficients
    • Xi’s are independent variables
    • Y variable to forecast
    • e random error
  • Variables could be taken as such or some suitable transformations
  • % ofincidence of Bacterial blight (Akola) – Weekly models (42nd to 44th SMW)
  • Data used: 1993-1999 on MAXTemp, MINTemp, RH1 (morn), RH2 (aft) and RF – [X1 to X5) lagged by 2 to 4 weeks
  • Model for 44th SMW

Y= 133.18 - 3.09 RH2L4 + 1.68 RFL4 (R2=0.78)

  • Potato aphid is an abundant potato pest and vector of potato leaf-roll virus, potato virus Y , PVA, etc.
  • Potato aphid population – Pantnagar (weekly models)
  • Data used: 1974-96 on MAXT, MINT and RH

– [X1 to X3) lagged by 2 weeks

  • Model for December 3rd week

Y = 80.25 + 40.25 cos (2.70 X12 - 14.82)

+ 35.78 cos (6.81 X22 + 8.03)

gdd approach
GDD =  (mean temperature – base temperature)

The decision of

Base temperature

Initial time

Not much work on base temperature for various diseases

Normally base temperature is taken as 50 C

Under Indian conditions, mean temperature is seldom below 50 C

Use of GDD and simple accumulation of mean temperature will provide similar results in statistical models

Need for work on base temperature and initial time of calculation

GDD approach

Under Indian conditions, other variables also important

  • Model using simple accumulations not found appropriate
  • Models based on weighted weather indices



Y variable to forecast

xiw value of ith weather variable in wth period

riw weight given to i-th weather variable in wth period

rii’w weight given to product of xi and xi’ in wth period

p number of weather variables

n1 and n2 are the initial and final periods for which weather

variables are to be included in the model

e error term

Experience based weights
  • Subjective weights based on experience.
    • Weather variable not favourable : weight = 0
    • Weather variable favourable : weight = ½
    • Weather variable very favourable : weight = 1

Example :

  • Favourable relative humidity  92%
  • Most favourable relative humidity  98%
  • Weather data
  • Year Week No.
  • 1 2 3 4 5 6
  • 1993 88.7 90.1 94.4 98.3 98.0 95.0
  • 94.0 93.3 94.9 93.3 92.0 88.1
  • 90.3 91.9 90.4 87.9 86.4 89.7
  • ----------------------------------------------------------------
  • ----------------------------------------------------------------

Weighted Index

  • 0x 88.7 + 0x90.1 + 0.5 x 94.4 + 1 x 98.3 +
  • 1 x 98 + 0.5 x 95 = 271.0
  • 0.5 x 94 + 0.5 x 93.3 + 0.5 x 94.9 +
  • 0.5 x 93.3 + 0.5 x 92 + 0 x 88.1 = 232.6
  • 0 x 90.3 + 0 x 91.9 + 0 x 90.4 + 0x 87.9 +
  • 0 x 86.4 + 0 x 89.7 = 0.0
  • ---------------------------------------------------------------
  • ----------------------------------------------------------------

Interaction :

Both variables not favourable : weight = 0

One variable not favourable, one variable favourable : weight = 1/8

One variable not favourable, one variable highly favourable : weight = ¼

Both variables favourable : weight = ½

One variable favourable, one variable highly favourable : weight = ¾

Both variables highly favourable : weight = 1


Correlation based weights

riw correlation coefficient between Y and i-th weather

variable in wth period

rii’w correlation coefficient between Y and product of xi and xi’

in wth period


Modified model

  • Model using both weighted and unweighted indices



For each weather variable two types of indices have been developed

      • Simple total of values of weather variable in different periods
      • Weighted total, weights being correlation coefficients between variable to forecast and weather variable in respective periods
  • The first indexrepresents total amount of weather variable received by the crop during the period under consideration
  • The other onetakes care of distribution of weather variable with reference to its importance in different periods in relation to variable to forecast
  • On similar lines, composite indices were computed with products of weather variables (taken two at a time) for joint effects.
pigeon pea
Pigeon pea

Phytophthora blight (Kanpur)

  • Average percent incidence of phytophthora blight at one point of time
  • Data used : 1985-86 to 1999-2000 on MAXT, MINT, RH1, RH2 and RF (X1- X5) from 28th to 33rd SMW

Y = 330.77 + 0.12 Z121 ….. (R2 = 0.77)

Sterility Mosaic
  • Average percent incidence of sterility mosaic
  • Data used : 1983-84 to 1999-2000 for MAXT, MINT, RH1, RH2 and RF (X1- X5) from 20th to 32nd SMW

Y = -180.41 + 0.09 Z121 …… (R2 = 0.84)


Late Leaf Spot & Rust – Tirupathi

  • Disease indices at one point of time
  • Data used : MAXT, MINT, RH1, RH2, RF and WS from (X1- X6)

- 10th to 14th SMW (Rabi or post rainy)

- 41st to 46th SMW (Kharif or rainy)

Principal component regression
  • Independent variables large and correlated
  • Independent variables transformed to principal components
  • First few principal components explaining desired variation selected
  • Regression model using principal components as regressors
Discriminant function analysis
  • Based on disease status years grouped into different categories – low, medium, high
  • Linear / quadratic discriminant function using weather data in above categories
  • Discriminant score of weather for each year
  • Regression model using disease data as dependent variable and discriminant scores of weather as independent.
  • Data requirement is more.
  • Can also be used if disease data are qualitative
  • Johnson et al. (1996) used discriminant analysis for forecasting potato late blight.
Deviation method
  • Useful when only 5-6 year data available for different periods
  • Week-wise data not adequate for modeling
  • Combined model considering complete data.
  • Not used for disease forewarning but in pest forewarning
Assumption : pest population / disease incidence in particular year at a given point of time composed of two components.
    • Natural growth pattern
    • Weather fluctuations
  • Natural pattern to be identified using data in different periods averaged over years.
  • Deviation of individual years in different periods from predicted natural pattern to be related with deviations of weather.
  • Mango fruitfly – Lucknow (weekly models)
  • Data used: 1993-94 to 1998-99 on MAXT, MINT and RH – [X1 to X3]
  • Model for natural pattern

t = Week no.

Yt = Fruitfly population count at week t

forecast model
Forecast model

Y =  125.766 + 0.665 (Y2) + 0.115 (1/X222 ) + 10.658 (X212)

+ 0.0013 (Y23) + 31.788 (1/Y3)  21.317 (X12)

 2.149 (1/X233)  1.746 (1/X234)

Y = Deviation of fruitfly population from natural cycle

Yi = Fruitfly population in i-th lag week

Xij = Deviation from average of i-th weather variable (i =

1,2,3 corresponds to maximum temperature,

minimum temperature and relative humidity) in j-th lag



With the development of computer hardware and software and the rapid computerization of business, huge amount of data have been collected and stored in centralized or distributed databases

Data is heterogeneous (mixture of text, symbolic, numeric, texture, image), huge (both in dimension and size) and scattered.

The rate at which such data is stored is growing at a phenomenal rate.

As a result, traditional statistical techniques and data management tools are no longer adequate for analyzing this vast collection of data.


One of the applications of Information Technology that has drawn the attention of researchers is data mining, where pattern recognition, image processing, machine intelligence i.e concerned with the development of algorithms and techniques that allow system to "learn“ are directly related

  • Data Mining involves
    • Statistics : Provides the background for the algorithms.
    • Artificial Intelligence : Provides the required heuristics for learning the system
    • Data Management : Provides the platform for storage & retrieval of raw and summary data.

Pattern Recognition and Machine Learning principles applied to a very large (both in size and dimension) heterogeneous database for Knowledge Discovery

Knowledge Discovery is the process of identifying valid, novel, potentially useful and ultimately understandable patterns in data. Patterns may embrace associations, correlations, trends, anomalies, statistically significant structures etc.

Without “Soft Computing” Machine Intelligence and Data Mining may remains Incomplete

soft computing
Soft Computing

Soft Computing is a new multidisciplinary field that was proposed by Dr.LotfiZadeh, whose goal was to construct new generation Artificial Intelligence, known as Computational Intelligence.

The concept of Soft Computing has evolved. Dr.Zadeh defined Soft Computing in its latest incarnation as the fusion of the fields of fuzzy logic, neural network, neuro-computing, Evolutionary & Genetic Computing and Probabilistic Computing into one multidisciplinary system.

Soft Computing is the fusion of methodologies that were designed to model and enable solutions to real world problems, which are not modeled, or too difficult to model. These problems are typically associated with fuzzy, complex, and dynamical systems, with uncertain parameters.

These systems are the ones that model the real world and are of most interest to the modern science.

The main goal of Soft Computing is to develop intelligent system and to solve nonlinear and mathematically unmodelled system problems [Zadeh 1993, 1996, and 1999].

The applications of Soft Computing have two main advantages.

First, it made solving nonlinear problems, in which mathematical models are not available, possible.

Second, it introduced the human knowledge such as cognition, recognition, understanding, learning, and others into the fields of computing.

This resulted in the possibility of constructing intelligent systems such as autonomous self-tuning systems, and automated designed systems.

soft computing tools
soft computing tools

Soft computing tools include

  • Fuzzy sets
    • Fuzzy sets provide a natural frame work for the process in dealing with uncertainty
  • Artificial neural networks
      • Neural networks are widely used for modelling complex functions and provide learning and generalization capabilities
    • Genetic algorithms
    • Genetic algorithms are an efficient search and optimization tool
  • Rough set theory
    • Rough sets help in granular computation and knowledge discovery
Why Neural Networks are desirable

Human brain can generalize from abstract

Recognize patterns in the presence of noise

Recall memories

Make decisions for current problems based on prior experience

Why Desirable in Statistics

Prediction of future events based on past experience

Able to classify patterns in memory

Predict latent variables that are not easily measured

Non-linear regression problems

application of anns
Application of ANNs


medical diagnosis

signature verification

character recognition

voice recognition

image recognition

face recognition

loan risk evaluation

data mining

Modelling and Control

control systems

system identification

composing music


economic indicators

energy requirements

medical outcomes

crop forecasts

environmental risks

Neural networks are being successfully applied across an extraordinary range of problem domains, in areas as diverse as finance, medicine, engineering, geology, biology, physics and agriculture.

From a statistical perspective neural networks are interesting because of their potential use in prediction and classification problems.

A very important feature of these networks is their adaptive nature, where “Learning by Example” replaces “Programming” in solving problems.

Basic capability of neural networks is to learn patterns from examples

Type of neural network models

Two types of neural network models

Multilayer perceptron (MLP) with different hidden layers and nodes

Radial basis function (RBF)

neural network based model
Neural network based model

Steps in developing a neural network model

Forming training, testing and validation sets

Neural network model

No. of input nodes

No. of hidden layers

No. of hidden nodes

No. of output nodes

Activation function

Model building

Sensitivity Analysis

data sets
Data sets

The data available is divided into three data sets

Training set represents the input- output mapping, which is used to modify the weights.

Validation set is required only to decide when to stop training the network, and not for weight update.

Test set is the part of collected data that is set aside to test how well a trained neural network generalizes.

No. of input nodes : more than one

No. of hidden layers : one / two

No. of hidden nodes : decided by various rules

No. of output nodes : one

Activation function : hyperbolic

Activation function:

Activation functions determine the output of a processing node. Non linear functions have been used as activation functions such as logistic, tanh etc.

Activation functions such as sigmoid are commonly used because they are nonlinear and continuously differentiable which are desirable for network learning

Logistic activation functions are mainly used for classification problems which involve learning about average behavior

Hyperbolic tangent functions are used for the problem involves learning about deviations from the average such as the forecasting problem.

Therefore, in the present study, hyperbolic tangent (tanh) function has been used as activation function for neural networks model based on MLP architecture.




learning of anns
Learning of ANNs

The most significant property of a neural network is that it can learn from environment, and can improve its performance through learning

Learning is the process of modifying the weights in networks

The network becomes more knowledgeable about environment after each iteration of learning process.

There are mainly two types of learning paradigms

Supervised learning

Unsupervised learning

a learning cycle in the mlp backpropagation learning algorithm
A learning cycle in the MLP (Backpropagation Learning Algorithm)

Output vector

Target vector


Input vector

ANN model


Adjust weights


Alternaria blight (Varuna, Rohini & Binoy)

Bharatpur (Raj)

Behrampur (WB)

Dholi (Bihar)

Powdery mildew (Varuna and GM2)


Variable to forewarn

crop age at first appearance of disease

crop age at peak severity of disease

maximum severity of disease


Bacterial blight (% of disease incidence) - Akola

pests diseases forewarning mustard
Pests / diseases forewarning-Mustard

Data have been taken from Mission Mode Project under National Agricultural Technology Project, entitled “Development of weather based forewarning system for crop pests and diseases”, at CRIDA, Hyderabad.

Models were developed for forecasting different aspects relating to diseases for Alternaria Blight (AB) and Powdery Mildew (PM) in Mustard crop.

The field trials were sown on 10 dates at weekly intervals (01, 08, 15, 22, 29 October, 05, 12, 19, 26 November and 03 December) at each of the locations viz., Bharatpur, Dholi and Berhampur for Alternaria Blight and at S.K.Nagar for Powdery Mildew.

Data for different dates of sowing were taken together for model development.

Weekly data on weather variables starting from week of sowing up to six weeks of crop growth were considered

Forewarning models were developed for two varieties of mustard crop for

Alternaria Blight on leaf and pod (Varuna and Rohini – Bharatpur, Varuna and Binoy – Behrampur and Varuna and Pusabold – Dholi) and

Powdery Mildew on leaf (Varuna and GM2 – S.K.Nagar)

Models have been validated using data on subsequent years not included in developing the models.


Mean Absolute Percentage Error of various models at Bharatpur in different varieties in mustard crop for Alternaria blight (AB) - 2006-07

Neural networks, with their remarkable ability to derive meaning from complicated or imprecise data, can be used to extract patterns and classifications

Neural networks do not perform miracles. But if used sensibly they can produce some amazing results

Model for qualitative data
  • Data in categories
  • Occurrence / non-occurrence, low / medium / high, etc.
  • Classified as 0 / 1 (2 categories); 0,1,2 (three categories)
  • Quantitative data / mixed data can be converted to categories

Logistic Regression model

  • where, L= β0+ β1x1+ β2x2 ….βnxn
  • x1 , x2, x3,…xn are weather variables/weather indices
  • e = random error
  • Forecast / Prediction rule
    • If P < 0.5, then the probability of epidemic occurrence will be minimal
    • If P  0.5, then there is more chance of occurrence of epidemic.
  • Leaf blast severity (%) - Palampur at one point of time
  • Data used: 1991-92 to 1998-99 on MAXT, MINT, RH1, RH2, BSH & RF – [X1 to X6] from 23th to 31st SMW.
  • Model :

L= 394.8 -0.0520 Z351-1.5414 Z10

  • Validation for subsequent years :

Alternaria blight and White rust

  • Data used: 1987-88 to 1998-99 on MAXT, MINT, RH1, RH2 and BSH – (X1 to X5) from week of sowing (n1) to 50th smw (n2)

Model forAlternaria blight

L =- 8.8347 + 0.0163 Z120 - 0.00037 Z130 - 0.00472 Z450

Model for White rust

L = 5.8570 - 0.0293Z40 + 0.00264 Z230

  • Forecasts of subsequent years are
Within year model
  • Model using only one year’s data
    • Data availability for several dates of sowing
    • If adequate dates of sowing, models similar to between-year models could be developed
  • Use for forewarning subsequent years (?)
  • Model for single date of sowing
    • Forewarning of maximum disease severity
    • Applicable when 10-12 data observations between first disease appearance and maximum disease severity
    • Non-linear model for disease development pattern growth using partial data


  • Alternaria blight cv. Varuna (% disease severity) - Kumarganj
  • Data used: 1999-2000
  • Model :
  • Yt = A exp (B/t)
  • Yt = pds at time t, A and B are parameters,
  • t = week after sowing (1,2,…….)

Observed, predicted and forecasts of max. percent disease severity (PDS)

  • Reliable forecast of max. pds could be obtained for 2 weeks in advance
models developed at iasri
Models developed at IASRI
  • Sugarcane
    • Pyrilla
    • Early shoot borer &
    • Top borer
  • Pigeon pea
    • Pod fly
    • Pod borer
    • Sterility Mosaic
    • Phytophthora Blight
  • Rice
    • BPH
    • Gall midge
  • Mango
    • Powdery Mildew
    • hoppers
    • fruit-fly
  • Mustard
    • Alternaria Blight
    • White Rust
    • Powdery Mildew
    • Aphid
  • Cotton
    • American boll worm
    • Pink boll worm
    • Spotted boll worm
    • Whitefly
  • Groundnut
    • Spodoptera litura
    • Late leaf blast
    • Rust
  • Onion
    • Thrips
  • Agrawal, Ranjana, Jain, R.C. and Jha, M.P. (1983). Joint effects of weather variables on rice yields. Mausam, 34(2), 177-81.
  • Agrawal, Ranjana, Jain, R.C., Jha, M.P., (1986). Models for studying rice crop weather relationship, Mausam, 37(1), 67-70.
  • Agrawal Ranjana, Mehta, S.C., Kumar, Amrender and Bhar, L.M. (2004). Development of weather based forewarning system for crop pests and diseases- Report from IASRI, Mission mode project under NATP, PI, Dr. Y.S. Ramakrishna, CRIDA, Hyderabad.
  • Denton, J.W., 1995. How good are neural networks for causal forecasting? Journal of Business Forecasting, 14 (2), 17–20.
  • Desai, A.G., Chattopadhyay, C., Agrawal, Ranjana, Kumar, A., Meena, R.L., Meena, P.D., Sharma, K.C., Rao, M. Srinivasa, Prasad,, Y.G. and Ramakrishna, Y.S. (2004). Brassica juncea powdery mildew epidemiology and weather-based forecasting models for India - a case study , Journal of Plant Diseases and Protection, 111(5), 429-438.
  • Gaudart, J., Giusiano, B. and Huiart, L. (2004). Comparison of the performance of multi-layer perceptron and linear regression for epidemiological data. Comput. Statist. & Data Anal., 44, 547-70.
Hebb, D.O. (1949) The organization of behaviour: A Neuropsychological Theory, Wiley, New York.
  • Hopfield, J.J. (1982). Neural network and physical system with emergent collective computational capabilities. In proceeding of the National Academy of Science (USA) ,79, 2554-2558.
  • Kaastra, I. and Boyd, M.(1996): Designing a neural network for forecasting financial and economic time series. Neurocomputing, 10(3), 215-236.
  • Masters, T. (1993). Practical Neural Network Recipes in C++, San Diego, Academic Press.
  • Rosenblatt, F. (1958). The perceptron: A probabilistic model for information storage and organization in the brain. Psychological review, 65, 386-408.
  • Rumelhart, D.E., Hinton, G.E., and Williams, R.J. (1986). Learning internal representations by error propagation, Nature, 323, 533-536
Saanzogni, Louis and Kerr, Don (2001) Milk production estimate using feed forward artificial neural networks. Computer and Electronics in Agriculture, 32, 21-30.
  • Warner, B. and Misra, M. (1996). Understanding neural networks as statistical tools. American Statistician, 50, 284-93.
  • Widrow, B. and Hoff, M.E. (1960). Adaptive switching circuit. IREWESCON convention record, 4, 96-104
  • Zhang, G., Patuwo, B. E. and Hu, M. Y. (1998). Forecasting with artificial neural networks: The state of the art. International Journal of Forecasting, 14, 35-62.