# Year Outbreak Max temp Rel humidity 1987 Yes 30.14 8

Year Outbreak Max temp Rel humidity 1987 Yes 30.14 82.86 1988 No 30.66 79.57 1989 No 26.31 89.14 1990 Yes 28.43 91 1991 No 29.57 80.57 1992 Yes 31.25 67.82 1993 No 30.35 61.76 1994 Yes 30.71 81.14 1995 No 30.71 61.57 1996 Yes 33.07 59.76 1997 No 31.5 68.29 2000 No 29.5 79.14 1. In order for the model to serve as a forewarning system for farmers, what requirements must be satisfied regarding data availability?Maximum temperature and relative humidity must be available, these are the two predictors. 2.Write an equation for the model fitted by the researches in the form of equation (8.1). Use predictor names instead of x notation.Odds = p / 1- plog(odds)=B0+B1⋅(Max temperature)+B2⋅(Relative humidity)3. Create a scatter plot of the two predictors, using different hue for epidemic and non-epidemic markers.Create a scatter plot of the two predictors, using different hue for epidemic and non-epidemic markers.Does there appear to be a relationship between epidemic status and the two predictors? There are 5 occurrences of an outbreak in the data. Out of the top four highest Relative Humidities, we see that an outbreak occurred at 3 of them. Out of the top 3 highest Max. Temperatures, an outbreak occurred at 2 of them. This could lead us to conclude that outbreaks occur at high temperatures and high relative humidity.Logistic regression with training period using the two predictors The actual model is then given by p is the probability of event occurring: log(p / 1 – p) = −56.1543 + 1.3849(MaxTemp) + 0.1877(RelHumidity)4. Compute naive forecasts of epidemic status for years 1995- 1997 using next-year forecasts (Ft+1 = Ft). What is the naive forecast for year 2000? Summarize the results for these four years in a classification matrix. Use a roll forward naive forecast. Use roll fwd.Naive forecast is:1995 – Yes1996 – No1997 – Yes1998 – No*Needed to install e1071 to completeWe predicted with only 25% accuracy. Forecast for the year 2000 is No Outbreak. 5. Partition the data into training and validation periods, so that years 1987-1994 are the training period. Fit a logistic regression to the training period using the two predictors, and report the outbreak probability as well as a forecast for year 1995 (use a threshold of 0.5). 6. Generate outbreak forecasts for years 1996, 1997, and 2000 by repeatedly moving the training period forward. for example, to forecast year 1996, partition the data so that years 1987-1995 are the training period. then fit the logistic regression model and use it to generate a forecast (use threshold 0.5).7. summarize the logistic regressions predictive accuracy for these four years (1995-1997, 2000) in a classification matrix.8. does the logistic regression out preform the naive forecasts?The forecasts from the logistic regression model is an improvement over the roll-forward naïve forecasts, as the accuracy went up to 75%. i think the logistic regression model outperforms the naïve forecasts. 9. for year 1997, there is some uncertainty regarding the data quality of the out break status. according to the logistic regression model is it more likely that an outbreak occurred or not?10. if we fit a logistic regression with a lag outbreak predictor such as log(odds)t=B0+B1 (outbreak)t-1 to years 1987-1997 how can this model be used to forecast and outbreak in year 2000.