1. Which of the following statistical measures is affected by outliers in the data?(a) the standard deviation.(b) the mean.(c) the correlation coefficient.(d) the median.(e) all of the above.(f) only (a) (b) and (c).Explain why you did, or did not, pick (a).Explain why you did, or did not, pick (b).Explain why you did, or did not, pick (c).Explain why you did, or did not, pick (d).22. (10 points) Suppose a social scientist, investigating death certificates, records the age atwhich fifteen people died. The ages are displayed in the following stemplot:leaf unit = 1 N = 15(a) Describe the shape [mode, skewness/symmetry, and outliers] of thedistribution.(b) Find the median and the mean. Show your calculations to derive the mean.3. (10 points) The histogram and some descriptive statistics for a dataset are shown:Descriptive Statistics: C1N Mean Standard Deviation Minimum Q1 Median Q3 Maximum520 1.2975 1.6834 0.00189 0.3187 0.7326 1.5733 8.0000(a) Report (or compute as necessary) the value of the most appropriate measure ofcenter, and the value of the most appropriate measure of spread. If you compute, show yourcalculations.(b) There are some data values that are clear outliers in the plot. If these outlyingvalues were removed, the effect on mean and standard deviation would be whichone of the following:(i) Mean and standard deviation would both increase.(ii) Mean would increase, but standard deviation would decrease.(iii) Mean and standard deviation would both decrease.(iv) Mean would decrease, but standard deviation would increase.(v) Neither mean nor standard deviation would be affected.Explain why you did, or did not, pick (i).Explain why you did, or did not, pick (ii).Explain why you did, or did not, pick (iii).Explain why you did, or did not, pick (iv).Explain why you did, or did not, pick (v).44. (10 points) The boxplot below displays the delivery time for 1,000 deliveries from a certainpizza shop.(a) 50% of the delivery times were (choose the correct answer and EXPLAIN why you DID pickthat answer and DID NOT pick the other possible answers, for a total of 5 explanations):(i) Between 5 and 50 minutes.(ii) Greater than 50 minutes.(iii) Between 10 and 25 minutes.(iv) Below 10 minutes.(v) Greater than 15 minutes(b) 25% of the delivery times were: (choose the BEST answer and EXPLAIN why you DID pickthat answer and DID NOT pick the other possible answers, for a total of 5 explanations):(i) Below 5 minutes.(ii) Between 15 and 30 minutes(iii) Between 30 and 45 minutes(iv) Greater than 50 minutes.(v) Grater than 35 minutes.5[# 4 continued](c) Based on the plot if THERE WERE an outlier at 60, the mean delivery time of the 1,000deliveries is most likely(choose the correct answer):(i) 15 minutes.(ii) Greater than 15 minutes.(iii) Less than 15 minutes.(iv) 15/1000 minutes.Explain why you chose the answer you picked.5. (5 points) Suppose that the distribution of lengths of time for connection between a student’sdorm computer and the remotely-located University server is normal in shape,with mean of 5 seconds and standard deviation of 1.2 seconds. The middle 95%of all connections will occur between what times?66. (10 points) What are the effects of exposure to an advertising message?The answer may depend both on the length of the ad, and how often it isrepeated.A study investigated this question using 80 undergraduate students as subjects.In order make sure that the sample is balanced in terms of gender, the researchersrandomly selected 40 male students, and randomly selected 40 female students.All the students saw a 40 minute television program that included adsfor a digital camera. The length of the commercial was either 30-seconds or 90-seconds, and it was repeated either 1,3, or 5 times during the program.The subjects were randomized to the different treatments, and at the end of theshow rated their intention of purchasing the camera on a scale from 1 to 10.(a) This study is: (pick one) controlled experiment / observational study(b) What are the explanatory and response variables?(c) How many treatments are there altogether in this study?(d) Can you draw causal conclusions from this study? If not, explain why. If yes,what feature of this study allows you to do so? (one sentence is enough!!)(e) The sampling method that was used in this study is: (indicate the correct answer)Simple random sampling / stratified sampling / convenience sampling /voluntary response / cluster sampling.Explain why you picked the answer you picked.77. (10 points) A random sample of 26 people was selected. Each person was asked about theirdaily consumption of olive oil, and their cholesterol value. The results are represented inthe following plot:(a) For the plot shown, choose the most reasonable correlation coefficient, r is (pick one):(i) -2.1(ii) -0.8(iii) -0.03(iv) 0(v) 0.5(vi) -1Explain why you chose the answer you picked.8[#7 continued]Suppose the equation of the least-squares regression line for predicting cholesterol point valuefrom grams of oil consumed is: cholesterol = 202 – 4.5 oil(b) Complete the following:For each extra gram of olive oil consumed, cholesterol value is predictedto increase/ decrease (pick one),by _____________ points (fill in the blank).(c) What is the predicted cholesterol level for a person who consumes 3 grams ofolive oil daily?(d) True or false: We can conclude from this study that consuming olive oil causeschanges in cholesterol level. Briefly explain.98. (10 points) The question whether juvenile delinquency is related to birth order was examinedin a large study. A total of 1,060 boys attending public school were given a questionnaire thatmeasures delinquent behavior, and had also each boy indicate his birth order. The results aresummarized in the following two-way table:Delinquent Not DelinquentOldest 77 56In-Between 59 37Youngest 52 53334(a) What is the explanatory variable, and what is the response variable?(b) Based on your answer to (a), use the empty table below to supplement the two-waytable with the appropriate conditional percentages.Delinquent Not DelinquentOldestIn-BetweenYoungest(c) Interpret your results in the context of the question. That is, based on part (b) write abrief description of the relationship between birth order and juvenile delinquency as itappears from the data.(d) Complete the following sentence:Since this is _________________________ (choose: a randomized experiment oran observational study), we ______________________ (choose: can or cannot)conclude that birth order is the cause for delinquent behavior.109. (5 points). When conducting a survey, it is important to use a random sample in order to:(a) get a sample that represents the population well.(b) reduce bias resulting from poorly worded questions.(c) reduce bias resulting from poorly ordered questions.(d) reduce bias resulting from sensitive questions.(e) None of the above.Explain why you chose the answer you picked.10. (10 points) Consider the following types of displays:(1) histogram (2) pie-chart (3) scatterplot (4) two-way table (5) side-by-side boxplotsIn each of the following situations data is recorded. Indicate which of the five choices above isthe most appropriate display for the data and why it is the most appropriate.(a) A social scientist studying racial bias in the court system, records the race and guilty verdictfor 300 people on trial. [At least some people were found guilty and some people not guilty.](b) A health scientist investigates how well we can predict an athlete’s maximum bench pressweight (a measure of his/her strength) from knowing the number of 60-pound bench presses thathe/she can perform (before fatigue). The relevant data was collected from 125 athletes.(c) A USA Today poll asked a sample of single men (ages 18-44) thefollowing question:If I had an “X-rated” bachelor party, I’d…The possible answers were: (i) tell fiancé all (ii) edit details (iii) say nothing. Data was recorded.(d) Does cell phone use while driving impairs reaction times? A recent experiment compared thereaction times (in milliseconds) of drivers who were engaged in a conversation on a cell phone todrivers who were not.