Need help with your Discussion

Get a timely done, PLAGIARISM-FREE paper
from our highly-qualified writers!

glass
pen
clip
papers
heaphones

Miami University Simple and Multiple Logistic Regression Models Discussion

Miami University Simple and Multiple Logistic Regression Models Discussion

Miami University Simple and Multiple Logistic Regression Models Discussion

Question Description

Directions: Complete the following questions. Thequestions have been separated into 4 parts of similar material. Parts1, 2, and 3 will only use the corona_train data while Part 4 will use the corona_test data. Use the Markdown starter file here hw7_starter.Rmd.

Part 1 – Odds

1. Using the training dataset, compute the odds that a county has reported a Coronavirus-related death. (2pts)

2. Does the odds of a Coronavirus-related death vary by Censusregion? Compute the odds that a county has reported aCoronavirus-related death for each Census region within the UnitedStates. Compare these values to address the question. (3pts)

Part 2 – Simple Logistic Regression

3. Build a plot (or plots) to explore how the logarithm of thepopulation density predicts whether a county has recorded acoronavirus-related death. Briefly discuss the results of your plot. (2pts)

4. Build a simple logistic model to statistically determine if thelogarithm of the population density predicts the probability a countyhas reported a Coronavirus-related death. Support your findings with anappropriate hypothesis test. (3pts)

Part 3 – Multiple Logistic Regression Models

5. Fit a multiple logistic regression model with the census region,the logarithm of population density, the cumulative coronavirus rate,the median county age, the median income, the percent of the county thatare U.S. citizens, the percent with a college degree, the percent ofthe population that are veterans of the U.S. armed services, the percentwith healthcare and the percent that voted for President Trump in the2016 general election to predict the probability a county has reported aCoronavirus-related death. Conduct an appropriate test to determinewhether this model significantly predicts the probability a county hasreported a Coronavirus-related death. (3pts)

6. Perform a backward selection procedure on the model from question 5. Which variable(s) has/have been removed from the model. (2pts)

7. We will now continue a backward selection procedure, but this timeusing Likelihood Ratio test. Using the drop1() function to determinewhich predictors are significant, iteratively remove all insignificantpredictors from the model in question 6. That is, look at the drop1()output from the model in question 6, refit the model after removing allinsignificant terms, look at the drop1() output, refit the model afterremoving all insignificant terms… Continue this process until allpredictors are significant. What predictor variables remain in themodel? (4pts)

8. The starter file contains some code to help you along on thisproblem. Build a table to compare the AIC, BIC and a Pseudo-R-squaredfor the models fit in questions 5, 6 and 7. Which model is best withrespect to each metric? (3pts)

9. Code was supplied for a Pseudo-R-squared calculation in question8. Explain how this value mimics that of the traditional R-squared valueused in multiple linear regression. (2pts)

10. For the model with the best BIC, of those fit in questions 5, 6,or 7, interpret the coefficient regionWest. Be sure to explain thiscoefficient in terms of odds (not log-odds, which do not provide a niceinterpretation). How does this compare to the results in question 2?Why might they be similar/different? (3pts)

Part 4 – Prediction

11. We will use three fitted models built above to predict whether acounty in the testing dataset will have a Coronavirus-related death.Some code is supplied in the starter file, edit and replicate so it willmake predictions using all three models. Briefly describe what thiscode is doing. (2pts)

12. Calculate and discuss the accuracy, sensitivity and specificityfor all three models to predict if a county has reported aCoronavirus-related death. Which model appears to be the best model atpredicting if a county has a Coronavirus-related death? Code is providedfor the confusion matrix of the first model. Replicate this code togenerate the confusion matrices for the other two models. (6pts)

13. Using the best model from theprevious question, compute the sensitivity and specificity if theprobability threshold (the 0.5 provided in the code for question 11)were 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and 0.9. Use these valuesto complete the table in the starter file. Which threshold appears to bethe best choice? (5pts)

NOTE: the ideas of sensitivity and specificity areVERY relevant in today’s society as scientist develop tests for theCOVID-19 Coronavirus; for both antibody and detection of the disease. Wefelt it prudent to introduce these topics under the currentcircumstances.

Some Coding hints

We have covered a lot this semester… In an effort to help you withsome of the necessary coding, we provide the following hints but noteadditional code is needed for all to work

  • xtabs() can be used in questions 1, 2, and 12
  • ggplot() is needed in question 3
  • glm(), drop1() and/or anova() are needed in questions 4, 5, 7 and 8
  • stats::step() is needed in question 6
  • summary() will provide output with model coefficients, you can also use coef()

I need rmd and html file in the end.

Have a similar assignment? "Place an order for your assignment and have exceptional work written by our team of experts, guaranteeing you A results."

Order Solution Now

Our Service Charter


1. Professional & Expert Writers: Eminence Papers only hires the best. Our writers are specially selected and recruited, after which they undergo further training to perfect their skills for specialization purposes. Moreover, our writers are holders of masters and Ph.D. degrees. They have impressive academic records, besides being native English speakers.

2. Top Quality Papers: Our customers are always guaranteed of papers that exceed their expectations. All our writers have +5 years of experience. This implies that all papers are written by individuals who are experts in their fields. In addition, the quality team reviews all the papers before sending them to the customers.

3. Plagiarism-Free Papers: All papers provided by Eminence Papers are written from scratch. Appropriate referencing and citation of key information are followed. Plagiarism checkers are used by the Quality assurance team and our editors just to double-check that there are no instances of plagiarism.

4. Timely Delivery: Time wasted is equivalent to a failed dedication and commitment. Eminence Papers are known for the timely delivery of any pending customer orders. Customers are well informed of the progress of their papers to ensure they keep track of what the writer is providing before the final draft is sent for grading.

5. Affordable Prices: Our prices are fairly structured to fit in all groups. Any customer willing to place their assignments with us can do so at very affordable prices. In addition, our customers enjoy regular discounts and bonuses.

6. 24/7 Customer Support: At Eminence Papers, we have put in place a team of experts who answer all customer inquiries promptly. The best part is the ever-availability of the team. Customers can make inquiries anytime.

We Can Write It for You! Enjoy 20% OFF on This Order. Use Code SAVE20

Stuck with your Assignment?

Enjoy 20% OFF Today
Use code SAVE20