# Statistic Assignment

This final exam is composed of three parts. The first part consists of 10 items/questions based on a hypothetical data of Covid-19 patients admitted to hospital. The second part consists of 3 items/questions addressing standardized score, percentile ranking and score distributions. The third part consists of 2 items/questions where you will interpret King County COVID-19 data.

Part I.

In the first part of this final exam, you will use a hypothetical dataset ‘Anti-inflammation drug and Covid-19 trial data Wi21 FE.xlsx’, comparing the effectiveness of three anti-inflammation drugs/treatments in helping patients recover from Covid-19 (i.e., how long it takes to recover and be discharged from the hospital). Recovery is operationalized as length of stay in the hospital or number of days to be discharged from the hospital. In other words, we want to see which of these three drugs/treatments help patients recover fastest from Covid-19 using length of hospital stay as a proxy. (As we witnessed over the past year or so, hospitals are overwhelmed with Covid-19 patients and speedy recovery is critical.)

The independent/predictor variable is the Treatment with three randomly assigned group of Covid-19 patients: Group #1 received the standard treatment that includes the drug Dexamethasone, Group #2 received treatment that includes the new drug Tocilizumab, Group #3 received treatment that includes the new drug Sarilumab.

The dependent/outcome variable is recovery time, i.e., the number of days to be discharged from the hospital.

This dataset includes three additional variables that may make difference in recovery time regardless of drug/treatment they are receiving:

1. Age (Age)
2. Sex (Sex) (1=Female, 2=Male)
3. Race/Ethnicity (1=Asian, 2=American Indian/Alaskan Native, 3=Black/African American, 4=Hispanic, 5=Native Hawaiian, 6=White, 7=Mixed race)

Orientation and General Instruction:

Please differentiate your answers/responses in some way. You can use boldface, use different color text, or highlight your answers.

In this hypothetical dataset, the variable Days to discharge has 61 subjects/cases with missing data. These cases are assumed to be deceased from Covid-19.

1-5. Create bar charts for each categorical variable. Create histograms and descriptive tables (includes measures of central tendency, variation, and skewness and kurtosis) for each continuous variable. Copy and paste all bar charts, histograms, and descriptive tables onto the answer sheet. Then describe trends, patterns, and exceptions displayed in the graph. (5 points)

6 – 9. Given the research questions below, state the independent and dependent variables (or Variable 1 and Variable 2) and level of measurement. Write null and research/alternative hypotheses. Then conduct the appropriate statistical analysis. State tests used, report and interpret key statistics, and explain your findings, including the likelihood you might be committing Type I or II error given your decision regarding the null hypothesis. Refer back to your Assignments and Handouts in thoroughly addressing all elements in your reporting, interpretation, and conclusion. Be sure to add a conclusion in lay language. Conclusion is the most important aspect of items 6-9, worth ½ of the points associated with each of these items. Do NOT include tables and graphs from SPSS output – Just report and interpret your findings, and write a thorough summary and conclusion.

Please use .01 as your threshold in determining statistical significance for items #6 to #9.

1. How effective are these three drugs/treatments in treating inflammation, therefore helping them recover from Covid-19? In other words, which drug/treatment group take longest and shortest to discharge from the hospital? (Is there a statistically significant difference in number of days Covid-19 patients take to discharge from the hospital based on drugs/treatments? If so, how do they differ and by how much do they differ?)
2. How and how well age of patients predict discharge from the hospital? (Is there is a statistically significant relationship between age and days to discharge? If so, what is the nature and degree of relationship?) (4 points)
3. Do male and female patients differ in how long they take to discharge from the hospital? If so, how do they differ and by how much do they differ? (Is there a statistically significant difference in days to discharge between male and female patients?) (4 points)
4. Do patients differ in days to discharge from the hospital based on their race and ethnic background? If so, how do they differ and by how much do they differ? (Is there a statistically significant difference in days to discharge between patients from various race and ethnic group?) (4 points)
5. What are your overall findings and conclusions from #6, #7, #8, and #9? (1 point extra credit)

PLEASE USE THE ANSWER SHEET BEGINNING ON PAGE 4 TO COMPLETE THIS FINAL EXAM.

PAGE 3 IS INTENTIONALLY LEFT BLANK.

Copy and paste each univariate graphs and descriptive tables under each variable listed below. Then describe trends, patterns and exceptions depicted in each graph and/or table. (5 points)

1. Treatment
2. Age
3. Sex
4. Race/Ethnicity
5. Recovery time/Days to discharge from the hospital

How effective are three drugs/treatments in treating inflammation, therefore helping them recover from Covid-19? In other words, which drug/treatment group take longest and shortest to discharge from the hospital? (Is there a statistically significant difference in number of days Covid-19 patients take to discharge from the hospital based on drugs/treatments? If so, how do they differ and by how much do they differ?) (4 points)

Independent variable:

Dependent variable:

Ho:

H1:

Statistical analysis used:

Key statistics:

Assumptions:

Accept or reject the Ho:

Summary and conclusion:

Given your decision to accept or reject the Ho, which of the two errors are you likely to be committing, and why?

1. How and how well age of patients predict discharge from the hospital? (Is there is a statistically significant relationship between age and days to discharge? If so, what is the nature and degree of relationship?) (4 points)

Variable 1:

Variable 2:

Ho:

H1:

Statistical analysis used:

Key statistics:

Assumptions:

Accept or reject the Ho:

Summary and conclusion:

Given your decision to accept or reject the Ho, which of the two errors are you likely to be committing, and why?

1. Do male and female patients differ in how long they take to discharge from the hospital? If so, how do they differ and by how much do they differ? (Is there a statistically significant difference in days to discharge between male and female patients?) (4 points)

Independent variable:

Dependent variable:

Ho:

H1:

Statistical analysis used:

Key statistics:

Assumptions:

Accept or reject the Ho:

Summary and conclusion:

Given your decision to accept or reject the Ho, which of the two errors are you likely to be committing, and why?

1. Do patients differ in days to discharge from the hospital based on their race and ethnic background? If so, how do they differ and by how much do they differ? (Is there a statistically significant difference in days to discharge between patients from various race and ethnic group?) (4 points)

Independent variable:

Dependent variable:

Ho:

H1:

Statistical analysis used:

Key statistics:

Assumptions:

Accept or reject the Ho:

Summary and conclusion:

Given your decision to accept or reject the Ho, which of the two errors are you likely to be committing, and why?

1. What are your overall findings and conclusions from #6, #7, #8 and #9? (1point extra credit)

Part II. Standardized score, percentile ranking and score distributions (5 points)

Obtain your current grade in this class and calculate/find out your percentile ranking. You will need to convert your grade to z-score in order to find out your percentile ranking. Currently, the average grade for this class is 82.9 and the standard deviation is 7.73. Please show your work/calculation steps.

1. Your current grade in raw/observed score form (i.e., as it appears in the Gradebook) as well as in z-score form. (1 point)
2. Assuming normal distribution, please calculate/find out your percentile ranking in this class, and explain what your percentile ranking represent/mean in relation to your peers. (2 points)
3. If our class grade scores had skewness of 1.65, and kurtosis of -3.19. Is this a normal distribution? If so, please explain why. If not, how would you characterize this distribution relative to normal distribution? (2 points)

Part III. King County COVID-19 Data (4 points)

Please go to the following website to answer the questions below. King County Covid-19 cases and deaths: https://www.kingcounty.gov/depts/health/covid-19/data/race-ethnicity.aspx

1. What is the overall message/story that these graphs collectively depict about the cases and deaths from COVID-19 in King County? (2 points)
2. In both COVID-19 cases and deaths data in King County, which race/ethnic group has the most accuracy in its estimate and which race/ethnic group has the least accuracy in its estimate? (2 points)

Instructions to navigate and use the website:

The link above will take you to the webpage below. Click on ‘Rates of cases’.

Then use these two options to view Cases and Deaths data.

Hover over the bars to find/view numbers.

EXTRA CREDIT (3 points)

A pharmaceutical company has developed a new drug intended to help lower cholesterol, and plans to test its effectiveness with test subjects/patients with high cholesterol. The test subjects (150 patients with high cholesterol) were randomly assigned to three groups: 50 to take this new drug (experimental group), 50 to take a cholesterol drug already on the market (control group), and 50 to a placebo pill. The test subjects’ cholesterol level will be measured once before and once after the treatment. (Use Andy Field’s test identifier if needed.)

1. What are the independent and dependent variables and their level of measurement? (1 point)
2. What statistical test and how many of them will be used to explore the difference in average cholesterol between the groups before the treatment and after the treatment? (1 point)
3. What statistical test and how many of them will be used to explore the difference in average cholesterol within each group before and after the treatment (i.e., how much did cholesterol level change within each group as result of the treatment)? (1 point)

