Health surveys are commonly conducted to evaluate the overall state of affairs in terms of health decisions and trends. Often, the data collected are paired with other extant data in order to see if relationships exist that were not specifically studied or around which data were not specifically collected.
The Centers for Disease Control, of course, collect thousands of data elements from a variety of health settings and surveys. The data set attached, “Case Study 3 – States.csv” includes 6 variables:
- Food hardship rate – the reported rate of persons that experience the inability to purchase the food they need at least once in the last 12 months;
- Obesity rate – the rate of obese persons, which is defined as having a BMI of 30 or greater;
- Adult cigarette use – the proportion of persons 16 or older smoking more than 100 cigarettes in their lifetime and who continue to smoke;
- Child cigarette use – the proportion of persons under 16 smoking more than 100 cigarettes in their lifetime and who continue to smoke;
- Tax – the number of cents of state tax levied on a pack of 20 cigarettes; and
- Location – the geographical location in the US to which each state and US is a member.
- Create a scatterplot of the data for Food Hardship Rate and Obesity Rate. Copy the scatterplot
into this document with axes and chart titles and interpret what the scatterplot tells us about the relationship between the two variables.
- For the food hardship and obesity variables, determine the strength of any correlation, and
determine whether it is significant at 0.05. Interpret the meaning of the correlation coefficient. A quick p-value calculator for Pearson correlation can be found on Social Science Statistics website (https://www.socscistatistics.com/pvalues/pearsondistribution.aspx). Write a journal entry for the results.
- For the adult smoking and child smoking variables, determine the strength of any correlation,
and determine whether it is significant at 0.05. Interpret the meaning of the correlation coefficient. A quick p-value calculator for Pearson correlation can be found at https://www.socscistatistics.com/pvalues/pearsondistribution.aspx. Write a journal entry for the results.
- One method of decreasing the smoking rate is to increase the tax rate on a pack of cigarettes.
Using the mean of the adult and child rate for each state and DC, consider predicting the smoking rate from the tax rate.
- Calculate the mean of the adult and child smoking rates for each state and DC – insert this in an Excel column.
- With tax rate as the predictor variable, conduct a simple linear regression analysis on the data for mean adult and child smoking rate. In the analysis, you would typically report:
- A regression plot analysis
- A statement and interpretation of the significance
- The regression equation and its meaning
- A statement and interpretation of R2
- Practical statement of meaningfulness
5) Researchers attempt to see if there are relationships between variables that are not scale in nature, as well. In this case, we have 6 categories of “Location.” A bar chart showing the number of smokers in a random sample of 1000 residents in each region is shown below.
- Determine the number in each 1000 resident sample that are nonsmokers.
- Using the chi-square test for independence, determine whether or not the number of smokers in a region is independent of the region.
- Indicate the null and alternate hypotheses.
- Use the calculator at http://turner.faculty.swau.edu/mathematics/math241/materials/contablecal c/entry.phpto conduct the chi-square test at 0.05.
- Interpret the results and write a journal entry for the results.