# Relationship Between Weight And Exercise - Regression Model - Cancer - Variables - Statistical Regression - Mathematics Assignment Help

Curtin University of Technology

This case study application uses a sub-dataset of random sample (Weight 2019S2QM.dta) from a larger scale study. This dataset contains information of weight, exercise level and gender measured for 72 participants. This dataset provides one continuous variable Y (weight) and two categorical variables (exercise and gender) as follows in Table 1. The researcher who conducted the study would like to answer the following questions with your help.
1) If gender is an effect modifier on the association between exercise and weight? i.e., if gender modifies the effect of exercise on weight?
2) If mean weight loss due to different exercise levels is the same for males and females?

Questions

1. The researcher believes that investigating the relationship between weight and exercise, accounting for gender as a potential effect modifier would be able to answer the research questions. Do you agree?

2. Firstly the researcher decided to ask your help for testing the null hypothesis that the population mean weight is the same regardless exercise levels, i.e., she would like to compare the population mean weight between the two exercise levels.

3. Having compared the population mean weight between the two exercise levels, the researcher would like to perform a multiple regression analysis for comparing the difference in the population mean weight between the two exercise levels, accounting for gender in the model as a potential effect modifier.

i. Which of the multiple regression models you learnt from our unit is appropriate for the research’s analysis? Why?
ii. The researcher plotted the following mean plots to check whether there is an interactive effect.

a) Which mean plot (Plot 1 or 2) indicates that gender probably modifies the effect of exercise on weight? Why?
b) Based on the mean plot you chose, make a justification on whether the interaction term between exercise and gender should be included and assessed in the multiple regression model.

iii. Fit the multiple regression model you recommended (in Question 3.i) for the researcher.

iv. Based on the output of your regression model (Question 3.iii), test the hypothesis that there is no interaction in the population between the exercise and gender, and answer the question “if gender modifies the effect of exercise on weight?”

v. The researcher believes that any variable with a p value &gt;0.05 needs to be removed from the multiple regression model, and a further simplified model is necessary. Do you agree with her? Comment on “if a further model is necessary” by selecting an answer below:

a) Yes I agree, the non-significant variable (list it here_______________) should be removed from the model and hence I can have a further simpler model. Briefly justify your answer.

b) No, I disagree, there is no need to remove any variable from the model and hence no need to have a further model. Briefly justify your answer.

4. Based on your final model obtained in Q3 v,
a. What is the prediction equation (estimated regression coefficients are rounded up to
3 decimal places)?

b. The researcher believes the constant in your final model has no any meaningful interpretation.
c. Calculate the predicted mean weight loss low and high exercise levels for each gender based on the regression equation obtained in Q4 a.

d. The researcher interpreted the regression coefficient ‘7.733’ as ‘People who participated in low level exercises were heavier by 7.733 kg on average than those who participated in high level exercises regardless their gender’. Do you think this interpretation is correct?

5. Using information you obtained from the final model in Q4, write a detailed conclusion with supporting evidence to help the researcher answer the research questions

Case study application TWO

A researcher would like to investigate the association between exposure to a special chemical and a specific type of cancer using a random sample (Cancer 2019S2QM.dta) from a practical study. He collected other variables such as number of cigarette smoked per day and living distances from the site of exposure, which will be accounted for as well in his analysis.

The main research question of this study is
1) “if exposure to a special chemical increases the risk of having the cancer?”

The other research questions are
2) “if living close to the site of exposure is associated with a higher risk of having the cancer?”
3) “if more number of cigarette smoked per day is associated with a higher risk of having the cancer?”

Questions

1. Given this data, to answer the research questions, you need to help the research identify

1.1. Which variable is the dependent variable?
1.2. Which variable is the main independent variable of interest?
1.3. Which other variables should be controlled in your analysis?
1.4. Given the type of the dependent variable, which kind of regression analysis you should use, and Why?

2. Literature shows number of cigarette smoked per day may confound the association between exposure to a special chemical and the risk of having the specific type of cancer. You need to help the researcher using the steps covered in lectures/labs to assess the possible confounding effect.

1.1 Find the crude odds ratio (also its 95% confidence interval (CI)) of having the Cancer comparing the individuals who exposed to the special chemical with the individuals who didn’t.

1.2 Assess whether NumberCigarette confounds the effect of Exposure on the risk of having the Cancer. Assess this possible confounding effect by calculating the relevant statistic from both epidemiological and statistical perspectives. Attach Stata output here

3. Given that living distance from the site of exposure and exposure to the special chemical may have a combined or multiplicative effect on the risk of having the Cancer, the researcher would like to test whether there is an interactive effect between Distance (continuous) and Exposure on the risk of having the Cancer. Help the researcher answer thefollowing questions with evidence from your analysis.
a) Whether the effect of exposure to the special chemical is modified by the living distance from the site of exposure? You need to answer this question with evidence from your analysis.

b) Whether you can have a simpler model by removing this interaction term? You need to answer this question with at least two reasons.

4. The researcher wanted to predict the probability of having the cancer for some individuals with different personal characteristics. He would like to perform a multiple regression analysis you recommended in Q1, and include Exposure, NumberCigarette, and Distance (continuous) in his model without any interactions between the independent variables.

Attach Stata output (eg., parameter estimation table with Coefficients) here
4.1 What is the regression equation (coefficients are round up to 3 decimal places)? P is the probability of having the Cancer.

4.2 Now help the researcher calculate the predicted probability of having the cancer for an individual who
a) not expose to the special chemical, smoked 50 cigarettes a day and lived 12 km from the site of exposure? Make a brief comment

b) exposed to the special chemical, smoked 20 cigarettes a day and lived 0.1 km from the site of exposure? Make a brief comment

5. To help the researcher answer the research questions, you need to perform the multiple regression analysis you recommended in Q1, including Exposure, NumberCigarette, and Distance (continuous) in your model without any interactions between the independent variables.
Attach Stata output (eg., parameter estimation table with Odds Ratios) here

1) Whether exposure to a special chemical increases the risk of having the cancer? Answer
this question by interpreting the information provided by the adjusted odds ratios (and 95% CI) related to Exposure.
2) Whether living close to the site of exposure is associated with a higher risk of having the cancer? Answer this question by interpreting the information given from the adjusted odds ratios (and 95% CI) related to Distance.

3) If more number of cigarette smoked per day is associated with a higher risk of having the cancer? Answer this question by interpreting the information obtained from the adjusted odds ratios (and 95% CI) related to NumberCigarette.

6. Given the categorized distance, the researcher would like to know from which living distance category, the risk of having the cancer will be significantly decreased. You need to perform another regression analysis including Exposure, NumberCigarette, and DistanceCat (categorical) in your model without any interactions between the independent variables.
Attach Stata outputs (eg., parameter estimation table with Odds Ratios) here
a)  Do you think overall DistanceCat is a significant predictor of having the Cancer? Why?

b)  From which living distance category, the risk of having the cancer will be significantly decreased? By how many %? How the population mean odds decreased with 95% certainty?

c) Compare the multivariable models with continuous Distance (Q5) and the model with the categorical DistanceCat (this question Q6), if you were the researcher, which model you would use to explain the association between the risk of cancer and living distances from the site of exposure. Justify your choice.

Case study application Three

This case study application uses information from a published paper “Survival of neonates and predictors of their mortality in Tigray region, Northern Ethiopia: prospective cohort study.

1. Briefly describe the study to answer the following questions:
i. What were the study design and objectives?

ii. How were the participants recruited: where, when, how many?
iii. What are the main limitations (list two) of the study?

1) How did the authors define the main outcome variable Neonatal Mortality (NM)? What is the event of interest, how did the authors calculate survival time (i.e., time-to-event), how did the authors define the censored neonates?

2) If you were the author, complete the table below for 10 neonates with different conditions given below, where censoring status is coded “1” for event and “0” for censored.

3. Read Data Management and Analysis section and Table 6, answer the following questions:

i. Which statistical regression analysis with which regression strategy were used for Table 6? Which method the authors used for assessing the proportional-hazards assumption?
ii. How many (and list all) statistically significant factors associated with neonatal mortality obtained based on the multivariable Cox-proportional hazard model in Table 6? Why you know they are statistically significant?
iii. Choose ONE significant factor obtained based on the multivariable Cox- proportional hazard model from Table 6, interpret its effect and 95%CI on neonatal mortality using your own words.

• Uploaded By : Mitchell Lee
• Posted on : October 21st, 2018

