POS 3713 - Linear Regression Model - Scatterplot - Coefficients - Statistics Assessment Answer
- Subject Code :
POS 3713 - Linear Regression Model Statistics Assessment Answer
1. For all models in this problem set, report the constants, coefficients, standard errors, and p values, the N and the R2 values in one table in your writeup. Label them, Model 1, Model 2, etc, as per the example table provided below. When you are finished, you should have 5 models in your table. Use the format of Table 1 as a guide for creating this table in your Word document. In the left-hand column, your variable names should be included. Name the variables appropriately so a reader would know what it is, but DO NOT USE THE R CODE NAMES. In the columns to the right of the variable names, insert the different models you will make for this problem set. Each cell should contain the values of the coefficients and their standard errors (in parentheses) that are included in that model. Since the first model is bivariate, the only cells that will have numbers will be the primary independent variable (the coefficient with SE’s in parentheses underneath), the constant (the coefficient with SE’s in parentheses underneath), and the N and R2. Fill the subsequent columns as the problem set instructs.
2. Load the data; it should show you two data frames. Use c for now. View the data frame and the variables. Make sure you know what each variable is and what it is measuring
3. Report the descriptive statistics for all the above variables. This includes the type and level of measurement; for categorical variables, provide a properly labeled frequency graph using the freq() command, and include the frequency, and percentage of each category in your write-up; for continuous, provide a histogram using hist() command and report the n, median, mean and standard deviation in your write-up. For the set of categorical variables that indicate regional differences, you will not need to make a graph, but you should consider them here as one category and report their number and frequency. Label each response as 3a-h.
4. Your primary interest is the relationship that attractiveness has on the percentage of the margin of victory. Create a scatter-plot using the plot() command for these two variables. The syntax for that command is plot(x,y). Use the label options you used for your histograms and frequency graphs to label the Main Graph label (main=""), the x-axis label (xlab=""), and use ylab="" to label the y-axis. Put your scatterplot in your word document and spend a few sentences discussing whether you can distinguish the direction of the relationship between the variables. Be sure to answer the following questions: Is a relationship apparent? What factors might explain why the scatterplot looks the way it does? Are there any concerns with outliers or leveraging observations? Note: Look in the book to figure out how to identify outliers and leverage.
5. Create a binary regression model using attractiveness and the per cent margin of victory and the lm() command. Report the β coefficient for attractiveness and report the standard error and p-value for the coefficient in a table, and label the results Model 1. In your word document, report the statistical and substantive significance. Explain the relationship as you might to a person who is not familiar with statistics, but in such a way that a statistician would recognize what you’ve done and would appreciate your work.
6. Write the R code necessary to find the R2 for the binary model you just made using the following equation. Report the value, and interpret what this particular R2 value means as you would to someone not familiar with statistics.
7. We may be missing important confounding variables. Perceived candidate attractiveness can be correlated with a variety of other variables that could affect the percent margin of victory. Therefore, we need to include the necessary variables. Run a model that includes all of the above variables, and report the estimated values in your table. In the table, label this model Model 2. Also in your writeup, interpret the results of every variable in the model, both statistically (with p-values), and substantively (with size and directions of coefficients). Explain every relationship as you might to a person who is not familiar with statistics, but in such a way that a statistician would recognize what you’ve done and would appreciate your work. You should use at least 2-3 sentences to properly explain the results of each variable. Label this 7a-h.
8. If you followed instructions on the last part, R refused to include one of your variables. In a few sentences, identify which one and explain why it got dropped.
9. There are two particular control variables that may not have a linear relationship with the margin of victory. Identify the most likely candidate and create a new variable that is that variable but squared. Run a new regression model including this variable, and include it as Model 3 in your table. Explain the relationship that this variable has with the dependent variable.
10. From number 3 above, you may have identified former Rep. Henry Waxman as an outlier in our sample. Unfortunately, for reasons completely beyond his control, Mr Waxman may have affected our data. Mr Waxman is observation 87. Report his attractiveness (for your own research, Google his image) and his per cent margin of victory scores. In a few sentences, explain what impact he might have on our results.
11. We may need to remove Mr Waxman from our sample. We will use a variant of the call function. For example, to remove the fifth observation from a data frame named x, you would write out: x<-x[-5,]. In a few sentences, provide a meaningful justification for removing Mr Waxman.
12. Re-run the same code you used for Model 2, and include it in the table as Model 4. What impact did Mr.Waxman’s removal have on our primary variable of interest? Use examples and be specific.
13. Our research is catching some attention. Using some recent grant money we received because of the above results, we have expanded our sample to include 1000 observations. Using the data frame named t, re-run the same variables we did in Model 2 and report the results in your table as Model 5. Reinterpret the results for your main independent variable both statistically (with p-values), and substantively (with size and directions of coefficients). Explain the relationship as you might to a person who is not familiar with statistics, but in such a way that a statistician would recognize what you’ve done and would appreciate your work. What are the specific impacts of increasing the sample size? Why do they occur? Answer both of these latter questions with at least 3-4 sentences. Label each variable interpretation as 13a-h, and continue through the alphabet for these last parts of the question (e.g. 13i and 13g).
This POS 3713: Statistics Assessment has been solved by our Statistics Assessment Experts at TVAssignmentHelp. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our experts are well trained to follow all marking rubrics & referencing style.
Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.