# G12SMM : Statistical Models and Methods-R Studio Assignment

Internal Code: MAS434

## R Studio Assignment:

The Data:

Data are available on the recommended prices of used cars in the United States. All cars are the same age, but have done different mileages and have different specifications. You have recently been employed by a used car dealership to build models to describe the dependence of recommended prices on potential explanatory variables, in order to use these models to price your own used cars. The data, which come in two parts, are available on Moodle. They are TrainData.txt Training data, which will be used to build models. TestData.txt Test data, which will be used to assess predictions from the models built. They can be read into R (after saving the file in your working directory) using:
After reading in the data, you can look at the structure of the data (number of observations/variable types etc) using the str() command, e.g. str(Train). For both data sets,you should treat the covariates Cylinder, Doors, Cruise, Sound and Leather as factors (they are treated as integers by default). This can be done using, for example,Train\$Cylinder = factor(Train\$Cylinder)

Question:

(a) Using the TRAINING data, investigate models to explain the relationship between Price and the other variables. That is, Price (or transformations of it) is to be the response variable, and all other variables are potential explanatory variables.
(b) Use your fitted model(s) from (a) to predict the responses for the observations in the TEST data set. That is, for each of the observations in the Test data, use the values of the explanatory variables as input to your model(s) from (a) to obtain fitted/predicted responses for these observations. Compare your predicted responses with the known observed responses from the observations in the Test data, using suitable plots/numerical summaries.