# STA303/1002 – Abalone data – Ships data

Abalone data for Q1

In this problem, we are going to analyze a data sets with 4177 subjects data with 8 variables, and will try to predict whether or not the ring of abalone is greater 9 or not. The complete data set description can be found

Below are the list of all variables in the data  set are :

• Sex: nominal variable – takes levels of M, F, and I (infant).

• Length: continuous variable (mm) – Longest shell measurement

• Diameter: continuous variable (mm) – perpendicular to length

• Height: continuous variable (mm) – with meat in shell

• Whole weight: continuous variable (grams) – whole abalone

• Shucked weight: continuous variable (grams) – weight of meat

• Viscera weight: continuous variable (grams) – gut weight (after bleeding)

• Shell weight: continuous variable (grams) – after being dried

• Rings: integer

We are interested in predicting the rings variable is greater than 9 or not. So you need to creat the binary

response based on it,

Ships data for Q2

We are interested in the number of accidents per month for a sample of ships (a classic example given by McCullagh & Nelder, 1989). The data can be found in the file “ships.csv” and it contains 40 subjects data with 14 variables. The response variable is called ACC. The explicative variables are:

• TYPE: there are 5 ships, labelled as 1-2-3-4-5. Type is a categorical variable, and 5 dummy TA, TB,TC, TD, TE.
•  CONSTRUCTION YEAR: the ships are constructed in one of four periods, leading to the dummy variables T6064, T6569, T7074, T7579.
• MONTHS: a measure for the amount of service months that the ship has already carried out.