diff_months: 25

Biased Bootstrap - LDA Question - Statistics Assignment Help

Download Solution Now
Added on: 2020-09-15 09:39:00
Order Code: MAS33472
Question Task Id: 185753
Assignment Task

1. Biased bootstrap

In this question, we will look at the e↵ect on bootstrap estimation if we have a point in our original sample that is not representative of the population. Consider a random sample X1, X2,...,Xn such that

Equation.JPG

bootstrap Sample.JPG

LDA question.JPG

Calculate.JPG

3. Tuning lasso-regression

In this question, you are required to perform an analysis to give a lasso regression model to predict a track’s popularity on Spotify. The dataset to use can be obtained from TidyTuesday.

Analysis

(a) Find data on TidyTuesday - read into R.
(b) Select the variables - track popularity, playlist genre, playlist subgenre, danceability to duration.
(c) Produce a joy plot (ridges plot) of track popularity for each sub-genre:
• Order the sub-genres by mean popularity,
• Colour the ridges by genre.
(d) Calculate the sub-genre that has the highest mean popularity.
(e) Remove the variable genre.
(f) Split the data into training and testing.
(g) Convert all categorical predictors into dummy variables.
(h) Normalise all predictors.
(i) Fit a lasso regression.
(j) Tune the model to find the best penalty using cross-validation.
(k) Produce a VIP plot for just the sub-genre predictors - use colour to
indicate the sign of the predictor.
(l) Calculate the final RMSE for the test dataset.
(m) Produce a scatterplot of the predictor value against the true popularity for the test dataset.
Questions to answer
(a) Which sub-genre has the highest mean popularity?
(b) Why did we remove the variable genre?
(c) What is the optimal penalty?
(d) Is the model any good?

Submission instructions

To practice the procedure for your practical exam, we are going to use the following protocol.

(a) Create a project for your analysis.
(b) Include in your project, the data, and a single Rmarkdown file that performs the analysis and includes the answers.
(c) Zip your entire project and upload the resultant zip file.
(d) The marker will unzip your file and knit your rmarkdown file.
Markscheme
• Code [5 marks]: All code is given, and is correct and well commented.
Code is also give to justify any conclusions.
• Rmarkdown [5 marks]: Rmarkdown file knits.
• Plots [4 marks]: All plots are given, as well as the code to produce. All figures are captioned.
• Answers [3 marks]: All the questions are answered correctly with references to the code, table or plot where applicable.
• Discussion [2 marks]: A reasonable discussion of the model and its usefulness.

4. MCU movie figure
This question is for postgrads only. Undergraduates may attempt it for giggles, kicks and bonus marks
Tasks
(a) Goto to wiki-page https://en.wikipedia.org/wiki/Marvel_Cinematic_Universe
(b) Read the table for the movie information into R - I suggest using webscraping.
(c) Clean the table as necessary.
(d) Produce a single figure to represent the following information:
• Film title,
• Phase,
• Release date, and
• Director.
Submission instructions
For your submission, we just need your figure as a pdf.
Markscheme

• Figure [10 marks]: A single figure as a pdf, that includes all the movies, and represents the title, release date, phase and director.
• Bonus marks [5 marks]: The best figure as decided by Jono, Matt, and Jack will be awarded an extra 5 bonus marks.

 

This Statistics Assignment has been solved by our Statistics Experts at TVAssignmentHelp. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our Experts are well trained to follow all marking rubrics & referencing style.

Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose one between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

  • Uploaded By : Alon 
  • Posted on : September 15th, 2018
  • Downloads : 0

Download Solution Now

Can't find what you're looking for?

Whatsapp Tap to ChatGet instant assistance

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more