SIT720 - Machine Learning - Digital literacy - Critical Thinking - Problem Solving - Computer Science - Assessment Answer

Download Solution Now
Added on: 0000-00-00 00:00:00
Order Code: MAS20592
Question Task Id: 51760

Connect with SIT720 Machine Learning Expert Now

Assessment Task:
SIT720 - Machine Learning - Digital literacy - Critical Thinking - Problem Solving -  Computer Science - Assessment Answer 

Task:

Perform unsupervised learning of data such as clustering and dimensionality reduction

Learning Outcome

GLO 1: Discipline knowledge and capabilities
GLO 3: Digital literacy
GLO 4: Critical thinking
GLO 5: Problem-solving

Purpose

In this assignment, this is an individual assessment task of maximum of 20 pages including all relevant material, graphs, images, and tables. Students will be required to provide responses for a series of problem situations related to their analysis techniques. They are also required to provide evidence through the articulation of the scenario, application of Python programming skills, analysis techniques and provide a rationale for their response.eed to demonstrate your skills for data clustering and dimensionality reduction. There are two parts to this assignment.

Part-1 Clustering:

Instructions: there are five different files where each file contains a different number and types of digit images. The file name ends with a digit between 0 to 4. Please compute the modulus operation (fID=SID % 5), where SID is your own student ID number. Now select the data file, name of which ends with the same fID value. For example, if your student id is 218201419, then you should compute fID=218201419%5. This result is fID=4 so in this case you should work with the file named "digitData4.csv'. If the result was fID=2 you must work with the file named “digitData2.csv”.

1- Read the downloaded file into a matrix M(mXn). Create an empty numpy array X with m rows and n-1 columns. Assign all m rows and first n-1 columns of M into X. Create a numpy vector true labels and assign n-th column of M into that. Print dimensions of M, X and true labels. 
2- Next perform K-means clustering with 5 clusters using Euclidean distance as a similarity measure. Evaluate the clustering performance using adjusted rand index (ARI) and adjusted mutual information. Report the clustering performance averaged over 50 random initializations of K-means. 
3- If we have an ARI value of 0.7 after a single run of K-means clustering with 'Kmeans++' initializaton for any data set then what will be the value of averaged ARI over 20 repetitions. Explain why? 
4- Repeat K-means clustering with 5 clusters using a similarity measure other than Euclidean distance (you are free to use other libraries). Evaluate the clustering performance over 50 random initializations of K-means using adjusted rand index and adjusted mutual information. Report the clustering performance and compare it with the results obtained in step 2.

 

Part-2 Dimensionality Reduction using PCA/SVD: For the provided digits dataset:
1- Perform PCA. Plot the captured variance with respect to increasing latent dimensionality. What is the minimum dimension that captures at least 95% variance? 
2- Create a scatter plot with each of the total rows of X projected onto the first two principal components. In other words, the horizontal axis should be v1, the vertical axisv2, and each individual should be projected onto the subspace spanned by v1 and v2. Your plot must use a different color for each digit and include a legend. 

This Machine Learning assessment has been solved by our Computer Science experts at TVAssignmentHelp. Our Assignment Writing Experts are efficient to provide a fresh solution to this question. We are serving more than 10000+ Students in Australia, UK & US by helping them to score HD in their academics. Our experts are well trained to follow all marking rubrics & referencing style.

Be it a used or new solution, the quality of the work submitted by our assignment experts remains unhampered. You may continue to expect the same or even better quality with the used and new assignment solution files respectively. There’s one thing to be noticed that you could choose between the two and acquire an HD either way. You could choose a new assignment solution file to get yourself an exclusive, plagiarism (with free Turnitin file), an expert quality assignment or order an old solution file that was considered worthy of the highest distinction.

 

 

 

  • Uploaded By : ethan
  • Posted on : December 16th, 2018
  • Downloads : 1

Download Solution Now

Review Question

Please enter your email

Can't find what you're looking for?

Why are you waiting for Assignment Deadline?
Book Assignment Today & Get 500 Words Free
Order Now TnC Apply
cancel

Choose a Plan

Premium

80 USD
  • All in Gold, plus:
  • 30-minute live one-to-one session with an expert
    • Understanding Marking Rubric
    • Understanding task requirements
    • Structuring & Formatting
    • Referencing & Citing
Most
Popular

Gold

30 50 USD
  • Get the Full Used Solution
    (Solution is already submitted and 100% plagiarised.
    Can only be used for reference purposes)
Save 33%

Silver

20 USD
  • Journals
  • Peer-Reviewed Articles
  • Books
  • Various other Data Sources – ProQuest, Informit, Scopus, Academic Search Complete, EBSCO, Exerpta Medica Database, and more