Coursera: Machine Learning (Week 8) Quiz - Unsupervised Learning | Andrew NG

byAkshay Daga (APDaga) -November 29, 2019

3

▸ Unsupervised Learning :

Recommended Machine Learning Courses:

Coursera: Machine Learning

Coursera: Deep Learning Specialization

Coursera: Machine Learning with Python

Coursera: Advanced Machine Learning Specialization

Udemy: Machine Learning

LinkedIn: Machine Learning

Eduonix: Machine Learning

edX: Machine Learning

Fast.ai: Introduction to Machine Learning for Coders

For which of the following tasks might K-means clustering be a suitable algorithm
Select all that apply.
- Given a set of news articles from many different news websites, find out what are the main topics covered.
  K-means can cluster the articles and then we can inspect them or use other methods to infer what topic each cluster represents
- Given historical weather records, predict if tomorrow’s weather will be sunny or rainy.
- From the user usage patterns on a website, figure out what different groups of users exist.
  We can cluster the users with K-means to find different, distinct groups.
- Given many emails, you want to determine if they are Spam or Non-Spam emails.
- Given a database of information about your users, automatically group them into different market segments.
  You can use K-means to cluster the database entries, and each cluster will correspond to a different market segment.
- Given sales data from a large number of products in a supermarket, figure out which products tend to form coherent groups (say are frequently purchased together) and thus should be put on the same shelf.
  If you cluster the sales data with K-means, each cluster should correspond to coherent groups of items.
- Given sales data from a large number of products in a supermarket, estimate future sales for each of these products.

Suppose we have three cluster centroids $\mu_1 = \begin{bmatrix} 1\\ 2 \end{bmatrix}$ , $\mu_2 = \begin{bmatrix} -3\\ 0 \end{bmatrix}$ and $\mu_3 = \begin{bmatrix} 4\\ 2 \end{bmatrix}$ .
Furthermore, we have a training example $x^{(i)} = \begin{bmatrix} -1\\ 2 \end{bmatrix}$ . After a cluster assignment
step, what will $C^{(i)}$ be?
- $C^{(i)}$ = 1
  $x^{(i)}$ is closest to $\mu_1$ , So $C^{(i)}$ = 1.
  (Calculate Euclidean distance for each centroid and choose the smallest one)
- $C^{(i)}$ is not assigned
- $C^{(i)}$ = 2
- $C^{(i)}$ = 3

K-means is an iterative algorithm, and two of the following steps are repeatedly carried out in its inner-loop. Which two?
- Move the cluster centroids, where the centroids $\mu_k$ are updated.
  The cluster update is the second step of the K-means loop.
- The cluster assignment step, where the parameters $C^{(i)}$ are updated.
  This is the correst first step of the K-means loop.
- Using the elbow method to choose K.
- Feature scaling, to ensure each feature is on a comparable scale to the others.
- The cluster centroid assignment step, where each cluster centroid $\mu_i$ is assigned (by setting $C^{(i)}$ ) to the closest training example $x^{(i)}$ .
- Move each cluster centroid $\mu_k$ , by setting it to be equal to the closest training example $x^{(i)}$ .
- Test on the cross-validation set.
- Randomly initialize the cluster centroids.

Suppose you have an unlabeled dataset $\{x^{(1)}, ... , x^{(m)}\}$ . You run K-means with 50 different random initializations, and obtain 50 different clusterings of the data.

What is the recommended way for choosing which one of these 50 clusterings to use?
- Use the elbow method.
- Plot the data and the cluster centroids, and pick the clustering that gives the most “coherent” cluster centroids.
- Manually examine the clusterings, and pick the best one.
- Compute the distortion function $J(C^{(1)}, ... , C^{(m)}, \mu_1, ... , \mu_k)$ , and pick the one that minimizes this.
  A lower value for the distortion function implies a better clustering, so you should choose the clustering with the smallest value for the distortion function.
- The only way to do so is if we also have labels $y^{(i)}$ for our data.
- Always pick the final (50th) clustering found, since by that time it is more likely to have converged to a good solution.
- The answer is ambiguous, and there is no good way of choosing.
- For each of the clusterings, compute $\frac{1}{m} \sum_{i=1}^m \left \| x^{(i)} - \mu_{c^{(i)}} \right \|^2$ , and pick the one that minimizes this.
  This function is the distortion function. Since a lower value for the distortion function implies a better clustering, you should choose the clustering with the smallest value for the distortion function.

Check-out our free tutorials on IOT (Internet of Things):

Which of the following statements are true? Select all that apply.
- On every iteration of K-means, the cost function $J(C^{(1)}, ... , C^{(m)}, \mu_1, ... , \mu_k)$ (the distortion function) should either stay the same or decrease; in particular, it should not increase.
  Both the cluster assignment and cluster update steps decrese the cost / distortion function, so it should never increase after an iteration of K-means.
- A good way to initialize K-means is to select K (distinct) examples from the training set and set the cluster centroids equal to these selected examples.
  This is the recommended method of initialization.
- K-Means will always give the same results regardless of the initialization of the centroids.
- Once an example has been assigned to a particular centroid, it will never be reassigned to another different centroid
- For some datasets, the “right” or “correct” value of K (the number of clusters) can be ambiguous, and hard even for a human expert looking carefully at the data to decide.
  In many datasets, different choices of K will give different clusterings which appear quite reasonable. With no labels on the data, we cannot say one is better than the other.
- The standard way of initializing K-means is setting $\mu_1 = ... = \mu_k$ to be equal to a vector of zeros.
- If we are worried about K-means getting stuck in bad local optima, one way to ameliorate (reduce) this problem is if we try using multiple random initializations.
  Since each run of K-means is independent, multiple runs can find different optima, and some should avoid bad local optima.
- Since K-Means is an unsupervised learning algorithm, it cannot overfit the data, and thus it is always better to have as large a number of clusters as is computationally feasible.

Click here to see solutions for all Machine Learning Coursera Assignments.
&
Click here to see more codes for Raspberry Pi 3 and similar Family.
&
Click here to see more codes for NodeMCU ESP8266 and similar Family.
&
Click here to see more codes for Arduino Mega (ATMega 2560) and similar Family.

Feel free to ask doubts in the comment section. I will try my best to answer it.
If you find this helpful by any mean like, comment and share the post.
This is the simplest way to encourage me to keep doing such work.

Thanks & Regards,
- APDaga DumpBox

3 Comments

Unknown23 May 2021 at 16:11
2nd is wrong .Answere should be option 3rd = c^{(i)} = 2
ReplyDelete
Replies
Unknown16 April 2022 at 22:57
if you calculate the Euclidean distance between x and 3 centroids you get 4, 8 & 25 respectively, not 4, 20 and 25! The rest is true, thanks for your efforts!
ReplyDelete
Replies

Add comment

Coursera: Machine Learning (Week 8) Quiz - Unsupervised Learning | Andrew NG

▸ Unsupervised Learning :

Check-out our free tutorials on IOT (Internet of Things):

3 Comments

Contact form