Implementasi Metode K-Means, K-Modes, dan K-Prototype dengan python¶
Clustering¶
Pada prinsipnya metode clustering sangat bermanfaat sekali dalam bidang kelimuan, seperti
- Identifikasi obyek (Recoginition)
Metode clustering biasa dipakai dalam bidang Image Processing, Computer Vision, Robot Vision, dan lain-lain.
- Decission Support System dan Data Mining
Dalam hal ini metode clusting biasa digunakan dalam Segmentasi Pasar, pemetaan wilayah, Manajement marketing, dan lain-lain.
dalam hal ini metode clusting biasa digunakan dalam Segmentasi Pasar, pemetaan wilayah, Manajement marketing, dan lain-lain.
K-Means Clustering¶
Secara umum, K-Means clustering melakukan pengelompokan dengan algoritma seperti berikut:
- Tentukan jumlah k (k=cluster)
- Alokasikan data awal ke dalam cluster secara random
- Hitung centroid/rata-rata dari masing-masing cluster yang telah ditentukan.
- Hitung jarak masing-masing data dengan centroid, dan alokasikan data pada cluster terdekat.
- Ulangi langka 3 dan 4, hingga nilai dari centroid tidak berubah lagi.
K-Modes Clustering¶
K-Prototype Clustering¶
K-Prototype clustering merupakan metode clustring gabungan dari K-Means dan K-Modes. Metode ini digunakan untuk mengelompokkan data yang memiliki attribut numerik dan kategorikal.
Implementasi dengan Python¶
K-Means¶
Pada implementasi K-Means, kita dapat menggunakan data glass yang dapat diunduh di link ini , data tersebut dapat kita tampilkan dalam bentuk data frame. Pertama kali yang dapat kita lakukan adalah memuat data tersebut menggunaka libarary pandas.
import pandas as pd data = pd.read_csv('glass.csv',delimiter=';', decimal=',') df = pd.DataFrame(data) df.style.hide_index()
Dari data tersebut akan tampil seperti berikut:
ID | refractive index | Sodium | Magnesium | Aluminum | Silicon | Potassium | Calcium | Barium | Iron |
---|---|---|---|---|---|---|---|---|---|
1 | 1.52101 | 13.64 | 4.49 | 1.1 | 71.78 | 0.06 | 8.75 | 0 | 0.001 |
2 | 1.51761 | 13.89 | 3.6 | 1.36 | 72.73 | 0.48 | 7.83 | 0 | 0 |
3 | 1.51618 | 13.53 | 3.55 | 1.54 | 72.99 | 0.39 | 7.78 | 0 | 0 |
4 | 1.51766 | 13.21 | 3.69 | 1.29 | 72.61 | 0.57 | 8.22 | 0 | 0 |
5 | 1.51742 | 13.27 | 3.62 | 1.24 | 73.08 | 0.55 | 8.07 | 0 | 0 |
6 | 1.51596 | 12.79 | 3.61 | 1.62 | 72.97 | 0.64 | 8.07 | 0 | 0.26 |
7 | 1.51743 | 13.3 | 3.6 | 1.14 | 73.09 | 0.58 | 8.17 | 0 | 0 |
8 | 1.51756 | 13.15 | 3.61 | 1.05 | 73.24 | 0.57 | 8.24 | 0 | 0 |
9 | 1.51918 | 14.04 | 3.58 | 1.37 | 72.08 | 0.56 | 8.3 | 0 | 0 |
10 | 1.51755 | 13 | 3.6 | 1.36 | 72.99 | 0.57 | 8.4 | 0 | 0.11 |
11 | 1.51571 | 12.72 | 3.46 | 1.56 | 73.2 | 0.67 | 8.09 | 0 | 0.24 |
12 | 1.51763 | 12.8 | 3.66 | 1.27 | 73.01 | 0.6 | 8.56 | 0 | 0 |
13 | 1.51589 | 12.88 | 3.43 | 1.4 | 73.28 | 0.69 | 8.05 | 0 | 0.24 |
14 | 1.51748 | 12.86 | 3.56 | 1.27 | 73.21 | 0.54 | 8.38 | 0 | 0.17 |
15 | 1.51763 | 12.61 | 3.59 | 1.31 | 73.29 | 0.58 | 8.5 | 0 | 0 |
16 | 1.51761 | 12.81 | 3.54 | 1.23 | 73.24 | 0.58 | 8.39 | 0 | 0 |
17 | 1.51784 | 12.68 | 3.67 | 1.16 | 73.11 | 0.61 | 8.7 | 0 | 0 |
18 | 1.52196 | 14.36 | 3.85 | 0.89 | 71.36 | 0.15 | 9.15 | 0 | 0 |
19 | 1.51911 | 13.9 | 3.73 | 1.18 | 72.12 | 0.06 | 8.89 | 0 | 0 |
20 | 1.51735 | 13.02 | 3.54 | 1.69 | 72.73 | 0.54 | 8.44 | 0 | 0.07 |
21 | 1.5175 | 12.82 | 3.55 | 1.49 | 72.75 | 0.54 | 8.52 | 0 | 0.19 |
22 | 1.51966 | 14.77 | 3.75 | 0.29 | 72.02 | 0.03 | 9 | 0 | 0 |
23 | 1.51736 | 12.78 | 3.62 | 1.29 | 72.79 | 0.59 | 8.7 | 0 | 0 |
24 | 1.51751 | 12.81 | 3.57 | 1.35 | 73.02 | 0.62 | 8.59 | 0 | 0 |
25 | 1.5172 | 13.38 | 3.5 | 1.15 | 72.85 | 0.5 | 8.43 | 0 | 0 |
26 | 1.51764 | 12.98 | 3.54 | 1.21 | 73 | 0.65 | 8.53 | 0 | 0 |
27 | 1.51793 | 13.21 | 3.48 | 1.41 | 72.64 | 0.59 | 8.43 | 0 | 0 |
28 | 1.51721 | 12.87 | 3.48 | 1.33 | 73.04 | 0.56 | 8.43 | 0 | 0 |
29 | 1.51768 | 12.56 | 3.52 | 1.43 | 73.15 | 0.57 | 8.54 | 0 | 0 |
30 | 1.51784 | 13.08 | 3.49 | 1.28 | 72.86 | 0.6 | 8.49 | 0 | 0 |
31 | 1.51768 | 12.65 | 3.56 | 1.3 | 73.08 | 0.61 | 8.69 | 0 | 0.14 |
32 | 1.51747 | 12.84 | 3.5 | 1.14 | 73.27 | 0.56 | 8.55 | 0 | 0 |
33 | 1.51775 | 12.85 | 3.48 | 1.23 | 72.97 | 0.61 | 8.56 | 0.09 | 0.22 |
34 | 1.51753 | 12.57 | 3.47 | 1.38 | 73.39 | 0.6 | 8.55 | 0 | 0.06 |
35 | 1.51783 | 12.69 | 3.54 | 1.34 | 72.95 | 0.57 | 8.75 | 0 | 0 |
36 | 1.51567 | 13.29 | 3.45 | 1.21 | 72.74 | 0.56 | 8.57 | 0 | 0 |
37 | 1.51909 | 13.89 | 3.53 | 1.32 | 71.81 | 0.51 | 8.78 | 0.11 | 0 |
38 | 1.51797 | 12.74 | 3.48 | 1.35 | 72.96 | 0.64 | 8.68 | 0 | 0 |
39 | 1.52213 | 14.21 | 3.82 | 0.47 | 71.77 | 0.11 | 9.57 | 0 | 0 |
40 | 1.52213 | 14.21 | 3.82 | 0.47 | 71.77 | 0.11 | 9.57 | 0 | 0 |
41 | 1.51793 | 12.79 | 3.5 | 1.12 | 73.03 | 0.64 | 8.77 | 0 | 0 |
42 | 1.51755 | 12.71 | 3.42 | 1.2 | 73.2 | 0.59 | 8.64 | 0 | 0 |
43 | 1.51779 | 13.21 | 3.39 | 1.33 | 72.76 | 0.59 | 8.59 | 0 | 0 |
44 | 1.5221 | 13.73 | 3.84 | 0.72 | 71.76 | 0.17 | 9.74 | 0 | 0 |
45 | 1.51786 | 12.73 | 3.43 | 1.19 | 72.95 | 0.62 | 8.76 | 0 | 0.3 |
46 | 1.519 | 13.49 | 3.48 | 1.35 | 71.95 | 0.55 | 9 | 0 | 0 |
47 | 1.51869 | 13.19 | 3.37 | 1.18 | 72.72 | 0.57 | 8.83 | 0 | 0.16 |
48 | 1.52667 | 13.99 | 3.7 | 0.71 | 71.57 | 0.02 | 9.82 | 0 | 0.1 |
49 | 1.52223 | 13.21 | 3.77 | 0.79 | 71.99 | 0.13 | 10.02 | 0 | 0 |
50 | 1.51898 | 13.58 | 3.35 | 1.23 | 72.08 | 0.59 | 8.91 | 0 | 0 |
51 | 1.5232 | 13.72 | 3.72 | 0.51 | 71.75 | 0.09 | 10.06 | 0 | 0.16 |
52 | 1.51926 | 13.2 | 3.33 | 1.28 | 72.36 | 0.6 | 9.14 | 0 | 0.11 |
53 | 1.51808 | 13.43 | 2.87 | 1.19 | 72.84 | 0.55 | 9.03 | 0 | 0 |
54 | 1.51837 | 13.14 | 2.84 | 1.28 | 72.85 | 0.55 | 9.07 | 0 | 0 |
55 | 1.51778 | 13.21 | 2.81 | 1.29 | 72.98 | 0.51 | 9.02 | 0 | 0.09 |
56 | 1.51769 | 12.45 | 2.71 | 1.29 | 73.7 | 0.56 | 9.06 | 0 | 0.24 |
57 | 1.51215 | 12.99 | 3.47 | 1.12 | 72.98 | 0.62 | 8.35 | 0 | 0.31 |
58 | 1.51824 | 12.87 | 3.48 | 1.29 | 72.95 | 0.6 | 8.43 | 0 | 0 |
59 | 1.51754 | 13.48 | 3.74 | 1.17 | 72.99 | 0.59 | 8.03 | 0 | 0 |
60 | 1.51754 | 13.39 | 3.66 | 1.19 | 72.79 | 0.57 | 8.27 | 0 | 0.11 |
61 | 1.51905 | 13.6 | 3.62 | 1.11 | 72.64 | 0.14 | 8.76 | 0 | 0 |
62 | 1.51977 | 13.81 | 3.58 | 1.32 | 71.72 | 0.12 | 8.67 | 0.69 | 0 |
63 | 1.52172 | 13.51 | 3.86 | 0.88 | 71.79 | 0.23 | 9.54 | 0 | 0.11 |
64 | 1.52227 | 14.17 | 3.81 | 0.78 | 71.35 | 0 | 9.69 | 0 | 0 |
65 | 1.52172 | 13.48 | 3.74 | 0.9 | 72.01 | 0.18 | 9.61 | 0 | 0.07 |
66 | 1.52099 | 13.69 | 3.59 | 1.12 | 71.96 | 0.09 | 9.4 | 0 | 0 |
67 | 1.52152 | 13.05 | 3.65 | 0.87 | 72.22 | 0.19 | 9.85 | 0 | 0.17 |
68 | 1.52152 | 13.05 | 3.65 | 0.87 | 72.32 | 0.19 | 9.85 | 0 | 0.17 |
69 | 1.52152 | 13.12 | 3.58 | 0.9 | 72.2 | 0.23 | 9.82 | 0 | 0.16 |
70 | 1.523 | 13.31 | 3.58 | 0.82 | 71.99 | 0.12 | 10.17 | 0 | 0.03 |
71 | 1.51574 | 14.86 | 3.67 | 1.74 | 71.87 | 0.16 | 7.36 | 0 | 0.12 |
72 | 1.51848 | 13.64 | 3.87 | 1.27 | 71.96 | 0.54 | 8.32 | 0 | 0.32 |
73 | 1.51593 | 13.09 | 3.59 | 1.52 | 73.1 | 0.67 | 7.83 | 0 | 0 |
74 | 1.51631 | 13.34 | 3.57 | 1.57 | 72.87 | 0.61 | 7.89 | 0 | 0 |
75 | 1.51596 | 13.02 | 3.56 | 1.54 | 73.11 | 0.72 | 7.9 | 0 | 0 |
76 | 1.5159 | 13.02 | 3.58 | 1.51 | 73.12 | 0.69 | 7.96 | 0 | 0 |
77 | 1.51645 | 13.44 | 3.61 | 1.54 | 72.39 | 0.66 | 8.03 | 0 | 0 |
78 | 1.51627 | 13 | 3.58 | 1.54 | 72.83 | 0.61 | 8.04 | 0 | 0 |
79 | 1.51613 | 13.92 | 3.52 | 1.25 | 72.88 | 0.37 | 7.94 | 0 | 0.14 |
80 | 1.5159 | 12.82 | 3.52 | 1.9 | 72.86 | 0.69 | 7.97 | 0 | 0 |
81 | 1.51592 | 12.86 | 3.52 | 2.12 | 72.66 | 0.69 | 7.97 | 0 | 0 |
82 | 1.51593 | 13.25 | 3.45 | 1.43 | 73.17 | 0.61 | 7.86 | 0 | 0 |
83 | 1.51646 | 13.41 | 3.55 | 1.25 | 72.81 | 0.68 | 8.1 | 0 | 0 |
84 | 1.51594 | 13.09 | 3.52 | 1.55 | 72.87 | 0.68 | 8.05 | 0 | 0.09 |
85 | 1.51409 | 14.25 | 3.09 | 2.08 | 72.28 | 1.1 | 7.08 | 0 | 0 |
86 | 1.51625 | 13.36 | 3.58 | 1.49 | 72.72 | 0.45 | 8.21 | 0 | 0 |
87 | 1.51569 | 13.24 | 3.49 | 1.47 | 73.25 | 0.38 | 8.03 | 0 | 0 |
88 | 1.51645 | 13.4 | 3.49 | 1.52 | 72.65 | 0.67 | 8.08 | 0 | 0.1 |
89 | 1.51618 | 13.01 | 3.5 | 1.48 | 72.89 | 0.6 | 8.12 | 0 | 0 |
90 | 1.5164 | 12.55 | 3.48 | 1.87 | 73.23 | 0.63 | 8.08 | 0 | 0.09 |
91 | 1.51841 | 12.93 | 3.74 | 1.11 | 72.28 | 0.64 | 8.96 | 0 | 0.22 |
92 | 1.51605 | 12.9 | 3.44 | 1.45 | 73.06 | 0.44 | 8.27 | 0 | 0 |
93 | 1.51588 | 13.12 | 3.41 | 1.58 | 73.26 | 0.07 | 8.39 | 0 | 0.19 |
94 | 1.5159 | 13.24 | 3.34 | 1.47 | 73.1 | 0.39 | 8.22 | 0 | 0 |
95 | 1.51629 | 12.71 | 3.33 | 1.49 | 73.28 | 0.67 | 8.24 | 0 | 0 |
96 | 1.5186 | 13.36 | 3.43 | 1.43 | 72.26 | 0.51 | 8.6 | 0 | 0 |
97 | 1.51841 | 13.02 | 3.62 | 1.06 | 72.34 | 0.64 | 9.13 | 0 | 0.15 |
98 | 1.51743 | 12.2 | 3.25 | 1.16 | 73.55 | 0.62 | 8.9 | 0 | 0.24 |
99 | 1.51689 | 12.67 | 2.88 | 1.71 | 73.21 | 0.73 | 8.54 | 0 | 0 |
100 | 1.51811 | 12.96 | 2.96 | 1.43 | 72.92 | 0.6 | 8.79 | 0.14 | 0 |
101 | 1.51655 | 12.75 | 2.85 | 1.44 | 73.27 | 0.57 | 8.79 | 0.11 | 0.22 |
102 | 1.5173 | 12.35 | 2.72 | 1.63 | 72.87 | 0.7 | 9.23 | 0 | 0 |
103 | 1.5182 | 12.62 | 2.76 | 0.83 | 73.81 | 0.35 | 9.42 | 0 | 0.2 |
104 | 1.52725 | 13.8 | 3.15 | 0.66 | 70.57 | 0.08 | 11.64 | 0 | 0 |
105 | 1.5241 | 13.83 | 2.9 | 1.17 | 71.15 | 0.08 | 10.79 | 0 | 0 |
106 | 1.52475 | 11.45 | 0 | 1.88 | 72.19 | 0.81 | 13.24 | 0 | 0.34 |
107 | 1.53125 | 10.73 | 0 | 2.1 | 69.81 | 0.58 | 13.3 | 3.15 | 0.28 |
108 | 1.53393 | 12.3 | 0 | 1 | 70.16 | 0.12 | 16.19 | 0 | 0.24 |
109 | 1.52222 | 14.43 | 0 | 1 | 72.67 | 0.1 | 11.52 | 0 | 0.08 |
110 | 1.51818 | 13.72 | 0 | 0.56 | 74.45 | 0 | 10.99 | 0 | 0 |
111 | 1.52664 | 11.23 | 0 | 0.77 | 73.21 | 0 | 14.68 | 0 | 0 |
112 | 1.52739 | 11.02 | 0 | 0.75 | 73.08 | 0 | 14.96 | 0 | 0 |
113 | 1.52777 | 12.64 | 0 | 0.67 | 72.02 | 0.06 | 14.4 | 0 | 0 |
114 | 1.51892 | 13.46 | 3.83 | 1.26 | 72.55 | 0.57 | 8.21 | 0 | 0.14 |
115 | 1.51847 | 13.1 | 3.97 | 1.19 | 72.44 | 0.6 | 8.43 | 0 | 0 |
116 | 1.51846 | 13.41 | 3.89 | 1.33 | 72.38 | 0.51 | 8.28 | 0 | 0 |
117 | 1.51829 | 13.24 | 3.9 | 1.41 | 72.33 | 0.55 | 8.31 | 0 | 0.1 |
118 | 1.51708 | 13.72 | 3.68 | 1.81 | 72.06 | 0.64 | 7.88 | 0 | 0 |
119 | 1.51673 | 13.3 | 3.64 | 1.53 | 72.53 | 0.65 | 8.03 | 0 | 0.29 |
120 | 1.51652 | 13.56 | 3.57 | 1.47 | 72.45 | 0.64 | 7.96 | 0 | 0 |
121 | 1.51844 | 13.25 | 3.76 | 1.32 | 72.4 | 0.58 | 8.42 | 0 | 0 |
122 | 1.51663 | 12.93 | 3.54 | 1.62 | 72.96 | 0.64 | 8.03 | 0 | 0.21 |
123 | 1.51687 | 13.23 | 3.54 | 1.48 | 72.84 | 0.56 | 8.1 | 0 | 0 |
124 | 1.51707 | 13.48 | 3.48 | 1.71 | 72.52 | 0.62 | 7.99 | 0 | 0 |
125 | 1.52177 | 13.2 | 3.68 | 1.15 | 72.75 | 0.54 | 8.52 | 0 | 0 |
126 | 1.51872 | 12.93 | 3.66 | 1.56 | 72.51 | 0.58 | 8.55 | 0 | 0.12 |
127 | 1.51667 | 12.94 | 3.61 | 1.26 | 72.75 | 0.56 | 8.6 | 0 | 0 |
128 | 1.52081 | 13.78 | 2.28 | 1.43 | 71.99 | 0.49 | 9.85 | 0 | 0.17 |
129 | 1.52068 | 13.55 | 2.09 | 1.67 | 72.18 | 0.53 | 9.57 | 0.27 | 0.17 |
130 | 1.5202 | 13.98 | 1.35 | 1.63 | 71.76 | 0.39 | 10.56 | 0 | 0.18 |
131 | 1.52177 | 13.75 | 1.01 | 1.36 | 72.19 | 0.33 | 11.14 | 0 | 0 |
132 | 1.52614 | 13.7 | 0 | 1.36 | 71.24 | 0.19 | 13.44 | 0 | 0.1 |
133 | 1.51813 | 13.43 | 3.98 | 1.18 | 72.49 | 0.58 | 8.15 | 0 | 0 |
134 | 1.518 | 13.71 | 3.93 | 1.54 | 71.81 | 0.54 | 8.21 | 0 | 0.15 |
135 | 1.51811 | 13.33 | 3.85 | 1.25 | 72.78 | 0.52 | 8.12 | 0 | 0 |
136 | 1.51789 | 13.19 | 3.9 | 1.3 | 72.33 | 0.55 | 8.44 | 0 | 0.28 |
137 | 1.51806 | 13 | 3.8 | 1.08 | 73.07 | 0.56 | 8.38 | 0 | 0.12 |
138 | 1.51711 | 12.89 | 3.62 | 1.57 | 72.96 | 0.61 | 8.11 | 0 | 0 |
139 | 1.51674 | 12.79 | 3.52 | 1.54 | 73.36 | 0.66 | 7.9 | 0 | 0 |
140 | 1.51674 | 12.87 | 3.56 | 1.64 | 73.14 | 0.65 | 7.99 | 0 | 0 |
141 | 1.5169 | 13.33 | 3.54 | 1.61 | 72.54 | 0.68 | 8.11 | 0 | 0 |
142 | 1.51851 | 13.2 | 3.63 | 1.07 | 72.83 | 0.57 | 8.41 | 0.09 | 0.17 |
143 | 1.51662 | 12.85 | 3.51 | 1.44 | 73.01 | 0.68 | 8.23 | 0.06 | 0.25 |
144 | 1.51709 | 13 | 3.47 | 1.79 | 72.72 | 0.66 | 8.18 | 0 | 0 |
145 | 1.5166 | 12.99 | 3.18 | 1.23 | 72.97 | 0.58 | 8.81 | 0 | 0.24 |
146 | 1.51839 | 12.85 | 3.67 | 1.24 | 72.57 | 0.62 | 8.68 | 0 | 0.35 |
147 | 1.51769 | 13.65 | 3.66 | 1.11 | 72.77 | 0.11 | 8.6 | 0 | 0 |
148 | 1.5161 | 13.33 | 3.53 | 1.34 | 72.67 | 0.56 | 8.33 | 0 | 0 |
149 | 1.5167 | 13.24 | 3.57 | 1.38 | 72.7 | 0.56 | 8.44 | 0 | 0.1 |
150 | 1.51643 | 12.16 | 3.52 | 1.35 | 72.89 | 0.57 | 8.53 | 0 | 0 |
151 | 1.51665 | 13.14 | 3.45 | 1.76 | 72.48 | 0.6 | 8.38 | 0 | 0.17 |
152 | 1.52127 | 14.32 | 3.9 | 0.83 | 71.5 | 0 | 9.49 | 0 | 0 |
153 | 1.51779 | 13.64 | 3.65 | 0.65 | 73 | 0.06 | 8.93 | 0 | 0 |
154 | 1.5161 | 13.42 | 3.4 | 1.22 | 72.69 | 0.59 | 8.32 | 0 | 0 |
155 | 1.51694 | 12.86 | 3.58 | 1.31 | 72.61 | 0.61 | 8.79 | 0 | 0 |
156 | 1.51646 | 13.04 | 3.4 | 1.26 | 73.01 | 0.52 | 8.58 | 0 | 0 |
157 | 1.51655 | 13.41 | 3.39 | 1.28 | 72.64 | 0.52 | 8.65 | 0 | 0 |
158 | 1.52121 | 14.03 | 3.76 | 0.58 | 71.79 | 0.11 | 9.65 | 0 | 0 |
159 | 1.51776 | 13.53 | 3.41 | 1.52 | 72.04 | 0.58 | 8.79 | 0 | 0 |
160 | 1.51796 | 13.5 | 3.36 | 1.63 | 71.94 | 0.57 | 8.81 | 0 | 0.09 |
161 | 1.51832 | 13.33 | 3.34 | 1.54 | 72.14 | 0.56 | 8.99 | 0 | 0 |
162 | 1.51934 | 13.64 | 3.54 | 0.75 | 72.65 | 0.16 | 8.89 | 0.15 | 0.24 |
163 | 1.52211 | 14.19 | 3.78 | 0.91 | 71.36 | 0.23 | 9.14 | 0 | 0.37 |
164 | 1.51514 | 14.01 | 2.68 | 3.5 | 69.89 | 1.68 | 5.87 | 2.2 | 0 |
165 | 1.51915 | 12.73 | 1.85 | 1.86 | 72.69 | 0.6 | 10.09 | 0 | 0 |
166 | 1.52171 | 11.56 | 1.88 | 1.56 | 72.86 | 0.47 | 11.41 | 0 | 0 |
167 | 1.52151 | 11.03 | 1.71 | 1.56 | 73.44 | 0.58 | 11.62 | 0 | 0 |
168 | 1.51969 | 12.64 | 0 | 1.65 | 73.75 | 0.38 | 11.53 | 0 | 0 |
169 | 1.51666 | 12.86 | 0 | 1.83 | 73.88 | 0.97 | 10.17 | 0 | 0 |
170 | 1.51994 | 13.27 | 0 | 1.76 | 73.03 | 0.47 | 11.32 | 0 | 0 |
171 | 1.52369 | 13.44 | 0 | 1.58 | 72.22 | 0.32 | 12.24 | 0 | 0 |
172 | 1.51316 | 13.02 | 0 | 3.04 | 70.48 | 6.21 | 6.96 | 0 | 0 |
173 | 1.51321 | 13 | 0 | 3.02 | 70.7 | 6.21 | 6.93 | 0 | 0 |
174 | 1.52043 | 13.38 | 0 | 1.4 | 72.25 | 0.33 | 12.5 | 0 | 0 |
175 | 1.52058 | 12.85 | 1.61 | 2.17 | 72.18 | 0.76 | 9.7 | 0.24 | 0.51 |
176 | 1.52119 | 12.97 | 0.33 | 1.51 | 73.39 | 0.13 | 11.27 | 0 | 0.28 |
177 | 1.51905 | 14 | 2.39 | 1.56 | 72.37 | 0 | 9.57 | 0 | 0 |
178 | 1.51937 | 13.79 | 2.41 | 1.19 | 72.76 | 0 | 9.77 | 0 | 0 |
179 | 1.51829 | 14.46 | 2.24 | 1.62 | 72.38 | 0 | 9.26 | 0 | 0 |
180 | 1.51852 | 14.09 | 2.19 | 1.66 | 72.67 | 0 | 9.32 | 0 | 0 |
181 | 1.51299 | 14.4 | 1.74 | 1.54 | 74.55 | 0 | 7.59 | 0 | 0 |
182 | 1.51888 | 14.99 | 0.78 | 1.74 | 72.5 | 0 | 9.95 | 0 | 0 |
183 | 1.51916 | 14.15 | 0 | 2.09 | 72.74 | 0 | 10.88 | 0 | 0 |
184 | 1.51969 | 14.56 | 0 | 0.56 | 73.48 | 0 | 11.22 | 0 | 0 |
185 | 1.51115 | 17.38 | 0 | 0.34 | 75.41 | 0 | 6.65 | 0 | 0 |
186 | 1.51131 | 13.69 | 3.2 | 1.81 | 72.81 | 1.76 | 5.43 | 1.19 | 0 |
187 | 1.51838 | 14.32 | 3.26 | 2.22 | 71.25 | 1.46 | 5.79 | 1.63 | 0 |
188 | 1.52315 | 13.44 | 3.34 | 1.23 | 72.38 | 0.6 | 8.83 | 0 | 0 |
189 | 1.52247 | 14.86 | 2.2 | 2.06 | 70.26 | 0.76 | 9.76 | 0 | 0 |
190 | 1.52365 | 15.79 | 1.83 | 1.31 | 70.43 | 0.31 | 8.61 | 1.68 | 0 |
191 | 1.51613 | 13.88 | 1.78 | 1.79 | 73.1 | 0 | 8.67 | 0.76 | 0 |
192 | 1.51602 | 14.85 | 0 | 2.38 | 73.28 | 0 | 8.76 | 0.64 | 0.09 |
193 | 1.51623 | 14.2 | 0 | 2.79 | 73.46 | 0.04 | 9.04 | 0.4 | 0.09 |
194 | 1.51719 | 14.75 | 0 | 2 | 73.02 | 0 | 8.53 | 1.59 | 0.08 |
195 | 1.51683 | 14.56 | 0 | 1.98 | 73.29 | 0 | 8.52 | 1.57 | 0.07 |
196 | 1.51545 | 14.14 | 0 | 2.68 | 73.39 | 0.08 | 9.07 | 0.61 | 0.05 |
197 | 1.51556 | 13.87 | 0 | 2.54 | 73.23 | 0.14 | 9.41 | 0.81 | 0.01 |
198 | 1.51727 | 14.7 | 0 | 2.34 | 73.28 | 0 | 8.95 | 0.66 | 0 |
199 | 1.51531 | 14.38 | 0 | 2.66 | 73.1 | 0.04 | 9.08 | 0.64 | 0 |
200 | 1.51609 | 15.01 | 0 | 2.51 | 73.05 | 0.05 | 8.83 | 0.53 | 0 |
201 | 1.51508 | 15.15 | 0 | 2.25 | 73.5 | 0 | 8.34 | 0.63 | 0 |
202 | 1.51653 | 11.95 | 0 | 1.19 | 75.18 | 2.7 | 8.93 | 0 | 0 |
203 | 1.51514 | 14.85 | 0 | 2.42 | 73.72 | 0 | 8.39 | 0.56 | 0 |
204 | 1.51658 | 14.8 | 0 | 1.99 | 73.11 | 0 | 8.28 | 1.71 | 0 |
205 | 1.51617 | 14.95 | 0 | 2.27 | 73.3 | 0 | 8.71 | 0.67 | 0 |
206 | 1.51732 | 14.95 | 0 | 1.8 | 72.99 | 0 | 8.61 | 1.55 | 0 |
207 | 1.51645 | 14.94 | 0 | 1.87 | 73.11 | 0 | 8.67 | 1.38 | 0 |
208 | 1.51831 | 14.39 | 0 | 1.82 | 72.86 | 1.41 | 6.47 | 2.88 | 0 |
209 | 1.5164 | 14.37 | 0 | 2.74 | 72.85 | 0 | 9.45 | 0.54 | 0 |
210 | 1.51623 | 14.14 | 0 | 2.88 | 72.61 | 0.08 | 9.18 | 1.06 | 0 |
211 | 1.51685 | 14.92 | 0 | 1.99 | 73.06 | 0 | 8.4 | 1.59 | 0 |
212 | 1.52065 | 14.36 | 0 | 2.02 | 73.42 | 0 | 8.44 | 1.64 | 0 |
213 | 1.51651 | 14.38 | 0 | 1.94 | 73.61 | 0 | 8.48 | 1.57 | 0 |
214 | 1.51711 | 14.23 | 0 | 2.08 | 73.36 | 0 | 8.62 | 1.67 | 0 |
Kita dapat membuat fungsi untuk menampilkan masing -masing cluster dalam bentuk table.
def show_cluster(data,k): cluster = {} for i in range(k): cluster['Cluster ' + str(i)] = data[data["Cluster"].isin([i])].iloc[:,0].values dframe = pd.DataFrame.from_dict(cluster, orient='index') dframe = dframe.transpose() dframe = dframe.fillna("") return dframe.style.hide_index()
Kita dapat menggunakan KMeans yang merupakan library dari sklearn untuk melakukan clustering pada data numerik. dalam contoh ini digunakan k=5 untuk mengelompokkan menjadi 5 cluster. dari hasil proses clustering yang dilakukan, hasilnya dapat digabungkan dengan data yang telah ada dengan menambahkan attribut Cluster agar setiap baris data memiliki clusternya masing-msaing.
from sklearn.cluster import KMeans k = 5 data_set = df.iloc[:,1:].values df_dummy = pd.get_dummy(df) data_set = df_dummy.reset_index().values kmeans = KMeans(n_clusters=k) cluster = kmeans.fit(data_set) data['Cluster'] = cluster.labels_ show_cluster(data,k)
kita dapat menggunakan fungsi show_cluster() yang telah dibuat sebelumnya. dari fungsi tersebut kita dapat menampilkan cluster-cluster dan ID dari anggota cluster tersebut.
Cluster 0 | Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 |
---|---|---|---|---|
2 | 106 | 169 | 1 | 164 |
3 | 107 | 181 | 18 | 172 |
4 | 108 | 182 | 19 | 173 |
5 | 109 | 185 | 22 | 186 |
6 | 110 | 191 | 37 | 187 |
7 | 111 | 192 | 39 | |
8 | 112 | 193 | 40 | |
9 | 113 | 194 | 44 | |
10 | 131 | 195 | 46 | |
11 | 132 | 196 | 48 | |
12 | 166 | 197 | 49 | |
13 | 167 | 198 | 50 | |
14 | 168 | 199 | 51 | |
15 | 170 | 200 | 62 | |
16 | 171 | 201 | 63 | |
17 | 174 | 202 | 64 | |
20 | 176 | 203 | 65 | |
21 | 183 | 204 | 66 | |
23 | 184 | 205 | 67 | |
24 | 206 | 68 | ||
25 | 207 | 69 | ||
26 | 208 | 70 | ||
27 | 209 | 104 | ||
28 | 210 | 105 | ||
29 | 211 | 128 | ||
30 | 212 | 129 | ||
31 | 213 | 130 | ||
32 | 214 | 152 | ||
33 | 158 | |||
34 | 163 | |||
35 | 165 | |||
36 | 175 | |||
38 | 177 | |||
41 | 178 | |||
42 | 179 | |||
43 | 180 | |||
45 | 189 | |||
47 | 190 | |||
52 | ||||
53 | ||||
54 | ||||
55 | ||||
56 | ||||
57 | ||||
58 | ||||
59 | ||||
60 | ||||
61 | ||||
71 | ||||
72 | ||||
73 | ||||
74 | ||||
75 | ||||
76 | ||||
77 | ||||
78 | ||||
79 | ||||
80 | ||||
81 | ||||
82 | ||||
83 | ||||
84 | ||||
85 | ||||
86 | ||||
87 | ||||
88 | ||||
89 | ||||
90 | ||||
91 | ||||
92 | ||||
93 | ||||
94 | ||||
95 | ||||
96 | ||||
97 | ||||
98 | ||||
99 | ||||
100 | ||||
101 | ||||
102 | ||||
103 | ||||
114 | ||||
115 | ||||
116 | ||||
117 | ||||
118 | ||||
119 | ||||
120 | ||||
121 | ||||
122 | ||||
123 | ||||
124 | ||||
125 | ||||
126 | ||||
127 | ||||
133 | ||||
134 | ||||
135 | ||||
136 | ||||
137 | ||||
138 | ||||
139 | ||||
140 | ||||
141 | ||||
142 | ||||
143 | ||||
144 | ||||
145 | ||||
146 | ||||
147 | ||||
148 | ||||
149 | ||||
150 | ||||
151 | ||||
153 | ||||
154 | ||||
155 | ||||
156 | ||||
157 | ||||
159 | ||||
160 | ||||
161 | ||||
162 | ||||
188 |
Hasil dari clustering tersebut dapat kita visualisasikan dalam bentuk plot dengan menggunakan library matpotlib.
import matplotlib.pyplot as plt from sklearn.decomposition import PCA pca = PCA(2) plot_columns = pca.fit_transform(df_dummy.iloc[:,0:10]) plt.title("Hasil Klustering K-Means") plt.scatter(x=plot_columns[:,1], y=plot_columns[:,0], c=data["Cluster"], s=30) plt.show()
Maka akan menampilkan plot scatter tersebut dengan warna berbeda dari masing-masing cluster.
K-Modes¶
Pada contoh implementasi K-Modes kita dapat menggunakan data yang dapat di unduh pada link ini. data tersebut dapat kita visualisasikan dalam bentuk data frame .
import pandas as pd data = pd.read_csv('data_balloons.csv',delimiter=';') df = pd.DataFrame(data) df.style.hide_index()
Setelah di load data tersebut akan tampil seperti berikut:
ID | color | size | act | age | inflated |
---|---|---|---|---|---|
1 | YELLOW | SMALL | STRETCH | ADULT | T |
2 | YELLOW | SMALL | STRETCH | ADULT | T |
3 | YELLOW | SMALL | STRETCH | ADULT | T |
4 | YELLOW | SMALL | STRETCH | ADULT | T |
5 | YELLOW | SMALL | STRETCH | ADULT | T |
6 | YELLOW | SMALL | STRETCH | ADULT | T |
7 | YELLOW | SMALL | STRETCH | ADULT | T |
8 | YELLOW | SMALL | STRETCH | ADULT | T |
9 | YELLOW | SMALL | STRETCH | ADULT | T |
10 | YELLOW | SMALL | STRETCH | ADULT | T |
11 | YELLOW | SMALL | STRETCH | CHILD | F |
12 | YELLOW | SMALL | STRETCH | CHILD | F |
13 | YELLOW | SMALL | STRETCH | CHILD | F |
14 | YELLOW | SMALL | STRETCH | CHILD | F |
15 | YELLOW | SMALL | STRETCH | CHILD | F |
16 | YELLOW | SMALL | DIP | ADULT | F |
17 | YELLOW | SMALL | DIP | ADULT | F |
18 | YELLOW | SMALL | DIP | ADULT | F |
19 | YELLOW | SMALL | DIP | ADULT | F |
20 | YELLOW | SMALL | DIP | ADULT | F |
21 | YELLOW | SMALL | DIP | CHILD | F |
22 | YELLOW | SMALL | DIP | CHILD | F |
23 | YELLOW | SMALL | DIP | CHILD | F |
24 | YELLOW | SMALL | DIP | CHILD | F |
25 | YELLOW | SMALL | DIP | CHILD | F |
26 | YELLOW | LARGE | STRETCH | ADULT | T |
27 | YELLOW | LARGE | STRETCH | ADULT | T |
28 | YELLOW | LARGE | STRETCH | ADULT | T |
29 | YELLOW | LARGE | STRETCH | ADULT | T |
30 | YELLOW | LARGE | STRETCH | ADULT | T |
31 | YELLOW | LARGE | STRETCH | ADULT | T |
32 | YELLOW | LARGE | STRETCH | ADULT | T |
33 | YELLOW | LARGE | STRETCH | ADULT | T |
34 | YELLOW | LARGE | STRETCH | ADULT | T |
35 | YELLOW | LARGE | STRETCH | ADULT | T |
36 | YELLOW | LARGE | STRETCH | CHILD | F |
37 | YELLOW | LARGE | STRETCH | CHILD | F |
38 | YELLOW | LARGE | STRETCH | CHILD | F |
39 | YELLOW | LARGE | STRETCH | CHILD | F |
40 | YELLOW | LARGE | STRETCH | CHILD | F |
41 | YELLOW | LARGE | DIP | ADULT | F |
42 | YELLOW | LARGE | DIP | ADULT | F |
43 | YELLOW | LARGE | DIP | ADULT | F |
44 | YELLOW | LARGE | DIP | ADULT | F |
45 | YELLOW | LARGE | DIP | ADULT | F |
46 | YELLOW | LARGE | DIP | CHILD | F |
47 | YELLOW | LARGE | DIP | CHILD | F |
48 | YELLOW | LARGE | DIP | CHILD | F |
49 | YELLOW | LARGE | DIP | CHILD | F |
50 | YELLOW | LARGE | DIP | CHILD | F |
51 | PURPLE | SMALL | STRETCH | ADULT | T |
52 | PURPLE | SMALL | STRETCH | ADULT | T |
53 | PURPLE | SMALL | STRETCH | ADULT | T |
54 | PURPLE | SMALL | STRETCH | ADULT | T |
55 | PURPLE | SMALL | STRETCH | ADULT | T |
56 | PURPLE | SMALL | STRETCH | ADULT | T |
57 | PURPLE | SMALL | STRETCH | ADULT | T |
58 | PURPLE | SMALL | STRETCH | ADULT | T |
59 | PURPLE | SMALL | STRETCH | ADULT | T |
60 | PURPLE | SMALL | STRETCH | ADULT | T |
61 | PURPLE | SMALL | STRETCH | CHILD | F |
62 | PURPLE | SMALL | STRETCH | CHILD | F |
63 | PURPLE | SMALL | STRETCH | CHILD | F |
64 | PURPLE | SMALL | STRETCH | CHILD | F |
65 | PURPLE | SMALL | STRETCH | CHILD | F |
66 | PURPLE | SMALL | DIP | ADULT | F |
67 | PURPLE | SMALL | DIP | ADULT | F |
68 | PURPLE | SMALL | DIP | ADULT | F |
69 | PURPLE | SMALL | DIP | ADULT | F |
70 | PURPLE | SMALL | DIP | ADULT | F |
71 | PURPLE | SMALL | DIP | CHILD | F |
72 | PURPLE | SMALL | DIP | CHILD | F |
73 | PURPLE | SMALL | DIP | CHILD | F |
74 | PURPLE | SMALL | DIP | CHILD | F |
75 | PURPLE | SMALL | DIP | CHILD | F |
76 | PURPLE | LARGE | STRETCH | ADULT | T |
77 | PURPLE | LARGE | STRETCH | ADULT | T |
78 | PURPLE | LARGE | STRETCH | ADULT | T |
79 | PURPLE | LARGE | STRETCH | ADULT | T |
80 | PURPLE | LARGE | STRETCH | ADULT | T |
81 | PURPLE | LARGE | STRETCH | ADULT | T |
82 | PURPLE | LARGE | STRETCH | ADULT | T |
83 | PURPLE | LARGE | STRETCH | ADULT | T |
84 | PURPLE | LARGE | STRETCH | ADULT | T |
85 | PURPLE | LARGE | STRETCH | ADULT | T |
86 | PURPLE | LARGE | STRETCH | CHILD | F |
87 | PURPLE | LARGE | STRETCH | CHILD | F |
88 | PURPLE | LARGE | STRETCH | CHILD | F |
89 | PURPLE | LARGE | STRETCH | CHILD | F |
90 | PURPLE | LARGE | STRETCH | CHILD | F |
91 | PURPLE | LARGE | DIP | ADULT | F |
92 | PURPLE | LARGE | DIP | ADULT | F |
93 | PURPLE | LARGE | DIP | ADULT | F |
94 | PURPLE | LARGE | DIP | ADULT | F |
95 | PURPLE | LARGE | DIP | ADULT | F |
96 | PURPLE | LARGE | DIP | CHILD | F |
97 | PURPLE | LARGE | DIP | CHILD | F |
98 | PURPLE | LARGE | DIP | CHILD | F |
99 | PURPLE | LARGE | DIP | CHILD | F |
100 | PURPLE | LARGE | DIP | CHILD | F |
kita dapat menggunakan KModes yang merupakan library dari kmodes untuk melakukan clustering. data kategorikal dalam contoh ini digunakan k=3 untuk mengelompokkan menjadi 3 cluster. dari hasil proses clustering yang dilakukan, hasilnya dapat digabungkan dengan data yang telah ada dengan menambahkan attribut Cluster agar setiap baris data memiliki clusternya masing-msaing.
from kmodes.kmodes import KModes k = 3 df_dummy = pd.get_dummies(df) data_set = df_dummy.reset_index().values kmodes_cao = KModes(n_clusters=k, init='Cao', verbose=1) cluster = kmodes_cao.fit(data_set) data['Cluster'] = cluster.labels_ show_cluster(data,k)
kita dapat menggunakan fungsi show_cluster() yang telah dibuat sebelumnya. dari fungsi tersebut kita dapat menampilkan cluster-cluster dan ID dari anggota cluster tersebut.
Cluster 0 | Cluster 1 | Cluster 2 |
---|---|---|
1 | 41 | 11 |
2 | 42 | 12 |
3 | 43 | 13 |
4 | 44 | 14 |
5 | 45 | 15 |
6 | 46 | 21 |
7 | 47 | 22 |
8 | 48 | 23 |
9 | 49 | 24 |
10 | 50 | 25 |
16 | 66 | 36 |
17 | 67 | 37 |
18 | 68 | 38 |
19 | 69 | 39 |
20 | 70 | 40 |
26 | 71 | 61 |
27 | 72 | 62 |
28 | 73 | 63 |
29 | 74 | 64 |
30 | 75 | 65 |
31 | 86 | |
32 | 87 | |
33 | 88 | |
34 | 89 | |
35 | 90 | |
51 | 91 | |
52 | 92 | |
53 | 93 | |
54 | 94 | |
55 | 95 | |
56 | 96 | |
57 | 97 | |
58 | 98 | |
59 | 99 | |
60 | 100 | |
76 | ||
77 | ||
78 | ||
79 | ||
80 | ||
81 | ||
82 | ||
83 | ||
84 | ||
85 |
Hasil dari clustering tersebut dapat kita visualisasikan dalam bentuk plot dengan menggunakan library matpotlib.
import matplotlib.pyplot as plt from sklearn.decomposition import PCA pca = PCA(2) plot_columns = pca.fit_transform(df_dummy.iloc[:,0:6]) plt.title("Hasil Klustering K-Modes") plt.scatter(x=plot_columns[:,1], y=plot_columns[:,0], c=df_dummy["Cluster"], s=30) plt.show()
Maka akan menampilkan plot scatter tersebut dengan warna berbeda dari masing-masing cluster.
K-Prototype¶
Pada implementasi K-Prototype, kita dapat menggunakan data yang dapat diunduh di link ini , data tersebut dapat kita load untuk ditampilkan dalam bentuk data frame .
import pandas as pd data = pd.read_csv('tae_data.csv',delimiter=';') df = pd.DataFrame(data) df.style.hide_index()
Setelah di load data tersebut akan tampil seperti berikut:
Kita dapat menggunakan Kprototype yang merupakan library dari kmodes untuk melakukan clustering data campuran dari numerik dan kategorikal. dalam contoh ini digunakan k=5 untuk mengelompokkan menjadi 5 cluster.
from kmodes.kprototypes import KPrototypes k = 5 df_dummy = pd.get_dummies(df) data_set = df_dummy.reset_index().values kproto = KPrototypes(n_clusters=k, init='Cao', verbose=2) cluster = kproto.fit(data_set, categorical=[0, 1, 2, 3, 5]) data['Cluster'] = cluster.labels_ show_cluster(data,k)
kita dapat menggunakan fungsi show_cluster() yang telah dibuat sebelumnya. dari fungsi tersebut kita dapat menampilkan cluster-cluster dan ID dari anggota cluster tersebut.
Cluster 0 | Cluster 1 | Cluster 2 | Cluster 3 | Cluster 4 |
---|---|---|---|---|
4 | 1 | 3 | 14 | 11 |
8 | 2 | 5 | 15 | 17 |
12 | 6 | 9 | 16 | 19 |
13 | 7 | 18 | 20 | 26 |
21 | 10 | 31 | 25 | 34 |
23 | 22 | 38 | 28 | 50 |
30 | 24 | 42 | 29 | 56 |
33 | 27 | 44 | 35 | 58 |
43 | 32 | 48 | 36 | 65 |
47 | 37 | 57 | 39 | 73 |
51 | 40 | 70 | 53 | 81 |
52 | 41 | 77 | 54 | 83 |
60 | 45 | 85 | 55 | 99 |
62 | 46 | 117 | 59 | 105 |
69 | 49 | 126 | 64 | 107 |
72 | 61 | 149 | 67 | 108 |
93 | 63 | 150 | 68 | 118 |
94 | 66 | 74 | 121 | |
95 | 71 | 75 | 123 | |
98 | 76 | 78 | 129 | |
106 | 79 | 82 | 137 | |
110 | 80 | 89 | 141 | |
112 | 84 | 101 | 143 | |
113 | 86 | 104 | 145 | |
114 | 87 | 109 | 148 | |
115 | 88 | 125 | ||
119 | 90 | 128 | ||
127 | 91 | 132 | ||
139 | 92 | 138 | ||
142 | 96 | 144 | ||
147 | 97 | |||
151 | 100 | |||
102 | ||||
103 | ||||
111 | ||||
116 | ||||
120 | ||||
122 | ||||
124 | ||||
130 | ||||
131 | ||||
133 | ||||
134 | ||||
135 | ||||
136 | ||||
140 | ||||
146 |
Hasil dari clustering tersebut dapat kita visualisasikan dalam bentuk plot dengan menggunakan library matpotlib.
import matplotlib.pyplot as plt from sklearn.decomposition import PCA pca = PCA(2) plot_columns = pca.fit_transform(df_dummy.iloc[:,1:]) plt.title("Hasil Klustering K-Prototype") plt.scatter(x=plot_columns[:,1], y=plot_columns[:,0], c=df_dummy["Cluster"], s=30) plt.show()
Maka akan menampilkan plot scatter tersebut dengan warna berbeda dari setiap cluster.