Enhance K-Means Clustering Performance with PCA! / K-평균 알고리즘에 PCA 활용하기!

While K-Means Clustering is a popular choice, it can sometimes struggle with high-dimensional datasets. PCA can be a valuable tool in such cases, as it helps us focus on the most important factors and improve the algorithm's efficiency.

As can be seen in the figure above, K-Means Clustering wasn't able to effectively group data points in my dataset. To address this, I implemented Principal Component Analysis (PCA) as a preprocessing step.

PCA

from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA

#scaling!
scaler=StandardScaler()
dataPCA=data.drop('Classes  ',axis=1) #your dataframe
data_scaled=scaler.fit_transform(dataPCA)

#PCA
pca=PCA(n_components=2)
PCA=pca.fit_transform(data_scaled)
df_pca=pd.DataFrame(PCA)
x_pca= df_pca.values

After the PCA preprocessing

Finally it works! PCA made it!

'Python' 카테고리의 다른 글

How do I represent 'else: pass' in a ternary expression?/ 'else: pass'문 삼항연산식으로 쓰기 (0)	2024.03.27

Portfolio

Enhance K-Means Clustering Performance with PCA! / K-평균 알고리즘에 PCA 활용하기!

PCA

After the PCA preprocessing

'Python' 카테고리의 다른 글

티스토리툴바

Enhance K-Means Clustering Performance with PCA! / K-평균 알고리즘에 PCA 활용하기!

PCA

After the PCA preprocessing

'Python' 카테고리의 다른 글

관련글

티스토리툴바