top of page
Image by mostafa meraji

IRIS SPECIES

Silas Liu - Nov. 12, 2021

R, Machine Learning Classification, PCA

The Iris data set is a famous data set used in machine learning studies, first presented in 1936, for Linear Discriminant Analysis (LDA). The dataset is composed of 4 features of the flowers and 3 iris species: setosa, versicolor and virginica.

​​

We apply here some machine learning techniques like LDA and QDA but the focus of this study is to apply Principal Component Analysis (PCA) on the dataset, showcasing dimension reduction.

​

A complementary analysis of PCA can be accessed by viewing the analysis of Singular Value Decomposition (SVD) applied to mnist digits dataset.

bottom of page