top of page


IRIS SPECIES
Silas Liu - Nov. 12, 2021
R, Machine Learning Classification, PCA
The Iris data set is a famous data set used in machine learning studies, first presented in 1936, for Linear Discriminant Analysis (LDA). The dataset is composed of 4 features of the flowers and 3 iris species: setosa, versicolor and virginica.
​​
We apply here some machine learning techniques like LDA and QDA but the focus of this study is to apply Principal Component Analysis (PCA) on the dataset, showcasing dimension reduction.
​
A complementary analysis of PCA can be accessed by viewing the analysis of Singular Value Decomposition (SVD) applied to mnist digits dataset.
bottom of page