Author: Yong-Han Hank Cheng
This package provides functions for: 1. Preprocessing data. 2. Clustering with K-means or hierarchical clustering. 3. Classification with random forest.
The main functions for clustering are:
The main functions for classification are:
RandomForestAutomaticMtryAndNtree(): This uses the randomForest::randomForest function to create a random forest classifier. Additional code is included for optimizing mtry and ntree. Importantly, the default values for mtry and ntree are sensible values and actually is not required to be adjusted. For more rigorous explanations on hyperparameter tuning, use this paper and the R package provided by the authors: https://doi.org/10.1002/widm.1301, “Hyperparameters and tuning strategies for random forest” by Probst et al, 2019.
RandomForestClassificationPercentileMatrixForPheatmap(): This creates random forest model on several smaller subset data sets of a larger data set and outputs results in the form of a pheatmap so that the random forest performance in terms of prediction accuracy and feature importance can be compared between each subset.
CVPredictionsRandomForest(): This performs random forest classification using default hyperparameters and also uses cross-validation. The cross-validation fold can be specified by the user.The performance (predictions and feature importance determination) of the model are outputted.
CVRandomForestClassificationMatrixForPheatmap(): This uses CVPredictionsRandomForest() on several smaller subset data sets of a larger data set and outputs results in the form of a pheatmap so that the random forest performance in terms of prediction accuracy and feature importance can be compared between each subset.
# Install the package from GitHub
devtools::install_github("yhhc2/machinelearnr")
# Load package
library("machinelearnr")
Source code: https://github.com/yhhc2/machinelearnr
Visit the package’s website: https://yhhc2.github.io/machinelearnr/
Function reference is located here: https://yhhc2.github.io/machinelearnr/reference/index.html
Visit this vignette for example output for each function usage: https://yhhc2.github.io/machinelearnr/articles/Examples.html
The machinelearnr package is licensed under the GPL (>=3) license. The logo is licensed under the CC BY 4.0 license.