GeneratePC1andPC2PlotsWithAndWithoutOutliers.RdGenerate PC1 vs PC2 plots to visualize data with and without outliers. also ouputs dataset with outliers removed.
GeneratePC1andPC2PlotsWithAndWithoutOutliers( inputted.data, columns.to.do.PCA.on, scale.PCA, p.value.for.outliers )
| inputted.data | A dataframe. |
|---|---|
| columns.to.do.PCA.on | A vector of strings that specify the column names that should be used for doing PCA. |
| scale.PCA | Boolean to specify whether or not to scale columns before doing PCA. |
| p.value.for.outliers | Outliers are defined as samples with either PC1 or PC2 values that have a standard deviation value that meets a specified p-value threshold. |
A List with two objects:
Data after removing outliers for PC1 and PC2.
Data from outliers.
Plots will also be displayed.
Outliers are defined as samples with either PC1 or PC2 values that have a standard deviation value that meets a specified p-value threshold.
Other Preprocessing functions:
AddColBinnedToBinary(),
AddColBinnedToQuartiles(),
AddPCsToEnd(),
ConvertDataToPercentiles(),
CorAssoTestMultipleWithErrorHandling(),
DownSampleDataframe(),
GenerateElbowPlotPCA(),
Log2TargetDensityPlotComparison(),
LookAtPCFeatureLoadings(),
MultipleColumnsNormalCheckThenBoxCox(),
NormalCheckThenBoxCoxTransform(),
RanomlySelectOneRowForEach(),
RecodeIdentifier(),
RemoveColWithAllZeros(),
RemoveRowsBasedOnCol(),
RemoveSamplesWithInstability(),
SplitIntoTrainTest(),
StabilityTestingAcrossVisits(),
SubsetDataByContinuousCol(),
TwoSampleTTest(),
ZScoreChallengeOutliers(),
captureSessionInfo(),
correlation.association.test(),
describeNumericalColumnsWithLevels(),
describeNumericalColumns(),
generate.descriptive.plots.save.pdf(),
generate.descriptive.plots()