ZScoreChallengeOutliers.RdIf the Z-score of a sample for the selected column corresponds with a p-value less than 0.05, then the sample is considered an outlier and removed.
ZScoreChallengeOutliers(inputted.data, column.to.perform.outlier.analysis)
| inputted.data | A dataframe |
|---|---|
| column.to.perform.outlier.analysis | Name of column in dataframe to evaluate for outliers. The column should contain continuous data. |
A dataframe with outlier rows removed.
Other Preprocessing functions:
AddColBinnedToBinary(),
AddColBinnedToQuartiles(),
AddPCsToEnd(),
ConvertDataToPercentiles(),
CorAssoTestMultipleWithErrorHandling(),
DownSampleDataframe(),
GenerateElbowPlotPCA(),
GeneratePC1andPC2PlotsWithAndWithoutOutliers(),
Log2TargetDensityPlotComparison(),
LookAtPCFeatureLoadings(),
MultipleColumnsNormalCheckThenBoxCox(),
NormalCheckThenBoxCoxTransform(),
RanomlySelectOneRowForEach(),
RecodeIdentifier(),
RemoveColWithAllZeros(),
RemoveRowsBasedOnCol(),
RemoveSamplesWithInstability(),
SplitIntoTrainTest(),
StabilityTestingAcrossVisits(),
SubsetDataByContinuousCol(),
TwoSampleTTest(),
captureSessionInfo(),
correlation.association.test(),
describeNumericalColumnsWithLevels(),
describeNumericalColumns(),
generate.descriptive.plots.save.pdf(),
generate.descriptive.plots()
identifier.col <- c("a", "a", "a", "b", "b", "b", "c") value.col <- c(1, 2, 3, 1, 1, 1, 100) input.data.frame <- as.data.frame(cbind(identifier.col, value.col)) results <- ZScoreChallengeOutliers(input.data.frame, "value.col") results#> identifier.col value.col #> 1 a 1 #> 2 a 2 #> 3 a 3 #> 4 b 1 #> 5 b 1 #> 6 b 1