Remove outliers based on Z score of a particular variable

If the Z-score of a sample for the selected column corresponds with a p-value less than 0.05, then the sample is considered an outlier and removed.

ZScoreChallengeOutliers(inputted.data, column.to.perform.outlier.analysis)

Arguments

inputted.data	A dataframe
column.to.perform.outlier.analysis	Name of column in dataframe to evaluate for outliers. The column should contain continuous data.

Value

A dataframe with outlier rows removed.

Other Preprocessing functions: AddColBinnedToBinary(), AddColBinnedToQuartiles(), AddPCsToEnd(), ConvertDataToPercentiles(), CorAssoTestMultipleWithErrorHandling(), DownSampleDataframe(), GenerateElbowPlotPCA(), GeneratePC1andPC2PlotsWithAndWithoutOutliers(), Log2TargetDensityPlotComparison(), LookAtPCFeatureLoadings(), MultipleColumnsNormalCheckThenBoxCox(), NormalCheckThenBoxCoxTransform(), RanomlySelectOneRowForEach(), RecodeIdentifier(), RemoveColWithAllZeros(), RemoveRowsBasedOnCol(), RemoveSamplesWithInstability(), SplitIntoTrainTest(), StabilityTestingAcrossVisits(), SubsetDataByContinuousCol(), TwoSampleTTest(), captureSessionInfo(), correlation.association.test(), describeNumericalColumnsWithLevels(), describeNumericalColumns(), generate.descriptive.plots.save.pdf(), generate.descriptive.plots()

Examples



identifier.col <- c("a", "a", "a", "b", "b", "b", "c")
value.col <- c(1, 2, 3, 1, 1, 1, 100)
input.data.frame <- as.data.frame(cbind(identifier.col, value.col))

results <- ZScoreChallengeOutliers(input.data.frame, "value.col")

results
#>   identifier.col value.col
#> 1              a         1
#> 2              a         2
#> 3              a         3
#> 4              b         1
#> 5              b         1
#> 6              b         1

Arguments

Value

See also

Examples