RemoveSamplesWithInstability.Rd
This function uses the StabilityTestingAcrossVisits() function, and then uses the results to subset the inputted data.
RemoveSamplesWithInstability( inputted.data, col.name.of.unique.identifier, value.to.evaluate, standard.deviation.threshold )
inputted.data | A dataframe |
---|---|
col.name.of.unique.identifier | A string that specifies name of column in inputted.data containing unique identifiers. |
value.to.evaluate | A string that specifies name of column in inputted.data to look at for stability of values. |
standard.deviation.threshold | A numeric value that specifies the value of the standard deviation that is considered large enough to say vists for a single sample is too unstable. |
A dataframe where only rows from stable samples remain.
Samples with only a single visit are removed. Additionally, samples that have values that differ significantly (stddev greater than a specified threshold) are also removed.
Other Preprocessing functions:
AddColBinnedToBinary()
,
AddColBinnedToQuartiles()
,
AddPCsToEnd()
,
ConvertDataToPercentiles()
,
CorAssoTestMultipleWithErrorHandling()
,
DownSampleDataframe()
,
GenerateElbowPlotPCA()
,
GeneratePC1andPC2PlotsWithAndWithoutOutliers()
,
Log2TargetDensityPlotComparison()
,
LookAtPCFeatureLoadings()
,
MultipleColumnsNormalCheckThenBoxCox()
,
NormalCheckThenBoxCoxTransform()
,
RanomlySelectOneRowForEach()
,
RecodeIdentifier()
,
RemoveColWithAllZeros()
,
RemoveRowsBasedOnCol()
,
SplitIntoTrainTest()
,
StabilityTestingAcrossVisits()
,
SubsetDataByContinuousCol()
,
TwoSampleTTest()
,
ZScoreChallengeOutliers()
,
captureSessionInfo()
,
correlation.association.test()
,
describeNumericalColumnsWithLevels()
,
describeNumericalColumns()
,
generate.descriptive.plots.save.pdf()
,
generate.descriptive.plots()
identifier.col <- c("a", "a", "a", "b", "b", "b", "c") value.col <- c(1, 2, 3, 1, 1, 1, 5) input.data.frame <- as.data.frame(cbind(identifier.col, value.col)) results <- RemoveSamplesWithInstability(input.data.frame, "identifier.col", "value.col", 0.5) results#> identifier.col value.col #> 4 b 1 #> 5 b 1 #> 6 b 1