outlierImputerComponent
Description
outlierImputerComponent is a pipeline component that performs
outlier imputation. The pipeline component uses the functionality of the filloutliers function during the learn phase to identify and impute outlier
values for a set of observations. During the run phase, the component uses the values learned
during the learn phase to impute outlier values in a new data set.
Creation
Description
creates a pipeline component for imputing outlier values.component = outlierImputerComponent
sets writable Properties using one or more
name-value arguments. For example, you can specify the outlier detection method by using
the component = outlierImputerComponent(Name=Value)FindMethod name-value argument.
Properties
Learn Parameters
The software sets learn parameters when you create the component. You can modify learn
parameters using dot notation any time before you use the learn object
function. Any unset learn parameters use the corresponding default values.
Outlier detection method, specified as one of the following values.
| Value | Description |
|---|---|
"gesd" | For each variable, find outliers by using the generalized extreme
Studentized deviate test for outliers. Use ThresholdFactor to specify the alpha value for the test. |
"grubbs" | For each variable, find outliers by using Grubbs’ test, which removes one
outlier per iteration based on hypothesis testing. Use
ThresholdFactor to specify the alpha value for the
test. |
"mean" | For each variable, outliers are values more than a certain number of
standard deviations from the mean. Use ThresholdFactor to
specify the number of standard deviations. |
"median" | For each variable, outliers are values more than a certain number of
scaled median absolute deviations (MAD) from the median. Use
ThresholdFactor to specify the number of scaled
MAD. |
"percentiles" | For each variable, outliers are values below the lower threshold or above
the upper threshold, as specified by Threshold. |
"quartiles" | For each variable, outliers are values more than a certain number of
interquartile ranges below the lower quartile (25 percent) or above the upper
quartile (75 percent). Use ThresholdFactor to specify the
number of interquartile ranges. |
For more information, see findmethod.
Example: c =
outlierImputerComponent(FindMethod="mean")
Example: c.FindMethod = "quartiles"
Data Types: char | string
Outlier imputation method, specified as "center" or
"clip".
If you specify
"center", the software determines the center value using the outlier detection method (FindMethod).If you specify
"clip", the software determines the lower and upper thresholds using the outlier detection method (FindMethod).
Example: c =
outlierImputerComponent(ImputationMethod="clip")
Example: c.ImputationMethod = "center"
Data Types: char | string
Outlier detection threshold factor, specified as a nonnegative scalar.
When
FindMethodis"median", the outlier detection threshold factor is the number of scaled MAD, which is 3 by default.When
FindMethodis"mean", the outlier detection threshold factor is the number of standard deviations from the mean, which is 3 by default.When
FindMethodis"grubbs"or"gesd", the outlier detection threshold factor is a scalar in the interval (0,1), which represents the alpha value of the hypothesis test. Values close to 0 result in a smaller number of outliers, and values close to 1 result in a larger number of outliers. The default value is 0.05.When
FindMethodis"quartiles", the outlier detection threshold factor is the number of interquartile ranges, which is 1.5 by default.
You cannot specify ThresholdFactor when the outlier detection
method is "percentiles".
Example: c =
outlierImputerComponent(ThresholdFactor=2.5)
Example: c.ThresholdFactor = 0.01
Data Types: single | double
Lower and upper percentile thresholds, specified as a nonnegative vector with two elements in the interval [0,100]. The first element indicates the lower percentile threshold, and the second element indicates the upper percentile threshold. The first element must be less than the second element.
You must specify Threshold when the outlier detection method
(FindMethod)
is "percentiles". You cannot specify Threshold
for any other outlier detection method.
Example: c = outlierImputerComponent(Threshold=[10
90])
Example: c.Threshold=[5 95]
Data Types: single | double
Maximum number of outliers to impute, specified as a positive integer scalar.
If you do not specify the MaxNumOutliers value, the software
uses the integer nearest to 10 percent of n, where
n is the number of observations in the data argument of
learn.
You can specify MaxNumOutliers only when the outlier
detection method (FindMethod)
is "gesd".
Example: c =
outlierImputerComponent(MaxNumOutliers=20)
Example: c.MaxNumOutliers = 5
Data Types: single | double
Component Properties
The software sets component properties when you create the component. You can modify the
component properties (excluding HasLearnables and
HasLearned) using dot notation at any time. You cannot modify the
HasLearnables and HasLearned properties
directly.
Component identifier, specified as a character vector or string scalar.
Example: c =
outlierImputerComponent(Name="OutlierImputation")
Example: c.Name = "Imputation"
Data Types: char | string
Names of the input ports, specified as a character vector, string array, or cell array of character vectors.
Example: c =
outlierImputerComponent(Inputs="X")
Example: c.Inputs = "X1"
Data Types: char | string | cell
Names of the output ports, specified as a character vector, string array, or cell array of character vectors.
Example: c =
outlierImputerComponent(Outputs=["newX","indices"])
Example: c.Outputs = ["X1","Idx"]
Data Types: char | string | cell
Tags that enable the automatic connection of the component inputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
InputTags, the number of tags must match the number of inputs
in Inputs.
Example: c =
outlierImputerComponent(InputTags=0)
Example: c.InputTags = 1
Data Types: single | double
Tags that enable the automatic connection of the component outputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
OutputTags, the number of tags must match the number of outputs
in Outputs.
Example: c = outlierImputerComponent(OutputTags=[0
0])
Example: c.OutputTags = [1 0]
Data Types: single | double
This property is read-only.
Indicator for the learnables, returned as 1
(true). A value of 1 indicates that the
component contains Learnables.
Data Types: logical
This property is read-only.
Indicator showing the learning status of the component, returned as
0 (false) or 1
(true). A value of 1 indicates that the
learn object function has been applied to the component, and
the Learnables are nonempty.
Data Types: logical
Learnables
The software sets learnables when you use the learn object
function. You cannot modify learnables directly.
This property is read-only.
Lower threshold for identifying outliers, returned as a table. Each value
corresponds to a variable in VariablesWithOutliers.
This property is read-only.
Upper threshold for identifying outliers, returned as a table. Each value
corresponds to a variable in VariablesWithOutliers.
This property is read-only.
Center value for imputing outliers, returned as a table. Each value corresponds to
a variable in VariableWithOutliers.
This property is read-only.
Names of the variables used by the component to derive the
LowerThreshold, UpperThreshold, and
Center values, returned as a string array. The variables
correspond to columns in the data argument of learn.
Object Functions
learn | Initialize and evaluate pipeline or component |
run | Execute pipeline or component for inference after learning |
reset | Reset pipeline or component |
series | Connect components in series to create pipeline |
parallel | Connect components or pipelines in parallel to create pipeline |
view | View diagram of pipeline inputs, outputs, components, and connections |
Examples
Create a pipeline component that imputes outlier values in observations.
component = outlierImputerComponent
component =
outlierImputerComponent with properties:
Name: "OutlierImputer"
Inputs: "DataIn"
InputTags: 1
Outputs: ["DataOut" "IsOutlier"]
OutputTags: [1 0]
Learnables (HasLearned = false)
LowerThreshold: []
UpperThreshold: []
Center: []
VariablesWithOutliers: []
Show all parameterscomponent is an outlierImputerComponent object
that contains four learnables: LowerThreshold,
UpperThreshold, Center, and
VariablesWithOutliers. The properties remain empty until you pass
data to the component during the learn phase.
Load the carbig data set. Create a table containing the variables
Acceleration, Displacement, and
Horsepower.
load carbig
cars = table(Acceleration,Displacement,Horsepower);Use the learn object function to find and impute outlier values
in cars.
[component,newcars] = learn(component,cars); component
component =
outlierImputerComponent with properties:
Name: "OutlierImputer"
Inputs: "DataIn"
InputTags: 1
Outputs: ["DataOut" "IsOutlier"]
OutputTags: [1 0]
Learnables (HasLearned = true)
LowerThreshold: [1×3 table]
UpperThreshold: [1×3 table]
Center: [1×3 table]
VariablesWithOutliers: ["Acceleration" "Displacement" "Horsepower"]
Show all parametersThe LowerThreshold, UpperThreshold,
Center, and VariablesWithOutliers properties are
nonempty, and the HasLearned property is set to
true.
The newcars data set contains the same observations as
cars, but has imputed outlier values. For example, the sixth
observation in cars, which has outlier values for
Displacement and Horsepower, uses imputed values
in newcars.
observation = cars(6,:) newObservation = newcars(6,:)
observation =
1×3 table
Acceleration Displacement Horsepower
____________ ____________ __________
10 429 198
newObservation =
1×3 table
Acceleration Displacement Horsepower
____________ ____________ __________
10 151 95 Version History
Introduced in R2026a
See Also
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)