Main Content

Regression

Create Regression model object for exposure at default

Description

Create and analyze a Regression model object to calculate the exposure at default (EAD) using this workflow:

  1. Use fitEADModel to create a Regression model object.

  2. Use predict to predict the EAD.

  3. Use modelDiscrimination to return AUROC and ROC data. You can plot the results using modelDiscriminationPlot.

  4. Use modelAccuracy to return the R-square, RMSE, correlation, and sample mean error of the predicted and observed EAD data. You can plot the results using modelAccuracyPlot.

Creation

Description

example

RegressionEADModel = fitEADModel(data,ModelType) creates a Regression LGD model object.

example

RegressionEADModel = fitEADModel(___,Name,Value) specifies options using one or more name-value pair arguments in addition to the input arguments in the previous syntax. The optional name-value pair arguments set model object properties. For example, eadModel = fitEADModel(EADData,ModelType,'PredictorVars',{'UtilizationRate','Age','Marriage'},'ConversionMeasure',"ccf",'DrawnVar','Drawn','LimitVar','Limit','ResponseVar','EAD') creates an eadModel object using a Regression model type.

Input Arguments

expand all

Data for loss given default, specified as a table.

Data Types: table

Model type, specified as a string with the value of "Regression" or a character vector with the value of 'Regression'.

Data Types: char | string

Regression Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside quotes. You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: eadModel = fitEADModel(EADData,ModelType,'PredictorVars',{'UtilizationRate','Age','Marriage'},'ConversionMeasure',ConversionMeasure,'DrawnVar','Drawn','LimitVar','Limit','ResponseVar','EAD')

User-defined model ID, specified as the comma-separated pair consisting of 'ModelID' and a string or character vector. The software uses the ModelID text to format outputs and is expected to be short.

Data Types: string | char

User-defined description for model, specified as the comma-separated pair consisting of 'Description' and a string or character vector.

Data Types: string | char

Predictor variables, specified as the comma-separated pair consisting of 'PredictorVars' and a string array or cell array of character vectors. PredictorVars indicates which columns in the data input contain the predictor information. By default, PredictorVars is set to all the columns in the data input except for the ResponseVar.

Data Types: string | cell

Response variable, specified as the comma-separated pair consisting of 'ResponseVar' and a string or character vector. The response variable contains the EAD data and must be a numeric variable with values between 0 and 1 (inclusive). An EAD value of 0 indicates no loss (full recovery), 1 indicates total loss (no recovery), and values between 0 and 1 indicate a partial loss. By default, the ResponseVar is set to the last column of data.

Data Types: string | char

Boundary tolerance, specified as the comma-separated pair consisting of 'BoundaryTolerance' and a positive scalar numeric. The BoundaryTolerance value perturbs the EAD response values away from 0 and 1, before applying a response transformation.

Data Types: double

Limit variable, specified as the comma-separated pair consisting of 'LimitVar' and a string or character vector. LimitVar indicates which column in the data contains the limit amount. LimitVar is required when ConversionMeasure is 'ccf' or 'lcf'.

Data Types: string | char

Drawn variable, specified as the comma-separated pair consisting of 'DrawnVar' and a string or character vector. DrawnVar indicates which column in the data contains the limit amount. DrawnVar is required when ConversionMeasure is 'ccf'.

Data Types: string | char

Response transform, specified as the comma-separated pair consisting of 'ConversionMeasure' and a character vector or string.

  • "ccf" — Credit conversion factor (CCF) is the portion of the undrawn amount that will be converted into credit. The undrawn amount is the limit minus the drawn amount. The EAD thus becomes the drawn amount plus the CCF times the limit minus the drawn amount (EAD = Drawn + CCF*(Limit - Drawn)) .

  • "lcf" — Limit conversion factor (LCF) is a fraction of the limit representing the total exposure. The EAD is then defined as the LCF times the limit (EAD = LCF*Limit).

Data Types: string | char

Properties

expand all

User-defined model ID, returned as a string.

Data Types: string

User-defined description, returned as a string.

Data Types: string

Underlying statistical model, returned as a compact linear model object. The compact version of the underlying regression model is an instance of the classreg.regr.CompactLinearModel class. For more information, see fitlm and CompactLinearModel.

Data Types: string

Predictor variables, returned as a string array.

Data Types: string

Response variable, returned as a scalar string.

Data Types: string

Limit variable, returned as a string.

Data Types: string

Drawn variable, returned as a string.

Data Types: string

This property is read-only.

Boundary tolerance, returned as a scalar positive numeric.

Data Types: double

Conversion measure, returned as a string.

Data Types: string

This property is read-only.

Conversion transform, returned as a string that is "complog" if ConversionMeasure is "ccf" and "logit" when ConversionMeasure is "lcf".

Data Types: string

Object Functions

predictPredict exposure at default
modelDiscriminationCompute AUROC and ROC data
modelDiscriminationPlotPlot ROC curve
modelAccuracyCompute R-square, RMSE, correlation, and sample mean error of predicted and observed EADs
modelAccuracyPlotScatter plot of predicted and observed EADs

Examples

collapse all

This example shows how to use fitEADModel to create a Regression model for exposure at default (EAD).

Load EAD Data

Load the EAD data.

load EADData.mat
head(EADData)
ans=8×6 table
    UtilizationRate    Age     Marriage        Limit         Drawn          EAD    
    _______________    ___    ___________    __________    __________    __________

        0.24359        25     not married         44776         10907         44740
        0.96946        44     not married    2.1405e+05    2.0751e+05         40678
              0        40     married        1.6581e+05             0    1.6567e+05
        0.53242        38     not married    1.7375e+05         92506        1593.5
         0.2583        30     not married         26258        6782.5        54.175
        0.17039        54     married        1.7357e+05         29575        576.69
        0.18586        27     not married         19590          3641        998.49
        0.85372        42     not married    2.0712e+05    1.7682e+05    1.6454e+05

rng('default');
NumObs = height(EADData);
c = cvpartition(NumObs,'HoldOut',0.4);
TrainingInd = training(c);
TestInd = test(c);

Select Model Type

Select a model type for Regression or Tobit.

ModelType = "Regression";

Select Conversion Measure

Select a conversion measure for the EAD response values.

ConversionMeasure = "LCF";

Create Regression EAD Model

Use fitEADModel to create a Regression model using EADData.

eadModel = fitEADModel(EADData,ModelType,'PredictorVars',{'UtilizationRate','Age','Marriage'}, ...
    'ConversionMeasure',ConversionMeasure,'DrawnVar','Drawn','LimitVar','Limit','ResponseVar','EAD');
disp(eadModel);
  Regression with properties:

    ConversionTransform: "logit"
      BoundaryTolerance: 1.0000e-07
                ModelID: "Regression"
            Description: ""
        UnderlyingModel: [1x1 classreg.regr.CompactLinearModel]
          PredictorVars: ["UtilizationRate"    "Age"    "Marriage"]
            ResponseVar: "EAD"
               LimitVar: "Limit"
               DrawnVar: "Drawn"
      ConversionMeasure: "lcf"

Display the underlying model. The underlying model's response variable is the transformation of the EAD response data. Use the 'BoundaryTolerance', 'LimitVar', and 'DrawnVar' name-value arguments to modify the transformation.

disp(eadModel.UnderlyingModel);
Compact linear regression model:
    EAD_lcf_logit ~ 1 + UtilizationRate + Age + Marriage

Estimated Coefficients:
                            Estimate        SE         tStat       pValue  
                            _________    _________    _______    __________

    (Intercept)               -2.4745      0.29892    -8.2781    1.6448e-16
    UtilizationRate            6.0045      0.19901     30.172    7.703e-182
    Age                     -0.020095    0.0073019     -2.752     0.0059471
    Marriage_not married     -0.03509      0.13935    -0.2518        0.8012


Number of observations: 4378, Error degrees of freedom: 4374
Root Mean Squared Error: 4.48
R-squared: 0.173,  Adjusted R-Squared: 0.173
F-statistic vs. constant model: 305, p-value = 5.7e-180

Predict EAD

EAD prediction operates on the underlying compact statistical model and then transforms the predicted values back to the EAD scale. You can specify the predict function with different options for the 'ModelLevel' name-vale argument.

predictedEAD = predict(eadModel, EADData(TestInd,:), 'ModelLevel', 'ead');
predictedConversion = predict(eadModel, EADData(TestInd,:), 'ModelLevel', 'ConversionMeasure');

Validate EAD Model

For model validation, use modelDiscrimination, modelDiscriminationPlot, modelAccuracy, and modelAccuracyPlot.

Use modelDiscrimination and then modelDiscriminationPlot to plot the ROC curve.

ModelLevel = "ConversionMeasure";
[DiscMeasure1, DiscData1] = modelDiscrimination(eadModel, EADData(TestInd,:), 'ModelLevel', ModelLevel);
modelDiscriminationPlot(eadModel, EADData(TestInd, :), 'ModelLevel', ModelLevel,'SegmentBy','Marriage');

Figure contains an axes object. The axes object with title E A D _ l c f blank R O C blank S e g m e n t e d blank b y blank M a r r i a g e contains 2 objects of type line. These objects represent Regression, married, AUROC = 0.70813, Regression, not married, AUROC = 0.70921.

Use modelAccuracy and then modelAccuracyPlot to show a scatter plot of the predictions.

YData = "Observed";

[AccMeasure1, AccData1] = modelAccuracy(eadModel, EADData(TestInd,:), 'ModelLevel', ModelLevel);
modelAccuracyPlot(eadModel, EADData(TestInd,:), 'ModelLevel', ModelLevel, 'YData', YData);

Figure contains an axes object. The axes object with title Scatter Regression, R-Squared: 0.16148 contains 2 objects of type scatter, line. These objects represent Data, Fit.

Plot a histogram of observed with respect to the predicted EAD.

figure;
histogram(AccData1.Observed);
hold on;
histogram(AccData1.(('Predicted_' + ModelType)));
legend('Observed', 'Predicted');

Figure contains an axes object. The axes object contains 2 objects of type histogram. These objects represent Observed, Predicted.

More About

expand all

References

[1] Baesens, Bart, Daniel Roesch, and Harald Scheule. Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS. Wiley, 2016.

[2] Bellini, Tiziano. IFRS 9 and CECL Credit Risk Modelling and Validation: A Practical Guide with Examples Worked in R and SAS. San Diego, CA: Elsevier, 2019.

[3] Brown, Iain. Developing Credit Risk Models Using SAS Enterprise Miner and SAS/STAT: Theory and Applications. SAS Institute, 2014.

[4] Roesch, Daniel and Harald Scheule. Deep Credit Risk. Independently published, 2020.

Introduced in R2021b