# Split Table Data Variables and Apply Functions

This example shows how to split power outage data from a table into groups by region and cause of the power outages. Then it shows how to apply functions to calculate statistics for each group and collect the results in a table.

The sample file, `outages.csv`, contains data representing electric utility outages in the United States. The file contains six columns: `Region`, `OutageTime`, `Loss`, `Customers`, `RestorationTime`, and `Cause`. Read `outages.csv` into a table.

`T = readtable('outages.csv');`

Convert `Region` and `Cause` to categorical arrays, and `OutageTime` and `RestorationTime` to `datetime` arrays. Display the first five rows.

```T.Region = categorical(T.Region); T.Cause = categorical(T.Cause); T.OutageTime = datetime(T.OutageTime); T.RestorationTime = datetime(T.RestorationTime); T(1:5,:)```
```ans=5×6 table Region OutageTime Loss Customers RestorationTime Cause _________ ________________ ______ __________ ________________ _______________ SouthWest 2002-02-01 12:18 458.98 1.8202e+06 2002-02-07 16:50 winter storm SouthEast 2003-01-23 00:49 530.14 2.1204e+05 NaT winter storm SouthEast 2003-02-07 21:15 289.4 1.4294e+05 2003-02-17 08:14 winter storm West 2004-04-06 05:44 434.81 3.4037e+05 2004-04-06 06:10 equipment fault MidWest 2002-03-16 06:18 186.44 2.1275e+05 2002-03-18 23:23 severe storm ```

### Calculate Maximum Power Loss

Determine the greatest power loss due to a power outage in each region. The `findgroups` function returns `G`, a vector of group numbers created from `T.Region`. The `splitapply` function uses `G` to split `T.Loss` into five groups, corresponding to the five regions. `splitapply` applies the `max` function to each group and concatenates the maximum power losses into a vector.

```G = findgroups(T.Region); maxLoss = splitapply(@max,T.Loss,G)```
```maxLoss = 5×1 104 × 2.3141 2.3418 0.8767 0.2796 1.6659 ```

Calculate the maximum power loss due to a power outage by cause. To specify that `Cause` is the grouping variable, use table indexing. Create a table that contains the maximum power losses and their causes.

```T1 = T(:,'Cause'); [G,powerLosses] = findgroups(T1); powerLosses.maxLoss = splitapply(@max,T.Loss,G)```
```powerLosses=10×2 table Cause maxLoss ________________ _______ attack 582.63 earthquake 258.18 energy emergency 11638 equipment fault 16659 fire 872.96 severe storm 8767.3 thunder storm 23418 unknown 23141 wind 2796 winter storm 2883.7 ```

`powerLosses` is a table because `T1` is a table. You can append the maximum losses as another table variable.

Calculate the maximum power loss by cause in each region. To specify that `Region` and `Cause` are the grouping variables, use table indexing. Create a table that contains the maximum power losses and display the first 15 rows.

```T1 = T(:,{'Region','Cause'}); [G,powerLosses] = findgroups(T1); powerLosses.maxLoss = splitapply(@max,T.Loss,G); powerLosses(1:15,:)```
```ans=15×3 table Region Cause maxLoss _________ ________________ _______ MidWest attack 0 MidWest energy emergency 2378.7 MidWest equipment fault 903.28 MidWest severe storm 6808.7 MidWest thunder storm 15128 MidWest unknown 23141 MidWest wind 2053.8 MidWest winter storm 669.25 NorthEast attack 405.62 NorthEast earthquake 0 NorthEast energy emergency 11638 NorthEast equipment fault 794.36 NorthEast fire 872.96 NorthEast severe storm 6002.4 NorthEast thunder storm 23418 ```

### Calculate Number of Customers Impacted

Determine power-outage impact on customers by cause and region. Because `T.Loss` contains `NaN` values, wrap `sum` in an anonymous function to use the `'omitnan'` input argument.

```osumFcn = @(x)(sum(x,'omitnan')); powerLosses.totalCustomers = splitapply(osumFcn,T.Customers,G); powerLosses(1:15,:)```
```ans=15×4 table Region Cause maxLoss totalCustomers _________ ________________ _______ ______________ MidWest attack 0 0 MidWest energy emergency 2378.7 6.3363e+05 MidWest equipment fault 903.28 1.7822e+05 MidWest severe storm 6808.7 1.3511e+07 MidWest thunder storm 15128 4.2563e+06 MidWest unknown 23141 3.9505e+06 MidWest wind 2053.8 1.8796e+06 MidWest winter storm 669.25 4.8887e+06 NorthEast attack 405.62 2181.8 NorthEast earthquake 0 0 NorthEast energy emergency 11638 1.4391e+05 NorthEast equipment fault 794.36 3.9961e+05 NorthEast fire 872.96 6.1292e+05 NorthEast severe storm 6002.4 2.7905e+07 NorthEast thunder storm 23418 2.1885e+07 ```

### Calculate Mean Durations of Power Outages

Determine the mean durations of all U.S. power outages in hours. Add the mean durations of power outages to `powerLosses`. Because `T.RestorationTime` has `NaT` values, omit the resulting `NaN` values when calculating the mean durations.

```D = T.RestorationTime - T.OutageTime; H = hours(D); omeanFcn = @(x)(mean(x,'omitnan')); powerLosses.meanOutage = splitapply(omeanFcn,H,G); powerLosses(1:15,:)```
```ans=15×5 table Region Cause maxLoss totalCustomers meanOutage _________ ________________ _______ ______________ __________ MidWest attack 0 0 335.02 MidWest energy emergency 2378.7 6.3363e+05 5339.3 MidWest equipment fault 903.28 1.7822e+05 17.863 MidWest severe storm 6808.7 1.3511e+07 78.906 MidWest thunder storm 15128 4.2563e+06 51.245 MidWest unknown 23141 3.9505e+06 30.892 MidWest wind 2053.8 1.8796e+06 73.761 MidWest winter storm 669.25 4.8887e+06 127.58 NorthEast attack 405.62 2181.8 5.5117 NorthEast earthquake 0 0 0 NorthEast energy emergency 11638 1.4391e+05 77.345 NorthEast equipment fault 794.36 3.9961e+05 87.204 NorthEast fire 872.96 6.1292e+05 4.0267 NorthEast severe storm 6002.4 2.7905e+07 2163.5 NorthEast thunder storm 23418 2.1885e+07 46.098 ```