Main Content

Select Times in Timetable

A timetable is a type of table that associates a time with each row. You can select time-based subsets of its data in several ways:

  • Find times within a certain range using the timerange or withtol functions.

  • Match recurring units of time, such as days or months, using the components of datetime arrays.

  • Resample or group data with the retime function.

For example, read a sample file outages.csv, containing data representing electric utility outages in the United States from 2002–2014. The vector of row times, OutageTime, indicates when the outages occurred. The readtimetable function imports it as a datetime array. Display the first five rows.

TT = readtimetable('outages.csv');
head(TT,5)
       OutageTime          Region         Loss     Customers     RestorationTime            Cause       
    ________________    _____________    ______    __________    ________________    ___________________

    2002-02-01 12:18    {'SouthWest'}    458.98    1.8202e+06    2002-02-07 16:50    {'winter storm'   }
    2003-01-23 00:49    {'SouthEast'}    530.14    2.1204e+05                 NaT    {'winter storm'   }
    2003-02-07 21:15    {'SouthEast'}     289.4    1.4294e+05    2003-02-17 08:14    {'winter storm'   }
    2004-04-06 05:44    {'West'     }    434.81    3.4037e+05    2004-04-06 06:10    {'equipment fault'}
    2002-03-16 06:18    {'MidWest'  }    186.44    2.1275e+05    2002-03-18 23:23    {'severe storm'   }

Before R2019a, read tabular data with readtable and convert it to a timetable using table2timetable.

Select Time Range

To find data in a specific range, you can use the timerange function, which defines time-based subscripts for indexing. For instance, define a range for the summer of 2008, which started on June 20 and ended on September 21. By default, timerange defines a half-open interval that is closed on the left and open on the right, so specify the end date as September 22.

TR = timerange("2008-06-20","2008-09-22")
TR = 
	timetable timerange subscript:

		Select timetable rows with times in the half-open interval:
		  Starting at, including:   20-Jun-2008 00:00:00
		  Ending at, but excluding: 22-Sep-2008 00:00:00

Find the outages that occurred in that range, and then plot the number of customers affected over time.

summer08 = TT(TR,:);
stem(summer08.OutageTime,summer08.Customers)
ylabel("Customers")

Several outages during that time range had high customer impact. Expand the range to a time period that spans the entire year of 2008 and look for similarly high numbers.

TR = timerange("2008","years");
all08 = TT(TR,:);
high08 = all08(all08.Customers > 500000,:);

stem(high08.OutageTime,high08.Customers)
ylabel('Customers')

The timerange function is also helpful for selecting specific dates. Selecting times by comparing datetime values can give misleading results because all datetime values include both date and time components. However, when you specify only the date component of a datetime value, the time component is set to midnight. Therefore, although there is data from June 26, a comparison like this one returns no results.

any(summer08.OutageTime == datetime("2008-06-26"))
ans = logical
   0

Instead, you can use timerange.

TR = timerange("2008-06-26","days");
june26 = summer08(TR,:)
june26=1×5 timetable
       OutageTime          Region         Loss     Customers    RestorationTime           Cause      
    ________________    _____________    ______    _________    ________________    _________________

    2008-06-26 22:36    {'NorthEast'}    425.21      93612      2008-06-27 06:53    {'thunder storm'}

Another way to define a range is to specify a tolerance around a time using withtol. For example, find rows from the summer of 2008 where OutageTime is within three days of Labor Day, September 1.

WT = withtol("2008-09-01",days(3));
nearSep1 = summer08(WT,:)
nearSep1=4×5 timetable
       OutageTime          Region         Loss     Customers    RestorationTime            Cause       
    ________________    _____________    ______    _________    ________________    ___________________

    2008-09-01 23:35    {'SouthEast'}    206.27     2.27e+05                 NaT    {'equipment fault'}
    2008-09-01 00:18    {'MidWest'  }    510.05        74213    2008-09-01 14:07    {'thunder storm'  }
    2008-09-02 19:01    {'MidWest'  }       NaN    2.215e+05    2008-09-03 02:58    {'severe storm'   }
    2008-08-29 20:25    {'West'     }       NaN        31624    2008-09-01 01:51    {'wind'           }

Match Units of Time

You also can use units of datetime values, such as hours or days, to identify rows for logical indexing. This method can be useful for specifying periodic intervals.

For example, find the values of OutageTime whose month components have values of 3 or less, corresponding to January, February, and March of each year. Use the resulting logical array to index into TT.

TR = (month(TT.OutageTime) <= 3);
winterTT = TT(TR,:);

head(winterTT,5)
       OutageTime          Region         Loss     Customers     RestorationTime          Cause      
    ________________    _____________    ______    __________    ________________    ________________

    2002-02-01 12:18    {'SouthWest'}    458.98    1.8202e+06    2002-02-07 16:50    {'winter storm'}
    2003-01-23 00:49    {'SouthEast'}    530.14    2.1204e+05                 NaT    {'winter storm'}
    2003-02-07 21:15    {'SouthEast'}     289.4    1.4294e+05    2003-02-17 08:14    {'winter storm'}
    2002-03-16 06:18    {'MidWest'  }    186.44    2.1275e+05    2002-03-18 23:23    {'severe storm'}
    2005-02-04 08:18    {'MidWest'  }       NaN           NaN    2005-02-04 19:51    {'attack'      }

Create a pie chart of the wintertime causes. The pie function accepts only numeric or categorical inputs, so first convert Cause to categorical.

winterTT.Cause = categorical(winterTT.Cause);
pie(winterTT.Cause)
title("Causes of Outages, January to March");

Group by Time Period

The retime function adjusts row times to create specified intervals, either by resampling or grouping values. Its pre-defined intervals range from seconds to years, and you can specify how to handle missing or multiple values for the intervals. For instance, you can select the first observation from each week, or count observations in a quarter.

For the outage data, you can use retime to find totals for each year. First, create a timetable with only numeric variables. Then, call retime and specify a yearly interval, combining multiple values using a sum. The output has one row for each year, containing the total losses and total customers affected during that year.

numTT = TT(:,vartype("numeric"));
numTT = retime(numTT,"yearly","sum");
head(numTT,5)
       OutageTime       Loss     Customers 
    ________________    _____    __________

    2002-01-01 00:00    81335    1.3052e+07
    2003-01-01 00:00    58036     1.396e+07
    2004-01-01 00:00    51014    1.5523e+07
    2005-01-01 00:00    33980    8.7334e+06
    2006-01-01 00:00    35129    2.5729e+07

Create a bar chart of the number of customers affected each year.

bar(numTT.OutageTime,numTT.Customers)
xlabel("Year")
ylabel("Customers")

Calculate Durations Using Row Times

You can use the row times of a timetable with other datetime or duration values to perform calculations. For example, calculate the durations of the power outages listed in the outage data. Then calculate the monthly medians of the outage durations and plot them.

First add the outage durations to TT by subtracting the row times (which are the starts of power outages) from RestorationTime (which are the ends of the power outages). Change the format of OutageDuration to display the durations of the outages in days. Display the first five rows of TT.

TT.OutageDuration = TT.RestorationTime - TT.OutageTime;
TT.OutageDuration.Format = 'd';
head(TT,5)
       OutageTime          Region         Loss     Customers     RestorationTime            Cause           OutageDuration
    ________________    _____________    ______    __________    ________________    ___________________    ______________

    2002-02-01 12:18    {'SouthWest'}    458.98    1.8202e+06    2002-02-07 16:50    {'winter storm'   }      6.1889 days 
    2003-01-23 00:49    {'SouthEast'}    530.14    2.1204e+05                 NaT    {'winter storm'   }         NaN days 
    2003-02-07 21:15    {'SouthEast'}     289.4    1.4294e+05    2003-02-17 08:14    {'winter storm'   }      9.4576 days 
    2004-04-06 05:44    {'West'     }    434.81    3.4037e+05    2004-04-06 06:10    {'equipment fault'}    0.018056 days 
    2002-03-16 06:18    {'MidWest'  }    186.44    2.1275e+05    2002-03-18 23:23    {'severe storm'   }      2.7118 days 

Create a timetable that has only the outage durations. Some rows of TT have missing values, NaT, for the restoration times, leading to NaN values in OutageDuration. To remove the NaN values from medianTT, use the rmmissing function. Then use retime to calculate the monthly median outage duration. Display the first five rows of medianTT.

medianTT = TT(:,"OutageDuration");
medianTT = rmmissing(medianTT); 
medianTT = retime(medianTT,'monthly',@median);
head(medianTT,5)
       OutageTime       OutageDuration
    ________________    ______________

    2002-02-01 00:00      6.1889 days 
    2002-03-01 00:00      2.7472 days 
    2002-04-01 00:00         NaN days 
    2002-05-01 00:00     0.72917 days 
    2002-06-01 00:00     0.22431 days 

Create a stairstep chart of the monthly median outage durations.

stairs(medianTT.OutageTime,medianTT.OutageDuration)
xlabel("Year")
ylabel("Median Duration (days)")

See Also

| | | | | | | | | | |

Related Topics