# Convert Text in Table Variables to Categorical

This example shows how to convert a variable in a table from a cell array of character vectors to a categorical array.

### Load Sample Data and Create a Table

Load sample data gathered from 100 patients.

```load patients whos```
``` Name Size Bytes Class Attributes Age 100x1 800 double Diastolic 100x1 800 double Gender 100x1 11412 cell Height 100x1 800 double LastName 100x1 11616 cell Location 100x1 14208 cell SelfAssessedHealthStatus 100x1 11540 cell Smoker 100x1 100 logical Systolic 100x1 800 double Weight 100x1 800 double ```

Store the patient data from `Age`, `Gender`, `Height`, `Weight`, `SelfAssessedHealthStatus`, and `Location` in a table. Use the unique identifiers in the variable `LastName` as row names.

```T = table(Age,Gender,Height,Weight,... SelfAssessedHealthStatus,Location,... 'RowNames',LastName);```

### Convert Table Variables from Cell Arrays of Character Vectors to Categorical Arrays

The cell arrays of character vectors, `Gender` and `Location`, contain discrete sets of unique values.

Convert `Gender` and `Location` to categorical arrays.

```T.Gender = categorical(T.Gender); T.Location = categorical(T.Location);```

The variable, `SelfAssessedHealthStatus`, contains four unique values: `Excellent`, `Fair`, `Good`, and `Poor`.

Convert `SelfAssessedHealthStatus` to an ordinal categorical array, such that the categories have the mathematical ordering `Poor < Fair < Good < Excellent`.

```T.SelfAssessedHealthStatus = categorical(T.SelfAssessedHealthStatus,... {'Poor','Fair','Good','Excellent'},'Ordinal',true);```

### Print a Summary

View the data type, description, units, and other descriptive statistics for each variable by using `summary` to summarize the table.

```format compact summary(T)```
```Variables: Age: 100x1 double Values: Min 25 Median 39 Max 50 Gender: 100x1 categorical Values: Female 53 Male 47 Height: 100x1 double Values: Min 60 Median 67 Max 72 Weight: 100x1 double Values: Min 111 Median 142.5 Max 202 SelfAssessedHealthStatus: 100x1 ordinal categorical Values: Poor 11 Fair 15 Good 40 Excellent 34 Location: 100x1 categorical Values: County General Hospital 39 St. Mary s Medical Center 24 VA Hospital 37 ```

The table variables `Gender`, `SelfAssessedHealthStatus`, and `Location` are categorical arrays. The summary contains the counts of the number of elements in each category. For example, the summary indicates that 53 of the 100 patients are female and 47 are male.

### Select Data Based on Categories

Create a subtable, `T1`, containing the age, height, and weight of all female patients who were observed at County General Hospital. You can easily create a logical vector based on the values in the categorical arrays `Gender` and `Location`.

`rows = T.Location=='County General Hospital' & T.Gender=='Female';`

`rows` is a 100-by-1 logical vector with logical `true` (`1`) for the table rows where the gender is female and the location is County General Hospital.

Define the subset of variables.

`vars = {'Age','Height','Weight'};`

Use parentheses to create the subtable, `T1`.

`T1 = T(rows,vars)`
```T1=19×3 table Age Height Weight ___ ______ ______ Brown 49 64 119 Taylor 31 66 132 Anderson 45 68 128 Lee 44 66 146 Walker 28 65 123 Young 25 63 114 Campbell 37 65 135 Evans 39 62 121 Morris 43 64 135 Rivera 29 63 130 Richardson 30 67 141 Cox 28 66 111 Torres 45 70 137 Peterson 32 60 136 Ramirez 48 64 137 Bennett 35 64 131 ⋮ ```

`A` is a 19-by-3 table.

Since ordinal categorical arrays have a mathematical ordering for their categories, you can perform element-wise comparisons of them with relational operations, such as greater than and less than.

Create a subtable, `T2`, of the gender, age, height, and weight of all patients who assessed their health status as poor or fair.

First, define the subset of rows to include in table `T2`.

`rows = T.SelfAssessedHealthStatus<='Fair';`

Then, define the subset of variables to include in table `T2`.

`vars = {'Gender','Age','Height','Weight'};`

Use parentheses to create the subtable `T2`.

`T2 = T(rows,vars)`
```T2=26×4 table Gender Age Height Weight ______ ___ ______ ______ Johnson Male 43 69 163 Jones Female 40 67 133 Thomas Female 42 66 137 Jackson Male 25 71 174 Garcia Female 27 69 131 Rodriguez Female 39 64 117 Lewis Female 41 62 137 Lee Female 44 66 146 Hall Male 25 70 189 Hernandez Male 36 68 166 Lopez Female 40 66 137 Gonzalez Female 35 66 118 Mitchell Male 39 71 164 Campbell Female 37 65 135 Parker Male 30 68 182 Stewart Male 49 68 170 ⋮ ```

`T2` is a 26-by-4 table.