Read and Analyze MAT-File with Key-Value Data
This example shows how to create a datastore for key-value pair data in a MAT-file that is the output of mapreduce
. Then, the example shows how to read all the data in the datastore and sort it. This example assumes that the data in the MAT-file fits in memory.
Create a datastore from the sample file, mapredout.mat
, using the datastore
function. The sample file contains unique keys representing airline carrier codes and corresponding values that represent the number of flights operated by that carrier.
ds = datastore('mapredout.mat');
datastore
returns a KeyValueDatastore
. The datastore
function automatically determines the appropriate type of datastore to create.
Preview the data using the preview
function. This function does not affect the state of the datastore.
preview(ds)
ans=1×2 table
Key Value
______ _________
{'AA'} {[14930]}
Read all of the data in ds
using the readall
function. The readall
function returns a table with two columns, Key
and Value
.
T = readall(ds)
T=29×2 table
Key Value
__________ _________
{'AA' } {[14930]}
{'AS' } {[ 2910]}
{'CO' } {[ 8138]}
{'DL' } {[16578]}
{'EA' } {[ 920]}
{'HP' } {[ 3660]}
{'ML (1)'} {[ 69]}
{'NW' } {[10349]}
{'PA (1)'} {[ 318]}
{'PI' } {[ 871]}
{'PS' } {[ 83]}
{'TW' } {[ 3805]}
{'UA' } {[13286]}
{'US' } {[13997]}
{'WN' } {[15931]}
{'AQ' } {[ 154]}
⋮
T
contains all the airline and flight data from the datastore in the same order in which the data was read. The table variables, Key
and Value
, are cell arrays.
Convert Value
to a numeric array.
T.Value = cell2mat(T.Value)
T=29×2 table
Key Value
__________ _____
{'AA' } 14930
{'AS' } 2910
{'CO' } 8138
{'DL' } 16578
{'EA' } 920
{'HP' } 3660
{'ML (1)'} 69
{'NW' } 10349
{'PA (1)'} 318
{'PI' } 871
{'PS' } 83
{'TW' } 3805
{'UA' } 13286
{'US' } 13997
{'WN' } 15931
{'AQ' } 154
⋮
Assign new names to the table variables.
T.Properties.VariableNames = {'Airline','NumFlights'};
Sort the data in T
by the number of flights.
T = sortrows(T,'NumFlights','descend')
T=29×2 table
Airline NumFlights
_______ __________
{'DL'} 16578
{'WN'} 15931
{'AA'} 14930
{'US'} 13997
{'UA'} 13286
{'NW'} 10349
{'CO'} 8138
{'MQ'} 3962
{'TW'} 3805
{'HP'} 3660
{'OO'} 3090
{'AS'} 2910
{'XE'} 2357
{'EV'} 1699
{'OH'} 1457
{'FL'} 1263
⋮
View a summary of the sorted table.
summary(T)
T: 29x2 table Variables: Airline: cell array of character vectors NumFlights: double Statistics for applicable variables: NumMissing Min Median Max Mean Std Airline 0 NumFlights 0 69 1457 16578 4.2594e+03 5.5065e+03
Reset the datastore to allow rereading of the data.
reset(ds)
See Also
datastore
| KeyValueDatastore
| tall
| mapreduce