How do I use indexing to create discrete column vectors for a large data set?

I have different parameters for about 600 patients in a single spreadsheet giving me about 400K rows of data with 30 columns. The data is organized such that the patients are stacked on top of each other and each identifier is in the first column. Thus, the first 700ish rows belong to patient 1, the second 600ish rose belong to patient 2 so and so forth. I would like to automate the separation of this data so that I have specific arrays and/or column vectors that belong to only 1 patient.
For example current data:
Patient 1 Time 1 Heart Rate 1
Patient 1 Time 2 Heart Rate 2
Patient 2 Time 1 Heart Rate 1
Patient 2 Time 2 Heart Rate 2
I would like to be able to refer to each variable for each patient in its own column vector. This will allow me to plot the different patient information. For example, variable "T1" returns all the time values for patient 1. Variable "HeartRate1" returns all the heart rate values for patient 1. I would then plot the time and heart rate for patient 1.
Thanks for any help.
Chris

답변 (2개)

It sounds like perhaps you would like table() for some of what you are doing. Perhaps a vector of tables.
To do the grouping, read the data into a cell array, and then
[uniquepatients, ~, uidx] = unique(FullData(:,1), 'stable');
numpatients = length(uniquepatients);
PerPatientData = cell(numpatients, 1);
PatientIDs = cell(numpatients, 1);
for K = 1 : numpatients
PatientIDs = uniquepatients(K);
PerPatientData(K) = FullData(uidx==K, 2:end); %presuming cell array!
end

댓글 수: 3

I think this is what I'm looking for. However, I keep getting this error message: Error using cell/unique (line85). Input A must be a cell array of strings.
The data in column 1 is numerical so this is somewhat confusing to me.
Here is an example of some of the data:
1096412\t5G ICU\t4.1762e+04\t-80424\t
1096412\t5G ICU\t4.1762e+04\t-24\t
1096412\t5G ICU\t4.1762e+04\t-9.2167\t234
1096412\t5G ICU\t4.1762e+04\t-7.5000\t
1096412\t5G ICU\t4.1762e+04\t-7.3333\t239
Note: I edited your sample data to make the tabs obvious, as the forum runs tab-separated parts together.
[uniquepatients, ~, uidx] = unique( cell2mat(FullData(:,1)), 'stable');
Christopher, your sample data is different than your original example. I'm gonna go with the first (mostly because I don't really understand the sample data). If you have a spreadsheet file, read it into MATLAB using readtable. Then you could do one of two things.
  • If the vectors of times are the same for all the patients, use unstack to turn the (600-ish*700-ish)-by-3 table into a (700-ish)-by-(600-ish+1) table, with one row for each time, one (common) Time variable, and 600-ish HeartRate variables.
  • If the time vectors are not all the same, you can create one table for each patient using the following:
Some example data:
>> t = table(Patient,Time,HeartRate)
t =
Patient Time HeartRate
_______ ____ _________
111 1.1 0.83083
111 2.1 0.58526
111 3.1 0.54972
222 1.2 0.91719
222 2.2 0.28584
222 3.2 0.7572
333 1.3 0.75373
333 2.3 0.38045
333 3.3 0.56782
Now turn that into three tables, each in its own cell of a cell array (there are several ways to do this, here's one):
>> splitter = @(Time,HeartRate)table(Time,HeartRate);
>> patients = rowfun(splitter, t, ...
'GroupingVariable','Patient', 'OutputFormat','cell')
patients =
[3x2 table]
[3x2 table]
[3x2 table]
And then, for example,
plot(patients{1}.Time,patients{1}.HeartRate)
This assumes you have R2013b or later. Some tweaks you might consider:
  • Convert the patient variable in the table to a categorical vector. Of course, if you're just gonna throw that tall table away, may that's not needed.
  • If you have R2014b or later, use a datetime or duration vector for the Time variable.
Hope this helps.

이 질문은 마감되었습니다.

태그

질문:

2015년 6월 2일

마감:

2021년 8월 20일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by