Appending dataset of varying length

조회 수: 4 (최근 30일)
Braden
Braden 2011년 4월 17일
I have a collection of monthly 1 minute averaged data files that I would like to import, append, and process. I would like to handle the data in Matlab using a dataset array:
wind = dataset('file','Halkirk1_12_2010_average1min.csv','delimiter',',','format',['%s' repmat(' %f',1,72)]);
Due to the nature of the files, they are varying lengths so vertcat does not work. I will use datevec to pick up the next file to append. Is there a function or method that will append two datasets of varying lengths? Any tips or thoughts would be appreciated.
  댓글 수: 2
Oleg Komarov
Oleg Komarov 2011년 4월 17일
You mean you have different number of columns?
Braden
Braden 2011년 4월 18일
No, the csv files have a different number of rows. i.e. the number of 1 min averages in January's file is more than February because there are 31 days in January vs. 28 in February.

댓글을 달려면 로그인하십시오.

채택된 답변

Oleg Komarov
Oleg Komarov 2011년 4월 18일
The number of rows is not a problem:
A = dataset({rand(10,1),'col1'});
B = dataset({rand(20,1),'col1'});
C = [A;B]
If you have an error, report the full error msg.

추가 답변 (2개)

Laura Proctor
Laura Proctor 2011년 4월 18일
You can merge datasets using the JOIN function.
  댓글 수: 4
Oleg Komarov
Oleg Komarov 2011년 4월 19일
@Laura: which is not what the op wants if she wants to append.
Braden
Braden 2011년 4월 19일
This is correct Oleg. 'He' wants to append.

댓글을 달려면 로그인하십시오.


Richard Willey
Richard Willey 2011년 4월 19일
Hi Oleg
This strikes me as more of a data representation issue than a question of MATLAB syntax. Your eventual solution will depend on how you want to describe "time". You seem to be assuming a "wide" format in which the observations for each month are stored as separate variables and each row represents a separate one minute average (the first one minute average in the month, the second one minute average in the month, ...) This format will work fine, however, you might need to use some NaNs to pad out some of the monthes.
You might find it easier if you created a variable labeled "Time" and used this to measure all of your observations. You could create separate variables that track what month this time value corresponds to, what day of the week it is, whether its a holiday, what have you.
I'm attaching some code that I wrote a while back that grabs data from xls files and automatically creates nominal variables based on the file name.
Hope that this proves helpful
%%Loading Data into MATLAB
clear all
clc
% This script assumes that we have a set of XLS files.
% Each XLS file contains a separate spark sweep
% We're interested in combining all these files into a dataset array
% After which, we're going to identify the minimum BSFC for each spark
% sweep
%Identify where to search for files
Location = 'H:\Documents\MATLAB\BSFC\';
% Store the name of all .xls files as a vector D
D = dir([Location, '*.xls']);
% Create a dataset array from the file that is the first element in D
name = D(1) .name
Engine = dataset('xlsfile',name);
% Use the name of the file as a nominal variable
% The nominal variable can be used to note that all these rows came from
% the file with name = "name"
% Start by stripping off the ".xls" extension
name = name(1:end-4);
% Write the name to the dataset array and convert to a nominal
Engine.Name = repmat(name,length(Engine),1);
Engine.Name = nominal(Engine.Name);
% Repeat for all the rest of the .xls files in the "Location".
% Each new file with be vertically concatenated with the
% original dataset array
f = @(x,y) vertcat(x,y);
parfor i = 2 : length(D)
name = D(i) .name
Engine2 = dataset('xlsfile',name);
name = name(1:end-4);
Engine2.Name = repmat(name,length(Engine2),1);
Engine2.Name = nominal(Engine2.Name);
Engine = f(Engine, Engine2);
end
  댓글 수: 10
Teja Muppirala
Teja Muppirala 2011년 5월 6일
A very common mistake.
zeros(size(wind,1)) <--- Out of memory. This is not what you meant.
zeros(size(wind,1),1) <--- This is what you meant to write
Fix all of those lines to be:
wind.vhub = zeros(size(wind,1),1);
wind.newWinV = zeros(size(wind,1),1);
wind.newWinVMax = zeros(size(wind,1),1);
wind.newWinVMin = zeros(size(wind,1),1);
wind.newSonWinV = zeros(size(wind,1),1);
wind.newSonWinVMax = zeros(size(wind,1),1);
wind.newSonWinVMin = zeros(size(wind,1),1);
wind.phub = zeros(size(wind,1),1);
wind.rho = zeros(size(wind,1),1);
wind.Cp = zeros(size(wind,1),1);
wind.normpow = zeros(size(wind,1),1);
Braden
Braden 2011년 5월 6일
ah yes! thanks for picking up on that! your help is much appreciated.

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Spreadsheets에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by