필터 지우기
필터 지우기

Interpolating Multivariate time series

조회 수: 4 (최근 30일)
Andrew
Andrew 2011년 4월 29일
Hi all,
I'm trying to test a multivariate time series dataset which has 2536instances and 73 attributes with missing values(represented by ?) in some rows. I tried looking for interpolating the time series. But all I can see is for 2-3 attributes.
Can someone help me on how to interpolate this dataset?The dataset is in .data format.
Andrew

답변 (3개)

Andrew
Andrew 2011년 4월 29일
To be clear,the dataset will be something similar to this
1/1/1998,0.8,1.8,2.4,2.1,10330,-55,0,0.
1/2/1998,2.8,3.2,3.3,2.7,10275,-55,0,0.
. . .
1/5/1998,2.6,2.1,1.6,1.4,?,?,?,0.58,0.
. . .
1/22/1998,2.8,3.6,?,?,4.6,10090,-40,0,0.
  댓글 수: 4
Andrew
Andrew 2011년 4월 29일
@Oleg
not really...all the rows have same number of colums with 73 attributes.
This is the dataset I'm talking about
http://archive.ics.uci.edu/ml/machine-learning-databases/ozone/onehr.data
it has total 75 columns 1 date+73 attributes+1 result column which says if it's ozone day or not.
Andrew
Andrew 2011년 4월 29일
@andrei
I'm not sure on how to use TriScatteredInnterp. Would you mind helping with the code that does the interpolation and save that missing values in the .data file. I need to use that data to test the algorithm
Thanks

댓글을 달려면 로그인하십시오.


Richard Willey
Richard Willey 2011년 4월 29일
Handling missing data is a very complicated topic.
There are a number of different approaches that you can use including listwise deletion, substitution models, multiple imputation, yada yada yada. Each approach has its own advantages and disadvantages.
For example, an approach based on substitution (regression substitution, interpolation, what have you) will give you a complete data set to work with, however, this new data set is going to be biased. (As a simple example, supposed that you use a regression substitution model to estimate plausible values for your missing data point. Later on, you fit a regression model to your [complete) data set and report an R^2...)
Alternatively, an approach based on listwise deletion won't [necessarily] run into the same problems with bias, however, you will have issues with loss of statistical power.
I took a quick look at the data set in question. Two observations.
1. You are missing large blocks of data - this is going to cause some real problems for interpolation based techniques
2. Your data doesn't appear to be Missing Completely At Random or even Missing at Random
Personally, I would start with listwise deletion...

Andrew
Andrew 2011년 4월 30일
I guess I can't delete the missing values..
How do we interpolate that with interp1???Can I use this to interpolate the above dataset?
I've read somewhere in the matlab works saying, yi = interp1(Y,xi) assumes that x = 1:N, where N is the length of Y for vector Y, or size(Y,1) for matrix Y.
yi = interp1(x,Y,xi,method) interpolates using alternative methods:
But then, how does it know what dataset to use??when I load dataset using "load onehr.data",it says unknown value '?'...
Can someone help me??

카테고리

Help CenterFile Exchange에서 Interpolation에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by