Is it possible to do multidimensional interpolation on a set of scattered data?

조회 수: 43 (최근 30일)
I’m completely stumped by a seemingly simple problem, and have been for a couple of years. Can any of you smart matlabbers help?
I want to do a multidimensional interpolation of a set of data. So, use interpn, right? (I tried using griddatan and I’ve been waiting for hours while it’s “busy”, trying just one interpolation (not the thousands I need).)
My data are stored in a large 2D matrix. They are not completely ordered, like in the example below. My actual problem has 9 dimensions of independent data, but in the example below, assume the first four columns are independent variables and the fifth column is the dependent variable that I seek an interpolation value for.
To use interpn, I need to supply the function with a multidimensional array – with the number of dimensions matching the number of independent variables (right?). So for this example, I’d need a 4D array. Example:
X1 X2 X3 X4 Y
24 1 8 30 0.73
24 2 9 40 0.04
24 3 9 50 0.18
24 4 10 60 0.88
24 1 11 70 0.18
24 2 12 80 0.94
24 3 8 90 0.36
24 4 9 100 0.58
24 1 10 110 0.86
24 2 10 120 0.49
26 3 10 130 0.03
26 4 11 140 0.90
26 1 12 150 0.03
26 2 8 160 0.12
26 3 9 170 0.85
26 4 10 180 0.35
26 1 11 190 0.51
26 2 12 200 0.05
26 3 12 210 0.49
26 4 13 220 0.45
How do I put them in a 4D array? Ignoring X4 for the moment, and looking at a slice of the array where X3 = 8, we would have this array:
X1\X2 1 2 3 4
24 0.73 0.36
26 0.12
There are empty cells. Is that OK? Otherwise how could we make a multidimensional array from scattered data?
Building that array by hand is impractical with a large array. Is there some automated way to do it? Am I making this harder than it should be? Or is it so hard that I should do something else?
Thanks!!!
  댓글 수: 2
Canoe Commuter
Canoe Commuter 2013년 11월 1일
Sure. Here's an Excel file, with a header showing the variable names (X1 to X9 and Y) and a column with unique IDs for each of the 3217 points.
There's also a .m file with the data entered as a 3217 x 10 array (can't upload a .mat file to this site). Columns 1 to 9 are the independent variables (X1 to X9) and column 10 is the dependent variable (Y).
Thanks for looking at it!

댓글을 달려면 로그인하십시오.

채택된 답변

Canoe Commuter
Canoe Commuter 2014년 1월 27일
편집: Canoe Commuter 2014년 1월 27일
After, some consideration I eventually decided to take a different approach. Instead of trying to do a multidimensional interpolation within my large data set, I decided to create neural network models with the data set, and model any points that I need. There are advantages and disadvantages to this approach, but both approaches introduce a small amount of error, and the neural net approach is doable, whereas I never succeeded in the interpolation approach, despite the excellent guidance from Jeremy, Matt J, and Kelley Kearney.
If you are trying to figure out multidimensional interpolation, there are some good points and tools below.
  댓글 수: 1
Matthias
Matthias 2018년 1월 25일
Hi, I am working on a similar problem. Do you perhaps have some literature suggestion on the neural network approach that you have used?

댓글을 달려면 로그인하십시오.

추가 답변 (3개)

Kelly Kearney
Kelly Kearney 2013년 11월 1일
Does this function do what approximately what you're looking for: vec2grid.m? Right now, I've only written it to deal with 2D and 3D arrays, but it could easily be expanded to include an arbitrary number of dimensions.
data = [...
24 1 8 30 0.73
24 2 9 40 0.04
24 3 9 50 0.18
24 4 10 60 0.88
24 1 11 70 0.18
24 2 12 80 0.94
24 3 8 90 0.36
24 4 9 100 0.58
24 1 10 110 0.86
24 2 10 120 0.49
26 3 10 130 0.03
26 4 11 140 0.90
26 1 12 150 0.03
26 2 8 160 0.12
26 3 9 170 0.85
26 4 10 180 0.35
26 1 11 190 0.51
26 2 12 200 0.05
26 3 12 210 0.49
26 4 13 220 0.45];
[x1, x2, x3, y] = vec2grid(data(:,1), data(:,2), data(:,3), data(:,5))
x1 =
24
26
x2 =
1
2
3
4
x3 =
8
9
10
11
12
13
y(:,:,1) =
0.73 NaN 0.36 NaN
NaN 0.04 0.18 0.58
0.86 0.49 NaN 0.88
0.18 NaN NaN NaN
NaN 0.94 NaN NaN
NaN NaN NaN NaN
y(:,:,2) =
NaN 0.12 NaN NaN
NaN NaN 0.85 NaN
NaN NaN 0.03 0.35
0.51 NaN NaN 0.9
0.03 0.05 0.49 NaN
NaN NaN NaN 0.45
  댓글 수: 6
Canoe Commuter
Canoe Commuter 2013년 11월 4일
편집: Canoe Commuter 2013년 11월 4일
Thanks again, Kelly!
What I'm after is a way to interpolate my data sets. Each set has nine independent variables. I'd like to be able to arbitrarily choose a 9D location, and find an interpolated value at that location. It seems like interpn is intended for this purpose, but the input data need to be in a 9D array.
I'm starting to think that this is too computationally intensive. Part of the rationale for doing this was an assumption that interpolation is faster than modeling new data, but perhaps this assumption fails when the dimensionality increases. Can confirm this?
Kelly Kearney
Kelly Kearney 2013년 11월 4일
The interpn function isn't going to help you here; it will incorporate the NaNs into its interpolation, so your resulting interpolated values are going to be NaN for most of the coordinate space.
Do you want a nearest-neighbor interpolation or something else (linear, etc)? If the former, there could be several shortcuts around using griddatan.
If the latter, it might help to manipulate your data a little, and remove some of the dimensions you're interpolating over. For example, it looks like your first 3 variables are pretty grid-like, while the remaining dimensions are more scattered. Perhaps you could aggregate your data into dim1 x dim2 x dim3 groups, then interpolate separately within each of those "slices."
Whether any of this will be faster than rerunning your model is entirely dependent on that model.

댓글을 달려면 로그인하십시오.


Matt J
Matt J 2013년 11월 4일
편집: Matt J 2013년 11월 4일
So, use interpn, right?
No, not if your data is scattered. And rather than griddatan, scatteredInterpolant() is probably what would be recommended as the latest and greatest, if you have a sufficiently recent MATLAB release.
If your data can always be viewed as gridded data with missing elements, and the idea is to to fill the missing data with something, you could try this FEX file
I haven't used it myself, but it's getting favorable reviews. However, you would probably need a lot less missing data for this to work than in your example. In the example, your data essentially forms a line in 4-dimensional space. That's not a lot to extrapolate from.
  댓글 수: 1
Canoe Commuter
Canoe Commuter 2013년 11월 7일
Thanks, Matt. I'm using R2013a, which includes scatteredInterpolant, but it looks like that can only be used for 2D or 3D scattered data sets. I haven't yet checked out the FEX file.
Based on the wisdom of the commenters in this thread, I am currently working on reshaping my problem to be several 4D problems (dimensions 4 to 9 vary one-at-a-time, so this makes sense). As Kelly points out, dimensions 1 to 3 are not scattered. I'll update when I make progress...

댓글을 달려면 로그인하십시오.


Jeremy
Jeremy 2013년 11월 5일
편집: Jeremy 2013년 11월 5일
For 2 dimensional data (surface of scattered points), I've found gridfit to be an all-star. Not sure how well that will translate to your 9 dimensional problem...
  댓글 수: 1
Canoe Commuter
Canoe Commuter 2013년 11월 7일
Thanks, Jeremy. gridfit looks like a great tool, but as you suggested, it's only for 2D data sets.

댓글을 달려면 로그인하십시오.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by