MATLAB Answers


Collate Data in Grid

Si 님이 질문을 제출함. 14 May 2014
최근 활동 Image Analyst 님이 댓글을 추가함. 14 Jun 2018

I have a list of data points in 2 dimensions (x and y co-ordinates). They fall within a certain range, +R and -R. There is also 3rd (variable) property connected with each data point. i.e., The list contains 3 columns. The first two column entries relate to the position. The third column relates to speed. There are approximately 1,000,000 rows of data.

1. I want to set up an 'imaginary' grid to count the number of points which fall in each grid location based on the (x, y) location. The size of the grid can be varied. I would like to also plot this data (A hot spot type map showing x, y position with color indicating the count number at each location).

2. Then I also want to be able to sort the data within each grid (or any particular specified grid). I want to be able to sort through the data relating to an individual grid and count the number of points which have the same/different speed. Then I would also like to plot this (from data within a specified grid location: count versus speed).

Any ideas would be welcome on how to implement this.


How can I do this? Any ideas?


  댓글 수: 7

표시 이전 댓글 수: 4
From his last comment to me, I assumed he's got it all solved now. If I'm wrong, say so.
i do have the same problem to put point data file(x,y,z test file) to 2d grided cells and find out minimum z value and z value point(x,Y), inside each cells. Pleases let me know solution for this. i have use meshgrid and griddata functions. but i could not recall data. do need to find out proper indexing method to index cells. thank you.
Show what code you did, what you got, and what you want, in a new discussion all your own.

로그인 to comment.

답변 수: 3

Cedric Wannaz 님의 답변 17 May 2014
Cedric Wannaz 님이 편집함. 17 May 2014
 채택된 답변

In addition to Image Analyst's hints, you could also look at
Alternatively, here is a short example which illustrates how to grid data "by hand" on a regular grid. It can be adapted to irregular grids, to produce what is sometimes named zonal statistics in GIS contexts.
R = 4 ;
% - Define dummy data.
n = 15 ;
x = -R + 2*R*rand( n, 1 ) ;
y = -R + 2*R*rand( n, 1 ) ;
v1 = randi( 20, n, 1 ) ; % A series of data associated with points.
v2 = randi( 20, n, 1 ) ; % Another series.
% - Build grid.
nBinsX = 3 ;
nBinsY = 2 ;
xg = linspace( -R, R, nBinsX+1 ) ;
yg = linspace( -R, R, nBinsY+1 ) ;
nCells = nBinsX * nBinsY ;
% - Build figure.
figure(1) ; clf ; hold on ;
set( gcf, 'Color', 'w', 'Units', 'Normalized', ...
'Position', [0.1,0.1,0.6,0.6] ) ;
% - Plot grid.
plot( [xg;xg], repmat( [-R;R], 1, numel( xg )), 'Color', 0.8*[1,1,1] ) ;
plot( repmat( [-R;R], 1, numel( yg )), [yg;yg], 'Color', 0.8*[1,1,1] ) ;
xlim( 1.5*[-R,R] ) ; ylim( 1.5*[-R,R] ) ;
% - Build set of unique IDs for cells.
xId = sum( bsxfun( @ge, x, xg(1:end-1) ), 2 ) ;
yId = sum( bsxfun( @ge, y, yg(1:end-1) ), 2 ) ;
cellId = nBinsY * (xId - 1) + yId ;
% - Plot cell IDs.
labels = arrayfun( @(k)sprintf( '%d', k ), 1:nCells, 'UniformOutput', false ) ;
[X,Y] = meshgrid( (xg(1:end-1)+xg(2:end))/2, (yg(1:end-1)+yg(2:end))/2 ) ;
text( X(:), Y(:), labels, 'Color', 'b', 'FontSize', 14 ) ;
% - Plot data points with labels.
plot( x, y, 'rx', 'LineWidth', 2, 'MarkerSize', 8 ) ;
labels = arrayfun( @(k)sprintf( 'P%d\\in%d | %d,%d', k, cellId(k), ...
v1(k), v2(k) ), 1:n, 'UniformOutput', false ) ;
text( x, y+R/100, labels, 'Color', 'k', 'FontSize', 9, ...
'HorizontalAlignment', 'Center', 'VerticalAlignment', 'Bottom' ) ;
% - Compute some stat (sum, mean) per block on v1 and v2.
blockSum_v1 = accumarray( cellId, v1, [nCells, 1] ) ;
blockMean_v2 = accumarray( cellId, v2, [nCells, 1], @mean ) ;
fprintf( '\nBlock sum v1 =\n' ) ;
disp( blockSum_v1 ) ;
fprintf( '\nBlock mean v2 =\n' ) ;
disp( blockMean_v2 ) ;
This outputs
Block sum v1 =
Block mean v2 =
Note that if you eliminate all the "demo" code, the approach reduces to almost a one-liner for defining cellIDs and then a one-liner per stat.

  댓글 수: 4

표시 이전 댓글 수: 1
I may have misunderstood the problem that you are trying to solve. In my setup, x and y are coordinates (e.g. spatial/geographical) which define points/locations in a 2D space, and v1, v2 are two data sets associated with these points. So y is not a count at all. How does your setup differ from my description?
PS: in my setup, if each point was the location of a particulate and v1 was the speed of particulates, I would get a total count of particulates per block with:
blockCount = accumarray( cellId, ones( size( cellId )), [nCells, 1] ) ;
A block count of particulates whose speed is e.g. in the range 2 <= v1 < 4.3 as follows:
isRelevant = v1 >= 2 & v1 < 4.3 ;
blockCount_v1inRange = accumarray( cellId(isRelevant), ...
ones( nnz(isRelevant), 1 ), [nCells, 1] ) ;
A block average speed of particulates whose speed is in the relevant range with:
blockMean_v1inRange = accumarray( cellId(isRelevant), v1(isRelevant), ...
[nCells, 1], @mean ) ;
Si 17 May 2014
Thanks Cedric. Its my fault, my description is not the best. Let me clarify. Yes, x and y are coordinates which define position in 2D space. There is one other data set/variable associated with each point and this is velocity. Lets just say v1. When I mentioned y, it was a bit confusing, I was meaning 'y axis' (i.e., ordinate/vertical axis) not the y variable. Sorry about that. What I was poorly trying to say is... to then treat every cell like a separate data set. To collate the data in each individual cell. In terms of v1. Thus allowing the possibility (for each individual cell) to be able to plot count versus v1. So for example, using your solution (attached png file), I can plot the frequency (count) of the velocities (v1) for the points in grid 5.
Ok. Yes, then it is easy to isolate a block with basic indexing:
isSelected = cellId == 5 ;
blockData = v1(isSelected) ;
or simply
blockData = v1(cellId==5) ;
which is I guess what you did. You can also put all blocks of data is a cell array:
>> v1byCell = accumarray( cellId, v1, [nCells, 1], @(x){x} )
v1byCell =
[3x1 double]
[4x1 double]
[2x1 double]
[2x1 double]
[3x1 double]
[ 1]

로그인 to comment.

Image Analyst 님의 답변 15 May 2014

1. Try griddedInterpolant, TriScatteredInterp, or griddata, depending on the version of MATLAB you're using.
2. Try sort() and hist() or histc().

  댓글 수: 4

표시 이전 댓글 수: 1
Attach your data file and your coding attempt (the m-file) if you get stuck.
Si 17 May 2014
Hi, just a quick one regarding hist3 and colorbars. 1) I use hist3(X) with the default 10x10 grid and plot as 2D heat map. Then make a call to colorbar. The colorbar generated is accurate and presentable. This is all well and good. 2) I use hist3 with edges to increase the number of bins and to match the data set values. I then make a call to colorbar and get values all over the place that do not represent the data and is not presentable. Why is this? I was expecting a colorbar to be generated similar to 1). Any ideas why this is and how to overcome? Thanks

로그인 to comment.

Thank you so much Cedric.
I have some particle data for which I placed grid using your suggested code and with minor corrections. I could get the figure which is attached below. Q-Instead of rectangular grid, is it possible to place the circular grid on it ? As you can the grid is rectangular and due to which some grid cells are falling outside the circle. These cells have no data points. If I can place the circular grid then these empty cells won't be there ( that's what I want).
Q- If the circular grid is not possible to place on it, then can you please suggest me other way to remove these empty cells.
I tried using 'nan' to remove them, but the only problem is it will also remove the empty cells which are inside the circle. I just want to remove the empty cells which are 'outside' circular particle data (as you can see I have marked the outside one), not the inside one.
Any help will be highly appreciated. Thank you so much.
My particle data is in term of circular plane only.

  댓글 수: 0

로그인 to comment.

Translated by