How to remove multiple outlier data in a rectangular maze?

์กฐํšŒ ์ˆ˜: 2 (์ตœ๊ทผ 30์ผ)
Atanu
Atanu 2022๋…„ 6์›” 10์ผ
๋Œ“๊ธ€: Atanu 2022๋…„ 6์›” 16์ผ
I have this rodent x,y coordinates as described in the attached picture. There are some outliers in the data, I need to remove them. I have to dynamically define a zone based on the extreme coordinates in the quardrants.
For example, let's say I need to remove the outlier data circled in red. The datapoint is in Maze4. I have attached the data for Maze4. I want to remove the bins where histcounts2 is < 2. I also need the 'xcoordinates2' and 'ycoordinates2' array after cleaning the outliers. I tried this so far.
h4 = histogram2(Maze4.xcoordinates2, Maze4.ycoordinates2, ...
nbins,'DisplayStyle','tile','ShowEmptyBins','on');
counts4 = histcounts2(Maze4.xcoordinates2, Maze4.ycoordinates2, 25);
index4 = h4.Values(counts4<2);
But the index4 gives me 1-D array. How do I solve this?

์ฑ„ํƒ๋œ ๋‹ต๋ณ€

Image Analyst
Image Analyst 2022๋…„ 6์›” 10์ผ
First try to avoid the problem by not letting your rodents escape from your maze! ๐Ÿญ๐Ÿน๐Ÿ๐Ÿ€๐Ÿคฃ
Then if you know your x coordinates must be between -80 and 80, use masking to extract only those good indexes:
% Find out which indexes are outside the -80 to 80 range.
mask = abs(x <= 80);
% Extract only those good indexes.
x = x(mask);
y = y(mask);
  ๋Œ“๊ธ€ ์ˆ˜: 5
Image Analyst
Image Analyst 2022๋…„ 6์›” 14์ผ
How about this, where you take all x data between the 5% and 95% points?
s = load('maze4.mat')
% Extract the table
t = s.Maze4
xOriginal = t.xcoordinates2;
yOriginal = t.ycoordinates2;
subplot(2, 2, 1)
plot(xOriginal, yOriginal, 'r.', 'MarkerSize', 8)
grid on
title('Original Data')
subplot(2, 2, 2)
[counts, edges] = histcounts(xOriginal, 100)
bar(edges(1:end-1), counts, 1)
grid on;
title('Histogram of X values')
theCDF = rescale(cumsum(counts) / sum(counts), 0, 100);
subplot(2, 2, 4)
plot(edges(1:end-1), theCDF, 'b-', 'LineWidth', 2)
title('CDF of Histogram')
grid on;
% Find the index of the 5% and 95% points
index1 = find(theCDF > 5, 1, 'first')
x1 = edges(index1)
xline(edges(index1), 'LineWidth', 2, 'Color', 'r')
% Find the index of the 5% and 95% points
index2 = find(theCDF > 95, 1, 'first')
x2 = edges(index2)
xline(edges(index2), 'LineWidth', 2, 'Color', 'r')
% Get a mask for the x values we want to exclude
indexesToKeep = t.xcoordinates2 >= x1 & t.xcoordinates2 <= x2;
xCleaned = xOriginal(indexesToKeep);
yCleaned = yOriginal(indexesToKeep);
subplot(2, 2, 3)
plot(xCleaned, yCleaned, 'r.', 'MarkerSize', 8)
grid on
title('Cleaned Data')
Or else you can use a clustering algorithm like dbscan. Demo attached.
Atanu
Atanu 2022๋…„ 6์›” 16์ผ
Very elegant! Thank you so much!

๋Œ“๊ธ€์„ ๋‹ฌ๋ ค๋ฉด ๋กœ๊ทธ์ธํ•˜์‹ญ์‹œ์˜ค.

์ถ”๊ฐ€ ๋‹ต๋ณ€ (0๊ฐœ)

์นดํ…Œ๊ณ ๋ฆฌ

Help Center ๋ฐ File Exchange์—์„œ Loops and Conditional Statements์— ๋Œ€ํ•ด ์ž์„ธํžˆ ์•Œ์•„๋ณด๊ธฐ

์ œํ’ˆ


๋ฆด๋ฆฌ์Šค

R2022a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by