Comparing and removing rows of an array that are within 5% of each other

조회 수: 1 (최근 30일)
Michael Costa
Michael Costa 2021년 8월 26일
답변: Image Analyst 2021년 8월 26일
I have an array which is ~30 million x 14. It is sorted in ascending order of the first element of each row. I am trying to compare each row in the array to the previous row, and remove it if all 14 values are within 5% or less of the previous row's 14 values. The idea is that, if a row is within 5% of the previous row, I can treat them as if they are duplicates, and I don't want to include them in my final data set. Since the array is large, I would prefer to use logical indexing if possible, but I am also willing to use a for loop if neccesary.

답변 (1개)

Image Analyst
Image Analyst 2021년 8월 26일
Try this:
data = 10 + rand(6, 4) % Sample data
[rows, columns] = size(data);
% Find out percentage differences between an element and the one above it.
percentDifferences = abs([ones(1, columns); diff(data, 1)] ./ data)
% Find out which rows have all percent differences less than 5% of previous row.
rowsToDelete = all(percentDifferences < 0.05, 2)
% Do the deletions.
data(rowsToDelete, :) = []

카테고리

Help CenterFile Exchange에서 Matrix Indexing에 대해 자세히 알아보기

제품


릴리스

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by