Comparing and removing rows of an array that are within 5% of each other
조회 수: 1 (최근 30일)
이전 댓글 표시
I have an array which is ~30 million x 14. It is sorted in ascending order of the first element of each row. I am trying to compare each row in the array to the previous row, and remove it if all 14 values are within 5% or less of the previous row's 14 values. The idea is that, if a row is within 5% of the previous row, I can treat them as if they are duplicates, and I don't want to include them in my final data set. Since the array is large, I would prefer to use logical indexing if possible, but I am also willing to use a for loop if neccesary.
댓글 수: 0
답변 (1개)
Image Analyst
2021년 8월 26일
Try this:
data = 10 + rand(6, 4) % Sample data
[rows, columns] = size(data);
% Find out percentage differences between an element and the one above it.
percentDifferences = abs([ones(1, columns); diff(data, 1)] ./ data)
% Find out which rows have all percent differences less than 5% of previous row.
rowsToDelete = all(percentDifferences < 0.05, 2)
% Do the deletions.
data(rowsToDelete, :) = []
댓글 수: 0
참고 항목
카테고리
Help Center 및 File Exchange에서 Matrix Indexing에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!