Identify Duplicate values in an array and replace with Nan

조회 수: 18 (최근 30일)
Hello: I have a array as below of dimension 40,000x3. Third columns often contain duplicate values. I need to identify duplicate values and replace with Nan. Kindly help.
2.39300000000000 0 6.16800000000000
2.38720000000000 0 6.16800000000000
2.38480000000000 0 6.16800000000000
2.37380000000000 0 6.16800000000000
2.37410000000000 0 6.16800000000000
2.37020000000000 0 6.16800000000000
2.36880000000000 0 6.16800000000000
2.36350000000000 0 6.16800000000000

채택된 답변

Mrutyunjaya Hiremath
Mrutyunjaya Hiremath 2023년 9월 7일
If you want to replace only the duplicates with 'NaN' and keep one occurrence of each value intact, here is the code:
% Sample array (Replace this with your 40,000x3 array)
array = [
2.3930, 0, 6.1680;
2.3872, 0, 6.1680;
2.3848, 0, 6.1780;
2.3738, 0, 6.1680;
2.3741, 0, 6.1690;
2.3702, 0, 6.1780;
2.3688, 0, 6.1690;
2.3635, 0, 6.1780;
];
% Extract the third column
third_col = array(:, 3);
% Find unique values and their first occurrence index
[unique_vals, ~, ic] = unique(third_col);
% Count the occurrence of each unique value
counts = accumarray(ic, 1);
% Identify values that occur more than once (duplicates)
duplicate_vals = unique_vals(counts > 1);
% Replace only duplicates with NaN, keep one occurrence of each value
for val = duplicate_vals'
idx = find(third_col == val);
third_col(idx(2:end)) = NaN; % Keep the first occurrence, replace the rest with NaN
end
% Update the third column in the original array
array(:, 3) = third_col;
% Display the updated array
disp(array);
2.3930 0 6.1680 2.3872 0 NaN 2.3848 0 6.1780 2.3738 0 NaN 2.3741 0 6.1690 2.3702 0 NaN 2.3688 0 NaN 2.3635 0 NaN
In this code, for each duplicate value, find its indices in the third column using 'find'. Then, keep the first occurrence (index idx(1)) and replace the rest (idx(2:end)) with NaN. This will leave one instance of each value in the third column and replace only the duplicates with 'NaN'.

추가 답변 (1개)

Dyuman Joshi
Dyuman Joshi 2023년 9월 7일
편집: Dyuman Joshi 2023년 9월 7일
Here's a much faster and simpler approach -
array = [2.3930, 0, 6.1680;
2.3872, 0, 6.1680;
2.3848, 0, 6.1780;
2.3738, 0, 6.1680;
2.3741, 0, 6.1690;
2.3702, 0, 6.1780;
2.3688, 0, 6.1690;
2.3635, 0, 6.1780];
%Get the unique values and the indices corresponding to their 1st occurence
%in order they appear in the array
[val,first_idx] = unique(array(:,3),'stable')
val = 3×1
6.1680 6.1780 6.1690
first_idx = 3×1
1 3 5
%Convert all the values of column 3 to NaN
array(:,3) = NaN;
%Re-assign the values according to the indices
array(first_idx,3) = val
array = 8×3
2.3930 0 6.1680 2.3872 0 NaN 2.3848 0 6.1780 2.3738 0 NaN 2.3741 0 6.1690 2.3702 0 NaN 2.3688 0 NaN 2.3635 0 NaN

카테고리

Help CenterFile Exchange에서 Matrices and Arrays에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by