How can I find duplicate first four columns?
이 질문을 팔로우합니다.
- 팔로우하는 게시물 피드에서 업데이트를 확인할 수 있습니다.
- 정보 수신 기본 설정에 따라 이메일을 받을 수 있습니다.
오류 발생
페이지가 변경되었기 때문에 동작을 완료할 수 없습니다. 업데이트된 상태를 보려면 페이지를 다시 불러오십시오.
이전 댓글 표시
0 개 추천
Hello!
My excel file format is as follows:
- A column: Time
- B column: Year
- C column: Day of the year
- D column: Hour
There are many duplicate values at the first four column in this excel file. For example, 4101 and 4102 row are same values in the first four columns. 4114 and 4115, 4127 and 4128 and so forth...And I'd like to remove upper row such as 4101, 4114, 4127 etc...
How can I find duplicate A column and just remove upper row?
채택된 답변
Image Analyst
2018년 8월 20일
Not sure if it's the most compact code, but I think it's simple and intuitive, and I think this will work:
[numbers, strings, raw] = xlsread('test.xlsx');
col1 = numbers(:, 1);
[C, ia, ic] = unique(col1);
fprintf('Length of column 1 = %d.\nNumber of unique numbers in column 1 = %d.\n', ...
length(col1), length(C));
% Scan down array and keep track of the first time we encounter anumber.
rowsToDelete = zeros(1, length(C));
for k = 1 : length(C)
% Find out how many times this unique number occurs.
indexes = (col1 == C(k));
numOccurrences = sum(indexes);
fprintf('For k = %d, found %d occurrences of %d\n', k, numOccurrences, C(k));
% More than 2? Then delete the first one.
if numOccurrences >= 2
% If there are two or more, delete the first one ONLY.
rowsToDelete(k) = find(indexes, 1, 'first');
end
end
% Remove zeros
rowsToDelete(rowsToDelete <= 0) = [];
col1(rowsToDelete) = [];
fprintf('Deleted these rows: ');
fprintf('%d ', rowsToDelete);
fprintf('\nAfter removing first numbers we now have %d numbers.\n', ...
length(col1));
댓글 수: 10
Suat YAZICI
2018년 8월 20일
편집: Suat YAZICI
2018년 8월 20일
I checked many duplicated values. Your coding works very well for first column.
I just try compherending this code.
Is edited column A 'C variable' now?
col1 contains the vector with the first duplicate rows deleted. You can delete the same rows from the other columns if you want.
This code removes duplicated A-th column. How to remove with respect to duplicated variables name A-th row like A3685:I3685, A4114:I4114 and so on?
I don't understand. I thought I gave you what you want. So if A = [2,4,6,6,6,7,8,8] the output will be [2,4,6,6,7,8] which is the same thing but with the first element (only) of duplicated runs being removed. Please give a small example, not one with tens of thousands of rows, and say which rows specifically are to be removed.
For example,
- test(28x10) matrix
In addition to my question, I can see deleted these rows are 10th(10,1), 14th(14,1) and 27th(27,1) when I use your coding but delete these rows should be like (10, :), (14, :) and (27, :). I mean all lines about duplicated these rows.
Sorry, I don't understand your grammar. You say my code deletes rows 10, 14, and 27, and it should delete rows 10, 14, and 27 (the same). So it's working, right?
Sorry for misunderstanding. I tried to say that it deleted as rows 10,14,27 and columns 1. For example, I'd like to delete A10:J10 (rows 10, colums :), A14:J14, A27:J27 for this file.
Just say
numbers(rowsToDelete) = []
to delete those rows in all columns.
Suat YAZICI
2018년 8월 20일
편집: Suat YAZICI
2018년 8월 20일
Sorry but it doesn't work. Why didn't I get edited 25x10 matrix instead of 25x1 matrix because of col1 = numbers(:, 1) ?
I was just using col1 as a convenience to figure out what rows need to be deleted. Once I figured that out, I applied it to all columns in "numbers" to delete those rows from every column. If numbers is 10 columns, then it should still be 10 columns after deleting some rows. It just won't have as many rows. If you didn't get a new height for numbers, then you didn't use the code from my last comment where I deleted the rows from numbers.
추가 답변 (0개)
카테고리
도움말 센터 및 File Exchange에서 Loops and Conditional Statements에 대해 자세히 알아보기
태그
참고 항목
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!웹사이트 선택
번역된 콘텐츠를 보고 지역별 이벤트와 혜택을 살펴보려면 웹사이트를 선택하십시오. 현재 계신 지역에 따라 다음 웹사이트를 권장합니다:
또한 다음 목록에서 웹사이트를 선택하실 수도 있습니다.
사이트 성능 최적화 방법
최고의 사이트 성능을 위해 중국 사이트(중국어 또는 영어)를 선택하십시오. 현재 계신 지역에서는 다른 국가의 MathWorks 사이트 방문이 최적화되지 않았습니다.
미주
- América Latina (Español)
- Canada (English)
- United States (English)
유럽
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
