Remove duplicate rows in CSV file

조회 수: 20 (최근 30일)
mohammad Alsajri
mohammad Alsajri 2019년 7월 23일
댓글: mohammad Alsajri 2019년 7월 25일
hello dear mathworkers,
I have a dataset consist of approximatlly 4 millions records, and i want to remove the duplicated rows or records, can any one help me with the way, i am using matlab 2018a . thanks in advance
  댓글 수: 7
madhan ravi
madhan ravi 2019년 7월 24일
Mohammed: Alex's solution should have solved your problem.
mohammad Alsajri
mohammad Alsajri 2019년 7월 25일
thanks for help guys

댓글을 달려면 로그인하십시오.

채택된 답변

Alex Mcaulley
Alex Mcaulley 2019년 7월 23일
Since all is numeric data, you can use:
data = xlsread('kdd.xlsx');
datanew = unique(data,'rows');
  댓글 수: 2
Shameer Parmar
Shameer Parmar 2019년 7월 23일
This is not working, because non of data is similar.. I dont find duplicate entries in this sheet provided by Mohammad Alsajri..
using your command, the 'data' and 'datanew' both are getting exact same..
Alex Mcaulley
Alex Mcaulley 2019년 7월 23일
This code works!
I guess the excel provided by Mohammad is just a small portion of the dataset (4 million of rows).

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Language Fundamentals에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by