Why are some csv files imported incorrectly into my cell array?

조회 수: 9 (최근 30일)
lil brain
lil brain 2022년 12월 13일
편집: Stephen23 2022년 12월 14일
Hi,
I have a cell array called alldata whcih contains the contents of 24 csv files. However, when importing these files I can see that the last five (for example the csv file: 5422_task.csv) have been incorrectly imported in that the first column inlcudes two values (seperated by a comma) with an apostrophe infront.
alldata{1, 24}
ans =
1216×3 cell array
{'media_open; media_play; medi…'} {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'}
{'Multimedia File,"task_com.Ut…'} {1×1 missing } {1×1 missing }
{'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing }
{'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing }
{'Minimum Value,-100' } {1×1 missing } {1×1 missing }
{'Maximum Value,100' } {1×1 missing } {1×1 missing }
{'Number of Steps,9' } {1×1 missing } {1×1 missing }
{'Second,"Rating"' } {1×1 missing } {1×1 missing }
{'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing }
{'10.5,96.09' } {1×1 missing } {1×1 missing }
{'10.75,96.09' } {1×1 missing } {1×1 missing }
{'11,96.09' } {1×1 missing } {1×1 missing }
{'11.25,96.16375' } {1×1 missing } {1×1 missing }
{'11.5,96.45875' } {1×1 missing } {1×1 missing }
On the other hand, all the other csv files have been correctly imported so that the first two columns show two different values that have been seperated by a comma (for example the csv file: 1311_task.csv).
alldata{1, 1}
ans =
682×3 cell array
{'media_open; media_play; medi…'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'}
{'Multimedia File' } {'com.UtrechtUniversity.XRPS_Q…'} {1×1 missing }
{'Lower Label' } {'Negative Affect' } {1×1 missing }
{'Upper Label' } {'Positive Affect' } {1×1 missing }
{'Minimum Value' } {[ -100]} {1×1 missing }
{'Maximum Value' } {[ 100]} {1×1 missing }
{'Number of Steps' } {[ 9]} {1×1 missing }
{'Second' } {'Rating' } {1×1 missing }
{'%%%%%%' } {'%%%%%%' } {1×1 missing }
{[ 1]} {[ 0.7800]} {1×1 missing }
{[ 2]} {[ 0.8975]} {1×1 missing }
{[ 3]} {[ 0.7800]} {1×1 missing }
{[ 4]} {[ 0.7800]} {1×1 missing }
{[ 5]} {[ 0.8385]} {1×1 missing }
{[ 6]} {[ 0.7800]} {1×1 missing }
{[ 7]} {[ 0.7800]} {1×1 missing }
Any idea why this might be the case?
Thank you!

채택된 답변

Voss
Voss 2022년 12월 13일
"Any idea why this might be the case?"
It's because the different files have commas and semicolons in different places, e.g. line 10 of 1311_task.csv looks like this:
1;0.78;
but line 10 of 5422_task.csv looks like this:
10.5,96.09;;
So in one file you've got a semicolon after each number, and in the other file a comma in between the numbers and two semicolons at the end of the line.
I don't know what function(s) you're using to import the files, but here's an attempt to handle both of those situations with one piece of code:
files = {'1311_task.csv' '5422_task.csv'};
C = cell(1,numel(files));
for ii = 1:numel(files)
C{ii} = readcell(files{ii},'Delimiter',{',' ';'},'ConsecutiveDelimitersRule','join');
end
C{:}
ans = 682×3 cell array
{'media_open; media_play; media_end'} {'2022/09/19 14:42:27:371' } {' 2022/09/19 14:54:07:167"'} {'Multimedia File' } {'com.UtrechtUniversity.XRPS_Quest-20220919-135434.mkv'} {1×1 missing } {'Lower Label' } {'Negative Affect' } {1×1 missing } {'Upper Label' } {'Positive Affect' } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second' } {'Rating' } {1×1 missing } {'%%%%%%' } {'%%%%%%' } {1×1 missing } {[ 1]} {[ 0.7800]} {1×1 missing } {[ 2]} {[ 0.8975]} {1×1 missing } {[ 3]} {[ 0.7800]} {1×1 missing } {[ 4]} {[ 0.7800]} {1×1 missing } {[ 5]} {[ 0.8385]} {1×1 missing } {[ 6]} {[ 0.7800]} {1×1 missing } {[ 7]} {[ 0.7800]} {1×1 missing }
ans = 1216×3 cell array
{'media_open; media_play; media_end,"2022/09/23 15:06:11:215' } {'2022/09/23 15:06:18:984'} {' 2022/09/23 15:11:37:652"'} {'Multimedia File,"task_com.UtrechtUniversity.XRPS_Quest-20220923-142855.mkv"'} {1×1 missing } {1×1 missing } {'Lower Label,"Weak Presence"' } {1×1 missing } {1×1 missing } {'Upper Label,"Strong Presence"' } {1×1 missing } {1×1 missing } {'Minimum Value' } {[ -100]} {1×1 missing } {'Maximum Value' } {[ 100]} {1×1 missing } {'Number of Steps' } {[ 9]} {1×1 missing } {'Second,"Rating"' } {1×1 missing } {1×1 missing } {'%%%%%%,"%%%%%%"' } {1×1 missing } {1×1 missing } {[ 10.5000]} {[ 96.0900]} {1×1 missing } {[ 10.7500]} {[ 96.0900]} {1×1 missing } {[ 11]} {[ 96.0900]} {1×1 missing } {[ 11.2500]} {[ 96.1637]} {1×1 missing } {[ 11.5000]} {[ 96.4587]} {1×1 missing } {[ 11.7500]} {[ 96.0900]} {1×1 missing } {[ 12]} {[ 96.0900]} {1×1 missing }
As you can see there, the header info (lines 1-9) is not parsed the same between the two files, but the data section (lines 10-end) is, so maybe that's good enough?
  댓글 수: 5
lil brain
lil brain 2022년 12월 14일
It seems that this error appears no matter what files I select. It is always the first file in the list though.
Stephen23
Stephen23 2022년 12월 14일
"Why is that?"
Forgot the path, see fixed code.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Text Files에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by