split a row into 2 rows

조회 수: 1 (최근 30일)
chocho
chocho 2017년 2월 16일
댓글: chocho 2017년 2월 22일
cg00008493 0.987979722052904 "COX8C;KIAA1409" 14 93813777 0.986128428295584 "COX8C;KIAA1409" 14 93813777
cg00031162 0.378288688845672 "TNFSF12;TNFSF12-TNFSF13" 17 7453377 0.362510745266914 "TNFSF12;TNFSF12-TNFSF13" 17 7453377
here are 2 lines and each line have 8 columns, i want to split each line have 2 sets like "COX8C;KIAA1409" into 2 rows and delete the duplicated columns output should be like this:
cg00008493 0.987979722052904 COX8C 0.986128428295584
cg00008493 0.987979722052904 KIAA1409 0.986128428295584
cg00031162 0.378288688845672 "TNFSF12 0.362510745266914
cg00031162 0.378288688845672 TNFSF12-TNFSF13 0.362510745266914
fid = fopen('COADREAD_methylation.txt','r');
data={};
while ~feof(fid)
l=fgetl(fid);
if isempty(strfind(l,'NA')), data=[data;{l}]; end
a = reshape(l, ',','""', [])';
end
fid=fclose(fid);
Note: I used NA to remove the lines which have NA

채택된 답변

Stephen23
Stephen23 2017년 2월 16일
편집: Stephen23 2017년 2월 17일
opt = {'CollectOutput',true};
inp = '%s%s%q%*d%*d%s%*q%*d%*d';
out = '%s\t%s\t%s\t%s\n';
f1d = fopen('temp1.txt','rt'); % the original file
f2d = fopen('temp2.txt','wt'); % the new file
while ~feof(f1d)
C = textscan(f1d,inp,1,opt{:});
C = [C{:}];
D = regexp(C{3},';','split');
for k = 1:numel(D)
fprintf(f2d,out,C{1:2},D{k},C{4});
end
end
fclose(f1d);
fclose(f2d);
Produces this output file:
cg00008493 0.987979722052904 COX8C 0.986128428295584
cg00008493 0.987979722052904 KIAA1409 0.986128428295584
cg00031162 0.378288688845672 TNFSF12 0.362510745266914
cg00031162 0.378288688845672 TNFSF12-TNFSF13 0.362510745266914
Tested on this input file:
  댓글 수: 18
Stephen23
Stephen23 2017년 2월 22일
If textscan has an empty output then you probably need to check the format string.
chocho
chocho 2017년 2월 22일
could you tell me how to present the format of this line? cg00000292 0.511852232819811 ATP2A1 0.787687855895422 0.51208122605745 0.599610258157912 0.568034757766559

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

태그

아직 태그를 입력하지 않았습니다.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by