split a row into 2 rows
조회 수: 1 (최근 30일)
이전 댓글 표시
cg00008493 0.987979722052904 "COX8C;KIAA1409" 14 93813777 0.986128428295584 "COX8C;KIAA1409" 14 93813777
cg00031162 0.378288688845672 "TNFSF12;TNFSF12-TNFSF13" 17 7453377 0.362510745266914 "TNFSF12;TNFSF12-TNFSF13" 17 7453377
here are 2 lines and each line have 8 columns, i want to split each line have 2 sets like "COX8C;KIAA1409" into 2 rows and delete the duplicated columns output should be like this:
cg00008493 0.987979722052904 COX8C 0.986128428295584
cg00008493 0.987979722052904 KIAA1409 0.986128428295584
cg00031162 0.378288688845672 "TNFSF12 0.362510745266914
cg00031162 0.378288688845672 TNFSF12-TNFSF13 0.362510745266914
fid = fopen('COADREAD_methylation.txt','r');
data={};
while ~feof(fid)
l=fgetl(fid);
if isempty(strfind(l,'NA')), data=[data;{l}]; end
a = reshape(l, ',','""', [])';
end
fid=fclose(fid);
Note: I used NA to remove the lines which have NA
댓글 수: 0
채택된 답변
Stephen23
2017년 2월 16일
편집: Stephen23
2017년 2월 17일
opt = {'CollectOutput',true};
inp = '%s%s%q%*d%*d%s%*q%*d%*d';
out = '%s\t%s\t%s\t%s\n';
f1d = fopen('temp1.txt','rt'); % the original file
f2d = fopen('temp2.txt','wt'); % the new file
while ~feof(f1d)
C = textscan(f1d,inp,1,opt{:});
C = [C{:}];
D = regexp(C{3},';','split');
for k = 1:numel(D)
fprintf(f2d,out,C{1:2},D{k},C{4});
end
end
fclose(f1d);
fclose(f2d);
Produces this output file:
cg00008493 0.987979722052904 COX8C 0.986128428295584
cg00008493 0.987979722052904 KIAA1409 0.986128428295584
cg00031162 0.378288688845672 TNFSF12 0.362510745266914
cg00031162 0.378288688845672 TNFSF12-TNFSF13 0.362510745266914
Tested on this input file:
댓글 수: 18
Stephen23
2017년 2월 22일
If textscan has an empty output then you probably need to check the format string.
추가 답변 (0개)
참고 항목
카테고리
Help Center 및 File Exchange에서 Large Files and Big Data에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!