How to replace parts of the text using regexprep

조회 수: 6 (최근 30일)
Konstantinos Tsitsilonis
Konstantinos Tsitsilonis 2018년 11월 5일
편집: Stephen23 2018년 11월 5일
Hi all,
I have a very large text file which I imported as a char. vector which whas the following pattern:
text=NEW SCOMPONENT /JAFHB0099
DESC 'FLANGE F7805 SLIP-ON 10K FF 900A'
GTYP FLAN
PARA 900 1095 56 FBIA BWD 13
END
NEW SCOMPONENT /JAFHB00aa
DESC 'FLANGE F7805 SLIP-ON 10K FF 1100A'
GTYP FLAN
PARA 1100 1225 18 FBIA BWD 14
END
I want to replace the parts after DESC and PARA with some of my own values, e.g.
nDESC = {'Description 1'; 'Description 2'} ;
nPARA = {'1500 15300 20 FBDIA BWD 14' ; '1600 1623 20 FBDIA SWM 13'} ;
For the above, I have developed the following code, also with the help of the MATLAB community which let me know about the regexp function:
%Extracts what lies after the word PARA in the PARA line & Replaces it with the nPARA
newtext = regexprep(text, 'PARA\s+(\d+\.?\d*\s+\d+\.?\d*\s+\d+\.?\d*\s+\w*\s+w*\s+\d+\.?\d*)', nPARA) ;
I follow a similar logic for the case of the DESC.
However 2 problems occur.
1. The parenthesis after the \s+ and \w* for some reason do not capture the tokens only in the parenthesis after the PARA word, which instead of returning 1100 1225 18 FBIA BWD 14, I get PARA 1100 1225 18 FBIA BWD 14. However I can work around this so its not a big deal, I should be missing something out there.
2. The result that I get from the above does not replace each individual line with each string in the cell array, however it takes the last cell in the cell array and replaces every line with that cell.
  댓글 수: 1
Stephen23
Stephen23 2018년 11월 5일
편집: Stephen23 2018년 11월 5일
1) regexprep does not replace the tokens, it replaces the matched substring. So what you see is the expected behavior. Tokens are entirely optional, and can be used in dynamic operations. But the entire matched substring is replaced. You could resolve this using a look-around operation.

댓글을 달려면 로그인하십시오.

채택된 답변

Stephen23
Stephen23 2018년 11월 5일
편집: Stephen23 2018년 11월 5일
This uses a slightly different approach using regexp and strncmp, which is based on the assumption that each command is on its own line. You did not supply an example file so I created one (attached).
>> nDESC = {'Description 1'; 'Description 2'};
>> nPARA = {'1500 15300 20 FBDIA BWD 14' ; '1600 1623 20 FBDIA SWM 13'};
>> S = fileread('temp1.txt')
S = NEW SCOMPONENT /JAFHB0099
DESC 'FLANGE F7805 SLIP-ON 10K FF 900A'
GTYP FLAN
PARA 900 1095 56 FBIA BWD 13
END
NEW SCOMPONENT /JAFHB00aa
DESC 'FLANGE F7805 SLIP-ON 10K FF 1100A'
GTYP FLAN
PARA 1100 1225 18 FBIA BWD 14
END
>> C = regexp(S,'^\s*([A-Z]+\s*)(.*)$','tokens','dotexceptnewline','lineanchors');
>> C = vertcat(C{:}).';
>> C(2,strncmp(C(1,:),'DESC',4)) = nDESC;
>> C(2,strncmp(C(1,:),'PARA',4)) = nPARA;
>> Z = sprintf('\n%s%s',C{:});
>> Z = Z(2:end)
Z = NEW SCOMPONENT /JAFHB0099
DESC Description 1
GTYP FLAN
PARA 1500 15300 20 FBDIA BWD 14
END
NEW SCOMPONENT /JAFHB00aa
DESC Description 2
GTYP FLAN
PARA 1600 1623 20 FBDIA SWM 13
END

추가 답변 (0개)

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by