필터 지우기
필터 지우기

how to remove punctuation from Arabic text file

조회 수: 2 (최근 30일)
Fateme Jalali
Fateme Jalali 2016년 7월 31일
편집: Thorsten 2016년 8월 1일
Hello,I have a Arabic string and want to discard all punctuations. I want to keep only text and white space between words.For example this is my string: str='سلام. دوست خوب من!'. can I change codes below to do it?
str= fileread('D:/docc111.txt');
str1 = regexprep(str,'\s+',' ');%replace enter with white space
%or str1 = regexprep(str,'[\n\r]+',' ')
%str1 = 'Hello, I need 1 MATLAB code to discard all punctuation, and signs from 9 text files.'
Lstr1=length(str1);
str_space='\s'; %String of characters
str_caps='[A-Z]';
str_ch='[a-z]';
str_nums='[0-9]';
ind_space=regexp(str1,str_space);%Match regular expression
ind_caps=regexp(str1,str_caps);
ind_chrs=regexp(str1,str_ch);
ind_nums=regexp(str1,str_nums);
mask=[ind_space ind_caps ind_chrs ind_nums];
num_str2=1:1:Lstr1;
num_str2(mask)=[];
str3=str1;
str3(num_str2)=[];
chars = [str3];
%insert space after first index and after last index in chars
charsWithWhitespace = [' ', chars(1:end), ' '];
newTest = sprintf(strrep(charsWithWhitespace, '\n', ' '));
fid = fopen('myySE1.txt','w');
fprintf(fid, '%s',charsWithWhitespace);
fclose(fid);

답변 (1개)

Walter Roberson
Walter Roberson 2016년 7월 31일

카테고리

Help CenterFile Exchange에서 Characters and Strings에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by