No changes when using function erasePunctuation to remove digits.

조회 수: 3 (최근 30일)
Ismat Mohd Sulaiman
Ismat Mohd Sulaiman 2021년 3월 16일
댓글: Ismat Mohd Sulaiman 2021년 7월 5일
I'm trying to remove the digits in my document that has been tokenized.
However, using the erasePunctuation function, I didn't see any changes (no digits were removed) to the updated document. I've checked the type, and the tokenizer does recognize these tokens as digits. Please help. Thanks.
The output:

답변 (1개)

Cris LaPierre
Cris LaPierre 2021년 3월 16일
편집: Cris LaPierre 2021년 3월 16일
erasePunctuation still only erases punctuation, not numbers. The 'digits' specification tells it what type of token to remove punctuation from. See the description here.
You could try to remove digits using the following.
tkD = tokenDetails(cleanDoc);
cleanDoc = removeWords(cleanDoc,tkD{tkD.Type=="digits"});

카테고리

Help CenterFile Exchange에서 Large Files and Big Data에 대해 자세히 알아보기

제품


릴리스

R2020b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by