I have a PDF file "EHP.pdf", I want to count the total number of words in that file? This file has many sections I want to exclude the last section from the calculations. Any suggestions?

댓글 수: 2

KALYAN ACHARJYA
KALYAN ACHARJYA 2018년 12월 20일
Using Matlab??
Ahmed Alsaadi
Ahmed Alsaadi 2018년 12월 20일
Yes, by using Matlab

댓글을 달려면 로그인하십시오.

 채택된 답변

Omer Yasin Birey
Omer Yasin Birey 2018년 12월 20일
편집: Omer Yasin Birey 2018년 12월 21일

0 개 추천

Hi Ahmed, you can use extractFileText. You must choose a starter word and a finisher word, this word must be unique. Because, counting will end when Matlab encounters this word. By this way you can count the words between the starter and finisher.
str = extractFileText("EHP.pdf");
i = strfind(str,"firstWord"); % write here the first word of your pdf
ii = strfind(str,"lastWord"); % write here the last word of your pdf, that must be distinctive
start = i(1);
fin = ii(1);
extracted = extractBetween(str,start,fin-1)
uniqueWordNumbers = wordCloudCounts(extracted);
counter = uniqueWordNumbers(:,2);
counterArray = table2array(counter);
totalWords = sum(counterArray);

댓글 수: 3

Ahmed Alsaadi
Ahmed Alsaadi 2018년 12월 20일
Hi Omer,
I have got this error message when MATLAB executes the last line in the code "Error using sum Invalid data type. First argument must be numeric or logical."
Ah, You are right Ahmed. I made a typo and also forgot a line there, try this instead:
counter = uniqueWordNumbers(:,2);
counterArray = table2array(counter);
totalWords = sum(counterArray);
add this table2array line and change the input of sum with this
Ahmed Alsaadi
Ahmed Alsaadi 2018년 12월 20일
It works now, thank you very much Omer.

댓글을 달려면 로그인하십시오.

추가 답변 (0개)

카테고리

도움말 센터File Exchange에서 Display and Presentation에 대해 자세히 알아보기

질문:

2018년 12월 20일

편집:

2018년 12월 21일

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by