Selecting specific data from pdf

조회 수: 1 (최근 30일)
Nathaniel Porter
Nathaniel Porter 2022년 2월 23일
답변: Riya 2023년 9월 15일
%Trying to obtain any values between 48-64 and corresponding values in the
%column to the right
%For example the first line with value 58 in third column and would like to also
%obtain 100 from it
%I tried extracting the pdf first but unsure of where to go from here
clear;
pages = [1:18];
str = extractFileText("data-01.pdf",'Pages',pages);

답변 (1개)

Riya
Riya 2023년 9월 15일
Hello Nathaniel Porter,
As per my understanding, you want to obtain specific values from a PDF file. Such that values are between 48 and 64 in a specific column and want to retrieve the corresponding values in the column to the right.
You can follow the steps given below for the same:
% Split the text into lines
lines = splitlines(str);
% Initialize variables
result = [];
% Iterate over the lines
for i = 1:numel(lines)
line = lines{i};
% Use regular expressions to find values between 48 and 64 in the third column
pattern = '\d+\s+\d+\s+([48-64])\s+(\d+)';
match = regexp(line, pattern, 'tokens');
% If a match is found, extract the values
if ~isempty(match)
value = str2double(match{1}{1});
correspondingValue = str2double(match{1}{2});
% Store the values in the result
result = [result; value, correspondingValue];
end
end
% Display the result
disp(result);
For more information about regexp’, you can refer the following documentation:
I hope it helps!

카테고리

Help CenterFile Exchange에서 Database Toolbox에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by