Conversion of PDF data to Spreadsheet
조회 수: 8 (최근 30일)
이전 댓글 표시
HI I have several reports in PDF format. I would like to write an m-script to capture the data into spreadsheet. I thought the best method would be to add all the headers to an array, capturing each page's data in the PDF to different sheets in Excel and then populate the fields with the values corresponding to the headers. Is there a better way to achieve this?
댓글 수: 0
답변 (1개)
Guillaume
2017년 5월 24일
Well, your first hurdle will be to capture the data from the pdf. There is no built-in tool for this in matlab and depending on the structure of the pdf this will be either a fair amount of work (data is actually stored as continuous text in the file) or extremely hard (data is stored as text but scattered through the file, or data is just an image of the text which will require ocr).
pdf is not really designed to transfer structured data to a computer. It's mostly meant to be read by a human.
댓글 수: 2
Guillaume
2017년 5월 25일
Shaili Bulusu's comment posted as an answer moved here:
I understand the difficulties. But I have a script that will read the data for me from the pdf. My query is on the approach of sorting the headers as an array or if there is a better way to capture the data into a spreadsheet.
Guillaume
2017년 5월 25일
More details on what the approach of sorting the headers as an array means would be required to answer your question. What form does the inputs come in, and what form of output do you want?
참고 항목
카테고리
Help Center 및 File Exchange에서 Spreadsheets에 대해 자세히 알아보기
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!