필터 지우기
필터 지우기

Extract bookmarks from PDF files using Matlab?

조회 수: 2 (최근 30일)
Wei Sun
Wei Sun 2023년 10월 15일
댓글: dpb 2023년 10월 18일

How to extract all the bookmarks from a PDF file? These bookmarks are usrually the headlines of the pdf. Thank you.

  댓글 수: 4
Christopher Creutzig
Christopher Creutzig 2023년 10월 18일
Those “bookmarks” form a tree structure, with chapters and sections, and include (hyperlink) targets in the PDF. What kind of output would be useful for what you are trying to do, a flat vector of strings, or do you need the nesting information? Do you need the link targets?
(Not saying I have a solution for any of these, but it would help in trying to answer your question.)
dpb
dpb 2023년 10월 18일
@Christopher Creutzig, if you're particularly interested in/knowlegeable of pdf file interaction, you might find <another recent question> of some interest.

댓글을 달려면 로그인하십시오.

답변 (1개)

dpb
dpb 2023년 10월 15일
이동: dpb 2023년 10월 15일
High level MATLAB functions including extractFileText, pdfinfo and readPDFFormData in the <DataAnalyticsToolbox> don't return the bookmarks; you'll have to have some 3rd party pdf toolset to be able to do that...there are some like <itext bookmark example> that utilize code in a DLL that you would have to write mex code in your language of choice to use.
All you can do with high-level MATLAB will be to <search for known strings or patterns>.
  댓글 수: 2
Wei Sun
Wei Sun 2023년 10월 17일
편집: Wei Sun 2023년 10월 17일
Thanks for your answer. Much hope that an official function in Text Analytics Toolbox should be provided.
dpb
dpb 2023년 10월 17일
You can always submit an enhancement request to TMW...

댓글을 달려면 로그인하십시오.

카테고리

Help CenterFile Exchange에서 Environment and Settings에 대해 자세히 알아보기

태그

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by