Is there way to read powerpoint file and get data?

조회 수: 55 (최근 30일)
Hoon Jeong
Hoon Jeong . 2018년 4월 23일
댓글: gwoo . 2023년 2월 1일
I have a PowerPoint slide file, which contains a lot of slides with numbers written inside the shapes(or objects).
So, I navigated around the website to see if I could read this PowerPoint slide file, and get some data from each slide.
Here is how far I could get.
h = actxserver('PowerPoint.Application');
h.Presentations.Open('C:\Temp\TLM template.pptx');
Set firstPres = Presentations.Item(1)
I am trying to get data from the second object in the first slide.
But I don't even know how to "read" this file, and get some data.
Can you give some guidance?
  댓글 수: 2
SR 2019년 5월 3일
Did you ever figure out how to read data from PowerPoint?
I am having a similar issue, I want to:
  1. automatically create a presentation with textbox content via matlab (done)
  2. close it and clear all variables in session / quit matlab (done)
  3. manufally open presentation to edit textbox content and save presentation (done)
  4. open manually updated presentation via matlab (done)
  5. read/import updated content in specific edited textbox into matlab <-- not done
Although I can navigate to the specific texbox object I am interested in reading from, the contents load as empty.. I can only "read" data from textboxes if I filled them via matlab in the same session (not really "reading" since variable already loaded).

댓글을 달려면 로그인하십시오.

답변 (2개)

Guillaume 2018년 4월 23일
I've never automated powerpoint, but like any other office application it shouldn't be too hard. The powerpoint DOM is documented by Microsoft. You just have to figure out how to navigate it.
A quick glance shows that you'll have to access the Slides collection to get your Slide, then probably the Shapes collection of your slide.
Something like:
powerpoint = actxserver('Powerpoint.Application');
presentation = powerpoint.Presentations.Open('C:\Temp\TLM template.pptx');
slide = presentation.Slides.Item(1); %to get the first slide
shape = slide.Shapes.Item(2); %to get the 2nd shape

gwoo 2023년 1월 26일
Better than using the ActiveX COM connection, use the .NET API. Then you can use all the well documented and updated help from the .NET documentation for MS Office. Making this connection also gives you access to Microsoft.Office.Core class which has all kinds of universal enums and such like MsoTriState which is used for matching TRUE or FALSE property values. You can also access the PowerPoint enums within the Microsoft.Office.Interop.PowerPoint class which is good for matching or setting property values like whether some paragraph is bulleted or not.
applicationName = "PowerPoint";
AppClass = connectNETApplication(applicationName);
msoFalse = Microsoft.Office.Core.MsoTriState.msoFalse;
ppBulletNone = Microsoft.Office.Interop.PowerPoint.PpBulletType.ppBulletNone;
function AppClass = connectNETApplication(applicationName)
% This arguments bit is for Matlab 2022, it can be removed for versions
% before that
applicationName string {contains(applicationName, ["PowerPoint", "Word", "Excel"])}
NET.addAssembly("Microsoft.Office.Interop." + applicationName);
AppClass = Microsoft.Office.Interop.(applicationName).ApplicationClass;
function disconnectNETApplication(AppClass)
For @SR case, hopefully solved by now, you can get and set text from "shapes" which is just a generic word of element in a way like below:
% PowerPoint.Presentations.Open(FileName, ReadOnly, Untitled, WithWindow)
Document = AppClass.Presentations.Open(fullfile(folder, filename), msoFalse, msoFalse, msoFalse);
Cell = Document.Slides.Item(1).Shapes.Item(2).Table.Cell(rowcol{:});
txt = string(Cell.Shape.TextFrame.TextRange.Paragraphs(1).Text); % Get Text
Cell.Shape.TextFrame.TextRange.Paragraphs(1).Text = "test"; % Set Text as string
For the loaded presentation Document, you can go through the slides and to select a slide to work in, you indicate which slide using the Item method to return that slide. Same with shapes which is just the term used for elements on the slide. You can loop through all the shapes using the Item method to find characteristics of each one to determine which one you want to work with.
  댓글 수: 3
gwoo 2023년 2월 1일
yea that's true about dotnetenv and it isn't needed. What I wrote is redundant actually (i like to say explicit) because when you load a NET.addAssembly it automatically uses the "framework" environment so you are right, it's not strictly needed and doesn't even exist prior to 2022.
Yes, i have compiled applications that use .NET API using other programs, not MS Office applications. I don't know why it would matter though.

댓글을 달려면 로그인하십시오.


Help CenterFile Exchange에서 MATLAB Report Generator에 대해 자세히 알아보기

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by