Extract text from MS Word file using OOo API

The Application Programming Interface and the OASIS Open Document Format

Extract text from MS Word file using OOo API

Postby shekhar.kotekar » Mon Mar 28, 2011 1:01 pm

Hi,

I would like to know whether is it possible to use Open Office API to extract text and other information from MS word document.

Till now I have used MS Offfice Interop API (its too slow as it COM based), and also Apache POI (it has poor documentation, no support, no forums) but they have their disadvantages so I was thinking about using Open Office API.

Please enlighten !!!!
OpenOffice 3.1 on Windows Vista
shekhar.kotekar
 
Posts: 2
Joined: Mon Mar 28, 2011 12:45 pm

Re: Extract text from MS word file using Open Office API

Postby rudolfo » Mon Mar 28, 2011 1:14 pm

shekhar.kotekar wrote:Till now I have used MS Offfice Interop API (its too slow as it COM based)

OpenOffice has to convert MS Word documents first into its internal XML or DOM tree format before it can do anything with it. If COM automation is too slow for you, I doubt that OOo would be fast enough for you.
The key point with OpenOffice is that is is build around an open xml format. If the Office Suite itself is to slow for you, you can always code your own XML application to manipulate the documents. But of course this approach to get around performance issues can only be used for documents that are already in the (ODF) xml format.
OpenOffice 3.1.1 (2.4.3 until October 2009) and LibreOffice 3.3.2 on Windows 2000, AOO 3.4.1 on Windows 7
There are several macro languages in OOo, but none of them is called Visual Basic or VB(A)! Please call it OOo Basic, Star Basic or simply Basic.
rudolfo
Volunteer
 
Posts: 1488
Joined: Wed Mar 19, 2008 11:34 am
Location: Germany

Re: Extract text from MS word file using Open Office API

Postby shekhar.kotekar » Mon Mar 28, 2011 1:15 pm

@rudolfo,

Thanks a lot for your inputs.
OpenOffice 3.1 on Windows Vista
shekhar.kotekar
 
Posts: 2
Joined: Mon Mar 28, 2011 12:45 pm


Return to UNO API and ODF

Who is online

Users browsing this forum: No registered users and 1 guest