Page 1 of 1

html to doc

Posted: Wed Dec 26, 2007 2:24 pm
by codewriter
Hi ,

I am working on converting html to doc thru java program. I have an html which is already fully formated . i got to export this as an doc file .any info on this would of help and if so some sample impl .

regards,

Re: html to doc

Posted: Wed Dec 26, 2007 4:39 pm
by floris v
Hey, it's very bad practice not to tell what you want. :evil: Be specific about what you want done and how. Don't leave people guessing - that's wasting their time. They might say: "Open the file in Writer and save it as .doc." :lol:

Re: html to doc

Posted: Wed Dec 26, 2007 6:07 pm
by hol.sten
codewriter wrote:I am working on converting html to doc thru java program.
Try this threads:
- Java: Using the Bootstrap Connection Mechanism: http://user.services.openoffice.org/en/ ... =44&t=1013
- Java: Using the Interprocess Connection Mechanism: http://user.services.openoffice.org/en/ ... =44&t=1014

But use another save filter, like "MS WinWord 6.0", instead of "writer_pdf_Export". Change

Code: Select all

conversionProperties[0].Value = "writer_pdf_Export";
to

Code: Select all

conversionProperties[0].Value = "MS WinWord 6.0";
Or try the filter list, if you need another conversion: http://wiki.services.openoffice.org/wik ... st_OOo_2_1

Re: html to doc

Posted: Sat Dec 29, 2007 7:36 pm
by codewriter
HI ,

I am almost there with the suggestion gave ... on converting html to doc ..
but life does not seems to be easy with this ...

here goes ....

Note :html2doc this is written in java program to run on a webserver .

If my from input url i.e loadurl is locally (hard disc) stored . the program is able to store properly .

but when the input url is an weburl i.e http://xyz.com . my program is not able to store .. wondering what is the problem .

will attach the code in the next post ... for check ..

any inputs are highly appreciated .

regards,

Re: html to doc

Posted: Sat Dec 29, 2007 7:39 pm
by codewriter
This works ...
public static void bootstrap() {


String loadUrl = "file:///c:/dev/netbeans/oootest/viewtopic.php.htm";
// String loadUrl = "http://www.google.com";
String storeUrl = "file:///c:/dev/netbeans/oootest/mydocoutputboot.doc";

try {
XComponentContext xContext = Bootstrap.bootstrap();
XMultiComponentFactory xMultiComponentFactory = xContext.getServiceManager();
XComponentLoader xcomponentloader = (XComponentLoader) UnoRuntime.queryInterface(XComponentLoader.class, xMultiComponentFactory.createInstanceWithContext("com.sun.star.frame.Desktop", xContext));

PropertyValue[] conversionProperties = new PropertyValue[2];
conversionProperties[0] = new PropertyValue();
conversionProperties[0].Name = "FilterName";
conversionProperties[0].Value = "MS Word 97";

conversionProperties[1] = new PropertyValue();
conversionProperties[1].Name = "Hidden";
conversionProperties[1].Value = new Boolean(true);

// Object objectDocumentToStore = xcomponentloader.loadComponentFromURL(loadUrl, "_blank", 1, new PropertyValue[0]);
Object objectDocumentToStore = xcomponentloader.loadComponentFromURL(loadUrl, "_blank", 1, conversionProperties);


XStorable xstorable = (XStorable) UnoRuntime.queryInterface(XStorable.class, objectDocumentToStore);
// xstorable.storeToURL(storeUrl,conversionProperties);
xstorable.storeToURL(storeUrl, conversionProperties);
// Getting the method dispose() for closing the document
// XComponent xcomponent =
// ( XComponent ) UnoRuntime.queryInterface( XComponent.class,
// xstorable );
System.exit(0);
}
catch (java.lang.Exception e) {
e.printStackTrace();
}
finally {
System.exit(0);
}

}


and this does not ...


public static void bootstrap() {

String loadUrl = "http://www.xyz.com";
String storeUrl = "file:///c:/dev/netbeans/oootest/mydocoutputboot.doc";

try {
XComponentContext xContext = Bootstrap.bootstrap();
XMultiComponentFactory xMultiComponentFactory = xContext.getServiceManager();
XComponentLoader xcomponentloader = (XComponentLoader) UnoRuntime.queryInterface(XComponentLoader.class, xMultiComponentFactory.createInstanceWithContext("com.sun.star.frame.Desktop", xContext));

PropertyValue[] conversionProperties = new PropertyValue[2];
conversionProperties[0] = new PropertyValue();
conversionProperties[0].Name = "FilterName";
conversionProperties[0].Value = "MS Word 97";

conversionProperties[1] = new PropertyValue();
conversionProperties[1].Name = "Hidden";
conversionProperties[1].Value = new Boolean(true);

// Object objectDocumentToStore = xcomponentloader.loadComponentFromURL(loadUrl, "_blank", 1, new PropertyValue[0]);
Object objectDocumentToStore = xcomponentloader.loadComponentFromURL(loadUrl, "_blank", 1, conversionProperties);


XStorable xstorable = (XStorable) UnoRuntime.queryInterface(XStorable.class, objectDocumentToStore);
// xstorable.storeToURL(storeUrl,conversionProperties);
xstorable.storeToURL(storeUrl, conversionProperties);
// Getting the method dispose() for closing the document
// XComponent xcomponent =
// ( XComponent ) UnoRuntime.queryInterface( XComponent.class,
// xstorable );
System.exit(0);
}
catch (java.lang.Exception e) {
e.printStackTrace();
}
finally {
System.exit(0);
}

}

Exception thrown ...
com.sun.star.task.ErrorCodeIOException:
at com.sun.star.lib.uno.environments.remote.Job.remoteUnoRequestRaisedException(Job.java:187)
at com.sun.star.lib.uno.environments.remote.Job.execute(Job.java:153)
at com.sun.star.lib.uno.environments.remote.JobQueue.enter(JobQueue.java:349)
at com.sun.star.lib.uno.environments.remote.JobQueue.enter(JobQueue.java:318)
at com.sun.star.lib.uno.environments.remote.JavaThreadPool.enter(JavaThreadPool.java:106)
at com.sun.star.lib.uno.bridges.java_remote.java_remote_bridge.sendRequest(java_remote_bridge.java:657)
at com.sun.star.lib.uno.bridges.java_remote.ProxyFactory$Handler.request(ProxyFactory.java:159)
at com.sun.star.lib.uno.bridges.java_remote.ProxyFactory$Handler.invoke(ProxyFactory.java:141)
at $Proxy5.storeToURL(Unknown Source)
at com.vtech.util.word.InterprocessConnectionOdtToPdfQuickAndDirty.bootstrap(InterprocessConnectionOdtToPdfQuickAndDirty.java:115)
at com.vtech.util.word.InterprocessConnectionOdtToPdfQuickAndDirty.main(InterprocessConnectionOdtToPdfQuickAndDirty.java:134)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:585)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:86)

Re: html to doc

Posted: Sat Jan 05, 2008 10:55 am
by codewriter
HI ,

Has any one come across such situation mentioned above .

regards,

Re: html to doc

Posted: Sat Jan 05, 2008 10:56 am
by codewriter
codewriter wrote:Hi ,

I am working on converting html to doc thru java program. I have an html which is already fully formated . i got to export this as an doc file .any info on this would of help and if so some sample impl .

regards,

Re: html to doc

Posted: Sat Jan 05, 2008 11:03 am
by DrewJensen
OK - I don't use Java per se, but I do recall seeing a discussion before where the problem ended up being the hidden property - have you tried this with the document window not being hidden?

Re: html to doc

Posted: Sat Jan 05, 2008 11:05 am
by codewriter
Hi Thanks,

Yes i tried that too . But it doesn't seems to be working .

regards,

Re: html to doc

Posted: Sat Jan 05, 2008 2:42 pm
by hol.sten
codewriter wrote:and this does not ...

Code: Select all

...
    String loadUrl = "http://www.xyz.com";
...
            conversionProperties[0] = new PropertyValue();
            conversionProperties[0].Name = "FilterName";
            conversionProperties[0].Value = "MS Word 97";
...
Exception thrown ...
com.sun.star.task.ErrorCodeIOException:
...
Problem: Loading HTML with OOo creates a Writer/Web document and NOT a Writer document. For that reason, you cannot use the com.sun.star.text.TextDocument filters, you have to use the com.sun.star.text.WebDocument filters.

Here is an example of working Java code using the Bootstrap Connection Mechanism to load a HTML page and store it as HTML and PDF:

Code: Select all

package oootest;

import com.sun.star.beans.PropertyValue;
import com.sun.star.comp.helper.Bootstrap;
import com.sun.star.frame.XComponentLoader;
import com.sun.star.frame.XStorable;
import com.sun.star.lang.XMultiComponentFactory;
import com.sun.star.uno.UnoRuntime;
import com.sun.star.uno.XComponentContext;

public class BootstrapConnectionWebToHtmlAndPdfQuickAndDirty {
    public static void main(String[] args) {
        
        // Loading HTML creates a Writer/Web document! It does NOT create a Writer document!
        String loadUrl      = "http://www.xyz.com";

        // Store web page as HTML and PDF
        String storeUrlHtml = "file:///c:/dev/netbeans/oootest/xyz.html";
        String storeUrlPdf  = "file:///c:/dev/netbeans/oootest/xyz.pdf";

        try {
            XComponentContext xContext = Bootstrap.bootstrap();
            XMultiComponentFactory xMultiComponentFactory = xContext.getServiceManager();
            XComponentLoader xcomponentloader = (XComponentLoader) UnoRuntime.queryInterface(XComponentLoader.class,xMultiComponentFactory.createInstanceWithContext("com.sun.star.frame.Desktop", xContext));

            Object objectDocumentToStore = xcomponentloader.loadComponentFromURL(loadUrl, "_blank", 0, new PropertyValue[0]);

            // Sometimes loading from the web needs some time.
            // 4000 waits for 4 seconds. Try different settings.
            Thread.sleep(4000);

            XStorable xstorable = (XStorable) UnoRuntime.queryInterface(XStorable.class,objectDocumentToStore);

            // Filter names are listed at http://wiki.services.openoffice.org/wiki/Framework/Article/Filter/FilterList_OOo_2_1
            PropertyValue[] conversionProperties = new PropertyValue[1];
            conversionProperties[0] = new PropertyValue();
            conversionProperties[0].Name = "FilterName";

            // Store Writer/Web document as HTML
            conversionProperties[0].Value = "HTML";
            xstorable.storeToURL(storeUrlHtml,conversionProperties);

            // Store Writer/Web document as PDF
            conversionProperties[0].Value = "writer_web_pdf_Export";
            xstorable.storeToURL(storeUrlPdf,conversionProperties);
        }
        catch (java.lang.Exception e) {
            e.printStackTrace();
        }
        finally {
            System.exit(0);
        }
    }    
}
Thanks to add '[Solved]' in your first post title (edit button) if your issue has been fixed.

Regards
hol.sten

Re: html to doc

Posted: Sat Jan 05, 2008 3:39 pm
by hol.sten
Here is an example of working Java code using the Bootstrap Connection Mechanism to load a HTML document, an ODT document and an ODS document and store each as a PDF:

Code: Select all

package oootest;

import com.sun.star.beans.PropertyValue;
import com.sun.star.comp.helper.Bootstrap;
import com.sun.star.frame.XComponentLoader;
import com.sun.star.frame.XStorable;
import com.sun.star.io.IOException;
import com.sun.star.lang.IllegalArgumentException;
import com.sun.star.lang.XMultiComponentFactory;
import com.sun.star.lang.XServiceInfo;
import com.sun.star.uno.UnoRuntime;
import com.sun.star.uno.XComponentContext;

public class BootstrapConnectionDocumentToPdfQuickAndDirty {
    public static void main(String[] args) {
        // Load documents
        String loadUrlHtml = "file:///c:/dev/netbeans/oootest/my.htm";
        String loadUrlOdt  = "file:///c:/dev/netbeans/oootest/my.odt";
        String loadUrlOds  = "file:///c:/dev/netbeans/oootest/my.ods";

        // Store documents
        String storeUrlHtml = "file:///c:/dev/netbeans/oootest/my.htm.pdf";
        String storeUrlOdt  = "file:///c:/dev/netbeans/oootest/my.odt.pdf";
        String storeUrlOds  = "file:///c:/dev/netbeans/oootest/my.ods.pdf";

        try {
            XComponentContext xContext = Bootstrap.bootstrap();
            XMultiComponentFactory xMultiComponentFactory = xContext.getServiceManager();
            XComponentLoader xcomponentloader = (XComponentLoader) UnoRuntime.queryInterface(XComponentLoader.class,xMultiComponentFactory.createInstanceWithContext("com.sun.star.frame.Desktop", xContext));

            convertDocumentToPdf(xcomponentloader,loadUrlHtml,storeUrlHtml);
            convertDocumentToPdf(xcomponentloader,loadUrlOdt,storeUrlOdt);
            convertDocumentToPdf(xcomponentloader,loadUrlOds,storeUrlOds);
        }
        catch (java.lang.Exception e) {
            e.printStackTrace();
        }
        finally {
            System.exit(0);
        }
    }

    private static void convertDocumentToPdf(XComponentLoader xcomponentloader, String loadUrlHtml, String storeUrl) throws IOException, InterruptedException, IllegalArgumentException {
        Object document = xcomponentloader.loadComponentFromURL(loadUrlHtml, "_blank", 0, new PropertyValue[0]);

        // Sometimes loading needs some time. 4000 waits for 4 seconds. Try different settings if needed.
        Thread.sleep(4000);

        storePDF(document, storeUrl);
    }

    private static void storePDF(Object document,String storeUrl) throws IOException {
        // Determine suitable filter name for PDF export by asking XServiceInfo.
        // Source: OOo Developer's Guide - 7 Office Development - 7.1.5 Handling Documents - Storing Documents
        // http://api.openoffice.org/docs/DevelopersGuide/OfficeDev/OfficeDev.xhtml#1_1_5_3_Storing_Documents
        //
        // Filter names are listed at http://wiki.services.openoffice.org/wiki/Framework/Article/Filter/FilterList_OOo_2_1
        XServiceInfo xInfo = (XServiceInfo) UnoRuntime.queryInterface(XServiceInfo.class,document);
        String storeFilter = null;
        if(xInfo!=null) {
            if(xInfo.supportsService("com.sun.star.text.TextDocument")) {
              storeFilter = "writer_pdf_Export";
            }
            else if(xInfo.supportsService("com.sun.star.text.WebDocument")) {
              storeFilter = "writer_web_pdf_Export";
            }
            else if(xInfo.supportsService("com.sun.star.sheet.SpreadsheetDocument")) {
              storeFilter = "calc_pdf_Export";
            }
        }

        if (storeFilter != null) {
            PropertyValue[] conversionProperties = new PropertyValue[2];
            conversionProperties[0] = new PropertyValue();
            conversionProperties[0].Name = "FilterName";
            conversionProperties[0].Value = storeFilter;
            conversionProperties[1] = new PropertyValue();
            conversionProperties[1].Name = "Overwrite ";
            conversionProperties[1].Value = new Boolean(true);

            XStorable xstorable = (XStorable) UnoRuntime.queryInterface(XStorable.class,document);
            xstorable.storeToURL(storeUrl, conversionProperties);
        }
    }
}
Regards
hol.sten

Re: html to doc

Posted: Sat Jan 05, 2008 4:34 pm
by DrewJensen
Hol.stein,

Well, I am sure the OP will appreciate that example and I know I do - this will help with a project I have on the drawing board immensely. Thanks.

Re: html to doc

Posted: Tue Jan 08, 2008 4:58 pm
by codewriter
Hi Thanks very much,

Now I have used writer_web_StarOffice_XML_Writer to store it as .doc file .

I still have one issue , where the images are not seems to be embeded into the doc file while it loads from the weburl . Can i embed the avilable images from html page into the generated doc file some how .
the reason is , this generated output file will be used offline also , where there is no internet connections available .

regards,

Re: html to doc

Posted: Thu Nov 12, 2009 6:48 am
by sarath

Re: html to doc

Posted: Fri Jan 08, 2016 9:12 am
by kunal14
if html is in string format like "<html>...</html>", then how to convert it to doc