[Issue] Export web page with numbers as text in a column

Discuss the spreadsheet application

[Issue] Export web page with numbers as text in a column

Postby sethron » Mon Apr 07, 2008 9:49 pm

Grettings.
I am using OO Calc version 2.4 on Windows XP. I am a web application developer. I want to be able to open an exported web page to Calc with a column of numbers that ought to be treated as text.
That is, the column has leading zeros on the left side. For Example 0003.
Using that other spreadsheet program, the workaround is to use a CSS value of mso-number-format:\@; and then use that class value with the TD tag.

REFERENCE: http://agoric.com/sources/software/htmltoExcel

The result of this is that the exported web page will convert the cell, or repeated row value as a text field with the leading zeros in the spreadsheet column.

I have searched in the FAQ and DEV areas of the OO wiki, but have not found the equivalent fix for the mso-number-format:\@; technique.

I thought I'd ask this community if you have any similar ways of converting a web page to the .xls format with the text format in tact.

-Seth
Last edited by Hagar Delest on Tue Jun 10, 2008 3:43 pm, edited 2 times in total.
Reason: tagged the thread as issue.
User avatar
sethron
 
Posts: 5
Joined: Mon Apr 07, 2008 9:30 pm

Re: Export web page report with numbers as text in a column.

Postby acknak » Tue Apr 08, 2008 4:02 am

Holy smokes. This is just baffling to me.

If I format some spreadsheet cells as text, and save the file as HTML (Calc), a text cell is represented in the HTML output like so:

<TD HEIGHT=17 ALIGN=LEFT SDNUM="1033;0;@">00102</TD>

Now, that looked promising, as it appears to have all the necessary information.

On opening the HTML file however, the cell does indeed have a format of "Text" (or "@"), but Calc has removed the leading zeroes anyway. The text in the cell is "102".

Ack. Pthht.
AOO4/LO5 • Linux • Fedora 23
User avatar
acknak
Moderator
 
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Export web page report with numbers as text in a column.

Postby sethron » Tue Apr 08, 2008 7:06 pm

Intriguing, isn't it? :!:
Thanks for looking into this.
-Seth
User avatar
sethron
 
Posts: 5
Joined: Mon Apr 07, 2008 9:30 pm

Re: Export web page report with numbers as text in a column.

Postby acknak » Tue Apr 08, 2008 7:37 pm

Yes, well, not to mention that the HTML is loaded with non-standard attributes, like "SDNUM". If it actually worked, it wouldn't look quite so ridiculous.

This makes MS' little hack look like top-notch engineering.
AOO4/LO5 • Linux • Fedora 23
User avatar
acknak
Moderator
 
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Export web page report with numbers as text in a column.

Postby sethron » Tue Apr 08, 2008 10:24 pm

There is the technique of prefixing the field with a single quote.
That seems to treat the entire field as text and keep the leading zeros with or without the SDNUM="1033;0;@".
However, the actual cell value will still contain the prefixed single quote. :roll:
-Seth
User avatar
sethron
 
Posts: 5
Joined: Mon Apr 07, 2008 9:30 pm

Re: Export web page report with numbers as text in a column.

Postby acknak » Wed Apr 09, 2008 4:19 am

I went ahead and filed Issue 88071: text data corrupted on reopen when saved as HTML: leading zeroes are lost.

You can register there and add a vote (up to two) or a comment.

As far as a workaround, from my testing, RTF seems to have the same problem. XML is possible: you can add a custom XSLT-based filter that will allow Calc to open your XML data. I don't know if it will handle numeric text correctly. See [XML Filter] Create XSLT filters for import and export.

Another possibility might be to just generate a .xls file. There are free, open source libraries for generating .xls files.

There are some libraries for generating ODF also, but I expect that's going to be a steeper climb.
AOO4/LO5 • Linux • Fedora 23
User avatar
acknak
Moderator
 
Posts: 22756
Joined: Mon Oct 08, 2007 1:25 am
Location: USA:NJ:E3

Re: Export web page report with numbers as text in a column.

Postby sethron » Wed Apr 09, 2008 5:17 am

Thanks for filing the issue and the XSLT link. I post my votes to it.
Skoal! :)
-Seth
User avatar
sethron
 
Posts: 5
Joined: Mon Apr 07, 2008 9:30 pm

Re: [Issue] Export web page with numbers as text in a column

Postby bspar » Sun Apr 13, 2008 11:40 pm

Operating System XP
OpenOffice 1.3.9238
Hello,
I have a somewhat similar problem in which quite frequently I want to put a table into a spreadsheet. If this is considered a different subject then forgive me they seem related. If this works you may not need to export the web page. I frequently need to copy tables from the web and can't highlight a table on an html page and then past it into a calc spreadsheet. I would prefer not to use excel97 however I haven't found any other solution.

edit.
I will frequently use notepad or wordpad if the table isn't too large and save it as a txt file.
bspar
 
Posts: 1
Joined: Sun Apr 13, 2008 10:09 pm

Re: [Issue] Export web page with numbers as text in a column

Postby sethron » Mon Apr 14, 2008 12:22 am

Hmm. I do not recall exactly how importing a web page into Calc was done in OO version 1.3, but you ought to be able to View | Insert | Link to external data, then type or copy and paste the URL of the web page you want to import into Calc. Are you referring to the preview and selection of all the HTML or the selected tables that other spreadsheet program displays when you tell it to import external data, such as a web page? There seems to be only the option to import all HTML_ALL or HTML_tables in Calc.
-Seth
User avatar
sethron
 
Posts: 5
Joined: Mon Apr 07, 2008 9:30 pm


Return to Calc

Who is online

Users browsing this forum: No registered users and 16 guests