[Solved] Import CSV into OO Calc 3.3.0 vs 3.2.1

Discuss the spreadsheet application
Post Reply
User avatar
prino
Posts: 6
Joined: Sat Sep 18, 2010 11:21 am
Location: Oostende, Belgium

[Solved] Import CSV into OO Calc 3.3.0 vs 3.2.1

Post by prino »

I've got CSV files containing times, the first few lines of one file look like:

Code: Select all

 1, 1,19800616,"07:47:00",+041051.0,19800616,"07:52:00",+041054.2,"00:05:00",+0038.4,"NL" ,"V"   ,"De Bilt"                        ,"afs Amersfoort"
 1, 2,19800616,"08:12:00",+033268.0,19800616,"09:25:00",+033403.0,"01:13:00",+0111.0,"NL" ,"-"   ,"afs Amersfoort"                 ,"Enschede"
 1, 3,19800616,"09:58:00",+078048.0,19800616,"10:28:00",+078060.0,"00:18:00",+0040.0,"NL" ,"-"   ,"Enschede"                       ,"Enschede"
 1, 4,19800616,"10:49:00",+001311.0,19800616,"11:33:00",+001342.0,"00:37:00",+0050.3,"D"  ,"-"   ,"Enschede"                       ,"Nordhorn"
 1, 5,19800616,"12:01:00",+074721.0,19800616,"12:17:00",+074740.0,"00:16:00",+0071.3,"D"  ,"-"   ,"Nordhorn"                       ,"Lingen"
 1, 6,19800616,"12:44:00",+069636.0,19800616,"14:58:00",+069869.0,"02:00:00",+0116.5,"D"  ,"-"   ,"Lingen"                         ,"Rastst„tte Hamburg Stillhorn"
and I only tell OO Calc 3.2.1 that
  • the 19800616 columns contain dates in YMD format,
  • the separator is a comma
  • the data is in ASCII codepage 437 format
It imports correctly, cols 1 & 2 show as numbers (despite the leading space), and the times (cols 4, 7 and 9) show as times.

Then, earlier this week I upgraded to OO 3.3.0, and did the same...

Problems?

Cols 1 & 2 import as text, and so do the time (4, 7 & 9) columns.

A change of ' ' to '' applied to cols 1 & 2 turns them into numbers, nice (but fundamentally wrong), but I've been unable to find any method of turning cols 4,7 & 9 into times.

As a result, I was forced to completely uninstall 3.3.0 followed by a reinstall of 3.2.1. My company uses lots of CSV files generated by other applications (some of them run on IBM's z/OS!), so, to express myself very politely, we are not very happy about this...

Can anyone what I'm doing wrong reading these CSV files?
Last edited by prino on Fri Feb 18, 2011 10:57 pm, edited 1 time in total.
Robert AH Prins
robert.ah.prins @ the 7Gb Google thingy

OpenOffice 3.3.0 on Windows XP Pro SP3
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Import CSV into OO Calc 3.3.0 vs 3.2.1

Post by Villeroy »

This csv has been exported using the wrong character encoding, quoted numerals and unquoted text.
Nevertheless I can easily import the data with comma as column delimiter, a German locale or any other locale where the point is not the decimal delimiter, double-quote as text delimiter, the YMD date in column 3, "Quoted fields as text"=OFF and "Detect special numbers"=ON.
Due to the wrong export encoding it is impossible to get the German umlauts correctly.
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
prino
Posts: 6
Joined: Sat Sep 18, 2010 11:21 am
Location: Oostende, Belgium

Re: Import CSV into OO Calc 3.3.0 vs 3.2.1

Post by prino »

Villeroy wrote:This csv has been exported using the wrong character encoding, quoted numerals and unquoted text.
I don't see quoted numbers, or unquoted text, and the file is created on z/OS and FTP'ed to the PC, resulting in a PC file with ASCII code page 437 - FWIW, the code page is not an issue!
Villeroy wrote:Nevertheless I can easily import the data with comma as column delimiter, a German locale or any other locale where the point is not the decimal delimiter, double-quote as text delimiter, the YMD date in column 3, "Quoted fields as text"=OFF and "Detect special numbers"=ON.
The file imports correctly with OO Calc 3.2.1:

Character set: Western Europe (DOS/OS2-437/US)
Separated by: Comma
Text delimiter: "

And columns 3 and 6 changed to Date (YMD)

That is all!

For what it's worth, the ä in Raststätte is mangled due to the Cut & Paste into the posting window, if I had bet € 1,000 that someone would start talking about it, rather than the real issue, i.e. the fact that OO Calc 3.3.0 does not import the times correctly nor the space-preceded numbers, I would have won my bet...
Villeroy wrote:Due to the wrong export encoding it is impossible to get the German umlauts correctly.
Code page is not the issue!
Robert AH Prins
robert.ah.prins @ the 7Gb Google thingy

OpenOffice 3.3.0 on Windows XP Pro SP3
User avatar
Villeroy
Volunteer
Posts: 31279
Joined: Mon Oct 08, 2007 1:35 am
Location: Germany

Re: Import CSV into OO Calc 3.3.0 vs 3.2.1

Post by Villeroy »

The times are quoted numerals whereas the phone "numbers" should be quoted text, I don't know about the meaning of column J. Is +0038.4 supposed to be the decimal number 38.4?
Another thing is strange with that file: Although it is comma separated, it has fixed widths as well. The city names are space padded. Each row has the same amount of characters.

Your import in 3.2. just happens to work because you use a comma locale. It will fail when your locale is set to something English unless you declare the phone numbers explicitly as text. Try Tools>Options...LanguageSettings>Languages: Locale = English before you import.
The file imports correctly with OO Calc 3.2.1:

Character set: Western Europe (DOS/OS2-437/US)
Separated by: Comma
Text delimiter: "

And columns 3 and 6 changed to Date (YMD)
Using the exact same settings I get the same results in version 3.3 and apart from the individual column settings (YMD) my settings are stored for the next import.
I don't get the unquoted integers in A and B as text unless I declare them as text.

New options make a difference:
- You can specify the locale for the whole import process so you can keep the column settings on "Standard" in most cases but not in this particular case. Dutch(Belgium) is OK since it will not treat the points as decimal separators. Column type "Date(YMD)" works as in older versions. In any version you may use some English locale (point decimals) and mark the columns with points explicitly as column type "Text".
- You can specify if quotes around numerals should be ignored (as in previous versions) or if quoted numerals should be imported as text (which is the purpose of quotes around text). In this case you want the quotes around times to be ignored but for many users this is a huge improvement because they do not need to mark their text columns twice (double-quotes and import setting).
- you can specify if "special numbers" (date, time, currency, percent, scientific, boolean numerals) should be detected or not. This is a major improvement for all those who complained about sport results interpreted as dates. If you want the times to be imported as times you want them to be interpreted.

After formatting the dates, removing all trailing space and substituting the broken "ä" I get this perfectly valid csv:

Code: Select all

1,1,1980-06-16,07:47:00,"+041051.0",1980-06-16,07:52:00,"+041054.2",00:05:00,"+0038.4","NL","V","De Bilt","afs Amersfoort"
1,2,1980-06-16,08:12:00,"+033268.0",1980-06-16,09:25:00,"+033403.0",01:13:00,"+0111.0","NL","-","afs Amersfoort","Enschede"
1,3,1980-06-16,09:58:00,"+078048.0",1980-06-16,10:28:00,"+078060.0",00:18:00,"+0040.0","NL","-","Enschede","Enschede"
1,4,1980-06-16,10:49:00,"+001311.0",1980-06-16,11:33:00,"+001342.0",00:37:00,"+0050.3","D","-","Enschede","Nordhorn"
1,5,1980-06-16,12:01:00,"+074721.0",1980-06-16,12:17:00,"+074740.0",00:16:00,"+0071.3","D","-","Nordhorn","Lingen"
1,6,1980-06-16,12:44:00,"+069636.0",1980-06-16,14:58:00,"+069869.0",02:00:00,"+0116.5","D","-","Lingen","Raststätte Hamburg Stillhorn"
Please, edit this topic's initial post and add "[Solved]" to the subject line if your problem has been solved.
Ubuntu 18.04 with LibreOffice 6.0, latest OpenOffice and LibreOffice
User avatar
prino
Posts: 6
Joined: Sat Sep 18, 2010 11:21 am
Location: Oostende, Belgium

Re: Import CSV into OO Calc 3.3.0 vs 3.2.1

Post by prino »

Villeroy wrote:The times are quoted numerals whereas the phone "numbers" should be quoted text, I don't know about the meaning of column J. Is +0038.4 supposed to be the decimal number 38.4?
Another thing is strange with that file: Although it is comma separated, it has fixed widths as well. The city names are space padded. Each row has the same amount of characters.
That's because it's generated with PL/I on z/OS. Generating fixed-layout records is far easier than shifting everything together. They are also a lot more readable. :mrgreen:
Villeroy wrote:Your import in 3.2. just happens to work because you use a comma locale. It will fail when your locale is set to something English unless you declare the phone numbers explicitly as text. Try Tools>Options...LanguageSettings>Languages: Locale = English before you import.
The file imports correctly with OO Calc 3.2.1:

Character set: Western Europe (DOS/OS2-437/US)
Separated by: Comma
Text delimiter: "

And columns 3 and 6 changed to Date (YMD)
Using the exact same settings I get the same results in version 3.3 and apart from the individual column settings (YMD) my settings are stored for the next import.
I don't get the unquoted integers in A and B as text unless I declare them as text.

New options make a difference:
- You can specify the locale for the whole import process so you can keep the column settings on "Standard" in most cases but not in this particular case. Dutch(Belgium) is OK since it will not treat the points as decimal separators. Column type "Date(YMD)" works as in older versions. In any version you may use some English locale (point decimals) and mark the columns with points explicitly as column type "Text".
- You can specify if quotes around numerals should be ignored (as in previous versions) or if quoted numerals should be imported as text (which is the purpose of quotes around text). In this case you want the quotes around times to be ignored but for many users this is a huge improvement because they do not need to mark their text columns twice (double-quotes and import setting).
- you can specify if "special numbers" (date, time, currency, percent, scientific, boolean numerals) should be detected or not. This is a major improvement for all those who complained about sport results interpreted as dates. If you want the times to be imported as times you want them to be interpreted.
"Detect special numbers"=ON is the key and selecting it solves the problem, nice that I don't have to go through the uninstall/re-install rigmarole... :D
Robert AH Prins
robert.ah.prins @ the 7Gb Google thingy

OpenOffice 3.3.0 on Windows XP Pro SP3
Post Reply