[Solved] Runtime error when importing custom Word dictionary
[Solved] Runtime error when importing custom Word dictionary
I have a problem using the macro ImportExportDictionary1-1.sxw to import a Word 2003 CustomWord.dic in Unicode into an OO3.2.1 New User Dictionary called CustomOOo.
The macro runs through a few hundred words and then gives a "BASIC runtime error, Reading exceeds EOF." I then cannot stop the macro which means I cannot close the file either - because BASIC is still running. Ctl+Alt+Del gave a "programme not responding" and let me out. After recovering the .odt files I see that nothing was added to the CustomOOo dictionary.
In this forum “Importing User Dictionary from Microsoft Word” by Avraham in July had the same problem but in that case the words were imported despite the error message, while here the words are not imported.
I have near zero programming knowledge and I would appreciate assistance to get the macro to work.
The macro runs through a few hundred words and then gives a "BASIC runtime error, Reading exceeds EOF." I then cannot stop the macro which means I cannot close the file either - because BASIC is still running. Ctl+Alt+Del gave a "programme not responding" and let me out. After recovering the .odt files I see that nothing was added to the CustomOOo dictionary.
In this forum “Importing User Dictionary from Microsoft Word” by Avraham in July had the same problem but in that case the words were imported despite the error message, while here the words are not imported.
I have near zero programming knowledge and I would appreciate assistance to get the macro to work.
Last edited by gerald682 on Mon Nov 15, 2010 8:39 pm, edited 1 time in total.
Open Office 3.2.1 Windows XP Pro v5.1 SP3
Re: Runtime error when importing custom Word dictionary
Hi,
as far as I can see, there seems to be a limit of words for the macro.
An easy alternative solution:
Use the "OOoUserDict1" format instead of the Macro and the "WBSWG6" format.
(My test with a MS Word 2000 CUSTOM.DIC [> 250 words] works well:
- convert CUSTOM.DIC to UTF-8,
- add the encoding lines [1]
- rename the dictionary, and paste it in OOo 3.2.1 user/wordbook)
[1] e.g.:
See also:
→ Re: Creating and mass populating Custom Dictionary – part (2)
→ Re: Add button in spell checker doesn't work – EDIT: [2010-08-22]
→ Issue 106032: linguistic: make human-readable user-dicts the default format ?
→ Issue 60698: user-dict format re-work ...
.
as far as I can see, there seems to be a limit of words for the macro.
Edit: ...but it's presumably another problem... Now I could easily import a *.dic (text file, more than 2200 words/lines) with that macro. |
Use the "OOoUserDict1" format instead of the Macro and the "WBSWG6" format.
(My test with a MS Word 2000 CUSTOM.DIC [> 250 words] works well:
- convert CUSTOM.DIC to UTF-8,
- add the encoding lines [1]
- rename the dictionary, and paste it in OOo 3.2.1 user/wordbook)
[1] e.g.:
Code: Select all
OOoUserDict1
lang: en-US
type: positive
---
→ Re: Creating and mass populating Custom Dictionary – part (2)
→ Re: Add button in spell checker doesn't work – EDIT: [2010-08-22]
→ Issue 106032: linguistic: make human-readable user-dicts the default format ?
→ Issue 60698: user-dict format re-work ...
.
LibreOffice 4.0.4 · WinXP
Re: Runtime error when importing custom Word dictionary
Thanks Franx.
I followed your first set of instructions without success. Maybe they were too cryptic for a beginner.
Anyway today I see your edit where you had the macro work for over 2000 entries which encouraged me to look at my word list as the source of macro problem.
There were various symbols, like * and [, so I removed them.
And, fantastic, the macro added my 6221 words.
Unfortunately when I tested it on the word kukupa which should be kūkupa it gave an error and no suggestion.
And, when I opened the dictionary it made no sense: I use NZ English and the dictionary was in UTF-8 after editing and saving with NotePad before I ran the macro.
I would appreciate your further advice.
I followed your first set of instructions without success. Maybe they were too cryptic for a beginner.
Anyway today I see your edit where you had the macro work for over 2000 entries which encouraged me to look at my word list as the source of macro problem.
There were various symbols, like * and [, so I removed them.
And, fantastic, the macro added my 6221 words.
Unfortunately when I tested it on the word kukupa which should be kūkupa it gave an error and no suggestion.
And, when I opened the dictionary it made no sense: I use NZ English and the dictionary was in UTF-8 after editing and saving with NotePad before I ran the macro.
I would appreciate your further advice.
Open Office 3.2.1 Windows XP Pro v5.1 SP3
Re: Runtime error when importing custom Word dictionary
That looks like "mojibake". [1]gerald682 wrote: [...] I use NZ English and the dictionary was in UTF-8 after editing and saving with NotePad before I ran the macro.
I would appreciate your further advice.
Could you upload the edited "Word 2003 CustomWord.dic", or send it to me via PM?
I would test it with both,
- the macro (WBSWG6 format)
- and the "alternative" in OOoUserDict1 format.
[1] <http://en.wikipedia.org/wiki/Mojibake>
EDIT:
A similar example (I applied the import with the macro twice):
(1) Text file UTF-8 (2) Text file "ANSI"
LibreOffice 4.0.4 · WinXP
Re: Runtime error when importing custom Word dictionary
Hi Franx.
I tried to send it via the PM system but .dic, .txt not acceptable and .jpg didn't work because dimensions could not be determined.
I mentioned I was a beginner!
And trying to send it as an attachment with this message - same problems.
How do I send it?
I tried to send it via the PM system but .dic, .txt not acceptable and .jpg didn't work because dimensions could not be determined.
I mentioned I was a beginner!
And trying to send it as an attachment with this message - same problems.
How do I send it?
Open Office 3.2.1 Windows XP Pro v5.1 SP3
Re: Runtime error when importing custom Word dictionary
Please see my private message...
LibreOffice 4.0.4 · WinXP
Re: Runtime error when importing custom Word dictionary
Hi Franx
Dictionary file attached. .dic renamed to .zip
Dictionary file attached. .dic renamed to .zip
- Attachments
-
- MaoriSciNamesUTF8.zip
- (66.37 KiB) Downloaded 215 times
Open Office 3.2.1 Windows XP Pro v5.1 SP3
Re: Runtime error when importing custom Word dictionary
Thanks. See you later!
LibreOffice 4.0.4 · WinXP
Re: Runtime error when importing custom Word dictionary
First try
OOoUserDict1 format
(1)
- Open MaoriSciNamesUTF8.zip with a text editor (Notepad++)
- Add the four lines at the beginning
(2)
- Convert to UTF-8 without BOM (= ANSI as UTF-8)
- Save as Maori_1.dic (or anything.dic) (3)
- Paste Maori_1.dic in ...user\wordbook
- Open OOo, then Tools > Options > ... > Writing Aids
- activate/tick Maori_1.dic (4) Try it (and fix the issue )
[Rename Maori_1.zip to Maori_1.dic] ↓
OOoUserDict1 format
(1)
- Open MaoriSciNamesUTF8.zip with a text editor (Notepad++)
- Add the four lines at the beginning
Code: Select all
OOoUserDict1
lang: <none>
type: positive
---
‘Ake
‘Ake‘ake
‘Aketa
‘Ange
‘Ano
‘Apūka
‘Ara
‘Arapītia
‘Atukura
...
- Convert to UTF-8 without BOM (= ANSI as UTF-8)
- Save as Maori_1.dic (or anything.dic) (3)
- Paste Maori_1.dic in ...user\wordbook
- Open OOo, then Tools > Options > ... > Writing Aids
- activate/tick Maori_1.dic (4) Try it (and fix the issue )
[Rename Maori_1.zip to Maori_1.dic] ↓
LibreOffice 4.0.4 · WinXP
Re: Runtime error when importing custom Word dictionary
↓
(worse results--for comparison only)
Macro (WBSWG6 format)
[Removed]
sorry--wrong attachments...
[New]
Original MaoriSciNamesUTF8.zip (UTF-8) → converted into text file (UTF-8 without BOM)
→ imported with macro: Maori_2ub.dic → renamed to Maori_2ub.zip Original MaoriSciNamesUTF8.zip (UTF-8) → converted into text file (ANSI)
→ imported with macro: Maori_3a.dic → renamed to Maori_3a.zip
Sample: Looks good--but doesn't work...
(worse results--for comparison only)
Macro (WBSWG6 format)
[Removed]
sorry--wrong attachments...
[New]
Original MaoriSciNamesUTF8.zip (UTF-8) → converted into text file (UTF-8 without BOM)
→ imported with macro: Maori_2ub.dic → renamed to Maori_2ub.zip Original MaoriSciNamesUTF8.zip (UTF-8) → converted into text file (ANSI)
→ imported with macro: Maori_3a.dic → renamed to Maori_3a.zip
Sample: Looks good--but doesn't work...
LibreOffice 4.0.4 · WinXP
Re: Runtime error when importing custom Word dictionary
Second Try
Maybe a workaround...
Are you sure that you are using the correct character for: ‘ (e.g. in ‘Ake‘ake)?
The problem with these "words" is that this char ‘
is not treated as part of the word.
Maybe it's incorrect--but I've replaced it now with the Unicode char
U+02BB MODIFIER LETTER TURNED COMMA = ʻ
from the Unicode blockIPA ExtensionsSpacing Modifier Letters, (e.g.: ʻAkeʻake).
You can see--with the help of the red wavy lines from spell check--
that this char is now treated as part of the word.
Example 1 Then I've removed the older version of these words from the dictionary (Maori_1.dic).
Afterwards, I've added the new version (with U+02BB) to the dictionary--
via right-cklick > context menu.
Example 2 The improved(?) dictionary (OOoUserDict1 format): Download:
→ test_playground.odt
<https://docs.google.com/leaf?id=0B0EPDe ... ist&num=50>
Maybe a workaround...
Are you sure that you are using the correct character for: ‘ (e.g. in ‘Ake‘ake)?
The problem with these "words" is that this char ‘
is not treated as part of the word.
Maybe it's incorrect--but I've replaced it now with the Unicode char
U+02BB MODIFIER LETTER TURNED COMMA = ʻ
from the Unicode block
You can see--with the help of the red wavy lines from spell check--
that this char is now treated as part of the word.
Example 1 Then I've removed the older version of these words from the dictionary (Maori_1.dic).
Afterwards, I've added the new version (with U+02BB) to the dictionary--
via right-cklick > context menu.
Example 2 The improved(?) dictionary (OOoUserDict1 format): Download:
→ test_playground.odt
<https://docs.google.com/leaf?id=0B0EPDe ... ist&num=50>
Last edited by franx on Mon Nov 15, 2010 6:34 pm, edited 1 time in total.
LibreOffice 4.0.4 · WinXP
Re: Runtime error when importing custom Word dictionary
Franx, you have done a fantastic job, including the discovery that if the unknown chr (‘) was turned into U+02BB things worked better. Now you have mentioned U+02BB my comments below deal with both your proposed solutions and the U+2018/U+02BB problem in Polynesian languages - maybe this should move to another forum, but I leave that up to you.
I should have realized that this import problem would drag me straight back into the handling of the glottal/hamsah in Polynesian languages. The glottal is used in Hawaiian (called ‘okino), Tahitian, Tongan, Samoan and Cook Islands Maori (what I am dealing with), but is not required in NZ Maori (a.k.a. Maori). By the way, I neither speak nor write CK Maori, but I do strive to record the CK Maori names for plants and animals in a modern orthography.
U+2018 (Left Single Quotation Mark) is widely used for glottals because it is in most common fonts and it is more stable than using an ordinary quotation mark in which 'ava'ava is smart quoted into ‘ava’ava. It's main (only?) problem is that custom dictionaries do not accept it as the first letter of a word and they treat words with it inside as two or more words.
U+02BB (Modifier Letter Turned Comma) is the real Unicode glottal and spellers recognize it as a letter rather than a quotation mark. The problem is that U+02BB is in only a couple of little used fonts with XP. With Vista things improved in that v.5.01 (2006) Arial and Times New Roman both contain U+02BB, although it was still absent from most fonts including the new Calibri etc. And if a file in Vista's Arial with ‘Ava‘ava was copied into an XP system it becomes □Ava□ava.
Thus my Cook Islands Maori dictionary of names still has the glottal as U+2018, although when U+02BB is more generally available I will change. Generally all systems handle the macron vowels: āēīōū ĀĒĪŌŪ (although, surprisingly, your Maori_3a lost them).
Back to your various solutions. Maori_2ub and Maori_3a as you noticed do not work - the former is a mess and marked everything with a macron or a U+2018 as misspelt; while the Maori_3a handled within 2018s OK, ignored initial 2018s, and as I note above this solution lost all the macrons.
Solution Maori_1 (for text using U+2018).
This was the Word .dic converted to UTF-8 without BOM.
Macrons: responds to missing macrons (within words and on initial letter) and lists the correct word, eg. Upoa lists Ūpua
Within 2018: responds to missing 2018s and lists correct word, eg. akaoa listed Akao‘a
Initial 2018: marks all words with initial 2018 as incorrect, and for missing initial 2018s such as Ārorangi it correctly lists ‘Ārorangi (which, of course, it continues to mark as incorrect). Not 100% but the suggestion list gives good information to reflect upon and will correct missing initial 2018s.
This is the dic I will be using with Open Office until the day I change to U+02BB.
Solution Maori_1b (for text using U+02BB)
Macrons: as for Maori_1
Within 02BB: as with Maori_1
Initial 02BB: this is where this solution shines. All correct words are shown as correct, and (as for Maori_1) when an initial glottal is missing it lists the correct word.
This soltion will be a great help to those working in Polynesia who have already changed to chr U+02BB - and it is certainly a good reason for me to think more seriously about abandoning U+2018 and changing to U+02BB.
Meitaki ma‘ata
I should have realized that this import problem would drag me straight back into the handling of the glottal/hamsah in Polynesian languages. The glottal is used in Hawaiian (called ‘okino), Tahitian, Tongan, Samoan and Cook Islands Maori (what I am dealing with), but is not required in NZ Maori (a.k.a. Maori). By the way, I neither speak nor write CK Maori, but I do strive to record the CK Maori names for plants and animals in a modern orthography.
U+2018 (Left Single Quotation Mark) is widely used for glottals because it is in most common fonts and it is more stable than using an ordinary quotation mark in which 'ava'ava is smart quoted into ‘ava’ava. It's main (only?) problem is that custom dictionaries do not accept it as the first letter of a word and they treat words with it inside as two or more words.
U+02BB (Modifier Letter Turned Comma) is the real Unicode glottal and spellers recognize it as a letter rather than a quotation mark. The problem is that U+02BB is in only a couple of little used fonts with XP. With Vista things improved in that v.5.01 (2006) Arial and Times New Roman both contain U+02BB, although it was still absent from most fonts including the new Calibri etc. And if a file in Vista's Arial with ‘Ava‘ava was copied into an XP system it becomes □Ava□ava.
Thus my Cook Islands Maori dictionary of names still has the glottal as U+2018, although when U+02BB is more generally available I will change. Generally all systems handle the macron vowels: āēīōū ĀĒĪŌŪ (although, surprisingly, your Maori_3a lost them).
Back to your various solutions. Maori_2ub and Maori_3a as you noticed do not work - the former is a mess and marked everything with a macron or a U+2018 as misspelt; while the Maori_3a handled within 2018s OK, ignored initial 2018s, and as I note above this solution lost all the macrons.
Solution Maori_1 (for text using U+2018).
This was the Word .dic converted to UTF-8 without BOM.
Macrons: responds to missing macrons (within words and on initial letter) and lists the correct word, eg. Upoa lists Ūpua
Within 2018: responds to missing 2018s and lists correct word, eg. akaoa listed Akao‘a
Initial 2018: marks all words with initial 2018 as incorrect, and for missing initial 2018s such as Ārorangi it correctly lists ‘Ārorangi (which, of course, it continues to mark as incorrect). Not 100% but the suggestion list gives good information to reflect upon and will correct missing initial 2018s.
This is the dic I will be using with Open Office until the day I change to U+02BB.
Solution Maori_1b (for text using U+02BB)
Macrons: as for Maori_1
Within 02BB: as with Maori_1
Initial 02BB: this is where this solution shines. All correct words are shown as correct, and (as for Maori_1) when an initial glottal is missing it lists the correct word.
This soltion will be a great help to those working in Polynesia who have already changed to chr U+02BB - and it is certainly a good reason for me to think more seriously about abandoning U+2018 and changing to U+02BB.
Meitaki ma‘ata
Open Office 3.2.1 Windows XP Pro v5.1 SP3
Re: Runtime error when importing custom Word dictionary
Hi Gerald,
thanks for your detailed and helpful feedback and the enlightening explanation.
All the best for your work and the Cook Islands Maori dictionary of names--
rā mānea
thanks for your detailed and helpful feedback and the enlightening explanation.
All the best for your work and the Cook Islands Maori dictionary of names--
rā mānea
LibreOffice 4.0.4 · WinXP