[Solved] LibreOffice File format error found at SAXParse

Help with installation and general system troubleshooting questions concerning the office suite LibreOffice.

[Solved] LibreOffice File format error found at SAXParse

Postby markhand » Sun Jan 08, 2017 10:21 pm

Hi,

I'm desperately hoping someone can help. My partner needs to submit the attached in the next 12 hours but on her final save the file has corrupted. We'd be so grateful if someone could take a look and repair. It's a docx file... we have now absorbed the advice from previous posts about saving in odt format.

The error is:
File format error found at
SAXParseException: '[word/document.xml line 2]: Attribute w:eastAsiaTheme redefined
', Stream 'word/document.xml', Line 2, Column 11298(row,col).

Thanks in advance,
Mark
Last edited by Hagar Delest on Mon Jan 09, 2017 9:01 am, edited 2 times in total.
Reason: tagged [Solved].
Libre Office 5.2.0.4 Windows 10 Home 1607
markhand
 
Posts: 3
Joined: Sun Jan 08, 2017 10:11 pm

Re: LibreOffice File format error found at SAXParse

Postby Hagar Delest » Sun Jan 08, 2017 10:41 pm

Hi and welcome to the forum!

I opened the document.xml file with XMLCopy editor and deleted a style just after the VOICE 1 string where the file seemed to be cut.
Then only I could format the file with an XML scheme and it seems to have retrieved the text. I hope you lost the formatting only.
LibreOffice 7.0.3 on Xubuntu 20.10 and LibreOffice 7.1 (portable) on Windows 10.
User avatar
Hagar Delest
Moderator
 
Posts: 29445
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: [Solved] LibreOffice File format error found at SAXParse

Postby RoryOF » Sun Jan 08, 2017 10:47 pm

Try the attached - check formatting is as desired. I did not have to remove any text, only a duplicate formatting command
Apache OpenOffice 4.1.9 on Xubuntu 20.04.1 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 32202
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Postby Hagar Delest » Sun Jan 08, 2017 10:50 pm

Rory's file is much better than mine!
LibreOffice 7.0.3 on Xubuntu 20.10 and LibreOffice 7.1 (portable) on Windows 10.
User avatar
Hagar Delest
Moderator
 
Posts: 29445
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: LibreOffice File format error found at SAXParse

Postby RoryOF » Sun Jan 08, 2017 10:53 pm

I was able to find the duplicated formatting command and removed it.
Apache OpenOffice 4.1.9 on Xubuntu 20.04.1 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 32202
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Postby RoryOF » Sun Jan 08, 2017 10:57 pm

If markhand's partner is writing another play, note that there is a radio play formatting template at
http://extensions.openoffice.org/en/project/screenwrightr-a4-radio-script-formatting-template
Apache OpenOffice 4.1.9 on Xubuntu 20.04.1 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 32202
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Postby Hagar Delest » Sun Jan 08, 2017 10:58 pm

Strange. I first tried to format the XML file but it couldn't. I then deleted something after the Voice 1 string and it did format the file afterward, without the error about the duplicate formatting. I guessed it was fine. But your finding was better.
LibreOffice 7.0.3 on Xubuntu 20.10 and LibreOffice 7.1 (portable) on Windows 10.
User avatar
Hagar Delest
Moderator
 
Posts: 29445
Joined: Sun Oct 07, 2007 9:07 pm
Location: France

Re: LibreOffice File format error found at SAXParse

Postby RoryOF » Sun Jan 08, 2017 11:07 pm

I opened the archive that is the .docx file, found document.xml (in the internal word folder). I opened this in situ using XML Copy Editor, which reported a format error of duplicate attribute at line 2, 11950. I deleted that attribute, tested for "well formed" in XML C.E., got another duplicate attribute report at the same location, deleted that. This time XML C.E. reported that the file was well formed. I Saved it and was prompted to update it in the .docx archive. The updated .docx archive was what I uploaded.

XML Copy Editor is OK, but its diagnostic messages are not as explicit as they might be. I would welcome a pointer to a more explicit XML analyser that ran on linux, preferably running in a GUI as this is easier for file repairs.
Apache OpenOffice 4.1.9 on Xubuntu 20.04.1 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 32202
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Postby markhand » Sun Jan 08, 2017 11:39 pm

Thank you so much Hagar Delest and RoryOF
I have downloaded both files. Tears of joy have been shed and the deadline shall be met.
If it's not too much trouble, I'd be grateful if you could delete the repaired file attachments from your posts.
We'll check out the radio play template.
Many thanks,
Mark
Libre Office 5.2.0.4 Windows 10 Home 1607
markhand
 
Posts: 3
Joined: Sun Jan 08, 2017 10:11 pm

Re: LibreOffice File format error found at SAXParse

Postby RoryOF » Sun Jan 08, 2017 11:45 pm

markhand wrote:Thank you so much Hagar Delest and RoryOF
I have downloaded both files. Tears of joy have been shed and the deadline shall be met.
If it's not too much trouble, I'd be grateful if you could delete the repaired file attachments from your posts.
We'll check out the radio play template.
Many thanks,
Mark


All deleted, including Hagar's.
Apache OpenOffice 4.1.9 on Xubuntu 20.04.1 (mostly 64 bit version) and very infrequently on Win2K/XP
User avatar
RoryOF
Moderator
 
Posts: 32202
Joined: Sat Jan 31, 2009 9:30 pm
Location: Ireland

Re: LibreOffice File format error found at SAXParse

Postby markhand » Mon Jan 09, 2017 12:03 am

Thank you both!
Libre Office 5.2.0.4 Windows 10 Home 1607
markhand
 
Posts: 3
Joined: Sun Jan 08, 2017 10:11 pm

Re: LibreOffice File format error found at SAXParse

Postby John_Ha » Mon Jan 09, 2017 12:19 am

RoryOF wrote:XML Copy Editor is OK, but its diagnostic messages are not as explicit as they might be. I would welcome a pointer to a more explicit XML analyser that ran on linux, preferably running in a GUI as this is easier for file repairs.

There seem to be quite a few when I Google for them. You could try an on-line system like Validate an XML file.

When all else fails, deleting all the XML tags is pretty simple and gets back just the text. Use a Regular Expressions Find and Replace with search argument <[^>]+> and replace argument blank. You cannot do it in Writer as it produces a single paragraph which will often be over the 64k limit so I use Notepad++ to do it.
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
 
Posts: 8219
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK

Re: LibreOffice File format error found at SAXParse

Postby John_Ha » Mon Jan 09, 2017 1:17 am

markhand wrote:we have now absorbed the advice from previous posts about saving in odt format.

That is excellent advice to follow!
LO 6.4.4.2, Windows 10 Home 64 bit

See the Writer Guide, the Writer FAQ, the Writer Tutorials and Writer for students.

Remember: Always save your Writer files as .odt files. - see here for the many reasons why.
John_Ha
Volunteer
 
Posts: 8219
Joined: Fri Sep 18, 2009 5:51 pm
Location: UK


Return to LibreOffice

Who is online

Users browsing this forum: No registered users and 1 guest