What character encoding does ODT use?

3

1

Whenever I open an ODT file saved by LibreOffice with default settings, and open it in a plain text editor, I get a bunch of nonsense. Not a single trace of the text I typed.

In what character encoding is ODT?

stommestack

Posted 2013-07-09T11:44:10.350

Reputation: 411

Answers

6

ODT files are compressed archives (zip files) containing several files. If you decompress (unzip) it, you will find the contents.

$ unzip foo.odt
Archive:  foo.odt
extracting: mimetype                
 inflating: Pictures/2000006A00000BF2000000FE05FAFF99.svm  
 inflating: Pictures/200000B200000BFC000002A0046833A0.svm  
 inflating: Pictures/200000EA00000F69000002A0332EA603.svm  
 inflating: Pictures/2000006200000AA500000136B263D3BB.svm  
 inflating: Pictures/2000006A00000AD2000001917B356A9B.svm  
 inflating: Pictures/2000005A00000A5A0000017F19B0D1EE.svm  
 inflating: Pictures/2000009200001036000001309FA6D695.svm  
 inflating: Pictures/2000008E00001C330000034F3E642D92.svm  
 inflating: Pictures/20000092000011AD0000014269F7C132.svm  
 inflating: Pictures/200000890000107E000002A0D7E80C67.svm  
 inflating: Pictures/200000E200001295000002A0E7D552DC.svm  
 inflating: Pictures/200000B2000012AB000001712E0D7F4B.svm  
 inflating: Pictures/200000CF00001D390000034F15D09B76.svm  
 inflating: Pictures/2000013A000019BA0000042370CD253A.svm  
 inflating: Pictures/2000007A00000C6300000136B0155364.svm  
 inflating: meta.xml                
 inflating: settings.xml            
 inflating: content.xml             
extracting: Thumbnails/thumbnail.png  
 inflating: layout-cache            
 inflating: manifest.rdf            
  creating: Configurations2/images/Bitmaps/
  creating: Configurations2/popupmenu/
  creating: Configurations2/toolpanel/
  creating: Configurations2/statusbar/
  creating: Configurations2/progressbar/
  creating: Configurations2/toolbar/
  creating: Configurations2/floater/
  creating: Configurations2/menubar/
 inflating: Configurations2/accelerator/current.xml  
 inflating: styles.xml              
 inflating: META-INF/manifest.xml 

Thus, when you try to open it in a text editor, it will appear as binary data because it does not consist of encoded text!

user235731

Posted 2013-07-09T11:44:10.350

Reputation:

It's worth noting that you can open content.xml which will contain the actual text of the document; I would expect that to be encoded using one of the UTF encodings, most likely UTF-8. Of course, it will have a bunch of other things thrown in as well, but even if you just naiively strip out the XML you should come pretty close. – a CVn – 2014-06-05T08:59:04.260