0
I'm also not sure if I'm doing something wrong or its a bug.
I want to use the bash command odt2txt to convert an odt-file, made with Libreoffice Writer, to a text file. However, the line breaks don't seem to be handled correctly. Every single line break is converted to two line breaks, multiple line breaks are also converted to two line breaks.
If I, for example, save this
This is a test
one line break before this
two line breaks before this
and three line breaks before this
into test.odt with LO Writer, and then do
odt2txt test.odt
I get
This is a test
one line break before this
two line breaks before this
and three line breaks before this
Using any of the options hasn't helped me either.
I don't find anything about this on Google, so I wonder if I'm the only one who has this problem.
Update: output from cat -vet output.txt, as asked for in comment
$
This is a test$
$
one line break before this$
$
two line breaks before this$
$
and three line breaks before this$
$
test your output with
cat -vet output.txt
. If you see^M$
at the end of each line, either usedos2unix output.txt
or look more closely at doc forodt2txt
to see if there is an option to create Unix/Linux line endings OR turn off Windows processing. Might be good to check you original file too for^M$
s, and then you'll know the source. – shellter – 2017-01-13T22:39:40.573I have added the output of
cat -vet output.txt
in the question. I don't think it is Windows processing, though, since it's not just a doubling of every line ending. Every thing gets turned into two (i.e. three or more also are turned into two). (sorry for my late reaction, btw. Was away without internet for the weekend) – None – 2017-01-16T11:44:37.233If you use
Save As
and selectText Document
from within Writer itself, the formatting appears to be as you would want, as does selecting all, copying and pasting into a text editor. – AFH – 2017-01-16T12:50:09.180@AFH, yes I know. But I need to do it from the command line. I want to run some script of commands on the text I'm writing, however, I want to keep being able to edit the text in LO. I want be able to keep using the mark up tools (not needed in the final text, but helps me structuring). And always saving as text or copying into a text-file, as opposed to just saving, is inefficient. – Lu Kas – 2017-01-16T13:10:56.630
I have found a work-around by now, though. Now I just save to .docx and use docx2txt. docx2txt seems to give the correct behaviour. So it is not really a problem for me anymore, but it still seems to me that odt2txt has a bug. Or I'm doing something wrong ... – Lu Kas – 2017-01-16T13:15:04.733
After saving as text, you can still revert to editing the original ODF file. But I agree that you have probably found a bug, unless some of the conversion options will change the handling for repeated new-lines. Glad you found a solution. You should submit it as an answer, for the benefit of others. – AFH – 2017-01-16T13:29:48.947
So should I report this somewhere? I don't really know how to. Or is nobody really interested? – Lu Kas – 2017-01-16T14:35:08.937