6
I am experimenting a strange behaviour which I don't know how to solve. I will explain the scenario:
- From a Python script I'm getting a json from a simple application hosted on parse.
- Once I get the text, I get a sentence from it and save it to a local "txt" file saving it as iso-8859-15.
- Finally I send it to a text to speech processor, which expects receiving it on ISO-8859-15
The weird thing is that once the python script runs, if I run
file my_file.txt
The output is:
my_file.txt: ASCII text, with no line terminators
But if I open my_file.txt
with vim, then remove the last "dot" of the sentence, write it again, and save the file: if I do again:
file my_file.txt
now the output is:
my_file.txt: ASCII text
Which solves some problems when processing the voice synthesizer. So, how can I force this behaviour automatically without doing the vim stuff? I have also done many tries with iconv
with no success.
Any help would be much appreciated
Edit:
i@raspberrypi ~/main $ hexdump -C my_file.txt
00000000 73 61 6d 70 6c 65 20 61 6e 73 77 65 72 2e 2e |sample answer..|
0000000f
pi@raspberrypi ~/main $ file my_file.txt
my_file.txt: ASCII text, with no line terminators
pi@raspberrypi ~/main $ vim my_file.txt
pi@raspberrypi ~/main $ file my_file.txt
my_file.txt: ASCII text
pi@raspberrypi ~/main $ hexdump -C my_file.txt
00000000 73 61 6d 70 6c 65 20 61 6e 73 77 65 72 2e 2e 0a |sample answer...|
00000010
Python code:
import json,httplib
from random import randint
import codecs
connection = httplib.HTTPSConnection('api.parse.com', 443)
connection.connect()
connection.request('GET', '/1/classes/XXXX', '', {
"X-Parse-Application-Id": "xxxx",
"X-Parse-REST-API-Key": "xxxx"
})
result = json.loads(connection.getresponse().read())
pos = randint(0,len(result['results'])-1)
sentence = result['results'][pos]['sentence'].encode('iso-8859-15')
response = result['results'][pos]['response'].encode('iso-8859-15')
text_file = codecs.open("sentence.txt", "w","ISO-8859-15")
text_file.write("%s" % sentence)
text_file.close()
text_file = open("response.txt","w")
text_file.write("%s" % response)
text_file.close()
Can you upload the file with no line terminators? I would like to have a look at it. – Nidhoegger – 2015-10-17T09:23:55.673
1Is it removing the 'dot', or does any edit fix it? It might be that editing the file adds the end of line marker, rather than the dot causing the problem. – Paul – 2015-10-17T09:24:29.507
So it's a single line in that text file? And does it have a line terminator? And are you sure you're only removing the dot? You can validate using
hexdump -C
. When typing in vim, lines always seem to end with0x0a
, even though you cannot move the cursor to the next empty line. So I guess vim is indeed adding it when you remove the dot, or make any edit. – Arjan – 2015-10-17T09:24:58.623many thanks! yes, you are all right, just opening and saving the file with vim is enough – cor – 2015-10-17T09:27:13.567
thank you @Arjan I edited the post with the command results – cor – 2015-10-17T09:40:40.110
@Nidhoegger I uploaded a file. Is on the edited question. Many thanks – cor – 2015-10-17T10:33:18.673
Please show the python code how you get the line and how you write it. I suspect that the newline is stripped when looping over the input and all you need to do is append it when writing the output file. Please make sure to specify if you are using python 2 or 3 since unicode handling has changed a lot between those two versions. – Bram – 2015-10-17T10:42:33.973
Thanks @Bram, there it is. Using python 2.7.3. Writing to a file in two different ways, with same result. – cor – 2015-10-17T10:52:22.513
So that specific example even has two dots, right?
0x2e
is a dot, and that's in the example twice. But indeed, the0x0a
is added by vim, even when you don't even remove anything, like you already saw now. – Arjan – 2015-10-17T11:01:14.357