How to remove this symbol "^@" with vim?

62

18

I have some files that are corrupted with this symbol:

^@

It's not part of the string; it's not searchable. How do I substitute this symbol with nothing, or how do I delete this symbol?

Here is an example line from one file:

^@F^@i^@l^@e^@n^@a^@m^@e^@ ^@ ^@ ^@ ^@ ^@ ^@ ^@ ^@ ^@ ^@:^@ ^@^M^@

mrt181

Posted 2009-11-25T10:16:34.010

Reputation: 785

Answers

54

You could try:

  • %s/<CTRL-2>//g (on regular PCs)

  • %s/<CTRL-SHIFT-2>//g (on Mac PCs)

where <CTRL-2> means first press down the CTRL on regular PCs, keeping it as pressed down, hit 2, release CTRL.

and <CTRL-SHIFT-2> means first press down the control on Mac PCs, keeping it as pressed down, press down shift on Mac PCs, keeping it as pressed down, hit 2, release control and shift.

Finally, both of the two commands should result in %s/^@//g on screen. ^@ means a single character (a NULL byte, which otherwise couldn’t be displayed), not ^ followed by @, so you can't just type ^ and @ in a row in the above command.

This command removes all the ^@.

phresus

Posted 2009-11-25T10:16:34.010

Reputation: 866

4Just stumbled upon this question/answer through a related link: This is actually a bad advice and will only work properly in very few cases. It's better to actually change the encoding rather than removing null bytes. If you remove the null bytes, you might still have other multibyte characters that show up as garbage. – Mario – 2014-03-07T09:57:30.517

@Mario could you tell us more about the encoding change? Is it something related to jrb's answer below? – George – 2014-03-08T13:11:04.633

See rpyzh's answer further down below. Shows loading the file using the proper encoding as well as saving it with a different one (although the answer could need some more explanation). Jrb's last note is enough if you just want to read it, but not if you want to have it saved without the null bytes using another encoding. – Mario – 2014-03-08T13:37:47.797

50

I don't think your files are corrupted. Your example line looks like it contains regular text with null bytes between each character. This suggests it's a text file that's been encoded in UTF-16 but the byte-order mark is missing from the start of the file. See http://en.wikipedia.org/wiki/Byte-order_mark

Suppose I open Notepad, type the word 'filename', and save as Unicode Big-endian. A hex dump of this file looks like this:

fe ff 00 66 00 69 00 6c 00 65 00 6e 00 61 00 6d 00 65

If I open this file in Vim it looks fine - the 'fe ff' bytes tell Vim how the file is encoded. Now suppose I create a file containing the exact same sequence of bytes, but without the leading 'fe ff'. Vim inserts ^@ (or <00>, depending on your config), in place of the null bytes; Notepad inserts spaces.

So rather than remove the nulls, you should really be looking to get Vim to interpret the file correctly. You can get Vim to reload the file with the correct encoding with the command:

:e ++enc=utf16

jrb

Posted 2009-11-25T10:16:34.010

Reputation: 621

6To remove them, choose another encoding and save the file again: :set fenc=utf-8 – scy – 2010-08-12T09:19:31.100

Yes, the last command made vim interpret the file correctly but does not remove the nullbytes. – mrt181 – 2009-11-26T11:14:38.903

35

This actually worked for me within vim:

:%s/\%x00//g

jriggins

Posted 2009-11-25T10:16:34.010

Reputation: 461

5

This works for me linux. '00' is the ASCII hex value, which you can find for any character in vim by placing the cursor over it and typing 'ga' (think "get ascii) in command mode or :as / :ascii on the command line. http://vim.wikia.com/wiki/Showing_the_ASCII_value_of_the_current_character

– Casey Jones – 2014-09-29T13:42:27.300

^Vx00 also works. You can also enter 16-bit unicode with ^VuXXXX. I tried %uXXXX in a search and that also worked. – Edward Falk – 2016-06-02T01:32:50.130

You will be my beloved man up to the end of time. From the deep of my heart...thank you! – Gonzalo Cao – 2019-01-29T19:37:22.750

5this works with substitute(), but Ctl-VCtl-Shift-2 does not. – dsummersl – 2013-01-18T15:26:49.580

Same problem for me, I couldn't get <Ctrl-V><Ctrl-2> (as well as the one with <Ctrl-Shift-2>) to work either, but this worked. – Jeff B – 2013-07-31T16:12:52.543

12

That 'symbol' represents a NULL character, with ASCII value 000.

It's difficult to remove with vim, try

tr -d '\000' < file1 > file2

pavium

Posted 2009-11-25T10:16:34.010

Reputation: 5 956

7

As others have noted, those are null bytes (ASCII 00). On Linux, the way to enter ASCII values into vim is to press Ctrl-V followed by the 3-digit octal value of any character. To replace all null bytes, use:

    :%s/Ctrl-V000//g

(with no spaces).

Likewise, you can search for nulls with:

    /Ctrl-V000

In both cases, it won't show the zeros as you're typing them, but after entering all three, it will display ^@. On color terminals it will show that in blue to indicate that it's a control character.

TheAmigo

Posted 2009-11-25T10:16:34.010

Reputation: 290

6

FWIW, in my case I had to use vim on cygwin to edit a text file created on a mac. The accepted solution didn't work for me, but was close. According to Vim wiki page about working with Unicode, there is a difference between Big Endian and Little Endian versions of the BOM byte. So, I had to explicitly tell vim to use a Little Endian version of BOM encoding.

Only after picking the right encoding I converted the file format (line endings) to dos so I could edit the file in Windows editor. Trying to set reset the file format before specifying the encoding gave me grief. Here is the full list of commands I used:

:e ++enc=utf16le
:w!
:e ++ff=mac
:setlocal ff=dos
:wq

rpyzh

Posted 2009-11-25T10:16:34.010

Reputation: 163

Precious info. In my case it was the endianness of the BOM byte. – Andre Albuquerque – 2014-04-15T14:00:59.797

3

The accepted solution did not work for me. I made vim pipe the file through tr instead:

:%!tr -d '\000'

This would also work well with visual mode (just type :!tr -d '\000') or on a range of lines:

# Remove nulls from current line:
:.!tr -d '\000'

# Remove nulls from lines 3-5:
:3,5!tr -d '\000'

We Are All Monica

Posted 2009-11-25T10:16:34.010

Reputation: 303

2

^@ not a bad character if you use a proper encoding, but if you want to remove then try:

  • tr -d '\000'
  • sed 's/\000//g'

^M character is there in your example data

To convert your file to Unix/Linux format before any processing, try:

dos2unix filename - rhel and other

dos2ux filename [newfilename] - HP-UX

user490343

Posted 2009-11-25T10:16:34.010

Reputation: 21

1

In addition to @jrb's answer, in Vim, the character encoding of the file is detected based on the fileencodings option. (note the 's' at end of fileencodings)

I.e. on Windows, the default value for the fileencodings option is ucs-bom, which means:

check if BOM exists at the beginning of the file.

If BOM exists, then 'read the character encoding of the file out of BOM'.

If BOM doesn't exist (and in this case that would also mean that all character encodings specified in the fileencodings option failed to match), then read the file with the character encoding specified in the encoding option. The default character encoding for the encoding option is: latin1. Now, because latin1 is the one byte length character encoding, all bytes in the file are valid latin1 characters (even the Nul character ^@ that you're seeing*).

*- actually, ^@ is the newline character in the Vim's buffer text, not the Nul character.

The proper way to read the file is to specify the character encoding manually as UTF-16 (as it looks like UTF-16 is the proper char encoding in this case).

colemik

Posted 2009-11-25T10:16:34.010

Reputation: 1 414

Why do you think it's a newline character? – L29Ah – 2019-12-30T20:05:05.940