First, always use the right tool for the job. Text editor for viewing binary files is the same as to use a knife for nailing. Use any HEX viewer/editor for such tasks or better use the tool that knows internals of the binary file in question. If we talking about CPU's opcodes then something like IDA Pro free or OllyDbg would be useful for analyzing internals of executable files.
Does that mean the opcode f8
is converted into binary data
1111 1000
and stored in the file?
As was correctly pointed by @Mokubai - 0xF8
is same number as 1111 1000
, one represented in HEX notation and the last one as binary representation. It is the same as number 248 in decimal system.
If you creating manually executable code from CPU opcodes (or compile assembler source code), then i386 CPU will recognize 0xF8
(or 0b11111000
or 248 - it all the same) as CLC
instruction.
An Assembly code generated by a Compiler(say) clc
has an opcode f8
and
I am sure that the Assembler assembling the above mnemonic,
substitutes its opcode f8
in it's place.
That's true, except - "An Assembly code generated by a Compiler".
I just want to be sure you correctly understanding difference between "Assembly code" and opcodes. Opcodes are exact language that CPU can understand, it just numbers ( and it is how we programmed first computers when translators from CPU mnemonics aka assembler was a dream )
Nowadays, we mostly using "direct" compilation from high level programming language directly to executable binaries with compilers such C/C++/GoLang that produce CPU opcodes.
(When I said "direct compilation" that's not actually true, under the hood compilers doing multiple steps before it produced executable binaries, but for the end user it looks the same as we driving a car without need to know how gasoline converted to movement)
As was mentioned correctly by @sawdust in comment, higher level programming languages can use different strategies to create CPU opcodes. You can analyze for example gcc
compiler how it would cook opcodes by telling it to generate assembler code that would be used to make opcodes(object codes)
gcc -S -o myprogram.asm myprogram.c
If that is the case, why am I not able to view the binary contents of
a binary file using a normal text editor(say Notepad) - after all it's
'0's and '1's right?
Notepad speak another language. It understands its own "opcodes" - ASCII, anything else it's "greek" to Notepad.
"I'm aware of the Linking stage in-between" -- Incorrect, the linking stage would be after assemby. "What exactly happens after this stage" -- Depends on whether the assembly produces relocatable object code (which could be linked with other objects files), or absolute object code. "after all it's '0's and '1's right" -- Yes, but a text editor always treats that binary data as codes for text (e.g. ASCII), whereas a disassembler will treat the data as machine code, and display opcodes and operands. – sawdust – 2017-02-16T06:43:54.910
1You are missing a key point,
f8
doesn't need to be "converted", it already is1111 1000
they are just different representations of the exact same thing. One is shown as hex, the other as binary. Hex has the benefit of being slightly more human readable and has a neat side effect of splitting binary quads into single digits, in this case f = 1111 and 8 = 1000. The basic unit used by the CPU is binary digits, but humans tend to use the hex representations. – Mokubai – 2017-02-16T07:26:26.263