Pages from any web site are just HTML files that are transfered from a remote server to your computer so your browser can render them. (It is a lot more complex than that, most times they can be generated on-the-fly, and there's also separate images files transferred together with the HTML file so the browser can place images on the page, and also there's javascript that will describe bevahiour and CSS that will describe appearance. But for the purposes of this answer we can just simplify it to what I stated earlier).
HTML files are just plain text files. They must have specific tags in them (the HTML tags, enclosed by <>
), but other than that they are ASCII text files just like any .txt
. If you "view the source" on any browser, you are just seeing the exact contents of the HTML file the browser receives prior to rendering it on your screen.
Now, your browser doesn't care where the HTML file comes from. It can come from a website, or from a folder on your computer. You can even drag a .html
file to a broser window and it will try to render it (it might be broken and weird for the lack of images, javascript and css, but it will have some content at least).
When you view the source, copy it, paste on Word, and save as text, you are just creating a new HTML file on your computer. That file will lack all the images, javascript and CSS, but other than that it will be a perfectly valid HTML file. What you will see on your screen will be the best attempt of the browser on rendering it properly.
To illustrate what I mean, I have opened this very exact page for this question, pasted the code on notepad, saved it on a folder and opened it. Here is the result (note I don't see any single HTML tag, just text!):
Depending on the browser, if you just change the file extension to .txt
it will display the source code of the file, HTML tags and all, instead of rendering it. Firefox 31 on Windows 7 does that at least.
Note that if you paste the HTML on Word and save it as a .doc
or .docx
file, and then open it on your browser, all you will see are garbled characters, because browsers aren't meant to render Word files.
it should be viewable as html, without css/scripts that were based relative to the original web page. What browser did you use? Maybe you have the wrong ext on the file? – Logman – 2014-08-23T23:33:37.827
@Tyson You should make this an answer. – krowe – 2014-08-24T02:55:10.487