3

On the results page when I Google "e-luminate", the 3rd and 4th link seems to point to specific directory deep within the folders which stores the images. How can I get rid of these 2 results from Google search results? How can I get Google to de-index it?

I checked on the server and the folders did not seem different from other folders but these 2 paths seems to get indexed by Google.

Thank you.

weegee
  • 143
  • 7
Weng Fai Wong
  • 133
  • 1
  • 3
  • This question may be better suited for [Pro Webmasters Stack Exchange](http://webmasters.stackexchange.com/). – nhinkle Dec 21 '10 at 20:46

3 Answers3

3

First, sign up for a Google Webmaster Tools account. This will allow you view statistics from Google about how they crawl your site, and lets you request removal of pages from the index (more on that later).

Next, set up a robots.txt file for your site. You do not need to block your entire site from Google to use robots.txt. All search engines follow robots.txt, so this will also prevent sites like Bing or Yahoo from indexing these pages.

To set this up, create robots.txt as a plain text file in the root directory of your site (e.g. http://www.example.com/robots.txt). The syntax is very simple: you specify the user-agent this should apply to, using * as a wild-card for all robots, and you specify where the robots shouldn't crawl. Note that you should not include any pages you want to be completely "secret", as this is a publicly visible file. The syntax for robots.txt is as follows:

User-agent: user agent name
Disallow: directory name
Disallow: another directory
Disallow: (etc)

If you want to block any search engines from indexing data in a subdirectory of your images directory, you might do something like this:

User-agent: *
Disallow: /images/foo/bar/
Disallow: /images/foo/baz/

You can even disallow just a specific file:

User-agent: *
Disallow: /images/foo/bar/qux.jpg

Setting up robots.txt will prevent the specified directories and files from being indexed in the future. Over time, these pages will be removed from the search index, but it will not be immediate. To expedite this process, use your webmaster tools account to submit a request to remove a URL from the index. Click on the website account you want to remove the URL from, then open "Site configuration" on the left. Click on "Crawler access", then open the "Remove URL" tab. Click on "New removal request", and type in the URL you want to have removed. Then, hit enter. The page should ask you to confirm that you've already blocked the URL via robots.txt (which you've just done). Click OK, and it should submit the request. It will usually take them 1-3 days to process the request. You can check the status of the request by logging into your webmaster tools account at any time.

google webmasters url removal

nhinkle
  • 567
  • 2
  • 17
  • Hi nhinkle, thanks for your response. It seems that the root directory has a robot.txt with the following content.. User-agent: * Disallow: * – Weng Fai Wong Dec 21 '10 at 22:58
  • I just made it so it targets the two troublesome subfolder too. – Weng Fai Wong Dec 21 '10 at 22:59
  • If it already has `Disallow *`, then search bots shouldn't be indexing _anything_ on your site, which probably isn't what you want. If you only want to block bots from accessing those specific subfolders, remove the `Disallow: *` line, but keep the specific lines, so that bots can index the rest of your site, but not those parts. – nhinkle Dec 21 '10 at 23:04
  • I see, would the bot in turn index the rest of the site (which is not the intended outcome either!) except the two specific subfolders? – Weng Fai Wong Dec 21 '10 at 23:08
  • There is a slight improvement now. When I click through the 2 links from Google results page. It returns "Directory Listing Denied This Virtual Directory does not allow contents to be listed." instead of the directory content. Would the two entries be removed from Google results page eventually? – Weng Fai Wong Dec 21 '10 at 23:10
  • Also just submitted the two links to be removed from "Google Webmaster Tool -> Crawles access -> Remove URL". So time to pray perhaps. – Weng Fai Wong Dec 21 '10 at 23:13
  • If you have submitted the request, then the entries should be removed eventually. By signing up for a webmaster tools account, Google should now know about your site and index it, but will only index what it finds relevant. Robots.txt only affects what the bot specifically _ignores_. If you want to tell google about certain pages on your site, you can use a `sitemap.xml` file. – nhinkle Dec 21 '10 at 23:43
1

Did you try searching first?

I searched for "Remove page from Google index" and got this page: Remove a page or site from Google's search results.

It says you should create a robots.txt file.

After that, you can go to Google Webmaster Tools to request speedy removal.

Mikel
  • 3,727
  • 2
  • 19
  • 16
  • Thanks for your prompt response. The issue is we would still want the main domain name indexed but not the 2 specific results. Those folders only contain Images file types and there is no HTML to add the meta tags to block indexing. Would it work if I add a HTML page in those specific folders with the meta tags to block indexing of that specific folder? Could you elaborate on Google Webmaster Tool for speedy removal? – Weng Fai Wong Dec 21 '10 at 05:20
0

Read about robots.txt files, and you'll understand, you simply put the robot.txt file in any folder you want completely removed from google, and it will after a few hours or days, should not be shown anymore. Robots.txt are generated by using the webmasters tools in your Google Account, try it!

Lane
  • 1