Questions tagged [search-engine]

Search engines are programs that search documents for specified keywords and returns a list of the documents or websites where the keywords were present.

Search engines are programs that search documents for specified keywords and returns a list of the documents or websites where the keywords were present.

Examples:

  • google
  • yahoo
  • bing
32 questions
5
votes
6 answers

What happens if a website does not have a robots.txt file?

If the robots.txt file is missing in the root directory of a website, how are things treated as: the site is not indexed at all the site is indexed without any restrictions It should logically be the second one according to me. I ask in reference…
Lazer
  • 415
  • 3
  • 7
  • 9
5
votes
6 answers

Blocking yandex.ru bot

I want to block all request from yandex.ru search bot. It is very traffic intensive (2GB/day). I first blocked one C class IP range, but it seems this bot appear from different IP ranges. For example: spider31.yandex.ru ->…
Ross
  • 268
  • 1
  • 3
  • 9
3
votes
1 answer

Locate / Updatedb with archive support? (tar.gz etc.)

I do have many backups on my nas and my dedicated server. Some as copy within the filesystem, some archived as .zip or .tar.{bz2|gz}. Is there any way to include the filenames within these archives in the updatedb-database? Or is there any other…
philipp
  • 101
  • 3
3
votes
3 answers

How to prevent discovery of a secure URL?

If I have a url that is used for getting messages and I create it like so: http://www.mydomain.com/somelonghash123456etcetc and this URL allows for other services to POST messages to. Is it possible for a search engine robot to find it? I don't want…
lamp_scaler
  • 577
  • 1
  • 5
  • 18
3
votes
3 answers

Are there any good intranet search engines?

I'm searching for a intranet search engine which is capable of spider our intranet websites and network shares like SMB, NFS and optionaly AFP. Ever better for us would be a search engine which is extendable via plugins like the Spotlight framework…
DASKAjA
  • 161
  • 2
  • 7
3
votes
2 answers

How to prevent majestic 12 from indexing a site

We experience a lot of traffic and server load on a web server. All I can find out is majestic12 accessing pages all the time. I wonder how I can prevent majestic12 from indexing the site Do they respect any robots.txt entry and how do I write such…
user12096
  • 917
  • 5
  • 23
  • 39
2
votes
1 answer

DNS zone not working like it should S2012R2

I'm stuck in solving the following problem. We have 2 domain and DNS controllers. Everything works like it should, except for something very weird. On the DNS servers we have 2 extra Primary AD integrated zones to make sure everybody uses Google…
2
votes
3 answers

odd query strings in Googlebot requests

Google's indexing bot (edit: yes, it's Google, IP resolves) seems to be adding arbitrary query strings to our home page. xx.xxx.xx.xxx - - [30/Jun/2009:10:14:37 -0400] "GET /?key=61680 HTTP/1.1" 200 3334 "-" "Mozilla/5.0 (compatible; Googlebot/2.1;…
ceejayoz
  • 32,469
  • 7
  • 81
  • 105
2
votes
1 answer

whoosh_backend module cannot be found

I recently tried to install haystack with a whoosh search engine. This is to work with django 1.3 on a nginx production server. I've followed the installation instructions for each item (both haystack and whoosh). Although when I try and start the…
Neil Hickman
  • 133
  • 10
2
votes
2 answers

Intranet Search engine solution to run on SBS 2008

I'm looking for a web based intranet search solution to index some intranet network shares with PDF / Doc / Textfiles etc. and maybe also an intranet wiki. Microsoft has the Search Server Express, which looks promising but the minimum requirements…
2
votes
1 answer

MediaWiki SearchEngine that behaves like google

searchterm: foo should match foo and foobar searchterm: "foo" should only match foo I tried LuceneSearch and SphinxSearch so far, but I couldn't get any of these to behave like google. foo will only match foo, and foo* will match foo and foobar.
chris
  • 432
  • 4
  • 9
1
vote
0 answers

How can I add multiple Search Engines to choose from in Chrome Enterprise?

In Chrome Enterprise you can specify the default search engine via GPO, and with the DefaultSearchProviderEnabled option you can prevent the users from overriding or changing that default search engine. Now, is it possible to add another search…
1
vote
0 answers

"Windows Search Service" query results do not show thumbnails but icons

I am having a strange issue here on several Windows Server 2012 R2 machines. Problem description: Client machines searching a mapped network drive get the results shown as icons instead of thumbnails. The search results (like *.doc, *.txt) can not…
Matthias Güntert
  • 2,358
  • 11
  • 38
  • 58
1
vote
2 answers

Amazon EC2 hosted services and their effect on local SERPS

I am thinking of hosting a project on Amazon's EC2. However I am a little unclear about possible negative impacts this may have on local search results. The projects main customer base is in the UK however the closest EC2 region is Ireland. Does…
luxerama
  • 189
  • 5
1
vote
2 answers

Search Engine Bot - Large amount of hits

I've started tracking user-agent strings on a website at the start of each session. Looking at the data for this month so far I'm seeing on search engine bot that keeps coming up a lot.. Mozilla/5.0 (compatible; Baiduspider/2.0;…
Justin808
  • 307
  • 3
  • 11
1
2 3