0

I have used Amazon pre signed url to share content.

https://boto3.amazonaws.com/v1/documentation/api/latest/guide/s3-presigned-urls.html

Is google able to crawl this url? I'm sharing this url with just one client. What about other services? there are some of theme that let you share content with someone by creating a seemingly random url (or even using hashes) like www.somedomain.com/something/15b8b348ea1d895d753d1acb57683bd9 Is that url crawled by google or other search engines?

Thanks

1 Answers1

0

A presigned URL is like a temporary password included in the URL. Everybody which knows this URL can have access to it as long as the URL is not expired. But web crawlers (there are way more crawlers than Google) will be able to access the URL only if they get to know it because it was included in some public web site or similar. They would not be able to just guess it and crawl it (and would not even try it) because the secret part of the URL is long enough so that it cannot be brute forced during the life time of the URL.

Steffen Ullrich
  • 184,332
  • 29
  • 363
  • 424
  • Very clear explanation! Thanks. I'm interested in knowing how this crawlers (google and others) work under the hood. Can you provide me with some resources, papers, whatever you may have? Thanks again – Markus Bell Sep 22 '19 at 14:23
  • @user218163: *"Can you provide me with some resources, papers, whatever you may have?"* - this is a different question and should therefore not be asked in a comment. It is also off-topic (not clearly security related) and too broad. Also, you can easily start yourself by using a search engine with phrases like [how do web crawlers work](https://www.google.com/search?q=how+do+web+crawlers+work). There is even a [wikipedia entry](https://en.wikipedia.org/wiki/Web_crawler) about this. – Steffen Ullrich Sep 22 '19 at 14:48
  • You are right. Will do. Thanks – Markus Bell Sep 22 '19 at 15:00
  • One last comment. You said that "secret part of the URL is long enough so that it cannot be brute forced during the life time of the URL". This is true for AWS Pre signed URLs, but not for www.somedomain.com/something/15b8b348ea1d895d753d1acb57683bd9. I was not aware that web crawlers also use brute force. So using fixed URLs, like the somedomain.com are going to be discovered at some point, right? – Markus Bell Sep 22 '19 at 15:07
  • @user218163: *"I was not aware that web crawlers also use brute force."* - normal web crawlers don't and I don't know why you think they would even try. And even your example *"www.somedomain.com/something/15b8b348ea1d895d753d1acb57683bd9"* (which I assume should mean 16 byte hex at the end) has 128 bits which makes it practically impossible to brute force. – Steffen Ullrich Sep 22 '19 at 15:13
  • "They will not be able to just guess it and crawl it because the secret part of the URL is long enough so that it cannot be brute forced during the life time of the URL." That's why I thought web crawlers use brute force. Sorry if I misunderstood you – Markus Bell Sep 22 '19 at 15:45
  • @user218163: I've made it now more explicit that they don't even try to guess the URL. – Steffen Ullrich Sep 22 '19 at 16:01