This is a proposed architecture for submitting anonymous request to re-crawl the web page to google bot. I tried to come up with the solution given below. The intention of posting it here is to know the security loopholes in the given architecture and find out what improvements might benefit the current architecture
Here is the scenario, Let us say user visits a URL and he/she suspects that the page is cloaked or for some reason want the bot to re-crawl the URL. Google provides fetch as google tool for the same. But anyhow, when we are submitting the URL to Google, Google will know our IP. I want this submission to be anonymous. Please do not confuse with this. Assume that I reach the page turning off(disabling) my JavaScript.
So here are the steps:
1. User request for Random number from a authority which grants the user a random number and same to Google server. From a duration from t to t' same random number will be given to all the users and Google will also store same random number for that duration. After that a new random number will come into picture and that random number will not be used{ I wanted to minimize the management of keys for a user so I resorted to that approach}
2.Once we get the random number, XOR this with URL and send the encrypted URL to trusted mediator
. This mediator stores the request of all users{ encrypted URL} and after every 5-10 mins gives these URL to the Google Server.
3. Also note that the a user can send one URL once every 15 mins.
4. As soon as the dialog with mediator is over the connection is closed by the user.
5. Now mediator sends all the encrypted URLs to the Google server
6. Google server only knows the encrypted URLs and source as mediator hence privacy of the user is preserved
These are assumptions I made: a.Mediator can allow only one connection per user or client in every 15 mins window. b.Mediator on terminating the connection with the user or client keeps no details about the user or client c.Random number generator is true random number generator d.Mediator and Random number generator are both fault tolerant(in this context I mean robust to load).
What are some existing flaws?What can be improvements?
EDIT: In spite of the fact I have accepted the answer. I welcome comments other answers and improvements so feel free to let me knows the flaws or improvements.