Questions tagged [robots.txt]
8 questions
4
votes
0 answers
Is automated web site access legal?
Many web sites include in their terms of service things about automated access being prohibited. One example is in ebay's robots.txt file:
The use of robots or other automated means to access the eBay site\n
without the express permission of…
2
votes
2 answers
What software is needed for membership websites and how can they still be indexed by Google
I notice that in some cases paywalled news articles seem to have been indexed by Google because excerpts from the story appears in the search hit.
However, when I go to these web sites using a Googlebot (robot) identity the information is not there…
Tyler Durden
- 6,333
1
vote
1 answer
Apache wont start, port 80 in use by a system process, found baiduspider
Ok so I have uninstalled IIS on my windows server and decided to try Xampp to host my domains. Port 80 is in use and I have tried all of the fixes that I have came across for the past 2 days. I was in need of figuring out what is using process id 4…
David Stoler
- 33
1
vote
1 answer
How can we know which URLs can be crawled as robots.txt tells if we don't know to which folder a URL belong to?
I'm going to code a web crawler but before I want to know what is going to be possible to crawl.
Tell me if I'm wrong, but in robots.txt websites indicate folders not URLs that can and can't be crawled, so how can we know to which folder a URL…
DevAb
- 113
0
votes
0 answers
Index a whole website that has blocked Google?
I tried to do a site:site.com [search terms] on Google but site.com has blocked Google from indexing it via its robots.txt. How can I get around this? Can I download and index the whole site myself somehow and then search my own private index?
d-b
- 956
0
votes
1 answer
Can ROS be played in Windows 10, 64bit?
I try to install ROS(robot operating system) in my computer with windows 10, 64bit.
Is it possible?
And is there any process to do it?
Bourgou
- 1
-1
votes
1 answer
Googlebot blocked by robots.txt
So recently I've tested my site with Google mobile-friendly test and the main loading issue was "Googlebot blocked by robots.txt"
My robots.txtdoes allow Google bot I think?
What do you think guys? What's the problem here?
-1
votes
1 answer
How to to prevent Google from indexing
We have setup one of web site on server and site build in php and symfony framework, As my requirement is to prevent Google from indexing and below of my robot.txt and Is prevent using .htaccess?
User-agent: *
Disallow:
So How to to prevent it and…
Nullpointer
- 180