top button
Flag Notify
    Connect to us
      Site Registration

Site Registration

How robot.txt file works for a serach engine?

+4 votes
410 views
How robot.txt file works for a serach engine?
posted Nov 20, 2013 by Salil Agrawal

Share this question
Facebook Share Button Twitter Share Button LinkedIn Share Button
Befor going to any site first it searches for the robots.txt and from here it takes the site's information and checks permission for allow and also different information...
Most of the FAQ clears it in http://www.robotstxt.org/robotstxt.html
Thanks Sachi, Probably I have not framed the question correctly. Now I am in the process of opening QueryHome to search so need to the process how to add QueryHome details to the webmasters.

1 Answer

0 votes

Web Robots (also known as Web Wanderers, Crawlers, or Spiders), are programs that traverse the Web automatically. Search engines such as Google use them to index the web content, spammers use them to scan for email addresses, and they have many other uses.

It works likes this: a robot wants to visit a Web site URL, say http://www.example.com/welcome.html. Before it does so, it firsts checks for http://www.example.com/robots.txt, and finds.

There are two important considerations when using /robots.txt:

robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and email address harvesters used by spammers will pay no attention.
the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don't want robots to use.

answer Sep 8, 2016 by Magento_ocodewire
Similar Questions
+5 votes

Question is related to the QueryHome Sitemap, want to know what is the meaning of priority field in the sitemap.xml and if it has any impact on the rank of the page.

+1 vote

Is there any clever solution to store static files in Flask's application root directory. robots.txt and sitemap.xml are expected to be found in /, so my idea was to create routes for them:

@app.route('/sitemap.xml', methods=['GET'])
def sitemap():
  response = make_response(open('sitemap.xml').read())
  response.headers["Content-type"] = "text/plain"
  return response

There must be something more convenient :)

+3 votes

Do Google Search API have any limit for the website like google custom search, I could not find any documentation?
Please help...

+2 votes

I need to search through a directory of text files for a string. Here is a short program I made in the past to search through a single text file for a line of text.

How can I modify the code to search through a directory of files that have different filenames, but the same extension?

fname = raw_input("Enter file name: ") #"*.txt"
fh = open(fname)
lst = list()
biglst=[]
for line in fh:
 line=line.rstrip()
 line=line.split()
 biglst+=line
final=[]
for out in biglst:
 if out not in final:
 final.append(out)
final.sort()
print (final)
...