Python: script to findout broken links of a website?

+1 vote

Was hoping to get some tips or advice on scripting a program that would sort through my many links on my directory website and print out to me the ones that are broken or no longer functioning so that I could fix or remove them from the site.

posted Sep 23, 2013 by Majula Joshi

Do you know about beautiful soup or requests  they can help you I think. You really need to provide more information for any useful help here
I'm trying to create a Python script that will search through the url links on my directory website and using a logic statement like,
   if: (how ever i would establish in python that the url is not broken)
     then return,
   elsif: (link is broken)
     then: print '%s is broken.' (%s being the name of the link).

I want the program to perform this on all the links so that I can easily see which links are useless on my website so my page doesn't wind up functionally being difficult and clouded with broken links like every other crappy directory.

2 Answers

+1 vote

Since it's your own website, the best answer is probably to process the source to that site. Was it written by a python script?

Otherwise, if the site is reasonably correct (as most are not), then beautiful Soup is probably the place to start.

answer Sep 23, 2013 by Jagan Mishra
+1 vote

The easiest solution is probably a non-Python tool like wget. Search the web for "link checker" or "href verifier" or words to that effect, see what you find. This is an extremely common task, and I don't think you need to write any code for it.

answer Sep 23, 2013 by Naveena Garg
