Archived
1
0
Fork
You've already forked checkbot
0
No description
This repository has been archived on 2025年12月10日. You can view files and clone it, but you cannot make any changes to its state, such as pushing and creating new issues, pull requests or comments.
  • Perl 79.4%
  • HTML 15.3%
  • CSS 3.9%
  • Shell 1.4%
Find a file
2010年05月01日 13:44:03 +00:00
t Initial version just tests if checkbot can be run. 2002年12月27日 13:59:31 +00:00
.cvsignore Don't show the blib directory which is an artefact of the build process. 2004年04月21日 19:31:22 +00:00
ChangeLog Fix from Gerald Pfeifer to make checkbot's --file argument a bit more robust. 2009年03月29日 17:41:50 +00:00
checkbot Update --url option to better explain how multiple start URLs can be used. 2010年05月01日 13:44:03 +00:00
checkbot.css Cleaned up CSS. 2003年08月31日 17:24:18 +00:00
checkbot2.css Add classes and ids to checkbot reports. 2005年12月18日 21:43:12 +00:00
index.html Version 1.80. 2008年10月15日 12:55:00 +00:00
Makefile.PL Use PREREQ_PM to check for required modules. 2008年10月18日 15:15:57 +00:00
makeweb.sh Also add the 2nd stylesheet to the website. 2007年01月27日 16:58:25 +00:00
MANIFEST MakeMaker now wants us to include the META.yml file. 2003年12月21日 11:25:56 +00:00
README Version 1.80. 2008年10月15日 12:55:00 +00:00
RELEASE-PROCESS Update release process based on most recent release. 2009年03月29日 17:40:56 +00:00
style.css Add style for the H3 as well. 2003年05月04日 18:32:05 +00:00
suppression-test.txt Initial version suppresses file in test.html 2003年11月29日 15:51:08 +00:00
test.html Release 1.78 2006年05月03日 20:13:21 +00:00
TODO New item for checking javascript: links. 2005年07月25日 19:41:50 +00:00

Checkbot -- a WWW link verifier
Checkbot is a perl5 script which can verify links within a region of
the World Wide Web. It checks all pages within an identified region,
and all links within that region. After checking all links within the
region, it will also check all links which point outside of the
region, and then stop.
Checkbot regularly writes reports on its findings, including all
servers found in the region, and all links with problems on those
servers.
Checkbot was written originally to check a number of servers at
once. This has implied some design decisions, so you might want to
keep that in mind when making suggestions. Speaking of which, be sure
to check the to do file on the website for things which have been
suggested for Checkbot.
INSTALLATION
Making and installing Checkbot is easy:
 perl Makefile.PL
 make
 make install
You will need to have the following Perl modules installed in order to 
properly install Checkbot:
 LWP
 URI
 HTML::Parser
 MIME::Base64
 Net::FTP
 Mail::Send (optional, contained in the MailTools package)
	Time::Duration (optional, used for additional info in report)
WHERE TO FIND IT
Checkbot is distributed at: http://degraaff.org/checkbot/
Problems, bug reports, and feature enhancements are welcome at
http://sourceforge.net/projects/checkbot/
There is an announcement mailing list to which announcements of new
versions are posted. You can sign up for the list at 
https://lists.sourceforge.net/lists/listinfo/checkbot-announce
Hans de Graaff <hans@degraaff.org>
RECENT CHANGES
Changes in versino 1.80 (15-Oct-2008)
 * Fix handling of nofollow robots tag.
 * Require newer version of LWP for better handling of character
 encodings.
 * Ignore mms scheme.
 * Minor clarification in output.
Changes in version 1.79 (3-Feb-2007)
 * Correctly parse documents to avoid problems with UTF-8
 documents. This avoids the "Parsing of undecoded UTF-8 will give
 garbage when decoding entities" messages.
 * Allow regular expressions in the suppression file, and complain if
 the suppression file is not a proper file.
 * More robust handling of HTTP and FTP servers that have problems
 responding to HEAD requests.
 * Use the original URL to report problems.
 * Ensure XHTML compliance.
Changes in version 1.78 (3-May-2006)
 * Don't throw errors for links that cannot be expected to be valid
 all the time (e.g. the classid attribute of an object element)
 * Better fallbacks for some cases where the HEAD request does not
 work
 * Add more classes and ids to allow more styling of results pages
 (including example CSS file)
 * Ensure XHTML compliance
 * Better checks for optional dependencies
Changes in version 1.77 (28-Jul-2005)
 * Fix silly build-related problem that prevented checkbot 1.76 from
 running at all.
 * Check for presence of robots meta tag and act on it.
Changes in version 1.76 (25-Jul-2005)
 * Error reports now include the page title for easier identification.
 * javascript: links are now ignored because they cannot be checked.
 * Documentation updates.
Changes in version 1.75 (22-Apr-2004)
 * New --cookies option to accept cookies from servers while checking.
 * New --noproxy option indicates which domains should not be
 passed through the proxy.
 * New error code for unknown domains; only known non-checkable
 schemes are ignored now.
 * Minor bug fixes.
 * Documentation updates.
Changes in version 1.74 (17-Dec-2003)
 * New --suppress option allows Response code/URL combinations not
 to be reported as problems.
 * Checkbot warnings are now handled as pseudo-HTTP status messages
 so that they can make use of all Checkbot features such as
 --dontwarn.
 * Option --allow-simple-hosts is deprecated due to this change.
 * More robust handling of (lack of) status messages.
 * Checkbot now requires LWP 5.70 due to bugfixes in this release,
 although it should still also work with older LWP versions.
 * Documentation fixes.
Changes in version 1.73 (31-Aug-2003)
 * Checkbot now tries to produce valid XHTML 1.1
 * URLs matching the --ignore option are now completely ignored;
 they used to be checked but not reported.
 * Proxy support works again, but --proxy now applies to all links
 * Documentation fixes
Changes in version 1.72 (04-May-2003)
 * URLs with query strings are now checked by default, the
 --exclude option can be used to revert to the previous behavior
 * The server results page contains shortcut links to each section
 * Removed warning for unqualified hostnames for news: URLs
 * Handling of signals such as SIGINT
 * Bug and documentation fixes
Changes in version 1.71 (29-Dec-2002)
 * New --filter option allows rewriting of URLs before they will be checked
 * Problematic links are now reported for each page on which they occur
 * New statistics which should work correctly
 * Much simplified storage of information on problem links
 * Duplicate links are now properly detected and not checked twice
 * Rewritten internals for link checking, as a consequence internal
 and external links are checked at the same time now, not in two
 passes like before
 * Rewritten internals for message output
 * A simple test case for 'make test'
 * Minor cleanups of the code
Version 1.70 was only released for testing purposes
Changes in version 1.69
 * Improved makefile and packaging
 * Better default for --match argument
 * Additional instance of using GET instead of HEAD added
 * Bug fixes in printing of web server feedback
Changes in version 1.68
 * Add --allow-simple-hosts which doesn't check for unqualified hosts
 * Mention --style option in help and added example style file
 * Change --sleep implementation so that fractional seconds can be used
 * Fix a bug with handling <base> tags
 * Tighten checks for http and https schemes
 * Remove harmless warnings