WebPageHTMLAnalyzer HTML Tag analysis tool
WebPageHTMLAnalyzer HTML Tag analysis tool help webmasters analyze their web pages. This search engine optimization tool analyze not only the Meta Tags but try to use the similar spider technology as the search engines spiders them self.
WebPageHTMLAnalyzer HTML Tag analysis tool
WebPageHTMLAnalyzer tool provide full information about the HTTP headers return by given web page. This information includes status, cache control settings, content type, server type and many more. Meta details section provides all the header meta details found in the web page. This information include page title, robots, keywords, description and much more.
Keywords section provides information about main keywords found in the web page. These keywords extracted from stripping out HTML and collection user visible text data, which most search engine crawlerâ€™s interest in, each keywords marks with their associated number of occurrence and weight density compared to other words. These information useful for webmasters and SEO experts to determine which keywords need to include in Meta keywords and Meta description sections. Also this tool provides information about keywords found in anchor tags. These are text links on web page (include the 'alt' text from images in the links and 'title'). These become more important by many search engines.
Preview tab in this tool use uses internet explorer base browser to preview web page. End users can simply click on any link on preview and start analyzing those pages.
HTML source tab provide exact HTML source code download from web page. If this is a compressed HTML source code or have some errors, then Tidy HTML source become very useful.
This WebPageHTMLAnalyzer uses TidyLib library for tidying up HTML. Tidy is composed from an HTML parser and an HTML pretty printer. The parser goes to considerable lengths to correct common markup errors. It also provides advice on how to make your pages more accessible to people with disabilities, and can be used to convert HTML content into XML as XHTML.
This is a very useful resource for webmasters who want to do checking and cleaning up HTML source codes in web pages. It is especially useful for finding and correcting errors in deeply nested HTML, or for making grotesque code legible once more.
WebPageHTMLAnalyzer tool comes very handy if you trying to analyze a web page with compress HTML (white spaces removed). Simply process the web page with Tidy option for indent set to Auto or Blocks, it will automatically format HTML source code what webmasters can easily read and understand.
Tidying HTML provide various options to end users
- For meta element indicating tidied doc.
- Suppress optional end tags.
- Text in the body is wrapped in <p>.
- Text in blocks is wrapped in <p>
- Replace i by em and b by strong.
- Replace presentational clutter by style rules.
- Discard presentation tags.
- Discard empty p elements.
- Both draconian cleaning for Word2000.
- Both fix comments with adjacent hyphens.
- Both fix URLs by replacing \ with /.
- Treat input as XML.
- Add >?xml ?< for XML docs.
- If set to yes, adds XML: space attr as needed.
- Make bare HTML.
- Use numeric entities for symbols.
- Both output non-breaking space as entity.
- Output naked ampersand as &.
- Output tags in upper not lower case.
- Output attributes in upper not lower case.
- Wrap within attribute values.
- Wrap within section tags.
- Wrap within ASP pseudo elements.
- Wrap within JSTE pseudo elements.
- Wrap within PHP pseudo elements.
- No 'Parsing X', guessed DTD or summary.
- Applies URI encoding if necessary.
- Output BODY content only.
- Hides all (real) comments in output.
- Sets the doctype mode for output.
- Indent the content of appropriate tags.
- Set the output type from here, like you can get the output as XML, XHTML or pure HTML.
WebPageHTMLAnalyzer tool provide option to set user agent. A user agent is the client application used with a particular network protocol; the phrase is most commonly used in reference to those which access the World Wide Web.
When Internet users visit a web site, a text string is generally sent to identify the user agent to the server. This forms part of the HTTP request, prefixed with User-Agent: (case does not matter) and typically includes information such as the application name, version, host operating system, and language. Bots, such as web crawlers, often also include a URL and/or e-mail address so that the webmaster can contact the operator of the bot.
The user-agent string is one of the criteria by which web crawlers can be excluded from certain pages or parts of a website using the "Robots Exclusion Standard" (robots.txt). This allows webmasters who feel that certain parts of their website should not be included in the data gathered by a particular crawler, or that a particular crawler is using up too much bandwidth, to request that crawler not to visit those pages.
Software provides few common user agents strings to select, or user can enter their own user agent.
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
- Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:188.8.131.52) Gecko/20070725 Firefox/184.108.40.206
- Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1)
- Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 5.1; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30)
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)
- Mozilla/4.0 (compatible; MSIE 5.0; Windows NT 5.1; .NET CLR 1.1.4322)
- Opera/9.20 (Windows NT 6.0; U; en)
- Opera/9.00 (Windows NT 5.1; U; en)
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.50
- Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; en) Opera 8.0
- Mozilla/4.0 (compatible; MSIE 6.0; MSIE 5.5; Windows NT 5.1) Opera 7.02 [en]
- Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20060127 Netscape/8.1
- Googlebot/2.1 ( http://www.googlebot.com/bot.html)Â
- Googlebot-Image/1.0 ( http://www.googlebot.com/bot.html) Â
- Mozilla/2.0 (compatible; Ask Jeeves)Â
- msnbot-Products/1.0 (+http://search.msn.com/msnbot.htm)Â
- ELinks/0.9.3 (textmode; Linux 2.6.9-kanotix-8 i686; 127x41)
- Links/0.9.1 (Linux 2.4.24; i386;)
- Lynx/2.8.5rel.1 libwww-FM/2.14 SSL-MM/1.4.1 GNUTLS/1.0.16
The HTML DOM Node Tree (Document Tree)
WebPageHTMLAnalyzer tool provide HTML DOM, a HTML document as a tree-structure base on XHTML. The tree structure is called a node-tree. The tree starts at the root node and branches out to the text nodes at the lowest level of the tree. The nodes in the node tree have a hierarchical relationship to each other. The terms parent, child, and sibling are used to describe the relationships. Parent nodes have children. Children on the same level are called siblings (brothers or sisters).
Search engines - for now and the foreseeable future - will continue to prefer plain text for indexing, end user should pay attention to what are the plain text content on their web pages. This information helps to ensure that webpage has correct plain content to spidered by the search engines, and, as a result, webmasters can enjoy high rankings in search results.
Software version and platform information
- Build Year: 2009
- Development Status : Beta
- Operating System : 32-bit MS Windows .Net 2.0
- IDE: Microsoft Visual Studio 2008
- Intended Audience : Webmasters, Web Developers, Search Engine Optimizers, SEO Developers
- Programming Language : C#
- User Interface : GUI (Graphical User Interface)
- Version: 1.0
Screenshots for - WebPageHTMLAnalyzer
Download This Webmaster Tools Free Software.
Download materials for this article (Webmaster Tools - Free Software)
File size: 180 KB, File type: zip
Total downloads: 317, Upload date: April 12 - 2009
yoz :: July 13-2009 :: 09:35 AM
Downloading web page...
The operation has timed out!
Administrator :: July 13-2009 :: 12:08 PM
Please try the download again, it's more likely our server has a small issue with high traffic at the moment you downloading. We are sorry for the any inconvenient caused to you by this.