tagsoup - A SAX-compliant HTML parser written in Java
Description:
TagSoup is a SAX-compliant parser written in Java that, instead of parsing well-formed or valid XML, parses HTML as it is found in the wild: nasty and brutish, though quite often far from short. TagSoup is designed for people who have to process this stuff using some semblance of a rational application design. By providing a SAX interface, it allows standard XML tools to be applied to even the worst HTML.
Homepage: http://home.ccil.org/~cowan/XML/tagsoup/
License: GPL
Vendor: Fedora Project
Packages
tagsoup-1.0.1-1jpp.1.fc7.i386 [146 KiB] |
Changelog by Vivek Lakshmanan (2007-02-12):
- rpmlint fixes - Use fedora approved naming convention - Fix buildroot to conform to Fedora packaging guidelines - Add LICENSE to the rpm and label as doc - Remove Vendor and Distribution tags - Minor formatting fixes - Use proper javaoc handling - Add requires and requires(x) on jpackage-utils - Add GCJ support - BR on ant-trax and xalan-j2 |