| Index | index by Group | index by Distribution | index by Vendor | index by creation date | index by Name | Mirrors | Help | Search |
| Name: python3-html-text | Distribution: Fedora Project |
| Version: 0.6.2 | Vendor: Fedora Project |
| Release: 1.fc41 | Build date: Fri Oct 25 05:31:48 2024 |
| Group: Unspecified | Build host: buildvm-s390x-09.s390.fedoraproject.org |
| Size: 30591 | Source RPM: python-html-text-0.6.2-1.fc41.src.rpm |
| Packager: Fedora Project | |
| Url: https://github.com/zytedata/html-text | |
| Summary: Extract text from HTML | |
How is html_text different from .xpath('//text()') from LXML
or .get_text() from Beautiful Soup?
- Text extracted with html_text does not contain inline styles,
javascript, comments and other text that is not normally visible
to users;
- html_text normalizes whitespace, but in a way smarter than
.xpath('normalize-space()), adding spaces around inline elements
(which are often used as block elements in html markup), and trying
to avoid adding extra spaces for punctuation;
- html-text can add newlines (e.g. after headers or paragraphs), so
that the output text looks more like how it is rendered in browsers.
MIT
* Fri Oct 18 2024 Benson Muite <benson_muite@emailplus.org> - 0.6.2-1 - Initial packaging
/usr/lib/python3.13/site-packages/html_text /usr/lib/python3.13/site-packages/html_text-0.6.2.dist-info /usr/lib/python3.13/site-packages/html_text-0.6.2.dist-info/INSTALLER /usr/lib/python3.13/site-packages/html_text-0.6.2.dist-info/LICENSE /usr/lib/python3.13/site-packages/html_text-0.6.2.dist-info/METADATA /usr/lib/python3.13/site-packages/html_text-0.6.2.dist-info/WHEEL /usr/lib/python3.13/site-packages/html_text-0.6.2.dist-info/top_level.txt /usr/lib/python3.13/site-packages/html_text/__init__.py /usr/lib/python3.13/site-packages/html_text/__pycache__ /usr/lib/python3.13/site-packages/html_text/__pycache__/__init__.cpython-313.opt-1.pyc /usr/lib/python3.13/site-packages/html_text/__pycache__/__init__.cpython-313.pyc /usr/lib/python3.13/site-packages/html_text/__pycache__/html_text.cpython-313.opt-1.pyc /usr/lib/python3.13/site-packages/html_text/__pycache__/html_text.cpython-313.pyc /usr/lib/python3.13/site-packages/html_text/html_text.py /usr/share/doc/python3-html-text /usr/share/doc/python3-html-text/README.rst
Generated by rpm2html 1.8.1
Fabrice Bellet, Fri Oct 24 01:09:13 2025