Index index by Group index by Distribution index by Vendor index by creation date index by Name Mirrors Help Search

python-Scrapy-doc-2.12.0-1.3 RPM for noarch

From OpenSuSE Tumbleweed for noarch

Name: python-Scrapy-doc Distribution: openSUSE Tumbleweed
Version: 2.12.0 Vendor: openSUSE
Release: 1.3 Build date: Tue Dec 3 09:24:29 2024
Group: Unspecified Build host: reproducible
Size: 0 Source RPM: python-Scrapy-2.12.0-1.3.src.rpm
Packager: https://bugs.opensuse.org
Url: https://scrapy.org
Summary: Documentation for python-Scrapy
Provides documentation for python-Scrapy.

Provides

Requires

License

BSD-3-Clause

Changelog

* Tue Dec 03 2024 Steve Kowalik <steven.kowalik@suse.com>
  - Update to 2.12.0:
    * Dropped support for Python 3.8, added support for Python 3.13
    * start_requests can now yield items
    * Added scrapy.http.JsonResponse
    * Added the CLOSESPIDER_PAGECOUNT_NO_ITEM setting
* Thu Jul 11 2024 Dirk Müller <dmueller@suse.com>
  - update to 2.11.2 (bsc#1224474, CVE-2024-1968):
    * Redirects to non-HTTP protocols are no longer followed.
      Please, see the 23j4-mw76-5v7h security advisory for more
      information. (:issue:`457`)
    * The Authorization header is now dropped on redirects to a
      different scheme (http:// or https://) or port, even if the
      domain is the same. Please, see the 4qqq-9vqf-3h3f security
      advisory for more information.
    * When using system proxy settings that are different for
      http:// and https://, redirects to a different URL scheme
      will now also trigger the corresponding change in proxy
      settings for the redirected request. Please, see the
      jm3v-qxmh-hxwv security advisory for more information.
      (:issue:`767`)
    * :attr:`Spider.allowed_domains
      <scrapy.Spider.allowed_domains>` is now enforced for all
      requests, and not only requests from spider callbacks.
    * :func:`~scrapy.utils.iterators.xmliter_lxml` no longer
      resolves XML entities.
    * defusedxml is now used to make
      :class:`scrapy.http.request.rpc.XmlRpcRequest` more secure.
    * Restored support for brotlipy_, which had been dropped in
      Scrapy 2.11.1 in favor of brotli. (:issue:`6261`)  Note
      brotlipy is deprecated, both in Scrapy and upstream. Use
      brotli instead if you can.
    * Make :setting:`METAREFRESH_IGNORE_TAGS` ["noscript"] by
      default. This prevents :class:`~scrapy.downloadermiddlewares.
      redirect.MetaRefreshMiddleware` from following redirects that
      would not be followed by web browsers with JavaScript
      enabled.
    * During :ref:`feed export <topics-feed-exports>`, do not close
      the underlying file from :ref:`built-in post-processing
      plugins <builtin-plugins>`.
    * :class:`LinkExtractor
      <scrapy.linkextractors.lxmlhtml.LxmlLinkExtractor>` now
      properly applies the unique and canonicalize parameters.
    * Do not initialize the scheduler disk queue if
      :setting:`JOBDIR` is an empty string.
    * Fix :attr:`Spider.logger <scrapy.Spider.logger>` not logging
      custom extra information.
    * robots.txt files with a non-UTF-8 encoding no longer prevent
      parsing the UTF-8-compatible (e.g. ASCII) parts of the
      document.
    * :meth:`scrapy.http.cookies.WrappedRequest.get_header` no
      longer raises an exception if default is None.
      :func:`scrapy.utils.response.get_base_url` to determine the
      base URL of a given :class:`~scrapy.http.Response`.
    * :class:`~scrapy.selector.Selector` now uses
      :func:`scrapy.utils.response.get_base_url` to determine the
      base URL of a given :class:`~scrapy.http.Response`.
      (:issue:`6265`)
    * The :meth:`media_to_download` method of :ref:`media pipelines
      <topics-media-pipeline>` now logs exceptions before stripping
      them.
    * When passing a callback to the :command:`parse` command,
      build the callback callable with the right signature.
    * Add a FAQ entry about :ref:`creating blank requests <faq-
      blank-request>`.
    * Document that :attr:`scrapy.selector.Selector.type` can be
      "json".
    * Make builds reproducible.
    * Packaging and test fixes
* Mon Mar 25 2024 Dirk Müller <dmueller@suse.com>
  - update to 2.11.1 (bsc#1220514, CVE-2024-1892, bsc#1221986):
    * Addressed `ReDoS vulnerabilities` (bsc#1220514, CVE-2024-1892)
    - ``scrapy.utils.iterators.xmliter`` is now deprecated in favor of
      :func:`~scrapy.utils.iterators.xmliter_lxml`, which
      :class:`~scrapy.spiders.XMLFeedSpider` now uses.
      To minimize the impact of this change on existing code,
      :func:`~scrapy.utils.iterators.xmliter_lxml` now supports indicating
      the node namespace with a prefix in the node name, and big files with
      highly nested trees when using libxml2 2.7+.
    - Fixed regular expressions in the implementation of the
      :func:`~scrapy.utils.response.open_in_browser` function.
      .. _ReDoS vulnerabilities: https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
    * :setting:`DOWNLOAD_MAXSIZE` and :setting:`DOWNLOAD_WARNSIZE` now also apply
      to the decompressed response body. Please, see the `7j7m-v7m3-jqm7 security
      advisory`_ for more information. (bsc#1221986)
      .. _7j7m-v7m3-jqm7 security advisory: https://github.com/scrapy/scrapy/security/advisories/GHSA-7j7m-v7m3-jqm7
    * Also in relation with the `7j7m-v7m3-jqm7 security advisory`_, the
      deprecated ``scrapy.downloadermiddlewares.decompression`` module has been
      removed.
    * The ``Authorization`` header is now dropped on redirects to a different
      domain. Please, see the `cw9j-q3vf-hrrv security advisory`_ for more
      information.
    * The OS signal handling code was refactored to no longer use private Twisted
      functions. (:issue:`6024`, :issue:`6064`, :issue:`6112`)
    * Improved documentation for :class:`~scrapy.crawler.Crawler` initialization
      changes made in the 2.11.0 release. (:issue:`6057`, :issue:`6147`)
    * Extended documentation for :attr:`Request.meta <scrapy.http.Request.meta>`.
    * Fixed the :reqmeta:`dont_merge_cookies` documentation. (:issue:`5936`,
    * Added a link to Zyte's export guides to the :ref:`feed exports
    * Added a missing note about backward-incompatible changes in
      :class:`~scrapy.exporters.PythonItemExporter` to the 2.11.0 release notes.
    * Added a missing note about removing the deprecated
      ``scrapy.utils.boto.is_botocore()`` function to the 2.8.0 release notes.
    * Other documentation improvements. (:issue:`6128`, :issue:`6144`,
      :issue:`6163`, :issue:`6190`, :issue:`6192`)
  - drop twisted-23.8.0-compat.patch (upstream)
* Wed Jan 10 2024 Daniel Garcia <daniel.garcia@suse.com>
  - Add patch twisted-23.8.0-compat.patch gh#scrapy/scrapy#6064
  - Update to 2.11.0:
    - Spiders can now modify settings in their from_crawler methods,
      e.g. based on spider arguments.
    - Periodic logging of stats.
    - Bug fixes.
  - 2.10.0:
    - Added Python 3.12 support, dropped Python 3.7 support.
    - The new add-ons framework simplifies configuring 3rd-party
      components that support it.
    - Exceptions to retry can now be configured.
    - Many fixes and improvements for feed exports.
  - 2.9.0:
    - Per-domain download settings.
    - Compatibility with new cryptography and new parsel.
    - JMESPath selectors from the new parsel.
    - Bug fixes.
  - 2.8.0:
    - This is a maintenance release, with minor features, bug fixes, and
      cleanups.
* Mon Nov 07 2022 Yogalakshmi Arunachalam <yarunachalam@suse.com>
  - Update to v2.7.1
    * Relaxed the restriction introduced in 2.6.2 so that the Proxy-Authentication header can again be set explicitly in certain cases,
      restoring compatibility with scrapy-zyte-smartproxy 2.1.0 and older
    Bug fixes
    * full change-log https://docs.scrapy.org/en/latest/news.html#scrapy-2-7-1-2022-11-02
* Thu Oct 27 2022 Yogalakshmi Arunachalam <yarunachalam@suse.com>
  - Update to v2.7.0
    Highlights:
    * Added Python 3.11 support, dropped Python 3.6 support
    * Improved support for :ref:`asynchronous callbacks <topics-coroutines>`
    * :ref:`Asyncio support <using-asyncio>` is enabled by default on new projects
    * Output names of item fields can now be arbitrary strings
    * Centralized :ref:`request fingerprinting <request-fingerprints>` configuration is now possible
    Modified requirements
    * Python 3.7 or greater is now required; support for Python 3.6 has been dropped. Support for the upcoming Python 3.11 has been added.
      The minimum required version of some dependencies has changed as well:
    - lxml: 3.5.0 → 4.3.0
    - Pillow (:ref:`images pipeline <images-pipeline>`): 4.0.0 → 7.1.0
    - zope.interface: 5.0.0 → 5.1.0
      (:issue:`5512`, :issue:`5514`, :issue:`5524`, :issue:`5563`, :issue:`5664`, :issue:`5670`, :issue:`5678`)
    Deprecations
    - :meth:`ImagesPipeline.thumb_path <scrapy.pipelines.images.ImagesPipeline.thumb_path>` must now accept an item parameter (:issue:`5504`, :issue:`5508`).
    - The scrapy.downloadermiddlewares.decompression module is now deprecated (:issue:`5546`, :issue:`5547`).
    Complete changelog https://github.com/scrapy/scrapy/blob/2.7/docs/news.rst
* Fri Sep 09 2022 Yogalakshmi Arunachalam <yarunachalam@suse.com>
  - Update to v2.6.2
    Security bug fix:
    * When HttpProxyMiddleware processes a request with proxy metadata, and that proxy metadata includes proxy credentials,
      HttpProxyMiddleware sets the Proxy-Authentication header, but only if that header is not already set.
    * There are third-party proxy-rotation downloader middlewares that set different proxy metadata every time they process a request.
    * Because of request retries and redirects, the same request can be processed by downloader middlewares more than once,
      including both HttpProxyMiddleware and any third-party proxy-rotation downloader middleware.
    * These third-party proxy-rotation downloader middlewares could change the proxy metadata of a request to a new value,
      but fail to remove the Proxy-Authentication header from the previous value of the proxy metadata, causing the credentials of one
      proxy to be sent to a different proxy.
    * To prevent the unintended leaking of proxy credentials, the behavior of HttpProxyMiddleware is now as follows when processing a request:
      + If the request being processed defines proxy metadata that includes credentials, the Proxy-Authorization header is always updated
      to feature those credentials.
      + If the request being processed defines proxy metadata without credentials, the Proxy-Authorization header is removed unless
      it was originally defined for the same proxy URL.
      + To remove proxy credentials while keeping the same proxy URL, remove the Proxy-Authorization header.
      + If the request has no proxy metadata, or that metadata is a falsy value (e.g. None), the Proxy-Authorization header is removed.
      + It is no longer possible to set a proxy URL through the proxy metadata but set the credentials through the Proxy-Authorization header.
      Set proxy credentials through the proxy metadata instead.
    * Also fixes the following regressions introduced in 2.6.0:
      + CrawlerProcess supports again crawling multiple spiders (issue 5435, issue 5436)
      + Installing a Twisted reactor before Scrapy does (e.g. importing twisted.internet.reactor somewhere at the module level)
      no longer prevents Scrapy from starting, as long as a different reactor is not specified in TWISTED_REACTOR (issue 5525, issue 5528)
      + Fixed an exception that was being logged after the spider finished under certain conditions (issue 5437, issue 5440)
      + The --output/-o command-line parameter supports again a value starting with a hyphen (issue 5444, issue 5445)
      + The scrapy parse -h command no longer throws an error (issue 5481, issue 5482)
* Fri Mar 04 2022 Ben Greiner <code@bnavigator.de>
  - Update runtime requirements and test deselections
* Wed Mar 02 2022 Matej Cepl <mcepl@suse.com>
  - Update to v2.6.1
    * Security fixes for cookie handling (CVE-2022-0577 aka
      bsc#1196638, GHSA-mfjm-vh54-3f96)
    * Python 3.10 support
    * asyncio support is no longer considered experimental, and works
      out-of-the-box on Windows regardless of your Python version
    * Feed exports now support pathlib.Path output paths and per-feed
      item filtering and post-processing
  - Remove unnecessary patches:
    - remove-h2-version-restriction.patch
    - add-peak-method-to-queues.patch
* Sun Jan 16 2022 Ben Greiner <code@bnavigator.de>
  - Skip a failing test in python310: exception format not recognized

Files

/usr/share/doc/packages/python-Scrapy-doc
/usr/share/doc/packages/python-Scrapy-doc/html


Generated by rpm2html 1.8.1

Fabrice Bellet, Sun Mar 30 23:22:36 2025