Here you'll find a contributing guide to get started with development.
For local development, it is required to have Python 3.9 (or a later version) installed.
We use Poetry for project management. Install it and set up your IDE accordingly.
To install this package and its development dependencies, run:
make install-devTo execute all code checking tools together, run:
make check-codeWe utilize ruff for linting, which analyzes code for potential issues and enforces consistent style. Refer to pyproject.toml for configuration details.
To run linting:
make lintOur automated code formatting also leverages ruff, ensuring uniform style and addressing fixable linting issues. Configuration specifics are outlined in pyproject.toml.
To run formatting:
make formatType checking is handled by mypy, verifying code against type annotations. Configuration settings can be found in pyproject.toml.
To run type checking:
make type-checkWe employ pytest as our testing framework, equipped with various plugins. Check pyproject.toml for configuration details and installed plugins.
We use pytest as a testing framework with many plugins. Check pyproject.toml for configuration details and installed plugins.
To run unit tests:
make unit-testsTo run unit tests with HTML coverage report:
make unit-tests-covWe adhere to the Google docstring format for documenting our codebase. Every user-facing class or method is documented. Documentation standards are enforced using ruff.
Our documentation is generated from these docstrings using pydoc-markdown with additional post-processing. Markdown files in the docs/ folder complement the autogenerated content. Final documentation is rendered using Docusaurus and published to GitHub Pages.
To run the documentation locally, you need to have Node.js version 20 or higher installed. Once you have the correct version of Node.js, follow these steps:
Navigate to the website/ directory:
cd website/Enable Corepack, which installs Yarn automatically:
corepack enableInstall the necessary dependencies:
yarnStart the project in development mode with Hot Module Replacement (HMR):
yarn startPublishing new versions to PyPI is automated through GitHub Actions.
- Alpha releases: Alpha releases can be triggered manually via GitHub Actions, typically from a pull request or a dedicated branch. This allows for testing new features or changes before merging into the master branch. Alpha releases use the version number from
pyproject.tomlwith an alpha suffix. - Beta releases: On each commit to the master branch, a new beta release is automatically published. The version number is sourced from
pyproject.toml, with the beta version suffix incremented by 1 from the last beta release on PyPI. - Stable releases: A stable version is published when a new release is created using GitHub Releases. The version number is taken from
pyproject.toml, and the built package assets are automatically uploaded to the GitHub release.
Important notes:
- Ensure the version number in
pyproject.tomlis updated before creating a new release. If a stable version with the same version number already exists on PyPI, the publish process will fail. - The release process also fails if the version is not documented in
CHANGELOG.md. Make sure to describe the changes in the new version there. - After a stable release, ensure to increment the version number in both
pyproject.tomlandcrawlee/__init__.py.
Before merging a pull request or committing directly to master for automatic beta release:
- Ensure
pyproject.tomlversion reflects the latest non-published version. - Describe changes in
CHANGELOG.mdunder the latest non-published version section.
Before creating a GitHub Release for production:
- Confirm successful deployment of the latest beta release with the latest commit.
- Ensure changes are documented in
CHANGELOG.mdsince the last production release. - When drafting a GitHub release:
- Create a new tag like
v1.2.3targeting the master branch. - Use
1.2.3as the release title. - Copy changes from
CHANGELOG.mdinto the release description. - Check "Set as the latest release" option for visibility.
- Create a new tag like
To release a new version manually, follow these steps. Note that manual releases should only be performed if you have a good reason, use the automated release process otherwise.
- Update the version number:
- Modify the
versionfield undertool.poetryinpyproject.toml. - Update the
__version__field incrawlee/__init__.py.
[tool.poetry]
name = "crawlee"
version = "x.z.y"- Generate the distribution archives for the package:
poetry build- Set up the PyPI API token for authentication:
poetry config pypi-token.pypi YOUR_API_TOKEN- Upload the package to PyPI:
poetry publish