Some Best Practices for Developing with Poetry
I love poetry! Poetry is, in my opinion, simply the best way to manage dependencies for python projects.
- No more "just run
pip install
and hope"! - No more wondering how on earth you got that package you wrote six months ago to run!
- No more forgetting to activate the virtualenv!
Poetry handles all the annoying, stupid aspects of python dependency and project management: metadata storage (one simple, standardized file: pyproject.toml
), package installation, dependency resolution, isolation via virtual environments, etc. Best of all, poetry is designed so that (most of the time) doing things the right way is also the easiest way.
Having said that, poetry is not perfect and there are still lessons to be learned about getting the most out of poetry. To wit:
Check in poetry.lock
When you run poetry install
or poetry update
, your poetry.lock
file will change, often in ways that result in a really big git diff
. That's annoying! Who wants that in their repo?
While the large diff can be annoying, having poetry.lock
available in the repo itself is a good idea because it gives you confidence that whoever or whatever is installing your project will get the exact same dependency versions. That means, if it works on your local machine, it will work on Johnny Data Scientist's machine, or on the CI machine, or on the client's.
If you're building a library rather than an application, you can rest assured that poetry.lock
is ignored when your project is specified as a dependency for another project.
Get in the habit of poetry update
ing
One of my "good habits" is to run a git pull
on the master branch of any project I'm collaborating on every morning. When that project is using poetry, I follow it up with a poetry update
. This ensures that I keep my local virtual environment in a runnable state, in case anyone upstream has changed the dependencies or metadata. It also handily updates poetry.lock
, in case anyone else has forgotten to do so.
I'd also recommend running poetry update
straight after a git checkout
, for the same reasons.
Use pyenv to manage python version
Poetry is clever, but it's not great at figuring out which python executable to use if your default python is the wrong version.
The best way to avoid this problem is to use pyenv. Pyenv is a python version manager, which lets you easily install and switch between different versions of python.
Suppose you're about to start working on a project using poetry. The first thing you should do is check pyproject.toml
for the python version; let's say you find python = "^3.6"
. You could then run pyenv install 3.6.11
, the latest version of python 3.6, and then from the root of your project run pyenv local 3.6.11
. This will create a .python-version
file, which you'll probably want to add to your .gitignore
. Finally, when you run poetry install
, poetry will pick up the right version of python.
Explicitly declare Dependencies
Suppose you already have scikit-learn
as a depenency. If you find yourself needing numpy
or scipy
later, it can be tempting not to add them as dependencies, since they're already covered (transitively) by scikit-learn
. This is a bad idea!
Firstly, as stated in the Zen of Python, "explicit is better than implicit". Importing numpy but not declaring it as a dependency is a recipe for confusion.
Secondly, and more seriously, it doesn't let you specify a version of your dependency. You might be relying on a feature in some_package
which was added in version 2.1
, but your dependencies might only need version 1.5
. Any small change might result in the dependency resolver deciding to install version 1.8
instead of 2.2
, and then suddenly your application is mysteriously broken and no-one knows why.
Only declare your version once
It's good practice, when developing a python application, to store the package version in a __version__
variable on the root module. However, the version is also defined in pyproject.toml
. To avoid defining the version twice, you can add this to your __init__.py
:
from importlib.metadata import version
__version__ = version('your package name as defined in pyproject.toml')
Unfortunately this only works for python 3.8 or newer. If you're battling with a legacy python version, you'll need to poetry add importlib_metadata
and replace the first line above with:
from importlib_metadata import version