#015 - Python Essentials | 02 - Python Environments Simplified
This is an important topic to let you hit the ground running in your Python engineering projects. Don't do what I did. Be Better.
This article is a beginner-friendly guide on Python environments, exploring their essentials, types, and use cases, designed to equip readers with a thorough understanding of the Python environment landscape and the different tools suitable for a wide range of project needs.
Update Note - January 20th, 2025
For managing my environments, packages and Python version installations, I now use Astral’s uv exclusively, one tool to rule them all. It’s fast, lightweight and simple. I highly recommend it. I write about it in more detail here:
Welcome back to the Python Essentials series. Today, we look at Python Environments. If you have yet to learn what a Python environment is, that's perfectly fine; this article will get you up to speed. Environments are necessary for effective Python use, so this stuff is ‘essential’. Boom.
Introduction
Python is becoming more prevalent as a viable tool for engineers around the world. Whether it's for data analysis, automation, or computational design, Python's role in engineering is undeniable. However, with great power comes great responsibility 🕸, particularly in managing Python environments.
For beginners, understanding the distinction between global and virtual environments and the importance of dependency management is crucial.
This post aims to clarify these concepts, providing a simple, practical guide to managing your Python environments effectively.
What are Python Environments?
A Python environment is a self-contained directory that contains a specific version of Python and various additional packages.
Think of it as a unique workspace for your Python projects, where you can install and manage dependencies without affecting other projects or your system settings. It isolates your Project into a confined sandbox.
Python Environments are a concept that can be tricky to grasp at the beginning of your Python journey. There are many different versions of Python. A new one comes out every few months. This results in an Indiana Jones gauntlet of pitfalls and boobytraps related to the version control of both Python and your various packages/libraries and how your different packages and versions of Python relate to or speak to one another.
Sidenote:
Understanding the hierarchical structure of code organization in Python is helpful when considering Python Environments.
Module: This is the simplest unit. A module is a single file containing Python code. It may include functions, classes, and variables, as well as runnable code. Modules are used to organize and reuse code logically. Think of a module as a single book focusing on a specific topic.
Package: A package is a collection of modules organized together. It's like a folder containing several books (modules), each covering different topics but all part of a broader subject (the package's purpose). A package includes a special file (
__init__.py
) to distinguish it from a directory of scripts.Library: A library is a broader collection that can include multiple packages or modules. It provides a wide range of tools and functionalities for various tasks. In essence, a library is like an entire bookshelf, containing a series of books (modules) and collections of books (packages) on different but related subjects.
In summary, a library is a broader concept that may consist of several packages, while a package specifically refers to a collection of related modules organized together. A library can include packages, but a package is not a subset of a library; it's a component of the library's organizational structure.
This sounds inconvenient, probably because it is. But it's relatively simple to navigate if you understand the ecosystem and how these things work. There is some groundwork in getting your bearings on this topic, hence this article’s inception.
If you ignore this stuff, like I did, you are sowing the seeds for some completely unnecessary self-administered face-punching.
There are two main types of Python environments: global and virtual. The global environment is your system-wide Python installation. When you install Python directly on your operating system, any libraries or packages you install are added to this global environment. This is convenient for quick tasks or when you’re just starting with Python. However, it has downsides, especially when working on multiple projects.
On the other hand, virtual environments are project-specific. They allow you to create isolated Python environments for each project. This means you can have different versions of Python and various libraries for each project, avoiding conflicts and compatibility issues. This is particularly beneficial for engineers when working on complex projects requiring specific versions of NumPy or Pandas.
I use my global environment for most of my work, but there are pitfalls. For example, I wanted to create a simple repository for Flocode for the Linear Regression Example discussed in a previous article; when I tried to create the dependency list, I had hundreds of installed packages that were totally unnecessary because I forgot to cordon off the project in a virtual environment. I ended up listing all of the packages in my global environment. Can you imagine my embarrassment?
To get lost in this topic, check it out on the official Python Docs: Virtual Environments and Packages.
Bizarrely, there is no standardized Python-wide approach to how you should manage your virtual environments. This has led to spirited debates in the Python community, with anger and frustration reminiscent of the passion and scale of the Battle of Helms Deep. I hope this gets figured out soon.
Importance of Dependency and Version Management
Managing dependencies and versions in your Python projects means keeping your engineering tools in order.
Dependencies are external libraries or packages that your project needs to function correctly. Over time, these dependencies get updated, sometimes in ways that are not compatible with your project. If you’re working in a global environment, updating a package for one project could break another. This is where virtual environments shine, allowing you to maintain separate dependencies for each project.
Everybody knows the pain of an Excel link breaking, that invasive and relentless Windows alert that gets ignored for eternity because trying to hunt down the issue will lead you beyond the brink of sanity and reason.
Moreover, not keeping track of versions and dependencies can lead to the notorious “it works on my machine” problem. This occurs when a project runs on your computer but fails on another because of different package versions or missing dependencies. This is a significant issue in collaborative engineering projects where consistency and reproducibility are key.
Additionally, avoiding massive requirements files in smaller, simpler projects is important. A large, cluttered requirements file can make your project heavy and slow to set up, especially when many of the listed packages are unnecessary. This is what happens if you don’t set up a virtual environment.
It's like carrying a toolbox filled with tools you don’t need - it's cumbersome and inefficient. Virtual environments allow you to keep a lean, project-specific set of dependencies, ensuring that each project has only what it needs.
In the next section, we'll explore the pros and cons of both global and virtual environments in more detail, helping you choose the right approach for your engineering projects.
Pros and Cons of Global vs Virtual Environments
Global Environments:
Pros:
Simplicity: Ideal for beginners, as it requires minimal setup.
Convenience: Useful for quick, small-scale projects or scripts.
System-wide Access: Any installed package or tool is available across all projects.
Cons:
Dependency Conflicts: Different projects may require different versions of the same package, leading to compatibility issues.
Harder to Replicate: Setting up an identical environment on another machine can be challenging, depending on the system's specific configuration.
Risk of System Issues: Incorrect package installations or updates can potentially affect other unrelated software or system operations.
Virtual Environments:
Pros:
Isolation: Each environment is separate, so there’s no risk of conflicting dependencies.
Customization: Tailor each environment to the specific needs of a project.
Reproducibility: It is easier to replicate the environment across different machines, which is crucial for collaborative engineering work.
Cons:
Initial Learning Curve: Requires some upfront learning to set up and manage.
Maintenance: Each environment must be maintained separately, which can be time-consuming.
Disk Space: Multiple environments can consume significant disk space, especially with large dependencies.
Tools for Managing Python Environments
pip: The most basic and widely used package manager. It installs packages from the Python Package Index (PyPI) but does not handle environment management. This is what I use most of the time, but I’ve had headaches along the way.
virtualenv: A tool to create isolated Python environments. It’s simple to use and works by installing packages locally rather than system-wide.
conda: An open-source package and environment management system. Conda is particularly popular in data science and engineering due to its ability to handle packages outside of the Python ecosystem. This is the first env management tool I used. It’s decent, but I found the Anaconda ecosystem very verbose. Many data scientists use it. For a more streamlined version, Miniconda is an excellent alternative.
pipenv: Combines pip and virtualenv into one tool. It automatically creates a virtual environment for your projects and manages your dependencies.
Poetry: A relatively new tool that manages both dependencies and packaging. It simplifies dependency management and resolves dependencies more efficiently. Lately, I’ve been using Poetry instead of pip and virtualenv. I’ll expand on it further in the future once I have some mileage. It seems pretty cool and has some excellent venv features that simplify things. A few aspects that I like:
Easy to Start: Poetry makes setting up a new Python project simple. It does a lot of the setup work for you.
Keeps Track of Packages: Imagine each Python project needing specific tools. Poetry lists these tools (called packages) in a
.toml
file and remembers which versions your project needs.Solves Conflicts: Sometimes, different tools don’t work well together. Poetry helps by figuring out which versions of each tool can work side by side without problems.
Consistency Across Computers: Poetry creates a file that ensures your project uses the same tools and settings, no matter where you work on it - whether it’s your computer or someone else’s.
Publishing Made Easy: If you ever want to share your Python project with others, Poetry makes this process straightforward.
Simple Commands: Poetry uses commands that are easy to understand and remember, making your coding journey smoother.
Flexible with Python Versions: Different projects might need different versions of Python. Poetry can handle this, allowing you to work on various projects with different needs.
Each of these tools has unique strengths and is suitable for different scenarios in engineering. For instance, Conda is excellent for data science projects involving heavy numerical computations. At the same time, Poetry or Pipenv might be more suited for standard Python development due to their simplicity and elegant dependency resolution.
For those interested in digging deeper into package management, there’s a great post by Anna-Lena Popkes that covers much more ground than I can.
An unbiased evaluation of environment management and packaging tools
Closing
Python environments are fundamental to a structured, efficient approach to engineering projects. Choosing a global or a virtual environment depends on your project's scale, complexity, and collaboration needs. Managing your Python environment and dependencies keeps your project running smoothly, ensuring your solutions are reliable, reproducible, and scalable.
As you grow in your Python journey, exploring and mastering environment management tools will become a standard part of your workflow, helping you harness Python’s full potential.
I know you want to ignore this because it all sounds so boring, but trust me, as you progress In your Python journey, understanding this ecosystem will pay dividends and reduce the risk of throwing your monitor out the window.
At flocode, we’re all about the compound interest and keeping the monitor in one piece.
Keep going.
See you in the next one.
James 🌊
Hi James,
Thank you for this series, it seems like poetry is your favorite. How do virtual environments actually solve the problem of "it works on my machine". When someone opens the python file, it automatically downloads the correct version of all the installed packages or something?
I think your series is aimed at civil and structural engineers, so could you name, in your opinion, the best environment managing program for that purpose?
In the opening section about libraries, packages, and modules you have mixed your metaphors. Are libraries the books on a library, not the library itself (they're called libraries after all)? Also you say a module is a book in the packages description, but you call modules chapters of a book in the modules description. Perhaps you could give a more detailed description on how the three things relate or interact with each other.
Thank you again,
Michael
This article was exactly what I needed! I’ve been struggling to understand Python environments, especially with the whole global vs. virtual environments debate. I always ended up mixing my projects and facing issues with dependencies and versions. After reading this, everything just clicked! The clear breakdown of modules, packages, and libraries was super helpful, and I now feel much more confident in managing my environments properly.
I used to feel overwhelmed, but this guide made it much simpler to grasp. I especially appreciated the advice on using virtual environments to avoid messing up my global setup. It saved me from making the same mistakes I did before.
For anyone facing similar issues, I also found this [install miniconda ubuntu](https://docs.vultr.com/how-to-install-miniconda-on-ubuntu-24-08) guide helpful when setting up my environment. Highly recommend this article – it’s a must-read! Thanks for putting this together!