Python modules and packages
A module is a file that contains definitions intended for reuse in a script or an interactive session.
import modulefor the first time does three things: 1) create a new namespace that acts as the global namespace for all objects defined in module, 2) execute the entire module, 3) create a name – identical to the module name – within the caller namespace that references to the module. This can be used to access module objects in the caller namespace as
from module import symbolimports
symbolinto the current namespace. However, the global namespace for
symbol(if it’s a function) always remains the namespace in which it was defined, not the caller’s namespace.
Regardless of what variant of the import statement is being used to import contents from a module, all of the module’s statements will be initialised the first time (and only the first time) the module name is encountered in an import statement (more details here).
from module import *is generally discouraged is that it directly imports all the module’s objects into the caller’s namespace, which is often said to cluter it up. Especially when importing large modules this makes sense, as it’s much cleaner to keep objects defined in imported modules in eponymous namespaces and accessing them via
module.object, which immediately makes clear where object comes from and can help greatly with debugging.
One implication of all the above is that as a developer, you don’t have to worry about clashing variable names between modules, as they are each stored in their own namespace, and accessed via
moduleB.fooin the caller namespace.
When we import a module
foo, the interpreter first searches for a built-in module and, if none is found, searches a list of directories given the variable
sys.pathcontains the directory of the input script, the variable
PYTHONPATH, and installation-dependent defaults. I can manipulate
sys.pathusing standard list operations; to add a directory, use
A common usecase of the above for me is to make a package available to Jupyter Notebooks. By default, a notebook’s
sys.pathcontains the folder the noteook is located in and a bunch of conda managed directories linked to my running Conda environment. To make available a package that lives in the project root directory, just do
sys.path.append('/Users/fgu/dev/projectname/packagename'). I can then reference modules from the package using
from packagename import module.
dir(modulename)to list all names defined in
dir()to list all names that are currently defined.
Running a module as a script
For relative imports to work as described, for instance, here and in Chapter 8 in Python Essential References and in recipees 10.1 and 10.3 in the Python Cookbook, the file into which you import has itself to be a module rather than a top-level script. If it’s the latter, it’s name will be main and it won’t be considered part of a package, regardless of where on the file system it is saved. Generally, for a file to be considered part of a package, it needs to nave a dot (.) in its name, as in package.submodule.modulename.
To import modules into a main script, one (somewhat unideal) solution is to add the absolute path to the package to
- Packages are collections of modules. They help structure Python’s module
namespace by using dotted module names. E.g.
a.brefers to submodule
a. Thus, just as the use of modules alleviates worries about clashing global variable names between modules, using a package alleviates worries about clashing module between multi-module packages.
Example: creating utility package
If you want to publish to PyPI, choose name that doesn’t exist yet.
Create virtual environment
pyenv virtualenv 3.9 futilsand activate venv
pyenv activate futils.
Create project folder with Poetry for nice default setup
poetry new projectnamebut rename
srcbecause of this blog post.
Install required dependencies
poetry add pandas numpy seabornand development dependencies
poetry add --dev ipykernel.
Can reinstall dependencies using
Publish project to private server:
poetry publish -r reponame. – currently not working, getting 403 forbidden error
Automatic git credentials reading when publishing package
Remove pyenv prompt warning message