Python modules and packages introduction

  1. Preface
    This post will go over the basic details about Python modules and packages along with some “related” topics in this area.
  2. Background
    First thing first, let’s go over the basic terminology and characteristics for most of the things that need to be discussed regarding modules and packages in Python.

    1. Python module : “A module is a file containing Python definitions and statements. The file name is the module name with the suffix .py appended. Within a module, the module’s name (as a string) is available via the value of the global variable __name__. A module can contain executable statements as well as function definitions. These statements are intended to initialize the module. They are executed only the first time the module name is encountered in an import statement (they are also run if the file is executed as a script.)”
    2. Let’s see an example of that. Below is a very simple and basic module called module_2 (i.e. – it is written within a Python file called module_2 .py).
    3.  #
       # this is a modules' global variable
       global_var_1 = __name__ + " global variable"
       # this is a modules' "public" function. Once you import this module (file) within another module --> this function can be used
       def func1(arg1):
                 func_name = __name__ + "::func1 - "
                 print(func_name + "start")
                 print(func_name + "got arg1:" + str(arg1))
       # this is a modules' "global" statement
       print("this is a \"stand alone\" statement within " + __name__ + " module")
    4. NOTES:
      1. The __name__ variable is an “integral” variable for each module (Python file) that is created. It will always contain the module name’s with the “full path” within the project (i.e. – for example
      2. Function func1 is (in this case) the only function defined in this module. It is accessible from all other modules that will import this module (see that later on).
      3. The last line that starts with the print command is the only “global statement” within this module. By global statement, it means that it is not within any function and or class, but rather “simply out there” within the module. This statement will be discussed further later.
  3. Namespaces
    Namespaces are the machinery in Python used to uniquely identify all the objects in a Python program (recall that everything in Python is an object).
    There are couple of namespaces “types” in Python.

    1. Global namespace
      The most “outer” namespace that exist in a Python program.
    2. Module namespace
      The namespace that encapsulates one module’s “definitions”, for example the math built-in module has its namespace, where within it, among many other stuff, the pi variable is defined:

      from math import pi
      >>> pi
      >>> math.pi
      NameError: name 'math' is not defined

      NOTE: Once the pi variable was imported from its “original” namespace, the math’s module namespace, into the global namespace, it is then NOT recognized as “math’s” variable BUT it “belongs” to the global namespace, thus should be referred as pi (and not math.pi).

    3. local namespace
      Usually refer to a more “intimacy” namespace, such as on of a function:

      def outer_func():
          c_num = 12
          def inner_func():
              d_num = 13
              print(dir(), ' - names in inner_func')
          e_num = 14
          print(dir(), ' - names in outer_func')
      # ['d_num']  - names in inner_func
      # ['c_num', 'e_num', 'inner_func']  - names in outer_func
  4.  Modules and the import command

    1. Now that we know what is (basically) a module, let’s see how do we “reuse” it within a Python program (i.e.- other modules and/or the main module).
      1. Importing a module: In order to use any definition (functions, variables, classes, etc…) defined within a module, in this case, module_2, another module, module_1 for instance, needs to import module_2. This is typically (but not mandatory) done by adding the import statement at the top of module_1 file, along with the fully qualified name of module_2, that it needs to import.
      2.  #
         from module_2 import module_2
         def modules_usage_examples():
                     func_name = inspect.stack()[0][3] + " - "
                     print(func_name + "start")
                     arg1 = 17
                     print(func_name + "about to call func1 of module_2")
      3. Due to that fact that module_1 used the import command to import module_2, it is able to call (“use/invoke”) func1 which is defined in module_2.
        NOTE: Any global definitions/statement within module_2, which is the imported module, will be executed ONLY ONCE, no matter how many other modules are importing module_2 as well, meaning, if module_1 is run the way it is now implemented, with the command:


        then all the “global” statements within module_2 will run only once, upon the first time it is imported.

    2. The main module
      As depicted earlier, a module can be “treated” as a normal Python script (which essentially, this is what it is) thus can be invoked with the above command by the Python interpreter “directly”. The question that now arises is – “what happens if we wish to pass command line arguments to that script?”

      1. Main module: In case we wish to be able to directly “talk” to the module via the Python interpreter (Linux shell for instance), we can check if the  module is “considered” (in run time) as the “main module”.This can be easily done by adding the lines in the (updated) module_1 file:
        if __name__ == "__main__":
            import sys
      2. NOTES:
        1. The last three lines are checking whether this script was invoked (“treated”) as the main module – if so, “its new” name is now “__main__” thus the three last lines which are “global” statements will be run by Python.
        2. Note that the second command-line argument is sent to the function, as sys.argv[1], this way the script can be invoked, for instance, with the following command:
          python3 17

          –> then 17 is sys.argv[1]

    3. Python built-in modules
      Python comes with a set of “built-in” modules, similarly, for instance, to the C/C++ standard library. For example, one of the most fundamental built-in modules is the sys module, which as its (short) name states, provides some basic and “OS wise” capabilities, usually “wrapped” around the underlying platform’s (Windows/Linux/Mac) system calls. Within the sys module, one particular variable is worth mention is the sys.path variable.

      1. The variable sys.path is a list of strings that determine the interpreter’s search path for modules. It is initialized to a default path taken from the environment variable PYTHONPATH, or from a built-in default if PYTHONPATH is not set.
      2. When running the Python interpreter, one can print the content of the sys.path as follows:
        (robotPidPrinter_env) guya@guy-vm:~/dev/robotpidprinter$ python3
        Python 3.5.2 (default, Oct 8 2019, 13:06:37)
        [GCC 5.4.0 20160609] on linux
        Type "help", "copyright", "credits" or "license" for more information.
        >>> import sys
        >>> print(sys.path)
        ['', '/home/guya/dev/robotpoc', '/usr/local/lib/python3.5/dist-packages/robotframework-3.1.2-py3.5.egg', '/usr/lib/', '/usr/lib/python3.5', '/usr/lib/python3.5/plat-x86_64-linux-gnu', '/usr/lib/python3.5/lib-dynload', '/home/guya/.local/lib/python3.5/site-packages', '/usr/local/lib/python3.5/dist-packages', '/usr/lib/python3/dist-packages']
        1. As you can see, the list of strings that sys.path holds, is “pointing” to the folders where the “default” installation of Python (in this case Python3) installed all the default built-in modules, such as /usr/lib, /usr/local/lib and /home/guya/.local/lib
        2. It is worth mention that the sys.path variable, as any variable in Python, can be changed during the run time of a program (Python script) if needed.
    4. Packages
      “Packages are a way of structuring Python’s module namespace by using “dotted module names”. For example, the module name A.B designates a sub module named B in a package named A.” So technically speaking, a package is a collection of one or more Python modules.

      1. file:
        In order to let Python know that a Python file(s) is a package (module) and not “just a bunch of Python files” – there is a special file, called, that indicates to Python that all the files within the folder with the file should be treated as modules (package). It is also the place to put all general “module-wise” initialization if needed. Usually, though, this file will probably be empty. Let’s look at the below project folder structure:

      2. NOTES:
        1. The “package root” folder contains file.
        2. The module_1 & module_2 folders contains files.
          –> Now Python will know to look for the module_1 & module_2 modules when it encounters an import statement with their name.
        3. There is much MORE to say about packages, modules, and their relationships, for now, it will do.
      3. Python 3.3+
        An important note is that starting Python3.3, technically, there is no need to explicitly create the file, though it is still possible. Note, however, that in case you deploy the project with the file (which is a very good practice and is a topic for another tutorial) and within it calls the find_packages() method, then every package folder MUST contain the file in order for Python to truly locate it. See this Q&A on Stackoverflow. So eventually, you would probably will have the file in each module folder.
      4. Having an file allows you also to “syntactically-wise” treat a directory as if it was a Python module. See adding an import to section in this nice tutorial.
    5. Excluding sub-package
      In case the project has some file(s) that are not needed to be part of the package distribution (for example, files of unit tests ) and you wish to exclude them from “the files that will be distributed” yet still need them to be part of the project’s files – it is possible to exclude them. One approach (that I found easy to implement and straight forward), is to utilize the find_packages() method that is used in the setup method (within the file). You can use the fact that setuptools can exclude packages (directories containing the files)

      1. Create a package_to_exclude folder – in my case, I named it tests (as it will also contain unit tests for this project, which are essential, but not relevant for distribution). All the files that you wish to exclude should be placed within this folder. Now what is left to do is tell the setup function to “ignore” (exclude) the tests package, by adding this line:
         packages=find_packages(exclude=["tests"]). That is it. wheel  files
    6. Resources:
      1. Ignore single Python file (module) or an entire package in the
      2. namespaces in Python
      3. Advanced tips and techniques to import in Python

        The picture:
        Mountains near the city of Mendoza, Argentina.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s