Monday, October 27, 2014

Getting version info for Python modules (for use in a super-installer)

The application that I am developing uses a lot of different 3rd-party Python modules. The installer (a Windows program) for this application installs these Python modules if they don't already exist on the target system. So the first thing the installer does is find out what version (if any) of these Python modules are currently installed. Below I refer to the application installer as a "super-installer" since it invokes the installers for these 3rd-party modules.

The semi-standard way for a Python module to provide version info is to have an attribute '__version__'.  For example:
>>> import numpy
>>> numpy.__version__
'1.8.2'
But some modules have an attribute 'version'.  For example:
>>> import tornado
>>> tornado.version
'3.1.1'
And some other modules have 'version' as a sub-attribute of 'release'. For example:
>>> import pyreadline
>>> pyreadline.release.version
'1.7.1'
Unfortunately, a few 3rd-party modules don't provide any runtime-queryable version info at all. For those, my super-installer writes the version info into a file in the module's folder after the module's installer has finished. The file name used for this was chosen to be a non-standard one (including the name of the company) so that it won't collide with any version mechanism introduced in future Python or module revisions. The code for reading the version of a Python module looks like this:
def getPythonModuleVersion(moduleName):
    """
    Return a string with the version info for the specified
    Python module.
    Return None if the module does not exist (import fails).
    Return 'UnknownVersion' if no version info is available.
    """
    try:
        moduleObj = importlib.import_module(moduleName)
        if moduleObj is None:
            return None
    except Exception:
        return None

    versionStr = getattr(moduleObj, '__version__', None)
    if versionStr is not None:
        return versionStr

    # '.version' works for 'tornado'
    versionStr = getattr(moduleObj, 'version', None)
    if versionStr is not None:
        return versionStr

    # '.release.version' works for 'pyreadline'
    releaseObj = getattr(moduleObj, 'release', None)
    if releaseObj is not None:
        versionStr = getattr(releaseObj, 'version', None)
        if versionStr is not None:
            return versionStr

    # Final try: see if there is a version file in the folder
    # of the module (this is not a standard mechanism but our
    # installer creates such a file if there is no other
    # version info.)
    versionFilePath = getVersionFilePath(moduleObj)
    if (versionFilePath is not None
        and os.path.isfile(versionFilePath)):
        contents = readContentsOfFile(versionFilePath)
        versionStr = contents.strip()
        if versionStr != '':
            return versionStr

    return 'UnknownVersion'
The function 'getVersionFilePath' looks like this:
def getVersionFilePath(moduleObj):
    """
    Return the path to the version file
    (in the module's folder).
    Note: this file does not necessarily exist.
    Return None if can't get the module's folder.
    """
    moduleFilePath = getattr(moduleObj, '__file__', None)
    if moduleFilePath is None:
        return None
    moduleFolderPath = os.path.dirname(moduleFilePath)
    # VersionFileName is a name chosen to not collide
    versionFilePath = os.path.join(moduleFolderPath,
                                   VersionFileName)
    return versionFilePath
The Python script (run from the super-installer) that checks if a particular Python module is installed and gets the version of the currently installed module looks like this:
    if len(sys.argv) < 2:
        sys.exit("Must supply the name of a Python module")
    moduleName = sys.argv[1]
    versionStrToAddIfNone = None
    if len(sys.argv) > 2:
        versionStrToAddIfNone = sys.argv[2]

    versionStr = getPythonModuleVersion(moduleName)

    # If 'versionStrToAddIfNone' is not None,
    # handle the case where module supplies no version info:
    if (versionStr == 'UnknownVersion'
        and versionStrToAddIfNone is not None):
        status = addVersionFile(moduleName,
                                versionStrToAddIfNone)
        if status:
            versionStr = versionStrToAddIfNone

    if versionStr is not None:
        print versionStr
    else:
        print "NoSuchModule"
The super-installer runs the above script with just one command-line argument (the module name) to find out the currently installed version of each 3rd-party Python module. If the current version is older than what is needed by the application, the installer for the new version of the Python module is run, and then the above script is run with a second command-line argument specifying the version of the module that was just installed. If the module didn't supply any version info, a version file is created via the function 'addVersionFile' which looks like this:
def addVersionFile(moduleName, versionStr):
    """
    Create a version file in the folder of the module.
    Return True iff successful.
    """
    try:
        moduleObj = importlib.import_module(moduleName)
        if moduleObj is None:
            return False
    except Exception:
        return False
    versionFilePath = getVersionFilePath(moduleObj)
    if versionFilePath is not None:
        writeStringToFile(versionFilePath, versionStr)
        return True
    return False
By the way, the use of 'pkg_resources' to obtain version info is problematic if binary Windows installers have been used to install a Python module. For example, look at this:
>>> import pkg_resources
>>> pkg_resources.get_distribution('numpy').version
'1.7.1'
>>> import numpy
>>> numpy.__version__
'1.8.2'