Saturday 7 November 2015

Python Weekly # 4 : Scoping - arguments and variables.

Python Scoping

Arguments, variables and other things

In this article I wanted to explore something of the rules which govern python scoping. Scoping is how Python decided when and where a name is valid and accessible. In Python names are not the same as objects or data, so even if the name is no longer valid, the object may well still exist (it depends on how many other references there are to the object - in simple terms how many other names are bound to that object or how many times the object appears in a dictionary or list).

When a python program executes, scopes are created for each module that is imported, and each class that is defined (although there is a twist to class scopes explained below). A scope is also created each time a function or method runs. You can think of a scope as like a dictionary which matches a name to the object that it is bound to.  As the program runs, python keeps track of the various scopes which are created, in a nested fashion, and at any point, there one more scopes which exist.

The scoping rules are actually relatively simple : 
  • When a name is bound to an object it is created in the inner most scope by default. If a function is being executed, than a name will be created in that function's scope, unless the name has been listed on a global statement
  • When a name is referenced (i.e. not created), the name is searched for  inner most scope first, and then going outwards towards the module and builtin scopes
A name is bound when :
  • It is the name of a module which is imported
  • It is the name of a class which is being defined
  • It is the name of a function which is being defined
  • It is the name of an argument to a function which is being executed.
  • It is a name which appears on the left hand side of an assignment
  • It is a name which appears as the loop variable in a for loop 
  • It is a name which appears as in an except statement.
Some examples would be helpful here :
  • A function defined in a module : If a name is defined in the function which matches a name defined in a module, then the version defined in the function will be used in the function (unless the global statement is used for that name) - that includes any arguments which are defined for that function.
  • If a function is defined in another function : The inner function can refer to any names defined in the outer function (or of course the module/builtin scope), but the outer function can't refer to names in the inner function. If the inner function rebinds a name used in the outer function, that doesn't change the binding made in the outer function :
     
    
    >>> def outer():
    >>>    a = 1
    >>>    def inner():
    >>>        a = 2
    >>>        print "Inner ",a
    >>>    print
    >>>    print "Outer a",a
    >>>    inner()
    >>>    print "Outer b",a
    >>>
    >>> outer()
    
    Outer a 1
    Inner 2
    Outer b 1
    
     
    
  • If a function is defined in a class (i.e. a method), then the method cannot access anything at the class scope, without using either the instance or class identity and qualifying the name - i.e using either self.name or self.__class__.name (or something similar) - this is the class scope twist that was mentioned above. The decision to use a qualified name (rather than just the name) is so that there is a single way that your code refers to or rebind names in the outer scope (contrast that with the nested function example above where the inner function has no way to rebind the name defined in the outer scope).
These scoping rules are fairly sensible (the principle of "least surprise" is common in Python - meaning code should always do what the developer expects it to do) and doesn't hold that many surprises to people already used to other programming languages. However, unlike in some other languages there are no compiler/interpret warnings if you define a name which hides/masks one of the builtin names. In theory that means you could redefine one of the very critical functions - like open (although it is not recommended unless you really know what you are doing, as you can easily break things).

Beware !!!

In Python 2.7 any name defined in a list comprehension is treated exactly as if the name had been defined in a for loop, or similar.
 
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
>>> l = [a for a in range(10)]
>>> a
9
 
This is one example of where the "least surprise" principle isn't actually maintained, especially when a similar generator expression does not do the same - in this case trying to access the generator loop variable does not succeed:
 
Python 2.7.6 (default, Jun 22 2015, 17:58:13)
>>> l = [(a for a in range(10))]
>>> a 
NameError: name 'a' is not defined 
 
This surprise is due to how Python2.7 implements the list comprehension in the first example, and this implementation issue is resolved in Python 3. It is definitely not recommended that you write any form of code that relies on this implementation detail in Python 2.7, as it is simply not clear what the expected value should be, and your code will break when translated from Python 2 to Python 3.

No comments:

Post a Comment