ECMA-262-5 in detail. Chapter 3.1. Lexical environments: Common Theory.

In this chapter we’ll talk in detail about lexical environments — a mechanism used in some languages to manage the static scoping. In order to understand this concept completely, we’ll also discuss briefly the alternative — dynamic scoping (which though isn’t used directly in ECMAScript). We’ll see how environments help to manage lexically nested structures of a code and to provide complete support of closures.

Lexical environments was introduced in ECMA-262-5 specifications, though this is an independent from ECMAScript theoretical concept which is used in many languages.

Actually, a technical part related with the topic (though, in the alternative terminology) we already discussed in the ES3 series — when were talking about variable and activation objects and also about a scope chain.

Strictly speaking, lexical environments in this case are just more theoretically suited and more highly-abstracted replacement of the previous ES3 concepts. Since ES5 era, in discussions and explanations of ECMAScript, I recommend to use exactly these new definitions. Though, more generic concepts, such as an activation record (which is an activation object in ES3) of a call-stack (which is an execution context stack in ES), etc of course may also be used, but already in discussions on lower abstraction levels.

This chapter is devoted to the common theory of environments and also touches several aspects of the programming language theory (PLT). We’ll consider the topic from several viewpoints and implementations in different languages — to see for what lexical environments are needed and how these structures are formed. In fact, if we completely understand the common scope theory, the question on understanding environments and scope in exactly ES will disappear by itself, and the topic will become clear.

As we said, all these concepts used in ES (i.e. an activation object, a scope chain, a lexical environment, etc) are related with a generic concept of a scope. Mentioned ES definitions are just techniques and a local terminology of the scope implementation. In order to understand these techniques, let’s recall the scope theoretical concept itself and its types.

Typically, scope is used to manage the visibility and accessibility of variables from different parts of a program.

Several encapsulating abstractions (such as namespaces, modules, etc) related with a scope, are invented to provide a better modularity of a system and to avoid naming variables conflicts. In the same respect, there are local variables of functions and local variables of blocks. Such techniques help to increase the abstraction and encapsulate the internal data (not bothering a user of this abstraction with details of the implementation and exact internal variable names).

Concept of a scope helps us to use in one program the same name variables but with possibly different meanings and values. From this viewpoint:

Scope is an enclosing context in which a variable is associated with a value.

We may also say, that this is a logical boundaries in which a variable (or even an expression) has its meaning. For example, a global variable, a local variable, etc, which generally reflects a logical range of a variable lifetime (or extent).

Block and function concepts lead us to one of the major scope properties — to nest other scopes or to be nested. Thus, as we’ll see, not all implementations allow nested functions, the same as not all implementations (and in particular the current version of ECMAScript) provide block-level scope.

Consider the following C example:

// global "x"
int x = 10;

void foo() {
  
  // local "x" of "foo" function
  int x = 20;

  if (true) {
    // local "x" of if-block
    int x = 30;
    printf("%d", x); // 30
  }

  printf("%d", x); // 20

}

foo();

printf("%d", x); // 10

It can be presented with the next figure:

Figure 1. Nested scopes.

Figure 1. Nested scopes.

Current version of ECMAScript does not support block-level scope:

var x = 10;

if (true) {
  var x = 20;
  console.log(x); // 20
}

console.log(x); // 20

However, some implementations (e.g. SpiderMonkey since 1.7 version) support alternative let instruction which allows the block-level scope. This instruction will be standardize in the future ES6 (aka Harmony) version:

let x = 10;

if (true) {
  let x = 20;
  console.log(x); // 20
}

console.log(x); // 10

Instruction let though is a syntactic sugar for the construction which may be implemented in the ES3 version. The example above desugars to the following code:

var x = 10;

if (true) {
  (function (x) {
    console.log(x); // 20
  })(20);
}

console.log(x); // 10

Another major property of a scope — is a method of a variable resolution. Indeed, since very likely that several programmers in one project may use the same variable name (for example, i for a loop counter), we should know how to get correct value of an appropriate identifier with the same name. There are two conceptual ways, and as a consequence two types of scope: static and dynamic. Let’s clarify them.

In static scoping, an identifier refers to its nearest lexical environment. The word “lexical” in this case relates to a property of a program text. I.e. where lexically in the source text a variable appears (that is, the exact place in the code) — in that scope it will be resolved later at runtime on referencing this variable. The word “environment” implies again something that lexically surrounds the definition.

The word “static” relates to ability to determine the scope of an identifier during the parsing stage of a program. That is, if we (by looking on the code) can say before the starting the program, in which scope a variable will be resolved — we deal with a static scope.

Let’s see an example:

var x = 10;
var y = 20;

function foo() {
  console.log(x, y);
}

foo(); // 10, 20

function bar() {
  var y = 30;
  console.log(x, y); // 10, 30
  foo(); // 10, 20
}

bar();

In the example, variable x lexically defined in the global scope — that means at runtime it is resolved also in the global scope, i.e. to value 10.

In case of the y name we have two definitions. But as we said, always the nearest lexical scope containing the variable is considered. The own scope has the highest priority and is considered the first. Therefore, in case of the bar function, y variable is resolved as 30. The local variable y of the bar function is said to shadow the same name variable y of the global scope.

However, the same name y is resolved as 20 in case of foo function — even if it is called inside the bar function, which contains another y. I.e. resolution of an identifier is independent from the environment of a caller (in this case bar is a caller of foo, and foo is a callee). And again, this is because at the moment of foo function definition, the nearest lexical context with the y name — was the global context.

Today static scope is used in many languages: C, Java, ECMAScript, Python, Ruby, Lua, etc.

We’ll mention a bit later the mechanisms of lexical scope implementation (after all it’s used in ECMAScript), and also will talk about subtle cases of using it with the first-class functions. And now let’s note briefly an alternative, dynamic scope — in order to see the difference and understand, why the dynamic scope cannot be used in supporting closures. As we’ll see ECMAScript also has some features of dynamic scope.

In contrast with the static scope, dynamic scope assumes that we cannot determine at parsing stage, to which value (in which environment) a variable will be resolved. Usually it means, that the variable is resolved not in the lexical environment, but rather in the dynamically formed global stack of variables. Every met variable declaration just puts the name of the variable onto the stack. After the scope (lifetime) of the variable is finished, the variable is removed (popped) from the stack.

That means, that for a single function we may have infinite resolution ways of the same variable name — depending on the context from which the function is called.

For example, consider a similar case as from above, but with using the dynamic scope. We use pascal-like pseudo-code syntax:

// *pseudo* code - with dynamic scope

y = 20;

procedure foo()
  print(y)
end


// on the stack of the "y" name
// currently only one value 20
// {y: [20]}

foo() // 20, OK

procedure bar()

  // and now on the stack there
  // are two "y" values: {y: [20, 30]};
  // the first found (from the top) is taken

  y = 30

  // therefore:
  foo() // 30!, not 20

end

bar()

We see that the environment of a caller affects on the variables resolution. Since a function (a callee) may be called from many different locations and with different states, it’s hard to determine the exact environment statically, at parsing stage. That’s why this type of scope is called as dynamic.

That is, a dynamically-scoped variable is resolved in the environment of execution, rather than the environment of definition as we have in the static scope.

One of the dynamic scope benefits is ability to apply the same code for different (mutable over the time) states of a system. However, such a benefit requires to keep in mind all possible cases of the function executions.

Obviously, with the dynamic scope it’s not possible to create a closure for a variable.

Today, most of the modern languages do not use dynamic scope. However, in some languages, and notably in Perl (or some variations of Lisp), a programmer may choose how to define a variable — with static or dynamic scope.

Take a look on the following Perl example. Keyword my there captures a variable lexically, meanwhile keyword local makes the variable dynamically-scoped:

# Perl example of static and dynamic scopes

$a = 0;

sub foo {
  return $a;
}

sub staticScope {
  my $a = 1; # lexical (static)
  return foo();
}

print staticScope(); # 0 (from the saved global frame)

$b = 0;

sub bar {
  return $b;
}

sub dynamicScope {
  local $b = 1; # dynamic
  return bar();
}

print dynamicScope(); # 1 (from the caller's frame)

As we said, ECMAScript does not use dynamic scope (in the view we described it above) also. However, a pair of ES instructions sometimes are considered as bringing a dynamics to the static scope. Therefore, modifications made by such instructions can also be related to dynamic scope. But notice again — not in respect of the global variables stack as in the standard dynamic scope definition, but in respect of impossibility to determine at parsing stage in which environment a variable will be resolved. These instructions are with and eval. The effect they apply to the ECMAScript static scope can be called a “Runtime scope augmentation”.

Consider the following example:

var x = 10;

var o = {x: 30};
var storage = {};

(function foo(flag) {

  if (flag == 2) {
    eval("var x = 20;");
  }

  if (flag == 3) {
    storage = o;
  }

  with (storage) {

    // "x" may be resolved either
    // in the global scope - 10, or
    // in the local scope of a function - 20
    // (created via "eval" function), or even
    // in the "storage" object - 30

    alert(x); // ? - scope of "x" is undetermined at compile time

  }

  // organize recursion on 3 calls

  if (flag < 3) {
    foo(++flag);
  }

})(1);

As we’ll see shortly, static lexical scoping increase efficiency, and the with and eval in contrast may decrease performance of lexical environments storage and variables lookup at implementation level. Therefore, statement with was completely removed from ES5 strict mode. And the eval function in the strict mode may not create variables in the calling context. That is, strict mode provides a complete lexical scope in ES.

Further in this chapter we’ll discuss only lexical (static) scope and details of its implementation. But before it we should consider briefly concept of a name binding, since it’s actively used with concept of environments.

Having highly-abstracted languages we usually operate not with low-level addresses to refer some data in memory, but rather with convenient variable names (identifiers), which reflect that data.

A name binding is the association of an identifier with an object.

An identifier can be bound or unbound. If an identifier is bound to an object, it references this object. The following use of the identifier results the object it’s bound to.

With bindings concepts two major operations are related (which often cause confusion in discussing by-reference or by-value strategies of passing arguments and assignment). These operations are rebinding and mutation.

A rebinding relates to an identifier. This operation unbinds the identifier (if it was previously bound) from an old object and binds it to another one (to another block of memory). Often (and in ECMAScript in particular) rebinding is implemented via a simple operation of assignment.

For example:

// bind "foo" to {x: 10} object
var foo = {x: 10};

console.log(foo.x); // 10

// bind "bar" to the *same* object
// as "foo" identifier is bound

var bar = foo;

console.log(foo === bar); // true
console.log(bar.x); // OK, also 10

// and now rebind "foo"
// to the new object

foo = {x: 20};

console.log(foo.x); // 20

// and "bar" still points
// to the old object

console.log(bar.x); // 10
console.log(foo === bar); // false

Often rebinding is confused with assignment by-reference. One could thought that after assigning the new object to the foo variable in the example above, bar variable would also point to the new object. However, as we see, bar still refers to the old object, meanwhile foo was rebound to the new memory block. The next figure shows these two actions:

Figure 2. Rebinding.

Figure 2. Rebinding.

Think about bindings not as by-reference, but (from C viewpoint) as by-pointer (or sometimes — by-sharing) operation. Often it’s also called as a special case of by-value where value is an address. Assignment just changes (rebinds) the pointer’s value (the address) from one memory block to another. And when we assign one variable to another we just copy the address of the same object to the second variable. Now two identifiers are said to share the one object. From here the name — by-sharing.

In contrast with rebinding, the operation of mutation already affects the content of the object.

Consider the following example:

// bind an array to the "foo" identifier
var foo = [1, 2, 3];

// and here is a *mutation* of
// the array object contents
foo.push(4);

console.log(foo); // 1,2,3,4

// also mutations
foo[4] = 5;
foo[0] = 0;

console.log(foo); // 0,2,3,4,5

This code is presented in the following figure:

Figure 3. Mutation.

Figure 3. Mutation.

You may find an additional information about the binding and evaluation strategies (by-reference, by-value, by-sharing) in the appropriate Chapter 8. Evaluation strategy of the ES3 series.

And now we are ready to discuss in detail general concept of environments and to see how they are made from within.

In this section we’ll mention the techniques of the lexical scope implementation. Also, since we operate with highly-abstracted entities and talk about lexical scoping, in the further explanation we’ll mainly use the concept of an environment, rather than scope, since exactly this terminology is used in ES5. E.g. the same — a global environment, a local environment of a function, etc.

As we mentioned, an environment specifies the meaning of an identifier (of a symbol) in the expression. Indeed, it is meaningless to speak of the value of an expression such as e.g. x + 1 without specifying any information about the environment that would provide a meaning for the symbol x (or even for the symbol +, if to treat it as a syntactic sugar for a simple function of addition — add(x, 1), though in the last case we should provide the meaning of the add name as well).

ECMAScript manages executions of functions with using the model of a call-stack, which is called here the execution context stack. Let’s consider some generic models for storing the variables (bindings). The things interesting for us — systems with closures and without them.

Until we have no first-class functions (i.e. such functions which may participate as normal data — we’ll talk about them a bit later) or simply do not allow inner functions, the easiest way to store local variables is to use the call-stack itself.

A special data structure of the call-stack which is called an activation record is used as a storage of the environment bindings. Sometimes it’s also called a call-stack frame.

Every time a function is activated, its activation record (with formal parameters and local variables) is pushed onto the call-stack. Thus, if the function calls another function (or itself — recursively) another stack-frame is pushed onto the stack. After the context is finished, the activation record is removed (popped) from the stack (which means — all local variables are destroyed). This model is used e.g. in C programming language.

For example:

void foo(int x) {
  int y = 20;
  bar(30);
}

void bar(x) {
  int z = 40;
}

foo(10);

Then the call-stack has the following modifications:

callStack = [];

// "foo" function activation
// record is pushed onto the stack

callStack.push({
  x: 10,
  y: 20
});

// "bar" function activation
// record is pushed onto the stack

callStack.push({
  x: 30,
  z: 40
});

// callStack at the moment of
// the "bar" activation

console.log(callStack); // [{x: 10, y: 20}, {x: 30, z: 40}]

// "bar" function ends
callStack.pop();

// "foo" function ends
callStack.pop();

On the next figure we see two activation records pushed onto the stack — it’s the moment of bar function activation:

Figure 4. A call stack with activation records.

Figure 4. A call stack with activation records.

And absolutely the same logical approach of functions execution is used in ECMAScript. However, with some very important differences.

First, as we know and said above, for the call-stack concept here stands the execution contexts stack and for the activation record stands (in ES3) the activation object.

However, the terminology difference is not so essential in this case. The main difference, is that in contrast with C, ECMAScript does not remove the activation object from the memory if there is a closure. And the most important case is when this closure is some inner function which uses variables of the parent function in which it’s created, and this inner function is returned upwards to the outside.

That means that the activation object should be stored not in the stack itself, but rather in the heap (a dynamically allocated memory; sometimes such languages are called heap-based languages — in contrast with stack-based languages). And it is stored there until there are references from closures which use (free) variables from this activation object. Moreover, not only one activation object is saved, but if needed (in case of several nested levels) — all parent activation objects.

var bar = (function foo() {
  var x = 10;
  var y = 20;
  return function bar() {
    return x + y;
  };
})();

bar(); // 30

In the next figure abstract representation of the heap-based activation records is presented. We see that if foo function creates a closure, then after foo is finished, its frame isn’t removed from memory because there is still reference to it from the closure:

Figure 5. Heap-based call-frames.

Figure 5. Heap-based call-frames.

One of the used terminologies in the theory for these activation objects is environment frames (analogy with call-stack frames). We use this terminology to underline the difference of the implementation — that environment frames continue to exist if there are references to them from closures. Also we use this terminology to underline the higher-abstracted concepts — i.e. without concentrating on lower level stack and addresses structures, we just say that these are environments, and how they are implemented — is already a derived question.

As is said, in ECMAScript, in contrast with C, we do have inner functions and closures. Moreover, all functions in ES are the first-class. Let’s recall the definition of such functions and also other definitions of the functional programming. We’ll see that these concepts are closely related with lexical environments model.

We’ll also clarify that the problem of closures (or — the “funarg problem” as will be mentioned below) is actually the problem of exactly lexical environments. That’s why in this section we’ll mostly speak about fundamental concepts of functional languages.

A first-class function is one that may participate as a normal data, i.e. be created literally at runtime, be passed as an argument, or be returned as a value from another function.

A simple example:

// create a function expression
// dynamically at runtime and
// bind it to "foo" identifier

var foo = function () {
  console.log("foo");
};

// pass it to another function,
// which in turn is also created
// at runtime and called immediately
// right after the creation; result
// of this function is again bound
// to the "foo" identifier

foo = (function (funArg) {

  // activate the "foo" function
  funArg(); // "foo"

  // and return it back as a value
  return funArg;

})(foo);

First-class functions may be stratified into more exact sub-definitions.

When a function is passed as an argument, it’s called a “funarg” — an abbreviation of the functional argument concept.

In turn, a function which accepts the “funarg” is called a higher-order function (HOF) or, closely to mathematics, an operator.

A function which returns another function is called a function with a functional value (or a function-valued function).

With these concepts, as we’ll see below, a so-called “funarg-problem” is related. And as also we’ll see shortly, the solution of this problem are exactly closures and lexical environments.

In the example above, foo function is a “funarg” which is passed to the anonymous higher-order function (it accepts foo “funarg” by the formal parameter name funArg). This anonymous function in turn returns the functional value — and again the foo function itself. And all these functions are grouped by the first-class functions definition.

Another important concept which is related with first-class functions and which we should recall — is the concept of a free-variable.

A free variable is a variable which is used by a function, but is neither a parameter, nor a local variable of the function.

In other words, a free variable is one that is placed not in the own environment, but probably in some surrounding environments. Notice, that a free variable may the same be as bound (i.e. found in some parent environment) or unbound. The last case will cause a ReferenceError in ECMAScript.

Take a look on the example:

// Global environment (GE)

var x = 10;

function foo(y) {

  // environment of "foo" function (E1)

  var z = 30;

  function bar(q) {
    // environment of "bar" function (E2)
    return x + y + z + q;
  }

  // return "bar" to the outside
  return bar;

}

var bar = foo(20);

bar(40); // 100

In this example we have three environments: GE, E1 and E2, which correspond respectively to the global object, foo function and bar function.

Thus, for the bar function, x, y and z variables are free — they are neither formal parameters, nor local variables of bar.

Notice, that foo function does not use free variables. However, since x variable is used inside the bar function, and because bar function is created during execution of the foo function, the later one should nevertheless save the bindings of the parent environment — in order to pass the information about the x binding further to the deeper nested functions (to the bar in our case).

Correct and expected result 100 after the bar function activation means, that bar function somehow remembers the environment of the foo function activation (where internal bar function is created), even if the context of the foo is finished. Repeat again, this is the difference from the stack-based activation record model used in C.

Obviously, if we allow nested inner functions and want to have the static (lexical) scope, and at the same time — to have all these functions as first-class, we should save all free variables used by a function at the moment of the function’s creation.

The most straightforward and the easiest way to implement such an algorithm, is to save the complete parent environment in which we were created. Later, at our own activation (in this case, at activation of the bar function), we’ll create our own environment, fill it with local variables and parameters, and set as our outer environment the saved one — in order to find free variables there.

It is possible to use the term environment either for a single bindings object, or for the whole list of all binding objects corresponding to the deepness of the nested level. In the later case we may call the binding objects as frames of the environment. From this viewpoint:

An environment is a sequence of frames. Each frame is a record (possibly empty) of bindings, which associate variable names with their corresponding values.

Notice, since this is a generic definition, we use the abstract concept of a record without specifying exact implementation structure — it may be a hash-table placed in the heap, or a stack memory, or even registers of the virtual machine, etc.

For example, environment E2 from the example above has three frames: own — bar, foo and global. Environment E1 contains two frames: foo (own) and the global frame. Global environment GE in turn consists only from one, global, frame.

Figure 6. An environment with frames.

Figure 6. An environment with frames.

A single frame may contain at most one binding for any variable. Each frame also has a pointer to its enclosing (or an outer) environment. The outer reference of the global frame is null. The value of a variable with respect to an environment is the value given by the binding of the variable in the first frame in the environment that contains a binding for that variable. If no frame in the sequence specifies a binding for the variable, then the variable is said to be unbound in the environment (the case of a ReferenceError).

var x = 10;

(function foo(y) {
  
  // use of free-bound "x" variable
  console.log(x);

  // own-bound "y" variable
  console.log(y); // 20
  
  // and free-unbound variable "z"
  console.log(z); // ReferenceError: "z" is not defined

})(20);

I.e. backing again to the concept of scopes, this sequence of environment frames (or in a different view — a linked (chained) list of environments) forms something that we may call as a chain of scopes. Not surprisingly, ES3 had exactly this terminology for that — a scope chain.

Notice, a one environment may serve as an enclosing environment for several inner environments:

// Global environment (GE)

var x = 10;

function foo() {

  // "foo" environment (E1)

  var x = 20;
  var y = 30;

  console.log(x + y);

}

function bar() {
  
  // "bar" environment (E2)

  var z = 40;

  console.log(x + z);
}

Thus in pseudo-code:

// global
GE = {
  x: 10,
  outer: null
};

// foo
E1 = {
  x: 20,
  y: 30,
  outer: GE
};

// bar
E2 = {
  z: 40,
  outer: GE
};

The next figure shows these relations:

Figure 7. Common parent environment frame.

Figure 7. Common parent environment frame.

That is, binding x in respect of the environment E1 shadows the same name binding in the global frame.

From all this we have the generic rules for creating and applying (calling) functions:

A function is created relatively to a given environment. The resulting function object is a pair consisting of the code (function body) and a pointer to the environment in which the function was created.

The code:

// global "x"
var x = 10;

// function "foo" is created relatively
// to the global environment

function foo(y) {
  var z = 30;
  console.log(x + y + z);
}

Corresponds in the pseudo-code to:

// create "foo" function

foo = functionObject {
  code: "console.log(x + y + z);"
  environment: {x: 10, outer: null}
};

This function object is shown on the following figure:

Figure 8. A function.

Figure 8. A function.

Note, the function refers to its environment, and one of the environment bindings — the function — refers back to the function object.

A function is called with (or applied to) a set of arguments by constructing a new frame, binding the formal parameters of the function to the arguments of the call, creating bindings for local variables in this frame, and then executing the body of the function in the context of the new environment constructed. The new frame has as its enclosing environment the environment part of the function object being applied.

And the application:

// function "foo" is applied
// to the argument 20

foo(20);

Corresponds to the following pseudo-code:

// create a new frame with formal 
// parameters and local variables

fooFrame = {
  y: 20,
  z: 30,
  outer: foo.environment
};

// and evaluate the code
// of the "foo" function 

execute(foo.code, fooFrame); // 60

The next figure shows the function application using environment:

Figure 9. Function application.

Figure 9. Function application.

The first point from this conclusion directly gives us a definition of a closure.

A closure is a pair consisting of the function code and the environment in which the function is created.

As we mentioned above, closures are invented as a solution for the “funarg problem”. Let’s recall it in order to have the complete understanding.

The funarg problem is divided into two sub-problems which are directly related with concepts of scope, environments and closures.

Upward funarg problem corresponds to the complexity of returning an inner function to the outside (upward) — i.e. how can we implement the returning of the function if this function uses free variables of the parent environment in which it’s created?

(function (x) {
  return function (y) {
    return x + y;
  };
})(10)(20); // 30

As we already know, the lexical scope with saving enclosing frame on the heap — is the key and the answer. And the strategy of storing bindings on the stack (used in C) does not fit anymore. Let’s repeat again, this saved code block and the environment — is a closure.

Downward funarg problem corresponds to the ambiguity of the variable name resolution when we pass a function which uses free variables as an argument to another function. In which scope these free variables should be resolved — in the scope of the function definition or in the scope of the function execution?

var x = 10;

(function (funArg) {

  var x = 20;
  funArg(); // 10, not 20

})(function () { // create and pass a funarg
  console.log(x);
});

I.e. downward funarg problem relates to the choice of the static (lexical) and dynamic scopes discussed in the beginning of the chapter. As we already know and said above — again, the lexical (static) scopeis the answer. We should save exactly the lexical variables to avoid such ambiguities. And again — this saved lexical variables and the code of our function — is what is called a closure.

So what do we have at the end? Concepts of first-class functions, closures and lexical environments are very closely related. And the lexical environments is exactly the technique which is used in implementation of closures and the static scope in general.

At this step (though, running forward) mention that ECMAScript uses exactly this model with environment frames. However, concrete ES terminology we’ll discuss in the appropriate section.

Detailed explanation of closures may be found in the Chapter 6. Closures of the ES3 series.

For the completeness let’s also clarify an alternative environments implementation used in some other languages.

Always remember, that in order to understand some concrete technology (e.g. ECMAScript) in-depth, we should always first understand the mechanisms of the common theory and also to see how other languages implement the technology. Then we’ll see that these generic mechanisms are become apparent in many similar languages. Though, the different languages may treat the implementation also differently. This section is devoted to environments in such languages as Python, Ruby and Lua.

An alternative way to save all free variables is to create one big environment frame which contains all, but only needed free variables collected from different enclosing environments.

Obviously, if some variables are not needed for inner functions, there is no need to save them. Consider the following example:

// global environment

var x = 10;
var y = 20;

function foo(z) {

  // environment of "foo" function
  var q = 40;

  function bar() {
    // environment of "bar" function
    return x + z;
  }

  return bar;

}

// creation of "bar"
var bar = foo(30);

// applying of "bar"
bar();

We see that none of the functions use global y variable. Therefore, we don’t save it neither in foo‘s closure, nor in the bar‘s.

Global variable x is not used by the foo function, however as we mentioned before, we should save it in the foo‘s closure since the deeper inner bar function does use it and should take information about x from the environment in which it is created (i.e. from the environment of the foo function).

With the q variable of the foo function the same situation as with the global y — no one uses it on deeper levels, so — we do not save it in bar‘s closure. Variable z of course is saved in the bar.

Thus, we have a single environment frame of the bar function containing all needed free variables:

bar = closure {
  code: <...>,
  environment: {
    x: 10,
    z: 30,
  }
}

A similar model is used for example in the Python programming language. There functions have one saved environment frame which is called simply and directly as __closure__ (reflecting the essence of the lexical environments). Global variables are not included in this frame, since they may always be found in the global frame. Not used variables are also not in the __closure__. Take a look on the example:

# Python environments example

# global "x"
x = 10

# global "foo" function
def foo(y):

    # local "z"
    z = 40

    # local "bar" function
    def bar():
        return x + y

    return bar

# create "bar"
bar = foo(20)

# execute "bar"
bar() # 30

# Saved environment of the "bar" function;
# it's stored in the __closure__ property;
#
# It contains only {"y": 20};
# "x" is not in the __closure__ since it's
# always can be found in the global scope;
# "z" is not saved either since it's not used

barEnvironment = bar.__closure__
print(barEnvironment) # tuple of closure cells

internalY = barEnvironment[0].cell_contents
print(internalY) # 20, "y"

Notice, that Python doesn’t save non-used variables even in case of using eval, i.e. when it is not known in advance will a variable be used in the context or not. In the following example internal baz function captures free variable x, and the bar function does not:

def foo(x):

    def bar(y):
        print(eval(k))

    def baz(y):
        z = x
        print(eval(k))

    return [bar, baz]

# create "bar" and "baz" functions
[bar, baz] = foo(10)

# "bar" does not closure anything
print(bar.__closure__) # None

# "baz" closures "x" variable
print(baz.__closure__) # closure cells {'x': 10}

k = "y"

baz(20) # OK, 20
bar(20) # OK, 20

k = "x"

baz(20) # OK, 10 - "x"
bar(20), # error, "x" is not defined

And again, ECMAScript in contrast, having chained frames of the environment, manages this case normally:

function foo(x) {

  function bar(y) {
    console.log(eval(k));
  }

  return bar;

}

// create "bar"
var bar = foo(10);

var k = "y";

// execute bar
bar(20); // OK, 20 - "y"

k = "x";

bar(20); // OK, 10 - "x"

For the brief but detailed explanation of closures in Python see this Python code-article.

I.e. the main difference is that model of chained environment frames (used in ECMAScript) optimizes the moment of function creation, however at the identifier resolution the whole scope chain, considering all environment frames (until the needed binding will be found or the ReferenceError will be thrown), should be traversed.

Meanwhile the model of the single environment frame optimizes the execution (all identifiers are resolved in the nearest single frame without long scope chain lookup), however requires more complex algorithm of the function creation with parsing all inner function and determining which variables should be saved and which are not.

Notice though, that this conclusion is only according to the ECMA-262-5 specification. In practice, ES engines may easily optimize the ECMAScript implementation and save only needed variables. But about ECMAScript implementation we’ll talk in the following chapter 3.2.

Also notice, that strictly speaking a combined frame may not be the single. This means, that the combined frame is optimized to contain bindings from several parent frames, however the environment may include some additional frames. The same in Python — at execution it has an own frame of the activation, the __closure__ combined frame, and the global frame.

Ruby programming language is one that also uses the single frame, with capturing all, but only existing at the moment of the closure creation variables. In the following example on Ruby variable x is captured by the second closure, but not by the first one:

# Ruby lambda closures example

# closure "foo", which has
# free variable "x"
 
foo = lambda {
  print x
}

# define the "x"
x = 10

# second closure "bar" with
# the same body - it also
# refers free variable "x"

bar = lambda {
  print x
}

bar.call # OK, 10
foo.call # error, "x" is not defined

However, as mentioned, Ruby saves all existing variables, and the described above case with eval resolves (the same as ES and in contrast with Python) an unused variable:

k = "y"

foo = lambda { |x|
  lambda { |y|
    eval(k)
  }
}

# create "bar"
bar = foo.call(10)

print(bar.call(20)) # OK, 20 - "y"

k = "x"

print(bar.call(20)) # OK, 10 - "x"

Some languages, e.g. Lua (which also has a single environment frame) allow to set the needed environment of a function dynamically at runtime. Consider the following Lua example:

– Lua environments example:

– global "x"
x = 10

– global function "foo"
function foo(y)

— local variable "q"
local q = 40

— get environment of the "foo"
fooEnvironment = getfenv(foo)

— {x = 10, globals…}
print(fooEnvironment) — table

— "x" and "y" are accessible from here,
— since "x" is in the environment, and
— "y" is a local variable (argument)
print(x + y) — 30

— and now change the environment of the "foo"
setfenv(foo, {
— use reference to "print" function,
— but give it another name
printValue = print,

— reuse "x"
x = x,

— and define a new "z" binding
— with value of the "y"
z = y

})

— use new bindings

printValue(x) — OK, 10
printValue(x + z) — OK, 30

— local variables are still accessible
printValue(y, q) — 20, 40

— but not other names
printValue(print) — nil
print("test") — error, "print" name is nil, can’t call

end

foo(20)

At this step we’re completing the common theory consideration. The next sub-chapter 3.2 will be devoted to exactly ECMAScript implementation. We’ll consider such structures as environment records (which correspond to the frames of environments in the theory we’ve discussed here), and talk about their different types: declarative environment records and object environment records, we’ll see which structure has an execution context in the ES5 and how its different parts are related with different types of functions — known for us function expressions and function declarations.

The summary which we have for this chapter:

  • Concept of an environment is related with a concept of a scope.
  • In the theory there are two types of scope: dynamic and static.
  • ECMAScript uses static (lexical) scope.
  • However, with and eval instructions may be considered as bringing a dynamics to the static scope.
  • Concepts of scope, environment, activation object, activation record, call-stack frame, environment frame, environment record and even execution context — are all the nearest synonyms and may be used in discussions. Thus, technically in ECMAScript some of them are parts of another — e.g. an environment record is a part of the lexical environment which in turn is a part of the execution context. However, logically in abstract definitions they all may be used nearly equally. It’s normal to say: “a global scope”, “a global environment”, “a global context”, etc.
  • ECMAScript uses model of the chained environment frames. In ES3 it was called a scope chain. In ES5 as we’ll see an environment frame is called an environment record.
  • An environment may enclose several inner environments.
  • Lexical environment are used to implement closures and to solve the funarg problem.
  • All functions in ECMAScript are first-class and closures.

If you have questions, additions or corrections, feel free to discuss them in comments.

Structure and Interpretation of Computer Programs (SICP):

ECMA-262-5 in detail:

Other useful literature on common theory:



Written by: Dmitry A. Soshnikov.
Published on: 2010-12-12.

Tags: , , , , , , , , , ,

 
 
 

23 Comments:

  1. Gravatar of John Merge John Merge
    15. December 2010 at 23:41

    Amazing stuff, really!
    The best articles I have ever seen – this one too!

    Keep going, man, you rules!


  2. Gravatar of John Merge John Merge
    15. December 2010 at 23:57

    Typos:

    resoled
    association of of an identifier
    The following use of the identifier results the object it’s bound too. (to)

    The word “environment” implies again something that lexically surrounds the definition.

    What exactly did you mean?


  3. Gravatar of Dmitry A. Soshnikov Dmitry A. Soshnikov
    16. December 2010 at 00:24

    @John Merge

    Thanks, glad to see more people interesting in deep JS.

    What exactly did you mean?

    I meant an analogy with the real meaning of the “environment” word. E.g. ecological environment (surrounding) — something that surrounds us.

    The same is here:

    var x = 10;
    
    function foo() {
    
      var y = 20;
    
      function bar() {
        var z = 30;
      }
    
    }

    The definition of the foo function is surrounded (enclosed) by the global environment. The definition of the bar function is surrounded by the foo‘s environment. And “lexically” means — the exact nearest place in the source code position.

    P.S.: thanks for fixing typos (I let myself to combine them in one message). Please inform me if there are more.

    Dmitry.


  4. Gravatar of John Merge John Merge
    16. December 2010 at 00:55

    Typos:

    “Notice, a one environment…”
    “Rules of function creation and application”

    And the most important case, when this closure — is some inner function which uses variables of the parent function in which it’s created, and this inner function is returned upwards to the outside.

    Did you mean:

    “And the most important case is when this closure is some inner function which uses variables of the parent function in which it’s created, and this inner function is returned upwards to the outside.”?

    Also, this part: “and this inner function is returned upwards to the outside” might be improved.


  5. Gravatar of joseanpg joseanpg
    16. December 2010 at 23:51

    I will be a laconic commenter, Dmitry this is superb!! ;)


  6. Gravatar of Dmitry A. Soshnikov Dmitry A. Soshnikov
    17. December 2010 at 12:11

    @John Merge

    Yep, corrected, thanks. Feel free to propose other improvements.

    @joseanpg

    Thanks, Jose ;) Also thanks for proposals sent via mail.


  7. Gravatar of Robert Polovsky Robert Polovsky
    19. December 2010 at 21:26

    It’s just awesome stuff, Dmitry! Thank you for writing so scientific articles, they are really the best I have ever read on JavaScript.

    Also special thanks for consideration closures in Python, I’m also interested in this language.


  8. Gravatar of Dmitry A. Soshnikov Dmitry A. Soshnikov
    20. December 2010 at 12:22

    @Robert Polovsky, yep, thanks, glad it’s useful.

    Also special thanks for consideration closures in Python

    Yes, Python and ECMAScript have many similar design features. But as we saw, having the similar core features they nevertheless differ in some implementation aspects. And exactly this difference helps us to analyze the theoretical topic in-depth.

    Dmitry.


  9. Gravatar of monolithed monolithed
    9. January 2011 at 22:41

    Дмитрий, походу прочтнения вашей статьи у меня возникло несколько вопросов:

    1. как правильно переводится эти выражение: Lexical environments и Lexical scope?. (для себя перевел как Лексическое окружение/контекст и лексическая область видимости)

    2. правильно ли я понял, что Dynamic scope как таковой в ES отсутствует?

    3. откуда у вас сведения, что инструкция let появится в ES6 (интересно же почитать о том что может нас порабовать в будущем)

    4. и если вас не затруднит могли бы вы хоть в кратце прояснить ситуацию о Variable Environment (т.к. она здесь не обсуждаеься) и ее отличиями с Lexical environments

    Заранее благодарен за ответы!


  10. Gravatar of Dmitry A. Soshnikov Dmitry A. Soshnikov
    10. January 2011 at 16:26

    @monolithed

    1. как правильно переводится эти выражение: Lexical environments и Lexical scope?. (для себя перевел как Лексическое окружение/контекст и лексическая область видимости)

    Да, все верно. “Scope” — область видимости. По поводу “environment” — чаще используется “лексическое окружение” (т.е. та среда, которая в коде лексически окружает объявление функции), но и “лексический контекст” тоже подойдет.

    2. правильно ли я понял, что Dynamic scope как таковой в ES отсутствует?

    Да. Но, как отмечено, with и eval могут быть рассмотрены как конструкции, “привносящие динамику” в лексический скоп. Но не в классическом смысле динамической ОВ (области видимости), а в плане, что невозможно на этапе парсига определить, в какой ОВ переменная будет разрешена позже при обращении.

    ES5 striсt (и, соответственно, ES6 Harmony) отменили with, а eval запускается в своей “песочнице” и не может создать переменную в вызывающем контексте (кроме косвенного вызова eval’a). Т.е. из ES5-strict удалены фичи динамического скопа.

    3. откуда у вас сведения, что инструкция let появится в ES6 (интересно же почитать о том что может нас порабовать в будущем)

    Основное обсуждение дизайна ES ведется в листе es-discuss. Периодически результаты обсуждений и предложений описываются на официальном сайте http://ecmascript.org. По поводу let, можно почитать здесь, здесь и здесь.

    4. и если вас не затруднит могли бы вы хоть в кратце прояснить ситуацию о Variable Environment (т.к. она здесь не обсуждаеься) и ее отличиями с Lexical environments

    Данная статья — это общая теория безотносительно конкретных языков программирования (однако, описанная на примерах различных языков). Предполагается, что после прочтения этой статьи, конкретика уже должна восприниматься именно как конкретика, которая использует внутри именно механизмы этой общей теории.

    Но, поскольку данный цикл статей посвящен именно ECMAScript, мы конечно будем подробно и в деталях разбирать всю конкретику ES в следующей статье “Chapter 3.2. Lexical Environments. ECMAScript implementations”, т.к. там тоже много своих нюансов (но, повторю, “ядром” этих всех нюансов является именно эта общая теория, описанная здесь в 3.1).

    Здесь же кратко отмечу. Оба компонента: и VariableEnvironment (VE), и LexicalEnvironment (LE) являются свойствами контекста исполнения. При этом первый служит для инстанцирования переменных при входе в контекст (т.е. это тот самый variable object, VO и ES3), а второй — для разрешения переменных уже в рантайме.

    Изначально, LE равна VE. Однако в процессе исполнения кода, LE может меняться (например, заменяться средой, созданной with), в то время как VE никогда не меняется на протяжении кода контекста.

    Основное отличие — при работе с разными видами функций. FD запоминает в качестве [[Scope]] VE, а FE, соответственно, LE, т.к. в отличие от FD может быть создана в рантайме и, например, внутри with, когда LE контекста будет подменена.

    Подробней будем разбирать в 3.2.


  11. Gravatar of NekR NekR
    14. March 2011 at 20:28

    Да, спасибо за хороший цикл статей. Особенно интересна будет следующая, хотелось бы по подробнее познакомиться с современными движками и обработкой им областей видимости.
    В 5ой редакции по идее движки должны сохранять только нужные (используемые) переменные в области видимости, а вот в 3ей редакции не понятно как поступают движки, ведь если судить по тому, что через eval можно получить любую переменную и области видимости, то при “встрече” в коде сохраняются все переменные “родительских” областей.

    П.С. “Гармоничный” это в том смысле, что ES4 (ActionScript) и ES3/ES5 (JavaScript) смогут перейти на него бесконфликтно?


  12. Gravatar of Dmitry A. Soshnikov Dmitry A. Soshnikov
    15. March 2011 at 13:15

    @NekR

    Особенно интересна будет следующая, хотелось бы по подробнее познакомиться с современными движками и обработкой им областей видимости.

    Да, ES5 вводит некоторую оптимизацию для обработки лексических окружений. Выделяют декларативную запись окружения (declarative environment record) и, соответственно, объектную запись окружения (object environment record). Фактически первая, для хранения биндингов (переменных), может использовать не объекты на куче, а прямо регистры виртуальной машины. Вторая используется для with и глобального объекта.

    Про конкретные реализации в конкретных движках сказать не могу, т.к. я не имплементер.

    В 5ой редакции по идее движки должны сохранять только нужные (используемые) переменные в области видимости, а вот в 3ей редакции не понятно как поступают движки, ведь если судить по тому, что через eval можно получить любую переменную и области видимости, то при “встрече” в коде сохраняются все переменные “родительских” областей.

    Вообще, и в 5-ой редакции ничего не сказано про то, что не все переменные должны замыкаться. Как и в 3-ей редакции родительское окружение сохраняется полностью (точнее, ссылка на него). Соответственно, если функция содержит другие внутренние функции или eval (или еще хуже интересней – eval по неизвестным заранее родительским переменным внутри вложенных функций), то должен быть способ получить эти переменные. Здесь уже оптимизация с регистрами виртуальной машины не прокатит. Однако, как мы видели, Python, например, забивает на eval и не сохраняет все подряд, а только нужное.

    П.С. “Гармоничный” это в том смысле, что ES4 (ActionScript) и ES3/ES5 (JavaScript) смогут перейти на него бесконфликтно?

    Не уверен насчет ES4 ;) он мало коррелирует с Harmony. “Гармоничный” скорей всего, что все-таки договорились в свое время, в какую сторону двигаться.


  13. Gravatar of NekR NekR
    15. March 2011 at 16:49

    Хорошо, что JS/ES продолжает развиваться, это радует и всё таки открывает новые возможности для программистов.
    В последнее время как раз стал задумываться над тем, как правильнее … выгоднее … пользоваться замыканиями. То есть стоит ли оставлять ссылку на родительскую область видимости, если мне из неё нужна будет только одна переменная, но там естественно будут ещё вспомогательные.
    Даже не знаю как объяснить по простому все ситуации, попробую на примере кода, но всё равно это не совсем то.
    Пример ака код из jQuery …

    var objectType = (function(typeByClass){
    	// ES5 forEach method 
    	'Object Array Function RegExp Number String'.split(' ').forEach(function(key/*, i, arr*/){
    		typeByClass['[object '+key+']'] = key.toLowerCase();
    	}); 
    	return function(obj){
    		return typeByClass[Object.prototype.toString.call(obj)];
    	}
    }({}));
    

    В этом примере замыкание, да, вполне оправданно, но в других случаях вспомогательных функций/переменных может быть гораздо больше и которые в конечной функции не будут замыкаться. Можно конечно сделать и по другому, но всё таки замыкания как раз хороши для таких примеров, по этому мне видится важным сохранения только нужных переменных.

    П.С. Это скорее не вопрос, а рассуждения с знающим человеком ;)


  14. Gravatar of Dmitry A. Soshnikov Dmitry A. Soshnikov
    15. March 2011 at 17:10

    @NekR

    мне видится важным сохранения только нужных переменных.

    Ну вот Python’у тоже так видится, и поэтому, повторю, он даже забивает на eval во внутренних функциях.

    JS (по спецификации, и практически) — оставляет такую возможность. Однако, я думаю, движки вполне делают какие-то оптимизации, если нет eval‘а во вложенных функциях.

    Поэтому, еще раз — Python оптимизирует момент разрешения идентификаторов и сохраняет только то, что нужно (однако процесс создания функции сложный — с парсингом внутренних функции и т.д). А JS оптимизирует момент создания функции и просто сохраняет ссылку на родительское окружение, оставляя его полностью в памяти.


  15. Gravatar of NekR NekR
    15. March 2011 at 18:55

    Ну вот Python’у тоже так видится, и поэтому, повторю

    Да, я читал статью и понял про Python, но Питон-Питоном – не переходить же из-за этого на него ;)
    Ну в общем ясно, раз скорее всего ссылка всегда сохраняется то можно спокойно пользоваться.
    Ждём следующую статью :)


  16. Gravatar of everything about java everything about java
    17. October 2011 at 02:57

    java programming lesson…

    [...]ECMA-262 » ECMA-262-5 in detail. Chapter 3.1. Lexical environments: Common Theory.[...]…


  17. Gravatar of Patrick Mahoney Patrick Mahoney
    3. February 2012 at 23:03

    Closure

    A closure is a pair of the function code and the saved at creation lexical environment.

    Did you intend “bindings” or “environment” after saved here? Just curious, great article! *subscribes*


  18. Gravatar of Dmitry Soshnikov Dmitry Soshnikov
    4. February 2012 at 07:51

    @Patrick Mahoney

    Yeah, it turned out a little bit confusing statement. I edited it. So a closure is a pair consisting of the function code the environment in which the function is created.


  19. Gravatar of AMS AMS
    10. April 2012 at 11:07

    Hi Dmitry great series of articles really learned a lot from it. What books and on papers do you recommend for some one to really get the background about how functional and non functional paradigms are implemented.


  20. Gravatar of LR LR
    18. May 2012 at 07:36

    Is it possible to get an exact reference to the scope chain/environmental record frame? I am trying to create a function at runtime from a string without using eval, but the Function constructor is native code and will only form a closure over the global object’s variables. If there were some way I could pass the entire chain from the activation where the call to create this function takes place, then I could do it, but I’m having trouble finding information about where these properties are stored and how they’re ordered. The JS inspector tools for browsers seem to break it down well.


  21. Gravatar of Dmitry Soshnikov Dmitry Soshnikov
    25. May 2012 at 20:58

    @LR

    No, unfortunately (or fortunately — it depends), the spec doesn’t allow to get direct access to the lexical environment object. However, in some implementations, such an access was — e.g. __parent__ property of some versions of Rhino.


  22. Gravatar of Joshua Ramirez Joshua Ramirez
    16. November 2012 at 23:13

    You’re a genius. Thank you for writing straightforward explanations to such abstract concepts. It’s a sign of mastery. Much appreciated.


  23. Gravatar of shiva shiva
    29. March 2013 at 12:21

    One of the best article I have seen on the subject – detailed, clear and couldn’t be any simpler.


Leave a Reply

Code: For code you can use tags [js], [text], [ruby] and other.

XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>