in Notes

Pattern Matching

In this article we briefly describe the generic topic of pattern matching in programming languages. We’ll see that this powerful technique is present in our every day programming, even if we may not notice it.

A matching is a process of checking whether one thing has the same or similar representation as another thing.

There are two basic matches:

  1. value to value match
  2. value to pattern match

Let’s start from the simplest case, matching a simple value to itself. For this we’ll be using hypothetical match function:

match(1, 1); // true

We can further extend it and support expressions:

match(2, 1 + 1); // true

Now we can go further, and try to match more complex structures:

match([1, 2], [1, 2]); // true

Notice! In the example above we don’t check identities of two array (i.e we don’t check whether two arrays refer to the same place), but rather whether the representation of one array is similar or identical to another. I.e. we check the shape and all values inside the array to match.

Comparing values to values can be useful sometimes, but the most powerful pattern matching becomes when is used to compare values to patterns.

Let’s extend our match function to work not only with values, but also with some patterns — i.e. instead of using real data, we can use placeholders, that may represent the same value, but the actual exact value is not that important:

match([x, y], [1, 1]); // true
match([_, _], [1, 1]); // true (placeholder names are not essential)
match([x, _, _], [1, 1]); // false! 3 elements vs 2

These placeholders are usually called pattern bindings. We say that x and y are bound to value 1. Or also, simply — x and y variables have value 1. Also, we can use non-existing variables in patterns: if a variable doesn’t exist, it’s created.

As pattern name assumes, it may cover all corresponding real values:

match([x, y], [1, 2]); // true
match([x, y], [3, 4]); // true

Note, we don’t touch topic of immutability here. For example, in Erlang the previous line would be an error and would not bind x and y to 3 and 4, since they were already bound to values 1 and 2. But e.g. in ECMAScript they are normally can be re-bound to new values.

And of course we can go even further and support nested structures as well:

match([1, [2, 3]], [1, [2, 3]]); // true
match([x, [y, z]], [1, [2, 3]]); // true

A type can also be considered as a pattern. For example, we can define a type definition, as just a convenient name to refer to our pattern.

Consider the example:

type MyType = shape([int, int]); // an array pattern with two int elements

match(MyType, [1, 2]); // true

So far we were able to match one value to another, or one pattern to a value. We can go even further, and provide construct to match a value to several alternative values or patterns. Basically, it’s the same switch-statement, which supports not only equality check, but also patterns:

function checkMatch(x) {
  match(x):
    when 10:       print('Matched 10')
    when 20:       print('Matched 20')
    when [e1, e2]: print('Matched an array with two elements')
    when _:        print('Anything else')
}

checkMatch(1); // 'Anything else'
checkMatch(10); // 'Matched 10'
checkMatch([1, 2]); // 'Matched an array with two elements'
checkMatch([1, 2, 3]); // 'Anything else'
checkMatch(20); // 'Matched 20'

Usually exactly this multi-case pattern matching is the most widely spread on practice and when describing the topic.

Notice, the multi-choice construct from above though is just a syntactic sugar for simple if-statments, the same as casual switch. We could write it the manual way:

if (match(10, x)) {
  print('Matched 10');
} else if (match(20, x)) {
  print('Matched 20');
} else if (match([e1, e2], x)) {
  print('Matched an array with two elements');
} else {
  print('Anything else');
}

However, the multi-choice construct is obviously more convenient.

A destructuring is a process of splitting a complex structure to smaller sub-parts.

We have seen it above already:

match([x, y], [1, 2]); // true

print(x, y); // 1, 2

In the example above we not only checked whether the value [1, 2] is matched to the pattern [x, y], but also the pattern allowed us to have smaller sub-parts of the [1, 2] structure. I.e. we destructured it to x and y parts, that we can use further.

The most cool thing, that a destructured sub-part pattern variable name, can be reused to match patterns that contain the same variable name:

// Bind x to 10 at destructuring.
match([x, 20], [10, 20]); // true

// And then already use it as a value.
match([10, _], [x, x]); // true

Notice, that destructuring may present in pattern matching or may not. Sometimes there is a confusion here: one can confuse destructuring and pattern matching topics, assuming that destructuring is pattern matching. It is not as we mentioned, and the main task of pattern matching is exactly to check whether a value is matched to the pattern. And destructuring is just a convenient consequence here. And vice-versa: a destructuring implementation may exist without a strict matching check. Although, almost all implementations of pattern matching today imply destructuring as well.

The regular pattern matchings (or better known as just regular expressions) are usually not directly participate in “classical” pattern matching topics, but have the same roots, and also can be considered as a sub-type of pattern matching.

Let’s look at the example:

'abc'.match(/a(.)c/);

In the example above we matched value 'abc' to the pattern /a(.)c/, and even destructured group that contains "b" when the pattern is applied.

By the way, this is what we referred to when was mentioning at the beginning that pattern matching can present in our every day programming, even if we may not notice it: regular expressions are just a special case of pattern matching.

Let’s take a look at couple of examples in real languages. Often, this match function we used above, is implemented as a simple “assignment” operator (in fact, it’s not an assignment, but rather the pattern matching operator). Consider the following Erlang code:

X = 1,
X = X,
1 = 1,
1 = X.

The first statement binds X to 1, and further three already check matching of the right and left hand sides.

Another Erlang example with case alternative matching:

case X of
  1 -> print('matched1');
  _ -> print('Anything else')
end.

JavaScript (starting since ES6) uses non-strict pattern matching, that allows only to destructure needed complex structures, but doesn’t do actual matching check:

[x, y] = [1, 2];

console.log(x, y); // 1, 2

[x, y] = [1, 2, 3]; // works too, non-strict

console.log(x, y); // 1, 2

Pattern matching and destructuring present in many languages such as Erlang, Haskel, JavaScript (non-strict), CoffeeScript (also just non-strict destructuring), and many others.

On practice, destructuring and pattern matching can be used for elegant solutions to split a complex object (i.e. with one var statement to define three variables):

var point = {x: 1, y: 2, z: 3};

var {x, y, z} = point;

console.log(x, y, z); // 1, 2, 3

When is used with function parameters, it’s even more convenient:

function initPoint({x, y, z}) {
   ...
}

initPoint({x: 1, y: 2, z: 3});

Or, e.g. we received a response from a server, and want to run appropriate handling procedure, based on its format:

case Response of
  {http, Body} -> handle_body(Body);
  {tcp, Stream} -> handle_stream(Stream);
  Raw -> handle_raw(Raw)
end. 

Hope this small article dispelled the mystery around the pattern matching topic, and it became more clear. To recap, regardless which type of matching we use: complex objects with destructuring, simple values, or regular matchings, at the end of the day, it’s reduced to very simple check: whether a value matched to a value, or a value matched to a pattern. There are no other cases here. Everything else is just a syntax of concrete languages.

Written by: Dmitry A. Soshnikov
Published on: Aug 16th, 2014

Write a Comment

Comment

  1. Thanks Dmitry,I hope you blog regularly.Yours is one of the best blog where each article encapsulates a wealth of information in a succinct and clear way.

  2. I am really glad that you are back with your blog entries!
    Everything what I’ve read so far on your site helped me to finally understand all these tiny details of JavaScript.

    You really have a talent to explain difficult stuff. Well done!

  3. A little bug: missing comma at match([x, y] [1, 2]); // true
    It should be: match([x, y], [1, 2]); // true
    Cheers!

  4. @Jarrod Kahn: Yeah, strict pattern matching (similar to Erlang’s or OCaml’s) would be nice to have in ES7.