In this article we briefly describe the generic topic of pattern matching in programming languages. We’ll see that this powerful technique is present in our every day programming, even if we may not notice it.
Matching
A matching is a process of checking whether one thing has the same or similar representation as another thing.
There are two basic matches:
- value to value match
- value to pattern match
Value-to-value match
Let’s start from the simplest case, matching a simple value to itself. For this we’ll be using hypothetical match
function:
match(1, 1); // true
We can further extend it and support expressions:
match(2, 1 + 1); // true
Now we can go further, and try to match more complex structures:
match([1, 2], [1, 2]); // true
Note: in the example above we don’t check identities of two array (i.e we don’t check whether the two arrays point to the same memory location), but rather whether the representation of one array is similar or identical to another. I.e. we check the shape and all values inside the array to match.
Comparing values to values can be useful sometimes, but the most powerful pattern matching becomes when is used to compare values to patterns.
Value-to-pattern match
Let’s extend our match
function to work not only with values, but also with some patterns — i.e. instead of using real data, we can use placeholders, that may represent the same value, but the actual exact value is not that important:
match([x, y], [1, 1]); // true match([_, _], [1, 1]); // true (placeholder names are not essential) match([x, _, _], [1, 1]); // false! 3 elements vs 2
These placeholders are usually called pattern bindings. We say that x
and y
are bound to value 1
. Or also, simply — x
and y
variables have value 1
. Also, we can use non-existing variables in patterns: if a variable doesn’t exist, it’s created.
As pattern name assumes, it may cover all corresponding real values:
match([x, y], [1, 2]); // true match([x, y], [3, 4]); // true
Note, we don’t touch topic of immutability here. For example, in Erlang the previous line would be an error and would not bind x
and y
to 3
and 4
, since they were already bound to values 1
and 2
. But e.g. in ECMAScript they are normally can be re-bound to new values.
And of course we can go even further and support nested structures as well:
match([1, [2, 3]], [1, [2, 3]]); // true match([x, [y, z]], [1, [2, 3]]); // true
Type matching
A type can also be considered as a pattern. For example, we can define a type definition, as just a convenient name to refer to our pattern.
Consider the example:
type MyType = shape([int, int]); // an array pattern with two int elements match(MyType, [1, 2]); // true
Multi-case match
So far we were able to match one value to another, or one pattern to a value. We can go even further, and provide construct to match a value to several alternative values or patterns. Basically, it’s the same switch
-statement, which supports not only equality check, but also patterns:
function checkMatch(x) { match(x): when 10: print('Matched 10') when 20: print('Matched 20') when [e1, e2]: print('Matched an array with two elements') when _: print('Anything else') } checkMatch(1); // 'Anything else' checkMatch(10); // 'Matched 10' checkMatch([1, 2]); // 'Matched an array with two elements' checkMatch([1, 2, 3]); // 'Anything else' checkMatch(20); // 'Matched 20'
Usually exactly this multi-case pattern matching is the most widely spread on practice and when describing the topic.
if
-statments, the same as casual switch
. We could write it the manual way:
if (match(10, x)) { print('Matched 10'); } else if (match(20, x)) { print('Matched 20'); } else if (match([e1, e2], x)) { print('Matched an array with two elements'); } else { print('Anything else'); }
However, the multi-choice construct is obviously more convenient.
Destructuring as a bonus
A destructuring is a process of splitting a complex structure to smaller sub-parts.
We have seen it above already:
match([x, y], [1, 2]); // true print(x, y); // 1, 2
In the example above we not only checked whether the value [1, 2]
is matched to the pattern [x, y]
, but also the pattern allowed us to have smaller sub-parts of the [1, 2]
structure. I.e. we destructured it to x
and y
parts, that we can use further.
The most cool thing, that a destructured sub-part pattern variable name, can be reused to match patterns that contain the same variable name:
// Bind x to 10 at destructuring. match([x, 20], [10, 20]); // true // And then already use it as a value. match([10, _], [x, x]); // true
Notice, that destructuring may present in pattern matching or may not. Sometimes there is a confusion here: one can confuse destructuring and pattern matching topics, assuming that destructuring is pattern matching. It is not as we mentioned, and the main task of pattern matching is exactly to check whether a value is matched to the pattern. And destructuring is just a convenient consequence here. And vice-versa: a destructuring implementation may exist without a strict matching check. Although, almost all implementations of pattern matching today imply destructuring as well.
Regular pattern matchings
The regular pattern matchings (or better known as just regular expressions) are usually not directly participate in “classical” pattern matching topics, but have the same roots, and also can be considered as a sub-type of pattern matching.
Let’s look at the example:
'abc'.match(/a(.)c/);
In the example above we matched value 'abc'
to the pattern /a(.)c/
, and even destructured group that contains "b"
when the pattern is applied.
By the way, this is what we referred to when was mentioning at the beginning that pattern matching can present in our every day programming, even if we may not notice it: regular expressions are just a special case of pattern matching.
In real languages
Let’s take a look at couple of examples in real languages. Often, this match
function we used above, is implemented as a simple “assignment” operator (in fact, it’s not an assignment, but rather the pattern matching operator). Consider the following Erlang code:
X = 1, X = X, 1 = 1, 1 = X.
The first statement binds X
to 1
, and further three already check matching of the right and left hand sides.
Another Erlang example with case
alternative matching:
case X of 1 -> print('matched1'); _ -> print('Anything else') end.
JavaScript (starting since ES6) uses non-strict pattern matching, that allows only to destructure needed complex structures, but doesn’t do actual matching check:
[x, y] = [1, 2]; console.log(x, y); // 1, 2 [x, y] = [1, 2, 3]; // works too, non-strict console.log(x, y); // 1, 2
Pattern matching and destructuring present in many languages such as Erlang, Haskel, JavaScript (non-strict), CoffeeScript (also just non-strict destructuring), and many others.
On practice
On practice, destructuring and pattern matching can be used for elegant solutions to split a complex object (i.e. with one var
statement to define three variables):
var point = {x: 1, y: 2, z: 3}; var {x, y, z} = point; console.log(x, y, z); // 1, 2, 3
When is used with function parameters, it’s even more convenient:
function initPoint({x, y, z}) { ... } initPoint({x: 1, y: 2, z: 3});
Or, e.g. we received a response from a server, and want to run appropriate handling procedure, based on its format:
case Response of {http, Body} -> handle_body(Body); {tcp, Stream} -> handle_stream(Stream); Raw -> handle_raw(Raw) end.
Conclusion
Hope this small article dispelled the mystery around the pattern matching topic, and it became more clear. To recap, regardless which type of matching we use: complex objects with destructuring, simple values, or regular matchings, at the end of the day, it’s reduced to very simple check: whether a value matched to a value, or a value matched to a pattern. There are no other cases here. Everything else is just a syntax of concrete languages.
Written by: Dmitry A. Soshnikov
Published on: Aug 16th, 2014
Thanks Dmitry,I hope you blog regularly.Yours is one of the best blog where each article encapsulates a wealth of information in a succinct and clear way.
@crackerplace, thanks, glad it’s useful.
I am really glad that you are back with your blog entries!
Everything what I’ve read so far on your site helped me to finally understand all these tiny details of JavaScript.
You really have a talent to explain difficult stuff. Well done!
A little bug: missing comma at
match([x, y] [1, 2]); // true
It should be:
match([x, y], [1, 2]); // true
Cheers!
@Miłosz Chmura, thanks for reading, and for the typo — fixed! 🙂
I’ve written a small strawman proposal for pattern matching in ES7+. Check if out here: http://kahnjw.com/posts/6/
@Jarrod Kahn: Yeah, strict pattern matching (similar to Erlang’s or OCaml’s) would be nice to have in ES7.
Wow!!! This is such a superb write up on pattern matching! Thanks a bunch for this article. You do have talent to explain complex things in simple terms. 🙂 You are a good Destructurer 🙂 🙂 Gonna wait for your automata series on YouTube too. 🙂 Your site is like a gem of knowledge – full of diamonds and gold and it makes people like me greedy lol Thanks again for superb write up, especially the match case examples;)
@Viren, thanks for the kind feedback, glad it’s useful, and glad to see more people interested in deeper engineering topics.