I hope you found the first problem set painful. At least a little. I wanted you to want a better a way. A way to economize your code. A way to capture the similarity of multiple blocks of code in a single block.
In this chapter, you'll get what you want. At the end, we'll return to the first problem set and do it a better way.
Functions are the better way. Below is our first Python function! How cute it is!
def double_me(a_num):
double = a_num * 2
return double
Let's break it down. (This all might seem a little mysterious. But it's best to lay it all out to begin. I promise that, by the time you're done with the chapter, it will all be clear.)
The first line of a function is called its header.
The header must begin with the keyword def. Keywords are predefined words. You don't choose how to spell or use them. Python chooses for you.
After the keyword def comes the name of the function. The name is your choice, though you do have to follow the rules for valid names. Pick a name that describes what the function does.
After the function name comes a set of parentheses. Inside them we place a name or names (which once again are our choice). What are these names? They will become the names of the function's inputs when the function is called. Input names in the header are called parameters.
After the close parentheses comes a colon. This is the end of the header.
If a line ends with a colon, the line (perhaps lines) under must be indented. Python groups lines of code by indentation. Python doesn't care now much you indent. (Four spaces is most common. Here's what the Python Style Guide has to say about that.) Just keep it consistent.
The indented code chunk that comes under the header is called the function body.
On the first line of the body a_num is doubled and the value that results is assigned to double.
The second (and last) line of the body begins with the keyword return. The value of the expression that follows return is the function's output. Thus the value of the variable double is the output of the function.
Type the double_me function into your editor. Exactly as you see it. Your friends tolerate your grammatical improprieties. Python will not. (I assume your double_me function is contained in a file named main. If it's not, you'll use your final name below.)
Now open up a Python console. (Recall that you know you're in the Python console when you get the >>> prompt.) In the console, type this:
>>> from main import double_me
You should find yourself with the Python console prompt >>> again. The console will seem unchanged, but it is not. It now knows your function. Try this at the prompt:
>>> double_me(3)
The console should respond 6. What happened here? You called double_me with the input 3. What happened when you did that? First, the function's one parameter - a_num - was assigned the value 3. That's what always happens to inputs; they become the values of parameters. Next the body of the function was executed. The first line of the body is double = a_num * 2. But as we said, the value of a_num is 3; so double get the values 6. On the second (and last) line of the body, we return the value of double. Sot this call to double_me returns 6. Summary: the return value of double_me when called with the argument 3 is 6.
The values sent to a function when it's called, the inputs we called them, are arguments. So we called double_me with the argument 6, and the value of that call (that is, the value returned) is 6.
Let me emphasize a point. We know that the Python console prints the value of an expression. So the value of double_me(3) is 6, which as we know was the value returned. The value of a function call is the value it returns.
Of course we can call the function as often as we like.
>>> double_me(6)
12
>>> double_me(0)
0
>>> double_me(-2)
-4
Perhaps then you begin to understand why functions are so useful. We don't have to duplicate code to get different values. Instead we can just call the same function again with different inputs. I'll say more about this at the end.
We may also call the function from within a program that contains the function. (I've included line numbers. Don't type them in; they're not part of the program. Likely your IDE shows line numbers automatically.)
1. def double_me(a_num):
2. double = a_num * 2
3. return double
4.
5. doubled_num = double_me(3)
6. print(doubled_num)
If your run the program, it will output the value of doubled_num.
A few notes:
The function must be defined before it's called. The function definition begins on line 1.
The function is not actually executed until it's called. The function call occurs on line 5; the argument of that function call is 3.
When the function is called, the parameter a_num takes the value 3. Lines 2 and 3 - the function body - are then executed.
One line 3, the value of double is returned. This value is then assigned to doubled_num on line 5, for as we know, the value of a function call is the value it returns.
We have distinguished function from function call from function value. This perhaps all seems a little subtle. Perhaps unnecessary. It is not. These distinctions are crucial to TF! Let me make sure all this is clear.
The function is the block of code that begin with def and ends with return. It is the recipe. It tells us what ingredients we need (the parameters) and what to do to them (the rule, as given in the body of the function). In the program above, the function spans lines 1 - 3.
But of course you can have a recipe you never make! A function call makes the recipe, i.e. it performs the function rule with some input and thereby produces an output. We call a function when we write its name followed by parentheses. Within those parentheses we provide values for its parameters. In the program above the function call occurs on line 5.
The value of a function call is the value it returns. The value of the double_me function for the input 3 is 6.
The flow of control of a program is the order in which the statements within it are executed. In the simple programs you wrote in the last chapter, flow of control proceeded from top to bottom. But in the previous section flow of control was not so simple. Look again at the program of the previous section. Flow of control there is 1-5-2-3-6.
I should walk you through this. On line 1, Python notes that a function named double_me exists but does not yet execute that function. Flow of control thus skips 2 and 3 for now and goes to 5. On line 5 the function double_me is called. Flow of control is now passed to the body of the function and lines 2 and 3 are executed. Once the function body has executed, flow of control ends passes to line 6, the line under the function call; and with this the program is complete.
In our example above, the expression after return was a variable. This is not necessary. We could if we wish place an expression whose value must be computed. Like this:
def square_me(a_num):
return a_num ** 2
Python will evaluate the expression after return - in this case a_num ** 2 - and then will return the value that results. Indeed this is precisely what it does when after return we find only a variable. The variable is evaluated and the value that results is returned. (You no doubt recall that the value of a variable is the object that it names.)
So far we've seen only functions of a single parameter. Let's have an example of a multi-parameter function.
def square_and_add(x, y):
return x**2 + y**2
If we call this function, we must pass it two values.
>>> square_and_add(3, 4)
25
As you no doubt realized, order matters here. The 3 was bound to the x and the 4 to the y, since the x and the 3 come first and the y and the 4 come second.
Now of course in the function above, we'd have gotten the same output if the 3 had bound to the y and the 4 to the x. So let's have a non-symmetrical example to prove definitively that inputs are assigned to parameters based on position of input and parameter.
def square_and_divide(x, y):
return x**2 / y**2
Let's use 3 and 4 again:
>>> square_and_divide(3, 4)
0.5625
>>> square_and_divide(4, 3)
1.7777777777777777
The first function call computed the value of 3**2 / 4**2, the second the value of 4**2 / 3**2.
Functions are like Lego. We can stick them together and so make functions of greater complexity.
One way to stick functions together is composition. In function composition, the output of one function is made the input of second function. Let's look at an example. We begin with the square root function.
def sq_rt(n):
return n ** (1/2)
We'll also need a sum of squares function.
def sum_squares(x, y):
return x**2 + y**2
Now let's use these two blocks to build a distance function. The function below computes the distance from the point with coordinates (x1, y1) to the point with coordinates (x2, y2).
def distance(x1, y1, x2, y2):
return sq_rt(sum_squares(x2 - x1, y2 - y1))
To call the function, we must send it four values. These give the x- and y-coordinates of two points; (x1, y1) is the first point, (x2, y2) the second.
>>> distance(1, 2, 4, 7)
5.830951894845301
Here's what happened.
In the function call, 1 was bound to x1, 2 to y1, 4 to x2 and 7 to y2.
Now consider that rather complex statement return sq_rt(sum_squares(x2 - x1, y2 - y1)). Begin with the inner piece sum_squares(x2 - x1, y2 - y1). x2 - x1 had the value 3 and y2 - y1 the value 5. So sum_squares(x2 - x1, y2 - y1) returned 3**2 + 5**2, or 34.
sq_rt then took that 34 and returned its square root.
This is the composition of which I spoke above. The output value of sum_squares was made the input value of sq_rt.
Just to prove a point, let's have an example of a pair of functions one of which calls the other. You'll do this lots. Indeed you often should. When a problem requires a complex solution, often that solution should be spread across multiple functions. What's the advantage of that? If we break the solution into multiple functions, each will be simpler and thus easier to write and to debug.
The example:
def rect(length, width):
return length * width
def right_rect_prism(base_length, base_width, height):
base_area = 2 * rect(base_length, base_width)
lateral_area = 2 * rect(base_length, height) + 2 * rect(base_width, height)
return base_area + lateral_area
Here we have two functions, rect and right_rect_prism. The first computes the area of rectangle given length and width, and the second computes the surface area of a right rectangular prism given base length, base width and base height.) Most importantly here, the second call the first. Indeed it does so three total times. That's totally legit.
Above, when we taught Python how to compute distance, we wrote a multi-function program. Let's take a moment to discuss a certain matter that is sure to arise in such a context. I do apologize, but it's a little complex. But better to have this explained to you now than write buggy code and have no idea why it's buggy!
Let's have a name first. I mean to discuss variable scope.
First type these two functions into your code editor and then run.
def circle_area(r):
PI = 3.14159
return PI * r**2
def circumference(r):
return 2 * PI * r
Now let's call them:
>>> circle_area(3)
28.27431
>>> circumference(3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in circumference
NameError: name 'PI' is not defined
We have a bug! Indeed we have a crash bug, which from here on I'll often call "exceptions". (That's the most common name.) Moreover, Python did its best to explain just why it crashed. You'll find that Python is really quite good at that; it almost always explain the reasons for a crash, and explains them well. Read your error messages! Here at the start of TF!, you often won't understand everything Python says in its error messages; but in most cases, you'll understand as much as you need to diagnose the error. (In a moment, I'll explain what "Traceback" means.)
So, what's the reason for the crash? Python told us that in the second line of the function circumference (the line return 2 * PI * r), the name PI is not defined. But that seems plainly wrong! It was defined, wasn't it? There in circle_area! Why doesn't circumference seem to know it?
The answer is that a function creates its own universe of variables; and once a function call is complete, the variables created within it cease to exist.
The word used for this is scope. The scope of a variable is that part of a program that can access and change its value; and so the scope of a variable created inside a function is that function. In the two functions defined above - circle_area and circumference - PI lies outside the scope of circumference and so circumference cannot access its value. (Indeed by the time we call circumference, PI has ceased to exist. Google "garbage collection".)
Another term used here is local. PI is a local variable, local to circle_area; and as such its value is accessible only inside that function.
We can if we wish define a variable outside the scope of any function. If we do, that variable is global; and as such its value is accessible anywhere in the program. Below PI is a global variable.
PI = 3.14159
def circumference(r):
return 2 * PI * r
If we now call circumference, it doesn't crash as it did before.
>>> circumference(1)
6.28318
Here's an inevitable question: what happens if we define a variable both inside and outside a function? Like this:
PI = 3.14159
def circumference(r):
PI = 3 # the Indiana definition ;-)
return 2 * PI * r
Well, let's try it out. If we send circumference the value the int 3, we get the int 18 back out. So, the local PI overrides the global PI!
Here's a summary of the scope rules:
If a variable definition occurs inside a function (local scope), that value is lost when the function finishes execution. It cannot be seen by another function; it cannot be seen by a later call of the same function.
If a variable definition occurs outside the scope of any function (global scope), that variable's value can be seen inside any function.
If a variable is defined both globally and locally, the local definition overrides the global within the function where it is defined.
But why? Why should the local definition override the global? Safety is the answer. What is a safe function? One that can effect the code around it only by the value(s) it returns. A function like that can have no unforeseen side effects. If however we let the body of a function alter the value of global variables, it can change the behavior of the code outside in quite unexpected and difficult-to-debug ways.
Watch this:
def sqrt(n):
return n ** (1/2)
# Imagine lots of code here. Thousands of lines.
def distance(x1, y1, x2, y2):
sum_squares = (x2 - x1) ** 2 + (y2 - y1) ** 2
sqrt = sum_squares ** (1/2)
return sqrt
If we let functions write to global variables, then a call to distance turns sqrt from a name of a function into a name of a number. Imagine the mischief that that might cause! You might reply of course that the programmer should keep track of her variables and not double-define sqrt. But that's a tall order! If your program consists of thousands of line of code, this becomes difficult to do. Wouldn't it be better not to have to worry about it? Wouldn't it be better to know that your function can't overwrite a global variable?
There should be only one way into a function and only one way out. The way in: values of parameters. The way out: return value. Any other way that a function effects global state (state = values of variables at a particular moment in time) is likely to land the programmer in a world of hurt.
Remember the pain of the first problem set? The same code. Over and over. The only difference was the value of a variable. Surely we can do better. Surely we can economize. But how?
Write a function! (I know. The variable names are long. But they are clear. Very clear.)
def years_to_secs(age_in_yrs):
secs_in_min = 60
mins_in_hr = 60
hrs_in_day = 24
days_in_yr = 365
age_in_secs= age_in_yrs * days_in_yr * hrs_in_day * mins_in_hr * secs_in_min
return age_in_secs
Now we can run the function as many times as we want; and each time we can send any value for age_in_yrs we choose. Like this:
years_to_secs(12)
378432000
years_to_secs(144)
4541184000
years_to_secs(1728)
54494208000
Isn't that great? We can use the function again and again but we don't have to write it out again and again!
The function years_to_secs captured the similarity of a set of similar blocks of code. The one point of difference - the value of age_in_yrs - became a parameter. This is a key idea in TF! Functionalize the similarities! Parameterize the differences! Memorize that. Word for word. I'll say nothing more important.
The process wherein we replace multiple similar code chunks with calls to a single function that captures that similarity has a name. It's called abstraction.
I have a little secret. The arithmetical operators we used to build our functions are themselves functions. I know, I know. They don't look like the functions we've seen. But they are indeed functions. Consider addition. It takes two numbers, adds them and then returns the sum. This three-part structure - take, compute, return - makes it a function.
The same is obviously true of -, *, /, //, % and **. All take values. All perform a computation. All return a value. Thus all are functions.
So, the function we constructed above - like the double_me function - was built from functions.
Here's another little secret. The print command ... that' a function too! It takes an input - the expression after print contained in parentheses - and it outputs the value of that expression to the screen. It's all functions! Functions, functions, functions! Python gives us functions. We take them and build functions. Perhaps you begin to understand why I call my book "Think Functional!".
I'll end with a little more about bugs.
Bugs, you will recall, are errors in code; and in the previous chapter, we distinguished two types - "stop bugs" and "go bugs" we called them. Stop bugs halt execution of the program at the place the bug was encountered; that is, they "crash" the program as it executes. Moreover, stop bugs are accompanied by error messages; Python will do its best to explain why it had to halt execution. Go bugs, on the other hand, do not crash the program. A program with only go bugs will run through to the end, but it won't in all cases produce the desired output. (As the jokesters often say, the program doesn't do what you wanted it to do. Instead it does precisely what you told it to. That always makes me giggle.)
In Chapter 1, we said that go bugs are often called logical or semantic bugs.
Now let's distinguish two types of stop bugs: these are the syntax errors and the runtime errors. A syntax error is one that violates Python's grammar rules; you've either used a symbol it doesn't know, or you've arranged symbols it does know in some illegal way. Or both, but syntax errors of the second type are by far the most common.
Below are some examples of syntax errors. Below each is a piece the error message I received and for some a bit of commentary. For each, I simply typed the function into my text editor and then chose "Run in terminal". (I use Visual Studio Code. At replit.com, you type it in and the click the Run button.) I did not call the function!
def spam() # left out the colon
return 'eggs'
Error message
File "g:\My Drive\Think Functional!\Programs\unit_tests\experiment.py", line 1
def spam() # left out the colon
^^^^^^^^^^^^^^^^^^^^
SyntaxError: expected ':'
Commentary
Please ignore the File "g:\My Drive\Think Functional!\Programs\unit_tests\experiment.py". That's just the file location; yours will be different. But everything after matters. line 1 tells you where the error occurred. SyntaxError: expected ':' is Python's attempt to diagnose the error. It's quite correct of course. Finally, the line of carets - ^^^^^^^^^^^^^^^^^^^^ - point to place where Python was unable to understand what was meant. Since Python expected a colon after the close parenthesis but didn't find it, it doesn't know what the rest of the line after the close parenthesis means.
def spam():
retrun 'eggs' # retrun?
Error message
File "g:\My Drive\Think Functional!\Programs\unit_tests\experiment.py", line 2
retrun 'eggs' # retrun?
^^^^^^
SyntaxError: invalid syntax
Commentary
This one isn't as helpful. But you shouldn't really expect more. Line 2 is pure nonsense. It's the name retrun (which to Python will be undefined), then a space, then a string. That means nothing to Python. You'll get invalid syntax when Python can't even begin to guess what you meant.
def spam():
eggs = ((1 + 3) / (2 + 4) ** (1 / 3)
return eggs
Error message
File "g:\My Drive\Think Functional!\Programs\unit_tests\experiment.py", line 2
eggs = ((1 + 3) / (2 + 4) ** (1 / 3)
^
SyntaxError: '(' was never closed
Commentary
No commentary necessary other than to say "no commentary necessary".
def spam():
scrambled = 2
fried = 3
return scrambled + fried
Error Message
File "g:\My Drive\Think Functional!\Programs\unit_tests\experiment.py", line 3
fried = 3
IndentationError: unexpected indent
Commentary
There's really not much to say. The number of spaces used to indent the first line in a block of code must be the same as the number of spaces used to indent every line in that block.
Now let's have some examples of run-time errors. We'll have two. Note that in each we don't get the error until we actually call the function.
def spam():
scrambled = 2
fried = 3
return scrambled + fried
Error Message
As I said, we don't get the error message after we type run. Instead we have to call the function. I did so in the console and got the error message below.
>>> spam()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "main.py", line 4, in spam
return scrambled + fried
NameError: name 'scrambled' is not defined. Did you mean: 'scrambled'?
Commentary
The syntax here is fine; that's why we don't see the error until we call the function. The problem here is logical. A name hasn't been given a value. Please do be careful. Every occurrence of a variable after the first has to be spelled exactly as the first. (I would be remiss here if I didn't point just how good the error message is. The "Did you mean" part is fantastic. Thanks, Python!)
def spam():
scrambled = 2
fried = 0
return scrambled / fried
Error Message
>>> spam()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "main.py", line 4, in spam
return scrambled / fried
ZeroDivisionError: division by zero
Commentary
Once again, we had to call the function to get the error; and that function call had us divide by 0. Can't do it in math. Can't do it in Python.