What if we wanted to execute a block of code again? What if we wanted to execute it (say) 12 times? What if we wanted the user to specify the number of times to execute?
Of course this could be immensely useful. Want to draw a 12-sided regular polygon? The obvious strategy: draw a segment and turn, draw another segment and turn ... twelve times total. How do we do this? Well, the most inelegant way is to write the code for a segment and a turn and then copy/paste 11 times. Ugly! The better way (if only it were possible) would be to execute the segment/turn code 12 times.
Well, Python's gotcha covered! It provides us with a way to repeat execution of a block and to specify at the start just how many times to repeat. This is called iteration. More precisely, it's called definite iteration. In definite iteration we repeat execution a set number of times. In indefinite iteration, the number of repetition is not set at the start. We'll encounter indefinite iteration in the next chapter.
Before I reveal just how Python lets us iterate, I must introduce a fourth data type, the list type. In Python, iteration and lists are like pb and j. They go great together! Indeed, iteration in Python requires use of a list-type object, and for us the list-type object will just be a list.
Know that here in the Iteration chapter, I'll just give you the list basics. There's lots more that can be said. Indeed a later chapter (Chapter 6) is devoted to lists.
We've seen three data types - float, int and str. Now we add a fourth, the list type. Like str, list is a sequence type. A str is a sequence of characters. Of what is list a sequence? Anything. Anything at all - float, int, str or list. We can mix too. Lists are versatile!
How do we create a list? With square brackets; and within those square brackets, we place the elements of the list separated by commas. This is best illustrated in the console. Here for your inspection are four lists. (Recall that when Python cannot simplify an expression, it simply repeats it back. As I said, we call such an expression a literal.)
>>> [1, 2, 3]
[1, 2, 3]
>>> ['a', 'b', 'c']
['a', 'b', 'c']
>>> [1, 'a', 2, 'b']
[1, 'a', 2, 'b']
>>> [1, 'a', [2, 3], ['b', 'c']]
[1, 'a', [2, 3], ['b', 'c']]
The first list is a list of integers. The second is a list of strings. The third is a mix of integers and strings. The fourth shows just how versatile lists are; here we have a list that contains an integer, a string, a list of integers and a list of strings.
Need I say that we can assign a list to a variable? Well, an object of any type can to assigned to a variable, and so of course lists can be assigned to variables.
>>> mixed_list = [1, 'a', [2, 3], ['b', 'c']]
>>> mixed_list
[1, 'a', [2, 3], ['b', 'c']]
Before I show how lists are used in definite iteration, I should point out that a list with no elements at all is quite possible.
>>> empty_list = []
Python is quite happy to create this list for us. (As the variable name suggests, we often say that such lists are empty.) Now you might well wonder why we'd ever want a list like that. Isn't the point of lists to store multiple values? Well yes, yes it is. But as we'll see in Chapter 6, we often build lists, like say a list of primes or a list of capitalized words. How do we do that? We begin with an empty list and add elements to it! Expect much more about this in Chapter 6.
Now on to definite iteration. We begin with an example.
for item in [2, 3, 5, 7, 11]:
print(item, item ** 2)
If we run it, we'll get this:
2 4
3 9
5 25
7 49
11 121
Apparently the print function was called five times. How does that work? Let's break it down.
We have two new keywords: for and in.
Between the keywords we have a variable. The choice of variable is ours. I chose item . This is the loop variable. (I'll explain why we say "loop" in a moment.)
After in we have a sequence. Any sequence will do. I chose a list of integers. We could also have used a string.
After the sequence we have a colon. As always, this means that we're at the end of the line and that a block of indented code will follow.
The body of the iteration follows the colon. The body will be executed once for each element of the sequence. The first time through, item takes as its value the first element of the sequence. The second time through, item takes as its value the second element of the sequence. Etc. This is what loop variables do - they march through the list. First time through, they take the first item in the list. Second time, they take the second. Etc.
Thus the first time we execute print(item, item ** 2), the value of item is 2 and so we print 2 4. The second time we execute the print(item, item ** 2), the value of item is 3 and so we print 3 9. This continues until item takes its final value, which is 11; and when item is 11, we print 11 121.
We call the iteration definite because we know at the start the number of times we will iterate - once for each element of the sequence. Is this case the body of the loop (that single print command) will be executed 5 times.
The block of intended code under the for - the loop body - is executed again and again. It loops! This is such a good description that I'll often say "for loop" instead of "definite iteration".
Like function calls, for loops redirect flow of control. Let's number lines and then trace flow of control.
1. for n in [1, 2, 3, 4]:
2. s = n ** 2
3. print("the square of", n, "is", s)
Output:
the square of 1 is 1
the square of 2 is 4
the square of 3 is 9
the square of 4 is 16
Flow of control goes 1, 2, 3, 2, 3, 2, 3, 2, 3. Thus the body of the for loop (lines 2 and 3) execute a total of 4 times. Loopy!
Above we iterated over a list. But we can iterate over any sequence type. Like str.
Program:
for char in "Hello, World!":
print(char)
Output:
H
e
l
l
o
,
W
o
r
l
d
!
You understand I'm sure. The loop variable in this case in char (short for "character"). The first time through the loop, the value of char was 'H'. The next time it was 'e'. We continue on until the value of char is the final character in the string - the '!' - and then we are done. So don't be surprised if when we encounter some new sequence data type, we'll be able to iterate with for through the elements of sequences of that type. That's just what for is built to do.
Now, in this chapter I won't actually ask you to iterate through any non-lists. We'll do lots of that in Chapter 5. But I did want you to know that it's possible.
Lists of ints are immensely useful. Definite iteration often uses them. Python provides us with a way to create int lists easily. The tool is the range function. Here's the syntax:
range(start, stop, step)
start, stop and step are integers. The list created begins at start. The difference between successive elements is step. The final element is not stop. Instead the final element of the list created is that integer such that the next after it would equal or exceed stop. Let's have examples. (You'll no doubt notice the use of the list function below. What's up with that? It converts a range-type object into a list-type object. But didn't I say that ranges were lists? Yes, I did; and I'll continue to say it. Strictly it isn't true, but it's near to the truth. Strictly ranges and lists are different types; but in the context in which we'll use ranges - definite iteration - they behave precisely as do lists.)
>>> list(range(1, 11, 1))
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # begin at 1, step up by 1, continue until 10
>>> list(range(0, 30, 5))
[0, 5, 10, 15, 20, 25] # begin at 0, step up by 5, end at 25
>>> list(range(0, 10, 3))
[0, 3, 6, 9] # begin at 0, step up by 3, end at 9
>>> list(range(0, -10, -1))
[0, -1, -2, -3, -4, -5, -6, -7, -8, -9] # begin at 0, step by -1, end at -9
I bet the the last one - the one with the negative step value - surprised you. You probably assumed that ranges have to go up. They don't. They can go down too.
Let me emphasize the point made about the stop value. We never reach or exceed it. Instead we go until the next step would give us the stop value or go past it. The list generated by range never contains the stop value or any value beyond it! Note that this was true even when the list generated goes down, like with range(0, -10, -1).
Above we have examples where we sent the range function three integers. But we can also send it either two or just one. If we send three, then as we said the first is start, the second stop and the third step. If we send only two, the first is start, the second is stop and step takes a default value of 1.
>>> list(range(1, 10))
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> list(range(9, 19))
[9, 10, 11, 12, 13, 14, 15, 16, 17, 18]
The first is equivalent to list(range(1, 10, 1)); the second is equivalent to list(range(9, 19, 1)).
If we send one value to range, that value is stop; start defaults to 0 and step defaults to 1.
>>> list(range(10))
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
This is equivalent to list(range(0, 10, 1)).
Note that range(10) thus contains 10 elements - if we begin at 0 and count up to 9, we have 10 total elements. In general, range(n) has precisely n elements.
I'll end with a final note about the range function. Yes, the range function creates lists. But it need not create a list that actually contains anything. Instead, it's very possible for it to create the empty list!
>>> list(range(1, 1, 1))
[]
This is a list whose elements begin at 1, go up by 1, and end before we reach 1. Ask yourself: What are the integers that begin at 1 but are all less than 1? Answer: there are not integers like that! So the list must be empty. Note this carefully. I bet you'll unintentionally write code that creates an empty list and then wonder why it doesn't work as intended. (I've done that many, many times.) Keep in mind that this can happen.
Now let's put what we've learned to work on some real problems. Here's our first task: for a given positive integer n, find the sum of the first n positive integers. If for instance n is 4, we wish to compute 1 + 2 + 3 + 4.
This is a job for iteration! Why? We do the same operation repeatedly! Our sum begins with 1. We add 2 to get 3. We add 3 to get 6. Finally we add 4 to get 10. See the pattern? Get the next integer and add that into the sum already computed. Iteration is the tool when we have a process that occurs repeatedly.
Here's how we code that up. (We're still function writers, so a function we will write.)
def sum_ints(n):
# return the sum of the first n positive integers
curr_sum = 0
for next_int in range(1, n + 1):
curr_sum = curr_sum + next_int
return cur_sum
A few function calls:
>>> sum_ints(0)
0
>>> sum_ints(1)
1
>>> sum_ints(4):
10
That's right! sum_ints(4), for example, should be 1 + 2 + 3 + 4, which indeed is 10.
Let us analyze the code:
The function needs an integer as input (which of course will be stored in the parameter n). First we gave it 0, then we gave it 1 and last we gave it 4.
The variable curr_sum holds the (surprise, surprise) current sum. This sum grows as we iterate; and when the iteration is complete, it holds the value the function will return. Obviously curr_sum should begin at 0 and then should increases as we add in successive integers.
Consider the third function call, when n was 4. We have range(1, 5) which as you know generates the list [1, 2, 3, 4]. So next_int grabs those four ints in succession.
The body of the iteration is curious - curr_sum occurs on both the left right of curr_sum = curr_sum + next_int. How does that work? As variable assignment statements always work! Evaluate the right and then assign the value that results to the variable on the left. When the right is evaluated, we of course pull the current value of curr_sum. We add to that next_int. Finally the value that results is assigned to curr_sum, which means of course that cur_sum is updated.
Let's trace the code when 4 was sent to sum_ints. curr_sum was first initialized to 0 before the loop began. The first time through the loop, we added 1 to that and made the result the new value of curr_sum. So cur_sum became 1. The second time through the loop, we added 2 to curr_sum and it became 1 + 2 = 3. The third time through we added 3 to curr_sum and it became 3 + 3 = 6. The last time through, we added 4 to curr_sum and its final value was then 6 + 4 = 10.
The pattern we saw above - initialize a variable before a loop begins and then update the value of that variable while in the loop - is common. Very common. Very common indeed. So very common indeed. (You get the point. It's common.) Let's give another example.
When mathematicians speak of an iterative process, they mean one in which a value is passed to a function and then the output of the function is made the input of a second call to the function; and (probably) the output of the second call is made the input of a third. And so on. Perhaps to infinity. Consider the example of compound interest. Say we deposit $1000 into an account that returns 5% compounded yearly. Thus once per year (typically at year's end), the amount in the account will be increased by 5%. So at the end of year one, the amount will be 1000 + 1000 * 0.05 = 1050. At then at end of year two, the amount will be 1050 + 1050 * 0.05 = 1102.5. And so on. Notice that in year two, we add the interest not to the original amount deposited (which was $1000) but to the current amount in the account (which is $1050). (If you never knew what was meant by "compound interest", this is it. Each new year, you earn interest not just on the initial amount deposited but also on any interest that's been earned previously.) So we have an initial value for amount; and then we begin to iteratively increase that value. (We iterate once for each year.) Each time we increase, we take the current value of amount and increase it by 5%. This is the initialize/update pattern we saw above.
Let's write some code.
def compound_interest(init_deposit, rate, years):
amount = init_deposit # initialize the amount in the account
for year in range(years):
amount = amount + amount * (rate/100) # rate/100 converts rate to a decimal
return round(amount, 2) # round to two digits after the decimal point
Now let's run:
>>> compound_interest(1000, 5, 1)
1050.0
>>> compound_interest(1000, 5, 2)
1102.5
>>> compound_interest(1000, 5, 10)
1628.89
>>> compound_interest(1000, 5, 50)
11467.4
>>> compound_interest(1000, 10, 50)
117390.85
So $1000 deposited at 5% becomes almost $11,500 after 50 years; and at 10% it comes almost $120,000! Do you understand now why people care so much about what might seem like small differences in rate of return? Double the rate gave over ten times the return.
But I'm not here to give financial advice. I'm here to teach you a little about iteration, and this is just such a good example. We set an initial value. We enter a loop, and each time through the loop we update the value. That's the initialize/update pattern.
I'll end the section with a final point about the compound_interest function. We never actually used the loop variable year in the body of the loop. (Compare to the sum_ints function above. There we did use the loop variable, which was next_int. We did so in the line cur_sum = cur_sum + next_int.) In the compound_interest function, the sole purpose of the loop is to make repeated application of a formula; the value of the loop variable was not needed.
As I said above, a list can be built of lists. Below is a list of lists of integers.
>>> [[1, 2], [3, 4], [7, 6, 8]]
[[1, 2], [3, 4], [7, 6, 8]]
Let's say we have a list of integer lists like the one above and we want to find the sum of all the integers buried inside. We'd need to iterate through the outer list - the list that contains integer lists. But for each integer list, we'd need to iterate through its member integers. So we'd need to iterate inside an iteration. But that's cool! We can iterate inside an iteration, much as before we could call a function inside a function.
Let's code this up. Get this in your editor:
def sum_lst_of_lsts(outer_lst):
sum = 0
for inner_lst in outer_lst:
for num in inner_lst:
sum = sum + num
return sum
Run it and hop over to the console.
>>> sum_lst_of_lsts([[1, 2], [3, 4], [7, 6, 8]])
31
Success! 1 + 2 + 3 + 4 + 7 + 6 + 8 = 31.
How does this work? Let's let Python help answer. Modify the code above (again, line numbers are for reference only - don't type them in):
1. def sum_lst_of_lsts(outer_lst):
2. sum = 0
3. for inner_lst in outer_lst:
4. print("current inner_lst:", inner_lst)
5. for curr_int in inner_lst:
6. print("current int:", curr_int)
7. sum = sum + curr_int
8. return sum
Let's call it.
>>> sum_lst_of_lsts([[1, 2], [3, 4], [7, 6, 8]])
current inner_lst: [1, 2]
current int: 1
current int: 2
current inner_lst: [3, 4]
current int: 3
current int: 4
current inner_lst: [7, 6, 8]
current int: 7
current int: 6
current int: 8
31
Please do read the output. Ask yourself why it is at it is.
Now let's analyze:
The for loop that begins on line 3 - the outer loop - iterates through the lists in outer_lst . Its loop variable inner_lst thus took on the value [1, 2], then [3, 4] , then [7, 6, 8].
The body of the outer loop extends from lines 4 to 7. That's the block of code that executes once for each value of inner_lst.
Part of the body of the outer loop is a second, inner loop. This is the for loop that begins on line 5. The body of the inner loop consists of lines 6 and 7. The loop variable of that inner loop is curr_int.
The inner loop executes once, from start to finish, for each value of outer_lst. So when outer_lst is [1, 2], curr_int is first 1 and then 2; and then when outer_lst is [3, 4], curr_int is first 3 and then 4; and finally when outer_lst is [7, 6, 8], curr_int is first 7, then 6, then 8.
Pay close attention to the indentation. We have four levels: none for the function header, four spaces for the body of the function, eight for the body of the outer loop, twelve for the body of the inner loop.
I have a little question intended to test whether you understand iterated iteration. Consider the short little program below:
s = 0
for i in range(4):
for j in range(9):
s = s + 1
print(s)
What does it print? (Please pause and work it out. When you have an answer, type the program in and run to verify.)
If you're stumped, here's the explanation:
Loops are loopy. We loop through their bodies over and over; and the number of times we loop is the same as the length of the sequence that comes after in. So the outer loop above (the i loop, we'll say, since its loop variable is i) loops 4 times.
But the body of the inner loop (the j loop) is itself a loop! And so each time it's executed, it will loop 9 times.
So we do something four times, and in each of those four, we do 9 things; that is, we do 9 things 4 times over. That's 36 total things! So the value of s is increased by 1 a total of 36 times, which, since it starts at 0, makes its final value 36. It's a bit like making, say, 4 pizzas, where to make one pizza you have to do 9 separate things - like roll out the dough, spread the sauce, etc. (I'll pretend it's 9 things.) So if you've made 4 pizzas, you've really done 36 things!
You see, then, that iterated iteration is closely linked to multiplication. 4 * 9 = 36, which is the value of s printed by the program above. 4 * 9 is "doing" 9 things 4 times over, which is 36 total.
Here's a curious little sequence:
1 / 1 + 2 / (1+4) + 3 / (1+4+9) + 4 / (1+4+9+16) + ...
You see the pattern, don't you? Each denominator is a sum of perfect squares that begins at 1 and ends at the square of the numerator.
This is a sequence, and like all sequences it screams iteration. We have an initial term: 1 / 1. And then we add in another: 2 / (1 + 4). And then another: 3 / (1 + 4 + 9). As long as we want. That's a process repeated with variation. That's what iteration is!
Now consider a denominator. The third one perhaps: 1 + 4 + 9. That's a sequence too, so the screams begins again. Iteration! Iteration! Iteration! So here's what we've got: at each stage of the prior (call it the "outer" iteration) we have a complete "inner" iteration to perform. So for instance when we reach 3 in the outer iteration, we have to iterate through 1, 4 and 9 in the inner. So what we have here is iterated iteration again! A for loop inside a for loop.
So let's write some code. Read it. Carefully. You should be able to follow. (I'm rather proud of the variable names. I think that they're so clear that comments aren't really necessary.)
def curious_little_sequence(n):
# compute 1/1 + 2/(1+4) + ... + n/(1+4+9+...+n**2)
seq_sum = 0
for numer in range(1, n+1):
denom_sum = 0
for denom_term in range(1, numer+1):
denom_sum = denom_sum + denom_term ** 2
seq_sum += numer / denom_sum
return seq_sum
In the curious_little_sequence function of the previous section, we had this line of code: denom_sum = denom_sum + denom_term ** 2. Python allows us to shorten that to: denom_sum += denom_term ** 2. Those two lines mean precisely the same.
This is our first example of syntactic sugar. Syntax like a = a + 1 can be replaced by the shorter a += 1. Sweet, is it not?
It works for the other arithmetical operations too: a -= 1 means the same as a = a - 1, a *= 2 means the same as a = a * 2 and a /= 2 means a = a / 2. I'll use these abbreviations form here on.
In the previous example, the stop value for the inner loop depended on the current value of the outer loop variable. Here was the inner for loop header:
for denom_term in range(1, numer+1):
numer was the outer loop variable. The result was that, each time we return to the inner loop, we iterate through more values than we did before.
Let's have another example of variable length inner loops. In the example below, the start value of the inner loop depend on the value of a outer loop variable. Indeed we'll have a loop inside a loop inside a loop, and each of the two inner loops will have start values that depend of the value of the loop variable of the loop immediately above.
So here's the problem: given some positive integer n, let's count the number of positive integer sequences (a, b, c) where a ≤ b, b ≤ c and c ≤ n.
If n = 1, we have only one sequence, namely (1, 1, 1).
If n = 2, the possible sequences are: (1, 1, 1) (1, 1, 2) (1, 2, 2) (2, 2, 2). That's 4 total.
If n is 3, the possible sequences are:
(1, 1, 1) (1, 1, 2) (1, 1, 3)
(1, 2, 2) (1, 2, 3)
(1, 3, 3)
(2, 2, 2) (2, 2, 3)
(2, 3, 3)
(3, 3, 3)
That's 10 total. (You might wonder why I laid it out like this. Roll with it for now. I'll come back to the question in a bit.)
If n is 4, the possible sequences are:
(1, 1, 1) (1, 1, 2) (1, 1, 3) (1, 1, 4)
(1, 2, 2) (1, 2, 3) (1, 2, 4)
(1, 3, 3) (1, 3, 4)
(1, 4, 4)
(2, 2, 2) (2, 2, 3) (2, 2, 4)
(2, 3, 3), (2, 3, 4)
(2, 4, 4)
(3, 3, 3) (3, 3, 4)
(3, 4, 4)
(4, 4, 4)
That's 20 total.
Our goal is to write a function that will, for any input positive integer n, return the number of possible sequences consistent with our rule. We're asked to construct a triple of values. So it seems natural to have a for loop inside a for loop inside a for loop. Let's agree that the outermost loop variable will be a, the middle loop variable will be b and that the innermost loop variable will be c.
The first or outermost loop will generate the first value in each triple. What should it's range be? 1 to n. The next loop generates the middle value. What should it's range be? Well, that's variable. It doesn't always start at 1. Since b must be greater than or equal to a, it should begin at the current value of a and then of course go to n. What finally should be the range for the innermost loop? As before, that's variable. Since c must be greater than or equal to b, the range should be b to n.
So that yields the code below. Note that we have a count variable that increments by 1 each time we complete a triple, which of course happens in the body of the innermost loop.
def inc_seq_count(n):
count = 0
for a in range(1, n+1):
for b in range(a, n+1):
for c in range(b, n+1):
print("(", a, " ", b, " ", c, ")", sep = "", end = " ")
count +=1
print()
return count
You'll notice that I put a print statement in the body of the innermost loop. That's so we can see, for each increment of count, the current values of a, b and c. The first of the print statements sets the value of end to a single space; that means it won't hop to a new line. The second print statement is just the empty print(); all it does is advance to a new line. So, we don't advance to a new line until the innermost loop has finished up. Below is the result:
>>> inc_seq_count(4)
(1 1 1) (1 1 2) (1 1 3) (1 1 4)
(1 2 2) (1 2 3) (1 2 4)
(1 3 3) (1 3 4)
(1 4 4)
(2 2 2) (2 2 3) (2 2 4)
(2 3 3) (2 3 4)
(2 4 4)
(3 3 3) (3 3 4)
(3 4 4)
(4 4 4)
20
If you look back up on the section, you'll see that that's how I initially reported the possible sequences for n = 4. That's why I did that!
I'll end with a few remarks about print statements in the bodies of your functions.
I have a final task for you before you begin the function set. Get acquainted with the euclid module. I provides us with a most wonderful way to visualize iteration. You'll use it in the function set.