Preparation#

Note about the reading material and preparation exercises. The reading material explains concepts, while the preparation exercises let you try them out. You can start with either. Some read first, others begin with exercises and use the book as needed. If programming is challenging, read first, do the exercises, then review the book.

Reading material#

In the Think Python (TP) book, strings are covered in the first 5 sections of Chapter 8 Strings and Regular Expressions.

Additionally, we also cover f-strings, which the book introduces in the second section of Chapter 13 Files and Databases.

Copy-and-Run#

Prep 7.1: Basics of Strings#

Run the following code to be reminded what you already know about strings.

s1 = "DTU"
year = 1829
s = s1 + " " + "is founded in " + str(year)
print(s)
print(len(s))

You have also seen string comparison before. Can you predict the output of the following code?

check1 = 'Maja' == 'Maja'
check2 = 'Maja' == 'maja'
check3 = 'Maja' == 'Maja '
check4 = 'Maja' == "Maja"

print(check1, check2, check3, check4)

And lastly, check once more how len() works with strings.

len1 = len('Maja')
len2 = len('Maja ')
len3 = len('')
len4 = len('   ')

print(len1, len2, len3, len4)

Prep 7.2: String Indexing and#

Run now the following code.

bm_string = "batmand"
print(bm_string[0])
print(bm_string[3])
print(bm_string[0:3])
print(bm_string[3:])
print(bm_string[::2])
print(bm_string[::-1])

As you can see, you can access individual characters in a string by using square brackets [] just as you indexed lists. You can use slicing as well, making it even more important to get familiar with the syntax [start:stop:step] and how default values work. Look at the illustration below and use slicing to access:

  • last three characters of the string (for batmand this would be and),

  • all characters except the first and the last one (for batmand this would be atman),

  • every second character but starting from the second one (for batmand this would be amn),

  • in reversed order every second character starting from the last one (for batmand this would be datb).

image

When working with lists last week, you have seen that list indexing returned elements, while list slicing returned lists, and the two could not be concatenated. Let’s see how this works with strings.

my_string = "I am a student at DTU"
my_character = my_string[0]
my_substring = my_string[6:14]

check = my_character + my_substring
print(check)

As you can see, this is different with strings. Individual elements are also strings, and you can concatenate them with other strings.

Prep 7.3: Strings Are Immutable#

When working with lists last week, you have seen that you can change individual elements of a list. Let’s see how this works with strings.

my_string = "i am a student at DTU."
my_string[0] = "I"
print(my_string)

As you can see, Python tells you that this is not possible. You cannot change individual characters in a string. We say that strings are immutable, which means that you cannot modify a string. Remember, you can always use reassignment where you use the old string value to construct a new one. Run the following code to see this can be done.

my_string = "i am a student at DTU."
my_string = "I" + my_string[1:]
print(my_string)

Prep 7.4: Traversing Strings#

Look at the following code. By combining your understanding of for-loops and string indexing, you should be able to predict the output of the code. Run the code to check if you were right.

my_string = 'Engineering is fun!'
for i in range(len(my_string)):
    print('Character with index', i, 'is', my_string[i])

Notice that a space, and an exclamation mark are also characters, and can be accessed in the same way as any other character. Try now this for-loop.

for char in "Engineering is fun!":
    print(char)

As with lists, you can use the for-loop to directly traverse string elements. Think about what the following piece of code does before running it.

alphabet = "abcdefghijklmnopqrstuvwxyz"
vowels = "aeiou"
consonants = ""
for char in alphabet:
    if char not in vowels:
        consonants = consonants + char
print(consonants)

Prep 7.5: String Methods#

Try running following code snippet.

title = "02002 Computer Programming"
print(title)
title = title.upper()
print(title)
title = title.lower()
print(title)

Similarly to lists, strings have many built-in methods for common operations. However, since strings are immutable, none of these methods change the string. Look back at the code above and confirm that both lines with method invocation are reassigning the returned value to the variable. Look back at how you used sort() method with list to see how method invocation looks like without reassignment.

Run the code snippets below and experiment with string methods count() and find().

my_string = "Mississippi River is a river in the United States."

count1 = my_string.count("s")
count2 = my_string.count("S")
count3 = my_string.count("i")
count4 = my_string.count("x")
count5 = my_string.count("iss")

print(count1, count2, count3, count4, count5)
my_string = "One two three four five six seven eight nine ten"

index1 = my_string.find("e")
index2 = my_string.find("two")
index3 = my_string.find("ten")
print(index1, index2, index3)

index4 = my_string.find("f")
index5 = my_string.find("f", index4 + 1)
index6 = my_string.find("f", index5 + 1)

print(index4, index5, index6)

index7 = my_string.find("g")
index8 = my_string.find("g", 0, 30)
print(index7, index8)

In the some assignments we give the find() method one additional argument, an index to start searching from. In the last assignment we give yet another additional argument to end the search before a certain index. Experiment with these arguments and see how they affect the output.

As you can see, the find() works almost exactly as index(), but it returns -1 if the substring is not found, instead of raising an error.

In the copy the code which you just tried, and replace find() with index(). Run the code and check which lines give the same output and which lines give an error.

Run now this code snippet.

s = "Zinc is a metal"
s = s.replace("Zinc", "Iron")
print(s)

You are maybe thinking that by using indexing, slicing and looping you could achieve the same behavior as with string methods. That is correct. Therefore it is not crucial to remember all the string methods. There are many and you can look them up. But it is good to know that they exist and can simplify your code.

You can find the most useful string methods on the W3Schools page and the full list of string methods in the Python documentation.

Prep 7.6: String Methods split() and join()#

Run this code.

physicists_string = "Bohr Heisenberg Schrödinger Dirac Pauli Fermi"
physicists = physicists_string.split()
print(physicists)

Re-run the code or make small modifications to answer the following questions.

  • What is the type of the variable returned by the split() method? If you are in doubt, run print(type(physicists)) to check.

  • How would the method work if there was more than one space between some of the words?

  • What would happen if there was one or more spaces at the beginning or the end of the string?

Now run this code to to see other uses of the split() method.

spaces_string = "many   spaces"
print(spaces_string.split())
print(spaces_string.split(" "))

ratings = "Great, Good, Poor, Okay, Poor, Good, Great, Okay"
print(ratings.split(", "))

question = "How much wood would a woodchuck chuck if a woodchuck could chuck wood?"
print(question.split("chuck"))
print(question.split("wood"))

Now run this code.

physicists = ['Bohr', 'Heisenberg', 'Schrödinger', 'Dirac', 'Pauli', 'Fermi']
separator = " "
together = separator.join(physicists)
print(together)

Now look at the line where method join() is invoked, and answer the following questions. If needed make small modifications to the code to check your answers.

  • What is the type of the variable returned by the join() method? If you are in doubt, run print(type()) to check.

  • What is the type of the variable given to the join() method as a parameter? That is, what is the type of the variable in the parentheses?

  • What is the type of the variable the method is invoked on? That is, what is the type of the variable before the dot?

Now run this code to to see other uses of the join() method.

physicists = ['Bohr', 'Heisenberg', 'Schrödinger', 'Dirac', 'Pauli', 'Fermi']
joint = ", ".join(physicists)
print(joint)
print(" - . - ".join(physicists))

Prep 7.7: The in Operator for Strings#

Run the following code and observe the output.

dna_seq = 'TTAACGCATGCCATAGGACGGTTAGGCTCAGAACCCGCAACCAATACACGTGATTTTCTCGTCCCCTG'
pattern = 'CTCG'
match = pattern in dna_seq
print(match)

And run this code to figure out whether in operator is case-sensitive.

print("i" in "Mississippi")
print("I" in "Mississippi")
print("kansas" in "arkansas")
print("Kansas" in "arkansas")

Prep 7.8: Everybody Loves F-strings#

In this course, we have used printing to keep track of what our code is doing. Look now at this example.

planet = 'Mars'
distance = 1.5
message = f"{planet} is {distance} AU from the Sun"
print(message)

You have learned how to construct such a message using string concatenation and by casting integers to strings. However, Python has f-strings that make this much easier. Everything inside the curly braces {} is evaluated and converted to a string.

Look how f-strings work in the following code.

for name in ["Anders", "Vedrana"]:
    if name=="Vedrana":
        s = "is"  
    else:
        s = "is NOT"
    print(f"{name} {s} teaching this course.")
import math
print(f"Pi to 3 decimal places is {math.pi:.3f}")
print(f"Two thirds is {2/3}")
print(f"Two thirds with 5 decimal places is {2/3:.5f}")
for i in range(1, 11):
    s = ""
    for j in range(1, 11):
        s = s + f'{i*j:4}' 
    print(s)    
hours = 9
minutes = 5
print(f"Time is {hours:02}:{minutes:02}")

As you can see, using f-strings you can decide how numbers are to be printed: the number of decimal places, the space occupied by the number, and whether to use leading zeros.

Below we see some common ways to use format specifiers for float and int variables (using 216 and 3.141592653589793 as examples). The letter d refers to an integer, and the letter f refers to a float.

Data type

Format specifier description

Example

Output

int

-

f"{216}"

216

int

[int] (d)

f"{216:d}"

216

int

[width][int] (5d)

f"{216:5d}"

216

int

[leading_zeros][width][int] (05d)

f"{216:05d}"

00216

int

[float] (f)

f"{216:f}"

216.0

float

[int] (d)

f"{pi:d}"

ValueError

float

-

f"{pi}"

3.141592653589793

float

[float] (f)

f"{pi:f}"

3.141593

float

.[precision][float] (.2f)

f"{pi:.2f}"

3.14

float

[width].[precision][float] (8.2f)

f"{pi:8.2f}"

3.14

float

[leading_zeros][width].[precision][float] (08.2f)

f"{pi:08.2f}"

00003.14

Two show the power of f-strings, consider the following snippets of code which are very cumbersome to write without f-strings.

def greatest_common_divisor(a, b):
    divisor = 1
    for i in range(2, min(a, b)+1):
        if a % i == 0 and b % i == 0:
            divisor = i
    return divisor

a = 12
b = 18
gcd = greatest_common_divisor(a, b)
print(f"Simplifying ratio {a}/{b}:\n {a}/{b} = ({a}/{gcd})/({b}/{gcd}) = {a//gcd}/{b//gcd} = {a/b:.3f}")
max_iter = 1000
pi = 0
for i in range(1, max_iter+1):
    pi = pi + 4 * (-1)**(i+1) / (2*i-1)
    if i % 100 == 0 or i in [1, 10, 20, 50]:
        print(f"Approximating pi step {i:04}/{max_iter}: {pi:.5f}")

Prep 7.9: Newline \n and Other Escape Characters#

Run this code.

my_string = "This is the first line.\nThis is the second line."
print(my_string)

The combination \n is called an escape character, and is used to represent a special functions in strings. Most often used escape character is \n which represents a new line. Try how this works in the following code.

message = "Hello" + 12*"\n-" + "\nWorld"
print(message)

What do you thing the length of the string message from the code above is? Run print(len(message)). What is the length of the single \n?

Some other escape characters are:

  • \t: Tab

  • \\: Backslash (useful for writing file paths)

  • \': Single quote (if you want a single quote in a string enclosed by double quotes)

  • \": Double quote (if you want a double quote in a string enclosed by single quotes)

Try some of these escape characters in the following code.

print("One\ttwo\tthree\tfour")
print("En\tto\ttre\tfire")
print("C:\\Users\\username\\Documents\\file.txt")

And lastly, run this code to see how you can split a string using the newline escape character.

text = "First line\nSecond line\nThird line"
lines = text.split('\n')
print(lines)

Prep 7.10: Fix String Errors#

Below are snippets of code which cause an error. Try to figure out what the error is and fix it.

print(upper("dtu"))
import math
print(f"The value of pi is {.5f:math.pi}")
last_name = "Hannemose"
print("The last letter of my last name is", last_name[9])
alphabet = "abcdefghijklmnopqrstuvwxyz"
print("first letter of the alphabet:", alphabet(0))
path = "C:\Documents\User\File"
print(f"pi repeated 5 times (for fun): {'3.14 '*5}")
print(f"e repeated 5 times (for fun): {"2.71 " *5}")

Prep 7.11: Predict the Output#

For each of the following snippets of code, try to predict the output before running them.

print("i'm not angry!".upper())
lyrics = "Looks like it's gonna be a great day today"
print(lyrics.split()[6])
print("I am a string".replace("a", "another"))
stans = "kazakhstan, uzbekistan, stanistan, turkmenistan"
print(stans.replace("stan", "land").split(", "))
print(("O"*20+"\n")*10)
print(f"{2002} Computer Programming")
print(f"{2002:5d} Computer Programming")
print(f"{2002:05d} Computer Programming")
fum = 5
bum = 7
print(f"{fum}+{bum}={fum+bum}")

Self quiz#

Question 7.1#

What is the value of apple_pie after the following code is executed?

apple_pie = 'apple' + 'pie'

Question 7.2#

What are values of check1 and check2 after the following code is executed?

check1 = 'Hello My Friend' == 'hello my friend'
check2 = 'No way!' == "No way!"

Question 7.3#

What is printed by the following code?

s = 'abcdefgh'
print(s[::2])

Question 7.4#

What happens when executing the following code?

s = 'Python programming!'
s[-1] = '?'
print(s)    

Question 7.5#

What happens when executing the following code?

s = 'Python programming!'
a = s[:6]
b = s[-1]
print(a + b)    

Question 7.6#

How many lines does the following code print?

s = 'Python!!!'
for char in s:
    print(char)

Question 7.7#

What is printed by the following code?

s = 'Python programming!'
s.upper()
print(s)    

Question 7.8#

What is the value of c after the following code is executed?

s = 'Aabenraa'
c = s.count('a')

Question 7.9#

What is the value of c after the following code is executed?

s = 'Aabenraa'
c = s.find('a')

Question 7.10#

What is the value of c after the following code is executed?

s = 'Aabenraa'
c = s.find('a', 2)

Question 7.11#

What is the value of n after the following code is executed?

text = "He didn't understand. Restless. Confused."
sentences = text.split('.')
n = len(sentences)

Question 7.12#

What is printed by the following code?

words = ['Python', 'is', 'fun']
sentence = ' '.join(words)
print(sentence)

Question 7.13#

What is the value of check after the following code is executed?

check = 'A' in 'The quick brown fox jumps over the lazy dog'

Question 7.14#

What gets printed by the following code?

if 'in' in 'inside':
    print('Yes')
elif 'out' in 'outside':
    print('No')

Question 7.15#

What gets printed by the following code?

month = 6
day = 5
year = 2025
print(f'{day}/{month}-{year}')

Question 7.16#

What is printed by the following code?

string = 'One\nTwo'
print(string)

Question 7.17#

What is the value of string_length after the following code is executed?

string_length = len('One\nTwo')

Question 7.18#

What is the value of c after the following code is executed?

days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
c = 0
for day in days:
    if day[0] == 'T':
        c = c + 1

Question 7.19#

You want to print the fraction 1/3 as a decimal number with 2 decimal places. Which of the following code snippets will achieve this?

Question 7.20#

How many lines does the following code print?

beaches = 'Acapulco, Boulders, Copacabana, Navagio, Tulum, Waikiki'
for beach in beaches:
    print(beach)