In-Class#

Coding Practice#

Code 12.1: Powers of 2#

In week 6 you wrote a code which created a list of the first n powers of 2. Let’s do the similar thing with NumPy arrays.

You should write a code that given an integer n creates a NumPy array of the first n powers of 2. For example, for n=4 the array should contain the values \(2^1\), \(2^2\), \(2^3\), \(2^4\), so the array should be [2 4 8 16]. Test your code with n=1, n=4 and n=10.

Code 12.2: 2: All Months with More than d Days#

You are given the following Numpy arrays.

months = np.array(['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul','Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
days = np.array([31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31])

You should write a code that given a number d creates a new NumPy array, months_with_more_than_d_days, containing the names of all months that have more than d days. For example, for d = 30, the array should contain:

['Jan' 'Mar' 'May' 'Jul' 'Aug' 'Oct' 'Dec']

Try it also for d = 31 and confirm that the array is empty.

Code 12.3: Math Functions#

We want to plot a function $\( f(x) = |1/2-x^3|-\sqrt{\sin(x)+e^x} \)$

on an n equally spaced points in the interval \([0,1]\).

To solve the problem, you should:

  • Define an integer n, for example n=100.

  • Create a Numpy array x, containing linearly spaced points between \(0\) and \(1\) (including both \(0\) and \(1\)).

  • Write an expression for the function \(f(x)\), and call it y.

  • Use Matplotlib to plot the values of line consisting of the points \((x, y)\). Your plot should look like the one shown below.

../_images/71958c0895cda7b988afda7582122fdd6c9dba1784a55222080fe42f20e2777c.png

Code 12.4: Counting Integers#

In previous weeks, you have been counting the frequency of different letters and numbers in lists. In this exercise you will use Numpy to calculate the frequency numbers in a Numpy array.

Start by making a function count_number(array, integer) with two inputs. The first input array should be a Numpy array of integers and the second input integer should be an integer. Your function count_number should count the number of times integer appears in array.

Test the function by defining array = np.array([3,5,2,3,3,5,6,9]), and checking that count_number(array, integer) gives the correct output for different integers.

Make another function, array_frequencies that counts the number of times all integers between 0 and 9 occurs in the array. The function should take as input a Numpy array of integers between 0 and 9, and output a Numpy array of integers of length 10 with the frequencies of each number between 0 and 9 in the input.

Check your function by printing array_frequencies(array). You should get

[0 0 1 3 0 2 1 0 0 1]

To see a very easy way to solve this problem using a single Numpy function, go the Advanced section 12.4.

Problem Solving#

Problem 12.5: Is the CPR Number Valid?#

Have you ever wondered how websites know if the CPR number (the Danish civil registration number) is valid or not? In this exercise, you will find out how to do this yourself!

The last digit of the CPR number is a check digit, which is calculated from the first nine digits. This can be used to check the validity of a CPR number.

Let’s say you are given the CPR number 1111111111 and you want to know whether it is valid. Here is how you can check it:

  • Multiply the first nine digits by the corresponding value from the “control number” 432765432, and add the results together. In our example, since the first nine digits of the CPR number are 111111111, this should give

\[ 1\cdot 4 + 1\cdot 3 + 1\cdot 2 + 1 \cdot 7 + 1\cdot 6 + 1\cdot 5 + 1\cdot 4 + 1\cdot 3 + 1\cdot 2 = 36 \]
  • Calculate the remainder when dividing the previous result by \(11\). In our case, this is 3 since \(36 = 3\cdot 11 + 3\).

  • Subtract the remainder from 11 to get the check digit. In our case that would be \(11-3=8\).

  • The final digit of the CPR having first nine digits 111111111 should therefore be \(8\). So 1111111118 is a valid CPR number. But you were given 1111111111 which ends with \(1\), and not \(8\). So 1111111111 is not a valid CPR number. That’s it!

You should write a function cpr_check() that takes a Numpy array of CPR digits as input and returns a boolean value indicating whether the CPR number is valid or not.

cpr_check.py

cpr_check(array)

Checks if CPR number is valid or not

Parameters:

  • array

numpy.ndarray

A numpy array of shape (10,) containing the digits of a CPR number

Returns:

  • bool

A boolian variable which is True if the CPR is valid

You can test the function cpr_check, by verifying the following (randomly chosen) cases:

Problem 12.6: Growth Curve of Bacteria#

Download the file bacteria.npy. The file contains the result of 160 experiments, where the bacteria growth was measured every 10 minutes for a total of 120 minutes.

Place the file in your current working directory. If you are unsure what your current working directory is, you can go back to Week 9: Preperation.

Load the data in the file using data = np.load('bacteria.npy').

Each row, data[i,:], records the bacterial growth for one experiment. Each column data[:,j] shows the bacterial growth for all 160 experiments at the \(j\)’th measurement time.

Calculate the mean \(\mu\) and standard deviation \(\sigma\) of the bacterial growth at each time step. You can find the mean and standard deviation of a Numpy array with np.mean() and np.std().

Calculate lower and upper range at each time step as \(\text{r}_\mathrm{lower} = \mu - 2\sigma\) and \(\text{r}_\mathrm{upper} = \mu + 2\sigma\).

Plot the bacterial growth from all the experiments in the same plot using a for-loop. The horizontal axis should have the label Time in minutes.

In the same plot, plot the mean bacterial growth, and the lower and upper range, using different colors and linestyles.

Change the plot options so that the plot looks like the one shown below.

../_images/ceadc07db4f19b8fe5c29685e747bb89d987bc464c077a3cf6630415d0d6ad20.png

Problem 12.7: Smooth Curve#

A closed curve is represented using a sequence of 2D points \(p_i = (x_i, y_i)\), \(i = 1, . . . , N\) connected by the line segments, where it is assumed that the last point is connected to the first point. The curve is smoothed by moving every curve point \(p_i\) slightly in the direction of a position midways between its two neighbors \(p_i−1\) and \(p_i+1\), where one also needs to make sure that the first and the last point of the curve are correctly displaced. The new coordinates for the curve points where \(i = 2, . . . , N−1\) can be computed as

\[ x_i^{\text{new}} = (1-\alpha)x_i + \alpha \frac{x_{i-1} + x_{i+1}}{2}, \quad \quad y_i^{\text{new}} = (1-\alpha)y_i + \alpha \frac{y_{i-1} + y_{i+1}}{2} \]

and for the first and last point we have

\[ x_1^{\text{new}} = (1-\alpha)x_1 + \alpha \frac{x_{N} + x_{2}}{2}, \quad \quad y_1^{\text{new}} = (1-\alpha)y_1 + \alpha \frac{y_{N} + y_{2}}{2}, \]

and

\[ x_N^{\text{new}} = (1-\alpha)x_N + \alpha \frac{x_{N-1} + x_{1}}{2}, \quad \quad y_N^{\text{new}} = (1-\alpha)y_N + \alpha \frac{y_{N-1} + y_{1}}{2}, \]

The parameter \(\alpha\) controls the strength of the smoothing and is usually set between \(0.1\) and \(0.5\).

Create a function smooth_curve that takes as input: a 2D numpy array \(C\) of shape (2,N) containing coordinates of curve points (\(x_i\) in the first row, \(y_i\) in the second row), and a scalar \(\alpha\). The function should return a 2D array of the same shape as \(C\), containing coordinates of the smoothed points.

smooth_curve.py

smooth_curve(C, alpha)

Return a matrix of coordinates for the smoothed curve

Parameters:

  • C

numpy.ndarray

A numpy array of shape (2, N) containing coordinates of curve points N>=3

  • alpha

float

Scalar with the smoothing parameter alpha.

Returns:

  • numpy.ndarray

A numpy array of shape (2, N) containing coordinates of the smoothed curve

As an example, consider a curve, also shown in the illustration in black, given

\[\begin{split} C = \begin{bmatrix} 24 & 40 & 36 & 44 & 28 & 18 & 12 & 0 & 8 & 4 \\ 12 & 16 & 8 & 4 & 0 & 4 & 0 & 4 & 12 & 16 \end{bmatrix}, \end{split}\]

and a smoothing parameter

\[ \alpha = 0.5. \]

The new coordinates of the first two points can be computed as

\[ x_1^{\text{new}} = (1 - 0.5)24 + 0.5 \frac{40 + 4}{2} = 23, \quad y_1^{\text{new}} = (1 - 0.5)12 + 0.5 \frac{16 + 16}{2} = 14, \]
\[ x_2^{\text{new}} = (1 - 0.5)40 + 0.5 \frac{24 + 36}{2} = 35, \quad y_2^{\text{new}} = (1 - 0.5)16 + 0.5 \frac{12 + 8}{2} = 13, \]

and a similar computation is performed for the remaining coordinates. The smoothed curve, shown in the illustration in red, is

\[\begin{split} S = \begin{bmatrix} 23 & 35 & 39 & 38 & 29.5 & 19 & 10.5 & 5 & 5 & 10 \\ 14 & 13 & 9 & 4 & 2 & 2 & 2 & 5 & 11 & 14 \end{bmatrix} \end{split}\]

Test your function by plotting the original curve and the smoothed curve using the provided code and compare your plot with the illustration below.

C = np.array([
    [24, 40, 36, 44, 28, 18, 12, 0, 8, 4],
    [12, 16, 8, 4, 0, 4, 0, 4, 12, 16]
])
alpha = 0.5
S = smooth_curve(C,alpha)

# Plotting the original and smoothed curve.
cycle = np.arange(C.shape[1] + 1)
cycle[-1] = 0 # Close the curve
plt.plot(C[0, cycle], C[1, cycle], label='Original curve')
plt.plot(S[0, cycle], S[1, cycle], label='Smoothed curve')
plt.legend()
plt.show()
../_images/ae37fd62cc0e834aa01d40babd69b2330293b1c72f29638a10b88ac1a0e87def.png

Problem 12.8: Flowering Plants#

The information about the flowering season for the plants in a tropical greenhouse is stored in a \(N \times 2\) matrix of integer numbers between 0 and 12. Each row in this matrix is a pair of numbers representing one plant. The first number is a month of the year where the plant starts flowering and the second number is a month where the plant stops flowering. For example a pair \((11, 2)\) represent a plant with flowering season starting in November (month 11) and ending in February (month 2). If a plant flowers all year round, it is represented with \((0, 0)\). A plant represented with \((7, 7)\) has a short flowering season which both starts and ends in July.

Create a function flowering_plants which takes as input a 2D Numpy array (a matrix) of greenhouse plants \(G\), and a month \(m\) given by a number between \(1\) and \(12\). The function should return the total number of plants which flower in month \(m\). Plants which start or stop flowering in the month \(m\) should be counted as flowering in month \(m\).

flowering_plants.py

flowering_plants(G, m)

Return a matrix of coordinates for the smoothed curve

Parameters:

  • G

numpy.ndarray

A numpy array of shape (N,2) of number between 0 and 12 representing a flowering season of plants.

  • m

int

Integer between 1 and 12 representing a month.

Returns:

  • int

Number of plants flowering in the given month.

As an example, consider the input

\[\begin{split} G = \begin{bmatrix} 5 & 9 \\ 9 & 11 \\ 12 & 6 \\ 0 & 0 \end{bmatrix} \quad \text{and} \quad m = 6. \end{split}\]

The first plant has a flowering season represented by \((5, 9)\), which corresponds to months 5, 6, 7, 8, and 9, so this plant flowers in June. The second plant is represented by \((9, 11)\), which corresponds to months 9, 10, and 11, so it does not flower in June. The third plant is \((12, 6)\), corresponding to months 12, 1, 2, 3, 4, 5, and 6, so it flowers in June. The last plant is \((0, 0)\), and it flowers all year round, also in June.

In total, there are 3 plants flowering in June. The function should return \(\mathbf{3}\).