Functions#
Thousands of functions are built into the Python computer language and still others can be loaded by using the import
Python command. This is very powerful and provides almost limitless capability to the Python language. However, there are many times when a custom function may be needed and this is a very powerful way to automate repetitive data handling and analysis tasks in a reproducible manner. Functions take arguments given in paretheses () directly following the name. For instance below is the print function:
dogname = "Phineas" # Define `dogname` variable
print(dogname) # `dogname` is the argument for the function, print
Phineas
Now let’s give this a try
# Our first function definition
def double(x):
""" Double x """
return 2*x
double(10)
20
Defining functions#
Here is a very simple function that converts a proportion to a percentage by multiplying it by 100. For example, the value of to_percentage(.5)
should be the number 50. (No percent sign.)
A function definition has a few parts.
def
#
It always starts with def
(short for define):
def
Name#
Next comes the name of the function. Let’s call our function to_percentage
.
def to_percentage
Signature#
Next comes something called the signature of the function. This tells Python how many arguments your function should have, and what names you’ll use to refer to those arguments in the function’s code. to_percentage
should take one argument, and we’ll call that argument proportion
since it should be a proportion.
def to_percentage(proportion)
We put a colon after the signature to tell Python it’s over.
def to_percentage(proportion):
Documentation#
Functions can do complicated things, so you should write an explanation of what your function does. For small functions, this is less important, but it’s a good habit to learn from the start. Conventionally, Python functions are documented by writing a triple-quoted string:
def to_percentage(proportion):
"""Converts a proportion to a percentage."""
Body#
Now we start writing code that runs when the function is called. This is called the body of the function. We can write anything we could write anywhere else. First let’s give a name to the number we multiply a proportion by to get a percentage.
def to_percentage(proportion):
"""Converts a proportion to a percentage."""
factor = 100
return
#
The special instruction return
in a function’s body tells Python to make the value of the function call equal to whatever comes right after return
. We want the value of to_percentage(.5)
to be the proportion .5 times the factor 100, so we write:
def to_percentage(proportion):
"""Converts a proportion to a percentage."""
factor = 100
return proportion * factor
import numpy as np
# Compute the ratio as a percentage
def per_change(x,y):
""" Takes ratio of x to y and
converts to a % change by subtracting 1
>>> per_change(20, 16)
0.2500
"""
return np.round(x/y-1,4)
per_change(3.89,3.69)
0.0542
Now use apply to compute new Table column#
from datascience import *
data = 'http://www2.census.gov/programs-surveys/popest/datasets/2010-2020/national/asrh/nc-est2020-agesex-res.csv'
full_census_table = Table.read_table(data)
partial_census_table = full_census_table.select('SEX', 'AGE', 'POPESTIMATE2010', 'POPESTIMATE2020').
partial_census_table=partial_census_table.relabeled('POPESTIMATE2010', '2010').relabeled('POPESTIMATE2020', '2020')
partial_census_table
File "/tmp/ipykernel_1838/3230332571.py", line 4
partial_census_table = full_census_table.select('SEX', 'AGE', 'POPESTIMATE2010', 'POPESTIMATE2020').
^
SyntaxError: invalid syntax
census=partial_census_table.where(0,0).where('AGE',are.below(99))
census=census.with_columns(
"% change",census.apply(per_change,'2020','2010')
)
census.set_format('% change',PercentFormatter)
SEX | AGE | 2010 | 2020 | % change |
---|---|---|---|---|
0 | 0 | 3951495 | 3735010 | -5.48% |
0 | 1 | 3957904 | 3773884 | -4.65% |
0 | 2 | 4090799 | 3853025 | -5.81% |
0 | 3 | 4111869 | 3921526 | -4.63% |
0 | 4 | 4077511 | 4017847 | -1.46% |
0 | 5 | 4064653 | 4054336 | -0.25% |
0 | 6 | 4073031 | 4040169 | -0.81% |
0 | 7 | 4043100 | 4029753 | -0.33% |
0 | 8 | 4025624 | 4034785 | 0.23% |
0 | 9 | 4125413 | 4078668 | -1.13% |
... (89 rows omitted)