Short Introduction to Programming in Python
Overview
Teaching: 30 min
Exercises: 5 minQuestions
How do I program in Python?
How can I represent my data in Python?
Objectives
Describe the advantages of using programming vs. completing repetitive tasks by hand.
Define the following data types in Python: strings, integers, and floats.
Perform mathematical operations in Python using basic operators.
Define the following as it relates to Python: lists, tuples, and dictionaries.
Interpreter
Python is an interpreted language which can be used in two ways:
- “Interactively”: when you use it as an “advanced calculator” executing
one command at a time. To start Python in this mode, execute
python
on the command line:
$ python
Python 3.5.1 (default, Oct 23 2015, 18:05:06)
[GCC 4.8.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
Chevrons >>>
indicate an interactive prompt in Python, meaning that it is waiting for your
input.
2 + 2
4
print("Hello World")
Hello World
- “Scripting” Mode: executing a series of “commands” saved in text file,
usually with a
.py
extension after the name of your file:
$ python my_script.py
Hello World
Introduction to variables in Python
Assigning values to variables
One of the most basic things we can do in Python is assign values to variables:
text = "Data Carpentry" # An example of assigning a value to a new text variable,
# also known as a string data type in Python
number = 42 # An example of assigning a numeric value, or an integer data type
pi_value = 3.1415 # An example of assigning a floating point value (the float data type)
Here we’ve assigned data to the variables text
, number
and pi_value
,
using the assignment operator =
. To review the value of a variable, we
can type the name of the variable into the interpreter and press Return:
text
"Data Carpentry"
Everything in Python has a type. To get the type of something, we can pass it
to the built-in function type
:
type(text)
<class 'str'>
type(number)
<class 'int'>
type(pi_value)
<class 'float'>
The variable text
is of type str
, short for “string”. Strings hold
sequences of characters, which can be letters, numbers, punctuation
or more exotic forms of text (even emoji!).
We can also see the value of something using another built-in function, print
:
print(text)
Data Carpentry
print(number)
42
This may seem redundant, but in fact it’s the only way to display output in a script:
example.py
# A Python script file
# Comments in Python start with #
# The next line assigns the string "Data Carpentry" to the variable "text".
text = "Data Carpentry"
# The next line does nothing!
text
# The next line uses the print function to print out the value we assigned to "text"
print(text)
Running the script
$ python example.py
Data Carpentry
Notice that “Data Carpentry” is printed only once.
Tip: print
and type
are built-in functions in Python. Later in this
lesson, we will introduce methods and user-defined functions. The Python
documentation is excellent for reference on the differences between them.
Types of Data
How information is stored in Python objects affects what we can do with it and the outputs of calculations as well. There are two main types of data that we will explore: numeric and text data types.
Text Data Type
The text data type is known as a string in Python, or object in pandas. Strings can
contain numbers and / or characters. For example, a string might be a word, a
sentence, or several sentences. A pandas object might also be a plot name like
'plot1'
. A string can also contain or consist of numbers. For instance, '1234'
could be stored as a string, as could '10.23'
. However strings that contain
numbers can not be used for mathematical operations!
Numeric Data Types
Numeric data types include integers and floats. A floating point (known as a float) number has decimal points even if that decimal point value is 0. For example: 1.13, 2.0, 1234.345. If we have a column that contains both integers and floating point numbers, pandas will assign the entire column to the float data type so the decimal points are not lost.
An integer will never have a decimal point. Thus if we wanted to store 1.13 as
an integer it would be stored as 1. Similarly, 1234.345 would be stored as 1234. You
will often see the data type Int64
in pandas which stands for 64 bit integer. The 64
refers to the memory allocated to store data in each cell which effectively
relates to how many digits it can store in each “cell”. Allocating space ahead of time
allows computers to optimize storage and processing efficiency.
So we’ve learned that computers store numbers in one of two ways: as integers or as floating-point numbers (or floats). Integers are the numbers we usually count with. Floats have fractional parts (decimal places). Let’s next consider how the data type can impact mathematical operations on our data. Addition, subtraction, division and multiplication work on floats and integers as we’d expect.
print(5+5)
10
print(24-4)
20
If we divide one integer by another, we get a float. The result on Python 3 is different than in Python 2, where the result is an integer (integer division).
print(5/9)
0.5555555555555556
print(10/3)
3.3333333333333335
We can also convert a floating point number to an integer or an integer to floating point number. Notice that Python by default rounds down when it converts from floating point to integer.
# Convert a to an integer
a = 7.83
int(a)
7
# Convert b to a float
b = 7
float(b)
7.0
Operators
We can perform mathematical calculations in Python using the basic operators
+, -, /, *, %
:
2 + 2 # Addition
4
6 * 7 # Multiplication
42
2 ** 16 # Power
65536
13 % 5 # Modulo
3
We can also use comparison and logic operators:
<, >, ==, !=, <=, >=
and statements of identity such as
and, or, not
. The data type returned by this is
called a boolean.
3 > 4
False
True and True
True
True or False
True
True and False
False
Sequences: Lists and Tuples
Lists
Lists are a common data structure to hold an ordered sequence of elements. Each element can be accessed by an index. Note that Python indexes start with 0 instead of 1:
numbers = [1, 2, 3]
numbers[0]
1
A for
loop can be used to access the elements in a list or other Python data
structure one at a time:
for num in numbers:
print(num)
1
2
3
Indentation is very important in Python. Note that the second line in the
example above is indented. Just like three chevrons >>>
indicate an
interactive prompt in Python, the three dots ...
are Python’s prompt for
multiple lines. This is Python’s way of marking a block of code. [Note: you
do not type >>>
or ...
.]
To add elements to the end of a list, we can use the append
method. Methods
are a way to interact with an object (a list, for example). We can invoke a
method using the dot .
followed by the method name and a list of arguments
in parentheses. Let’s look at an example using append
:
numbers.append(4)
print(numbers)
[1, 2, 3, 4]
To find out what methods are available for an
object, we can use the built-in help
command:
help(numbers)
Help on list object:
class list(object)
| list() -> new empty list
| list(iterable) -> new list initialized from iterable's items
...
Tuples
A tuple is similar to a list in that it’s an ordered sequence of elements.
However, tuples can not be changed once created (they are “immutable”). Tuples
are created by placing comma-separated values inside parentheses ()
.
# Tuples use parentheses
a_tuple = (1, 2, 3)
another_tuple = ('blue', 'green', 'red')
# Note: lists use square brackets
a_list = [1, 2, 3]
Tuples vs. Lists
- What happens when you execute
a_list[1] = 5
?- What happens when you execute
a_tuple[2] = 5
?- What does
type(a_tuple)
tell you abouta_tuple
?- What information does the built-in function
len()
provide? Does it provide the same information on both tuples and lists? Does thehelp()
function confirm this?
Dictionaries
A dictionary is a container that holds pairs of objects - keys and values.
translation = {'one': 'first', 'two': 'second'}
translation['one']
'first'
Dictionaries work a lot like lists - except that you index them with keys. You can think about a key as a name or unique identifier for the value it corresponds to.
rev = {'first': 'one', 'second': 'two'}
rev['first']
'one'
To add an item to the dictionary we assign a value to a new key:
rev['third'] = 'three'
rev
{'first': 'one', 'second': 'two', 'third': 'three'}
Using for
loops with dictionaries is a little more complicated. We can do
this in two ways:
for key, value in rev.items():
print(key, '->', value)
'first' -> one
'second' -> two
'third' -> three
or
for key in rev.keys():
print(key, '->', rev[key])
'first' -> one
'second' -> two
'third' -> three
Changing dictionaries
- First, print the value of the
rev
dictionary to the screen.- Reassign the value that corresponds to the key
second
so that it no longer reads “two” but instead2
.- Print the value of
rev
to the screen again to see if the value has changed.
For loops
Loops allow us to repeat a workflow (or series of actions) a given number of times or while some condition is true. We would use a loop to automatically process data that’s stored in multiple files (daily values with one file per year, for example). Loops lighten our work load by performing repeated tasks without our direct involvement and make it less likely that we’ll introduce errors by making mistakes while processing each file by hand.
Let’s write a simple for loop that simulates what a kid might see during a visit to the zoo:
animals = ['lion', 'tiger', 'crocodile', 'vulture', 'hippo']
print(animals)
['lion', 'tiger', 'crocodile', 'vulture', 'hippo']
for creature in animals:
print(creature)
lion
tiger
crocodile
vulture
hippo
The line defining the loop must start with for
and end with a colon, and the
body of the loop must be indented.
In this example, creature
is the loop variable that takes the value of the next
entry in animals
every time the loop goes around. We can call the loop variable
anything we like. After the loop finishes, the loop variable will still exist
and will have the value of the last entry in the collection:
animals = ['lion', 'tiger', 'crocodile', 'vulture', 'hippo']
for creature in animals:
pass
print('The loop variable is now: ' + creature)
The loop variable is now: hippo
We are not asking Python to print the value of the loop variable anymore, but
the for loop still runs and the value of creature
changes on each pass through
the loop. The statement pass
in the body of the loop means “do nothing”.
Challenge - Loops
What happens if we don’t include the
pass
statement?Rewrite the loop so that the animals are separated by commas, not new lines (Hint: You can concatenate strings using a plus sign. For example,
print(string1 + string2)
outputs ‘string1string2’).Suppose you have a list of number
xs = [3, 34, 23, 56, 14, 56]
. Write a loop to sum the numbers of the list.
If Statements
The body of the test function now has two conditionals (if statements) that
check the values of start_year
and end_year
. If statements execute a segment
of code when some condition is met. They commonly look something like this:
a = 5
if a<0: # Meets first condition?
# if a IS less than zero
print('a is a negative number')
elif a>0: # Did not meet first condition. meets second condition?
# if a ISN'T less than zero and IS more than zero
print('a is a positive number')
else: # Met neither condition
# if a ISN'T less than zero and ISN'T more than zero
print('a must be zero!')
Which would return:
a is a positive number
Change the value of a
to see how this function works. The statement elif
means “else if”, and all of the conditional statements must end in a colon.
The if statements in the function yearly_data_arg_test
check whether there is an
object associated with the variable names start_year
and end_year
. If those
variables are None
, the if statements return the boolean True
and execute whatever
is in their body. On the other hand, if the variable names are associated with
some value (they got a number in the function call), the if statements return False
and do not execute. The opposite conditional statements, which would return
True
if the variables were associated with objects (if they had received value
in the function call), would be if start_year
and if end_year
.
Challenge - Loops + If statements
Suppose you have a list of number
xs = [3, 34, 23, 56, 14, 56]
. Write a loop to sum the even numbers from the list.
Functions
Defining a section of code as a function in Python is done using the def
keyword. For example a function that takes two arguments and returns their sum
can be defined as:
def add_function(a, b):
result = a + b
return result
z = add_function(20, 22)
print(z)
42
Key Points
Python is an interpreted language which can be used interactively (executing one command at a time) or in scripting mode (executing a series of commands saved in file).
One can assign a value to a variable in Python. Those variables can be of several types, such as string, integer, floating point and complex numbers.
Lists and tuples are similar in that they are ordered lists of elements; they differ in that a tuple is immutable (cannot be changed).
Dictionaries are data structures that provide mappings between keys and values.