There are four primitive data types we will discuss in this course. They are called primitives because they are the basic building blocks of all data in Python:
Chapter 2.1.1 - int¶
Full name: integer
Python keyword: int
Python data type group: numeric
Integers (int
) are whole numbers with no fractional parts and can be positive or negative. Math operations work as you would expect with int
(and other numeric
) data types. For example, you could print the result of adding two int
values. The result is also an int
.
print(1 + 2)
3
In Python, int
is the most basic numeric
data type. The newest versions of Python handle most of the details of implementing the built-in int
data type, but there are some issues you may have to consider if you use Python 2 (which has been retired since 2020). Since you are unlikely to use Python 2 in your career, we will not examine those issues. When we eventually use more complex Python packages, we will revisit different types of integers that some packages may implement.
There is no minimum or maximum number for integers, per se, but there are some issues we could run into for more complex packages and/or use cases. We will cover these issues later in the course.
Mathematical Operations
The most common mathematical operations you will use with int
data include (Table 2.1.t1):
Table 2.1.t1 - Common mathematical operations in Python
Operation | Description | Notes |
---|---|---|
1 + 2 | The sum of 1 and 2 | |
1 - 2 | The difference of 1 and 2 | |
1 * 2 | The product of 1 and 2 | |
1 / 2 | The quotient of 1 and 2 | Will automatically turn into a float |
Your turn: Try to modify the following code blocks to see how your answers change.
print("The sum of 1 and 2 is", 1 + 2)
print("The difference of 1 and 2 is", 1 - 2)
print("The product of 1 and 2 is", 1 * 2)
print("The quotient of 1 and 2 is", 1 / 2)
The sum of 1 and 2 is 3
The difference of 1 and 2 is -1
The product of 1 and 2 is 2
The quotient of 1 and 2 is 0.5
Chapter 2.1.2 - Defining variables¶
While you can use explicit (actual) numbers for mathematical and other operations, you will almost always be using variables instead. Variables, much like those used in your math courses, are placeholders for values. You might ask “why would we need variables if we can just use the actual numbers?!”. The answer is generalizability.
What if we need to create a program that allows a user to use an equation that could handle any possible integer? We do not have time to code every possible combination of numbers that the user might decide to use. In fact, it might be easier to control what values cannot be used. Instead, we use placeholders that can vary (i.e., variables) depending on the state of the program and/or decisions made by the user.
As was mentioned in chapter 2.0.4, variables retain the values they were assigned by the programmer as long as the program is running. Thus, as long as you define the correct value, it can be used as many times as you need. This idea is similar to how our brains develop the ability to understand object permanence. In other words, we can assume that an object that once existed still exists even if we cannot see the object. In Python terms, if you define a variable “earlier” in the program, you can assume that variable still exists “later” in the program. In other words, the state of the program is remembered and persists as long as the program is running.
The most basic pattern for defining variables is as follows (NOTE: the text and brackets just describes the item at that position, do not use brackets in this situation):
[variable name] = [operation]
The name of the variable can contain any alphanumeric character:
ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789
However, variable names are constrained by some strict limitations and styling guidelines (Table 2.1.t2). A typical Python variable has a simple, lowercase name that splits two or more words using an underscore (_
). For example: split_words_using_underscore
. Some languages use “Camel Case” for their variable names, where words are separated using capital letters. For example CamelCase
or TempF
. This is allowed, but is not preferred in Python scripts.
Table 2.1.t2 - Types of possible variable names to avoid.
Type | Notes | Example |
---|---|---|
keywords | Not allowed and will produce an error. | and = 1 |
built-in constants | Not allowed and will produce an error. | False = 2 |
names that start with a number | Not allowed and will produce an error. | 1abc = 3 |
built-in functions | Allowed (sometimes) but strongly discouraged. Do not use. | len = 5 |
names with *, **, or other operators | Allowed in some cases, but will usually produce an error. Do not use. | **a = 1 |
names with l, O, 0, 1 | Allowed but discouraged. Produces confusion. | l0O1llO0 = "what?" |
uppercase letters | Allowed but discouraged, unless defining a “constant” variable, a class, or other limited situations. | MY_VARIABLE = 3.14 #??? vs PI = 3.14 #OK |
long names exceeding 79 characters | Allowed but discouraged. Produces confusion. | this_is_a_really_long_variable_name = False |
names with one or two letters | Allowed but discouraged, unless counting or is self describing. | x = 9.8066 #??? vs. G = 9.8066 #gravity |
Here is a demonstration of defining a variable.
month = 8
You can see on the left side of the =
, there is a variable name that adheres to the requirements in Table 2.1.t2. On the right side of the =
, there is an int
. As you may recall from section 2.0.4, the right side of the =
is evaluated before the left side. Thus, the steps going on in the background are (very generally) as follows:
- Python determines if there are any syntax errors (errors in the code), if so, do not run the code but do provide an error message.
- Python prepares a memory location that will be represented for the programmer by the variable name
month
. The specific details are determined by the data type. - The value on the right is evaluated until it becomes a basic data type (e.g.,
int
). In other words, if there are mathematical or other operations, they will first be processed and reduced to their most basic representation. In this case, it was already as simple as possible. In more advanced situations, you can define variables asfunctions
orobjects
, but we will discuss this later in the course. - The memory location represented by the variable name
month
is assigned theint
value of 8. - Following rule 1 (section 2.0.4),
month
(and its corresponding memory location) will be equal to 8 until the program changes its value or the program is closed.
As was discussed in class, the “memory location” is within the computer’s Random-Access Memory (RAM), and is not automatically saved to your hard drive. Once the program is closed, or the computer is turned off, the value associated with the variable is lost. This is a very common action (how often do you close programs? or turn your computer off?), so using Python code to “save” the steps to reproduce defining the variable is important for allowing your ideas, thoughts, and workflow to persist through more than one program / computer cycle. People often like to think of code as a recipe--as long as you know how Grandma Goody made her cookies, you can make them as many times as you want. The alternative is to just eat them once and never know how to make them again!
Below are working Python code examples of setting 4 variables with 4 operations, and then displaying (“printing”) the value for each variable.
sum_result = 1 + 2
diff_result = 1 - 2
prod_result = 1 * 2
quot_result = 1 / 2
print("The sum of 1 and 2 is", sum_result)
print("The difference of 1 and 2 is", diff_result)
print("The product of 1 and 2 is", prod_result)
print("The quotient of 1 and 2 is", quot_result)
The sum of 1 and 2 is 3
The difference of 1 and 2 is -1
The product of 1 and 2 is 2
The quotient of 1 and 2 is 0.5
Chapter 2.1.3 - Counting with Integers¶
Many tasks require you to define and maintain a variable that keeps track of a count. For example, you might want to keep track of how often an EF5 tornado occurs in a list of tornado events.
Imagine that you have a Python program written that can determine if an entry in a tornado dataset is an EF5 tornado or not. The specifics of that program are not important right now, except for the variable that helps you keep track of the number of EF5 tornadoes.
The typical pattern for counting is to start at zero. Why? Logically, we start with the assumption of zero EF5 tornadoes and initialize the counting variable by setting it equal to value of 0.
ef5_tornado_count = 0
Now, we have code that runs and checks each line in a database. Let’s imagine that there was a detection of an EF5 tornado, and we need to increase the count by 1 (increment). How would we do that? You might say that we just set the counting variable to 1:
ef5_tornado_count = 1
This works for changing the variable from 0 to 1, but what about from 1 to 2?
ef5_tornado_count = 2
Let’s say that there are 1 million events detected, would we really have to type out something like the following code and every int
before this?
ef5_tornado_count = 1000000
Instead, think of the logic associated with each time the count is increased. That specific process only requires that you answer the following two questions:
- What is the current count?
- How do I add 1 to the current count?
We have already explored how we would add 1 to a variable and print the result:
x = 5
print(x + 1)
6
However, we have not covered how to modify the value of a variable. Examine the following code. Can you explain why the code produces the output that it does?
x = 5
print(x + 1)
print(x)
6
5
The print
statement does exactly what you think it does (OK, it does not print something out of the printer in the hall). It simply processes the code within the parentheses and then displays the result below the Python code. It does not modify the variable in any way because there is no =
operator.
Rule 1: values are not changed unless you use an equal sign (=
) to set a new value.
Can you explain each line of output below? What Python operations are performed to produce the output?
x = 1
print(x)
x = 2
print(x)
1
2
If we want to keep track of the count and add one to the count, we need to use Rule 1 and Rule 2.
Rule 2: everything to the right of an equal sign (=
) is evaluated first.
If the value does not change until we use an equal sign, and the right side is evaluated first, we can redefine a variable using an operation that includes a variable that appears on both sides of the =
operator! Since the right side is evaluated first (Rule 2), we just need to add 1 to the current count using a mathematical operation, and then use that result to modify the variable (Rule 1).
The two rules are why the following code is possible and we do not have to manually type out every possible int
. All we need to know is how to add 1 to a variable and how to redefine that variable.
ef5_tornado_count = 0
print("Number of EF5 tornadoes:", ef5_tornado_count)
ef5_tornado_count = ef5_tornado_count + 1 # rule 2 + rule 1
print("Number of EF5 tornadoes:", ef5_tornado_count)
ef5_tornado_count = ef5_tornado_count + 1 # rule 2 + rule 1
print("Number of EF5 tornadoes:", ef5_tornado_count)
ef5_tornado_count = ef5_tornado_count + 1 # rule 2 + rule 1
print("Number of EF5 tornadoes:", ef5_tornado_count)
Number of EF5 tornadoes: 0
Number of EF5 tornadoes: 1
Number of EF5 tornadoes: 2
Number of EF5 tornadoes: 3
However, you cannot have two variable assignments on the same line, since the expression on the right side:
a = a + 1 = a + 1
Cell In[12], line 1
a = a + 1 = a + 1
^
SyntaxError: cannot assign to expression
If you are wondering why this is the case, consider the following “variable” definition. It is not clear what variable is being redefined on the left side. Thus, Python has strictly outlawed this possibility.
a = 1
b = 2
a + b = 3
Cell In[13], line 4
a + b = 3
^
SyntaxError: cannot assign to expression here. Maybe you meant '==' instead of '='?
Chapter 2.1.4 - float¶
Full name: floating-point
Python keyword: float
Python data type group: numeric
Floats (floating point numbers) are numbers with fractional parts and can be positive or negative. They are used whenever decimal precision is needed.
Unlike int
, floats need to be able to represent decimals that can have an infinite number of digits. As such, floats are only as accurate as the level of precision that is implemented in the programming language. In Python, floats are 64 bit “doubles” (double precision) by default. This gives you ~15 useful digits, which is more than enough for the activities in this course. Extra precision can be gained by representing the decimal as a fraction of two int
, which have unlimited precision. For example, 1 / 3
instead of 0.3333333333333333
. While floats might appear to be continuous, they actually occur on intervals that surprisingly change depending on if the number is very low or very high.
For example, if we printed out the decimal 0.01, we would get the following:
a = 0.1
print(a)
0.1
Which is what we would expect!
However, this is a truncated version of the actual floating point value from the nearest interval value to 0.1. If we printed out the full floating-point number, we will see an unintuitive result. The structure of the Python code below will be discussed later in the section on strings
. In short, the code asks the computer to “give me the float
number that is not truncated before 20 digits”.
a = 0.1
b = 0.2
c = 0.3
print("a (truncated) =", a, "a (20 digit precision) =", f"{a:.20g}")
print("b (truncated) =", b, "b (20 digit precision) =", f"{b:.20g}")
print("c (truncated) =", c, "c (20 digit precision) =", f"{c:.20g}")
a (truncated) = 0.1 a (20 digit precision) = 0.10000000000000000555
b (truncated) = 0.2 b (20 digit precision) = 0.2000000000000000111
c (truncated) = 0.3 c (20 digit precision) = 0.2999999999999999889
You will notice that this “rounding error” slightly changes depending on the number and can be slightly higher or lower than the expected number. This is due to how decimals are represented--namely, there is a local interval of existing decimals depending on the precision of the float
in the programming language (e.g., float16, float32, float64, etc., will all have different intervals), and the actual number is just the closest existing number. However, you usually do not need 20 digits of precision, so these unintuitive “rounding errors” do not impact your workflow, most of the time.
Here is an example of a simulated floating-point interval.
0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1
If a programmer provides the number 0.1001
, the float representation would be 0.1
. This is a very simplified version, but it gives you an idea of what is happening behind the scenes once you start examining many decimal places of precision.
There are some considerations that you need to keep in mind, including:
Because of rounding errors,
float
numbers that seem to be equal may actually have very small differences.int
values can be directly compared (i.e., are these twoint
exactly equal?), butfloat
numbers must be compared to a given level of precision (i.e., are these twofloat
equal within +/- 0.00001?). For this reason, you should useint
for counting and other situations where numbers need to be directly compared (e.g., are there exactly 10 EF5 tornadoes?).If several calculations are performed, the errors can begin to accumulate, and two seemingly identical approaches can start to diverge at fewer and fewer decimal places.
Here is an example, where the floating-point “rounding errors” from 0.1
accumulate from multiple additions, and 0.3
is rounded down to the nearest floating-point interval, causing a
and b
to be slightly different, which is an unexpected result!
a = 0.3
b = 0.1 + 0.1 + 0.1
print(f"a = {a:.20g}", f"b = {b:.20g}")
print("is a equal to b?", a == b)
a = 0.2999999999999999889 b = 0.30000000000000004441
is a equal to b? False
To compare these types of numbers, you need to use a function
like isclose
. We will talk about functions
in great detail later in the course, but the basic explanation is it simplifies several lines of code into one line of code to make it easier for a programmer to use.
from math import isclose
print(f"a = {a:.20g}", f"b = {b:.20g}")
print("is a almost equal to b?", isclose(a, b))
a = 0.2999999999999999889 b = 0.30000000000000004441
is a almost equal to b? True
Chapter 2.1.5 - Operations using both float and int¶
Python 3 will automatically convert from int
to float
when needed. This means you do not have to worry about the specific details and differences between int
and float
, except that Python will always convert the result to float
.
In one common situation, you might have mixed data types. In this case a
is an int
and b
is a float
. You can quickly tell the difference due to the lack or presence of decimals when printed. The result is a float
, so all of the caveats associated with that data type must be considered.
a = 1
b = 1.5
print("a =", a, "b =", b)
print("a + b =", a + b)
a = 1 b = 1.5
a + b = 2.5
The next situation might seem unexpected at first, but makes sense when you think about the end result. You can have an equation that includes no float
numbers, but produces a float
result. This happens when you divide two or more int
numbers. It is possible that the result is not a whole number, and thus, needs to be represented with decimals.
a = 1
b = 3
c = 4
d = 6
e = 0
f = 12
# All integers
my_mean = (a + b + c + d + e + f) / 6
print(my_mean)
4.333333333333333
The result is the same if you use all float
numbers:
a = 1.0
b = 3.0
c = 4.0
d = 6.0
e = 0.0
f = 12.0
# All floats
my_mean = (a + b + c + d + e + f) / 6
print(my_mean)
4.333333333333333
It is also the same if you use a mixure of int
and float
numbers:
a = 1
b = 3.0
c = 4
d = 6.0
e = 0
f = 12.0
# All floats
my_mean = (a + b + c + d + e + f) / 6
print(my_mean)
4.333333333333333
In the geosciences, you will be working with float
numbers often. For example, you may want to convert between temperature scales or test if an earthquake exceeds a particular magnitude. Thus, it is important that you remember the quirks associated with this data type.
Chapter 2.1.6 - bool¶
Full name: boolean
keyword: bool
Python data group: numeric
Booleans are used for testing conditions and can be one of two values: True
or False
. In Python, they are a special subset of int
that associate the value 0 with False
and the value 1 with True
:
print("Is 1 equal to True?", 1 == True)
print("Is 0 equal to False?", 0 == False)
Is 1 equal to True? True
Is 0 equal to False? True
However, you should not use bool
in place of int
.
bool
are used for a logical test or combinations of logical tests.
Tests are performed using boolean operators.
There are 6 common boolean operators (Table 2.1.t3):
Table 2.1.t3 - Common boolean operators
Operator | Description | Example of True | Example of False |
---|---|---|---|
== | exactly equal to | 1 == 1 | 1 == 2 |
!= | not equal to | 1 != 2 | 1 != 1 |
< | less than | 1 < 2 | 2 < 1 |
<= | less than or exactly equal to | 1 <= 1 | 2 <= 1 |
> | greater than | 2 > 1 | 1 > 2 |
>= | greater than or equal to | 1 >= 1 | 1 >= 2 |
Code example: Is a number larger than another number?
condition = 2 > 1
print(condition)
True
condition = 2 < 1
print(condition)
False
Code examples: Compare two variables
a = 1
b = 2
print(a > b)
print(a >= b)
print(a < b)
print(a <= b)
print(a == b)
print(a != b)
False
False
True
True
False
True
For more complex situations, we can combine multiple boolean tests using bitwise operators:
&
- the “and” operator. This requires the right and left side to beTrue
for the result to beTrue
. Otherwise the result is False.|
- the “or” operator. This requires at least one side to beTrue
for the result to beTrue
. Otherwise, the result is False.
When we combine multiple tests, we need to put parentheses around tests in a way that assures comparisons are done in the order we want. The order of tests starts from the “innermost” parentheses like you would expect in a mathematical expression.
We will learn more about tests and combination operators when we discuss if
, elif
, else
statements.
Ungraded exercise -- Run the following code. Identify each individual boolean test and set them equal to the variables d
, e
, f
, and g
. Use these new variables as replacements for the boolean tests to reproduce the output. Next, try to modify the values for a
, b
, and c
, as well as the boolean tests and bitwise operators to produce different results.
a = 1
b = 2
c = 3
print((a >= 1) & (b > c))
print((a >= 1) | (b > c))
False
True
Chapter 2.1.7 - Summary¶
Primitive data types form the foundation of most programming languages, including Python. Becoming comfortable with these data types will help you to better understand more complex topics in Python.