Strings are any text identified in Python code within ""
or ''
Full name: string
Python keyword: str
Python data type group: text sequence
Chapter 2.2.1 - String overview¶
Strings are our first introduction to composite (sometimes referred to as sequence) data types. These data types combine multiple primitive data types and turn them into sequences of values. These sequences can include int
, float
, bool
, and other primitive data types. Unlike languages like C, C++, etc., that are statically-typed, Python has dynamic typing. While both Python and C++, for example, use values that have types, variables in Python can change type at any time. This can be useful for beginning programmers who might have trouble dealing with the extra static typing syntax and behavior. However, it can also cause some hidden issues in your program because you might, for example, expect a counter to only have an int
, but if at some point during the program, you change it to a str
, Python will not stop you from doing so. Typing in Python is implicitly enforced when expected operations cause an error (e.g., “adding” an int
and a str
).
Strings are a special case of sequence data types. This is because they generally only allow the inclusion of alphanumeric characters in the sequence. We will learn about other sequence data types that do not enforce this requirement later in this chapter.
We can call these alphanumeric values “characters” (actually called Unicode codes) to match with the data type name in other common languages (e.g., char
in C++, Java, etc.). Strings, therefore, are sequences of characters, where each character inhabits a specific position (index) in that sequence. For example, the first character in this sentence is “F”.
There are multiple ways to define strings:
a = "double quotes"
a = 'single quotes'
a = """triple quotes"""
While single and double quotes can only occur on one line, triple quotes can help your definition span multiple lines.
a = "double quotes"
b = 'single quotes'
c = """
triple
quotes
work
on
multiple
lines
"""
print(a)
print(b)
print(c)
double quotes
single quotes
triple
quotes
work
on
multiple
lines
Chapter 2.2.2 - String rules¶
The following rules will help you to effectively use strings in Python:
Rule #1¶
You must “close” a string using a starting
"
and ending"
(alternatively a starting'
and ending'
, or"""
and"""
for multiple lines). Whatever you choose, you must open and close with the same symbol (e.g., do not start with"
and end with'
).
Rule #2¶
You cannot add an
int
orfloat
to astr
without first casting (type converting) those values.
Rule #3¶
If you need to format your strings or insert values, you must use an f-string to do so.
Related to Rule #2, you may be surprised to learn that we can treat strings like int
or float
by “adding” two variables, as long as both variables are strings.
Fix the code below to print out Huskies #1
.
string1 = "Huskies #"
string2 = 1
string3 = string1 + string2
print(string3)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[13], line 4
1 string1 = "Huskies #"
2 string2 = 1
----> 4 string3 = string1 + string2
6 print(string3)
TypeError: can only concatenate str (not "int") to str
However, we typically want to use more complex approaches which include:
- casting / type conversion: converting
int
orfloat
intostr
- format string / f-string: defining where and how to insert an
int
orfloat
into astr
(see Rule #3)
Chapter 2.2.3 - Defining strings (Rule #1)¶
This is the first thing you should check if you are debugging / troubleshooting code with strings.
In the following example, I try to set a variable to a string value, but I do not close the end of the string with "
. You get a SyntaxError, which suggests something very basic is wrong. When you see “EOL while scanning string literal”, that means that the interpreter expected a second "
and never got one. This is an error that will stop your code from running.
a = "Test
Cell In[4], line 1
a = "Test
^
SyntaxError: EOL while scanning string literal
You can also get an error if you mix up your "
and '
a = "Test'
Cell In[5], line 1
a = "Test'
^
SyntaxError: EOL while scanning string literal
Be consistent!
a = "NIU"
b = 'NIU'
print(a)
print(b)
NIU
NIU
Why does the following not work?
a = Test
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[17], line 1
----> 1 a = Test
NameError: name 'Test' is not defined
Chapter 2.2.4 - String conversions (Rule #2)¶
This is a rule across most programming languages, and is a subset of a more general rule that data types have different purposes and abilities. It is up to the programmer to make sure the right data type is being used. If the correct data type is not in place, casting (type conversion) is required.
In the case of float
and int
, the conversion is straightforward:
int_example = 1
float_example = 1.5
int_to_str_example = str(int_example)
float_to_str_example = str(float_example)
print(int_to_str_example)
print(float_to_str_example)
1
1.5
The original data types are incompatible with strings, and you will get a TypeError, which suggests there is an incompatible data type problem.
NOTE: notice how the variables defined in the code cell above are able to be used by the code below. This is true as long as the code cell above is “run” before the code cells below OR if those variables are defined before the examples below.
combined_str = "The answer is " + int_example
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 combined_str = "The answer is " + int_example
TypeError: can only concatenate str (not "int") to str
combined_str = "The answer is " + float_example
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[10], line 1
----> 1 combined_str = "The answer is " + float_example
TypeError: can only concatenate str (not "float") to str
However, after casting, the variables are now also str
and can be added to another string:
combined_str = "The answer is " + int_to_str_example
print(combined_str)
The answer is 1
combined_str = "The answer is " + float_to_str_example
print(combined_str)
The answer is 1.5
In some cases, a string can be cast to an int
or float
. This only works if the string is “obviously” a number:
str_number = "1.452"
str_to_float_number = float(str_number)
print(str_to_float_number)
1.452
Did the following code work as expected? If not, what do you have to do to make it work?
str_number_1 = "1.452"
str_number_2 = "3.134"
result = str_number_1 + str_number_2
print(result)
1.4523.134
Chapter 2.2.5 - Formatting strings (Rule #3)¶
A basic string is defined as in the examples above:
a = "test"
print(a)
test
If you have multiple variables to print, you can use a print statement with commas to print the values along with spaces:
a = "Let's"
b = "Go"
c = "Huskies!"
print(a, b, c)
Let's Go Huskies!
This works well with strings, but can result in undesirable results when using other data types, particularly float
a = 1
b = 3
c = a / b
print(a, "/", b, "=", c)
1 / 3 = 0.3333333333333333
Notice that we have many decimal places.
Often you are required to cut off the display of decimals at two decimal places.
There is no obvious way to make this work with what we have learned so far.
f-string syntax¶
f-strings (formatted strings) can take other data types, format them in a predictable way, and then output the result.
There are only two main differences between a normal string and f-string
- You must put an
f
in front of the first"
. - You must place “{ }” in the string where you would like to insert a value.
An example is shown below, where you insert a, b, and c like above into a string and then print the string:
a = 1
b = 3
c = a / b
result = f"{a} / {b} = {c}"
print(result)
1 / 3 = 0.3333333333333333
This is the most basic way we can insert variables into a string. While this is not particularly useful, we will see the utility in subsequent examples.
Formatting numbers¶
When inserting numbers (int
, float
) into a string, you must provide 3 pieces of information within the brackets ({ }
):
- The variable you wish to insert
- The data type of the variable
- The formatting code
The most common formatting task is to take a float and display a fixed amount of decimal places.
A basic template for this process can be seen below, where precision is the count of decimal places (try changing the precision variable from 2 to 3):
variable = 1.66666666666
precision = 2
formatted_float = f"{variable:.{precision}f}"
print(formatted_float)
1.67
You can also write it as such:
variable = 1.66666666666
formatted_float = f"{variable:.2f}"
print(formatted_float)
1.67
For the a, b, c example above, you would replace the above example with this:
a = 1
b = 3
c = a / b
result = f"{a} / {b} = {c:.2f}"
print(result)
1 / 3 = 0.33
Chapter 2.2.6 - Student Summarization¶
Use markdown cells to describe the code and use code cells to include code examples from the slideshow that summarize string usage
# code cell
markdown cell