Composite data types can be indexed to access primitive data types that they contain
For example, if we define a string:
a = "text"
print(a)
text
we know that a is actually a composite of 4 characters:
a = "t" + "e" + "x" + "t"
print(a)
text
you can access the different characters in the string by using “indexing”.
Indexing is a way of accessing data within a composite data type.
To do this, you need to first define a composite data type (primitive data types cannot be indexed):
a = "text"
To access one character in the variable a
, you need to do the following:
- place “an open square bracket” immediately after the variable name:
a[
- place an integer or a variable representing an integer immediately after the open square bracket:
a[0
- place “a closed square bracket” immediately after the integer:
a[0]
In english, a[0]
means “give me the first character in the string variable a
”
In Python and many other programming languages, indexing starts counting at 0
, so “first” is equivalent to “index 0”, “second” is equivalent to the “index 1”, etc.
Examples
The following code prints the first character in variable a
:
a = "text"
print(f"This is the first character in '{a}':", a[0])
This is the first character in 'text': t
The following code prints the second character in a
:
print(f"This is the second character in '{a}':", a[1])
This is the second character in 'text': e
The following code prints the third character in a
print(f"This is the third character in '{a}':", a[2])
This is the third character in 'text': x
The following code prints the fourth character in a:
print(f"This is the third character in '{a}':", a[3])
This is the third character in 'text': t
Chapter 2.4.1 - Indexing rules for one item¶
These three rules will help you to use indexing correctly, and to debug your code when you do not use it correctly:
Rule 1: You must “invoke” (aka tell Python you plan to do something more complex) an index call using a starting [
and ending with a ]
.
Rule 2: You must place an int
or int
variable within the square brackets with no whitespace.
Rule 3: The int
or int
variable must be a number that is no higher than the count of items in the composite data type.
Here are some common problems you may run into using indexing
- BUG: You forget to close the brackets or do not use the brackets properly:
a[0
Cell In[9], line 1
a[0
^
SyntaxError: unexpected EOF while parsing
a{0}
Cell In[10], line 1
a{0}
^
SyntaxError: invalid syntax
a(0)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[11], line 1
----> 1 a(0)
TypeError: 'str' object is not callable
a0
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[12], line 1
----> 1 a0
NameError: name 'a0' is not defined
- BUG: You use a
float
or other non-int data type as the index:
a[1.0]
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[13], line 1
----> 1 a[1.0]
TypeError: string indices must be integers
a['b']
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[14], line 1
----> 1 a['b']
TypeError: string indices must be integers
- BUG: You use an
int
, but the index is larger than the count of items in the composite data type:
For example, variable a
is set to “Text” which has 4 characters. Thus, your index cannot be higher than the “fourth item” which is equal to “index 3”.
print(a[3]) #totally fine!
t
print(a[4]) #out of range!
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
Cell In[16], line 1
----> 1 print(a[4])
IndexError: string index out of range
Chapter 2.4.2 - Indexing for more than one item: slicing¶
Slicing composite data types allows you to access two or more values at the same time. The same indexing rules apply with some additional syntax.
Again, first define a composite data type:
a = "TEXT"
Getting the first character in a
and setting equal to a variable named b
:
b = a[0]
print(b)
T
To get multiple characters, use the following pattern:
variable_name[start:end]
start
is an int or int variable that defines the first index at which you would like to slice the composite data type (inclusive)
end
is an int or int variable that defines the index immediately after the last index you would like to slice (exclusive).
:
“invokes” the slicing functionality built in to composite data types.
Slicing rules add two sub-rules to rule #3:
Rule 3.a: both start
and end
must be less than the count of items in a composite data type.
Rule 3.b: All indexes represented by each integer between start
and end
, except the index equivalent to end
, will be included in the slice.
In some math problems, the equivalent range of values would be defined using the square bracket (include this part of the range) and parentheses (do not include this part of the range): [start, end)
You can also think of it as if you have a variable that represents the sliced indexes named x
, the integers in x will be all integers that meet the following conditions: >= start
and < end
Nice refresher on domain and range: https://
If you want to access the first and second character in a string, you would have a range of 0:2
which means all indexes from 0 to 2, except not 2:
a = "text"
a[0:2]
'te'
Why is a[0] equivalent to a[0:1]?
print("a[0] =", a[0])
print("a[0:1] =", a[0:1])
a[0] = t
a[0:1] = t
You can start and end in the middle of a composite data type as well:
a = "Go Huskies!"
print(a[2:10])
Huskies
If you do not include a number on either side of the :
, it will result in one of three things:
a[:end]
- start at the first item (index 0) and get all items until (but not including) indexend
a[start:]
- start at indexstart
(including start) and go all the way to the end of the composite (including the last item).a[:]
- access all items in the composite
This is often used in code and is not a “bug” or “accident”, but make sure it is what you need to do for your purposes.
a = "Go Huskies!"
print("a[:3] =", a[:3])
print("a[3:] =", a[3:])
print("a[:] =", a[:])
a[:3] = Go
a[3:] = Huskies!
a[:] = Go Huskies!
Chapter 2.4.3 - Practice Indexing and Slicing¶
If given the following composite variable D
, answer the following questions in each cell using composite data type indexing and/or slicing. Do not just answer the question, provide code that outputs the answer using a print statement.
D = "GO HUSKIES"
What is the 3rd character in D
?
What is value at the 6th index in D
?
How would I set a variable named t to the 8th character in D
?
How would I output the string “HUGO” using the variable D
?
How would I set a variable named u
concurrently to the first, second, and third items in D
using slicing?