Data structure in Python

Author

Tony Duan

1 Built-in Data Structures

1.1 Singular Values

Code
# Assign an integer value to variable a
a=1
# Print the type of variable a
type(a)
int
Code
# Assign a float value to variable a
a=1.3
# Print the type of variable a
type(a)
float
Code
# Assign a string value to variable a
a='hell'
# Print the type of variable a
type(a)
str
Code
# Assign a boolean value to variable a
a= True
# Print the type of variable a
type(a)
bool

1.2 list

Lists are ordered, mutable (changeable) sequences that can store items of different data types. They are defined by enclosing elements in square brackets [].

Code
# Define a list named a
a=[1,2,3]

# Print the list
a
[1, 2, 3]
Code
# Print the type of variable a
type(a) 
list
Code
# Define a list of fruits
fruits = ['orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana','apple']

1.2.1 find length of the list with len()

Code
# Print the length of the fruits list
len(fruits)
8

1.2.2 First 2 on the list

Code
# Get the first 2 elements of the fruits list
fruits[:2]
['orange', 'apple']

1.2.3 Last 2 on the list

Code
# Get the last 2 elements of the fruits list
fruits[-2:]
['banana', 'apple']

1.2.4 Find how many times in the list with count()

Code
# Count the number of times 'apple' appears in the fruits list
fruits.count('apple')
3

1.2.5 Find location on the list with index()

show the first ‘apple’ index. Python lists start at 0

Code
# Find the index of the first occurrence of 'apple' in the fruits list
fruits.index('apple')
1

all ‘apple’ in the list

Code
# Use a list comprehension to find all indices where the value is 'apple'
[index for index, value in enumerate(fruits) if value == 'apple']
[1, 5, 7]

1.2.6 Reverse the list

Code
# Reverse the order of elements in the fruits list in-place
fruits.reverse()
# Print the modified fruits list
fruits
['apple', 'banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange']

1.2.7 Sort the list

Code
# Sort the elements of the fruits list in-place alphabetically
fruits.sort()
# Print the modified fruits list
fruits
['apple', 'apple', 'apple', 'banana', 'banana', 'kiwi', 'orange', 'pear']

1.2.8 Add element to the list

Code
# Add 'grape' to the end of the fruits list
fruits.append('grape')
# Print the modified fruits list
fruits
['apple',
 'apple',
 'apple',
 'banana',
 'banana',
 'kiwi',
 'orange',
 'pear',
 'grape']

1.2.9 Drop last element

Code
# Remove and return the last element from the fruits list
fruits.pop()

# Print the modified fruits list
fruits
['apple', 'apple', 'apple', 'banana', 'banana', 'kiwi', 'orange', 'pear']

1.2.10 List Comprehensions

using loop:

Code
# Initialize an empty list to store squares
squares = []
# Loop from 0 to 9
for x in range(10):
  # Append the square of x to the squares list
  squares.append(x**2)
  
# Print the squares list
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

using List Comprehensions

Code
# Create a list of squares using a list comprehension
squares = [x**2 for x in range(10)]
# Print the squares list
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

1.2.11 List to Tuples

Code
# Convert the squares list to a tuple
tuple(squares)
(0, 1, 4, 9, 16, 25, 36, 49, 64, 81)

1.2.12 List to Set

Code
# Convert the squares list to a set
set(squares)
{0, 1, 4, 9, 16, 25, 36, 49, 64, 81}

1.2.13 List to Dictionary

1.2.13.1 One list to dictionary

Code
# Define a list
list=['a', 1, 'b', 2, 'c', 3]

# Define a function to convert a list to a dictionary
def convert(lst):
   res_dict = {}
   for i in range(0, len(lst), 2):
       res_dict[lst[i]] = lst[i + 1]
   return res_dict
 
# Call the convert function with the list
convert(list)
{'a': 1, 'b': 2, 'c': 3}

1.2.13.2 Two lists to dictionary

Code
# Import the itertools module (though not used in this specific example)
import itertools

# Define a tuple of keys
keys = ('name', 'age', 'food')

# Define a tuple of values
values = ('Monty', 42, 'spam')

# Create a dictionary by zipping the keys and values
dict(zip(keys, values))
{'name': 'Monty', 'age': 42, 'food': 'spam'}

1.3 Tuples

Tuples are ordered, immutable (unchangeable) sequences. They are defined by enclosing elements in parentheses ().

Code
# Define a tuple named fruits
fruits = ('orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana','apple')

# Print the tuple
fruits
('orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana', 'apple')
Code
# Print the type of the fruits tuple
type(fruits)
tuple

tuple cannot be modified.

1.4 Sets

A set is an unordered collection with no duplicate elements. Sets are useful for mathematical set operations like union, intersection, difference, and symmetric difference. They are defined by enclosing elements in curly braces {}.

Code
# Define a set named basket with duplicate elements
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}

# Print the set (duplicates are automatically removed)
basket
{'apple', 'banana', 'orange', 'pear'}
Code
# Print the type of the basket set
type(basket)
set

1.5 Dictionaries

Dictionaries are unordered collections of key-value pairs. Each key must be unique, and it maps to a value. Dictionaries are defined by enclosing key-value pairs in curly braces {}.

Code
# Define a dictionary named tel
tel = {'jack': 4098, 'sape': 4139}

# Print the dictionary
tel
{'jack': 4098, 'sape': 4139}
Code
# Print the type of the tel dictionary
type(tel)
dict
Code
# Access the value associated with the key 'jack' in the tel dictionary
tel['jack']
4098

2 NumPy Data Structures (Matrix in Python)

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object

Python doesn’t have a built-in type for matrices. However, we can treat a list of a list as a matrix

Code
# Define a list of lists representing a matrix
A = [[1, 4, 5, 12], 
    [-5, 8, 9, 0],
    [-6, 7, 11, 19]]
    
# Print the matrix
A
[[1, 4, 5, 12], [-5, 8, 9, 0], [-6, 7, 11, 19]]

2.1 NumPy Array

Code
# Import the numpy library
import numpy as np

# Create a 2x3 NumPy array
A2 = np.array([[1, 2, 3], [3, 4, 5]])
# Print the array
print(A2)
[[1 2 3]
 [3 4 5]]
Code
# Print the type of the A2 array
type(A2)
numpy.ndarray

2.2 Shape

Code
# Print the shape of the A2 array (rows, columns)
A2.shape
(2, 3)

2.3 Row Number

Code
# Print the number of rows in the A2 array
len(A2)
2

2.4 Total Elements

Code
# Print the total number of elements in the A2 array
A2.size
6

2.5 Dimension

Code
# Print the number of dimensions of the A2 array
A2.ndim
2

2.6 Count NumPy Array Elements

Code
# Import the collections and numpy modules
import collections, numpy
# Create a NumPy array
a = numpy.array([0, 3, 0, 4])
# Count the occurrences of each element in the array
collections.Counter(a)
Counter({np.int64(0): 2, np.int64(3): 1, np.int64(4): 1})

2.7 Convert List into NumPy Array

Code
# Define a list of lists
A = [
  [1, 4, 5, 12], 
  [-5, 8, 9, 0],
  [-6, 7, 11, 19],
  [1, 4, 5, 12], 
  [-5, 8, 9, 0],
  [-6, 7, 11, 19],
  [1, 4, 5, 12], 
  [-5, 8, 9, 0],
  [-6, 7, 11, 19]
  ]
    
# Convert the list of lists to a NumPy array
A3 = np.array(A)

# Print the NumPy array
print(A3)
[[ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]]

2.8 Selection

2.8.1 First 5 Rows

Code
# Select the first 5 rows of the array
A[:5]
[[1, 4, 5, 12], [-5, 8, 9, 0], [-6, 7, 11, 19], [1, 4, 5, 12], [-5, 8, 9, 0]]

2.8.2 Last 5 Rows

Code
# Select all rows except the last 5
A[:-5]
[[1, 4, 5, 12], [-5, 8, 9, 0], [-6, 7, 11, 19], [1, 4, 5, 12]]

2.8.3 First Row

Code
# Select the first row of the array
A[0]
[1, 4, 5, 12]

2.8.4 First Column

Code
# Select the first column of the A2 array
A2[:,0]
array([1, 3])

2.8.5 First Row and First Column Element

Code
# Select the element at the first row and first column of the A2 array
A2[0,0]
np.int64(1)
Code
# Print the data type of the elements in the A2 array
A2.dtype
dtype('int64')

2.8.6 Second Row and Third Column

Code
# Select the element at the second row (index 1) and third column (index 2) of the A2 array
A2[1,2]
np.int64(5)

2.9 Filter

2.9.1 filter all

Code
# Print the A3 array
print(A3)
[[ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]]
Code
# Create a boolean array where True indicates elements greater than 4
A3>4
array([[False, False,  True,  True],
       [False,  True,  True, False],
       [False,  True,  True,  True],
       [False, False,  True,  True],
       [False,  True,  True, False],
       [False,  True,  True,  True],
       [False, False,  True,  True],
       [False,  True,  True, False],
       [False,  True,  True,  True]])
Code
# Select elements from A3 that are greater than 4
A3[A3>4]
array([ 5, 12,  8,  9,  7, 11, 19,  5, 12,  8,  9,  7, 11, 19,  5, 12,  8,
        9,  7, 11, 19])

2.9.2 Filter Row

Code
# Print the A3 array
A3
array([[ 1,  4,  5, 12],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [ 1,  4,  5, 12],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [ 1,  4,  5, 12],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19]])

filter second column > 5

Code
# Create a boolean array where True indicates rows where the second column (index 1) has a value greater than 5
filter_val=(A3>5)[:,2]

which only keeps 2,3 rows.

Code
# Select rows from A3 based on the filter_val boolean array, and all columns
A3[filter_val,0:]
array([[-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19]])

2.10 Create NumPy Array

2.10.1 Create Identity Matrix

Code
# Create a 3x3 identity matrix
np.eye(3)
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

2.10.2 Create Zeros Array

Code
# Create a 2x3 array filled with zeros
np.zeros((2,3))
array([[0., 0., 0.],
       [0., 0., 0.]])

2.10.3 Create Ones Array

Code
# Create a 2x3 array filled with ones
np.ones((2,3))
array([[1., 1., 1.],
       [1., 1., 1.]])

2.11 Compare Arrays

Code
# Creating Array
a = np.array([1,2,3,4]) 
b = np.array([3,2,5,6])
Code
# Compare if elements of array a are greater than elements of array b
np.greater(a, b)
array([False, False, False, False])
Code
# Compare if elements of array a are greater than or equal to elements of array b
a >= b
array([False,  True, False, False])
Code
# Compare if elements of array a are less than elements of array b
np.less(a, b)
array([ True, False,  True,  True])
Code
# Compare if elements of array a are equal to elements of array b
np.equal(a, b)
array([False,  True, False, False])

2.12 Reshape Array

Code
# Create a 1D array with values from 0 to 8 and reshape it into a 3x3 2D array
a=np.arange(9).reshape(3, 3)

# Print the reshaped array
a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

2.13 Array Calculations

Code
# Perform element-wise multiplication of array a by itself
b=a*a
# Print the result
b
array([[ 0,  1,  4],
       [ 9, 16, 25],
       [36, 49, 64]])
Code
# Perform element-wise addition of array a to itself
b=a+a
# Print the result
b
array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16]])

2.14 NumPy Array to DataFrame

Code
# Import the pandas library
import pandas as pd
# Create a DataFrame from the NumPy array b, with specified column names
df = pd.DataFrame(b, columns=['Column_A', 'Column_B', 'Column_C'])

# Print the DataFrame
df
Column_A Column_B Column_C
0 0 2 4
1 6 8 10
2 12 14 16

3 Reference

https://docs.python.org/3/tutorial/datastructures.html#

https://numpy.org/doc/stable/user/basics.rec.html

Back to top