Data structure in Python

Author

Tony Duan

1 bulid-in data Structures

1.1 singular

Code
a=1
type(a)
int
Code
a=1.3
type(a)
float
Code
a='hell'
type(a)
str
Code
a= True
type(a)
bool

1.2 list

Code
a=[1,2,3]

a
[1, 2, 3]
Code
type(a) 
list
Code
fruits = ['orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana','apple']

1.2.1 find length of the list with len()

Code
len(fruits)
8

1.2.2 first 2 on the list

Code
fruits[:2]
['orange', 'apple']

1.2.3 last 2 on the list

Code
fruits[-2:]
['banana', 'apple']

1.2.4 find how many time in the list with count()

Code
fruits.count('apple')
3

1.2.5 find locaiton of on the list with index()

show the first ‘apple’ index. python list start at 0

Code
fruits.index('apple')
1

all ‘apple’ in the list

Code
[index for index, value in enumerate(fruits) if value == 'apple']
[1, 5, 7]

1.2.6 reverse the list

Code
fruits.reverse()
fruits
['apple', 'banana', 'apple', 'kiwi', 'banana', 'pear', 'apple', 'orange']

1.2.7 sort the list

Code
fruits.sort()
fruits
['apple', 'apple', 'apple', 'banana', 'banana', 'kiwi', 'orange', 'pear']

1.2.8 add element on the list

Code
fruits.append('grape')
fruits
['apple',
 'apple',
 'apple',
 'banana',
 'banana',
 'kiwi',
 'orange',
 'pear',
 'grape']

1.2.9 drop last element

Code
fruits.pop()

fruits
['apple', 'apple', 'apple', 'banana', 'banana', 'kiwi', 'orange', 'pear']

1.2.10 List Comprehensions

using loop:

Code
squares = []
for x in range(10):
  squares.append(x**2)
  
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

using List Comprehensions

Code
squares = [x**2 for x in range(10)]
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]

1.2.11 list to Tuples

Code
tuple(squares)
(0, 1, 4, 9, 16, 25, 36, 49, 64, 81)

1.2.12 list to set

Code
set(squares)
{0, 1, 4, 9, 16, 25, 36, 49, 64, 81}

1.2.13 list to dictionary

1.2.13.1 one list to dictionary

Code
list=['a', 1, 'b', 2, 'c', 3]

def convert(lst):
   res_dict = {}
   for i in range(0, len(lst), 2):
       res_dict[lst[i]] = lst[i + 1]
   return res_dict
 
convert(list)
{'a': 1, 'b': 2, 'c': 3}

1.2.13.2 two list to dictionary

Code
import itertools

keys = ('name', 'age', 'food')

values = ('Monty', 42, 'spam')

dict(zip(keys, values))
{'name': 'Monty', 'age': 42, 'food': 'spam'}

1.3 Tuples

Code
fruits = ('orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana','apple')

fruits
('orange', 'apple', 'pear', 'banana', 'kiwi', 'apple', 'banana', 'apple')
Code
type(fruits)
tuple

tuple can not be modified.

1.4 Sets

A set is an unordered collection with no duplicate elements.

Code
basket = {'apple', 'orange', 'apple', 'pear', 'orange', 'banana'}

basket
{'apple', 'banana', 'orange', 'pear'}
Code
type(basket)
set

1.5 Dictionaries

Code
tel = {'jack': 4098, 'sape': 4139}

tel
{'jack': 4098, 'sape': 4139}
Code
type(tel)
dict
Code
tel['jack']
4098

2 numpy data Structures(matrix in python)

NumPy is the fundamental package for scientific computing in Python. It is a Python library that provides a multidimensional array object

Python doesn’t have a built-in type for matrices. However, we can treat a list of a list as a matrix

Code
A = [[1, 4, 5, 12], 
    [-5, 8, 9, 0],
    [-6, 7, 11, 19]]
    
A
[[1, 4, 5, 12], [-5, 8, 9, 0], [-6, 7, 11, 19]]

2.1 numpy Array

Code
import numpy as np

A2 = np.array([[1, 2, 3], [3, 4, 5]])
print(A2)
[[1 2 3]
 [3 4 5]]
Code
type(A2)
numpy.ndarray

2.2 shape

Code
A2.shape
(2, 3)

2.3 row number

Code
len(A2)
2

2.4 total elements

Code
A2.size
6

2.5 dimension

Code
A2.ndim
2

2.6 count numpy.ndarray

Code
import collections, numpy
a = numpy.array([0, 3, 0, 4])
collections.Counter(a)
Counter({np.int64(0): 2, np.int64(3): 1, np.int64(4): 1})

2.6.1 convert list into numpy array

Code
A = [
  [1, 4, 5, 12], 
  [-5, 8, 9, 0],
  [-6, 7, 11, 19],
  [1, 4, 5, 12], 
  [-5, 8, 9, 0],
  [-6, 7, 11, 19],
  [1, 4, 5, 12], 
  [-5, 8, 9, 0],
  [-6, 7, 11, 19]
  ]
    
A3 = np.array(A)

print(A3)
[[ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]]

2.6.2 selection

2.6.2.1 first 5 row

Code
A[:5]
[[1, 4, 5, 12], [-5, 8, 9, 0], [-6, 7, 11, 19], [1, 4, 5, 12], [-5, 8, 9, 0]]

2.6.2.2 lst 5 row

Code
A[:-5]
[[1, 4, 5, 12], [-5, 8, 9, 0], [-6, 7, 11, 19], [1, 4, 5, 12]]

2.6.2.3 first row

Code
A[0]
[1, 4, 5, 12]

2.6.2.4 first column

Code
A2[:,0]
array([1, 3])

2.6.2.5 first row and first column element

Code
A2[0,0]
np.int64(1)
Code
A2.dtype
dtype('int64')

2.6.2.6 2 row and 3 column

Code
A2[1,2]
np.int64(5)

2.6.2.7 filter

2.6.2.7.1 filter all
Code
print(A3)
[[ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]
 [ 1  4  5 12]
 [-5  8  9  0]
 [-6  7 11 19]]
Code
A3>4
array([[False, False,  True,  True],
       [False,  True,  True, False],
       [False,  True,  True,  True],
       [False, False,  True,  True],
       [False,  True,  True, False],
       [False,  True,  True,  True],
       [False, False,  True,  True],
       [False,  True,  True, False],
       [False,  True,  True,  True]])
Code
A3[A3>4]
array([ 5, 12,  8,  9,  7, 11, 19,  5, 12,  8,  9,  7, 11, 19,  5, 12,  8,
        9,  7, 11, 19])
2.6.2.7.2 filter row
Code
A3
array([[ 1,  4,  5, 12],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [ 1,  4,  5, 12],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [ 1,  4,  5, 12],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19]])

filter secound column>5

Code
filter_val=(A3>5)[:,2]

which only keep 2,3 row.

Code
A3[filter_val,0:]
array([[-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19],
       [-5,  8,  9,  0],
       [-6,  7, 11, 19]])

2.6.3 create numpy array

2.6.3.1 create eye

Code
np.eye(3)
array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

2.6.3.2 create zero

Code
np.zeros((2,3))
array([[0., 0., 0.],
       [0., 0., 0.]])

2.6.3.3 create ones

Code
np.ones((2,3))
array([[1., 1., 1.],
       [1., 1., 1.]])

2.6.4 compare

Code
# Creating Array
a = np.array([1,2,3,4]) 
b = np.array([3,2,5,6])
Code
# Comparing two arrays
np.greater(a, b)
array([False, False, False, False])
Code
a >= b
array([False,  True, False, False])
Code
# Comparing two arrays
np.less(a, b)
array([ True, False,  True,  True])
Code
# Comparing two arrays
np.equal(a, b)
array([False,  True, False, False])

2.6.5 reshape

Code
a=np.arange(9).reshape(3, 3)

a
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])

2.6.6 calculation

Code
b=a*a
b
array([[ 0,  1,  4],
       [ 9, 16, 25],
       [36, 49, 64]])
Code
b=a+a
b
array([[ 0,  2,  4],
       [ 6,  8, 10],
       [12, 14, 16]])

2.6.7 numpy array to dataframe

Code
import pandas as pd
df = pd.DataFrame(b, columns=['Column_A', 'Column_B', 'Column_C'])

df
Column_A Column_B Column_C
0 0 2 4
1 6 8 10
2 12 14 16

3 pandas series

4 Reference

https://docs.python.org/3/tutorial/datastructures.html#

https://numpy.org/doc/stable/user/basics.rec.html

Back to top