Python Data Structures

Master Python data structures like lists, tuples, sets, and dictionaries. Essential for efficient data manipulation in AI & Machine Learning.

Python Data Structures

This document provides a comprehensive overview of Python's fundamental data structures, including lists, tuples, sets, and dictionaries, along with their common methods and key differences.

3.1 Python Lists

A Python list is a versatile, ordered, and mutable sequence of items. Lists can contain elements of different data types.

Key Characteristics:

  • Ordered: Elements maintain their insertion order.

  • Mutable: You can change, add, or remove elements after the list is created.

  • Heterogeneous: Can store items of different data types (integers, strings, floats, other lists, etc.).

  • Dynamic: Can grow or shrink in size.

Creating a List:

my_list = [1, "hello", 3.14, True]
empty_list = []

Accessing Elements:

Elements are accessed using their index, starting from 0.

my_list = [10, 20, 30, 40, 50]
print(my_list[0])  # Output: 10
print(my_list[2])  # Output: 30
print(my_list[-1]) # Output: 50 (accessing the last element)

Slicing Lists:

You can extract a portion of a list using slicing.

my_list = [10, 20, 30, 40, 50]
print(my_list[1:4]) # Output: [20, 30, 40] (elements from index 1 up to, but not including, index 4)
print(my_list[:3])  # Output: [10, 20, 30] (elements from the beginning up to index 3)
print(my_list[2:])  # Output: [30, 40, 50] (elements from index 2 to the end)

3.2 Python List Methods

Python lists come with a rich set of built-in methods for manipulation.

  • append(item): Adds an item to the end of the list.

    my_list = [1, 2]
    my_list.append(3)
    print(my_list) # Output: [1, 2, 3]
    
  • extend(iterable): Adds all items from an iterable (like another list) to the end of the current list.

    list1 = [1, 2]
    list2 = [3, 4]
    list1.extend(list2)
    print(list1) # Output: [1, 2, 3, 4]
    
  • insert(index, item): Inserts an item at a specific index.

    my_list = [1, 3]
    my_list.insert(1, 2)
    print(my_list) # Output: [1, 2, 3]
    
  • remove(item): Removes the first occurrence of a specific item. Raises ValueError if the item is not found.

    my_list = [1, 2, 3, 2]
    my_list.remove(2)
    print(my_list) # Output: [1, 3, 2]
    
  • pop([index]): Removes and returns the item at a given index. If no index is specified, it removes and returns the last item.

    my_list = [1, 2, 3]
    last_item = my_list.pop()
    print(last_item) # Output: 3
    print(my_list)   # Output: [1, 2]
    
    item_at_index_0 = my_list.pop(0)
    print(item_at_index_0) # Output: 1
    print(my_list)        # Output: [2]
    
  • clear(): Removes all items from the list.

    my_list = [1, 2, 3]
    my_list.clear()
    print(my_list) # Output: []
    
  • index(item, [start], [end]): Returns the index of the first occurrence of an item. Raises ValueError if the item is not found.

    my_list = [10, 20, 30, 20]
    print(my_list.index(20)) # Output: 1
    print(my_list.index(20, 2)) # Output: 3 (searches from index 2 onwards)
    
  • count(item): Returns the number of times an item appears in the list.

    my_list = [1, 2, 2, 3, 2]
    print(my_list.count(2)) # Output: 3
    
  • sort(key=None, reverse=False): Sorts the items of the list in ascending order (or descending if reverse=True). Modifies the list in-place.

    my_list = [3, 1, 4, 1, 5, 9, 2]
    my_list.sort()
    print(my_list) # Output: [1, 1, 2, 3, 4, 5, 9]
    
    my_list.sort(reverse=True)
    print(my_list) # Output: [9, 5, 4, 3, 2, 1, 1]
    
  • reverse(): Reverses the elements of the list in-place.

    my_list = [1, 2, 3]
    my_list.reverse()
    print(my_list) # Output: [3, 2, 1]
    
  • copy(): Returns a shallow copy of the list.

    original_list = [1, 2, 3]
    copied_list = original_list.copy()
    print(copied_list) # Output: [1, 2, 3]
    print(original_list is copied_list) # Output: False
    

3.3 Python Tuples

A Python tuple is an ordered, immutable sequence of items. Once a tuple is created, its contents cannot be changed.

Key Characteristics:

  • Ordered: Elements maintain their insertion order.

  • Immutable: You cannot change, add, or remove elements after creation.

  • Heterogeneous: Can store items of different data types.

  • Fixed Size: The size is determined at creation.

Creating a Tuple:

my_tuple = (1, "hello", 3.14, True)
single_item_tuple = (5,) # Note the comma for a single-element tuple
empty_tuple = ()

Accessing Elements:

Similar to lists, elements are accessed using their index.

my_tuple = (10, 20, 30, 40, 50)
print(my_tuple[0])  # Output: 10
print(my_tuple[2])  # Output: 30
print(my_tuple[-1]) # Output: 50

Slicing Tuples:

Slicing works the same way as with lists, returning a new tuple.

my_tuple = (10, 20, 30, 40, 50)
print(my_tuple[1:4]) # Output: (20, 30, 40)

3.4 Python Tuple Methods

Tuples have fewer built-in methods than lists because they are immutable.

  • count(item): Returns the number of times an item appears in the tuple.

    my_tuple = (1, 2, 2, 3, 2)
    print(my_tuple.count(2)) # Output: 3
    
  • index(item, [start], [end]): Returns the index of the first occurrence of an item. Raises ValueError if the item is not found.

    my_tuple = (10, 20, 30, 20)
    print(my_tuple.index(20)) # Output: 1
    print(my_tuple.index(20, 2)) # Output: 3
    

3.5 Difference between List and Tuple

| Feature | List | Tuple | | :------------- | :------------------------------------ | :------------------------------------- | | Mutability | Mutable (can be changed) | Immutable (cannot be changed) | | Syntax | [item1, item2, ...] | (item1, item2, ...) | | Performance| Generally slightly slower due to mutability overhead | Generally slightly faster due to immutability | | Use Cases | Collections of items that may change, sequences of operations | Fixed collections of items, representing records, used as dictionary keys | | Memory | Uses more memory | Uses less memory | | Methods | More methods (append, extend, sort, etc.) | Fewer methods (count, index) |

3.6 Python Sets

A Python set is an unordered collection of unique elements. Sets are useful for membership testing, removing duplicates, and mathematical operations like union, intersection, and difference.

Key Characteristics:

  • Unordered: Elements do not have a defined order.

  • Mutable: You can add or remove elements from a set.

  • Unique Elements: Sets cannot contain duplicate values.

  • Un-hashable Elements: Elements in a set must be hashable (e.g., numbers, strings, tuples). Lists and dictionaries cannot be elements of a set.

Creating a Set:

my_set = {1, 2, 3, 4, 2} # Duplicates are automatically removed
print(my_set)      # Output: {1, 2, 3, 4} (order may vary)

empty_set = set() # Use set() for an empty set, {} creates an empty dictionary

Adding/Removing Elements:

my_set = {1, 2, 3}
my_set.add(4)
print(my_set) # Output: {1, 2, 3, 4} (order may vary)

my_set.update([3, 4, 5]) # Add multiple elements
print(my_set) # Output: {1, 2, 3, 4, 5}

my_set.remove(3)
print(my_set) # Output: {1, 2, 4, 5} (raises KeyError if element not found)

my_set.discard(6) # Removes if present, no error if not
print(my_set) # Output: {1, 2, 4, 5}

my_set.pop() # Removes and returns an arbitrary element
print(my_set)

my_set.clear() # Removes all elements
print(my_set) # Output: set()

3.7 Python Set Methods

Sets offer powerful methods for set operations:

  • add(item): Adds a single element to the set.

  • update(iterable): Adds elements from an iterable to the set.

  • remove(item): Removes an element. Raises KeyError if the element is not found.

  • discard(item): Removes an element if it is present; does nothing if the element is not found.

  • pop(): Removes and returns an arbitrary element from the set. Raises KeyError if the set is empty.

  • clear(): Removes all elements from the set.

  • union(other_set, ...) or |: Returns a new set with elements from the set and all others.

    set1 = {1, 2, 3}
    set2 = {3, 4, 5}
    print(set1.union(set2)) # Output: {1, 2, 3, 4, 5}
    print(set1 | set2)      # Output: {1, 2, 3, 4, 5}
    
  • intersection(other_set, ...) or &: Returns a new set with elements common to the set and all others.

    set1 = {1, 2, 3}
    set2 = {3, 4, 5}
    print(set1.intersection(set2)) # Output: {3}
    print(set1 & set2)             # Output: {3}
    
  • difference(other_set, ...) or -: Returns a new set with elements in the set that are not in all others.

    set1 = {1, 2, 3}
    set2 = {3, 4, 5}
    print(set1.difference(set2)) # Output: {1, 2}
    print(set1 - set2)           # Output: {1, 2}
    
  • symmetric_difference(other_set) or ^: Returns a new set with elements in either set, but not both.

    set1 = {1, 2, 3}
    set2 = {3, 4, 5}
    print(set1.symmetric_difference(set2)) # Output: {1, 2, 4, 5}
    print(set1 ^ set2)                     # Output: {1, 2, 4, 5}
    
  • issubset(other_set) or <=: Returns True if all elements of the set are contained in the other set.

  • issuperset(other_set) or >=: Returns True if all elements of the other set are contained in the set.

  • isdisjoint(other_set): Returns True if the set has no elements in common with another set.

3.8 Python Dictionary

A Python dictionary is an unordered, mutable collection of key-value pairs. Each key must be unique and immutable (like strings, numbers, or tuples).

Key Characteristics:

  • Unordered (in Python < 3.7, ordered in Python >= 3.7): Items are stored as key-value pairs.

  • Mutable: You can add, remove, or change key-value pairs after creation.

  • Unique Keys: Each key in a dictionary must be unique.

  • Hashable Keys: Keys must be of an immutable type (e.g., strings, numbers, tuples containing immutable elements). Values can be of any data type.

Creating a Dictionary:

my_dict = {"name": "Alice", "age": 30, "city": "New York"}
empty_dict = {}

Accessing Elements:

Values are accessed using their keys.

my_dict = {"name": "Alice", "age": 30}
print(my_dict["name"]) # Output: Alice
print(my_dict.get("age")) # Output: 30

Adding/Modifying Elements:

my_dict = {"name": "Alice"}
my_dict["age"] = 30 # Add a new key-value pair or update an existing one
print(my_dict)    # Output: {'name': 'Alice', 'age': 30}

my_dict.update({"city": "London", "age": 31}) # Update with another dictionary
print(my_dict)    # Output: {'name': 'Alice', 'age': 31, 'city': 'London'}

Removing Elements:

my_dict = {"name": "Alice", "age": 30, "city": "New York"}
del my_dict["city"] # Remove a specific key-value pair
print(my_dict)      # Output: {'name': 'Alice', 'age': 30}

age = my_dict.pop("age") # Remove and return the value for a specific key
print(age)        # Output: 30
print(my_dict)    # Output: {'name': 'Alice'}

## popitem() removes and returns the last inserted key-value pair (as a tuple)
## (available in Python 3.7+, arbitrary removal in older versions)
last_item = my_dict.popitem()
print(last_item) # e.g., ('name', 'Alice')
print(my_dict)   # Output: {}

my_dict.clear() # Removes all items
print(my_dict)  # Output: {}

3.9 Python Dictionary Methods

  • keys(): Returns a view object that displays a list of all the keys in the dictionary.

    my_dict = {"a": 1, "b": 2}
    print(my_dict.keys()) # Output: dict_keys(['a', 'b'])
    
  • values(): Returns a view object that displays a list of all the values in the dictionary.

    my_dict = {"a": 1, "b": 2}
    print(my_dict.values()) # Output: dict_values([1, 2])
    
  • items(): Returns a view object that displays a list of a dictionary's key-value tuple pairs.

    my_dict = {"a": 1, "b": 2}
    print(my_dict.items()) # Output: dict_items([('a', 1), ('b', 2)])
    
  • get(key, default=None): Returns the value for a specified key, or a default value if the key is not found.

    my_dict = {"a": 1}
    print(my_dict.get("a"))    # Output: 1
    print(my_dict.get("b"))    # Output: None
    print(my_dict.get("b", 0)) # Output: 0
    
  • update(other_dict): Updates the dictionary with key-value pairs from another dictionary or an iterable of key-value pairs.

  • copy(): Returns a shallow copy of the dictionary.

3.10 Difference between List and Dictionary

| Feature | List | Dictionary | | :------------- | :--------------------------------------- | :---------------------------------------- | | Structure | Ordered sequence of items | Unordered (or ordered by insertion) collection of key-value pairs | | Access | By integer index (position) | By unique keys | | Keys/Indices| Integer indices (0, 1, 2, ...) | Immutable, unique keys (strings, numbers, tuples) | | Mutability | Mutable | Mutable | | Uniqueness | Allows duplicate elements | Keys must be unique, values can be duplicated | | Use Cases | Collections of similar items, sequences where order matters | Storing related data, lookups, mappings | | Syntax | [item1, item2] | {"key1": value1, "key2": value2} |

3.11 Difference between List, Set, Tuple, and Dictionary

| Feature | List | Tuple | Set | Dictionary | | :------------- | :------------------------------------ | :------------------------------------- | :----------------------------------- | :---------------------------------------- | | Structure | Ordered, mutable sequence | Ordered, immutable sequence | Unordered, mutable collection of unique items | Unordered (or ordered) mutable key-value pairs | | Mutability | Mutable | Immutable | Mutable | Mutable | | Duplicates | Allowed | Allowed | Not allowed | Keys must be unique, values can be duplicated | | Access | By integer index (e.g., my_list[0]) | By integer index (e.g., my_tuple[0]) | No direct indexing, by iteration/membership check | By unique keys (e.g., my_dict["key"]) | | Keys | Integer indices | Integer indices | N/A (elements themselves) | Immutable, unique keys | | Use Cases | Dynamic collections, ordered sequences | Fixed collections, data integrity, hashable items | Uniqueness, membership testing, set operations | Mappings, lookups, related data storage | | Syntax | [ ] | ( ) | { } or set() | {key: value} | | Memory | More | Less | Moderate | Moderate |

3.12 Difference between Sets and Dictionary

While both sets and dictionaries use curly braces {} for literal creation (with set() being used for empty sets), they serve different primary purposes and have key distinctions:

| Feature | Set | Dictionary | | :------------- | :---------------------------------------------- | :---------------------------------------------- | | Purpose | Store a collection of unique elements; perform set operations. | Store key-value pairs; efficient lookups and associations. | | Elements | Individual items (must be hashable). | Pairs of keys and values. Keys must be hashable and unique. | | Structure | Unordered collection of unique items. | Unordered (or insertion-ordered) collection of key-value pairs. | | Access | No direct indexing. Elements are accessed via iteration or membership testing. | Accessed via their unique keys. | | Data | Stores single data points. | Stores associated data (key maps to value). | | Operations | Union, intersection, difference, subset, etc. | Adding, removing, updating, retrieving key-value pairs. | | Example | {1, 2, 3} | {"name": "Alice", "age": 30} |