Encryption Fundamentals — Beginner Workbook

Concept Estimated: 60–90 min

Background: How did people encrypt before computers?

Before digital encryption, people protected secrets using substitution ciphers — schemes that replace each letter in a message with a different letter or symbol. Understanding these classical ciphers is essential because they reveal the fundamental tension at the heart of cryptography: complexity vs. breakability.

Key Term

Plaintext is the original readable message. Ciphertext is the scrambled result after encryption. The process of converting between them requires a key.

1.1 The Caesar Cipher

Julius Caesar reportedly used this cipher to protect military messages. The idea is simple: shift every letter in the alphabet by a fixed number of positions.

With a shift of 3: A → D, B → E, Z → C (it wraps around). So the word HELLO becomes KHOOR.

Security Note

The Caesar cipher has only 25 possible keys (shifts 1–25). An attacker can try every single one in under a minute — even by hand. This is called a brute-force attack.

🧪 Try it — Caesar Cipher Demo

Your message:

Shift (key): 3

Plaintext

HELLO WORLD

→

Ciphertext

1.2 ROT13 — A Special Case

ROT13 is a Caesar cipher with a shift of exactly 13. Because the English alphabet has 26 letters, ROT13 is its own inverse: applying it twice gets you back to the original. It is used in online forums to hide spoilers, not for actual security.

🧪 ROT13 Demo — notice that encrypting twice gives back the original

Original

→

After ROT13

Applying ROT13 again:

1.3 The Vigenère Cipher

The Vigenère cipher uses a keyword instead of a single number. Each letter of the keyword determines the shift for the corresponding letter in the message. This makes frequency analysis harder — but not impossible.

For example, with keyword KEY:

Plaintext letter	H	E	L	L	O
Key letter	K (+10)	E (+4)	Y (+24)	K (+10)	E (+4)
Ciphertext	R	I	J	V	S

🧪 Vigenère Demo

Message: Keyword (letters only):

Plaintext

→

Ciphertext

1.4 Breaking Ciphers: Frequency Analysis

In any sufficiently long English text, some letters appear much more often than others. The letter E is the most common (~13%), followed by T, A, O, I, N. Attackers exploit this pattern to crack substitution ciphers.

📊 Frequency Analysis Tool — paste any ciphertext and look for patterns

Exercises

EX 1.1 Decrypt by brute force

The following ciphertext was encrypted with a Caesar cipher. Use the demo above to try all 25 shifts and find the one that produces readable English:

Ciphertext

PBZCHGRE FRPHEVGL VF GUR CENPGVPR BS CEBGRPGVAT
FLFGRZF, ARGJBEXF, NAQ CEBTENZF SEBZ NGGNPX.

Question: What is the shift value? What does the plaintext say?

Hint: This is actually ROT13. What does that tell you about the shift?

EX 1.2 Frequency analysis challenge

Paste this longer Caesar-encrypted text into the Frequency Analysis Tool above. Identify the most frequent letter, assume it stands for E, calculate the shift, and decrypt.

Challenge Ciphertext

WKHUHDUH WRPHWKHU HQFUBSWHG PHVVDJHV KLGGHQ
LQ SODLQ VLJKW. WKH DUWRI KLGLQJ PHVVDJHV LV
FDOOHG VWHJDQRJUDSKB. LW LV GLIIHUHQW IURP
HQFUBSWLRQ EHFDXVH LW KLGHV WKH IDFW WKDW D
PHVVDJH H[LVWV DW DOO. HQFUBSWLRQ KLGHV WKH
FRQWHQW EXW QRW WKH IDFW WKDW D PHVVDJH ZDV VHQW.

EX 1.3 Partner encryption challenge

1

Write a sentence about cybersecurity (at least 20 characters).

2

Encrypt it using the Vigenère demo with a keyword of your choice (don't share the keyword).

3

Share only the ciphertext with a classmate. See if they can crack it.

4

Discuss: How long did it take? What made it hard or easy?

My ciphertext:

What I noticed about cracking my partner's cipher:

Python Lab: Implement Your Own Caesar Cipher

🐧 Linux Terminal

Save as caesar.py, then run:
python3 caesar.py
No libraries to install — uses only built-in Python.

📓 Google Colab

Paste the entire script into a code cell and press Shift+Enter. No installation needed — all modules are standard library.

Python 3 — no installation required

# caesar.py — Build and break a Caesar cipher from scratch

def caesar_encrypt(text, shift):
    result = ""
    for char in text.upper():
        if char.isalpha():
            # ord() converts a letter to its ASCII number
            # A=65, Z=90. We shift within the alphabet using modulo.
            shifted = (ord(char) - 65 + shift) % 26 + 65
            result += chr(shifted)
        else:
            result += char  # preserve spaces and punctuation
    return result

def caesar_decrypt(ciphertext, shift):
    # Decrypting is just encrypting with the opposite shift
    return caesar_encrypt(ciphertext, 26 - shift)

def brute_force(ciphertext):
    """Try all 25 possible shifts"""
    for shift in range(1, 26):
        attempt = caesar_decrypt(ciphertext, shift)
        print(f"Shift {shift:2d}: {attempt}")

# --- Try it ---
secret = "HELLO WORLD"
encrypted = caesar_encrypt(secret, 7)
print(f"Original:  {secret}")
print(f"Encrypted: {encrypted}")
print(f"Decrypted: {caesar_decrypt(encrypted, 7)}")
print("\n--- Brute Force Attack ---")
brute_force(encrypted)

Extension Challenge

Modify the brute_force function to automatically identify the most likely correct shift by looking for common English words like "THE", "AND", "IS" in each attempt. Print only the most likely result.

✓ Knowledge Check — Lab 1

Q1: Why is the Caesar cipher considered cryptographically weak?

A It uses a keyword that is hard to remember
B It has only 25 possible keys, making brute-force trivial
C It was invented too long ago to still be relevant
D It does not preserve spaces

✔ Correct — Only 25 shifts means an attacker can try every key by hand.

✘ Not quite — try again. Think about how many keys are possible.

Q2: What does frequency analysis exploit?

A The weakness of the random number generator used
B The length of the keyword
C The fact that some letters appear more often than others in natural language
D Repeated use of the same key across multiple messages

✔ Correct — Letter frequency patterns survive simple substitution.

✘ Not quite. Think about how often 'E' appears in English text.

Q3: The Vigenère cipher is stronger than Caesar because…

A Each letter of the message can be shifted by a different amount, obscuring frequency patterns
B It uses a longer alphabet
C It encrypts numbers as well as letters
D It requires a computer to run

✔ Correct — The polyalphabetic substitution flattens the frequency distribution.

✘ Not quite. Focus on what the keyword changes about each letter's shift.

Concept Estimated: 60–90 min

Background: What is hashing, and how does it differ from encryption?

Hashing and encryption are often confused, but they serve different purposes. Encryption is reversible — if you have the key, you can always get the original message back. Hashing is one-way — it converts data into a fixed-length "fingerprint" called a hash or digest. There is no key, and there is no way to reconstruct the original from the hash alone (in theory).

Analogy

Think of a hash function like a meat grinder. You can put a steak in and get ground beef out, but you cannot put ground beef in and get a steak back. Different steaks will always produce different ground beef.

2.1 What are hash functions used for?

Password storage: Databases store the hash of your password, not the password itself. When you log in, your input is hashed and compared to the stored hash.
File integrity: Software downloads include a hash. After downloading, you hash the file yourself and compare — if they match, the file wasn't tampered with.
Digital signatures: Instead of signing an entire document, you sign its hash (much faster).
Blockchain: Each block contains the hash of the previous block, chaining them together immutably.

2.2 Properties a good hash function must have

Deterministic

Same input → same output, always

You need the same result each time to verify

Fixed length

Output is always the same size regardless of input size

SHA-256 always produces 256 bits, even for a 10GB file

Pre-image resistant

Given hash H, you cannot find an input that produces H

Prevents reverse-engineering passwords from their hashes

Collision resistant

Extremely hard to find two inputs with the same hash

MD5 and SHA-1 have been broken on this — use SHA-256+

Avalanche effect

Tiny change in input → completely different hash

Ensures the hash reveals nothing about the input

2.3 The Avalanche Effect in Action

Change one character in your input and watch how completely the hash changes. This is called the avalanche effect and is a crucial security property.

🧪 Hash Demo — watch the avalanche effect

Input text:

MD5 & SHA-1 Are Broken

MD5 (1992) and SHA-1 (1995) are still widely seen in legacy systems, but both have known collision attacks — researchers have demonstrated two different files that produce the same hash. Never use MD5 or SHA-1 for security-critical purposes. Use SHA-256 or SHA-3 instead.

2.4 On-Machine Labs: File Integrity Verification

🐧 Linux Terminal

md5sum, sha1sum, and sha256sum are pre-installed on all Linux distros as part of GNU coreutils. Run all commands exactly as shown.

📓 Google Colab

Prefix all shell commands with ! — e.g. !sha256sum testfile.txt. The Colab runtime is Linux-based so all tools are available.

EX 2.1 Generate and compare file hashes

1

Create a working directory and your first test file:

Linux Terminal

$ mkdir ~/hash-lab && cd ~/hash-lab

$ echo "This is my test file" > testfile.txt

$ cat testfile.txt

This is my test file

Google Colab equivalent

In a Colab cell, use the ! prefix: !mkdir -p /content/hash-lab && echo "This is my test file" > /content/hash-lab/testfile.txt

2

Generate hashes with three different algorithms and observe the output length:

Linux Terminal

$ md5sum testfile.txt

8d777f385d3dfec8815d20f7496026dc testfile.txt

$ sha1sum testfile.txt

3b18e512dba79e4c8300dd08aeb37f8e728b8dad testfile.txt

$ sha256sum testfile.txt

9d8efbd490d0b7d5bb08e95be5e74e0f117e5c6b3d8a3a2e034a5c9a47b78d2 testfile.txt

3

Create a second file that differs by only ONE character (the exclamation mark) and re-hash:

Linux Terminal

$ echo "This is my test file!" > testfile2.txt

$ sha256sum testfile.txt testfile2.txt

9d8efbd4... testfile.txt

b6dfbd7c... testfile2.txt

4

Use diff to confirm the files really do differ by only one character:

Linux Terminal

$ diff testfile.txt testfile2.txt

< This is my test file

> This is my test file!

Record your results: Copy your actual SHA-256 hashes below. How many hex characters differ between the two hashes?

SHA-256 of testfile.txt:

SHA-256 of testfile2.txt:

Number of hex characters that changed (out of 64 total):

EX 2.2 Verify a software download

Most legitimate software distributions publish their SHA-256 hash alongside the download link. This is how you verify that what you downloaded is authentic and unmodified in transit.

1

Download the Python 3 source tarball and its published checksum file directly from python.org:

Linux Terminal

$ cd ~/hash-lab

$ wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz

$ wget https://www.python.org/ftp/python/3.12.0/Python-3.12.0.tgz.asc

... download progress ...

Google Colab equivalent

Use !wget in a code cell — the file saves to /content/ by default. Then run !sha256sum /content/Python-3.12.0.tgz.

2

Compute the SHA-256 hash of your downloaded file:

Linux Terminal

$ sha256sum Python-3.12.0.tgz

[your hash will appear here] Python-3.12.0.tgz

3

Go to https://www.python.org/downloads/release/python-3120/ and find the SHA-256 hash listed next to the Gzipped source tarball link. Compare it manually character by character — or use the automated check below:

Linux Terminal — automated verification

$ echo "PASTE_OFFICIAL_HASH_HERE Python-3.12.0.tgz" | sha256sum --check

Python-3.12.0.tgz: OK

# If the hash is wrong you would see:

Python-3.12.0.tgz: FAILED

4

Simulate a tampered file: append one byte to the download and re-check:

Linux Terminal

$ cp Python-3.12.0.tgz Python-3.12.0-tampered.tgz

$ echo "X" >> Python-3.12.0-tampered.tgz

$ sha256sum Python-3.12.0.tgz Python-3.12.0-tampered.tgz

[original hash] Python-3.12.0.tgz

[completely different hash] Python-3.12.0-tampered.tgz

Published SHA-256 from python.org:

Your computed SHA-256:

What would it mean for security if these two hashes did not match?

EX 2.3 Python: Build a file integrity checker

Write a Python script that computes and stores hashes of files, then can verify them later to detect tampering. This is a simplified version of tools like tripwire used in real security operations.

🐧 Linux Terminal

Save as integrity_checker.py and run with python3 integrity_checker.py. No pip installs needed — uses only standard library modules.

📓 Google Colab

Paste into a code cell. Change the directory path from ./test_folder to /content/test_folder. No installation needed.

Python 3 — integrity_checker.py

import hashlib
import json
import os
from pathlib import Path

def hash_file(filepath, algorithm="sha256"):
    """Compute the hash of a file, reading in chunks for large files."""
    h = hashlib.new(algorithm)
    with open(filepath, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            h.update(chunk)
    return h.hexdigest()

def create_baseline(directory, outfile="baseline.json"):
    """Hash all files in a directory and save the results."""
    baseline = {}
    for path in Path(directory).rglob("*"):
        if path.is_file():
            baseline[str(path)] = hash_file(path)
    with open(outfile, "w") as f:
        json.dump(baseline, f, indent=2)
    print(f"Baseline saved: {len(baseline)} files hashed.")

def verify_integrity(directory, baseline_file="baseline.json"):
    """Compare current hashes against the saved baseline."""
    with open(baseline_file) as f:
        baseline = json.load(f)
    tampered = []
    for filepath, stored_hash in baseline.items():
        if os.path.exists(filepath):
            current_hash = hash_file(filepath)
            if current_hash != stored_hash:
                tampered.append(filepath)
                print(f"[CHANGED] {filepath}")
        else:
            print(f"[MISSING] {filepath}")
    if not tampered:
        print("✓ All files intact. No changes detected.")

# --- Usage ---
# Step 1: create a baseline of a folder
create_baseline("./test_folder")

# Step 2: modify a file in test_folder, then run:
verify_integrity("./test_folder")

1

Linux: Create a test folder with sample files:
mkdir -p ~/hash-lab/test_folder
echo "File one content" > ~/hash-lab/test_folder/file1.txt
echo "File two content" > ~/hash-lab/test_folder/file2.txt
Colab: Use !mkdir -p /content/test_folder and write files with Python's open().

2

Run create_baseline("./test_folder") — it will print how many files were hashed and save a baseline.json file.

3

Modify one file: open file1.txt and add or change a single word. Save it.

4

Run verify_integrity("./test_folder") and observe the output — it should flag the changed file as [CHANGED].

What output did you see after modifying the file?

What does this tell you about SHA-256's sensitivity to changes?

✓ Knowledge Check — Lab 2

Q1: Which of the following correctly describes a hash function?

A It encrypts data and can be reversed with a key
B It converts data to a fixed-length fingerprint that cannot feasibly be reversed
C It compresses files so they take less space
D It generates random numbers for cryptographic use

✔ Correct — Hashing is a one-way transformation with no decryption key.

✘ Not correct. Remember: no key, no reversal.

Q2: You download a file and its SHA-256 hash does not match the one published by the developer. What does this mean?

A Your download is probably fine — hashes sometimes differ
B The file is encrypted
C The file may have been corrupted or tampered with — do not run it
D You used the wrong hash algorithm

✔ Correct — Any mismatch must be treated as a potential compromise.

✘ Not quite. Hashes are deterministic — two identical files always produce identical hashes.

Q3: Why should MD5 no longer be used for security applications?

A It is too slow for modern computers
B Researchers have demonstrated collision attacks — two different inputs can produce the same MD5 hash
C It produces hashes that are too short
D It requires a secret key to operate

✔ Correct — Collision resistance is broken in MD5, making it unsafe for signatures and integrity checks.

✘ Not quite. Think about what "collision" means for a hash function.

Concept Estimated: 90–120 min

Background: Modern symmetric encryption

Classical ciphers like Caesar were symmetric: both parties used the same key. Modern symmetric encryption works the same way, but uses algorithms so mathematically complex that brute-force attacks would take longer than the age of the universe.

AES (Advanced Encryption Standard) is the global standard, adopted by NIST in 2001. It operates on 128-bit blocks of data using keys of 128, 192, or 256 bits. The NSA uses AES-256 for top-secret data. Understanding how to use it — and how to use it correctly — is a fundamental skill.

Key Term

A block cipher encrypts fixed-size chunks of data (blocks). AES uses 128-bit (16-byte) blocks. A mode of operation defines how the cipher handles messages longer than one block. The mode you choose dramatically affects security.

3.1 Block Cipher Modes: Why They Matter

Imagine encrypting a long message that is broken into 10 blocks. The simplest approach is to encrypt each block independently — this is ECB (Electronic Codebook) mode. The problem: identical plaintext blocks produce identical ciphertext blocks. Patterns survive.

The ECB Penguin Problem

The most famous demonstration: encrypt a bitmap image using ECB mode and the shape of the original image is still visible in the ciphertext, because identical pixel blocks map to identical encrypted blocks. This is not theoretical — it is a practical attack vector.

🧪 ECB vs CBC Visual Demo — observe pattern leakage

Input pattern (simulating a structured image):

ECB Mode — patterns visible

CBC Mode — patterns hidden

3.2 CBC Mode and the Initialization Vector (IV)

CBC (Cipher Block Chaining) fixes the ECB pattern problem by XOR-ing each plaintext block with the previous ciphertext block before encrypting. This means identical plaintext blocks produce different ciphertext depending on their position and what came before them.

But the very first block has no previous block to XOR with — so we use an Initialization Vector (IV): a random value that starts the chain. The IV must be:

Random — not a constant. A fixed IV breaks CBC security.
Unique per message — reusing an IV with the same key leaks information about the plaintexts.
Not secret — it can be sent alongside the ciphertext. Its job is randomness, not secrecy.

ECB

Yes — identical blocks → identical ciphertext

Never for real data

CBC

No — each block depends on previous

With HMAC; vulnerable to padding oracle

CTR

No — turns AES into stream cipher

Yes (but no authentication)

GCM

No — and provides authentication

Recommended default

3.3 OpenSSL on the Command Line

OpenSSL is a free, open-source toolkit that implements SSL/TLS and a broad set of cryptographic operations. It comes pre-installed on all Linux distributions and on every Google Colab instance.

🐧 Linux Terminal

All commands run exactly as written. Verify OpenSSL is present first: openssl version. If missing: sudo apt install openssl (Ubuntu/Debian) or sudo dnf install openssl (Fedora/RHEL).

📓 Google Colab

Prefix every shell command with ! and use /content/ as your working directory instead of ~/encryption-lab/. OpenSSL is pre-installed — no setup needed.

EX 3.1 Encrypt and decrypt a file with AES-256-CBC

1

Create a working directory and a plaintext file:

Linux Terminal

$ mkdir ~/encryption-lab && cd ~/encryption-lab

$ echo "This is my secret message. Nobody should see this." > secret.txt

$ cat secret.txt

This is my secret message. Nobody should see this.

Google Colab equivalent

In a Colab cell: !mkdir -p /content/encryption-lab && echo "This is my secret message. Nobody should see this." > /content/encryption-lab/secret.txt
Then prefix every subsequent command with ! and use /content/encryption-lab/ as your path.

2

Encrypt it. OpenSSL will ask you to set a passphrase. The -pbkdf2 flag uses a secure key derivation function so your password is not used directly as the key.

Linux Terminal

$ openssl enc -aes-256-cbc -pbkdf2 -in secret.txt -out secret.enc

enter AES-256-CBC encryption password: ****

Verifying - enter AES-256-CBC encryption password: ****

3

Try to read the encrypted file — it will be binary garbage. Then inspect it in hex to see the Salted__ header OpenSSL prepends:

Linux Terminal

$ cat secret.enc

[binary garbage — unreadable]

$ xxd secret.enc | head

00000000: 5361 6c74 6564 5f5f 4a8a 23b1 ... Salted__J.#.

4

Decrypt it back to a new file and verify the contents match the original:

Linux Terminal

$ openssl enc -aes-256-cbc -pbkdf2 -d -in secret.enc -out decrypted.txt

enter AES-256-CBC decryption password: ****

$ cat decrypted.txt

This is my secret message. Nobody should see this.

$ diff secret.txt decrypted.txt && echo "Files are identical"

Files are identical

5

Test what happens with the wrong password — deliberately enter a different one:

Linux Terminal

$ openssl enc -aes-256-cbc -pbkdf2 -d -in secret.enc -out wrong.txt

enter AES-256-CBC decryption password: wrongpassword

bad decrypt

140...error:06065064:digital envelope routines:EVP_DecryptFinal_ex:bad decrypt

What error message did you see when using the wrong password? Why can't OpenSSL just produce garbled output instead of an error?

What does the Salted__ prefix in the hex output tell you about how OpenSSL derives the encryption key from your password?

EX 3.2 Demonstrate ECB pattern leakage on an image

This is the classic "ECB penguin" demonstration. We encrypt a bitmap image using ECB and CBC modes and compare the results visually — ECB leaks the image's structure even after encryption.

Why BMP?

BMP (bitmap) files store raw, uncompressed pixel data. Identical regions of the image produce identical blocks of bytes — which ECB maps to identical ciphertext blocks, revealing shapes. JPEG and PNG compress first, hiding patterns before encryption even begins.

1

Navigate to your lab directory and download the Tux penguin image, then convert it to BMP:

Linux Terminal

$ cd ~/encryption-lab

$ wget -O tux.png https://upload.wikimedia.org/wikipedia/commons/a/af/Tux.png

# Convert PNG to BMP using ImageMagick (install if needed):

$ sudo apt install imagemagick -y

$ convert tux.png tux.bmp

$ ls -lh tux.bmp

[shows file size — BMP is much larger than PNG because it is uncompressed]

Google Colab equivalent

!wget -O /content/tux.png https://upload.wikimedia.org/wikipedia/commons/a/af/Tux.png
!apt install imagemagick -y -q
!convert /content/tux.png /content/tux.bmp
Then prefix all subsequent commands with ! and use /content/ paths.

2

Extract the BMP header (first 54 bytes — stores image dimensions and colour format, not pixel data). We preserve this so the output files are still viewable images:

Linux Terminal

$ head -c 54 tux.bmp > header.bin

$ wc -c header.bin

54 header.bin

3

Encrypt only the pixel data using ECB mode. The -nosalt and fixed -K hex key make the result deterministic — identical pixel blocks produce identical ciphertext blocks:

Linux Terminal

$ tail -c +55 tux.bmp \

| openssl enc -aes-128-ecb -nosalt \

-K 0123456789abcdef0123456789abcdef \

> ecb_pixels.bin

$ cat header.bin ecb_pixels.bin > tux_ecb.bmp

4

Do the same with CBC mode — each pixel block is XOR-ed with the previous ciphertext block before encrypting, breaking the pattern:

Linux Terminal

$ tail -c +55 tux.bmp \

| openssl enc -aes-128-cbc -nosalt \

-K 0123456789abcdef0123456789abcdef \

-iv 00000000000000000000000000000000 \

> cbc_pixels.bin

$ cat header.bin cbc_pixels.bin > tux_cbc.bmp

5

Open all three images side by side. On Linux use Eye of GNOME, the Files app, or ImageMagick's display tool:

Linux Terminal

$ eog tux.bmp tux_ecb.bmp tux_cbc.bmp

# alternatives: xdg-open tux_ecb.bmp

# or: display tux_ecb.bmp (ImageMagick)

Google Colab — view images inline in the notebook

from IPython.display import Image, display
print("Original:")
display(Image('/content/tux.bmp'),     width=200)
print("ECB encrypted:")
display(Image('/content/tux_ecb.bmp'), width=200)
print("CBC encrypted:")
display(Image('/content/tux_cbc.bmp'), width=200)

Describe what you see in each of the three images. Can you still make out the penguin in the ECB version?

Why does the CBC-encrypted image look like random noise while ECB does not?

EX 3.3 Python: Encrypt a file with AES-CBC correctly

This script implements AES-CBC encryption properly: random IV per encryption, PKCS7 padding, and IV prepended to ciphertext so the receiver can decrypt without any extra communication.

🐧 Linux Terminal

Install the library first:

pip3 install cryptography

Then run: python3 aes_demo.py

📓 Google Colab

In the first cell run:

!pip install cryptography

Then paste the script into the next cell and run it.

Python 3 — requires: pip install cryptography

import os
from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.primitives import padding

def aes_encrypt(plaintext: bytes, key: bytes) -> bytes:
    """
    Encrypt plaintext using AES-128-CBC.
    Returns: IV (16 bytes) + ciphertext (padded to 16-byte boundary)
    """
    # Generate a fresh random IV for every encryption operation
    iv = os.urandom(16)

    # Apply PKCS7 padding so plaintext length is a multiple of 16
    padder = padding.PKCS7(128).padder()
    padded = padder.update(plaintext) + padder.finalize()

    # Encrypt using AES-CBC
    cipher = Cipher(algorithms.AES(key), modes.CBC(iv))
    encryptor = cipher.encryptor()
    ciphertext = encryptor.update(padded) + encryptor.finalize()

    # Prepend the IV to the ciphertext so the receiver can use it
    return iv + ciphertext

def aes_decrypt(data: bytes, key: bytes) -> bytes:
    """
    Decrypt data produced by aes_encrypt.
    Expects: IV (first 16 bytes) + ciphertext
    """
    iv = data[:16]
    ciphertext = data[16:]

    cipher = Cipher(algorithms.AES(key), modes.CBC(iv))
    decryptor = cipher.decryptor()
    padded = decryptor.update(ciphertext) + decryptor.finalize()

    # Remove the padding to get original plaintext back
    unpadder = padding.PKCS7(128).unpadder()
    return unpadder.update(padded) + unpadder.finalize()

# --- Demo ---
key = os.urandom(16)  # AES-128 requires 16-byte key
message = b"This is a secret message!"

encrypted = aes_encrypt(message, key)
decrypted = aes_decrypt(encrypted, key)

print(f"Original:  {message}")
print(f"Encrypted: {encrypted.hex()}")
print(f"Decrypted: {decrypted}")
print(f"\nIV used: {encrypted[:16].hex()}")

# Encrypt the same message twice — different IVs mean different outputs
enc1 = aes_encrypt(message, key)
enc2 = aes_encrypt(message, key)
print(f"\nEncryption 1: {enc1.hex()[:40]}...")
print(f"Encryption 2: {enc2.hex()[:40]}...")
print(f"Are they the same? {enc1 == enc2}")  # Should be False!

1

Linux: Install the library with pip3 install cryptography, save the script as aes_demo.py, and run python3 aes_demo.py.
Colab: Run !pip install cryptography in one cell, paste the script in the next, and press Shift+Enter.

2

Observe that encrypting the same message twice produces completely different ciphertext. Look at the printed IV values — they differ each time. This is the random IV at work.

3

Extension exercise: Find the line iv = os.urandom(16) and replace it with iv = b'\x00' * 16 (a fixed all-zero IV). Run the script again twice. What do you notice about the two encrypted outputs now?

What was the output when you used a fixed IV and encrypted the same message twice? How does it differ from using a random IV?

Explain in your own words why a random IV is essential, even though the IV is not kept secret:

Going Further: Use AES-GCM Instead

AES-CBC provides confidentiality but not authentication — an attacker can tamper with the ciphertext without you knowing. AES-GCM (Galois/Counter Mode) provides both encryption and a message authentication code (MAC) in one step. For any new system, prefer GCM. Replace modes.CBC(iv) with modes.GCM(iv) and handle the tag attribute that gets generated.

✓ Knowledge Check — Lab 3

Q1: Why is ECB mode considered insecure for encrypting most data?

A It uses a key that is too short
B It is too slow for large files
C Identical plaintext blocks produce identical ciphertext blocks, revealing patterns
D It does not support keys longer than 128 bits

✔ Correct — The ECB penguin is the canonical demonstration of this structural weakness.

✘ Not quite. Think about what happens when two blocks of plaintext are identical.

Q2: What is the purpose of an Initialization Vector (IV)?

A To store the encryption key securely
B To ensure that encrypting the same plaintext twice with the same key produces different ciphertext
C To authenticate the message
D To compress the plaintext before encryption

✔ Correct — The random IV ensures ciphertext uniqueness even when plaintext and key are the same.

✘ Not quite. The IV does not need to be secret — its purpose is randomness.

Q3: Which AES mode is recommended for new applications that need both confidentiality and integrity?

A ECB
B CBC
C CTR
D GCM

✔ Correct — GCM (Galois/Counter Mode) provides authenticated encryption, protecting against tampering.

✘ Not quite. CBC only provides confidentiality — it cannot detect if ciphertext was tampered with.

Encryption
Fundamentals

What you will learn

How to use this workbook

⚠ Environment Setup — Do This First

Background: How did people encrypt before computers?

1.1 The Caesar Cipher

1.2 ROT13 — A Special Case

1.3 The Vigenère Cipher

1.4 Breaking Ciphers: Frequency Analysis

Exercises

Python Lab: Implement Your Own Caesar Cipher

✓ Knowledge Check — Lab 1

Background: What is hashing, and how does it differ from encryption?

2.1 What are hash functions used for?

2.2 Properties a good hash function must have

2.3 The Avalanche Effect in Action

2.4 On-Machine Labs: File Integrity Verification

✓ Knowledge Check — Lab 2

Background: Modern symmetric encryption

3.1 Block Cipher Modes: Why They Matter

3.2 CBC Mode and the Initialization Vector (IV)

3.3 OpenSSL on the Command Line

✓ Knowledge Check — Lab 3

Final Reflection

EncryptionFundamentals

What you will learn

How to use this workbook

⚠ Environment Setup — Do This First

Background: How did people encrypt before computers?

1.1 The Caesar Cipher

1.2 ROT13 — A Special Case

1.3 The Vigenère Cipher

1.4 Breaking Ciphers: Frequency Analysis

Exercises

Python Lab: Implement Your Own Caesar Cipher

✓ Knowledge Check — Lab 1

Background: What is hashing, and how does it differ from encryption?

2.1 What are hash functions used for?

2.2 Properties a good hash function must have

2.3 The Avalanche Effect in Action

2.4 On-Machine Labs: File Integrity Verification

✓ Knowledge Check — Lab 2

Background: Modern symmetric encryption

3.1 Block Cipher Modes: Why They Matter

3.2 CBC Mode and the Initialization Vector (IV)

3.3 OpenSSL on the Command Line

✓ Knowledge Check — Lab 3

Final Reflection

Encryption
Fundamentals