Python for loop — Silent Data Loss from List Modification
47% of flagged records persisted after a for loop cleanup due to in-place list modification — avoid this with list comprehensions or iterating a copy..
20+ years shipping production Python across data and backend systems. Lessons pulled from things that broke in production.
- A for loop visits every item in an iterable and runs the same block of code for each — Python handles counting automatically
- range(n) starts at 0 and excludes n — range(5) gives 0,1,2,3,4, never 5
- enumerate() gives you both index and value — use it instead of a manual counter
- zip() pairs items from two sequences together — use it instead of range(len()) when processing two lists in lockstep
- break exits the loop; continue skips the current iteration; else runs only if no break occurred
- Modifying a list while looping over it silently skips items — loop over a copy or build a new list
- List comprehensions are faster and safer than a for loop that builds a new list — prefer them when the body is a single expression
- Biggest trap: the for...else clause runs when the loop completes WITHOUT break, not when it breaks
A Python for loop is an iterator-based construct that traverses any iterable (list, string, dict, file, generator) by calling _ on an internal iterator object until _next__()StopIteration is raised. It exists because Python lacks C-style indexed loops — instead, it embraces the 'for each' paradigm, which is simpler and less error-prone for most tasks.
The loop variable gets bound to each element in sequence, and modifying the underlying collection during iteration (e.g., deleting from a list) silently corrupts the iteration state, causing skipped or duplicated elements. This is a classic footgun: the iterator doesn't know the list changed, so it blindly advances to the next index, which now points to a different element.
The fix is to iterate over a copy (for item in lst[:]) or use a while loop with manual index management. For loops, the same risk applies if you modify the list by index — range()range produces numbers, not references to the list's internal state, so you can delete lst[i] and shift all subsequent elements left, causing you to skip the next one.
The for loop is the right tool when you need to read or transform data without structural mutation; for destructive operations, use a while loop or a list comprehension (which builds a new list). Alternatives include for applying functions, map() for selection, and filter()itertools for advanced iteration patterns — but none of these protect you from mutating the source during iteration.
Imagine you have a bag of 10 apples and you want to check each one for bruises. You don't write a separate rule for apple 1, apple 2, apple 3 — you just say 'for every apple in the bag, do this check.' That's exactly what a for loop does in Python. It lets you write one instruction and automatically repeat it for every item in a collection, without you having to count how many items there are or manually keep track of where you are.
Every real program does repetitive work. A weather app checks temperatures for 365 days. A music app loads every song in your library. An online store applies a discount to every item in your cart.
The for loop solves this elegantly. You write an action once and Python automatically repeats it for every item in a collection — advancing through the sequence, tracking position, and stopping at the end, all without you doing anything extra.
I've reviewed a lot of Python code over the years — across data pipelines, API services, and scrappy internal tools — and for loop misuse shows up more than almost any other category of bug. Usually it's subtle: a list modified mid-iteration, a zip() silently swallowing records, a for...else clause that does the opposite of what the author intended. Most of these issues don't crash — they just produce wrong results quietly, which is worse.
By the end of this article you'll understand the for loop syntax properly, know how to iterate over lists, strings, dictionaries, ranges, and multiple sequences at once with zip(), use enumerate() the way Pythonistas actually use it, write list comprehensions instead of accumulator loops, and recognise the mistakes that trip up nearly every beginner — including a few that experienced developers still walk into on a bad day.
What a for Loop Actually Does — The Core Idea
A for loop says: 'take this collection of things, visit each one in order, and run the same block of code for each.' Python handles all the counting and moving-forward for you. You just describe what to DO on each visit.
The collection you loop over can be a list of names, a string of characters, a range of numbers — almost anything that holds multiple items. Python calls these 'iterables', but you can think of them as 'anything you can step through one piece at a time.'
The basic shape of a for loop is always the same: the word for, then a variable name you choose (this is your 'current item' holder), then in, then the collection, then a colon. Everything indented below that runs once per item. The indentation is not optional — Python uses it to know which lines belong inside the loop and which don't.
Beyond the syntax, it's worth understanding what Python is actually doing under the hood. When you write for planet in planets, Python calls iter(planets) to get an iterator object, then calls next() on it repeatedly until it raises a StopIteration exception — at which point the loop ends cleanly. You can do this manually yourself to see exactly what the loop sees step by step. This also explains why you can loop over strings, files, generators, and custom objects — anything that implements __iter__ and __next__ participates in the same protocol. That's not implementation detail trivia; it's the mental model that makes debugging iterator errors actually tractable.
# ── Basic for loop ─────────────────────────────────────────────────────────── planets = ["Mercury", "Venus", "Earth", "Mars", "Jupiter"] for planet in planets: # 'planet' holds the current item each round print(f"Visiting {planet}") # this line runs once per planet print("Tour complete!") # NOT indented — runs after the loop finishes # ── What Python actually does under the hood ───────────────────────────────── # The for loop above is equivalent to this manual iterator protocol: print("\n=== Manual iterator (what Python does internally) ===") iterator = iter(planets) # get an iterator from the list while True: try: planet = next(iterator) # ask for the next item print(f"Visiting {planet}") except StopIteration: # no more items — loop ends cleanly break # Knowing this explains WHY you can loop over strings, files, generators: # any object that implements __iter__ and __next__ is fair game. print("\n=== Strings are iterable too ===") for char in "Mars": # string is just a sequence of characters print(char)
- Python calls
iter()on the collection to get an iterator object - Each iteration calls
next()to get the next item — StopIteration signals the end and the loop exits cleanly - The loop variable is assigned fresh on each iteration — it does not carry state between rounds
- Any object that implements __iter__ and __next__ can be looped over — not just lists
- Indentation defines the loop body — no braces needed, Python enforces structure through layout
iter() and next() is what separates engineers who debug iterator errors confidently from engineers who just add print statements and hope.iter() and next() makes debugging iterator errors straightforward instead of mysterious.for item in my_list — direct iteration is cleanest and most readablefor _ in range(n) — underscore signals the variable is intentionally unusedfor i, item in enumerate(my_list) — not a manual counter variablefor key, value in my_dict.items() — direct dict iteration only gives keysLooping Over Numbers with range() — Your Most-Used Tool
Most beginners need to repeat something a set number of times — run a countdown from 10, process 100 items, print a multiplication table. For that, Python gives you . It generates a sequence of numbers on demand without storing them all in memory at once.range()
range(5) gives you 0, 1, 2, 3, 4 — five numbers, starting at zero. This trips up almost every beginner at least once: range counts from zero by default, and the number you pass in is NOT included in the output. Think of it as 'up to but not including 5.'
range(start, stop) lets you control where counting begins. range(1, 6) gives 1, 2, 3, 4, 5. range(start, stop, step) adds a step size — range(0, 10, 2) gives you every even number from 0 to 8. You can count backwards with a negative step: range(5, 0, -1) counts down from 5 to 1.
One thing worth internalising early: is a lazy object in Python 3. It does not generate all the numbers upfront and store them in a list. It calculates each number on demand as the loop requests it. range()range(1_000_000) uses a few dozen bytes of memory regardless of how large the range is. This is why you should never write list(range(n)) just to loop over numbers — you're forcing Python to materialise the entire sequence into memory for no reason. The range object already supports indexing, slicing, and membership testing without being converted to a list.
# ── range(n): repeat N times, starting from 0 ──────────────────────────────── print("=== Countdown prep ===") for step_number in range(5): # generates 0, 1, 2, 3, 4 print(f"Step {step_number}") # ── range(start, stop): control where counting begins ──────────────────────── print("\n=== Lap counter ===") for lap in range(1, 6): # generates 1, 2, 3, 4, 5 (NOT 6) print(f"Lap {lap} complete") # ── range(start, stop, step): skip values ──────────────────────────────────── print("\n=== Even floors only ===") for floor in range(0, 11, 2): # 0, 2, 4, 6, 8, 10 print(f"Elevator stops at floor {floor}") # ── Counting backwards ──────────────────────────────────────────────────────── print("\n=== Rocket launch ===") for countdown in range(5, 0, -1): # 5, 4, 3, 2, 1 print(f"T-minus {countdown}...") print("Liftoff!") # ── range() is lazy — not a list ───────────────────────────────────────────── import sys lazy_range = range(1_000_000) eager_list = list(range(1_000_000)) print(f"\nrange(1_000_000) memory: {sys.getsizeof(lazy_range)} bytes") print(f"list(range(1_000_000)) memory: {sys.getsizeof(eager_list):,} bytes") # Never convert range to a list just to iterate — there is no benefit # range already supports len(), indexing, and 'in' checks natively
range(), and it produces off-by-one errors that are genuinely annoying to track down. range(5) gives you [0, 1, 2, 3, 4] — five values, but none of them is 5. If you need to print 1 through 5, use range(1, 6). A good mental model: the stop value is a wall the counter hits but never crosses. The memory comparison in the code above also shows why you should never write list(range(n)) just to loop — range is already lazy and memory-efficient on its own, and it supports indexing and membership tests without being materialised.len(), slicing, and 'in' membership checks without conversion — so there is almost never a reason to call list() on it.range() directly for looping. Only convert to list if you genuinely need multiple independent passes or need to serialize the sequence.Looping Over Strings, Dictionaries, and Using enumerate()
A string in Python is a sequence of characters and you can loop over it exactly the same way you loop over a list. Python hands you one character at a time. This is useful for tasks like counting vowels, reversing text, checking for special characters, or validating a pattern character by character.
Dictionaries work a little differently. When you loop over a dictionary directly, you get its keys — just keys, nothing else. To get both keys and values together in one pass, use .items(). It hands you a (key, value) tuple each round, which you unpack by naming two variables after for. If you only need the values, .values() does that directly. If you only need keys, direct iteration and .keys() are equivalent — .keys() is slightly more explicit and communicates intent to whoever reads the code next.
Here is a pattern change that makes a real difference in code quality: . Often you need both the item AND its position number while looping. Without enumerate, a lot of developers create a separate counter variable before the loop and increment it manually on every iteration — three lines where one would do, and a manual increment that's easy to misplace or forget. enumerate()enumerate(collection) hands you a (index, item) tuple on each pass. You unpack it with two names after for. The start parameter lets you control the starting number, which is handy when displaying positions to users who expect to start counting at 1.
One pattern worth calling out explicitly because it shows up constantly in code reviews: for i in range(len(my_list)): followed by my_list[i] inside the loop. This is the C and Java index loop imported into Python. It works, but it fights the language. It breaks on iterables that don't have a , it introduces manual index arithmetic that can be wrong, and it obscures what the code is actually doing. len() is the direct replacement — it works on any iterable, handles the arithmetic for you, and reads as plain English.enumerate()
# ── Looping over a string ───────────────────────────────────────────────────── word = "Python" print("=== Characters in 'Python' ===") for character in word: # each character, one at a time print(character) # ── Looping over a dictionary ───────────────────────────────────────────────── student_grades = { "Alice": 92, "Bob": 78, "Carlos": 85 } print("\n=== Keys only (direct iteration) ===") for student_name in student_grades: # gives keys by default print(student_name) print("\n=== Keys AND values (.items()) ===") for student_name, grade in student_grades.items(): if grade >= 90: remark = "Excellent" elif grade >= 80: remark = "Good" else: remark = "Keep practising" print(f"{student_name}: {grade} — {remark}") print("\n=== Values only (.values()) ===") total = sum(student_grades.values()) # .values() passed directly to sum() — no loop needed print(f"Class total: {total}") # ── Using enumerate() to track position ─────────────────────────────────────── race_finishers = ["Aisha", "Ben", "Clara", "David"] print("\n=== Race Results — enumerate(start=1) ===") for position, runner_name in enumerate(race_finishers, start=1): print(f"Position {position}: {runner_name}") # ── The anti-pattern enumerate() replaces ───────────────────────────────────── print("\n=== Avoid this pattern: range(len()) ===") # Works but is unidiomatic — breaks on non-sequence iterables, invites off-by-one errors for i in range(len(race_finishers)): print(f"{i+1}: {race_finishers[i]}") print("\n=== Prefer enumerate() instead ===") # Clean, works on any iterable, no manual index arithmetic for i, runner in enumerate(race_finishers, start=1): print(f"{i}: {runner}")
counter = 0 before a loop and counter += 1 inside it, stop — that's exactly what enumerate() exists to replace. Similarly, for i in range(len(my_list)): followed by my_list[i] is a signal to switch to enumerate(). It's cleaner, less prone to off-by-one errors, and crucially, it works on any iterable — not just lists that support len(). The start parameter is the detail most people miss: enumerate(items, start=1) gives you 1-based positions without any arithmetic.enumerate() for indexed iteration over any sequence, dict.items() for key-value iteration. Both add negligible overhead over direct iteration.len()) pattern — always prefer it when you need an index.enumerate() handles for free.for item in my_list — simplest and most readable formfor i, item in enumerate(my_list) — never use range(len())for key in my_dict — direct iteration gives keys by defaultfor key, value in my_dict.items() — .items() returns (key, value) tuplesLooping Over Two Sequences at Once with zip()
One of the most common loop patterns in real code is processing two related lists in lockstep — pairing names with scores, pairing database keys with fetched values, pairing timestamps with readings. The wrong way is for i in range(len(names)): name = names[i]; score = scores[i]. The right way is .zip()
takes two or more iterables and pairs their items together, one from each, yielding a tuple per step. You unpack the tuple by naming two variables after zip()for. When the shorter iterable runs out, stops — no error, no warning, just stops. This is the right behaviour when you know your lists are the same length, but it becomes a silent data loss bug when they differ. If your two sequences might have different lengths and you need all items from both, use zip() from the standard library, which fills in a configurable default value for the shorter sequence.itertools.zip_longest()
Like , range() is lazy in Python 3. It doesn't build a list of tuples upfront — it generates one pair at a time on demand. zip()zip(list_a, list_b) on two million-item lists uses constant memory because it only ever holds one pair in flight.
also makes code self-documenting in a way that index-based loops can't match. When someone reads zip()for name, score in zip(names, scores), the intent is immediately obvious and the pairing is explicit in the code itself. The index-based equivalent requires the reader to mentally trace that names[i] and scores[i] are intentionally paired — an extra cognitive step that zip() eliminates entirely.
One bonus pattern that comes up often: building a dictionary from two parallel lists. dict(zip(keys, values)) is the idiomatic one-liner, and it reads clearly once you know the pattern.
from itertools import zip_longest # ── Basic zip(): pair two lists ─────────────────────────────────────────────── names = ["Alice", "Bob", "Carlos", "Diana"] scores = [92, 78, 85, 96] print("=== Score Report ===") for name, score in zip(names, scores): # no index math, no range(len()) status = "Pass" if score >= 80 else "Fail" print(f"{name}: {score} — {status}") # ── zip() stops at the shorter sequence ─────────────────────────────────────── extra_names = ["Alice", "Bob", "Carlos", "Diana", "Evan"] # 5 names fewer_scores = [92, 78, 85] # 3 scores print("\n=== zip() stops early — Diana and Evan are silently dropped ===") for name, score in zip(extra_names, fewer_scores): print(f"{name}: {score}") # Diana and Evan never appear — zip() stopped when fewer_scores ran out # ── zip_longest(): process all items, fill gaps with a default ──────────────── print("\n=== zip_longest() — no items dropped ===") for name, score in zip_longest(extra_names, fewer_scores, fillvalue="N/A"): print(f"{name}: {score}") # ── zip() with three sequences ──────────────────────────────────────────────── students = ["Alice", "Bob", "Carlos"] grades = ["A", "C", "B" ] years = [2024, 2025, 2026 ] print("\n=== Three-way zip ===") for student, grade, year in zip(students, grades, years): print(f"{student} ({year}): {grade}") # ── zip() to build a dictionary from two lists ──────────────────────────────── headers = ["name", "score", "year"] row = ["Alice", 92, 2026] record = dict(zip(headers, row)) # idiomatic one-liner — cleaner than a manual loop print(f"\n=== dict from zip: {record}") # ── zip() is lazy: constant memory regardless of input size ────────────────── import sys large_zip = zip(range(1_000_000), range(1_000_000)) print(f"\nzip object memory: {sys.getsizeof(large_zip)} bytes — not 8MB")
zip() stops as soon as the shorter one runs out — no error, no warning, no indication that anything was skipped. The extra items from the longer list are silently gone. This is correct behaviour when you know your lists match, but it becomes a quiet data loss bug when they don't. In data pipelines, this is particularly insidious because the output looks valid — it just has fewer rows than it should. If there is any chance the lengths differ, use itertools.zip_longest() and handle the fill value explicitly. Or assert lengths match at the top of the function before zipping: assert len(a) == len(b), f'Expected equal lengths, got {len(a)} and {len(b)}'.len()) when processing parallel lists.zip_longest() when lengths may differ.range().zip() — clean, lazy, and self-documenting. Add an assert if you want to be safe.zip_longest() from itertools with an explicit fillvalue — never silently lose data from a longer sequencefor name, score, year in zip(a, b, c) works exactly as you'd expectControlling the Loop — break, continue, and the else Clause
Sometimes you don't want to visit every single item. Maybe you're searching a list for a specific value and want to stop the moment you find it — no point checking the remaining 9,000 entries. That's break. It exits the loop immediately, as if you slammed a book shut mid-page.
continue is different — it doesn't stop the loop, it just skips the rest of the current iteration and jumps straight to the next one. Useful when you want to process most items but gracefully skip ones that fail a validation check without adding a layer of nesting.
Python's for loop also has an else clause, and this genuinely surprises most people the first time they encounter it — including some fairly experienced developers. The else block runs only if the loop completed normally, meaning it finished all iterations WITHOUT being interrupted by a break. Think of it as a 'loop completion handler' rather than an alternative condition. It's perfect for search-and-report patterns: loop through items looking for something, break if found, and use else to handle the 'not found' case cleanly without a flag variable.
One important detail that's easy to miss: break and continue only affect the innermost loop they live inside. In a nested loop, break exits the inner loop and returns control to the outer loop — it does not exit both. If you need to exit two levels of nesting at once, the cleanest approach by far is to move the nested logic into a separate function and use return. That eliminates the flag variable pattern, makes the search logic independently testable, and keeps the outer loop readable.
# ── break: stop the moment you find what you need ──────────────────────────── passenger_list = ["Tom", "Sara", "Jake", "Mia", "Leo"] target_passenger = "Jake" print("=== Boarding check ===") for passenger in passenger_list: if passenger == target_passenger: print(f"Found {target_passenger}! Stopping search.") break print(f"Not {target_passenger}, checked {passenger}") # ── continue: skip invalid items, keep going ────────────────────────────────── raw_scores = [88, -5, 76, 0, 95, -1, 60] print("\n=== Valid scores only ===") for score in raw_scores: if score <= 0: # invalid — skip this iteration entirely continue print(f"Recording score: {score}") # ── for...else: runs ONLY if the loop was NOT broken out of ─────────────────── seating_chart = ["14A", "14B", "15A", "15B"] requested_seat = "16C" print("\n=== Seat search ===") for seat in seating_chart: if seat == requested_seat: print(f"Seat {requested_seat} found!") break else: # This block runs ONLY because we never hit 'break' print(f"Seat {requested_seat} is not available.") # ── break only affects the innermost loop ───────────────────────────────────── # If you need to exit a nested loop early, use a function + return print("\n=== Nested loop: break exits inner only ===") found = False for row in range(3): for col in range(3): if row == 1 and col == 1: print(f"Found target at row={row}, col={col}") found = True break # exits inner loop only — outer loop continues if found: break # flag variable lets outer loop exit too — works but verbose # Cleaner approach: extract nested search into a function print("\n=== Nested loop: function + return is cleaner ===") def find_in_grid(grid, target): for r, row in enumerate(grid): for c, value in enumerate(row): if value == target: return r, c # exits both loops immediately, no flag needed return None grid = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] result = find_in_grid(grid, 5) print(f"Found 5 at: {result}")
for loops can have an else. Interviewers love asking about this because it signals you've gone beyond surface-level syntax knowledge. The key point: else runs when the loop finishes without a break. Think of it as 'completed without interruption.' A correct answer that also mentions the nested loop exit pattern — where break only exits the innermost loop, and the clean fix is a function with return — will consistently stand out in a technical interview.List Comprehensions — When a for Loop Fits on One Line
A list comprehension is a compact way to create a new list by transforming or filtering items from an existing collection. It isn't a separate concept from for loops — it IS a for loop, written in a single expression. Python evaluates it by running the loop internally and collecting the results into a new list.
The anatomy: [expression for item in iterable]. With filtering: [expression for item in iterable if condition]. The result is always a new list. The original iterable is untouched.
This matters because the most common for loop pattern in Python is the accumulator: create an empty list, loop, conditionally append. That's three moving parts, each of which can be written slightly wrong. The list comprehension collapses all three into one expression that's harder to get wrong and easier to read at a glance — once you're familiar with the syntax.
There's a readability threshold to keep in mind, though. A comprehension with a simple filter and transform is genuinely cleaner than the accumulator loop. A comprehension with complex nested logic or a multi-step transformation is not — at that point, write the loop explicitly. The goal is readable code, not compact code for its own sake.
Beyond lists, Python has dict comprehensions ({k: v for k, v in items}), set comprehensions ({x for x in items}), and generator expressions ((x for x in items)). Generator expressions deserve special attention: they produce one item at a time rather than building the entire result in memory. If you're passing the result directly to sum(), max(), any(), or all(), use a generator expression instead of a list comprehension — there's no reason to materialise the entire list just to hand it off and immediately discard it.
# ── The accumulator loop pattern (works, but verbose) ──────────────────────── raw_scores = [88, -5, 76, 0, 95, -1, 60, 45, 101] valid_scores_loop = [] for score in raw_scores: if 0 < score <= 100: valid_scores_loop.append(score) print(f"Loop result: {valid_scores_loop}") # ── The same logic as a list comprehension ──────────────────────────────────── valid_scores_comp = [score for score in raw_scores if 0 < score <= 100] print(f"Comprehension: {valid_scores_comp}") # ── Transformation: apply an operation to every item ───────────────────────── temps_celsius = [0, 20, 37, 100] temps_fahrenheit = [(c * 9/5) + 32 for c in temps_celsius] print(f"\nFahrenheit: {temps_fahrenheit}") # ── Filter AND transform in one line ───────────────────────────────────────── words = ["hello", "world", "python", "for", "loop"] # Uppercase only words longer than 4 characters result = [w.upper() for w in words if len(w) > 4] print(f"\nFiltered + uppercased: {result}") # ── Dict comprehension ──────────────────────────────────────────────────────── students = {"Alice": 92, "Bob": 78, "Carlos": 85} passing = {name: grade for name, grade in students.items() if grade >= 80} print(f"\nPassing students: {passing}") # ── Generator expression vs list comprehension: memory comparison ───────────── import sys list_comp = [x ** 2 for x in range(10_000)] # builds entire list in memory upfront gen_expr = (x ** 2 for x in range(10_000)) # lazy — one value at a time print(f"\nList comprehension size: {sys.getsizeof(list_comp):,} bytes") print(f"Generator expression size: {sys.getsizeof(gen_expr)} bytes") # When summing, use a generator — no need to build the list first list_sum = sum([x ** 2 for x in range(10_000)]) # builds list, then immediately discards it gen_sum = sum(x ** 2 for x in range(10_000)) # constant memory — always prefer this form print(f"\nBoth produce same sum: {list_sum == gen_sum}")
- List comprehension: [expr for item in iterable] — builds the full list immediately, supports indexing and
len(), reusable - Generator expression: (expr for item in iterable) — lazy, one item at a time, single-use, cannot index into it
- When passing to
sum(), the outer parentheses already exist: sum(x**2 for x in range(n)) — no extra brackets needed - If the comprehension body needs multiple lines or complex logic, use a regular for loop — readability wins over compactness
- Nested comprehensions (list of lists) work but become hard to read quickly — use named for loops when nesting exceeds one level
sum([x for x in large_list if condition]) — building a full list in memory just to sum it and throw it away on the next line.sum(x for x in large_list if condition) is a generator expression — constant memory, identical result.sum(), any(), all(), max(), or min(), drop the square brackets. The outer function call already provides the parentheses you need.sum(), any(), all(), max().sum(), max(), any(), all()Nested Loops and When to Avoid Them
A nested loop is a loop inside another loop. The outer loop runs once; for each of its iterations, the inner loop runs to completion. If the outer loop has 5 items and the inner loop has 5 items, you get 25 total iterations. If both loops have 1,000 items, you get 1,000,000. This O(n²) growth is the thing to watch — nested loops that perform fine on small development datasets can become genuinely painful on production data at scale.
The canonical correct use for a nested loop is two-dimensional data: a grid, a matrix, a table where you need to visit every cell. Outer loop for rows, inner loop for columns. That's the right tool for that structure.
The anti-pattern is using a nested loop when Python's standard library already has something that does the job more efficiently. The most common case: checking whether any item from list A exists in list B. With nested loops, every item in A is checked against every item in B — O(n²). Converting list B to a set first reduces each membership check to O(1), making the overall algorithm O(n). That difference is irrelevant at 50 items and enormous at 50,000.
For generating every combination of two sequences — every pair, every permutation — is the clean alternative to nested loops. Same output, less code, and the intent ('I want the cartesian product') is explicit in the function name rather than implied by the loop structure.itertools.product()
from itertools import product # ── Basic nested loop: multiplication table ─────────────────────────────────── print("=== Multiplication Table (3x3) ===") for row in range(1, 4): # outer loop: 3 iterations for col in range(1, 4): # inner loop: 3 iterations per outer iteration = 9 total print(f"{row} x {col} = {row * col}") print() # blank line after each row # ── Working with a 2D grid ──────────────────────────────────────────────────── print("=== Grid traversal ===") grid = [ [1, 2, 3], [4, 5, 6], [7, 8, 9] ] for row_index, row in enumerate(grid): for col_index, value in enumerate(row): print(f" grid[{row_index}][{col_index}] = {value}") # ── The O(n²) anti-pattern: nested loop membership check ───────────────────── import time list_a = list(range(5_000)) list_b = list(range(2_500, 7_500)) # overlaps with list_a in the middle # BAD: O(n²) — 'in list_b' scans the whole list every single time start = time.perf_counter() bad_overlap = [x for x in list_a if x in list_b] bad_time = time.perf_counter() - start # GOOD: O(n) — convert list_b to a set once, then each 'in set_b' check is O(1) set_b = set(list_b) start = time.perf_counter() good_overlap = [x for x in list_a if x in set_b] good_time = time.perf_counter() - start print(f"\n=== Membership check performance ===") print(f"Nested loop (O(n²)): {bad_time*1000:.2f}ms, {len(bad_overlap)} overlapping items") print(f"Set lookup (O(n)): {good_time*1000:.2f}ms, {len(good_overlap)} overlapping items") # ── itertools.product(): cartesian product without nested loops ─────────────── colors = ["Red", "Blue"] sizes = ["S", "M", "L" ] print("\n=== All colour/size combinations ===") for color, size in product(colors, sizes): # cleaner than two nested for loops print(f"{color} - {size}")
- Nested loops on 100-item lists = 10,000 operations. On 10,000-item lists = 100,000,000 operations.
- If the inner loop body is just an 'in' check, convert the outer collection to a set before the loop starts
- itertools.product() generates cartesian products with cleaner syntax and explicit intent — prefer it over nested for loops for combination generation
- Deeply nested loops — three levels or more — are almost always a signal the code needs to be broken into named functions
- Profile before optimising — sometimes the O(n²) loop is completely acceptable because n is always small in practice
set_b = set(list_b) before the outer loop — is one of the highest-impact single-line performance fixes you can make in Python.itertools.product() for cartesian products instead of nested loops — same result, explicit intent.enumerate() at both levels for clean index trackingWhy Your Loop Might Be Why Your App Crashed — Memory and Performance Traps
That nested for loop you wrote in a hurry? It's now allocating a million intermediate lists. You didn't notice in dev because your test data had 50 rows. Production has 50,000.
Python's for loop is elegant but not free. Every iteration burns CPU, and if you're building unnecessary data structures inside the loop — like appending to a list in a nested loop — you're creating garbage collector pressure that will tank your response times.
The fix: Pre-allocate storage. Use generators instead of lists when you only need to iterate once. Profile before you optimize, but profile after you ship the first version.
The rule is brutal: if you're looping over a collection bigger than 10,000 items and doing anything more complex than a simple read, you need to ask yourself whether this logic belongs in a database query or a background task queue.
// io.thecodeforge import time # Slow: builds a list inside every iteration slow_data = [] for i in range(50_000): slow_data.append(i * 2) # Fast: generator expression, no intermediate list start = time.perf_counter() fast_sum = sum(x * 2 for x in range(50_000)) elapsed = time.perf_counter() - start print(f"Generator sum: {fast_sum}, took {elapsed:.4f}s")
range(100_000_000) is fine — it's lazy. But list(range(100_000_000)) will eat 800MB+ of RAM and crash your process.range() are lazy. Use them.Iterating While Mutating — The Quiet Way Your Dictionary Breaks
You're looping over a dictionary and deleting keys because 'it makes the code shorter'. Congratulations, you just triggered a RuntimeError: dictionary changed size during iteration. Python's iteration protocol assumes the collection doesn't change under it.
The WHY: dict iteration uses a hash table cursor. If you add or remove keys, that cursor is now pointing at garbage. The fix is not to catch the exception — it's to iterate over a copy.
Use list( to snapshot the items you need before you start deleting. Or use dict.items())keys_to_delete pattern: collect the keys in a list, then loop that list to delete. Same for sets.
Lists are slightly different — you can delete by index if you iterate backwards. But nobody remembers that at 2 AM during a Sev-1. Just copy the list and move on.
The junior mistake: thinking Python's for loop works like C. It doesn't. It uses an iterator object. Respect the iterator, or it will bite you.
// io.thecodeforge
config = {"host": "db1", "port": 5432, "debug": True, "secret": "s3cr3t"}
# This will crash
# for key in config:
# if key == "secret":
# del config[key]
# Safe: iterate over a snapshot
for key in list(config.keys()):
if key == "secret":
del config[key]
print(config) # {'host': 'db1', 'port': 5432, 'debug': True}list(collection) if you plan to delete.list() or collect keys to delete in a separate list.Silent Data Loss from Modifying a List During Iteration
for item in records: and called records.remove(item) inside the loop body when a condition matched. Removing an item shifts all subsequent indices left by one, but Python's internal iterator keeps advancing its position counter regardless. So the item that slides into the just-vacated slot gets skipped entirely on the next step. Every deletion caused one skip. With roughly half the records matching the delete condition, approximately half ended up being skipped — which explains the 47% figure almost exactly.for item in records[:]: to iterate over a shallow copy while modifying the original list. The better long-term fix: replaced the entire pattern with a list comprehension — records = [r for r in records if not should_delete(r)] — which is both safer and expresses intent more clearly. No mutation, no iterator gymnastics, and the code reads as a declarative statement of what you want to keep rather than a procedural description of what to remove.- Never modify a list while iterating over it directly — use a copy or build a new list
- Silent bugs are worse than crashes — always verify loop results with assertions or counts after the fact
- List comprehensions are both safer and more Pythonic than in-place modification loops
- Add logging that reports expected vs. actual item counts after any cleanup operation — 'success' should mean something verifiable
for item in list[:] or a list comprehension insteaditertools.zip_longest() if you need to process all items from both sequencespython -c "a=[1,2,3,4,5]; [a.remove(x) for x in a if x%2==0]; print(a)"python -c "a=[1,2,3,4,5]; cleaned=[x for x in a if x%2!=0]; print(cleaned)"for item in list[:] or a list comprehension to build a new list — never mutate what you're iterating overpython -c "for x in [1,2,3]:
if x==99: break
else:
print('else ran — no break occurred')"grep -n 'break' your_file.py | head -20python -c "a=[1,2,3]; b=[1,2,3,4]; print(list(zip(a,b)))"python -c "from itertools import zip_longest; a=[1,2,3]; b=[1,2,3,4]; print(list(zip_longest(a,b,fillvalue='N/A')))"itertools.zip_longest() or assert len(a) == len(b) before zipping when data comes from external sourcespython -m cProfile -s cumtime your_script.pypython -c "import time; a=list(range(5000)); b=set(range(2500,7500)); t=time.perf_counter(); [x for x in a if x in b]; print(f'{(time.perf_counter()-t)*1000:.2f}ms')"python -c "result = None
for x in []:
result = x
print('result:', result)"grep -n 'for.*in' your_file.py | grep -A1 'print(.*loop_var'| Aspect | for Loop | while Loop |
|---|---|---|
| Best use case | Known collection or fixed number of iterations | Repeat until a condition changes — iteration count unknown upfront |
| Risk of infinite loop | Very low — stops automatically when the collection is exhausted | High — easy to forget updating the condition variable |
| Counting built-in | Yes, via range() or enumerate() — no manual counter needed | No — you manage your own counter and increment it yourself |
| Readability | Very readable — the iterable makes intent obvious at a glance | Can be harder to scan quickly, especially with complex conditions |
| Looping over a list | Natural — for item in my_list | Possible but awkward — requires manual index management |
| Typical example | Process every order in a cart; apply a discount to every item | Keep prompting user until valid input received; poll until connection ready |
| Memory efficiency | range() and zip() are lazy — constant memory regardless of iteration count | No built-in lazy generation — you manage memory yourself |
| Parallel iteration | zip() pairs two sequences cleanly and lazily | Requires manual index management for both sequences simultaneously |
Key takeaways
range() is lazy — 48 bytes regardless of size, and it supports indexing and membership checks without being converted to a list.enumerate() instead of a manual counter variable or range(len())zip() to iterate over two sequences in locksteplen()) on parallel lists. zip() stops at the shorter sequence silently; use itertools.zip_longest() when lengths may differ and silent data loss would be a problem.sum(), any(), all(), max(), drop the square brackets and let Python stay lazy.Common mistakes to avoid
6 patternsModifying a list while looping over it directly
for item in my_list[:] or build a filtered version with a list comprehension: cleaned = [item for item in my_list if not should_remove(item)]. The list comprehension is the preferred approach — it's safer, reads as a declarative statement of what you want to keep, and doesn't mutate anything.Expecting range(n) to include n in its output
range(1, n+1) to get 1 through n inclusive. When in doubt, print list(range(...)) to verify before using it in real logic.Using the loop variable after the loop and trusting its value
Forgetting that for...else runs when the loop does NOT break
Using range(len(my_list)) instead of enumerate() for indexed iteration
len(), and introduces manual index arithmetic that creates off-by-one opportunities on every use.for i, item in enumerate(my_list) — it works on any iterable, handles the index automatically, and reads as natural English. The range(len()) pattern is the C or Java index loop ported into Python where it doesn't belong.Using zip() when the two sequences might have different lengths
itertools.zip_longest(a, b, fillvalue=None) when lengths might differ. Add an assertion before zipping when they must be equal: assert len(a) == len(b), f'Length mismatch: {len(a)} vs {len(b)}' — fail loudly rather than silently losing data.Interview Questions on This Topic
What is the difference between break and continue in a Python for loop, and can you give a practical example of when you'd use each?
break exits the loop entirely — no more iterations run after it fires. Use it when you've found what you were looking for and there's no point continuing. continue skips only the rest of the current iteration and moves immediately to the next one — the loop keeps going. Use it when you want to process most items but skip ones that fail a condition without adding a nesting level. Example: break when you find a target username in a list; continue to skip negative scores when computing an average of valid entries. One important detail: both break and continue only affect the innermost loop they live inside — in nested loops, break exits the inner loop and returns control to the outer one, not both.Python's for loop has an else clause — what does it do, and when does the else block actually execute versus when does it get skipped?
If you need both the index and the value when iterating over a list, what are two ways to do it and which is considered more Pythonic — and why?
enumerate(my_list) which yields (index, value) tuples — this is the Pythonic way. (2) range(len(my_list)) with my_list[i] accessed manually. enumerate() is preferred because it is cleaner, avoids manual index arithmetic, works on any iterable (not just sequences that support len()), and communicates intent directly. The range(len()) pattern is the C and Java index loop imported into a language that doesn't need it — it works, but it's fighting the language.What happens if you modify a list while iterating over it with a for loop in Python? Why does this happen, and how do you fix it?
for item in my_list[:]) so modifications to the original don't affect the iteration, or build a filtered version with a list comprehension: result = [x for x in my_list if not should_remove(x)]. The list comprehension is generally preferred — it expresses intent clearly and eliminates mutation entirely.What is zip() and what happens when the two sequences you pass to it have different lengths?
for i in range(len(a)): index-based parallel loop. When the sequences have different lengths, zip() stops as soon as the shortest one is exhausted — items from the longer sequence are silently dropped with no error or warning. This is the correct behaviour when lengths are known to match, but a silent data loss bug when they might not. Fix: use itertools.zip_longest(a, b, fillvalue=default) to process all items from both sequences, or assert lengths match before zipping when data comes from external sources.What is the difference between a list comprehension and a generator expression, and when would you choose one over the other?
[expr for item in iterable] builds the entire result list in memory immediately. A generator expression (expr for item in iterable) is lazy — it produces one value at a time without ever storing the full result. Use a list comprehension when you need random access by index, will iterate over the result more than once, or need len(). Use a generator expression when passing the result directly to sum(), max(), any(), all(), or similar single-pass consumers — it uses constant memory regardless of input size. The most common mistake I see in code reviews: sum([x2 for x in range(1_000_000)]) builds an 8MB list just to sum it and throw it away. The fix is dropping the square brackets: sum(x2 for x in range(1_000_000)) — same result, a few dozen bytes of memory.Frequently Asked Questions
Yes, completely. A string in Python is a sequence of characters, so for char in 'hello' will visit 'h', 'e', 'l', 'l', 'o' one character at a time. This is handy for character-level processing — counting vowels, checking whether a string is a palindrome, scanning for invalid characters in an identifier. Strings implement the full iterator protocol, so they work anywhere a list would work in a for loop.
Use a for loop when you know in advance what you're iterating over — a list, a range of numbers, a string, a file, a database result set. Use a while loop when you want to keep repeating until some condition changes and you don't know upfront how many iterations that will take — like reading user input until it's valid, retrying a network request until it succeeds, or polling a queue until it's empty. In everyday practice, for loops cover the vast majority of repetition tasks. While loops are the right tool when iteration count is genuinely unknown.
Python was designed from the start to be readable by humans. Indentation makes the structure of the code visually obvious without requiring extra punctuation — everything indented at the same level below the for line is part of the loop body, and the moment indentation returns to the previous level, you're outside the loop. This enforces consistent formatting across all Python code and makes the visual structure match the logical structure. Once you're used to it, reading deeply nested code without curly braces is genuinely easier than reading the brace-heavy equivalent.
The underscore is a convention that signals 'I intentionally don't need this variable.' Python still creates it and assigns it on each iteration, but naming it underscore tells human readers — and tools like linters and type checkers — that the loop variable is unused by design. Use it when you just need to repeat something N times and don't care about the counter value: for _ in range(5): .do_something()
Use . Write zip()for item_a, item_b in zip(list_a, list_b) to pair items from both lists together, one pair per iteration. zip() is lazy, memory-efficient, and reads clearly — the pairing intent is explicit. If your lists might have different lengths and you can't afford to silently drop items from the longer one, use itertools.zip_longest(list_a, list_b, fillvalue=None) instead.
A list comprehension is a for loop that produces a new list, written as a single expression: [x * 2 for x in numbers if x > 0]. A regular for loop is more general — it can perform side effects like printing, writing to files, updating databases, or calling external APIs. The practical rule: use a list comprehension when the purpose is to transform or filter a collection into a new list, and the expression fits cleanly on one line. Use a regular for loop when the body has side effects or is complex enough that one-lining it would hurt readability. If you're building a list just to pass it immediately to sum() or a similar function, use a generator expression instead — same result, constant memory.
20+ years shipping production Python across data and backend systems. Lessons pulled from things that broke in production.
That's Control Flow. Mark it forged?
12 min read · try the examples if you haven't