Java HashMap — Non-Atomic Resize Causes Infinite Loops
Rate limit counters corrupted as HashMap resize in concurrent put() triggered infinite loops.
20+ years shipping production Java in banking & fintech. Written from production experience, not tutorials.
- HashMap stores key-value pairs using hashing for O(1) average lookups
- Internally uses array of buckets, converts to tree when collisions exceed 8
- Must override hashCode() and equals() for custom keys
- Not thread-safe — ConcurrentHashMap is the production replacement
- load factor 0.75 balances memory and speed; pre-size to avoid rehashing
- Null keys allowed (stored in bucket 0), null values allowed
Java's HashMap is a hash-table-based implementation of the Map interface that stores key-value pairs using an array of buckets, each containing a linked list or tree (since Java 8) to handle collisions. It exists to provide near-constant-time O(1) average performance for put and get operations, making it the default choice for most in-memory key-value storage in Java applications.
The critical problem this article addresses is that HashMap's resize operation is not atomic — when multiple threads trigger a resize concurrently, the internal linked list can form a circular reference during the rehashing process, causing an infinite loop on subsequent get or put calls. This is a production crash scenario that has taken down countless Java services, particularly before ConcurrentHashMap became the standard for multi-threaded access.
In the Java ecosystem, HashMap is the workhorse map implementation used by virtually every application, from Spring Boot microservices to Android apps. Its alternatives include Hashtable (synchronized but obsolete), ConcurrentHashMap (thread-safe with better performance), and sorted/tree-based maps like TreeMap.
You should not use HashMap when you need thread safety without external synchronization, when insertion order matters (use LinkedHashMap), or when you require sorted keys (use TreeMap). The resize crash is specifically a concurrency issue — single-threaded usage is safe, but in production systems where maps are shared across threads, this bug has been a notorious source of hard-to-reproduce outages.
The root cause lies in HashMap's internal mechanics: when the number of entries exceeds the load factor (default 0.75) times the capacity, the map doubles its bucket array and rehashes all entries. During rehash, entries from the same bucket are transferred using head insertion (pre-Java 8), which reverses the linked list order.
If two threads execute this simultaneously, one thread's rehashing can create a cycle in the list, causing infinite loops when iterating or looking up keys. Java 8 mitigated this by switching to tail insertion and treeifying long buckets, but the fundamental non-atomic resize remains — ConcurrentHashMap was designed specifically to solve this with lock striping and atomic operations.
Imagine a massive cloakroom at a concert venue. When you hand over your jacket, the attendant gives you a numbered ticket — say, ticket 42. Later, you hand back ticket 42 and instantly get your jacket, no searching required. A HashMap works exactly like that: you store something under a 'key' (the ticket number), and retrieving it later is near-instant because Java uses that key to jump straight to the right spot. No looping through everything. Just give the key, get the value.
Every real application manages data — user sessions, product prices, word counts, config settings. The moment you need to look something up by a label rather than a position, a plain array or list starts to hurt. You're stuck looping through every element hoping to find a match, and as your data grows, so does your wait time. Java's HashMap is built to eliminate exactly that pain.
What HashMap Actually Is — and Why Its Resize Can Crash You
A HashMap is a hash-table implementation of the Map interface that stores key-value pairs using an array of buckets, each bucket holding a linked list or tree. The core mechanic: a key's hashCode() determines the bucket index; if two keys land in the same bucket, they're chained. Lookup is O(1) on average, O(n) in the worst case when all keys collide. The table dynamically resizes when the load factor (default 0.75) is exceeded — this is where the trouble starts. Resize creates a new, larger array and rehashes every entry into it. In Java 7 and earlier, this rehash uses head-insertion into the new bucket's linked list, which reverses the list order. Under concurrent writes, two threads resizing simultaneously can create a circular linked list. Once that cycle exists, any subsequent get() or put() that traverses that bucket enters an infinite loop — the JVM hangs, CPU pegs at 100%, and the thread never returns. This is not a theoretical bug; it's a proven production killer in high-throughput systems using unsynchronized HashMaps. Use ConcurrentHashMap for shared mutable state, or synchronize externally. Never treat HashMap as thread-safe — its non-atomic resize is the landmine.
HashMap.get() — all waiting on the same bucket's infinite loop.get().How HashMap Actually Stores Data Internally (The Part Most Tutorials Skip)
When you call put(key, value), Java doesn't just dump your data into a list. It calls hashCode() on the key, applies a secondary hash to spread values evenly, then uses the result to pick a 'bucket' — essentially a slot in an internal array. Think of it like sorting mail into pigeonholes: every letter (key-value pair) goes to a specific hole, so retrieval is O(1) on average.
But what if two keys land in the same bucket? That's a hash collision. Java handles this with a linked list inside the bucket. From Java 8 onwards, if one bucket accumulates more than 8 entries, Java converts that list into a red-black tree, dropping worst-case lookup from O(n) to O(log n). This is why your HashMap stays fast even when things get crowded.
Understanding this matters the moment you write a custom class as a key. If your hashCode() is bad — say, always returning 1 — everything piles into a single bucket and your 'fast' map becomes a slow list in disguise.
import java.util.HashMap; import java.util.Map; public class HashMapInternalsDemo { public static void main(String[] args) { // A product price catalogue — a perfect real-world HashMap use case Map<String, Double> productPrices = new HashMap<>(); // put() computes hashCode("Laptop"), finds the right bucket, stores the pair productPrices.put("Laptop", 999.99); productPrices.put("Headphones", 149.49); productPrices.put("USB-C Hub", 39.95); productPrices.put("Webcam", 89.00); // get() uses the same hash logic to jump straight to the value — no loop double laptopPrice = productPrices.get("Laptop"); System.out.println("Laptop price: $" + laptopPrice); // getOrDefault() is safer — avoids NullPointerException if key is missing double micPrice = productPrices.getOrDefault("Microphone", 0.0); System.out.println("Microphone price (not in catalogue): $" + micPrice); // containsKey() is O(1) — use it to check before acting if (productPrices.containsKey("Webcam")) { System.out.println("Webcam is stocked."); } // Iterating over entries — entrySet() is the most efficient way System.out.println("\n--- Full Product Catalogue ---"); for (Map.Entry<String, Double> entry : productPrices.entrySet()) { System.out.printf("%-15s $%.2f%n", entry.getKey(), entry.getValue()); } // HashMap size grows dynamically — default initial capacity is 16, load factor 0.75 System.out.println("\nTotal products: " + productPrices.size()); } }
Objects.hash()String.hashCode() is well-distributed, no custom key neededHashMap.get()The hashCode and equals Contract — Why Breaking It Destroys Your Map
If you ever use a custom object as a HashMap key, you must override both hashCode() and equals(). This is non-negotiable, and it's the most common advanced mistake people make.
Here's the rule: if two objects are equal according to equals(), they MUST return the same hashCode(). If you break this, Java puts the same logical key into different buckets and you end up with duplicate entries or, worse, you can never retrieve your data again.
The reverse is fine — two different objects can share a hash code (collision), but equal objects must share a hash. IDE plugins like IntelliJ and Eclipse can auto-generate correct implementations. Always use them, or use Java's built-in Objects.hash() utility which does the heavy lifting safely.
This contract is also why String and Integer work perfectly as HashMap keys out of the box — their hashCode() and equals() are already correctly implemented in the JDK.
import java.util.HashMap; import java.util.Map; import java.util.Objects; // Imagine a system tracking inventory per warehouse location class WarehouseLocation { private final String city; private final int aisle; public WarehouseLocation(String city, int aisle) {\n this.city = city;\n this.aisle = aisle;\n } // Without overriding equals(), two WarehouseLocation("London", 3) objects // are NOT considered equal — Java uses reference equality by default @Override public boolean equals(Object other) { if (this == other) return true; if (!(other instanceof WarehouseLocation)) return false; WarehouseLocation that = (WarehouseLocation) other; return this.aisle == that.aisle && Objects.equals(this.city, that.city); } // Without overriding hashCode(), equal objects can land in DIFFERENT buckets // Objects.hash() combines fields safely into a well-distributed hash code @Override public int hashCode() { return Objects.hash(city, aisle); } @Override public String toString() { return city + ", Aisle " + aisle; } } public class CustomKeyHashMapDemo {\n\n public static void main(String[] args) {\n\n Map<WarehouseLocation, Integer> stockLevels = new HashMap<>();\n\n WarehouseLocation londonAisle3 = new WarehouseLocation(\"London\", 3);\n stockLevels.put(londonAisle3, 250);\n\n // This is a DIFFERENT object in memory, but logically the same location\n WarehouseLocation sameLocation = new WarehouseLocation(\"London\", 3);\n\n // This works ONLY because we correctly overrode hashCode() and equals()\n Integer stock = stockLevels.get(sameLocation);\n System.out.println(\"Stock at \" + sameLocation + \": \" + stock + \" units\");\n\n // Updating stock — put() with an existing key replaces the old value\n stockLevels.put(londonAisle3, 180);\n System.out.println(\"Updated stock: \" + stockLevels.get(sameLocation) + \" units\");\n\n System.out.println(\"Map size (should be 1, not 2): \" + stockLevels.size());\n }\n}", "output": "Stock at London, Aisle 3: 250 units\nUpdated stock: 180 units\nMap size (should be 1, not 2): 1" }
HashMap Resizing and Load Factor — Why Initial Capacity Matters in Production
HashMaps start with a default initial capacity of 16 buckets and a load factor of 0.75. When the number of entries exceeds capacity × load factor (16 × 0.75 = 12), the map resizes to double its current size (32, then 64, etc.). Resizing is expensive: it allocates a new array and rehashes every existing entry into the new buckets. During this process, all concurrent access to the map can cause corruption if not synchronized.
In production, resizing is often a hidden performance cost. If you're loading 10,000 entries into a HashMap with default settings, it will resize about 9 times (16→32→64→128→256→512→1024→2048→4096→8192→16384 — actually 10 resizes to reach capacity above 10000). Each resize copies and rehashes all existing entries. You can eliminate these resizes by pre-sizing the map.
The formula for initial capacity: expectedEntries / loadFactor. For 10,000 entries and 0.75 load factor, that's 10,000 / 0.75 ≈ 13,333, round up to nearest power of 2 = 16,384. Pass that to the constructor: new HashMap<>(16384).
Resizing is not just slow — it doubles memory usage momentarily because the old array and new array exist simultaneously. This can cause OOM in memory-constrained environments.
import java.util.HashMap; import java.util.Map; public class HashMapResizingDemo { public static void main(String[] args) { // Simulating bulk load of 1 million entries // Without pre-sizing: many resizes long start = System.currentTimeMillis(); Map<Integer, String> map = new HashMap<>(); for (int i = 0; i < 1_000_000; i++) { map.put(i, "value" + i); } System.out.println("Default init: " + (System.currentTimeMillis() - start) + "ms"); // With pre-sizing: expected 1M / 0.75 = 1,333,333 -> next power of 2 = 2,097,152 start = System.currentTimeMillis(); Map<Integer, String> preSized = new HashMap<>(2_097_152); for (int i = 0; i < 1_000_000; i++) { preSized.put(i, "value" + i); } System.out.println("Pre-sized: " + (System.currentTimeMillis() - start) + "ms"); } }
HashMap vs Hashtable: Key Differences and Migration Guide
If you're maintaining a legacy codebase, you've probably seen java.util.Hashtable. It was the first thread-safe map in Java, but it has serious flaws. Every method in Hashtable is synchronized on a single lock, making it a bottleneck under concurrency. It also forbids null keys and null values. HashMap fixes all of that — and then some.
Here's the side-by-side comparison:
| Feature | HashMap | Hashtable |
|---|---|---|
| Thread safety | Not thread-safe (use ConcurrentHashMap) | Thread-safe (synchronized methods) |
| Null keys | Allows one null key | No null keys |
| Null values | Allows multiple null values | No null values |
| Performance (single-threaded) | Excellent | Slow due to synchronization overhead |
| Performance (multi-threaded) | Unsafe; use ConcurrentHashMap | Poor (single lock contention) |
| Iteration behavior | Fail-fast (ConcurrentModificationException) | Fail-fast |
| Introduced | Java 1.2 | Java 1.0 (legacy) |
| Legacy enumeration | No | Yes (elements(), keys()) |
In modern Java, you should never use Hashtable in new code. Replace it with HashMap for single-threaded contexts or ConcurrentHashMap for multi-threaded. The migration is straightforward: change the class and remove any workarounds for null keys (e.g., replacing null with a sentinel value). If you rely on Hashtable's fail-safe behavior from its synchronized methods, note that ConcurrentHashMap gives much better throughput while still being safe.
package io.thecodeforge.hashmap; import java.util.*; public class HashMapVsHashtableDemo { public static void main(String[] args) { // Hashtable — no null keys Hashtable<String, Integer> ht = new Hashtable<>(); try { ht.put(null, 1); // throws NullPointerException } catch (NullPointerException e) { System.out.println("Hashtable forbids null keys"); } // HashMap — allows null key Map<String, Integer> hm = new HashMap<>(); hm.put(null, 1); System.out.println("HashMap null key value: " + hm.get(null)); // Performance comparison (single-threaded) long start = System.nanoTime(); for (int i = 0; i < 100_000; i++) { hm.put("key" + i, i); } long hmTime = System.nanoTime() - start; start = System.nanoTime(); for (int i = 0; i < 100_000; i++) { ht.put("key" + i, i); } long htTime = System.nanoTime() - start; System.out.printf("HashMap time: %d ns\nHashtable time: %d ns\n", hmTime, htTime); } }
get() returning null for missing keys, that still works. But if you relied on Hashtable throwing an exception on null keys, you'll need to add explicit validation.HashMap vs TreeMap vs LinkedHashMap: Choosing the Right Map
HashMap gives you speed. TreeMap gives you order. LinkedHashMap gives you both order and good speed. But each has tradeoffs that matter in production.
Here's a quick comparison:
| Feature | HashMap | LinkedHashMap | TreeMap |
|---|---|---|---|
| Ordering | None (arbitrary) | Insertion order | Natural ordering or Comparator |
| Null keys | Allowed (one) | Allowed (one) | Not allowed (NPE) |
| Null values | Allowed | Allowed | Allowed |
| Performance get/put/remove | O(1) avg | O(1) avg | O(log n) |
| Memory overhead | Low | Medium (doubly-linked list) | High (tree nodes) |
| Best use case | General fast lookup | LRU cache, audit logs | Sorted data, range queries |
| Thread safety | No | No | No |
When to choose each: - HashMap: Default choice when you only need fast key-value access and don't care about order. - LinkedHashMap: Choose when you need predictable iteration order (e.g., display order in a UI). Also perfect for building an LRU cache by overriding removeEldestEntry(). - TreeMap: Choose when you need keys sorted (e.g., alphabetical menu items, or to answer range queries like 'get all users with names between A and M'). The O(log n) performance is still very fast for typical map sizes.
Important: If you need order but still want O(1) performance, LinkedHashMap is the sweet spot. If you need sorted order and are willing to pay O(log n) for it, TreeMap is your friend. Never insert into a HashMap and then sort — that's wasteful.
package io.thecodeforge.hashmap; import java.util.*; public class MapOrderingDemo { public static void main(String[] args) { Map<String, Integer> hashMap = new HashMap<>(); Map<String, Integer> linkedHashMap = new LinkedHashMap<>(); Map<String, Integer> treeMap = new TreeMap<>(); List<String> keys = Arrays.asList("Charlie", "Alpha", "Bravo"); for (String key : keys) { hashMap.put(key, key.length()); linkedHashMap.put(key, key.length()); treeMap.put(key, key.length()); } System.out.println("HashMap (arbitrary): " + hashMap); System.out.println("LinkedHashMap (insertion): " + linkedHashMap); System.out.println("TreeMap (sorted): " + treeMap); // LinkedHashMap as LRU cache LinkedHashMap<String, Integer> lru = new LinkedHashMap<String, Integer>(16, 0.75f, true) { @Override protected boolean removeEldestEntry(Map.Entry<String, Integer> eldest) {\n return size() > 3;\n } }; lru.put("A", 1); lru.put("B", 2); lru.put("C", 3); lru.get("A"); // Access A to make it most recent lru.put("D", 4); // Triggers removal of eldest (B was not accessed recently) System.out.println("LRU cache (access-order): " + lru); } }
HashMap vs ConcurrentHashMap vs Hashtable — The Concurrency Showdown
HashMap is not thread-safe. ConcurrentHashMap is. That's the headline. But there's nuance.
Hashtable (legacy) synchronizes every method with a single lock — essentially a synchronized wrapper around HashMap. Throughput under concurrency is terrible because threads queue for the same lock.
ConcurrentHashMap, from Java 8 onwards, uses a Node array with CAS (Compare-And-Swap) operations for common paths like get() and put(). Internally, it divides the map into bins and only locks individual bins during write operations (Java 8+ uses synchronized on the first Node of each bin, but still fine-grained). Reads are lock-free. This gives much higher throughput.
However, ConcurrentHashMap's size() and isEmpty() can be expensive because they sum per-bucket counts. Also, iteration is weakly consistent — it may or may not reflect the latest concurrent updates. You cannot use ConcurrentHashMap for operations that need a consistent snapshot without external synchronization.
Another critical point: ConcurrentHashMap does not allow null keys or null values. This is by design to prevent ambiguity in multi-threaded contexts (e.g., get returning null could mean absent or null value).
package io.thecodeforge; import java.util.*; import java.util.concurrent.*; public class ConcurrencyComparisonDemo { public static void main(String[] args) throws Exception { // Single-writer, many-reader benchmark (simplified) int threads = 10; int iterations = 100_000; ExecutorService pool = Executors.newFixedThreadPool(threads); // Hashtable Map<Integer, Integer> ht = new Hashtable<>(); long start = System.nanoTime(); testConcurrentMap(ht, threads, iterations, pool); System.out.println("Hashtable: " + (System.nanoTime() - start) / 1_000_000 + "ms"); // ConcurrentHashMap Map<Integer, Integer> chm = new ConcurrentHashMap<>(); start = System.nanoTime(); testConcurrentMap(chm, threads, iterations, pool); System.out.println("ConcurrentHashMap: " + (System.nanoTime() - start) / 1_000_000 + "ms"); // Synchronized HashMap Map<Integer, Integer> shm = Collections.synchronizedMap(new HashMap<>()); start = System.nanoTime(); testConcurrentMap(shm, threads, iterations, pool); System.out.println("Synchronized HashMap: " + (System.nanoTime() - start) / 1_000_000 + "ms"); pool.shutdown(); } static void testConcurrentMap(Map<Integer, Integer> map, int threads, int iterations, ExecutorService pool) throws Exception {\n CountDownLatch latch = new CountDownLatch(threads);\n for (int t = 0; t < threads; t++) {\n pool.submit(() -> {\n for (int i = 0; i < iterations; i++) {\n map.put(i, i);\n map.get(i);\n } latch.countDown(); }); } latch.await(); } }
Collections.unmodifiableMap() with re-creationHashMap in Real-World Patterns — Beyond Basic CRUD
HashMap is the workhorse for many production patterns beyond simple store/retrieve. The most powerful methods to know: computeIfAbsent, merge, and putIfAbsent. These replace the old pattern of "check if key exists, then do something" with atomic operations.
Pattern 1: Frequency Counting. Use merge() to count occurrences. The lambda Integer::sum adds 1 for each occurrence, handling both first and subsequent cases.
Pattern 2: Grouping. computeIfAbsent creates a list for a new key on first access, then returns the existing list for subsequent accesses. This eliminates null checks and temporary variables.
Pattern 3: Default configuration. putIfAbsent sets a value only if the key is absent. Great for loading defaults without overwriting user-provided values.
Pattern 4: Two-level caches. Use Map<Key1, Map<Key2, Value>> with computeIfAbsent on the outer map to lazily create inner maps. This avoids pre-populating all inner maps.
Pattern 5: LRU cache. Use LinkedHashMap with access-order and override removeEldestEntry. This is a simple way to implement a bounded cache without external libraries.
package io.thecodeforge; import java.util.*; import java.util.stream.Collectors; public class HashMapPatternsDemo { public static void main(String[] args) { // ── PATTERN 1: Frequency Counting ────────────────────────────────── List<String> orderCategories = Arrays.asList( "Electronics", "Books", "Electronics", "Clothing", "Books", "Electronics", "Books", "Clothing", "Toys" ); Map<String, Integer> categoryCount = new HashMap<>(); for (String category : orderCategories) { categoryCount.merge(category, 1, Integer::sum); } System.out.println("Order counts by category:"); categoryCount.forEach((c, cnt) -> System.out.println(" " + c + ": " + cnt)); // ── PATTERN 2: computeIfAbsent for Grouping ──────────────────────── Map<String, List<Integer>> ordersByRegion = new HashMap<>(); String[][] rawOrders = { {"North", "1001"}, {\"South\", \"1002\"}, {\"North\", \"1003\"},\n {\"East\", \"1004\"}, {\"South\", \"1005\"}, {\"North\", \"1006\"}\n };\n for (String[] order : rawOrders) {\n ordersByRegion.computeIfAbsent(order[0], r -> new ArrayList<>()).add(Integer.parseInt(order[1]));\n }\n System.out.println(\"\\nOrders grouped by region:\");\n ordersByRegion.forEach((r, ids) -> System.out.println(\" \" + r + \": \" + ids));\n\n // ── PATTERN 3: putIfAbsent for Default Configs ─────────────────────\n Map<String, String> config = new HashMap<>();\n config.put(\"timeout\", \"30s\");\n config.putIfAbsent(\"timeout\", \"60s\"); // ignored\n config.putIfAbsent(\"retries\", \"3\");\n System.out.println(\"\\nConfig:\");\n config.forEach((k,v) -> System.out.println(\" \" + k + \" = \" + v));\n\n // ── PATTERN 4: Two-level cache ─────────────────────────────────────\n Map<String, Map<Integer, String>> twoLevelCache = new HashMap<>();\n // Automatically creates inner map for \"users\" on first access\n twoLevelCache.computeIfAbsent(\"users\", k -> new HashMap<>()).put(1, \"Alice\");\n twoLevelCache.computeIfAbsent(\"users\", k -> new HashMap<>()).put(2, \"Bob\");\n System.out.println(\"Two-level cache: \" + twoLevelCache);\n }\n}", "output": "Order counts by category:\n Books: 3\n Toys: 1\n Clothing: 2\n Electronics: 3\n\nOrders grouped by region:\n South: [1002, 1005]\n North: [1001, 1003, 1006]\n East: [1004]\n\nConfig:\n timeout = 30s\n retries = 3\n\nTwo-level cache: {users={1=Alice, 2=Bob}}" }
Mastering HashMap.merge() for Atomic Accumulation
The merge() method is the cleanest way to combine a new value with an existing one. It's perfect for counting, summing, or building collections. The signature is: merge(K key, V value, BiFunction remappingFunction). If the key is absent, it simply puts the value. If the key is present, it applies the remapping function to the old value and the new value, and stores the result. If the result is null, the key is removed.
This is a game-changer for frequency counting. Before Java 8, you wrote: if (map.containsKey(key)) { map.put(key, map.get(key) + 1); } else { map.put(key, 1); } With merge, it's one line: map.merge(key, 1
package io.thecodeforge.hashmap; import java.util.*; import java.util.concurrent.ConcurrentHashMap; import java.util.concurrent.atomic.AtomicInteger; public class MergeMethodDemo { public static void main(String[] args) { // Word frequency counter — one line per word String[] words = {"apple", "banana", "apple", "cherry", "banana", "apple"}; Map<String, Integer> freq = new HashMap<>(); for (String word : words) { freq.merge(word, 1, Integer::sum); } System.out.println("Word frequencies: " + freq); // Summing scores per player (multiple logs) Map<String, Integer> totalScores = new HashMap<>(); totalScores.merge("Alice", 10, Integer::sum); totalScores.merge("Bob", 20, Integer::sum); totalScores.merge("Alice", 5, Integer::sum); System.out.println("Total scores: " + totalScores); // Concurrent use — thread-safe accumulation Map<String, AtomicInteger> concurrentFreq = new ConcurrentHashMap<>(); // With merge and AtomicInteger (not needed if using Integer and merge) // Actually, merge with Integer is atomic on ConcurrentHashMap // Simulate concurrent updates Runnable task = () -> { for (int i = 0; i < 1000; i++) { concurrentFreq.merge("sharedKey", 1, Integer::sum); } }; Thread t1 = new Thread(task); Thread t2 = new Thread(task); t1.start(); t2.start(); try { t1.join(); t2.join(); } catch (InterruptedException e) {} System.out.println("Concurrent merge result: " + concurrentFreq.get("sharedKey") + " (expected 2000)"); // Removing a key via returning null Map<String, String> map = new HashMap<>(); map.put("temp", "value"); map.merge("temp", "irrelevant", (old, val) -> null); // removes key System.out.println("After merge-to-null: " + map.containsKey("temp")); // false } }
merge() can do it too but is less readable.Advantages and Disadvantages of HashMap
HashMap is the most used collection in Java, but it's not perfect for every situation. Understanding its strengths and weaknesses helps you decide when to use it and when to reach for an alternative.
| Advantages | Disadvantages |
|---|---|
O(1) average time for get() and put() — extremely fast for lookups | Not thread-safe — requires external synchronization or ConcurrentHashMap |
| Allows null keys (one) and null values — flexible for initialization patterns | No ordering guarantee — iteration order is unpredictable and can change after resizing |
| Dynamic resizing — automatically grows as entries are added | Resizing is expensive — O(N) rehash and memory spike; must pre-size for bulk loads |
| Rich API — computeIfAbsent, merge, putIfAbsent simplify common patterns | Poor worst-case performance — if hashCode() is bad, degrades to O(n) (mitigated by treeification in Java 8+) |
| Low memory overhead compared to TreeMap or LinkedHashMap | No range queries — can't efficiently find all keys between two values (use TreeMap) |
| Fail-fast iteration — detects concurrent modification early | Mutable keys cause memory leaks — if you modify a key after insertion, the entry becomes permanently unreachable |
In production, the disadvantages become critical when you ignore them. The not-thread-safe issue is the most common cause of production outages (see the production incident in this article). The high cost of resizing is the second most common performance issue. Always design with these tradeoffs in mind.
Practice Problems: Sharpen Your HashMap Skills
The best way to master HashMap is to use it in real scenarios. Here are five curated problems that cover the most common patterns you'll encounter in interviews and production code.
1. Word Frequency Counter Given a string, return a map of word -> count. Use merge() for a one-liner.
2. Anagram Detector Given two strings, determine if they are anagrams. Use a HashMap to count character frequencies and compare.
3. LRU Cache (Simple) Implement a fixed-size cache that evicts the least recently accessed item. Use LinkedHashMap with access-order and removeEldestEntry.
4. Group Anagrams Given an array of strings, group anagrams together. Use a HashMap where the key is the sorted version of each string, and the value is a list of original strings.
5. Two Sum with Indices Given an array of integers and a target, find two indices whose values sum to the target. Use a HashMap to store value -> index for O(n) solution.
Each problem reinforces a core HashMap technique: merge, computeIfAbsent, iteration, and custom key design. Try solving them without looking at the code first.
package io.thecodeforge.hashmap; import java.util.*; import java.util.stream.*; public class PracticeProblems { public static void main(String[] args) { // Problem 1: Word Frequency Counter String text = "apple banana apple cherry banana apple"; Map<String, Integer> freq = new HashMap<>(); for (String word : text.split(" ")) { freq.merge(word, 1, Integer::sum); } System.out.println("1. Word frequencies: " + freq); // Problem 2: Anagram Detector String s1 = "listen"; String s2 = "silent"; System.out.println("2. Are anagrams? " + areAnagrams(s1, s2)); // Problem 3: LRU Cache LRUCache<String, Integer> cache = new LRUCache<>(3); cache.put("A", 1); cache.put("B", 2); cache.put("C", 3); cache.get("A"); cache.put("D", 4); System.out.println("3. LRU cache: " + cache); // Problem 4: Group Anagrams String[] words = {"eat", "tea", "tan", "ate", "nat", "bat"}; System.out.println("4. Grouped anagrams: " + groupAnagrams(words)); // Problem 5: Two Sum int[] nums = {2, 7, 11, 15}; int target = 9; System.out.println("5. Two sum indices: " + Arrays.toString(twoSum(nums, target))); } static boolean areAnagrams(String a, String b) {\n if (a.length() != b.length()) return false;\n Map<Character, Integer> map = new HashMap<>();\n for (char c : a.toCharArray()) map.merge(c, 1, Integer::sum);\n for (char c : b.toCharArray()) {\n if (!map.containsKey(c)) return false;\n int count = map.get(c) - 1;\n if (count == 0) map.remove(c);\n else map.put(c, count);\n } return map.isEmpty(); } static class LRUCache<K, V> extends LinkedHashMap<K, V> {\n private final int maxSize;\n LRUCache(int maxSize) {\n super(16, 0.75f, true); // access-order\n this.maxSize = maxSize;\n } @Override protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {\n return size() > maxSize;\n } } static List<List<String>> groupAnagrams(String[] strs) { Map<String, List<String>> map = new HashMap<>(); for (String s : strs) { char[] chars = s.toCharArray(); Arrays.sort(chars); String key = new String(chars); map.computeIfAbsent(key, k -> new ArrayList<>()).add(s); } return new ArrayList<>(map.values()); } static int[] twoSum(int[] nums, int target) {\n Map<Integer, Integer> map = new HashMap<>();\n for (int i = 0; i < nums.length; i++) {\n int complement = target - nums[i];\n if (map.containsKey(complement)) {\n return new int[]{map.get(complement), i}; } map.put(nums[i], i); } return new int[]{}; } }
HashMap Constructors: Picking the Wrong One Will Cost You Memory
Most devs just call new HashMap<>() and move on. That’s fine for small maps with < 12 entries. But in high-volume or latency-sensitive code, the default constructor is a liability.
Java gives you four constructors. Only two matter in production: HashMap(int initialCapacity) and HashMap(int initialCapacity, float loadFactor). The no-arg constructor defaults to capacity 16 and load factor 0.75 — meaning a single put() can trigger a resize after 12 entries. If you’re building a map for 1000 items from an API response, that map will resize ~7 times. Each resize copies every entry to a new array. That’s CPU, GC pressure, and wasted time.
Use the capacity constructor. Calculate: expected size / load factor. For 1000 items with 0.75 load factor, initialCapacity = (int)(1000 / 0.75 + 1) = 1334. That avoids rehashing entirely.
The fourth constructor HashMap(Map<? extends K, ? extends V> m) copies the source map including its internal bucket array size. That’s a footgun: if you copy a tiny map into a new HashMap, you inherit its tiny capacity and get resizes on the first few puts. Always wrap it with the capacity constructor.
Pick capacity upfront. Your GC will thank you.
// io.thecodeforge — java tutorial import java.util.HashMap; import java.util.Map; public class HashMapConstructorPitfall { public static void main(String[] args) { // Wrong: default capacity 16, resizes after 12 entries Map<String, String> fragileMap = new HashMap<>(); for (int i = 0; i < 1000; i++) { fragileMap.put("key-" + i, "value-" + i); } // ~7 resizes happened, each copying all entries // Right: pre-allocate for 1000 entries with load factor 0.75 int expectedSize = 1000; int capacity = (int)(expectedSize / 0.75 + 1); Map<String, String> solidMap = new HashMap<>(capacity); for (int i = 0; i < 1000; i++) { solidMap.put("key-" + i, "value-" + i); } // Zero resizes System.out.println("solidMap size: " + solidMap.size()); } }
new HashMap<>(sourceMap) gives you capacity for ~3 entries. Your first real put triggers a resize. Use new HashMap<>(sourceMap.size() * 2) or the capacity formula above.Capacity, Load Factor, and Treeification: The Hidden Performance Knobs
HashMap’s performance isn’t magic — it’s a careful balance of array size and bucket collision probability. Capacity is the number of buckets in the internal array. Load factor is the threshold that triggers a resize. Default: 16 buckets, 0.75 load factor. That means after 12 entries, the array doubles to 32 buckets and everything gets rehashed.
But there’s a third knob most tutorials ignore: the treeification threshold. When a single bucket accumulates 8 or more entries, HashMap converts that bucket’s linked list into a balanced tree (red-black tree). This kicks in when your hash function is bad and many keys collide. Tree lookup is O(log n) vs linked list O(n). That threshold is tunable via TREEIFY_THRESHOLD constant (8), and the untreeify threshold (6) when entries drop below.
In production, if you see a HashMap with >1000 entries and high CPU on , suspect poor hash distribution. Check if keys are custom objects with broken hashCode — that’s where treeification masks the real problem. If your map stays under 8 collisions per bucket, you never hit tree mode, and performance remains O(1).get()
Monitor your map’s bucket distribution using a debugger or custom size metrics. If any bucket has >5 entries, revisit your hashCode implementation. Treeification is not free — it adds memory per node and overhead on every insert.
// io.thecodeforge — java tutorial import java.util.HashMap; import java.util.Map; import java.util.Objects; class BrokenHashKey { final int id; BrokenHashKey(int id) { this.id = id; } @Override public int hashCode() { return 1; } // all collide! @Override public boolean equals(Object o) { if (!(o instanceof BrokenHashKey)) return false; return ((BrokenHashKey)o).id == this.id; } } public class TreeificationThreshold { public static void main(String[] args) { Map<BrokenHashKey, String> badMap = new HashMap<>(); for (int i = 0; i < 20; i++) { badMap.put(new BrokenHashKey(i), "val-" + i); } // After 8 entries in bucket 1, HashMap treeifies that bucket // get() switches from O(n) to O(log n) System.out.println("Retrieved: " + badMap.get(new BrokenHashKey(5))); } }
HashMap Iteration Order: Never Assume Stability, Always Test
HashMap’s Javadoc says it explicitly: no guarantee of order, and order can change over time. Yet production bugs caused by assuming iteration order are legendary. You run a loop, process keys in some sequence, deploy a JVM update — suddenly order shifts, and your batch processor works but produces different output. Or worse: a unit test passes because the test JVM uses one bucket layout, but prod on a different Java version flips order.
Why does order change? HashMap iterates over buckets in array index order. When the array resizes, buckets move to new indices. That means the order you see after a resize is different from before. Even the initial order depends on hash values and bucket index computation — which itself depends on the internal function that spreads bits.hash()
If you rely on insertion order, use LinkedHashMap. If you rely on sorting, use TreeMap. Period. Don’t argue that “HashMap usually preserves insertion order” — it doesn’t. It preserves bucket index order, which is a side effect. A side effect that can break when you upgrade from Java 8 to Java 17 or when you cross a resize boundary.
Write tests that run with large datasets to trigger resizes, then verify iteration order is non-deterministic. If your code assumes order, it’s already broken — just waiting to fail in production.
// io.thecodeforge — java tutorial import java.util.HashMap; import java.util.Map; public class IterationOrderBreakdown { public static void main(String[] args) { Map<String, Integer> orders = new HashMap<>(); orders.put("alpha", 1); orders.put("beta", 2); orders.put("gamma", 3); // First iteration — order may be alpha, beta, gamma or different System.out.print("First: "); orders.keySet().forEach(k -> System.out.print(k + " ")); System.out.println(); // Simulate resize by adding many entries for (int i = 0; i < 1000; i++) { orders.put("x-" + i, i); } // Second iteration — order likely changed! System.out.print("After resize: "); orders.keySet().forEach(k -> System.out.print(k + " ")); System.out.println(); } }
Time and Space Complexity: Why HashMap is Fast Until It Isn't
HashMap gives you O(1) average time for put, get, and remove. That's the headline. But average doesn't mean every operation — when a hash collision happens, you're walking a linked list or a tree. Java 8 treeifies buckets over threshold 8, turning worst-case O(n) into O(log n). Still not constant. If you cram a bad hashCode into a small map, you'll watch your response times spike.
Space complexity is O(n) for stored entries, plus the backing array which is 2^n sized (power of two). Each entry object costs ~32 bytes overhead on 64-bit JVMs. A map with 10 million entries can eat 500 MB before you count the actual data. Resizing doubles capacity and rehashes everything — that's O(n) memory and time in one shot. That's why initial capacity matters: prevent resize spikes in latency-sensitive paths.
Know your thresholds. If your map stays under load factor 0.75, you're fine. Push above 0.9? You'll pay in collisions and treeification cost. Measure it.
// io.thecodeforge — java tutorial import java.util.*; public class HashMapComplexity { public static void main(String[] args) { Map<Integer, String> map = new HashMap<>(1 << 20); // 1M capacity long start = System.nanoTime(); for (int i = 0; i < 1_000_000; i++) { map.put(i, "value-" + i); } long end = System.nanoTime(); System.out.printf("Insert 1M entries: %d ms%n", (end - start) / 1_000_000); start = System.nanoTime(); String val = map.get(500_000); end = System.nanoTime(); System.out.printf("Get entry: %d ns%n", (end - start)); } }
HashMap(Map map) — The Copy Constructor That Copies Your Bugs
You'd think new HashMap<>(existingMap) is a safe clone. It's not. It copies references — not the objects. Your old map and new map share the exact same key and value instances. Mutate a mutable key in either map, and both maps are corrupted. HashMap has no deep copy mechanism built-in. If your keys are mutable StringBuilders or custom objects without defensive copies, you just introduced a heisenbug.
That constructor also sets the load factor to 0.75 (you can't override it here) and computes an initial capacity based on existing map size / 0.75 + 1. Works fine for most cases, but if the source map was resized and full, you inherit that cost. And it rehashes all entries — O(n) time and memory. Not free.
Senior move: use new HashMap<>(existingMap) only when keys and values are immutable or you want shared references. Otherwise, write a copy factory that deep-clones each entry. One line saves six hours debugging.
// io.thecodeforge — java tutorial import java.util.*; public class HashMapCopyConstructor { public static void main(String[] args) { Map<String, Integer> original = new HashMap<>(); original.put("alice", 1); original.put("bob", 2); // Shallow copy — shares references Map<String, Integer> copy = new HashMap<>(original); // Mutate original after copy original.put("alice", 100); original.put("charlie", 3); System.out.println("Original: " + original); System.out.println("Copy: " + copy); // String values are immutable, so copy is unaffected // But with mutable keys (e.g., List), both maps corrupt } }
Map.copyOf() for immutable maps instead of the copy constructor. It gives you an unmodifiable map and fails fast on nulls. For mutable copies, write a builder that deep-clones each entry.The Concurrency Corruption That Took Down a Payment Gateway
put() at the same time triggered a resize and rehash. During rehash, one thread was writing a new bucket while another was reading, resulting in an infinite loop in the linked list (pre-Java 8) or data corruption. The iteration then hit a cycle or null pointer.- Never share a raw HashMap across threads without external synchronization — even reads are unsafe during concurrent writes.
- ConcurrentHashMap is not a drop-in replacement for all cases; its iteration is weakly consistent, but it guarantees structural safety.
- When values need atomic updates, combine ConcurrentHashMap with AtomicLong or use
merge()/compute() methods.
HashMap.get() or put() iteration — indicates an infinite loop due to concurrent modification. Replace with ConcurrentHashMap.equals() and hashCode() are overridden consistently.get(). Use a profiler to find orphaned entries. Consider WeakHashMap for cache scenarios.thread dump via jstack <pid> | grep -A 10 "HashMap"Check if map is shared — add logging of map reference identityCollections.synchronizedMap() as temporary fix, then migrate to ConcurrentHashMapSystem.identityHashCode(key) vs key.hashCode()Verify equals() implementation with a unit testjmap -dump:live,format=b,file=heap.hprof <pid>Open heap dump in Eclipse MAT and run OQL: select * from java.util.HashMap$Node where toString(x) like '%expected%'remove() in a scheduled task| Feature | HashMap | LinkedHashMap | TreeMap | ConcurrentHashMap |
|---|---|---|---|---|
| Key ordering | None (arbitrary) | Insertion order | Natural / Comparator sort | None (arbitrary) |
| Null keys allowed | Yes (one null key) | Yes (one null key) | No — throws NullPointerException | No — throws NullPointerException |
| Thread safety | Not thread-safe | Not thread-safe | Not thread-safe | Thread-safe (segment locks) |
| get/put performance | O(1) average | O(1) average | O(log n) | O(1) average |
| Best use case | General-purpose lookup | LRU cache, audit logs | Sorted data, range queries | High-concurrency environments |
| Memory overhead | Low | Slightly higher (doubly linked list) | Higher (tree nodes) | Moderate (internal segments) |
Key takeaways
put() calls can create a circular linked list, causing infinite loops and 100% CPU usage.equals() correctly, or the map will lose entries.Common mistakes to avoid
5 patternsUsing a mutable object as a key
get() and won't clean it up — memory leak.Calling get() without null-checking the result
HashMap.get() returns null for both 'key is absent' and 'key maps to a null value'. Calling methods on the return value without checking causes a NullPointerException.Iterating over a HashMap while modifying it
remove() method: Iterator<Map.Entry<...>> it = map.entrySet().iterator(); while(it.hasNext()) { if (condition) it.remove(); }Not pre-sizing the HashMap for bulk loads
Sharing a HashMap across threads without synchronization
Collections.synchronizedMap(), but be aware of compound operations.Interview Questions on This Topic
What happens internally when two keys produce the same hashCode() in a Java HashMap? Walk me through how the collision is handled and how this changed in Java 8.
If I use a custom class as a HashMap key but only override equals() and not hashCode(), what specific bug will I see — and why does it happen at the bucket level?
equals()) end up as separate entries in the map. Without overriding hashCode(), the default Object.hashCode() returns a random memory address (typically based on object reference). Two distinct objects that are logically equal will have different hash codes, so they end up in different buckets. When you call get() with one object, it hashes to its own bucket, not the bucket where the equal object was stored — thus returning null. The map will appear to have duplicate entries when iterating. The fix: always override both methods consistently, and ensure that if a.equals(b) then a.hashCode() == b.hashCode().Frequently Asked Questions
No. The infinite loop bug only occurs under concurrent access when multiple threads trigger a resize simultaneously. Single-threaded usage of HashMap is safe.
Yes, though it's harder to trigger. Java 8 switched to tail insertion and treeified long buckets, which reduces the chance of a circular linked list forming, but the resize operation is still non-atomic. ConcurrentHashMap is the correct fix.
Use initialCapacity = (int) Math.ceil(expectedEntries / loadFactor). For the default load factor of 0.75, that's expectedEntries / 0.75, then round up to the nearest power of 2. For 10,000 entries, use new HashMap<>(16384).
ConcurrentHashMap uses lock striping — it locks only the bucket being modified, not the entire map — and performs resizing atomically with CAS (compare-and-swap) operations. This ensures no thread sees a partially resized structure.
20+ years shipping production Java in banking & fintech. Written from production experience, not tutorials.
That's Collections. Mark it forged?
15 min read · try the examples if you haven't