深入解析 Java HashMap 底层原理

1. HashMap 的基本结构

1.1 数据结构组成

HashMap 内部主要依赖以下几部分来存储数据：

数组：作为容器存储所有桶，每个桶的元素是链表或红黑树的头结点。
链表：当多个键映射到同一个桶时，这些键值对会以链表的形式串联在一起。
红黑树：当链表长度超过一定阈值时，链表转换为红黑树，以降低极端情况下的查找复杂度。

大白话:
想象一个文件柜（数组），每个抽屉（桶）里可能放着一串文件（链表），而当文件太多时，会用目录索引（红黑树）快速查找。

1.2 核心成员变量

展示代码图片

基础变量.webp

代码解释:

// 初始容量，默认 16
static final int DEFAULT_INITIAL_CAPACITY = 1 << 4;

// 默认负载因子，控制扩容阈值
static final float DEFAULT_LOAD_FACTOR = 0.75f;

// 链表长度达到8时转换为红黑树
static final int TREEIFY_THRESHOLD = 8;

// 红黑树节点数小于6时退化为链表
static final int UNTREEIFY_THRESHOLD = 6;

// 存储键值对的数组
transient Node<K,V>[] table;

// 当前存储的键值对数量
transient int size;

最佳实践:
当预估数据量较大时，构造时指定合适的初始容量，能有效减少扩容次数，从而提升性能：

Map<String, Integer> map = new HashMap<>(100);

1.3 示例图

table数组
- 表示整个 HashMap 的“table”，即存储桶数组。
数组部分
- 内部由多个小盒子组成，每个盒子代表数组的一个元素（即一个桶）。
- 标号0、1、2、...、n-2、n-1分别表示桶的下标。
桶中的节点
- 每个数组元素内可能存储一个 Node 对象。
- 如果该桶只有一个节点，则直接显示Node；如果有多个节点，则后续节点通过指针链接形成链表。
- 当链表长度较长时，可能转化为红黑树，这里用链表/红黑树来表示可能的情况。
箭头
- 表示从数组的桶指向链表或红黑树的入口（即链表的第一个节点或者红黑树的根节点）。
链表和红黑树的展示
- 链表/红黑树：说明在桶内可能是一个链表结构，也可能在达到一定阈值后被转换为红黑树结构，以提高查找性能。

2. 核心操作详解

2.1 put() 方法 —— 插入与更新

2.1.1 散列算法

为了使 key 均匀分布，HashMap 对 key 的 hashCode() 结果做了额外的混合处理：Java

/* ---------------- Static utilities -------------- */Java

/**
 * Computes key.hashCode() and spreads (XORs) higher bits of hash
 * to lower.  Because the table uses power-of-two masking, sets of
 * hashes that vary only in bits above the current mask will
 * always collide. (Among known examples are sets of Float keys
 * holding consecutive whole numbers in small tables.)  So we
 * apply a transform that spreads the impact of higher bits
 * downward. There is a tradeoff between speed, utility, and
 * quality of bit-spreading. Because many common sets of hashes
 * are already reasonably distributed (so don't benefit from
 * spreading), and because we use trees to handle large sets of
 * collisions in bins, we just XOR some shifted bits in the
 * cheapest possible way to reduce systematic lossage, as well as
 * to incorporate impact of the highest bits that would otherwise
 * never be used in index calculations because of table bounds.
 */
static final int hash(Object key) {
    int h;
// 将 key 的 hashCode 进行高位和低位的混合
    return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

空键处理：若 key 为 null，直接返回 0。
哈希值获取：调用 hashCode() 方法获取 key 的原始哈希值。
右移混合：通过无符号右移 16 位，将高位信息提取出来。
XOR 操作：利用 XOR 将原始哈希值与右移后的值混合，使得最终的哈希值在低位上也包含了高位的信息，从而减小冲突概率。

2.1.2 插入流程详解

    /**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don't change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0) // 1. 若 table 未初始化，则进行初始化扩容
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null) // 2. 计算桶索引： index = (n - 1) & hash
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k)))) // 3. 判断桶的首节点是否与 key 匹配
                e = p;
            else if (p instanceof TreeNode)  // 4. 如果当前桶为红黑树，使用树插入方法
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else { // 5. 否则沿链表查找
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        // 链表长度达到 TREEIFY_THRESHOLD 转换为红黑树
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key  6. 更新存在的节点的值
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        // 7. 插入成功后，更新结构修改次数和元素数量，并检查是否需要扩容
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

展开代码

最佳实践：
避免频繁修改 key 对象的 hashCode() 或 equals() 实现，否则可能导致元素定位异常，引起隐蔽的 bug。

2.1.3 插入过程示意图

2.2 get() 方法 —— 查找过程

HashMap 的查找过程与插入类似，主要步骤为：计算 hash、确定桶索引，然后在桶内查找目标节点。

    /**
     * Returns the value to which the specified key is mapped,
     * or {@code null} if this map contains no mapping for the key.
     *
     * <p>More formally, if this map contains a mapping from a key
     * {@code k} to a value {@code v} such that {@code (key==null ? k==null :
     * key.equals(k))}, then this method returns {@code v}; otherwise
     * it returns {@code null}.  (There can be at most one such mapping.)
     *
     * <p>A return value of {@code null} does not <i>necessarily</i>
     * indicate that the map contains no mapping for the key; it's also
     * possible that the map explicitly maps the key to {@code null}.
     * The {@link #containsKey containsKey} operation may be used to
     * distinguish these two cases.
     *
     * @see #put(Object, Object)Java
     */
    public V get(Object key) {
        Node<K,V> e;
        return (e = getNode(hash(key), key)) == null ? null : e.value;
    }

根据 key 计算哈希值（调用 hash(key)）。
调用 getNode() 方法，根据哈希值和 key 在底层数组中查找对应的节点。
如果找到节点，则返回节点的 value，否则返回 null。

PS:返回 null 不一定表示不存在映射，也可能是该 key 映射到的 value 正好为 null。可通过 containsKey() 进一步确认。

    /**
     * Implements Map.get and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @return the node, or null if none
     */
    final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
        (first = tab[(n - 1) & hash]) != null) {
        // 1. 检查桶中第一个节点是否匹配
        if (first.hash == hash && ((k = first.key) == key || (key != null && key.equals(k))))
            return first;
        // 2. 如果桶中有后续节点，继续查找
        if ((e = first.next) != null) {
            // 如果桶内结构为红黑树，则采用树查找方式
            if (first instanceof TreeNode)
                return ((TreeNode<K,V>)first).getTreeNode(hash, key);
            // 否则，遍历链表
            do {
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    return e;
            } while ((e = e.next) != null);
        }
    }
    return null;
    }

展开代码

获取 table 和桶长度：首先检查 HashMap 的底层数组 table 是否已初始化，并获取其长度。
确定桶索引：通过 (n - 1) & hash 来计算 key 应该存放在数组中的桶位置，然后获取该位置的第一个节点（first）。

直接命中：
检查桶中第一个节点是否匹配。匹配条件为：
- 节点的 hash 与计算的 hash 相等；
- 且节点的 key 与查找的 key 相同（相等判断既支持 == 也支持 equals()）。
如果匹配，则直接返回这个节点。
遍历桶中的链表或红黑树：
如果第一个节点不匹配且桶中存在后续节点（first.next 不为 null）：
- 红黑树查找：如果桶中的结构是红黑树（first instanceof TreeNode），则调用红黑树的查找方法 getTreeNode()。
- 链表遍历：否则，遍历链表中每个节点，依次比较节点的 hash 和 key，直到找到匹配的节点并返回。

如果遍历完桶中所有节点都没有找到符合条件的节点，则返回 null。

2.2.1 查找流程示意图

最佳实践：
尽量避免使用会产生大量哈希碰撞的 key 类型，优先选择不可变对象作为键，确保 hashCode() 分布均匀。

2.3 resize() 方法 —— 动态扩容

扩容是 HashMap 保证性能的关键机制。当元素数量超过阈值时，HashMap 将底层数组扩容，并重新分配元素。

    /**
     * Initializes or doubles table size.  If null, allocates in
     * accord with initial capacity target held in field threshold.
     * Otherwise, because we are using power-of-two expansion, the
     * elements from each bin must either stay at same index, or move
     * with a power of two offset in the new table.
     *
     * @return the table
     */
    final Node<K,V>[] resize() {
        Node<K,V>[] oldTab = table;
        int oldCap = (oldTab == null) ? 0 : oldTab.length;
        int oldThr = threshold;
        int newCap, newThr = 0;
        if (oldCap > 0) {
            if (oldCap >= MAXIMUM_CAPACITY) {
                threshold = Integer.MAX_VALUE;
                return oldTab;
            }
            else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                     oldCap >= DEFAULT_INITIAL_CAPACITY)
                newThr = oldThr << 1; // double threshold
        }
        else if (oldThr > 0) // initial capacity was placed in threshold
            newCap = oldThr;
        else {               // zero initial threshold signifies using defaults
            newCap = DEFAULT_INITIAL_CAPACITY;
            newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
        }
        if (newThr == 0) {
            float ft = (float)newCap * loadFactor;
            newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                      (int)ft : Integer.MAX_VALUE);
        }
        threshold = newThr;
        @SuppressWarnings({"rawtypes","unchecked"})
        Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
        table = newTab;
        if (oldTab != null) {
            for (int j = 0; j < oldCap; ++j) {
                Node<K,V> e;
                if ((e = oldTab[j]) != null) {
                    oldTab[j] = null;
                    if (e.next == null)
                        newTab[e.hash & (newCap - 1)] = e;
                    else if (e instanceof TreeNode)
                        ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                    else { // preserve order
                        Node<K,V> loHead = null, loTail = null;
                        Node<K,V> hiHead = null, hiTail = null;
                        Node<K,V> next;
                        do {
                            next = e.next;
                            if ((e.hash & oldCap) == 0) {
                                if (loTail == null)
                                    loHead = e;
                                else
                                    loTail.next = e;
                                loTail = e;
                            }
                            else {
                                if (hiTail == null)
                                    hiHead = e;
                                else
                                    hiTail.next = e;
                                hiTail = e;
                            }
                        } while ((e = next) != null);
                        if (loTail != null) {
                            loTail.next = null;
                            newTab[j] = loHead;
                        }
                        if (hiTail != null) {
                            hiTail.next = null;
                            newTab[j + oldCap] = hiHead;
                        }
                    }
                }
            }
        }
        return newTab;
    }

展开代码

resize() 方法的主要任务是：

初始化/扩容：如果当前 table 为空，则初始化；否则将数组容量翻倍。
重新计算阈值：根据新的容量和负载因子计算新的扩容阈值。
重新散列：将旧数组中每个桶中的节点重新分布到新数组中，由于容量是原来的两倍，所以每个节点只可能保持在原位置或者移动到 “原位置 + oldCap”。

2.3.1 扩容流程图

最佳实践
为避免频繁扩容，建议在构造 HashMap 时指定初始容量。如果能预估数据量较大，合理设置负载因子也可以在空间与性能之间取得平衡。

Menu

Share

深入解析 Java HashMap 底层原理

1. HashMap 的基本结构

1.1 数据结构组成

1.2 核心成员变量

1.3 示例图

2. 核心操作详解

2.1 put() 方法 —— 插入与更新

2.1.1 散列算法

2.1.2 插入流程详解

2.1.3 插入过程示意图

2.2 get() 方法 —— 查找过程

2.2.1 查找流程示意图

2.3 resize() 方法 —— 动态扩容

2.3.1 扩容流程图

Comment

项目ONE《技术派社区》

阿里云ECS服务器快速搭建

深入解析 Java HashMap 底层原理

拦截过滤器模式

传输对象模式

数据访问对象模式

测试文章

Spring Spring Security 中重要对象汇总