声明

本文采用版本为： go1.17.5
本文仅供自己学习使用，不做商业用途。

map 的结构：

hmap

hmap结构体定义

golang 中 map的实现结构为：哈希表 + 链表。其中链表，作用是当发生hash冲突时，拉链法生成的结点。

// A header for a Go map.
type hmap struct {
    // map中的元素个数
    count     int 
    // map当前的状态， 如： 正在写， 正在遍历
    flags     uint8
    // map中 buckets的对数 log_2， 如 buckets的容量为 2^B
    B         uint8  
    // overflow 的数量
    noverflow  uint16
    // hash种子， 用来生成hash值
    hash0     uint32 // hash seed
    // 指向map的 hash table。 hash table的长度为2^B
    buckets    unsafe.Pointer 
    // 指向扩容时的原 buckets， 新的buckets会申请新的空间。
    // 扩容的时候，buckets 长度会是 oldbuckets 的两倍
    oldbuckets unsafe.Pointer
    // 当前map的扩容进度（扩容搬迁到哪一个cell了）
    nevacuate  uintptr 

    extra *mapextra // optional fields
}

hmap 结构示意图

map 结构图

可以看到， []bmap 是一个hash table，每一个 bmap是我们常说的“桶”。经过hash 函数计算出来相同的hash值，放到相同的桶中。一个 bmap中可以存放 8个元素，如果多出8个，则生成新的结点，尾接到队尾。

bmap

bmap 结构定义

// A bucket for a Go map.
type bmap struct {
   // tophash generally contains the top byte of the hash value
   // for each key in this bucket. If tophash[0]

以上是只是静态文件 src/runtime/map.go 中的定义。 实际上编译期间会给它加料，动态地创建一个新的结构：

type bmap struct {
    topbits  [8]uint8
    keys     [8]keytype
    values   [8]valuetype
    pad      uintptr
    overflow uintptr
}

bmap 结构示意图

bmap

上图就是 bmap的内存模型，HOB Hash 指的就是 top hash。注意到 key 和 value 是各自放在一起的，并不是 key/value/key/value/... 这样的形式。源码里说明这样的好处是在某些情况下可以省略掉 padding 字段，节省内存空间。

每个 bmap设计成最多只能放 8 个 key-value 对，如果有第 9 个 key-value 落入当前的 bmap，那就需要再构建一个 bmap，通过 overflow 指针连接起来。

创建map

map创建方法:

ageMp := make(map[string]int)
// 指定 map 长度
ageMp := make(map[string]int, 8)

// ageMp 为 nil，不能向其添加元素，会直接panic
var ageMp map[string]int

我们实际上是通过调用的 makemap ，来创建map的。实际工作只是初始化了hmap中的各种字段，如：设置B的大小，设置hash 种子 hash 0.

// makemap implements Go map creation for make(map[k]v, hint).
// If the compiler has determined that the map or the first bucket
// can be created on the stack, h and/or bucket may be non-nil.
// If h != nil, the map can be created directly in h.
// If h.buckets != nil, bucket pointed to can be used as the first bucket.
func makemap(t *maptype, hint int, h *hmap) *hmap {
    // 判断是否还有空间可以申请
    mem, overflow := math.MulUintptr(uintptr(hint), t.bucket.size)
    if overflow || mem > maxAlloc {
        hint = 0
    }

    // initialize Hmap
    if h == nil {
        h = new(hmap)
    }
    // 初始化时， 随机生成hash种子
    h.hash0 = fastrand()

    // Find the size parameter B which will hold the requested # of elements.
    // For hint  6.5
    return count > bucketCnt && uintptr(count) > loadFactorNum*(bucketShift(B)/loadFactorDen)
}

注意 :

makemap 返回是*hmap 指针，即 map 是引用对象，对map的操作会影响到结构体内部。

查找map元素

使用方式

package main

import "fmt"

func main() {
    // 初始化 map
    ageMap := make(map[string]int)
    // 插入元素
    ageMap["wxx"] = 24

    // 不带 comma 用法
    age1 := ageMap["a"]
    fmt.Println(age1) // 0

    // 带 comma 用法
    age2, ok := ageMap["b"] // 0， false
    fmt.Println(age2, ok)
}

对应的是下面两种方法

func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer
// mapaccess2 和 mapaccess1 功能相同 但是多返回一个是否搜索到的bool值
func mapaccess2(t *maptype, h *hmap, key unsafe.Pointer) (unsafe.Pointer, bool)

hash 函数如何确定

type maptype struct {
    typ    _type
    key    *_type
    elem   *_type
    bucket *_type // internal type representing a hash bucket
    // function for hashing keys (ptr to key, seed) -> hash
    hasher     func(unsafe.Pointer, uintptr) uintptr
    keysize    uint8  // size of key slot
    elemsize   uint8  // size of elem slot
    bucketsize uint16 // size of bucket
    flags      uint32
}

map的key的类型，实现了自己的hash 方式。每种类型实现hash函数方式不一样。

正常定位key

key 经过哈希计算后得到hash值，共 64 个 bit 位。其中后B 个bit位置，用来定位当前元素落在哪一个桶里，高8个bit 为当前 hash 值的top hash。实际上定位key的过程是一个双重循环的过程，外层循环遍历所有的overflow，内层循环遍历当前bmap 中的 8个元素。

使用低 B 为定位是在哪一个 bucket
遍历当前结点中的8个 cell是否有满足top hash的。如果有则定位完成，否则进入第3步
遍历链表的结点，如果定位到了，则返回值。否则返回零值。

举例说明：如果当前 B 的值为 5，那么buckets 的长度为 2^5 = 32。假设有个key 经过hash函数计算后，得到的hash结果为：

10010111 | 000011110110110010001111001010100010010110010101010 │ 00110

第B位 hash & B == 01010 == 6,定位到6号桶
高8位 tophash 10010111 == 151,遍历匹配bmap 中的tophash

外层遍历bucket 中的链表

双层循环遍历key

内层循环遍历 bmap中的8个 cell

keys定位到桶

扩容时定位key

建议先不看此部分内容，看完后续 修改 map中元素 -> 扩容 操作后再回头看此部分内容。

扩容前的数据：

扩容前数据

等量扩容后的数据：

等量扩容后，查找方式和原本相同，不多做赘述。

等量扩容

两倍扩容后的数据

两倍扩容后，oldbuckets 的元素，可能被分配成了两部分。查找顺序如下：

遍历到old buckets的 bucket还没有被搬迁过，正常查询，返回找到的元素。
遍历到old buckets的 bucket 被搬迁过了，则到新的buckets中寻找元素。

两倍扩容

源码：

此处只分析 mapaccess1,。mapaccess2 相比 mapaccess1多添加了是否找到的bool值，有兴趣可自行看一下。

func mapaccess1(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
    // race 竞争检测
    if raceenabled && h != nil {
        callerpc := getcallerpc()
        pc := funcPC(mapaccess1)
        racereadpc(unsafe.Pointer(h), callerpc, pc)
        raceReadObjectPC(t.key, key, callerpc, pc)
    }
    if msanenabled && h != nil {
        msanread(key, t.key.size)
    }
    // 正确性检测
    if h == nil || h.count == 0 {
        if t.hashMightPanic() {
            t.hasher(key, 0) // see issue 23734
        }
        // 返回当前类型的0值
        return unsafe.Pointer(&zeroVal[0])
    }
    // 如果flags为 写 标识， 则同时读写冲突， 抛出panic。非协程安全
    if h.flags&hashWriting != 0 {
        throw("concurrent map read and map write")
    }
    // 依据hash 种子生成 当前key的 hash值
    hash := t.hasher(key, uintptr(h.hash0))
    // m = 2^B - 1, 实际上的作用是为了 只取低B 位， 来找到相对应桶的位置
    m := bucketMask(h.B)
    // 找到桶的位置
    b := (*bmap)(add(h.buckets, (hash&m)*uintptr(t.bucketsize)))
    // 正在扩容
    if c := h.oldbuckets; c != nil {
        // 非等量扩容（双倍扩容）
        if !h.sameSizeGrow() {
            // There used to be half as many buckets; mask down one more power of two.
            // 先 右移一位 得到原来old bucket的 mask的值
            m >>= 1
        }
        // 找到当前桶对应的 old bucket中的桶的位置
        oldb := (*bmap)(add(c, (hash&m)*uintptr(t.bucketsize)))
        // 原来的 oldbucket 还没有搬迁，则在old bucket中找
        if !evacuated(oldb) {
            b = oldb
        }
    }
    top := tophash(hash)
bucketloop:
    // 双重for循环 外层遍历overflow（链表） 内层遍历桶
    for ; b != nil; b = b.overflow(t) {
        for i := uintptr(0); i

修改map元素

使用方式：

package main

import "fmt"

func main() {
    // 初始化 map
    ageMap := make(map[string]int)
    // 插入元素
    ageMap["wxx"] = 24
}

修改和插入元素

步骤如下：

定位key的位置，和 map的查询操作一样。
如果定位过程中发现当前map 正在进行扩容，则帮助进行扩容搬迁。否则当前插入没有意义。
如果当前map中已经存在该key，则修改value的值，返回
如果当前map中不存在该key，则在找到的第一个空的cell上插入该元素的 tophash， key， value。该步骤需要判断是否需要扩容，如果满足扩容条件则进行扩容

扩容条件：

每个bucket 中的数量大于 6.5（map的容量count / 桶的数量 2^B > 6.5）。此时太拥挤了，hashtable 冲突会比较严重
overflow的数量太多了了，查询效率太低。地广人稀，需要聚集当一起居住。
- 当 B = 2^B
- 当 B > 15时， noverflow >= 2^15

// Like mapaccess, but allocates a slot for the key if it is not present in the map.
// mapassign 插入元素 修改元素
func mapassign(t *maptype, h *hmap, key unsafe.Pointer) unsafe.Pointer {
    // 初始化检测
    if h == nil {
        panic(plainError("assignment to entry in nil map"))
    }
    // 竞争检测
    if raceenabled {
        callerpc := getcallerpc()
        pc := funcPC(mapassign)
        racewritepc(unsafe.Pointer(h), callerpc, pc)
        raceReadObjectPC(t.key, key, callerpc, pc)
    }
    if msanenabled {
        msanread(key, t.key.size)
    }
    // 安全检测，如果当前有协程正在写，则panic ， 同时写错误
    if h.flags&hashWriting != 0 {
        throw("concurrent map writes")
    }
    // 依据hash种子，生成hash值
    hash := t.hasher(key, uintptr(h.hash0))

    // Set hashWriting after calling t.hasher, since t.hasher may panic,
    // in which case we have not actually done a write.
    // 将flags 的标志为改变为正在写
    h.flags ^= hashWriting
    // 如果没有 buckets, 则new 一个
    if h.buckets == nil {
        h.buckets = newobject(t.bucket) // newarray(t.bucket, 1)
    }

again:
    // 找到对应的桶 索引为 index = hash & 2^B - 1，第index个桶
    bucket := hash & bucketMask(h.B)
    // 如果正在扩容
    if h.growing() {
        // 帮助扩容
        growWork(t, h, bucket)
    }
    // 找到对应桶的地址
    b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize)))
    // tophash : 取hash值的高八位
    top := tophash(hash)

    // 要插入的cell 的tophash[i] 的地址
    var inserti *uint8
    // 要插入cell 的key 的地址
    var insertk unsafe.Pointer
    // 要插入cell 的value的地址
    var elem unsafe.Pointer
bucketloop:
    // 双重for 循环，寻找对应的cell
    // 外层for循环 遍历overflow
    // 内层for 循环， 遍历bucket
    for {
        for i := uintptr(0); i  6.5）。 此时太拥挤了，hashtable 冲突会比较严重
    //扩容条件 2：overflow的数量太多了， 查询效率太低。地广人稀， 需要聚集当一起居住
    //          当 B = 2^B
    //          当 B > 15时， noverflow >= 2^15
    if !h.growing() && (overLoadFactor(h.count+1, h.B) || tooManyOverflowBuckets(h.noverflow, h.B)) {
        // 标识一下 开始扩容。 hashGrow()只进行分配空间， 具体扩容在growWork()中
        hashGrow(t, h)
        // 扩容后，元素的位置可能会改变，需要重新遍历一次
        goto again // Growing the table invalidates everything, so try again
    }
    // 此时的情况， 遍历完了全部的Overflow的cell， 仍然没有找到合适的位置（前面的位置都满了）， 则新建一个overflow 放置在第一个cell里
    if inserti == nil {
        // The current bucket and all the overflow buckets connected to it are full, allocate a new one.
        // 新建一个overflow
        newb := h.newoverflow(t, b)
        // tophash的 i 的地址
        inserti = &newb.tophash[0]
        // key的地址
        insertk = add(unsafe.Pointer(newb), dataOffset)
        // value的地址
        elem = add(insertk, bucketCnt*uintptr(t.keysize))
    }

    // store new key/elem at insert position
    // 解引用
    if t.indirectkey() {
        kmem := newobject(t.key)
        *(*unsafe.Pointer)(insertk) = kmem
        insertk = kmem
    }
    // 解引用
    if t.indirectelem() {
        vmem := newobject(t.elem)
        *(*unsafe.Pointer)(elem) = vmem
    }
    // 设置key的值
    typedmemmove(t.key, insertk, key)
    // 设置tophash的值
    *inserti = top
    // map容量 + 1
    h.count++

done:
    // 判断是否有修改了 flag的协程，如果有则panic
    if h.flags&hashWriting == 0 {
        throw("concurrent map writes")
    }
    // 恢复flag标识位
    h.flags &^= hashWriting
    // 解引用
    if t.indirectelem() {
        elem = *((*unsafe.Pointer)(elem))
    }
    // 返回value的值
    return elem
}

扩容

源码

扩容的标识：h.oldbuckets ！= nil

// 扩容，hashGrow 函数值是申请了新的内存， 以及将oldbucket上的flags 标识赋值给 新的buckets
func hashGrow(t *maptype, h *hmap) {
    // If we've hit the load factor, get bigger.
    // Otherwise, there are too many overflow buckets,
    // so keep the same number of buckets and "grow" laterally.
    bigger := uint8(1)
    if !overLoadFactor(h.count+1, h.B) {
        bigger = 0
        h.flags |= sameSizeGrow
    }
    oldbuckets := h.buckets
    newbuckets, nextOverflow := makeBucketArray(t, h.B+bigger, nil)

    flags := h.flags &^ (iterator | oldIterator)
    if h.flags&iterator != 0 {
        flags |= oldIterator
    }
    // commit the grow (atomic wrt gc)
    h.B += bigger
    h.flags = flags
    h.oldbuckets = oldbuckets
    h.buckets = newbuckets
    h.nevacuate = 0
    h.noverflow = 0

    if h.extra != nil && h.extra.overflow != nil {
        // Promote current overflow buckets to the old generation.
        if h.extra.oldoverflow != nil {
            throw("oldoverflow is not nil")
        }
        h.extra.oldoverflow = h.extra.overflow
        h.extra.overflow = nil
    }
    if nextOverflow != nil {
        if h.extra == nil {
            h.extra = new(mapextra)
        }
        h.extra.nextOverflow = nextOverflow
    }

    // the actual copying of the hash table data is done incrementally
    // by growWork() and evacuate().
}

// 实际扩容
func growWork(t *maptype, h *hmap, bucket uintptr) {
    // make sure we evacuate the oldbucket corresponding
    // to the bucket we're about to use
    // 搬迁元素， 将老bucket上的元素 搬迁到心bucket上
    evacuate(t, h, bucket&h.oldbucketmask())

    // evacuate one more oldbucket to make progress on growing
    // 如果仍在扩容，则继续帮助扩容
    if h.growing() {
        evacuate(t, h, h.nevacuate)
    }
}


// evacuate 将oldbuckets上的元素 搬迁到 新的buckets上
func evacuate(t *maptype, h *hmap, oldbucket uintptr) {
    // 找到oldbuckets 的 bucket的地址
    b := (*bmap)(add(h.oldbuckets, oldbucket*uintptr(t.bucketsize)))
    // 计算oldbucket 的bucket的数量 2^B  B == 2  1 00. 高位1
    newbit := h.noldbuckets()
    // 当前没有搬迁过
    if !evacuated(b) {
        // TODO: reuse overflow buckets instead of using new ones, if there
        // is no iterator using the old buckets.  (If !oldIterator.)

        // xy contains the x and y (low and high) evacuation destinations.
        // 如果是双倍扩容，则会由两个长为newbit（oldbuckets的长度）的数组， 低位为x 部分， 高位为 y 部分
        // 比如 原本是长度为2， 扩容后的长度为 2 + 2（4） 前一部分的长度为2的数组为 x 部分， 后一部分长度为2的数组为 y 部分
        var xy [2]evacDst
        // 默认只使用 x 部分
        x := &xy[0]
        // x部分的起始位置， 实际上是tophash[] 的起始位置
        x.b = (*bmap)(add(h.buckets, oldbucket*uintptr(t.bucketsize)))
        // x部分的key 的起始位置
        x.k = add(unsafe.Pointer(x.b), dataOffset)
        // x部分的value 的起始位置
        x.e = add(x.k, bucketCnt*uintptr(t.keysize))
        // 双倍扩容才需要使用到 y 部分。 等量扩容只需要使用 x 部分
        if !h.sameSizeGrow() {
            // Only calculate y pointers if we're growing bigger.
            // Otherwise GC can see bad pointers.
            y := &xy[1]
            // y 部分 和 x部分的对应的位置相差的距离 为一个 newbit的长度
            y.b = (*bmap)(add(h.buckets, (oldbucket+newbit)*uintptr(t.bucketsize)))
            // y 部分的keys 的起始位置
            y.k = add(unsafe.Pointer(y.b), dataOffset)
            // y 部分的values 的起始位置
            y.e = add(y.k, bucketCnt*uintptr(t.keysize))
        }

        for ; b != nil; b = b.overflow(t) {
            // keys的起始位置
            k := add(unsafe.Pointer(b), dataOffset)
            // values的起始位置
            e := add(k, bucketCnt*uintptr(t.keysize))
            // 遍历bucket中的每一个 cell
            for i := 0; i

搬迁过程

搬迁过程中

搬迁中的查找

假设当前定位到了新的buckets的3号桶中，首先会判断oldbuckets中的对应的桶有没有被搬迁过。如果搬迁过了，不需要看原来的桶了，直接遍历新的buckets的3号桶。

内存对比图

扩容前：

扩容前数据

等量扩容结果

等量扩容

两倍扩容：

双倍扩容会将old buckets上的元素分配到x， y两个部key & 1

两倍扩容

举例说明

注意：当前只对双倍扩容描述，等量扩容只是重新填充了一下元素，相对位置没有改变。

假设当前map 的B == 5，原本元素经过hash函数计算的 hash 值为：

10010111 | 000011110110110010001111001010100010010110010101011 │ 00110

rehash.png

因为双倍扩容之后 B = B + 1，此时B == 6。key & 1

删除map元素

定位key，和查找过程相似。
判断删除元素后当前元素的个数，如果个数为0，则重置hash种子，使得map 和以前的map更加的不同
当前overflow中只有当前元素了，要删除当前元素时，需要考虑一下后面overflow的cell的状态，将后续的状态全部修改

func mapdelete(t *maptype, h *hmap, key unsafe.Pointer) {
    // race 检测
    if raceenabled && h != nil {
        callerpc := getcallerpc()
        pc := funcPC(mapdelete)
        racewritepc(unsafe.Pointer(h), callerpc, pc)
        raceReadObjectPC(t.key, key, callerpc, pc)
    }
    if msanenabled && h != nil {
        msanread(key, t.key.size)
    }
    // map没有初始化 或者 容量位0 直接返回
    if h == nil || h.count == 0 {
        if t.hashMightPanic() {
            t.hasher(key, 0) // see issue 23734
        }
        return
    }
    // 写 标志位检查， 如果有协程正在写map， 则panic 同时写错误
    if h.flags&hashWriting != 0 {
        throw("concurrent map writes")
    }
    // 依据hash种子 生成 hash值
    hash := t.hasher(key, uintptr(h.hash0))

    // Set hashWriting after calling t.hasher, since t.hasher may panic,
    // in which case we have not actually done a write (delete).
    // 置为 写 标志位
    h.flags ^= hashWriting
    // 找到是第几个桶
    bucket := hash & bucketMask(h.B)
    // 正在扩容，则帮助搬迁
    if h.growing() {
        growWork(t, h, bucket)
    }
    // 找到对应桶的 地址
    b := (*bmap)(add(h.buckets, bucket*uintptr(t.bucketsize)))
    bOrig := b
    top := tophash(hash)
search:
    // 双重循环， 外层遍历链表， 内层遍历bucket
    for ; b != nil; b = b.overflow(t) {
        for i := uintptr(0); i

遍历map

使用方式：

package main

import "fmt"

func main() {
 ageMp := make(map[string]int)
 ageMp["wxx"] = 1

 for name, age := range ageMp {
 fmt.Println(name, age)
 }
}

源码

// mapiterinit 初始化迭代器
// mapiternext 迭代器遍历
mapiterinit + mapiternext == 遍历

// mapiterinit initializes the hiter struct used for ranging over maps.
// The hiter struct pointed to by 'it' is allocated on the stack
// by the compilers order pass or on the heap by reflect_mapiterinit.
// Both need to have zeroed hiter since the struct contains pointers.
func mapiterinit(t *maptype, h *hmap, it *hiter) {
    // 竞争检测
    if raceenabled && h != nil {
        callerpc := getcallerpc()
        racereadpc(unsafe.Pointer(h), callerpc, funcPC(mapiterinit))
    }

    if h == nil || h.count == 0 {
        return
    }

    if unsafe.Sizeof(hiter{})/sys.PtrSize != 12 {
        throw("hash_iter size incorrect") // see cmd/compile/internal/reflectdata/reflect.go
    }
    it.t = t
    it.h = h

    // grab snapshot of bucket state
    it.B = h.B
    it.buckets = h.buckets
    if t.bucket.ptrdata == 0 {
        // Allocate the current slice and remember pointers to both current and old.
        // This preserves all relevant overflow buckets alive even if
        // the table grows and/or overflow buckets are added to the table
        // while we are iterating.
        h.createOverflow()
        it.overflow = h.extra.overflow
        it.oldoverflow = h.extra.oldoverflow
    }

    // decide where to start
    // 随机取一个bucket开始
    r := uintptr(fastrand())
    if h.B > 31-bucketCntBits {
        r += uintptr(fastrand()) > h.B & (bucketCnt - 1))

    // iterator state
    it.bucket = it.startBucket

    // Remember we have an iterator.
    // Can run concurrently with another mapiterinit().
    if old := h.flags; old&(iterator|oldIterator) != iterator|oldIterator {
        atomic.Or8(&h.flags, iterator|oldIterator)
    }

    mapiternext(it)
}

可以看到，每一遍历生成迭代器的时候，会随机选取一个bucket 以及一个cell开始。从前往后遍历，再次遍历到起始位置时，遍历完成。

总结

map 返回的结构体是指针，即引用类型
map的内存结构是 hash table + 链表
map的定位实际上是双重for循环，在定位到bucket后，外层遍历overflow，内层遍历8个cell
map 扩容的标识 h.oldbuckets != nil
map 的扩容分成等量扩容和双倍扩容
map的扩容条件为负载因子大于 6.5 或者 overflow 的数量太多（ B = 2^B ； B > 15时， noverflow >= 2^15）。前者对应的是元素分配太过于集中，hash冲突严重。后者对应的是元素分配太过于稀疏，地广人稀，查询效率低。
map 删除元素后，会重新生成hash种子，使得map 和之前更加得不同
map 的遍历是随机的，可能每次的遍历先后顺序不一样。

参考资料

https://www.qcrao.com/2019/05/22/dive-into-go-map/

https://draveness.me/golang/docs/part2-foundation/ch03-datastructure/golang-hashmap/

golang map源码浅析

声明

map 的结构：

hmap

hmap结构体定义

hmap 结构示意图

bmap

bmap 结构定义

bmap 结构示意图

创建map

查找map元素

hash 函数如何确定

正常定位key

扩容时定位key

源码：

修改map元素

修改和插入元素

扩容

源码

搬迁过程

内存对比图

举例说明

删除map元素

遍历map

源码

总结

参考资料

Docker多阶段构建

golang 设置http超时时间