Bencher is available on the Github marketplace!

Map clearing

Introduction

Maps are bread and butter in the Go programming language to very fast storage and retrieval of items. Maps allows for keyed storages and accesses in constant time aka O(1) time.

If we examine the Go specification section about Maps, we can see the formal definition of maps in Go

For example if we’d like to store the ages of various users in Go we can do the following storing book categories mapped by their commonly used ids, per https://play.golang.org/p/2BiEQACvsg2 or inlined below:

package main

func main() {
	ids := make(map[int]string)
	// Fill in the values as they'd come in from the database.
	ids[99] = "Literature"
	ids[212] = "Medicine"
	ids[88] = "Physics"
	ids[47] = "Biology"
	ids[22] = "Dance"
	ids[1902] = "Philosophy"
	ids[88] = "Comedy"

	// Now print out the category stored for id:1902
	printCategory(1902, ids)
	// Use unknown ids.
	printCategory(-1, ids)
	printCategory(100, ids)
}

func printCategory(id int, ids map[int]string) {
	category, ok := ids[id]
	if !ok {
		category = "Unknown"
	}
	println("Category for id", id, "is", category)
}

which prints out

Category for id 1902 is Philosophy
Category for id -1 is Unknown
Category for id 100 is Unknown

Deletion

If we want to correct our map entries, we can simply use the delete built-in function.

delete(ids, id2)

Clearing a map

To clear a map, we should be able to invoke delete with its keys. However, what style should we use?

Collect all keys

We might have collected all the keys somewhere in a slice, and now iterate through them to delete them in the map

for _, key := range keys {
    delete(ids, key)
}

Delete in a loop but do extra work

Perhaps there is extra work that you need to do in the same loop as the deletion, or use the stored values/keys later:

deletedKeys := make([]int, 0, len(ids))
for key := range ids {
    delete(ids, key)
    deletedKeys = append(deletedKeys, key)
}

// Use deleted keys much later on.
_ = deletedKeys

Delete only within the map iteration loop

for key := range ids {
    delete(ids, key)
}

Pop quiz

Which of the 3 styles above is the fastest and most efficient??

Set up benchmarking

To set up our shoot-out for benchmarking and solid data backed by numbers, we need to do the following:

Benchmark the code

To figure out how it performs, let’s start by benchmarking it.

package mapclearing

import "testing"

func returnValuesAndClearMap(m map[int]int) int {
	i := 0
	for key, value := range m {
		delete(m, key)
		i += value
	}
	return i
}

func populateMap(n int) map[int]int {
	m := make(map[int]int, n)
	for i := 0; i < n; i++ {
		m[i] = i
	}
	return m
}

var sink interface{}

func BenchmarkIt10(b *testing.B) {
	benchmarkIt(b, 10)
}

func BenchmarkIt100(b *testing.B) {
	benchmarkIt(b, 100)
}

func BenchmarkIt1000(b *testing.B) {
	benchmarkIt(b, 1000)
}

func BenchmarkIt10000(b *testing.B) {
	benchmarkIt(b, 10000)
}

func BenchmarkIt100000(b *testing.B) {
	benchmarkIt(b, 100000)
}

func BenchmarkIt1000000(b *testing.B) {
	benchmarkIt(b, 1000000)
}

func benchmarkIt(b *testing.B, n int) {
	for i := 0; i < b.N; i++ {
		m := populateMap(n)
		sink = returnValuesAndClearMap(m)
	}

	if sink == nil {
		b.Fatal("Benchmark did not run")
	}
	sink = (interface{})(nil)
}

Commit the result

Having successfully added Bencher to our repository in two clicks, we mailed the first change. That made Bencher spawn the following:

Shoot-out

Add map deletion in a loop

If we apply the map clearing idiom, with NOTHING else except iterating with keys and deleting, like this:

---
 maps/map-deletion/bench_test.go | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/maps/map-deletion/bench_test.go b/maps/map-deletion/bench_test.go
index 41139e0..630903a 100644
--- a/maps/map-deletion/bench_test.go
+++ b/maps/map-deletion/bench_test.go
@@ -4,10 +4,15 @@ import "testing"
 
 func returnValuesAndClearMap(m map[int]int) int {
 	i := 0
-	for key, value := range m {
-		delete(m, key)
+	// 1. Just get the values.
+	for _, value := range m {
 		i += value
 	}
+
+	// 2. Use the map clearing idiom solely now.
+	for key := range m {
+		delete(m, key)
+	}
 	return i
 }

Results

We get the following results per https://dashboard.github.orijtech.com/benchmark/88710d3d52044b8dafbba173b27f28e1

Verdict

For the best performance, please use the map clearing idiom solely

for key := range m {
    delete(m, key)
}

This option is the fastest as the compiler recognizes it for fast deletion and it was requested for improvement by Josh Bleecher Synder per golang/issues/#20138 and implemented by Martin Möhrmann in CL 110055 for Go1.11 in April 2018. We’ve actually used it to speed up customers' code per cosmos-sdk/pr#8719

References

Resource URL
delete builtin: Go standard library https://golang.org/pkg/builtin#delete
Bencher website https://bencher.orijtech.com/
Bencher on the Github marketplace https://github.com/marketplace/gobencher/
Go
Cosmos-SDK store/cachekv, x/bank/types: algorithmically fix pathologically slow code https://github.com/cosmos/cosmos-sdk/pull/8719
cmd/compile: recognize map-clearing range idiom by Josh Bleecher Snyder https://github.com/golang/go/issues/20138
cmd/compile: optimize map-clearing range idiom by Martin Möhrmann https://go-review.googlesource.com/c/go/+/110055/