Map clearing
Introduction
Maps are bread and butter in the Go programming language to very fast storage and retrieval of items. Maps allows for keyed storages and accesses in constant time aka O(1) time.
If we examine the Go specification section about Maps, we can see the formal definition of maps in Go
For example if we’d like to store the ages of various users in Go we can do the following storing book categories mapped by their commonly used ids, per https://play.golang.org/p/2BiEQACvsg2 or inlined below:
package main
func main() {
ids := make(map[int]string)
// Fill in the values as they'd come in from the database.
ids[99] = "Literature"
ids[212] = "Medicine"
ids[88] = "Physics"
ids[47] = "Biology"
ids[22] = "Dance"
ids[1902] = "Philosophy"
ids[88] = "Comedy"
// Now print out the category stored for id:1902
printCategory(1902, ids)
// Use unknown ids.
printCategory(-1, ids)
printCategory(100, ids)
}
func printCategory(id int, ids map[int]string) {
category, ok := ids[id]
if !ok {
category = "Unknown"
}
println("Category for id", id, "is", category)
}
which prints out
Category for id 1902 is Philosophy
Category for id -1 is Unknown
Category for id 100 is Unknown
Deletion
If we want to correct our map entries, we can simply use the delete built-in function.
delete(ids, id2)
Clearing a map
To clear a map, we should be able to invoke delete
with its keys. However, what style should we use?
Collect all keys
We might have collected all the keys somewhere in a slice, and now iterate through them to delete them in the map
for _, key := range keys {
delete(ids, key)
}
Delete in a loop but do extra work
Perhaps there is extra work that you need to do in the same loop as the deletion, or use the stored values/keys later:
deletedKeys := make([]int, 0, len(ids))
for key := range ids {
delete(ids, key)
deletedKeys = append(deletedKeys, key)
}
// Use deleted keys much later on.
_ = deletedKeys
Delete only within the map iteration loop
for key := range ids {
delete(ids, key)
}
Pop quiz
Which of the 3 styles above is the fastest and most efficient??
Set up benchmarking
To set up our shoot-out for benchmarking and solid data backed by numbers, we need to do the following:
Benchmark the code
To figure out how it performs, let’s start by benchmarking it.
package mapclearing
import "testing"
func returnValuesAndClearMap(m map[int]int) int {
i := 0
for key, value := range m {
delete(m, key)
i += value
}
return i
}
func populateMap(n int) map[int]int {
m := make(map[int]int, n)
for i := 0; i < n; i++ {
m[i] = i
}
return m
}
var sink interface{}
func BenchmarkIt10(b *testing.B) {
benchmarkIt(b, 10)
}
func BenchmarkIt100(b *testing.B) {
benchmarkIt(b, 100)
}
func BenchmarkIt1000(b *testing.B) {
benchmarkIt(b, 1000)
}
func BenchmarkIt10000(b *testing.B) {
benchmarkIt(b, 10000)
}
func BenchmarkIt100000(b *testing.B) {
benchmarkIt(b, 100000)
}
func BenchmarkIt1000000(b *testing.B) {
benchmarkIt(b, 1000000)
}
func benchmarkIt(b *testing.B, n int) {
for i := 0; i < b.N; i++ {
m := populateMap(n)
sink = returnValuesAndClearMap(m)
}
if sink == nil {
b.Fatal("Benchmark did not run")
}
sink = (interface{})(nil)
}
Commit the result
Having successfully added Bencher to our repository in two clicks, we mailed the first change. That made Bencher spawn the following:
Shoot-out
Add map deletion in a loop
If we apply the map clearing idiom, with NOTHING else except iterating with keys and deleting, like this:
---
maps/map-deletion/bench_test.go | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/maps/map-deletion/bench_test.go b/maps/map-deletion/bench_test.go
index 41139e0..630903a 100644
--- a/maps/map-deletion/bench_test.go
+++ b/maps/map-deletion/bench_test.go
@@ -4,10 +4,15 @@ import "testing"
func returnValuesAndClearMap(m map[int]int) int {
i := 0
- for key, value := range m {
- delete(m, key)
+ // 1. Just get the values.
+ for _, value := range m {
i += value
}
+
+ // 2. Use the map clearing idiom solely now.
+ for key := range m {
+ delete(m, key)
+ }
return i
}
Results
We get the following results per https://dashboard.github.orijtech.com/benchmark/88710d3d52044b8dafbba173b27f28e1
Verdict
For the best performance, please use the map clearing idiom solely
for key := range m {
delete(m, key)
}
This option is the fastest as the compiler recognizes it for fast deletion and it was requested for improvement by Josh Bleecher Synder per golang/issues/#20138 and implemented by Martin Möhrmann in CL 110055 for Go1.11 in April 2018. We’ve actually used it to speed up customers' code per cosmos-sdk/pr#8719
References
Resource | URL |
---|---|
delete builtin: Go standard library | https://golang.org/pkg/builtin#delete |
Bencher website | https://bencher.orijtech.com/ |
Bencher on the Github marketplace | https://github.com/marketplace/gobencher/ |
Go | |
Cosmos-SDK store/cachekv, x/bank/types: algorithmically fix pathologically slow code | https://github.com/cosmos/cosmos-sdk/pull/8719 |
cmd/compile: recognize map-clearing range idiom by Josh Bleecher Snyder | https://github.com/golang/go/issues/20138 |
cmd/compile: optimize map-clearing range idiom by Martin Möhrmann | https://go-review.googlesource.com/c/go/+/110055/ |