Go Profiling

Useful information about Go profiling.

Running profiler In Go

There are several options to run the profiler. For all options we will generate pprof file which might be rendered with go tool pprof tool to interactive or website mode (other formats are also available). There also other tools for rendering profile files - GoLand IDE for example.

//useful example
go tool pprof -http :8080 file.pprof

In Code

There is runtime/pprof and package named profile which is easier to use.

func main() {
    //creates pprof file for memory profile
    defer profile.Start(profile.MemProfile).Stop()
    //creates pprof file for CPU profile
    defer profile.Start(profile.CPUProfile).Stop()
}

Other examples are available in documentation https://pkg.go.dev/github.com/pkg/profile

If you are confused why Start and Stop is in defer then remember - in defer chained functions only last one is called after main function return, others are called at defer inicialization.

Benchmarks

When we have ready benchmarks then we can run those with additional profilers.

go test -cpuprofile cpu.pprof -memprofile mem.pprof -bench .

Other profilers might be found in Go documentation.

Via HTTP

Most common way to capture profiles for programs at macro and production levels.

import _ "net/http/pprof"

Or

m := http.NewServeMux()
m.HandleFunc("/debug/pprof/", pprof.Index)
m.HandleFunc("/debug/pprof/profile", pprof.Profile)
_ = http.Server{Handler: m}
//listener 

Other handlers or profiler implementations also might be defined. Those endpoints just provides pprof text which should be viewed using go tool ppfrof. It might be done directly by passing url to go tool ppfrof <url>

Rendering tips

Use pprof tool directly from http provided profile:

go tool pprof -http :8080 http://<address>/debug/pprof/heap

``

Available profilers

Heap

Shows only memory blocks allocated on the heap, not memory allocated on stack, or custom mmap calls. By default, Go records a sample per every 512 KB of allocated memory on the heap. It might be configured. Heap/alloc contains 4 sample value types:

  • alloc_space - total number of allocated bytes by location on the heap since the start of your program, even cleaned by garbage collector.
  • alloc_objects - number of all allocated memory blocks but not the actual space.
  • inuse_space - currently allocated bytes on the heap
  • inuse_objects - current number of allocated memory blocks (objects) on the heap

CPU

CPU profiler doesn’t return its profile immediately, It must be explicitly started and stopped, then it is usable to diagnose. Currently, profile rate is set to content value - 100Hz. Sample values:

  • samples - number of samples observed at the location.
  • CPU - cpu time at the location

Goroutines

Profiler in view contains functions that are involved in goroutines internal runtime:

  • runtime.gopark - used for park goroutines when they are waiting for specified bellow things(e.g I/O, channel communication).
  • runtime.chanrecv - used when goroutine wait for new value from channel.
  • runtime.chansend - used when goroutine wait to send something to channel.
  • runtime.selectgo - used when goroutine is waiting or checking cases in select statement.
  • runtime.netpollblock - used when goroutine waits for network I/O.

Off-CPU Time

Profile is not available in native profiler but external profiler like fgprof provides it. This value give us information about time when CPU time is not used, e.g. I/O from disk, network, external device or just syscalls.

Additional info

  • Heap profile doesn’t show variables connected to specific memory, so external tools like viewcore (CockroachDB) might be needed.
  • When compiling binary we might remove readable stack trace and source code information to make the binary size much smaller (check DWARF table and ldflags).
  • Profile is only an estimation and in reality real allocations might be larger or smaller.
  • Profiles might be aggregated using subtracting, diff and merge functions.
  • There is lines granularity in profiler options to get more granular data.

Continuous profiling

If you need level up profiling then you can check continuous profiling. There are few tools that provide that - one is Pyroscope. It supports pull and push modes and eBPF. https://github.com/grafana/pyroscope/tree/main/examples

Solutions

comments powered by Disqus
Built with Hugo
Theme Stack designed by Jimmy