Golang篇-pprof性能调优工具

简介

pprof是一种可视化和分析数据的工具,可以生成类似火焰图、堆栈图,内存分析图等。

应用

配置导出数据

  • 导入依赖
1
2
"net/http"
_ "net/http/pprof"
  • 开启协程
1
2
3
4
5
go func() {
runtime.SetBlockProfileRate(1) // 开启对阻塞操作的跟踪,block
runtime.SetMutexProfileFraction(1) // 开启对锁调用的跟踪,mutex
log.Println(http.ListenAndServe(":6060", nil))
}()

导出数据

基于上面的配置后我们可以通过两种方式访问

1.通过 http://127.0.0.1:6060/debug/pprof/ 访问

我们可以通过访问web的方式去查看对应的数据,但其实可读性并不是很好

相关信息详解

Profile项 说明 详情
allocs 内存分配 从程序启动开始,分配的全部内存
block 阻塞 导致同步原语阻塞的堆栈跟踪
cmdline 命令行调用 当前程序的命令行调用
goroutine gorouting 所有当前 goroutine 的堆栈跟踪
heap 活动对象的内存分配抽样。您可以指定 gc 参数以在获取堆样本之前运行 GC
mutex 互斥锁 争用互斥锁持有者的堆栈跟踪
profile CPU分析 CPU 使用率分析。可以在url中,通过seconds指定持续时间(默认30s)。获取配置文件后,使用 go tool pprof 命令分析CPU使用情况
threadcreate 线程创建 导致创建新操作系统线程的堆栈跟踪
trace 追踪 当前程序的执行轨迹。可以在url中,通过seconds指定持续时间(默认30s)。获取跟踪文件后,使用 go tool trace 命令调查跟踪

2.直接通过命令进行交互(需要安装graphviz)

在安装过程中出现依赖安装失败的情况需先独立安装依赖

1
brew install graphviz

导出数据默认30s

1
go tool pprof http://127.0.0.1:6060/debug/pprof/xxx

example

1
go tool pprof http://127.0.0.1:6060/debug/pprof/allocs

进行命令访问后我们可以查看相关帮助文档,键入help查看可以进行哪些操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
(pprof) help
Commands:
callgrind Outputs a graph in callgrind format
comments Output all profile comments
disasm Output assembly listings annotated with samples
dot Outputs a graph in DOT format
eog Visualize graph through eog
evince Visualize graph through evince
gif Outputs a graph image in GIF format
gv Visualize graph through gv
kcachegrind Visualize report in KCachegrind
list Output annotated source for functions matching regexp
pdf Outputs a graph in PDF format
peek Output callers/callees of functions matching regexp
png Outputs a graph image in PNG format
proto Outputs the profile in compressed protobuf format
ps Outputs a graph in PS format
raw Outputs a text representation of the raw profile
svg Outputs a graph in SVG format
tags Outputs all tags in the profile
text Outputs top entries in text form
top Outputs top entries in text form
topproto Outputs top entries in compressed protobuf format
traces Outputs all profile samples in text form
tree Outputs a text rendering of call graph
web Visualize graph through web browser
weblist Display annotated source in a web browser
o/options List options and their current values
q/quit/exit/^D Exit pprof

Options:
call_tree Create a context-sensitive call tree
compact_labels Show minimal headers
divide_by Ratio to divide all samples before visualization
drop_negative Ignore negative differences
edgefraction Hide edges below <f>*total
focus Restricts to samples going through a node matching regexp
hide Skips nodes matching regexp
ignore Skips paths going through any nodes matching regexp
intel_syntax Show assembly in Intel syntax
mean Average sample value over first value (count)
nodecount Max number of nodes to show
nodefraction Hide nodes below <f>*total
noinlines Ignore inlines.
normalize Scales profile based on the base profile.
output Output filename for file-based outputs
prune_from Drops any functions below the matched frame.
relative_percentages Show percentages relative to focused subgraph
sample_index Sample value to report (0-based index or name)
show Only show nodes matching regexp
show_from Drops functions above the highest matched frame.
source_path Search path for source files
tagfocus Restricts to samples with tags in range or matched by regexp
taghide Skip tags matching this regexp
tagignore Discard samples with tags in range or matched by regexp
tagleaf Adds pseudo stack frames for labels key/value pairs at the callstack leaf.
tagroot Adds pseudo stack frames for labels key/value pairs at the callstack root.
tagshow Only consider tags matching this regexp
trim Honor nodefraction/edgefraction/nodecount defaults
trim_path Path to trim from source paths before search
unit Measurement units to display

Option groups (only set one per group):
granularity
functions Aggregate at the function level.
filefunctions Aggregate at the function level.
files Aggregate at the file level.
lines Aggregate at the source code line level.
addresses Aggregate at the address level.
sort
cum Sort entries based on cumulative weight
flat Sort entries based on own weight
: Clear focus/ignore/hide/tagfocus/tagignore

type "help <cmd|option>" for more information

Profile项详解

allocs

内存分配从程序启动开始,分配的全部内存

1.进入控制台

1
go tool pprof http://127.0.0.1:6060/debug/pprof/allocs

2.先查看帮助文档

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
Type: alloc_space
Time: Dec 18, 2022 at 2:01pm (CST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) help
Commands:
callgrind Outputs a graph in callgrind format
comments Output all profile comments
disasm Output assembly listings annotated with samples
dot Outputs a graph in DOT format
eog Visualize graph through eog
evince Visualize graph through evince
gif Outputs a graph image in GIF format
gv Visualize graph through gv
kcachegrind Visualize report in KCachegrind
list Output annotated source for functions matching regexp
pdf Outputs a graph in PDF format
peek Output callers/callees of functions matching regexp
png Outputs a graph image in PNG format
proto Outputs the profile in compressed protobuf format
ps Outputs a graph in PS format
raw Outputs a text representation of the raw profile
svg Outputs a graph in SVG format
tags Outputs all tags in the profile
text Outputs top entries in text form
top Outputs top entries in text form
topproto Outputs top entries in compressed protobuf format
traces Outputs all profile samples in text form
tree Outputs a text rendering of call graph
web Visualize graph through web browser
weblist Display annotated source in a web browser
o/options List options and their current values
q/quit/exit/^D Exit pprof

Options:
call_tree Create a context-sensitive call tree
compact_labels Show minimal headers
divide_by Ratio to divide all samples before visualization
drop_negative Ignore negative differences
edgefraction Hide edges below <f>*total
focus Restricts to samples going through a node matching regexp
hide Skips nodes matching regexp
ignore Skips paths going through any nodes matching regexp
intel_syntax Show assembly in Intel syntax
mean Average sample value over first value (count)
nodecount Max number of nodes to show
nodefraction Hide nodes below <f>*total
noinlines Ignore inlines.
normalize Scales profile based on the base profile.
output Output filename for file-based outputs
prune_from Drops any functions below the matched frame.
relative_percentages Show percentages relative to focused subgraph
sample_index Sample value to report (0-based index or name)
show Only show nodes matching regexp
show_from Drops functions above the highest matched frame.
source_path Search path for source files
tagfocus Restricts to samples with tags in range or matched by regexp
taghide Skip tags matching this regexp
tagignore Discard samples with tags in range or matched by regexp
tagleaf Adds pseudo stack frames for labels key/value pairs at the callstack leaf.
tagroot Adds pseudo stack frames for labels key/value pairs at the callstack root.
tagshow Only consider tags matching this regexp
trim Honor nodefraction/edgefraction/nodecount defaults
trim_path Path to trim from source paths before search
unit Measurement units to display

Option groups (only set one per group):
granularity
functions Aggregate at the function level.
filefunctions Aggregate at the function level.
files Aggregate at the file level.
lines Aggregate at the source code line level.
addresses Aggregate at the address level.
sort
cum Sort entries based on cumulative weight
flat Sort entries based on own weight
: Clear focus/ignore/hide/tagfocus/tagignore

type "help <cmd|option>" for more information

3.以gif方式进行导出

1
gif

block

阻塞,导致同步原语阻塞的堆栈跟踪

1.以http服务器web页面展示数据

1
go tool pprof -http=:8000 http://127.0.0.1:6060/debug/pprof/block

2.进入web查看,可以选择下拉框查看不同项目的数据,占用情况、连线图、火焰图、窥探数据、结合源码分析、

3.选择下拉菜单 top 查看占用情况

4.查看火焰图

5.窥探耗时

6.结合源码进行分析

cmdline

命令行调用 当前程序的命令行调用

goroutine

gorouting 所有当前 goroutine 的堆栈跟踪

heap

堆 活动对象的内存分配抽样。您可以指定 gc 参数以在获取堆样本之前运行 GC

1.同上操作开启web可视化界面

1
go tool pprof -http=:8000 http://127.0.0.1:6060/debug/pprof/heap

2.同样包含各种选项来提供我们查看

mutex

互斥锁 争用互斥锁持有者的堆栈跟踪

profile

CPU分析 CPU 使用率分析。可以在url中,通过seconds指定持续时间(默认30s)。获取配置文件后,使用 go tool pprof 命令分析CPU使用情况

threadcreate

线程创建 导致创建新操作系统线程的堆栈跟踪

trace

追踪 当前程序的执行轨迹。可以在url中,通过seconds指定持续时间(默认30s)。获取跟踪文件后,使用 go tool trace 命令调查跟踪

帮助文档

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
go tool pprof
usage:

Produce output in the specified format.

pprof <format> [options] [binary] <source> ...

Omit the format to get an interactive shell whose commands can be used
to generate various views of a profile

pprof [options] [binary] <source> ...

Omit the format and provide the "-http" flag to get an interactive web
interface at the specified host:port that can be used to navigate through
various views of a profile.

pprof -http [host]:[port] [options] [binary] <source> ...

Details:
Output formats (select at most one):
-callgrind Outputs a graph in callgrind format
-comments Output all profile comments
-disasm Output assembly listings annotated with samples
-dot Outputs a graph in DOT format
-eog Visualize graph through eog
-evince Visualize graph through evince
-gif Outputs a graph image in GIF format
-gv Visualize graph through gv
-kcachegrind Visualize report in KCachegrind
-list Output annotated source for functions matching regexp
-pdf Outputs a graph in PDF format
-peek Output callers/callees of functions matching regexp
-png Outputs a graph image in PNG format
-proto Outputs the profile in compressed protobuf format
-ps Outputs a graph in PS format
-raw Outputs a text representation of the raw profile
-svg Outputs a graph in SVG format
-tags Outputs all tags in the profile
-text Outputs top entries in text form
-top Outputs top entries in text form
-topproto Outputs top entries in compressed protobuf format
-traces Outputs all profile samples in text form
-tree Outputs a text rendering of call graph
-web Visualize graph through web browser
-weblist Display annotated source in a web browser

Options:
-call_tree Create a context-sensitive call tree
-compact_labels Show minimal headers
-divide_by Ratio to divide all samples before visualization
-drop_negative Ignore negative differences
-edgefraction Hide edges below <f>*total
-focus Restricts to samples going through a node matching regexp
-hide Skips nodes matching regexp
-ignore Skips paths going through any nodes matching regexp
-intel_syntax Show assembly in Intel syntax
-mean Average sample value over first value (count)
-nodecount Max number of nodes to show
-nodefraction Hide nodes below <f>*total
-noinlines Ignore inlines.
-normalize Scales profile based on the base profile.
-output Output filename for file-based outputs
-prune_from Drops any functions below the matched frame.
-relative_percentages Show percentages relative to focused subgraph
-sample_index Sample value to report (0-based index or name)
-show Only show nodes matching regexp
-show_from Drops functions above the highest matched frame.
-source_path Search path for source files
-tagfocus Restricts to samples with tags in range or matched by regexp
-taghide Skip tags matching this regexp
-tagignore Discard samples with tags in range or matched by regexp
-tagleaf Adds pseudo stack frames for labels key/value pairs at the callstack leaf.
-tagroot Adds pseudo stack frames for labels key/value pairs at the callstack root.
-tagshow Only consider tags matching this regexp
-trim Honor nodefraction/edgefraction/nodecount defaults
-trim_path Path to trim from source paths before search
-unit Measurement units to display

Option groups (only set one per group):
granularity
-functions Aggregate at the function level.
-filefunctions Aggregate at the function level.
-files Aggregate at the file level.
-lines Aggregate at the source code line level.
-addresses Aggregate at the address level.
sort
-cum Sort entries based on cumulative weight
-flat Sort entries based on own weight

Source options:
-seconds Duration for time-based profile collection
-timeout Timeout in seconds for profile collection
-buildid Override build id for main binary
-add_comment Free-form annotation to add to the profile
Displayed on some reports or with pprof -comments
-diff_base source Source of base profile for comparison
-base source Source of base profile for profile subtraction
profile.pb.gz Profile in compressed protobuf format
legacy_profile Profile in legacy pprof format
http://host/profile URL for profile handler to retrieve
-symbolize= Controls source of symbol information
none Do not attempt symbolization
local Examine only local binaries
fastlocal Only get function names from local binaries
remote Do not examine local binaries
force Force re-symbolization
Binary Local path or build id of binary for symbolization
-tls_cert TLS client certificate file for fetching profile and symbols
-tls_key TLS private key file for fetching profile and symbols
-tls_ca TLS CA certs file for fetching profile and symbols

Misc options:
-http Provide web interface at host:port.
Host is optional and 'localhost' by default.
Port is optional and a randomly available port by default.
-no_browser Skip opening a browser for the interactive web UI.
-tools Search path for object tools

Legacy convenience options:
-inuse_space Same as -sample_index=inuse_space
-inuse_objects Same as -sample_index=inuse_objects
-alloc_space Same as -sample_index=alloc_space
-alloc_objects Same as -sample_index=alloc_objects
-total_delay Same as -sample_index=delay
-contentions Same as -sample_index=contentions
-mean_delay Same as -mean -sample_index=delay

Environment Variables:
PPROF_TMPDIR Location for saved profiles (default $HOME/pprof)
PPROF_TOOLS Search path for object-level tools
PPROF_BINARY_PATH Search path for local binary files
default: $HOME/pprof/binaries
searches $name, $path, $buildid/$name, $path/$buildid
* On Windows, %USERPROFILE% is used instead of $HOME
no profile source specified

案例

todo

资料

https://github.com/google/pprof

http://www.graphviz.org/


Golang篇-pprof性能调优工具
https://mikeygithub.github.io/2022/12/15/yuque/Golang篇-pprof性能调优工具/
作者
Mikey
发布于
2022年12月15日
许可协议