Skip to content

cmd/compile: pgo causes worse performance on some benchmarks #68815

Open
@qiulaidongfeng

Description

@qiulaidongfeng

Go version

go version devel go1.24-f428c7b7 Sat Aug 3 05:06:40 2024 +0000 windows/amd64

Output of go env in your module/workspace:

set GO111MODULE=auto
set GOARCH=amd64
set GOBIN=
set GOCACHE=D:\file\go-build
set GOENV=C:\Users\26454\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=D:\file\gofile\pkg\mod
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=D:\file\gofile
set GOPRIVATE=
set GOPROXY=https://goproxy.cn,direct
set GOROOT=C:\Users\26454\.go\current
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLCHAIN=local
set GOTOOLDIR=C:\Users\26454\.go\current\pkg\tool\windows_amd64
set GOVCS=
set GOVERSION=devel go1.24-f428c7b7 Sat Aug 3 05:06:40 2024 +0000
set GODEBUG=
set GOTELEMETRY=on
set GOTELEMETRYDIR=C:\Users\26454\AppData\Roaming\go\telemetry
set GCCGO=gccgo
set GOAMD64=v3
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
go: stripping unprintable or unescapable characters from %"GOMOD"%
set GOMOD=D:\file\gofile\U������\u\u-language\ucom\go.mod
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=C:\Users\26454\AppData\Local\Temp\go-build1658404967=/tmp/go-build -gno-record-gcc-switches

What did you do?

Repeat step:

git clone https://gitee.com/u-language/u-language.git -b pgo2024-8-9
cd u-language
Set environment variable URoot to the current path
cd ucom

go test -run=ssss -bench=Mode2 -count=10 -pgo off > new.txt && sleep 20 && go test -run=ssss -bench=Mode2 -count=10 > pgo.txt && benchstat new.txt pgo.txt

What did you see happen?

pgo does not cause performance degradation

What did you expect to see?

goos: windows
goarch: amd64
pkg: gitee.com/u-language/u-language/ucom
cpu: AMD Ryzen 7 7840HS w/ Radeon 780M Graphics
                                                           │   new.txt   │               pgo.txt                │
                                                           │   sec/op    │    sec/op     vs base                │
ComplierToCBuildMode2/输出456-16                               4.831µ ± 1%   4.695µ ± 12%        ~ (p=0.143 n=10)
ComplierToCBuildMode2/for加1,1万次,输出10000-16                   4.264µ ± 1%   4.244µ ± 11%        ~ (p=0.684 n=10)
ComplierToCBuildMode2/for代码块和main代码块都有变量i,输出10000-16         4.881µ ± 0%   4.777µ ±  1%   -2.13% (p=0.000 n=10)
ComplierToCBuildMode2/输出1-10000中偶数的数量,比较"s"=="a"输出false-16   6.770µ ± 0%   6.649µ ±  1%   -1.79% (p=0.001 n=10)
ComplierToCBuildMode2/使用指针循环加一个int变量自增10000次-16              4.346µ ± 1%   4.244µ ±  1%   -2.34% (p=0.000 n=10)
ComplierToCBuildMode2/输出结构体字段,结果字符串11-16                     4.766µ ± 1%   5.032µ ±  8%        ~ (p=0.393 n=10)
ComplierToCBuildMode2/整数和浮点数互相转换-16                          2.953µ ± 1%   2.950µ ±  2%        ~ (p=0.927 n=10)
**ComplierToCBuildMode2/调用malloc和free-16                       23.37µ ± 1%   26.95µ ±  1%  +15.29% (p=0.000 n=10)**
ComplierToCBuildMode2/使用自操作语句-16                             6.634µ ± 3%   6.391µ ±  2%   -3.66% (p=0.000 n=10)
ComplierToCBuildMode2/使用nil-16                               5.929µ ± 3%   5.758µ ±  1%   -2.88% (p=0.000 n=10)
**ComplierToCBuildMode2/使用选择器取地址与解引用-16                        25.35µ ± 1%   28.84µ ±  1%  +13.78% (p=0.000 n=10)
ComplierToCBuildMode2/使用init函数-16                            25.95µ ± 2%   29.33µ ±  1%  +13.02% (p=0.000 n=10)**
ComplierToCBuildMode2/switch-16                              7.643µ ± 1%   7.513µ ±  1%   -1.70% (p=0.000 n=10)
ComplierToCBuildMode2/位与,位或,异或,逻辑运算,括号表达式-16                 7.216µ ± 2%   7.102µ ±  1%   -1.58% (p=0.000 n=10)
ComplierToCBuildMode2/数组类型与索引表达式-16                          6.754µ ± 1%   6.451µ ±  1%   -4.49% (p=0.000 n=10)
ComplierToCBuildMode2/goto语句-16                              2.148µ ± 2%   2.074µ ±  3%   -3.45% (p=0.008 n=10)
geomean                                                      6.784µ        6.857µ         +1.07%

                                                           │   new.txt    │                 pgo.txt                 │
                                                           │     B/s      │      B/s       vs base                  │
ComplierToCBuildMode2/输出456-16                               205.1Ki ± 0%   205.1Ki ± 10%        ~ (p=0.211 n=10)
ComplierToCBuildMode2/for加1,1万次,输出10000-16                   224.6Ki ± 4%   229.5Ki ± 11%        ~ (p=0.592 n=10)
ComplierToCBuildMode2/for代码块和main代码块都有变量i,输出10000-16         195.3Ki ± 5%   205.1Ki ±  0%   +5.00% (p=0.003 n=10)
ComplierToCBuildMode2/输出1-10000中偶数的数量,比较"s"=="a"输出false-16   146.5Ki ± 0%   146.5Ki ±  0%        ~ (p=1.000 n=10) ¹
ComplierToCBuildMode2/使用指针循环加一个int变量自增10000次-16              224.6Ki ± 0%   234.4Ki ±  4%   +4.35% (p=0.003 n=10)
ComplierToCBuildMode2/输出结构体字段,结果字符串11-16                     205.1Ki ± 0%   195.3Ki ± 10%        ~ (p=0.180 n=10)
ComplierToCBuildMode2/整数和浮点数互相转换-16                          332.0Ki ± 0%   332.0Ki ±  3%        ~ (p=1.000 n=10)
ComplierToCBuildMode2/调用malloc和free-16                       39.06Ki ± 0%   39.06Ki ±  0%        ~ (p=1.000 n=10) ¹
ComplierToCBuildMode2/使用自操作语句-16                             146.5Ki ± 0%   156.2Ki ±  6%   +6.67% (p=0.001 n=10)
ComplierToCBuildMode2/使用nil-16                               166.0Ki ± 6%   166.0Ki ±  0%        ~ (p=0.263 n=10)
ComplierToCBuildMode2/使用选择器取地址与解引用-16                        39.06Ki ± 0%   29.30Ki ±  0%  -25.00% (p=0.000 n=10)
ComplierToCBuildMode2/使用init函数-16                            39.06Ki ± 0%   29.30Ki ±  0%  -25.00% (p=0.000 n=10)
ComplierToCBuildMode2/switch-16                              127.0Ki ± 0%   127.0Ki ±  0%        ~ (p=1.000 n=10)
ComplierToCBuildMode2/位与,位或,异或,逻辑运算,括号表达式-16                 136.7Ki ± 0%   136.7Ki ±  0%        ~ (p=1.000 n=10) ¹
ComplierToCBuildMode2/数组类型与索引表达式-16                          146.5Ki ± 0%   151.4Ki ±  3%   +3.33% (p=0.033 n=10)
ComplierToCBuildMode2/goto语句-16                              459.0Ki ± 2%   468.8Ki ±  2%   +2.13% (p=0.015 n=10)
geomean                                                      144.0Ki        140.5Ki         -2.43%
¹ all samples are equal

                                                           │   new.txt    │               pgo.txt               │
                                                           │     B/op     │     B/op      vs base               │
ComplierToCBuildMode2/输出456-16                               5.986Ki ± 0%   5.916Ki ± 0%  -1.18% (p=0.000 n=10)
ComplierToCBuildMode2/for加1,1万次,输出10000-16                   5.673Ki ± 0%   5.603Ki ± 0%  -1.24% (p=0.000 n=10)
ComplierToCBuildMode2/for代码块和main代码块都有变量i,输出10000-16         6.135Ki ± 0%   6.064Ki ± 0%  -1.15% (p=0.000 n=10)
ComplierToCBuildMode2/输出1-10000中偶数的数量,比较"s"=="a"输出false-16   7.929Ki ± 0%   7.858Ki ± 0%  -0.89% (p=0.000 n=10)
ComplierToCBuildMode2/使用指针循环加一个int变量自增10000次-16              5.728Ki ± 0%   5.657Ki ± 0%  -1.24% (p=0.000 n=10)
ComplierToCBuildMode2/输出结构体字段,结果字符串11-16                     5.735Ki ± 0%   5.664Ki ± 0%  -1.24% (p=0.000 n=10)
ComplierToCBuildMode2/整数和浮点数互相转换-16                          3.651Ki ± 0%   3.627Ki ± 0%  -0.67% (p=0.000 n=10)
ComplierToCBuildMode2/调用malloc和free-16                       8.631Ki ± 0%   8.561Ki ± 0%  -0.81% (p=0.000 n=10)
ComplierToCBuildMode2/使用自操作语句-16                             7.153Ki ± 0%   7.082Ki ± 0%  -1.00% (p=0.000 n=10)
ComplierToCBuildMode2/使用nil-16                               6.457Ki ± 0%   6.386Ki ± 0%  -1.10% (p=0.000 n=10)
ComplierToCBuildMode2/使用选择器取地址与解引用-16                        9.689Ki ± 0%   9.618Ki ± 0%  -0.73% (p=0.000 n=10)
ComplierToCBuildMode2/使用init函数-16                            10.63Ki ± 0%   10.56Ki ± 0%  -0.67% (p=0.000 n=10)
ComplierToCBuildMode2/switch-16                              8.242Ki ± 0%   8.171Ki ± 0%  -0.86% (p=0.000 n=10)
ComplierToCBuildMode2/位与,位或,异或,逻辑运算,括号表达式-16                 7.670Ki ± 0%   7.646Ki ± 0%  -0.31% (p=0.000 n=10)
ComplierToCBuildMode2/数组类型与索引表达式-16                          7.225Ki ± 0%   7.202Ki ± 0%  -0.32% (p=0.000 n=10)
ComplierToCBuildMode2/goto语句-16                              2.836Ki ± 0%   2.812Ki ± 0%  -0.83% (p=0.000 n=10)
geomean                                                      6.515Ki        6.457Ki       -0.89%

                                                           │  new.txt   │              pgo.txt              │
                                                           │ allocs/op  │ allocs/op   vs base               │
ComplierToCBuildMode2/输出456-16                               82.00 ± 0%   80.00 ± 0%  -2.44% (p=0.000 n=10)
ComplierToCBuildMode2/for加1,1万次,输出10000-16                   70.00 ± 0%   68.00 ± 0%  -2.86% (p=0.000 n=10)
ComplierToCBuildMode2/for代码块和main代码块都有变量i,输出10000-16         78.00 ± 0%   76.00 ± 0%  -2.56% (p=0.000 n=10)
ComplierToCBuildMode2/输出1-10000中偶数的数量,比较"s"=="a"输出false-16   111.0 ± 0%   109.0 ± 0%  -1.80% (p=0.000 n=10)
ComplierToCBuildMode2/使用指针循环加一个int变量自增10000次-16              76.00 ± 0%   74.00 ± 0%  -2.63% (p=0.000 n=10)
ComplierToCBuildMode2/输出结构体字段,结果字符串11-16                     81.00 ± 0%   79.00 ± 0%  -2.47% (p=0.000 n=10)
ComplierToCBuildMode2/整数和浮点数互相转换-16                          50.00 ± 0%   49.00 ± 0%  -2.00% (p=0.000 n=10)
ComplierToCBuildMode2/调用malloc和free-16                       94.00 ± 0%   92.00 ± 0%  -2.13% (p=0.000 n=10)
ComplierToCBuildMode2/使用自操作语句-16                             110.0 ± 0%   108.0 ± 0%  -1.82% (p=0.000 n=10)
ComplierToCBuildMode2/使用nil-16                               108.0 ± 0%   106.0 ± 0%  -1.85% (p=0.000 n=10)
ComplierToCBuildMode2/使用选择器取地址与解引用-16                        127.0 ± 0%   125.0 ± 0%  -1.57% (p=0.000 n=10)
ComplierToCBuildMode2/使用init函数-16                            132.0 ± 0%   130.0 ± 0%  -1.52% (p=0.000 n=10)
ComplierToCBuildMode2/switch-16                              130.0 ± 0%   128.0 ± 0%  -1.54% (p=0.000 n=10)
ComplierToCBuildMode2/位与,位或,异或,逻辑运算,括号表达式-16                 118.0 ± 0%   117.0 ± 0%  -0.85% (p=0.000 n=10)
ComplierToCBuildMode2/数组类型与索引表达式-16                          128.0 ± 0%   127.0 ± 0%  -0.78% (p=0.000 n=10)
ComplierToCBuildMode2/goto语句-16                              39.00 ± 0%   38.00 ± 0%  -2.56% (p=0.000 n=10)
geomean                                                      90.95        89.17       -1.96%

Metadata

Metadata

Assignees

No one assigned

    Labels

    NeedsInvestigationSomeone must examine and confirm this is a valid issue and not a duplicate of an existing one.Performancecompiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions