Skip to content

benchmark jenkins job? #8157

Closed
Closed
@AndreasMadsen

Description

@AndreasMadsen
Member
  • Version: master
  • Platform: NA
  • Subsystem: benchmark

A problem with our current benchmark system is that it takes a long time to compare two node versions. While many of the benchmarks likely use too many iterations (and thus time), that is not the case for all of them. See for example #8139, http/simple.js takes 4 hours per node version and it appears that is necessary in this case.

I personally don't own an extra computer to run benchmarks on, thus I think it would be nice if we could create a jenkins job on ci.nodejs.org that would compare two node version on a benchmark subset like http/simple.js.

I don't know how the CI setup works, but it would have to be a machine (virtual or physical) that has a constant performance. Otherwise we would get false positive results like we have seen here: #7425 (comment)

I don't think we need a machine for every platform, just those where we prioritize performance (windows and linux?).

Activity

added
metaIssues and PRs related to the general management of the project.
benchmarkIssues and PRs related to the benchmark subsystem.
on Aug 18, 2016
AndreasMadsen

AndreasMadsen commented on Aug 18, 2016

@AndreasMadsen
MemberAuthor

/cc @nodejs/benchmarking @nodejs/future-ci

jbergstroem

jbergstroem commented on Aug 18, 2016

@jbergstroem
Member

@AndreasMadsen we have a dedicated bare-bones machine to run benchmarks. These are published here: https://benchmarking.nodejs.org

The benchmark group could probably look into adding more tests.

gareth-ellis

gareth-ellis commented on Aug 18, 2016

@gareth-ellis
Member

I'm looking into a way of collecting regular results from the benchmarks directory - due to the sheer volume of data, we need to find a way of summarizing the scores, whilst still exposing the individual numbers.

The other problem to overcome, is to decide how we would run the benchmarks. We need to have a way of changing only one thing (we're interested in performance of Node across releases, so this is what we should change), all other components (the machine, the tests run etc) should stay the same.

The benchmarks directory is under constant change, so we need a way of ensuring that we either stick at a particular (back-level) version of benchmarks, and reuse those, or to run latest benchmarks and include an older version of node to compare to.

This could cause us issues where there's tests that we're changing based on new functionality that is not backwards compatibility.

Some comments on this would be good, so we can come to a decision and get the benchmarks running.

AndreasMadsen

AndreasMadsen commented on Aug 18, 2016

@AndreasMadsen
MemberAuthor

@jbergstroem @gareth-ellis I don't want continues observations of performances (though that is important), I want the ability to say "please compare master and this PR using this benchmark".

Changes in benchmarks shouldn't be an issue, we could just consistently use that in the master.

Essentially I want it to run:

git checkout master

# build master
./configure
make
mv ./node ./node-master

# build pr
apply-pr $PR_ID
./configure
make
mv ./node ./node-pr

# run benchmark
./node benchmark/compare.js --old ./node-master --new ./node-pr --filter $JOB_FILTER --runs $JOB_RUNS -- $JOB_CATEGORY | tee output.csv
cat output.csv | Rscript benchmark/compare.R

edit: updated bash job to be syntactically correct

jbergstroem

jbergstroem commented on Aug 18, 2016

@jbergstroem
Member

@AndreasMadsen I was just saying that the benchmark group is coordinating access/time spent on our only dedicated hardware box which is most suitable for reliable results. I guess I should've been more clear about what I meant with "adding tests", as in, exposing a job or similar that we could run from Jenkins (albeit with limited access).

mhdawson

mhdawson commented on Aug 22, 2016

@mhdawson
Member

@AndreasMadsen the other consideration is that if we run on the regular machines, tying up one of the machines for 4+ hours could affect the progress off the regular builds which are much faster.

As @gareth-ellis mentioned consolidating the data down to something consumable is the challenge for regular runs.

When you say you want to run a particular benchmark for a particular PR. Is there a key subset of the benchmarks that you find yourself wanting to run as opposed to all of them ? It might be easier to tackle being able to run those first.

AndreasMadsen

AndreasMadsen commented on Aug 23, 2016

@AndreasMadsen
MemberAuthor

@AndreasMadsen the other consideration is that if we run on the regular machines, tying up one of the machines for 4+ hours could affect the progress off the regular builds which are much faster.

I understand, but it is the same reason why we don't want to do it on our own machines.

As @gareth-ellis mentioned consolidating the data down to something consumable is the challenge for regular runs.

In this case we have the Rscript ./node benchmark/compare.R which does that. The job should just output the raw csv and the statistical summary.

When you say you want to run a particular benchmark for a particular PR. Is there a key subset of the benchmarks that you find yourself wanting to run as opposed to all of them ? It might be easier to tackle being able to run those first.

Yes, it is a subset. But the subset depends on the PR, thus I think this should be configurable parameters in the job. Please see the bash job I outlined for how this is typically accomplished.

gareth-ellis

gareth-ellis commented on Aug 23, 2016

@gareth-ellis
Member

@AndreasMadsen , do you need to have ./node in the Rscript line?

Rscript ./node benchmark/compare.R

When I run with that, I get
Error: unexpected input in ""
Execution halted

However removing ./node seems to generate some numbers.

I've prototyped something in my fork of benchmarking:

https://github.com/gareth-ellis/benchmarking/blob/master/experimental/benchmarks/community-benchmark/run.sh

I'll open a PR shortly to get some comments.

For a ci run - we'd need a job where we can supply:

mandatory BRANCH
mandatory PULL_ID
mandatory CATEGORY
optional RUNS
optional FILTER

We will need to be careful if we do implement this, that the machine is still able to continue the regular regression runs - though expanding them to include at least a subset of the benchmarks from http://github.com/nodejs/node would be beneficial

AndreasMadsen

AndreasMadsen commented on Aug 23, 2016

@AndreasMadsen
MemberAuthor

No sorry about that. It is just Rscript benchmark/compare.R

AndreasMadsen

AndreasMadsen commented on Aug 23, 2016

@AndreasMadsen
MemberAuthor

@gareth-ellis also the default --runs is 30, not 5. I think it will be confusing with a different default. Perhaps you should just make it like you implemented --filter.

Trott

Trott commented on Jul 10, 2017

@Trott
Member

@nodejs/build @nodejs/benchmarking Is this something that might realistically happen? Is there consensus that it would be desirable? Any clear pre-requisites (like more Ci resources)? Is there a path forward on this?

14 remaining items

Loading
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    benchmarkIssues and PRs related to the benchmark subsystem.metaIssues and PRs related to the general management of the project.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @apapirovski@mcollina@refack@jbergstroem@AndreasMadsen

        Issue actions

          benchmark jenkins job? · Issue #8157 · nodejs/node