Track external accumulators in tracer instead of using SparkInfo values#10553
Draft
charlesmyu wants to merge 2 commits intomasterfrom
Draft
Track external accumulators in tracer instead of using SparkInfo values#10553charlesmyu wants to merge 2 commits intomasterfrom
charlesmyu wants to merge 2 commits intomasterfrom
Conversation
BenchmarksStartupParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 62 metrics, 9 unstable metrics. Startup time reports for insecure-bankgantt
title insecure-bank - global startup overhead: candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.072 s) : 0, 1072227
Total [baseline] (8.767 s) : 0, 8767294
Agent [candidate] (1.065 s) : 0, 1065185
Total [candidate] (8.739 s) : 0, 8738592
section iast
Agent [baseline] (1.238 s) : 0, 1238379
Total [baseline] (9.345 s) : 0, 9345468
Agent [candidate] (1.239 s) : 0, 1239318
Total [candidate] (9.413 s) : 0, 9412660
gantt
title insecure-bank - break down per module: candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.194 ms) : 0, 1194
crashtracking [candidate] (1.176 ms) : 0, 1176
BytebuddyAgent [baseline] (633.325 ms) : 0, 633325
BytebuddyAgent [candidate] (629.549 ms) : 0, 629549
AgentMeter [baseline] (29.149 ms) : 0, 29149
AgentMeter [candidate] (29.023 ms) : 0, 29023
GlobalTracer [baseline] (259.509 ms) : 0, 259509
GlobalTracer [candidate] (258.397 ms) : 0, 258397
AppSec [baseline] (32.789 ms) : 0, 32789
AppSec [candidate] (32.747 ms) : 0, 32747
Debugger [baseline] (61.65 ms) : 0, 61650
Debugger [candidate] (62.096 ms) : 0, 62096
Remote Config [baseline] (623.287 µs) : 0, 623
Remote Config [candidate] (625.098 µs) : 0, 625
Telemetry [baseline] (11.495 ms) : 0, 11495
Telemetry [candidate] (10.0 ms) : 0, 10000
Flare Poller [baseline] (7.034 ms) : 0, 7034
Flare Poller [candidate] (6.161 ms) : 0, 6161
section iast
crashtracking [baseline] (1.185 ms) : 0, 1185
crashtracking [candidate] (1.183 ms) : 0, 1183
BytebuddyAgent [baseline] (800.491 ms) : 0, 800491
BytebuddyAgent [candidate] (801.705 ms) : 0, 801705
AgentMeter [baseline] (11.52 ms) : 0, 11520
AgentMeter [candidate] (11.486 ms) : 0, 11486
GlobalTracer [baseline] (249.312 ms) : 0, 249312
GlobalTracer [candidate] (249.964 ms) : 0, 249964
IAST [baseline] (27.284 ms) : 0, 27284
IAST [candidate] (27.056 ms) : 0, 27056
AppSec [baseline] (33.905 ms) : 0, 33905
AppSec [candidate] (31.558 ms) : 0, 31558
Debugger [baseline] (66.634 ms) : 0, 66634
Debugger [candidate] (68.226 ms) : 0, 68226
Remote Config [baseline] (534.68 µs) : 0, 535
Remote Config [candidate] (533.906 µs) : 0, 534
Telemetry [baseline] (8.632 ms) : 0, 8632
Telemetry [candidate] (8.734 ms) : 0, 8734
Flare Poller [baseline] (3.494 ms) : 0, 3494
Flare Poller [candidate] (3.517 ms) : 0, 3517
Startup time reports for petclinicgantt
title petclinic - global startup overhead: candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section tracing
Agent [baseline] (1.063 s) : 0, 1062741
Total [baseline] (10.87 s) : 0, 10869512
Agent [candidate] (1.064 s) : 0, 1064036
Total [candidate] (11.021 s) : 0, 11020651
section appsec
Agent [baseline] (1.245 s) : 0, 1244876
Total [baseline] (11.027 s) : 0, 11026852
Agent [candidate] (1.246 s) : 0, 1245771
Total [candidate] (10.952 s) : 0, 10952056
section iast
Agent [baseline] (1.233 s) : 0, 1233030
Total [baseline] (11.173 s) : 0, 11173256
Agent [candidate] (1.244 s) : 0, 1243811
Total [candidate] (11.313 s) : 0, 11312928
section profiling
Agent [baseline] (1.19 s) : 0, 1189738
Total [baseline] (10.949 s) : 0, 10949215
Agent [candidate] (1.205 s) : 0, 1204733
Total [candidate] (10.998 s) : 0, 10998143
gantt
title petclinic - break down per module: candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section tracing
crashtracking [baseline] (1.175 ms) : 0, 1175
crashtracking [candidate] (1.183 ms) : 0, 1183
BytebuddyAgent [baseline] (628.023 ms) : 0, 628023
BytebuddyAgent [candidate] (627.909 ms) : 0, 627909
AgentMeter [baseline] (28.891 ms) : 0, 28891
AgentMeter [candidate] (28.904 ms) : 0, 28904
GlobalTracer [baseline] (257.202 ms) : 0, 257202
GlobalTracer [candidate] (257.553 ms) : 0, 257553
AppSec [baseline] (32.75 ms) : 0, 32750
AppSec [candidate] (32.865 ms) : 0, 32865
Debugger [baseline] (62.066 ms) : 0, 62066
Debugger [candidate] (61.957 ms) : 0, 61957
Remote Config [baseline] (627.281 µs) : 0, 627
Remote Config [candidate] (617.003 µs) : 0, 617
Telemetry [baseline] (12.232 ms) : 0, 12232
Telemetry [candidate] (12.346 ms) : 0, 12346
Flare Poller [baseline] (4.558 ms) : 0, 4558
Flare Poller [candidate] (5.456 ms) : 0, 5456
section appsec
crashtracking [baseline] (1.185 ms) : 0, 1185
crashtracking [candidate] (1.179 ms) : 0, 1179
BytebuddyAgent [baseline] (662.187 ms) : 0, 662187
BytebuddyAgent [candidate] (661.938 ms) : 0, 661938
AgentMeter [baseline] (11.989 ms) : 0, 11989
AgentMeter [candidate] (12.028 ms) : 0, 12028
GlobalTracer [baseline] (259.514 ms) : 0, 259514
GlobalTracer [candidate] (260.18 ms) : 0, 260180
IAST [baseline] (25.255 ms) : 0, 25255
IAST [candidate] (25.572 ms) : 0, 25572
AppSec [baseline] (168.629 ms) : 0, 168629
AppSec [candidate] (168.324 ms) : 0, 168324
Debugger [baseline] (67.174 ms) : 0, 67174
Debugger [candidate] (67.623 ms) : 0, 67623
Remote Config [baseline] (665.638 µs) : 0, 666
Remote Config [candidate] (653.109 µs) : 0, 653
Telemetry [baseline] (9.244 ms) : 0, 9244
Telemetry [candidate] (9.181 ms) : 0, 9181
Flare Poller [baseline] (3.677 ms) : 0, 3677
Flare Poller [candidate] (3.669 ms) : 0, 3669
section iast
crashtracking [baseline] (1.181 ms) : 0, 1181
crashtracking [candidate] (1.22 ms) : 0, 1220
BytebuddyAgent [baseline] (794.702 ms) : 0, 794702
BytebuddyAgent [candidate] (804.103 ms) : 0, 804103
AgentMeter [baseline] (11.234 ms) : 0, 11234
AgentMeter [candidate] (11.544 ms) : 0, 11544
GlobalTracer [baseline] (249.208 ms) : 0, 249208
GlobalTracer [candidate] (250.158 ms) : 0, 250158
IAST [baseline] (27.231 ms) : 0, 27231
IAST [candidate] (27.168 ms) : 0, 27168
AppSec [baseline] (33.3 ms) : 0, 33300
AppSec [candidate] (35.737 ms) : 0, 35737
Debugger [baseline] (68.01 ms) : 0, 68010
Debugger [candidate] (65.647 ms) : 0, 65647
Remote Config [baseline] (547.128 µs) : 0, 547
Remote Config [candidate] (548.521 µs) : 0, 549
Telemetry [baseline] (8.796 ms) : 0, 8796
Telemetry [candidate] (8.644 ms) : 0, 8644
Flare Poller [baseline] (3.531 ms) : 0, 3531
Flare Poller [candidate] (3.472 ms) : 0, 3472
section profiling
crashtracking [baseline] (1.217 ms) : 0, 1217
crashtracking [candidate] (1.235 ms) : 0, 1235
BytebuddyAgent [baseline] (681.662 ms) : 0, 681662
BytebuddyAgent [candidate] (691.292 ms) : 0, 691292
AgentMeter [baseline] (8.602 ms) : 0, 8602
AgentMeter [candidate] (8.663 ms) : 0, 8663
GlobalTracer [baseline] (216.063 ms) : 0, 216063
GlobalTracer [candidate] (218.619 ms) : 0, 218619
AppSec [baseline] (32.431 ms) : 0, 32431
AppSec [candidate] (33.024 ms) : 0, 33024
Debugger [baseline] (67.779 ms) : 0, 67779
Debugger [candidate] (68.18 ms) : 0, 68180
Remote Config [baseline] (600.654 µs) : 0, 601
Remote Config [candidate] (600.437 µs) : 0, 600
Telemetry [baseline] (8.799 ms) : 0, 8799
Telemetry [candidate] (8.791 ms) : 0, 8791
Flare Poller [baseline] (3.748 ms) : 0, 3748
Flare Poller [candidate] (3.742 ms) : 0, 3742
ProfilingAgent [baseline] (98.923 ms) : 0, 98923
ProfilingAgent [candidate] (100.117 ms) : 0, 100117
Profiling [baseline] (99.491 ms) : 0, 99491
Profiling [candidate] (100.677 ms) : 0, 100677
LoadParameters
See matching parameters
SummaryFound 2 performance improvements and 1 performance regressions! Performance is the same for 15 metrics, 18 unstable metrics.
Request duration reports for petclinicgantt
title petclinic - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section baseline
no_agent (18.182 ms) : 17993, 18371
. : milestone, 18182,
appsec (18.684 ms) : 18494, 18874
. : milestone, 18684,
code_origins (17.611 ms) : 17438, 17785
. : milestone, 17611,
iast (18.947 ms) : 18753, 19141
. : milestone, 18947,
profiling (18.708 ms) : 18521, 18895
. : milestone, 18708,
tracing (17.837 ms) : 17657, 18017
. : milestone, 17837,
section candidate
no_agent (18.093 ms) : 17907, 18280
. : milestone, 18093,
appsec (18.947 ms) : 18758, 19136
. : milestone, 18947,
code_origins (17.831 ms) : 17653, 18009
. : milestone, 17831,
iast (17.678 ms) : 17502, 17853
. : milestone, 17678,
profiling (19.657 ms) : 19456, 19859
. : milestone, 19657,
tracing (17.76 ms) : 17584, 17936
. : milestone, 17760,
Request duration reports for insecure-bankgantt
title insecure-bank - request duration [CI 0.99] : candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section baseline
no_agent (1.18 ms) : 1169, 1191
. : milestone, 1180,
iast (3.155 ms) : 3110, 3199
. : milestone, 3155,
iast_FULL (5.911 ms) : 5852, 5970
. : milestone, 5911,
iast_GLOBAL (3.389 ms) : 3340, 3437
. : milestone, 3389,
profiling (2.091 ms) : 2072, 2109
. : milestone, 2091,
tracing (1.791 ms) : 1776, 1805
. : milestone, 1791,
section candidate
no_agent (1.199 ms) : 1187, 1211
. : milestone, 1199,
iast (3.13 ms) : 3088, 3173
. : milestone, 3130,
iast_FULL (5.828 ms) : 5769, 5887
. : milestone, 5828,
iast_GLOBAL (3.503 ms) : 3440, 3566
. : milestone, 3503,
profiling (2.074 ms) : 2055, 2093
. : milestone, 2074,
tracing (1.813 ms) : 1797, 1828
. : milestone, 1813,
DacapoParameters
See matching parameters
SummaryFound 0 performance improvements and 0 performance regressions! Performance is the same for 11 metrics, 1 unstable metrics. Execution time for tomcatgantt
title tomcat - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section baseline
no_agent (1.474 ms) : 1463, 1486
. : milestone, 1474,
appsec (3.774 ms) : 3552, 3996
. : milestone, 3774,
iast (2.261 ms) : 2191, 2330
. : milestone, 2261,
iast_GLOBAL (2.305 ms) : 2235, 2374
. : milestone, 2305,
profiling (2.073 ms) : 2018, 2127
. : milestone, 2073,
tracing (2.076 ms) : 2022, 2130
. : milestone, 2076,
section candidate
no_agent (1.475 ms) : 1464, 1487
. : milestone, 1475,
appsec (3.794 ms) : 3573, 4015
. : milestone, 3794,
iast (2.259 ms) : 2189, 2328
. : milestone, 2259,
iast_GLOBAL (2.3 ms) : 2230, 2369
. : milestone, 2300,
profiling (2.127 ms) : 2070, 2184
. : milestone, 2127,
tracing (2.067 ms) : 2013, 2121
. : milestone, 2067,
Execution time for biojavagantt
title biojava - execution time [CI 0.99] : candidate=1.60.0-SNAPSHOT~4adb3afe6c, baseline=1.60.0-SNAPSHOT~d10055d2a1
dateFormat X
axisFormat %s
section baseline
no_agent (15.543 s) : 15543000, 15543000
. : milestone, 15543000,
appsec (14.412 s) : 14412000, 14412000
. : milestone, 14412000,
iast (18.276 s) : 18276000, 18276000
. : milestone, 18276000,
iast_GLOBAL (18.055 s) : 18055000, 18055000
. : milestone, 18055000,
profiling (14.788 s) : 14788000, 14788000
. : milestone, 14788000,
tracing (14.776 s) : 14776000, 14776000
. : milestone, 14776000,
section candidate
no_agent (14.719 s) : 14719000, 14719000
. : milestone, 14719000,
appsec (14.89 s) : 14890000, 14890000
. : milestone, 14890000,
iast (18.335 s) : 18335000, 18335000
. : milestone, 18335000,
iast_GLOBAL (17.906 s) : 17906000, 17906000
. : milestone, 17906000,
profiling (15.596 s) : 15596000, 15596000
. : milestone, 15596000,
tracing (14.607 s) : 14607000, 14607000
. : milestone, 14607000,
|
4e5bdc7 to
ba09c80
Compare
4adb3af to
4679b93
Compare
4679b93 to
b384855
Compare
b384855 to
95a8148
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What Does This Do
Updates the metrics in the
_dd.spark.sql_planmeta field to use distributions calculated from individual task metrics, rather than the incorrectly summed metrics provided by theStageInfoobjects from Spark. This is becauseStageInfonaively sums all accumulators, even though that may not make sense for certain Spark SQL metrics (e.g. avg hash probes per key for aggr operations). Instead, we should accumulate those ourselves into distribution metrics and emit them accordingly.Currently in the UI, this is only used in one place (in the Spark SQL metrics in the DJM product), so we're not too worried about changing the format here.
Motivation
We'd like accurate metrics for Spark SQL operations.
Additional Notes
We can't get rid of the original map that tracks accumulators to stages as we still use that to associate Spark SQL operations to stages. However, we can avoid storing the entire accumulator now, and instead just store a simple map of accumulator ID to stage ID. This will be done in a followup PR.
Contributor Checklist
type:and (comp:orinst:) labels in addition to any other useful labelsclose,fix, or any linking keywords when referencing an issueUse
solvesinstead, and assign the PR milestone to the issueJira ticket: [PROJ-IDENT]
Note: Once your PR is ready to merge, add it to the merge queue by commenting
/merge./merge -ccancels the queue request./merge -f --reason "reason"skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.