diff --git a/docs/manpage.rst b/docs/manpage.rst index 55d3ca795d..1ca9d86ab5 100644 --- a/docs/manpage.rst +++ b/docs/manpage.rst @@ -197,6 +197,17 @@ Result storage commands The :option:`--filter-expr` can now be passed twice with :option:`--performance-compare`. + +.. option:: --term-lhs=NAME +.. option:: --term-rhs=NAME + + Change the default suffix for columns in performance comparisons. + + These options are relevant only in conjunction with the :option:`--performance-compare` and :option:`--performance-report` options + + .. versionadded:: 4.9 + + Other commands ^^^^^^^^^^^^^^ @@ -1413,14 +1424,21 @@ ReFrame provides several options for querying and inspecting past sessions and t All those options follow a common syntax that builds on top of the following elements: 1. Selection of sessions and test cases -2. Grouping of test cases and performance aggregations +2. Grouping of test cases and aggregations 3. Selection of test case attributes to present Throughout the documentation, we use the ``///`` for implicit performance comparisons (see :option:`--performance-report`) or for simple performance aggregations (see :option:`--list-stored-testcases`). +.. code-block:: bnf + + ::= ( "/" "/" + +The first optional `` := ``: An explicit session UUID. -2. `` := ``: A time period specification (see below for details). -4. `` ::= | | + ::= /* any valid UUID */ + ::= ()? "?" + ::= /* any valid Python expression */ + +Test cases can be practically selected in three ways: + +1. By an explicit session UUID, such as ``ae43e247-375f-4b05-8ab5-c7a017d4afc3``. +2. By a time period, such as ``20251201:now`` (the exact syntax of the time period is explained in :ref:`time-periods`). +3. By filter, which can either take the form of a pure Python expression or a Python expression prefixed by a time period. + The expression is evaluated over the session information including any user-specific session extras (see also :option:`--session-extras`). + Here are two examples: + + - ``?'tag=="123"'`` will select all stored sessions with ``tag`` set to ``123``. + - ``20251201:now?'tag=="123"'`` will select stored sessions from December 2025 with ``tag`` set to ``123``. + When filtering using an expression, it is a good idea to limit the scope of the query using a time period as this will reduce significantly the query times in large databases. + +.. tip:: + + When using session filters to select the test cases, quoting is important. + If ``tag=="123"`` was used unquoted in the example above, the shell would remove the double quotes from ``"123"`` and the expression passed to ReFrame would be ``tag==123``. + This is a valid expression but will always evaluate to false, since ``tag``, as every session attribute is a string. + Single-quoting the expression avoids this and the actual comparison will be ``tag=="123"`` giving the desired outcome. .. note:: @@ -1445,63 +1482,99 @@ The syntax for selecting sessions or test cases can take one of the following fo Support for scoping the session filter queries by a time period was added. +.. _time-periods: + Time periods ^^^^^^^^^^^^ -The general syntax of time period specification is the following: +The syntax for defining time periods in past results queries is the following: -.. code-block:: console - - := : - -```` and ```` are timestamp denoting the start and end of the requested period. -More specifically, the syntax of each timestamp is the following: - -.. code-block:: console - - [+|-w|d|h|m] +.. code-block:: bnf -The ```` is an absolute timestamp in one of the following ``strptime``-compatible formats or the special value ``now``: ``%Y%m%d``, ``%Y%m%dT%H%M``, ``%Y%m%dT%H%M%S``, ``%Y%m%dT%H%M%S%z``. + ::= ":" + ::= ("now" | ) (("+" | "-") ("w" | "d" | "h" | "m"))? + ::= /* any timestamp of the format `%Y%m%d`, `%Y%m%dT%H%M`, `%Y%m%dT%H%M%S` */ + ::= [0-9]+ +A time period is defined as a starting and ending timestamps separated by colon. +A timestamp can have any of the following ``strptime``-compatible formats: ``%Y%m%d``, ``%Y%m%dT%H%M``, ``%Y%m%dT%H%M%S``, ``%Y%m%dT%H%M%S%z``. +A timestamp can also be the special value ``now`` which denotes the current local time. Optionally, a shift argument can be appended with ``+`` or ``-`` signs, followed by the number of weeks (``w``), days (``d``), hours (``h``) or minutes (``m``). - For example, the period of the last 10 days can be specified as ``now-10d:now``. -Similarly, the period of the week starting on August 5, 2024 will be specified as ``20240805:20240805+1w``. +Similarly, the period of 3 weeks starting on August 5, 2024 can be specified as ``20240805:20240805+3w``. .. _testcase-grouping: -Grouping test cases and aggregating performance ------------------------------------------------- +Groupings and aggregations +-------------------------- The aggregation specification follows the general syntax: -.. code-block:: console +.. code-block:: bnf + + ::= ":" ? - := :[] +Where ```` is a list of aggregation specs and ```` is a list of attributes to group the test cases. +An aggregation spec has the following general syntax: -The ```` is a symbolic name for a function to aggregate the performance of the grouped test cases. -It can take one of the following values: +.. code-block:: bnf + + ::= | + ::= "(" ")" + ::= + +It can either a single aggregation function name or an aggregation function name followed by a test attribute in parenthesis, e.g., ``max`` or ``max(num_tasks)``. +A single function name as an aggregation is equivalent to ``fn(pval)``, i.e. the aggregation is applied to performance value, e.g., ``max`` is equivalent to ``max(pval)``. +The following aggregation functions are supported: + +- ``first``: return the first element of every group +- ``last``: return the last element of every group +- ``max``: return the maximum value of every group +- ``mix``: return the minimu value of every group +- ``mean``: return the minimu value of every group +- ``std``: return the standard deviation of every group +- ``sum``: return the sum of every group +- ``median``: return the median of every group +- ``p01``: return the 1% percentile of every group +- ``p05``: return the 5% percentile of every group +- ``p95``: return the 95% percentile of every group +- ``p99``: return the 99% percentile of every group -- ``first``: retrieve the performance data of the first test case only -- ``last``: retrieve the performance data of the last test case only -- ``max``: retrieve the maximum of all test cases -- ``mean``: calculate the mean over all test cases -- ``median``: retrieve the median of all test cases -- ``min``: retrieve the minimum of all test cases +There is also the pseudo-function ``stats``, which is essentially a shortcut for ``min,p01,p05,median,p95,p99,max,mean,std``. +It can also be applied to any other attribute than ``pval`` -The test cases are by default grouped by the following attributes: +When performing aggregations, test cases are grouped by the following attributes by default: - The test :attr:`~reframe.core.pipeline.RegressionTest.name` -- The system name -- The partition name -- The environment name +- A unique combination of the system name, partition name and environment name, called ``sysenv``. + ``sysenv`` is equivalent to ``:+``. - The performance variable name (see :func:`@performance_function ` and :attr:`~reframe.core.pipeline.RegressionTest.perf_variables`) - The performance variable unit -The ```` subspec specifies how the test cases will be grouped and can take one of the two following forms: +Note that if an aggregation is requested on an attribute, this attribute has to be present in the group-by list, otherwise ReFrame will complain. +For example, the following spec is problematic as ``num_tasks`` is not in the default group-by list: + +.. code-block:: bash + + # `num_tasks` is not in the group-by list, reframe will complain + 'now-1d:now/mean,mean(num_tasks):/' + +The correct is to add ``num_tasks`` in the group-by list as follows: + +.. code-block:: bash -1. ``+attr1+attr2...``: In this form the test cases will be grouped based on the default group-by attributes plus the user-specified ones (``attr1``, ``attr2`` etc.) -2. ``attr1,attr2,...``: In this form the test cases will be grouped based on the user-specified attributes only (``attr1``, ``attr2`` etc.). + 'now-1d:now/mean,mean(num_tasks):/+num_tasks' + + +The ```` spec has the following syntax: + +.. code-block:: bnf + + ::= | + ::= ("+" )+ + ::= ("," )* + +Users can either add attributes to the default list by followint the syntax ``+attr1+attr2...`` or can completely override the group-by attributes by providing an explcit list, such as ``attr1,attr2,...``. As an attribute for grouping test cases, any loggable test variable or parameter can be selected, as well as the following pseudo-attributes which are extracted or calculated on-the-fly: @@ -1515,35 +1588,45 @@ As an attribute for grouping test cases, any loggable test variable or parameter - ``punit``: the unit of the performance variable - ``presult``: the result (``pass`` or ``fail``) for this performance variable. The result is ``pass`` if the obtained performance value is within the acceptable bounds. -- ``pdiff``: the difference as a percentage between the base and target performance values when a performance comparison is attempted. - More specifically, ``pdiff = (pval_base - pval_target) / pval_target``. +- ``pdiff``: the difference as a percentage between the :ref:`left ` and :ref:`right ` performance values when a performance comparison is attempted. + More specifically, ``pdiff = (pval_lhs - pval_rhs) / pval_rhs``. - ``psamples``: the number of test cases aggregated. - ``sysenv``: The system/partition/environment combination as a single string of the form ``{system}:{partition}+{environ}`` .. note:: - For performance comparisons, either implicit or explicit, the aggregation applies to both the base and target test cases. + For performance comparisons, either implicit or explicit, the aggregation applies to both the left- and right-hand-side test cases. .. note:: .. versionadded:: 4.8 The ``presult`` special column was added. + .. versionadded:: 4.9 + More aggregations are added, multiple aggregations at once are supported and the ``stats`` shortcut is introduced. + + Presenting the results ---------------------- The selection of the final columns of the results table is specified by the same syntax as the ```` subspec described above. -However, for performance comparisons, ReFrame will generate two columns for every attribute in the subspec that is not also a group-by attribute, suffixed with ``_A`` and ``_B``. +However, for performance comparisons, ReFrame will generate two columns for every attribute in the subspec that is not also a group-by attribute, suffixed with ``(lhs)`` and ``(rhs)``. +These suffixes can be changed using the :option:`--term-lhs` and :option:`--term-rhs` options, respectively. These columns contain the aggregated values of the corresponding attributes. -Note that only the aggregation of ``pval`` (i.e. the test case performance) can be controlled (see :ref:`testcase-grouping`). -All other attributes are aggregated by joining their unique values. -It possible to select only one of the A/B variants of the extra columns by adding the ``_A`` or ``_B`` suffix to the column name in the ```` subspec, e.g., ``+result_A`` will only show the result of the first group of test cases. +Note that any attributes/columns that are not part of the group-by set will be aggregated by joining their unique values. + +It is also possible to select only one of the lhs/rhs variants of the extra columns by adding the ``_lhs`` or ``_rhs`` suffix to the column name in the ```` subspec, e.g., ``+result_lhs`` will only show the result of the left selection group in the comparison. .. versionchanged:: 4.8 - Support for selecting A/B column variants in performance comparisons. + Support for selecting lhs/rhs column variants in performance comparisons. + +.. versionchanged:: 4.9 + + Multiple aggregations at once are supported. New aggregations are added and ``A/B`` column variants are renamed to ``lhs`` and ``rhs``, respectively. + Examples -------- @@ -1560,7 +1643,7 @@ Here are some examples of performance comparison specs: .. code-block:: console - 20240701:20240701+1d/20240705:20240705+1d/max:+job_nodelist/+result + 20240701:20240702/20240705:20240706/max:+job_nodelist/+result Grammar ------- @@ -1571,9 +1654,13 @@ Note that parts that have a grammar defined elsewhere (e.g., Python attributes a .. code-block:: bnf ::= ( "/" "/" - ::= ":" - ::= "first" | "last" | "max" | "min" | "mean" | "median" - ::= | | E + ::= ":" ? + ::= ("," )* | "stats" + ::= | + ::= "(" ")" + ::= + ::= "first" | "last" | "max" | "min" | "mean" | "median" | "std" | "sum" | "p01" | "p05" | "p95" | "p99" + ::= | ::= ("+" )+ ::= ("," )* ::= /* any Python attribute */ diff --git a/docs/tutorial.rst b/docs/tutorial.rst index aa775883c8..a6a04d43eb 100644 --- a/docs/tutorial.rst +++ b/docs/tutorial.rst @@ -2155,8 +2155,23 @@ Inspecting past results .. versionadded:: 4.7 -For every session that has run at least one test case, ReFrame stores all its details, including the test cases, in a database. -Essentially, the stored information is the same as the one found in the :ref:`report file `. +ReFrame supports storing its detailed run results in a database. +This can be enabled by setting ``RFM_ENABLE_RESULTS_STORAGE=y`` in the environment. +The stored information is essentially the same as the one found in the :ref:`report file `. + +.. note:: + + To run the examples from this section of the tutorial, you need to rerun some of the previous experiments with results storage enabled. + + +To keep the output compact you can also set the following environment variable: + +.. code-block:: + + export RFM_TABLE_FORMAT=plain + +The default table format is ``pretty`` which also draws lines in the table. + To list the stored sessions use the :option:`--list-stored-sessions` option: @@ -2170,59 +2185,38 @@ its unique identifier, its start and end time and how many test cases have run: .. code-block:: console - ┍━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━┯━━━━━━━━━━━━━┑ - │ UUID │ Start time │ End time │ Num runs │ Num cases │ - ┝━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━┿━━━━━━━━━━━━━┥ - │ dbdb5f94-d1b2-4a11-aadc-57591d4a8496 │ 20241105T150144+0000 │ 20241105T150147+0000 │ 1 │ 1 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ eba49e9c-81f2-45b7-8680-34a5c9e08ac2 │ 20241105T150202+0000 │ 20241105T150205+0000 │ 1 │ 1 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 62e6e1e8-dd3a-4e70-a452-5c416a8f4d0b │ 20241105T150216+0000 │ 20241105T150219+0000 │ 1 │ 1 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 4ad75077-f2c5-4331-baf6-564275397f98 │ 20241105T150236+0000 │ 20241105T150237+0000 │ 1 │ 2 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 0507e4a0-f44c-45af-a068-9da842498c1f │ 20241105T150253+0000 │ 20241105T150254+0000 │ 1 │ 4 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ a7c2ffa9-482e-403f-9a78-5727262f6c7f │ 20241105T150304+0000 │ 20241105T150305+0000 │ 1 │ 4 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 47e8d98f-e2b9-4019-9a41-1c44d8a53d1b │ 20241105T150321+0000 │ 20241105T150332+0000 │ 1 │ 26 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ d0aa023b-2ebf-43d4-a0df-e809492434b5 │ 20241105T150352+0000 │ 20241105T150356+0000 │ 1 │ 10 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 8d2f6493-2f5f-4e20-8a8d-1f1b7b1285b0 │ 20241105T150415+0000 │ 20241105T150416+0000 │ 1 │ 10 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 1dd5da33-4e71-484a-b8e6-13ac4d513a66 │ 20241105T150436+0000 │ 20241105T150436+0000 │ 1 │ 4 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 216559ed-be1e-4289-9c88-c9e6b20d2e2e │ 20241105T150447+0000 │ 20241105T150448+0000 │ 1 │ 10 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ b387ee78-a44b-4711-ad81-629ebf578e53 │ 20241105T150448+0000 │ 20241105T150448+0000 │ 1 │ 1 │ - ├──────────────────────────────────────┼──────────────────────┼──────────────────────┼────────────┼─────────────┤ - │ 4bc5ba16-1a4a-4b27-b75c-407f01f1d292 │ 20241105T150503+0000 │ 20241105T150503+0000 │ 1 │ 5 │ - ┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━┷━━━━━━━━━━━━━┙ + UUID Start time End time Num runs Num cases + ------------------------------------ -------------------- -------------------- ---------- ----------- + f3c4c583-ee79-4611-ac99-e2fe6c53a9a0 20251223T182627+0000 20251223T182638+0000 1 1 + fc3cc1eb-e79e-4516-9066-087667b83dde 20251223T182639+0000 20251223T182647+0000 1 1 + be3ae2a0-47e9-4bbf-8501-cc6af40a47ce 20251223T182648+0000 20251223T182656+0000 1 1 + 6c4e2e00-b3c0-4f1b-a3fa-546ab4b60ceb 20251223T182657+0000 20251223T182700+0000 1 2 + 46f6e9fe-2a78-46c9-bafd-74b0d5d3d59d 20251223T182702+0000 20251223T182705+0000 1 4 + 0b48e28c-247b-49fc-a087-7154cc2c704a 20251223T182705+0000 20251223T182708+0000 1 4 + ca0bb17e-dc8b-4acf-b7a1-db884afe8b18 20251223T182709+0000 20251223T182712+0000 1 4 + c1295482-d278-495b-9425-de1ed405e236 20251223T182713+0000 20251223T182736+0000 1 26 + 84c0cbb8-72d8-4e5d-88b2-fd41bd2957fe 20251223T182737+0000 20251223T182747+0000 1 10 + 32bc4ce7-3e95-4f01-a4f6-33f9e7c6b01c 20251223T182748+0000 20251223T182750+0000 1 10 You can use :option:`--list-stored-testcases` to list the test cases of a specific session or those that have run within a certain period of time. -In the following example, we list the test cases of session ``0507e4a0-f44c-45af-a068-9da842498c1f`` showing the maximum performance for every performance variable. +In the following example, we list the test cases of session ``46f6e9fe-2a78-46c9-bafd-74b0d5d3d59d`` showing the maximum performance for every performance variable. Note that a session may contain multiple runs of the same test. .. code-block:: bash :caption: Run in the single-node container. - reframe --list-stored-testcases=0507e4a0-f44c-45af-a068-9da842498c1f/max:/ + reframe --list-stored-testcases=46f6e9fe-2a78-46c9-bafd-74b0d5d3d59d/max:/ .. code-block:: console - ┍━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┑ - │ name │ sysenv │ pvar │ punit │ pval │ - ┝━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━┥ - │ stream_test │ tutorialsys:default+gnu │ copy_bw │ MB/s │ 45454.4 │ - ├─────────────┼───────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test │ tutorialsys:default+gnu │ triad_bw │ MB/s │ 39979.1 │ - ├─────────────┼───────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test │ tutorialsys:default+clang │ copy_bw │ MB/s │ 43220.8 │ - ├─────────────┼───────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test │ tutorialsys:default+clang │ triad_bw │ MB/s │ 38759.9 │ - ┕━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┙ + name sysenv pvar punit pval (max) + ----------- ------------------------- -------- ------- ------------ + stream_test tutorialsys:default+clang copy_bw MB/s 19538.5 + stream_test tutorialsys:default+clang triad_bw MB/s 15299.1 + stream_test tutorialsys:default+gnu copy_bw MB/s 25944.8 + stream_test tutorialsys:default+gnu triad_bw MB/s 13701.7 + The grouping of the test cases, the aggregation and the actual columns shown in the final table are fully configurable. The exact syntax and the various posibilities are described in :ref:`querying-past-results`. @@ -2237,27 +2231,33 @@ For example, the following will list the mean performance of all test cases that .. code-block:: console - ┍━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━┑ - │ name │ sysenv │ pvar │ punit │ pval │ - ┝━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━┥ - │ stream_test │ generic:default+builtin │ copy_bw │ MB/s │ 40302.2 │ - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test │ generic:default+builtin │ triad_bw │ MB/s │ 30565.7 │ - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test │ tutorialsys:default+baseline │ copy_bw │ MB/s │ 40386.6 │ - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test │ tutorialsys:default+baseline │ triad_bw │ MB/s │ 30565.5 │ - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ + name sysenv pvar punit pval (mean) + --------------------------------------------------- ---------------------------- -------- ------- ------------- + stream_build_test tutorialsys:default+clang copy_bw MB/s 29203.2 + stream_build_test tutorialsys:default+clang triad_bw MB/s 17112.3 + stream_build_test tutorialsys:default+gnu copy_bw MB/s 27388 + stream_build_test tutorialsys:default+gnu triad_bw MB/s 14969.9 + stream_test generic:default+builtin copy_bw MB/s 18355.1 + stream_test generic:default+builtin triad_bw MB/s 13377.6 + stream_test tutorialsys:default+baseline copy_bw MB/s 18007.3 + stream_test tutorialsys:default+baseline triad_bw MB/s 13461.9 + stream_test tutorialsys:default+clang copy_bw MB/s 26024 + stream_test tutorialsys:default+clang triad_bw MB/s 15363.5 + stream_test tutorialsys:default+gnu copy_bw MB/s 25785.3 + stream_test tutorialsys:default+gnu triad_bw MB/s 14217.4 + stream_test %num_threads=1 %thread_placement=close tutorialsys:default+clang copy_bw MB/s 22808.3 + stream_test %num_threads=1 %thread_placement=close tutorialsys:default+clang triad_bw MB/s 14398.8 + stream_test %num_threads=1 %thread_placement=close tutorialsys:default+gnu copy_bw MB/s 21593 ... - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test %num_threads=1 %thread_placement=close │ tutorialsys:default+gnu │ copy_bw │ MB/s │ 47490.5 │ - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test %num_threads=1 %thread_placement=close │ tutorialsys:default+gnu │ triad_bw │ MB/s │ 34848.5 │ - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test %num_threads=1 %thread_placement=close │ tutorialsys:default+clang │ copy_bw │ MB/s │ 47618.6 │ - ├─────────────────────────────────────────────────────┼──────────────────────────────┼──────────┼─────────┼─────────┤ - │ stream_test %num_threads=1 %thread_placement=close │ tutorialsys:default+clang │ triad_bw │ MB/s │ 36237.2 │ - ┕━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━┙ + stream_test %num_threads=8 %thread_placement=spread tutorialsys:default+clang copy_bw MB/s 27782.6 + stream_test %num_threads=8 %thread_placement=spread tutorialsys:default+clang triad_bw MB/s 15491.9 + stream_test %num_threads=8 %thread_placement=spread tutorialsys:default+gnu copy_bw MB/s 28964.9 + stream_test %num_threads=8 %thread_placement=spread tutorialsys:default+gnu triad_bw MB/s 15645.3 + stream_test %num_threads=8 %thread_placement=true tutorialsys:default+clang copy_bw MB/s 26886.6 + stream_test %num_threads=8 %thread_placement=true tutorialsys:default+clang triad_bw MB/s 16554 + stream_test %num_threads=8 %thread_placement=true tutorialsys:default+gnu copy_bw MB/s 30292 + stream_test %num_threads=8 %thread_placement=true tutorialsys:default+gnu triad_bw MB/s 15811.4 + Note that the :option:`--list-stored-testcases` will list only performance tests. You can get all the details of stored sessions or a set of test cases using the :option:`--describe-stored-sessions` and :option:`--describe-stored-testcases` options which will return a detailed JSON record. @@ -2267,35 +2267,31 @@ You can also combine :option:`--list-stored-testcases` and :option:`--describe-s .. code-block:: bash :caption: Run in the single-node container. - reframe --list-stored-testcases=now-1d:now/mean:/ -n 'stream_test %' -E 'num_threads == 2' + reframe --list-stored-testcases=now-1d:now/mean:/ -n stream_test% -E 'num_threads == 2' Comparing performance of test cases ----------------------------------- ReFrame can be used to compare the performance of the same test cases run in different time periods using the :option:`--performance-compare` option. -The following will compare the performance of the test cases of the session ``0507e4a0-f44c-45af-a068-9da842498c1f`` with any other same test case that has run the last 24h: +The following will compare the performance of the test cases of the session ``46f6e9fe-2a78-46c9-bafd-74b0d5d3d59d`` with any other same test case that has run the last 24h: .. code-block:: bash :caption: Run in the single-node container. - reframe --performance-compare=0507e4a0-f44c-45af-a068-9da842498c1f/now-1d:now/mean:/ + reframe --performance-compare=46f6e9fe-2a78-46c9-bafd-74b0d5d3d59d/now-1d:now/mean:/ .. code-block:: console - ┍━━━━━━━━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━━┯━━━━━━━━━┑ - │ name │ sysenv │ pvar │ punit │ pval_A │ pval_B │ pdiff │ - ┝━━━━━━━━━━━━━┿━━━━━━━━━━━━━━━━━━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━━┿━━━━━━━━━┥ - │ stream_test │ tutorialsys:default+gnu │ copy_bw │ MB/s │ 45454.4 │ 46984.3 │ -3.26% │ - ├─────────────┼───────────────────────────┼──────────┼─────────┼──────────┼──────────┼─────────┤ - │ stream_test │ tutorialsys:default+gnu │ triad_bw │ MB/s │ 39979.1 │ 37726.2 │ +5.97% │ - ├─────────────┼───────────────────────────┼──────────┼─────────┼──────────┼──────────┼─────────┤ - │ stream_test │ tutorialsys:default+clang │ copy_bw │ MB/s │ 43220.8 │ 47949.5 │ -9.86% │ - ├─────────────┼───────────────────────────┼──────────┼─────────┼──────────┼──────────┼─────────┤ - │ stream_test │ tutorialsys:default+clang │ triad_bw │ MB/s │ 38759.9 │ 39916.1 │ -2.90% │ - ┕━━━━━━━━━━━━━┷━━━━━━━━━━━━━━━━━━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━━┷━━━━━━━━━┙ - -Note that the absolute base performance (``pval_A`` column) is listed along with the target performance (``pval_B`` column). + name sysenv pvar punit pval (mean) (lhs) pval (mean) (rhs) pdiff (%) + ----------- ------------------------- -------- ------- ------------------- ------------------- ----------- + stream_test tutorialsys:default+clang copy_bw MB/s 19538.5 26024 -24.92 + stream_test tutorialsys:default+clang triad_bw MB/s 15299.1 15363.5 -0.42 + stream_test tutorialsys:default+gnu copy_bw MB/s 25944.8 25785.3 0.62 + stream_test tutorialsys:default+gnu triad_bw MB/s 13701.7 14217.4 -3.63 + + +Note that the absolute base performance (``pval (lhs)`` column) is listed along with the target performance (``pval (rhs)`` column). :option:`--performance-compare` can also be combined with the :option:`-n` and :option:`-E` options in order to restrict the comparison to specific tests only. diff --git a/examples/tutorial/scripts/runall.sh b/examples/tutorial/scripts/runall.sh old mode 100644 new mode 100755 diff --git a/reframe/frontend/cli.py b/reframe/frontend/cli.py index d783e86a40..d9e17ea329 100644 --- a/reframe/frontend/cli.py +++ b/reframe/frontend/cli.py @@ -702,6 +702,14 @@ def main(): help='The delimiter to use when using `--table-format=csv`', envvar='RFM_TABLE_FORMAT_DELIM', configvar='general/table_format_delim' ) + misc_options.add_argument( + '--term-lhs', action='store', + help='LHS term in performance comparisons' + ) + misc_options.add_argument( + '--term-rhs', action='store', + help='RHS term in performance comparisons' + ) misc_options.add_argument( '-v', '--verbose', action='count', help='Increase verbosity level of output', @@ -1127,7 +1135,9 @@ def restrict_logging(): lambda htype: htype != 'stream') with exit_gracefully_on_error('failed to retrieve session data', printer): - printer.info(reporting.session_info(options.describe_stored_sessions)) + printer.info( + reporting.session_info(options.describe_stored_sessions) + ) sys.exit(0) if options.describe_stored_testcases: @@ -1154,8 +1164,9 @@ def restrict_logging(): if options.performance_compare: namepatt = '|'.join(n.replace('%', ' %') for n in options.names) - with exit_gracefully_on_error('failed to generate performance report', - printer): + with exit_gracefully_on_error( + 'failed to generate performance comparison', printer + ): filt = [None, None] if options.filter_expr is not None: if len(options.filter_expr) == 1: @@ -1168,8 +1179,10 @@ def restrict_logging(): sys.exit(1) printer.table( - reporting.performance_compare(options.performance_compare, - None, namepatt, *filt) + reporting.performance_compare( + options.performance_compare, None, namepatt, *filt, + options.term_lhs, options.term_rhs + ) ) sys.exit(0) @@ -1769,7 +1782,9 @@ def module_unuse(*paths): try: if rt.get_option('storage/0/enable'): data = reporting.performance_compare( - rt.get_option('general/0/perf_report_spec'), report + rt.get_option('general/0/perf_report_spec'), report, + term_lhs=options.term_lhs, + term_rhs=options.term_rhs ) else: data = report.report_data() diff --git a/reframe/frontend/printer.py b/reframe/frontend/printer.py index a26fd7d4b5..87b46990ee 100644 --- a/reframe/frontend/printer.py +++ b/reframe/frontend/printer.py @@ -286,7 +286,8 @@ def table(self, data, **kwargs): table_format = rt.runtime().get_option('general/0/table_format') if table_format == 'csv': - return self._table_as_csv(data) + self._table_as_csv(data) + return # Map our options to tabulate if table_format == 'plain': diff --git a/reframe/frontend/reporting/__init__.py b/reframe/frontend/reporting/__init__.py index 940595f01f..26638d6e18 100644 --- a/reframe/frontend/reporting/__init__.py +++ b/reframe/frontend/reporting/__init__.py @@ -11,23 +11,22 @@ import lxml.etree as etree import math import os +import polars as pl import re import socket import time import uuid from collections import UserDict -from collections.abc import Hashable import reframe as rfm import reframe.utility.jsonext as jsonext import reframe.utility.osext as osext from reframe.core.exceptions import ReframeError, what, is_severe, reraise_as from reframe.core.logging import getlogger, _format_time_rfc3339, time_function -from reframe.core.runtime import runtime from reframe.core.warnings import suppress_deprecations from reframe.utility import nodelist_abbrev, OrderedSet from .storage import StorageBackend -from .utility import Aggregator, parse_cmp_spec, parse_query_spec +from .utility import parse_cmp_spec, parse_query_spec # The schema data version # Major version bumps are expected to break the validation of previous schemas @@ -564,54 +563,53 @@ class _TCProxy(UserDict): _required_keys = ['name', 'system', 'partition', 'environ'] def __init__(self, testcase, include_only=None): + # Define the derived attributes + def _basename(): + return testcase['name'].split()[0] + + def _sysenv(): + return _format_sysenv(testcase['system'], + testcase['partition'], + testcase['environ']) + + def _job_nodelist(): + nodelist = testcase['job_nodelist'] + if isinstance(nodelist, str): + return nodelist + else: + return nodelist_abbrev(testcase['job_nodelist']) + if isinstance(testcase, _TCProxy): testcase = testcase.data if include_only is not None: self.data = {} - for k in include_only + self._required_keys: - if k in testcase: - self.data.setdefault(k, testcase[k]) - else: - self.data = testcase + for key in include_only + self._required_keys: + # Computed attributes + if key == 'basename': + val = _basename() + elif key == 'sysenv': + val = _sysenv() + elif key == 'job_nodelist': + val = _job_nodelist() + else: + val = testcase.get(key) - def __getitem__(self, key): - val = super().__getitem__(key) - if key == 'job_nodelist': - val = nodelist_abbrev(val) - - return val - - def __missing__(self, key): - if key == 'basename': - return self.data['name'].split()[0] - elif key == 'sysenv': - return _format_sysenv(self.data['system'], - self.data['partition'], - self.data['environ']) - elif key == 'pdiff': - return None + self.data.setdefault(key, val) else: - raise KeyError(key) - - -def _group_key(groups, testcase: _TCProxy): - key = [] - for grp in groups: - with reraise_as(ReframeError, (KeyError,), 'no such group'): - val = testcase[grp] - if not isinstance(val, Hashable): - val = str(val) - - key.append(val) - - return tuple(key) + # Include the derived attributes too + testcase.update({ + 'basename': _basename(), + 'sysenv': _sysenv(), + 'job_nodelist': _job_nodelist() + }) + self.data = testcase @time_function -def _group_testcases(testcases, groups, columns): - grouped = {} - record_cols = groups + [c for c in columns if c not in groups] +def _create_dataframe(testcases, groups, columns): + record_cols = list(OrderedSet(groups) | OrderedSet(columns)) + data = [] for tc in map(_TCProxy, testcases): for pvar, reftuple in tc['perfvalues'].items(): pvar = pvar.split(':')[-1] @@ -636,140 +634,68 @@ def _group_testcases(testcases, groups, columns): 'punit': punit, 'presult': presult }) - key = _group_key(groups, record) - grouped.setdefault(key, []) - grouped[key].append(record) - - return grouped + data.append(record) - -@time_function -def _aggregate_perf(grouped_testcases, aggr_fn, cols): - # Update delimiter for joining unique values based on the table format - table_format = runtime().get_option('general/0/table_format') - if table_format == 'pretty': - delim = '\n' + if data: + return pl.DataFrame(data) else: - delim = '|' - - other_aggr = Aggregator.create('join_uniq', delim) - count_aggr = Aggregator.create('count') - aggr_data = {} - for key, seq in grouped_testcases.items(): - aggr_data.setdefault(key, {}) - with reraise_as(ReframeError, (KeyError,), 'no such column'): - for c in cols: - if c == 'pval': - fn = aggr_fn - elif c == 'psamples': - fn = count_aggr - else: - fn = other_aggr + return pl.DataFrame(schema=record_cols) - if fn is count_aggr: - aggr_data[key][c] = fn(seq) - else: - aggr_data[key][c] = fn(tc[c] for tc in seq) - return aggr_data +@time_function +def _aggregate_data(testcases, query): + df = _create_dataframe(testcases, query.group_by, query.attributes) + df = df.group_by(query.group_by).agg( + query.aggregation.col_spec(query.aggregated_attributes) + ).sort(query.group_by) + return df @time_function -def compare_testcase_data(base_testcases, target_testcases, base_fn, target_fn, - groups=None, columns=None): - groups = groups or [] - - # Clean up columns and store those for which we want explicitly the A or B - # variants - cols = [] - variants_A = set() - variants_B = set() - for c in columns: - if c.endswith('_A'): - variants_A.add(c[:-2]) - cols.append(c[:-2]) - elif c.endswith('_B'): - variants_B.add(c[:-2]) - cols.append(c[:-2]) - else: - variants_A.add(c) - variants_B.add(c) - cols.append(c) - - grouped_base = _group_testcases(base_testcases, groups, cols) - grouped_target = _group_testcases(target_testcases, groups, cols) - pbase = _aggregate_perf(grouped_base, base_fn, cols) - ptarget = _aggregate_perf(grouped_target, target_fn, cols) - - # For visual purposes if `name` is in `groups`, consider also its - # derivative `basename` to be in, so as to avoid duplicate columns - if 'name' in groups: - groups.append('basename') - - # Build the final table data - extra_cols = set(cols) - set(groups) - {'pdiff'} - - # Header line - header = [] - for c in cols: - if c in extra_cols: - if c in variants_A: - header.append(f'{c}_A') - - if c in variants_B: - header.append(f'{c}_B') - else: - header.append(c) - - data = [header] - for key, aggr_data in pbase.items(): - pdiff = None - line = [] - for c in cols: - base = aggr_data.get(c) - try: - target = ptarget[key][c] - except KeyError: - target = None - - if c == 'pval': - line.append('n/a' if base is None else base) - line.append('n/a' if target is None else target) - - # compute diff for later usage - if base is not None and target is not None: - if base == 0 and target == 0: - pdiff = math.nan - elif target == 0: - pdiff = math.inf - else: - pdiff = (base - target) / target - pdiff = '{:+7.2%}'.format(pdiff) - elif c == 'pdiff': - line.append('n/a' if pdiff is None else pdiff) - elif c in extra_cols: - if c in variants_A: - line.append('n/a' if base is None else base) - - if c in variants_B: - line.append('n/a' if target is None else target) - else: - line.append('n/a' if base is None else base) +def compare_testcase_data(base_testcases, target_testcases, query): + df_base = _aggregate_data(base_testcases, query).with_columns( + pl.col(query.aggregated_columns).name.suffix(query.lhs_column_suffix) + ) + df_target = _aggregate_data(target_testcases, query).with_columns( + pl.col(query.aggregated_columns).name.suffix(query.rhs_column_suffix) + ) + pval = query.aggregation.column_names('pval')[0] + pval_lhs = f'{pval}{query.lhs_column_suffix}' + pval_rhs = f'{pval}{query.rhs_column_suffix}' + cols = OrderedSet(query.group_by) | OrderedSet(query.aggregated_variants) + if not df_base.is_empty() and not df_target.is_empty(): + cols |= {query.diff_column} + df = df_base.join(df_target, on=query.group_by).with_columns( + (100*(pl.col(pval_lhs) - pl.col(pval_rhs)) / pl.col(pval_rhs)) + .round(2).alias(query.diff_column) + ).select(cols) + elif df_base.is_empty(): + df = pl.DataFrame(schema=list(cols)) + else: + # df_target is empty; add an empty col for all `rhs` variants + df = df_base.select( + pl.col(col) + if col in df_base.columns else pl.lit('').alias(col) + for col in cols + ) - data.append(line) + data = [df.columns] + for row in df.iter_rows(): + data.append(row) return data @time_function def performance_compare(cmp, report=None, namepatt=None, - filterA=None, filterB=None): + filterA=None, filterB=None, + term_lhs=None, term_rhs=None): with reraise_as(ReframeError, (ValueError,), 'could not parse comparison spec'): - match = parse_cmp_spec(cmp) + query = parse_cmp_spec(cmp, term_lhs, term_rhs) backend = StorageBackend.default() - if match.base is None: + if query.lhs is None: if report is None: raise ValueError('report cannot be `None` ' 'for current run comparisons') @@ -785,11 +711,10 @@ def performance_compare(cmp, report=None, namepatt=None, except IndexError: tcs_base = [] else: - tcs_base = backend.fetch_testcases(match.base, namepatt, filterA) + tcs_base = backend.fetch_testcases(query.lhs, namepatt, filterA) - tcs_target = backend.fetch_testcases(match.target, namepatt, filterB) - return compare_testcase_data(tcs_base, tcs_target, match.aggregator, - match.aggregator, match.groups, match.columns) + tcs_target = backend.fetch_testcases(query.rhs, namepatt, filterB) + return compare_testcase_data(tcs_base, tcs_target, query) @time_function @@ -837,22 +762,20 @@ def session_data(query): def testcase_data(spec, namepatt=None, test_filter=None): with reraise_as(ReframeError, (ValueError,), 'could not parse comparison spec'): - match = parse_cmp_spec(spec, default_extra_cols=['pval']) + query = parse_cmp_spec(spec) - if match.base is not None: + if query.lhs is not None: raise ReframeError('only one time period or session are allowed: ' 'if you want to compare performance, ' 'use the `--performance-compare` option') storage = StorageBackend.default() - testcases = storage.fetch_testcases(match.target, namepatt, test_filter) - aggregated = _aggregate_perf( - _group_testcases(testcases, match.groups, match.columns), - match.aggregator, match.columns + df = _aggregate_data( + storage.fetch_testcases(query.rhs, namepatt, test_filter), query ) - data = [match.columns] - for aggr_data in aggregated.values(): - data.append([aggr_data[c] for c in match.columns]) + data = [df.columns] + for row in df.iter_rows(): + data.append(row) return data diff --git a/reframe/frontend/reporting/storage.py b/reframe/frontend/reporting/storage.py index 7175744691..e40e5a79ce 100644 --- a/reframe/frontend/reporting/storage.py +++ b/reframe/frontend/reporting/storage.py @@ -17,7 +17,7 @@ from reframe.core.logging import getlogger, time_function, getprofiler from reframe.core.runtime import runtime from reframe.utility import nodelist_abbrev -from ..reporting.utility import QuerySelector +from ..reporting.utility import QuerySelectorTestcase class StorageBackend: @@ -41,7 +41,7 @@ def store(self, report, report_file): '''Store the given report''' @abc.abstractmethod - def fetch_testcases(self, selector: QuerySelector, name_patt=None, + def fetch_testcases(self, selector: QuerySelectorTestcase, name_patt=None, test_filter=None): '''Fetch test cases based on the specified query selector. @@ -54,7 +54,7 @@ def fetch_testcases(self, selector: QuerySelector, name_patt=None, ''' @abc.abstractmethod - def fetch_sessions(self, selector: QuerySelector, decode=True): + def fetch_sessions(self, selector: QuerySelectorTestcase, decode=True): '''Fetch sessions based on the specified query selector. :arg selector: an instance of :class:`QuerySelector` that will specify @@ -65,7 +65,7 @@ def fetch_sessions(self, selector: QuerySelector, decode=True): ''' @abc.abstractmethod - def remove_sessions(self, selector: QuerySelector): + def remove_sessions(self, selector: QuerySelectorTestcase): '''Remove sessions based on the specified query selector :arg selector: an instance of :class:`QuerySelector` that will specify @@ -382,7 +382,7 @@ def _fetch_testcases_time_period(self, ts_start, ts_end, name_patt=None, return [*filter(filt_fn, testcases)] @time_function - def fetch_testcases(self, selector: QuerySelector, + def fetch_testcases(self, selector: QuerySelectorTestcase, name_patt=None, test_filter=None): if selector.by_session(): return self._fetch_testcases_from_session( @@ -394,7 +394,7 @@ def fetch_testcases(self, selector: QuerySelector, ) @time_function - def fetch_sessions(self, selector: QuerySelector, decode=True): + def fetch_sessions(self, selector: QuerySelectorTestcase, decode=True): query = 'SELECT uuid, json_blob FROM sessions' if selector.by_time_period(): ts_start, ts_end = selector.time_period @@ -448,7 +448,7 @@ def _do_remove2(self, conn, uuids): return [rec[0] for rec in results] @time_function - def remove_sessions(self, selector: QuerySelector): + def remove_sessions(self, selector: QuerySelectorTestcase): if selector.by_session_uuid(): uuids = [selector.uuid] else: diff --git a/reframe/frontend/reporting/utility.py b/reframe/frontend/reporting/utility.py index 7d127b9409..29b2b2afc2 100644 --- a/reframe/frontend/reporting/utility.py +++ b/reframe/frontend/reporting/utility.py @@ -3,99 +3,115 @@ # # SPDX-License-Identifier: BSD-3-Clause -import abc +import polars as pl import re -import statistics -import types -from collections import namedtuple from datetime import datetime, timedelta, timezone from numbers import Number +from typing import Dict, List +from reframe.core.runtime import runtime +from reframe.utility import OrderedSet + + +class Aggregation: + '''Represents a user aggregation''' + + OP_REGEX = re.compile(r'(?P\S+)\((?P\S+)\)|(?P\S+)') + OP_VALID = {'min', 'max', 'median', 'mean', 'std', 'first', 'last', + 'sum', 'p01', 'p05', 'p95', 'p99', 'stats'} + + def __init__(self, agg_spec: str): + '''Create an Aggregation from an aggretion spec''' + self._aggregations: Dict[str, List[str]] = {} + self._agg_names: Dict[str, str] = {} + for agg in agg_spec.split(','): + m = self.OP_REGEX.match(agg) + if m: + op = m.group('op') or m.group('op2') + col = m.group('col') or 'pval' + if op not in self.OP_VALID: + raise ValueError(f'unknown aggregation: {op}') + + if op == 'stats': + agg_ops = ['min', 'p01', 'p05', 'median', 'p95', 'p99', + 'max', 'mean', 'std'] + else: + agg_ops = [op] + + self._aggregations.setdefault(col, []) + self._aggregations[col] += agg_ops + for op in agg_ops: + self._agg_names[self._fmt_col(col, op)] = col + else: + raise ValueError(f'invalid aggregation spec: {agg}') + + def __repr__(self) -> str: + return f'Aggregation({self._aggregations})' + + def _fmt_col(self, col: str, op: str) -> str: + '''Format the aggregation's column name''' + return f'{col} ({op})' + + def attributes(self) -> List[str]: + '''Return the attributes to be aggregated''' + return list(self._aggregations.keys()) + + def column_names(self, col: str) -> List[str]: + '''Return the aggragation's column names''' + try: + ops = self._aggregations[col] + return [self._fmt_col(col, op) for op in ops] + except KeyError: + return [col] + + def strip_suffix(self, col: str) -> str: + '''Strip aggregation suffix from column''' + return self._agg_names.get(col, col) + + def col_spec(self, extra_cols: List[str]) -> List[pl.Expr]: + '''Return a list of polars expressions for this aggregation''' + def _expr_from_op(col, op): + if op == 'min': + return pl.col(col).min().alias(f'{col} (min)') + elif op == 'max': + return pl.col(col).max().alias(f'{col} (max)') + elif op == 'median': + return pl.col(col).median().alias(f'{col} (median)') + elif op == 'mean': + return pl.col(col).mean().alias(f'{col} (mean)') + elif op == 'std': + return pl.col(col).std().alias(f'{col} (stddev)') + elif op == 'first': + return pl.col(col).first().alias(f'{col} (first)') + elif op == 'last': + return pl.col(col).last().alias(f'{col} (last)') + elif op == 'p01': + return pl.col(col).quantile(0.01).alias(f'{col} (p01)') + elif op == 'p05': + return pl.col(col).quantile(0.05).alias(f'{col} (p05)') + elif op == 'p95': + return pl.col(col).quantile(0.95).alias(f'{col} (p95)') + elif op == 'p99': + return pl.col(col).quantile(0.99).alias(f'{col} (p99)') + elif op == 'sum': + return pl.col(col).sum().alias(f'{col} (sum)') + + specs = [] + for col, ops in self._aggregations.items(): + for op in ops: + specs.append(_expr_from_op(col, op)) + + # Add col specs for the extra columns requested + for col in extra_cols: + if col == 'pval': + continue + elif col == 'psamples': + specs.append(pl.len().alias('psamples')) + else: + table_format = runtime().get_option('general/0/table_format') + delim = '\n' if table_format == 'pretty' else '|' + specs.append(pl.col(col).unique().str.join(delim)) - -class Aggregator: - @classmethod - def create(cls, name, *args, **kwargs): - if name == 'first': - return AggrFirst(*args, **kwargs) - elif name == 'last': - return AggrLast(*args, **kwargs) - elif name == 'mean': - return AggrMean(*args, **kwargs) - elif name == 'median': - return AggrMedian(*args, **kwargs) - elif name == 'min': - return AggrMin(*args, **kwargs) - elif name == 'max': - return AggrMax(*args, **kwargs) - elif name == 'count': - return AggrCount(*args, **kwargs) - elif name == 'join_uniq': - return AggrJoinUniqueValues(*args, **kwargs) - else: - raise ValueError(f'unknown aggregation function: {name!r}') - - @abc.abstractmethod - def __call__(self, iterable): - pass - - -class AggrFirst(Aggregator): - def __call__(self, iterable): - for i, elem in enumerate(iterable): - if i == 0: - return elem - - -class AggrLast(Aggregator): - def __call__(self, iterable): - if not isinstance(iterable, types.GeneratorType): - return iterable[-1] - - for elem in iterable: - pass - - return elem - - -class AggrMean(Aggregator): - def __call__(self, iterable): - return statistics.mean(iterable) - - -class AggrMedian(Aggregator): - def __call__(self, iterable): - return statistics.median(iterable) - - -class AggrMin(Aggregator): - def __call__(self, iterable): - return min(iterable) - - -class AggrMax(Aggregator): - def __call__(self, iterable): - return max(iterable) - - -class AggrJoinUniqueValues(Aggregator): - def __init__(self, delim): - self.__delim = delim - - def __call__(self, iterable): - unique_vals = {str(elem) for elem in iterable} - return self.__delim.join(unique_vals) - - -class AggrCount(Aggregator): - def __call__(self, iterable): - if hasattr(iterable, '__len__'): - return len(iterable) - - count = 0 - for _ in iterable: - count += 1 - - return count + return specs def _parse_timestamp(s): @@ -153,7 +169,7 @@ def is_uuid(s): return _UUID_PATTERN.match(s) is not None -class QuerySelector: +class QuerySelectorTestcase: '''A class for encapsulating the different session and testcase queries. A session or testcase query can be of one of the following kinds: @@ -237,7 +253,8 @@ def _parse_aggregation(s, base_columns=None): except ValueError: raise ValueError(f'invalid aggregate function spec: {s}') from None - return Aggregator.create(op), _parse_columns(group_cols, base_columns) + # return Aggregator.create(op), _parse_columns(group_cols, base_columns) + return Aggregation(op), _parse_columns(group_cols, base_columns) def parse_query_spec(s): @@ -245,29 +262,154 @@ def parse_query_spec(s): return None if is_uuid(s): - return QuerySelector(uuid=s) + return QuerySelectorTestcase(uuid=s) if '?' in s: time_period, sess_filter = s.split('?', maxsplit=1) if time_period: - return QuerySelector(sess_filter=sess_filter, - time_period=parse_time_period(time_period)) + return QuerySelectorTestcase( + sess_filter=sess_filter, + time_period=parse_time_period(time_period) + ) else: - return QuerySelector(sess_filter=sess_filter) + return QuerySelectorTestcase(sess_filter=sess_filter) + + return QuerySelectorTestcase(time_period=parse_time_period(s)) + + +class _QueryMatch: + '''Class to represent the user's query''' + + def __init__(self, + lhs: QuerySelectorTestcase, + rhs: QuerySelectorTestcase, + aggregation: Aggregation, + groups: List[str], + columns: List[str], + term_lhs: str, term_rhs: str): + self.__lhs: QuerySelectorTestcase = lhs + self.__rhs: QuerySelectorTestcase = rhs + self.__aggregation: Aggregation = aggregation + self.__tc_group_by: List[str] = groups + self.__tc_attrs: List[str] = [] + self.__col_variants: Dict[str, List[str]] = {} + self.__lhs_term: str = term_lhs or 'lhs' + self.__rhs_term: str = term_rhs or 'rhs' + + if self.is_compare() and 'pval' not in columns: + # Always add `pval` if the query is a performance comparison + columns.append('pval') + + for col in columns: + if self.is_compare(): + # This is a comparison; trim any column suffixes and store + # them for later selection + if col.endswith(self.lhs_select_suffix): + col = col[:-len(self.lhs_select_suffix)] + self.__col_variants.setdefault(col, []) + self.__col_variants[col].append(self.lhs_column_suffix) + elif col.endswith(self.rhs_select_suffix): + col = col[:-len(self.rhs_select_suffix)] + self.__col_variants.setdefault(col, []) + self.__col_variants[col].append(self.rhs_column_suffix) + else: + self.__col_variants.setdefault(col, []) + self.__col_variants[col].append(self.lhs_column_suffix) + self.__col_variants[col].append(self.rhs_column_suffix) + + self.__tc_attrs.append(col) + + self.__tc_attrs_agg: List[str] = (OrderedSet(self.__tc_attrs) - + OrderedSet(self.__tc_group_by)) + self.__aggregated_cols: List[str] = [] + for col in self.__tc_attrs_agg: + self.__aggregated_cols += self.__aggregation.column_names(col) + + self.__col_variants_agg: List[str] = [] + for col in self.__aggregated_cols: + col_stripped = self.aggregation.strip_suffix(col) + if col_stripped in self.__col_variants: + self.__col_variants_agg += [ + f'{col}{variant}' + for variant in self.__col_variants[col_stripped] + ] + else: + self.__col_variants_agg.append(col) + + def is_compare(self): + '''Check if this query is a performance comparison''' + return self.__lhs is not None - return QuerySelector(time_period=parse_time_period(s)) + @property + def lhs_column_suffix(self): + '''The suffix of the lhs column in a comparison''' + return f' ({self.__lhs_term})' + + @property + def lhs_select_suffix(self): + '''The suffix for selecting the lhs column in a comparison''' + return '_L' + + @property + def rhs_column_suffix(self): + '''The suffix of the rhs column in a comparison''' + return f' ({self.__rhs_term})' + + @property + def rhs_select_suffix(self): + '''The suffix for selecting the rhs column in a comparison''' + return '_R' + + @property + def diff_column(self): + '''The name of the performance difference column''' + return 'pdiff (%)' + + @property + def lhs(self) -> QuerySelectorTestcase: + '''The lhs data sub-query''' + return self.__lhs + + @property + def rhs(self) -> QuerySelectorTestcase: + '''The rhs data sub-query''' + return self.__rhs + + @property + def aggregation(self) -> Aggregation: + '''The aggregation of this query''' + return self.__aggregation + + @property + def attributes(self) -> List[str]: + '''Test attributes requested by this query''' + return self.__tc_attrs + + @property + def aggregated_attributes(self) -> List[str]: + '''Test attributes whose values must be aggregated''' + return self.__tc_attrs_agg + + @property + def aggregated_columns(self) -> List[str]: + '''Column names of the aggregated attributes''' + return self.__aggregated_cols + @property + def aggregated_variants(self) -> List[str]: + '''Column names of the aggregated lhs/rhs attributes''' + return self.__col_variants_agg + + @property + def group_by(self) -> List[str]: + '''Test attributes to be grouped''' + return self.__tc_group_by -_Match = namedtuple('_Match', - ['base', 'target', 'aggregator', 'groups', 'columns']) DEFAULT_GROUP_BY = ['name', 'sysenv', 'pvar', 'punit'] -DEFAULT_EXTRA_COLS = ['pval', 'pdiff'] -def parse_cmp_spec(spec, default_group_by=None, default_extra_cols=None): - default_group_by = default_group_by or list(DEFAULT_GROUP_BY) - default_extra_cols = default_extra_cols or list(DEFAULT_EXTRA_COLS) +def parse_cmp_spec(spec, term_lhs=None, term_rhs=None): parts = spec.split('/') if len(parts) == 3: base_spec, target_spec, aggr, cols = None, *parts @@ -278,8 +420,9 @@ def parse_cmp_spec(spec, default_group_by=None, default_extra_cols=None): base = parse_query_spec(base_spec) target = parse_query_spec(target_spec) - aggr_fn, group_cols = _parse_aggregation(aggr, default_group_by) + aggr, group_cols = _parse_aggregation(aggr, DEFAULT_GROUP_BY) # Update base columns for listing - columns = _parse_columns(cols, group_cols + default_extra_cols) - return _Match(base, target, aggr_fn, group_cols, columns) + columns = _parse_columns(cols, group_cols + aggr.attributes()) + return _QueryMatch(base, target, aggr, group_cols, columns, + term_lhs, term_rhs) diff --git a/requirements.txt b/requirements.txt index 0edca43782..2df00ec1e4 100644 --- a/requirements.txt +++ b/requirements.txt @@ -6,6 +6,7 @@ fasteners==0.20; python_version >= '3.10' jinja2==3.1.6 jsonschema==3.2.0 lxml==6.0.2 +polars==1.35.1 pytest==8.4.2; python_version == '3.9' pytest==9.0.1; python_version >= '3.10' pytest-forked==1.6.0 diff --git a/setup.cfg b/setup.cfg index d63c748f29..8acab03341 100644 --- a/setup.cfg +++ b/setup.cfg @@ -32,6 +32,7 @@ install_requires = jinja2 jsonschema lxml + polars PyYAML requests semver diff --git a/unittests/test_cli.py b/unittests/test_cli.py index 640be0b94d..e0224a243f 100644 --- a/unittests/test_cli.py +++ b/unittests/test_cli.py @@ -636,6 +636,7 @@ def test_timestamp_option_default(run_reframe): matches = re.findall( r'(stage|output) directory: .*\/(\d{8}T\d{6}(\+|-)\d{4})', stdout ) + print(stdout) assert len(matches) == 2 @@ -1377,6 +1378,7 @@ def table_format(request): def assert_no_crash(returncode, stdout, stderr, exitcode=0): + print(stdout) assert returncode == exitcode assert 'Traceback' not in stdout assert 'Traceback' not in stderr @@ -1528,5 +1530,5 @@ def assert_no_crash(returncode, stdout, stderr, exitcode=0): assert_no_crash( *run_reframe2( action='--performance-compare=now-1m:now/now-1d:now/mean:+foo/+bar' - ), exitcode=1 + ) ) diff --git a/unittests/test_reporting.py b/unittests/test_reporting.py index c83a35ca6b..893b0e5557 100644 --- a/unittests/test_reporting.py +++ b/unittests/test_reporting.py @@ -6,6 +6,7 @@ import json import jsonschema import os +import polars as pl import pytest import sys import time @@ -18,14 +19,13 @@ import reframe.frontend.reporting as reporting import reframe.frontend.reporting.storage as report_storage from reframe.frontend.reporting.utility import (parse_cmp_spec, is_uuid, - QuerySelector, - DEFAULT_GROUP_BY, - DEFAULT_EXTRA_COLS) + QuerySelectorTestcase, + DEFAULT_GROUP_BY) from reframe.core.exceptions import ReframeError from reframe.frontend.reporting import RunReport -_DEFAULT_BASE_COLS = DEFAULT_GROUP_BY + DEFAULT_EXTRA_COLS +_DEFAULT_BASE_COLS = DEFAULT_GROUP_BY + ['pval'] # NOTE: We could move this to utility @@ -211,7 +211,7 @@ def test_parse_cmp_spec_period(time_period): spec, duration = time_period duration = int(duration) match = parse_cmp_spec(f'{spec}/{spec}/mean:/') - for query in ('base', 'target'): + for query in ('lhs', 'rhs'): assert getattr(match, query).by_time_period() ts_start, ts_end = getattr(match, query).time_period if 'now' in spec: @@ -223,36 +223,74 @@ def test_parse_cmp_spec_period(time_period): # Check variant without base period match = parse_cmp_spec(f'{spec}/mean:/') - assert match.base is None + assert match.lhs is None @pytest.fixture(params=['first', 'last', 'mean', 'median', - 'min', 'max', 'count']) + 'min', 'max', 'std', 'stats', 'sum', + 'p01', 'p05', 'p95', 'p99']) def aggregator(request): return request.param def test_parse_cmp_spec_aggregations(aggregator): match = parse_cmp_spec(f'now-1m:now/now-1d:now/{aggregator}:/') - data = [1, 2, 3, 4, 5] + num_recs = 10 + nodelist = [f'nid{i}' for i in range(num_recs)] + df = pl.DataFrame({ + 'name': ['test' for i in range(num_recs)], + 'pvar': ['time' for i in range(num_recs)], + 'unit': ['s' for i in range(num_recs)], + 'pval': [1 + i/10 for i in range(num_recs)], + 'node': nodelist + }) + agg = df.group_by('name').agg(match.aggregation.col_spec(['node'])) + assert set(agg['node'][0].split('\n')) == set(nodelist) if aggregator == 'first': - match.aggregator(data) == data[0] + assert 'pval (first)' in agg.columns + assert agg['pval (first)'][0] == 1 elif aggregator == 'last': - match.aggregator(data) == data[-1] + assert 'pval (last)' in agg.columns + assert agg['pval (last)'][0] == 1.9 elif aggregator == 'min': - match.aggregator(data) == 1 + assert 'pval (min)' in agg.columns + assert agg['pval (min)'][0] == 1 elif aggregator == 'max': - match.aggregator(data) == 5 + assert 'pval (max)' in agg.columns + assert agg['pval (max)'][0] == 1.9 elif aggregator == 'median': - match.aggregator(data) == 3 + assert 'pval (median)' in agg.columns + assert agg['pval (median)'][0] == 1.45 elif aggregator == 'mean': - match.aggregator(data) == sum(data) / len(data) - elif aggregator == 'count': - match.aggregator(data) == len(data) + assert 'pval (mean)' in agg.columns + assert agg['pval (mean)'][0] == 1.45 + elif aggregator == 'std': + assert 'pval (stddev)' in agg.columns + elif aggregator == 'stats': + assert 'pval (min)' in agg.columns + assert 'pval (p01)' in agg.columns + assert 'pval (p05)' in agg.columns + assert 'pval (median)' in agg.columns + assert 'pval (p95)' in agg.columns + assert 'pval (p99)' in agg.columns + assert 'pval (max)' in agg.columns + assert 'pval (mean)' in agg.columns + assert 'pval (stddev)' in agg.columns + elif aggregator == 'sum': + assert 'pval (sum)' in agg.columns + assert agg['pval (sum)'][0] == 14.5 + elif aggregator == 'p01': + assert agg['pval (p01)'][0] == 1 + elif aggregator == 'p05': + assert agg['pval (p05)'][0] == 1 + elif aggregator == 'p01': + assert agg['pval (p95)'][0] == 10 + elif aggregator == 'p05': + assert agg['pval (p99)'][0] == 10 # Check variant without base period match = parse_cmp_spec(f'now-1d:now/{aggregator}:/') - assert match.base is None + assert match.lhs is None @pytest.fixture(params=[('', DEFAULT_GROUP_BY), @@ -270,11 +308,11 @@ def test_parse_cmp_spec_group_by(group_by_columns): match = parse_cmp_spec( f'now-1m:now/now-1d:now/min:{spec}/' ) - assert match.groups == expected + assert match.group_by == expected # Check variant without base period match = parse_cmp_spec(f'now-1d:now/min:{spec}/') - assert match.base is None + assert match.lhs is None @pytest.fixture(params=[('', _DEFAULT_BASE_COLS), @@ -292,11 +330,17 @@ def test_parse_cmp_spec_extra_cols(columns): match = parse_cmp_spec( f'now-1m:now/now-1d:now/min:/{spec}' ) - assert match.columns == expected + + # `pval` is always added in case of comparisons + if spec == 'col1,col2': + assert match.attributes == expected + ['pval'] + else: + assert match.attributes == expected # Check variant without base period match = parse_cmp_spec(f'now-1d:now/min:/{spec}') - assert match.base is None + assert match.lhs is None + assert match.attributes == expected def test_is_uuid(): @@ -340,11 +384,11 @@ def _uuids(s): match = parse_cmp_spec(uuid_spec) base_uuid, target_uuid = _uuids(uuid_spec) - if match.base.by_session_uuid(): - assert match.base.uuid == base_uuid + if match.lhs.by_session_uuid(): + assert match.lhs.uuid == base_uuid - if match.target.by_session_uuid(): - assert match.target.uuid == target_uuid + if match.rhs.by_session_uuid(): + assert match.rhs.uuid == target_uuid @pytest.fixture(params=[ @@ -358,16 +402,16 @@ def sess_filter(request): def test_parse_cmp_spec_with_filter(sess_filter): match = parse_cmp_spec(sess_filter) - if match.base: - assert match.base.by_session_filter() - assert match.base.sess_filter == 'xyz == "123"' + if match.lhs: + assert match.lhs.by_session_filter() + assert match.lhs.sess_filter == 'xyz == "123"' - assert match.target.by_session_filter() - assert match.target.sess_filter == 'xyz == "789"' + assert match.rhs.by_session_filter() + assert match.rhs.sess_filter == 'xyz == "789"' if sess_filter.startswith('now'): - assert match.target.by_time_period() - ts_start, ts_end = match.target.time_period + assert match.rhs.by_time_period() + ts_start, ts_end = match.rhs.time_period assert int(ts_end - ts_start) == 86400 @@ -423,7 +467,6 @@ def test_parse_cmp_spec_invalid_extra_cols(invalid_col_spec): 'now-1m:now/now-1d:now', 'now-1m:now/now-1d:now/mean', 'now-1m:now/now-1d:now/mean:', - 'now-1m:now/now-1d:now/mean:', '/now-1d:now/mean:/', 'now-1m:now//mean:']) def various_invalid_specs(request): @@ -446,13 +489,13 @@ def _count_failed(testcases): return count def from_time_period(ts_start, ts_end): - return QuerySelector(time_period=(ts_start, ts_end)) + return QuerySelectorTestcase(time_period=(ts_start, ts_end)) def from_session_uuid(x): - return QuerySelector(uuid=x) + return QuerySelectorTestcase(uuid=x) def from_session_filter(filt, ts_start, ts_end): - return QuerySelector(time_period=(ts_start, ts_end), sess_filter=filt) + return QuerySelectorTestcase(time_period=(ts_start, ts_end), sess_filter=filt) monkeypatch.setenv('HOME', str(tmp_path)) uuids = []