Skip to content

Commit 5b3f082

Browse files
stefanvcholdgrafbsipocz
authored
Allow excluding certain usernames from changelog (#128)
* Allow excluding certain usernames from changelog Also expand known bot list to include `github-actions[bot]` and `changeset-bot`. * Document ignoring multiple usernames * Use sets/lists, depending on whether we need duplicates * Make work with --all * Add test for --ignore-contributor * Apply suggestion from @bsipocz Co-authored-by: Brigitta Sipőcz <b.sipocz@gmail.com> * Allow wildcards * Sort contributors to each PR * Reformat * Fix spelling of commenter * Sort order: PR author, then others * Move minimum supported Python to 3.10 * Apply suggestion from @choldgraf --------- Co-authored-by: Chris Holdgraf <choldgraf@gmail.com> Co-authored-by: Brigitta Sipőcz <b.sipocz@gmail.com>
1 parent faa4f4e commit 5b3f082

File tree

6 files changed

+166
-49
lines changed

6 files changed

+166
-49
lines changed

.github/workflows/tests.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,7 @@ jobs:
3636
include:
3737
# Only test oldest supported and latest python version to reduce
3838
# GitHub API calls, as they can get rate limited
39-
- python-version: 3.9
39+
- python-version: "3.10"
4040
- python-version: 3.x
4141

4242
steps:

docs/use.md

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -134,6 +134,19 @@ To include Issues and Pull Requests that were _opened_ in a time period, use the
134134

135135
(use:token)=
136136

137+
## Remove bots from the changelog
138+
139+
`github-activity` ships with a known list of bot usernames, but your project may use ones not on our list.
140+
To ignore additional usernames from the changelog, use the `--ignore-contributor` flag:
141+
142+
```
143+
github-activity ... --ignore-contributor robot-one --ignore-contributor 'robot-two*'
144+
```
145+
146+
Wildcards are matched as per [filename matching semantics](https://docs.python.org/3/library/fnmatch.html#fnmatch.fnmatch).
147+
148+
If this is a generic bot username, consider contributing it back to [our list](https://github.com/executablebooks/github-activity/blob/main/github_activity/github_activity.py#L73).
149+
137150
## Use a GitHub API token
138151

139152
`github-activity` uses the GitHub API to pull information about a repository's activity.

github_activity/cli.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
"include-opened": False,
2222
"strip-brackets": False,
2323
"all": False,
24+
"ignore-contributor": [],
2425
}
2526

2627
parser = argparse.ArgumentParser(description=DESCRIPTION)
@@ -130,6 +131,11 @@
130131
action="store_true",
131132
help=("""Whether to include all the GitHub tags"""),
132133
)
134+
parser.add_argument(
135+
"--ignore-contributor",
136+
action="append",
137+
help="Do not include this GitHub username as a contributor in the changelog",
138+
)
133139

134140
# Hidden argument so that target can be optionally passed as a positional argument
135141
parser.add_argument(
@@ -214,6 +220,7 @@ def main():
214220
include_opened=bool(args.include_opened),
215221
strip_brackets=bool(args.strip_brackets),
216222
branch=args.branch,
223+
ignored_contributors=args.ignore_contributor,
217224
)
218225

219226
if args.all:

github_activity/github_activity.py

Lines changed: 88 additions & 48 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
"""Use the GraphQL api to grab issues/PRs that match a query."""
2+
import dataclasses
23
import datetime
4+
import fnmatch
35
import os
46
import re
57
import shlex
@@ -69,25 +71,26 @@
6971
}
7072

7173
# exclude known bots from contributor lists
72-
# TODO: configurable? Everybody's got their own bots.
74+
# Also see 'ignore-contributor' flag/configuration option.
7375
BOT_USERS = {
74-
"codecov",
75-
"codecov-io",
76-
"dependabot",
77-
"github-actions",
78-
"henchbot",
79-
"jupyterlab-dev-mode",
80-
"lgtm-com",
81-
"meeseeksmachine",
82-
"names",
83-
"now",
84-
"pre-commit-ci",
85-
"renovate",
86-
"review-notebook-app",
87-
"support",
88-
"stale",
89-
"todo",
90-
"welcome",
76+
"changeset-bot*",
77+
"codecov*",
78+
"codecov-io*",
79+
"dependabot*",
80+
"github-actions*",
81+
"henchbot*",
82+
"jupyterlab-dev-mode*",
83+
"lgtm-com*",
84+
"meeseeksmachine*",
85+
"names*",
86+
"now*",
87+
"pre-commit-ci*",
88+
"renovate*",
89+
"review-notebook-app*",
90+
"support*",
91+
"stale*",
92+
"todo*",
93+
"welcome*",
9194
}
9295

9396

@@ -225,6 +228,7 @@ def generate_all_activity_md(
225228
include_opened=False,
226229
strip_brackets=False,
227230
branch=None,
231+
ignored_contributors: list[str] = None,
228232
):
229233
"""Generate a full markdown changelog of GitHub activity of a repo based on release tags.
230234
@@ -259,6 +263,8 @@ def generate_all_activity_md(
259263
E.g., [MRG], [DOC], etc.
260264
branch : string | None
261265
The branch or reference name to filter pull requests by.
266+
ignored_contributors : list
267+
List of usernames not to include in the changelog.
262268
263269
Returns
264270
-------
@@ -322,6 +328,7 @@ def filter(datum):
322328
include_opened=include_opened,
323329
strip_brackets=strip_brackets,
324330
branch=branch,
331+
ignored_contributors=ignored_contributors,
325332
)
326333

327334
if not md:
@@ -337,6 +344,26 @@ def filter(datum):
337344
return output
338345

339346

347+
@dataclasses.dataclass(slots=True)
348+
class ContributorSet:
349+
"""This class represents a sorted set of PR contributor usernames.
350+
351+
The sorting is special, in that the author is placed first.
352+
"""
353+
354+
author: str = ""
355+
other: set = dataclasses.field(default_factory=set)
356+
357+
def add(self, contributor):
358+
self.other.add(contributor)
359+
360+
def __iter__(self):
361+
if self.author:
362+
yield self.author
363+
for item in sorted(self.other - {self.author}):
364+
yield item
365+
366+
340367
def generate_activity_md(
341368
target,
342369
since=None,
@@ -349,6 +376,7 @@ def generate_activity_md(
349376
strip_brackets=False,
350377
heading_level=1,
351378
branch=None,
379+
ignored_contributors: list[str] = None,
352380
):
353381
"""Generate a markdown changelog of GitHub activity within a date window.
354382
@@ -418,30 +446,41 @@ def generate_activity_md(
418446
comment_response_cutoff = 6 # Comments on a single issue
419447
comment_others_cutoff = 2 # Comments on issues somebody else has authored
420448
comment_helpers = []
421-
all_contributors = []
449+
all_contributors = set()
422450
# add column for participants in each issue (not just original author)
423451
data["contributors"] = [[]] * len(data)
452+
453+
def ignored_user(username):
454+
return any(fnmatch.fnmatch(username, bot) for bot in BOT_USERS) or any(
455+
fnmatch.fnmatch(username, user) for user in ignored_contributors
456+
)
457+
458+
def filter_ignored(userlist):
459+
return {user for user in userlist if not ignored_user(user)}
460+
424461
for ix, row in data.iterrows():
425-
item_commentors = []
426-
item_contributors = []
462+
# Track contributors to this PR
463+
item_contributors = ContributorSet()
464+
465+
# This is a list, since we *want* duplicates in here—they
466+
# indicate number of times a contributor commented
467+
item_commenters = []
427468

428469
# contributor order:
429470
# - author
430471
# - committers
431472
# - merger
432473
# - reviewers
433474

434-
item_contributors.append(row.author)
475+
item_contributors.author = row.author
435476

436477
if row.kind == "pr":
437-
for committer in row.committers:
438-
if committer not in row.committers and committer not in BOT_USERS:
439-
item_contributors.append(committer)
478+
for committer in filter_ignored(row.committers):
479+
item_contributors.add(committer)
440480
if row.mergedBy and row.mergedBy != row.author:
441-
item_contributors.append(row.mergedBy)
442-
for reviewer in row.reviewers:
443-
if reviewer not in item_contributors:
444-
item_contributors.append(reviewer)
481+
item_contributors.add(row.mergedBy)
482+
for reviewer in filter_ignored(row.reviewers):
483+
item_contributors.add(reviewer)
445484

446485
for icomment in row["comments"]["edges"]:
447486
comment_author = icomment["node"]["author"]
@@ -451,36 +490,37 @@ def generate_activity_md(
451490
continue
452491

453492
comment_author = comment_author["login"]
454-
if comment_author in BOT_USERS:
493+
if ignored_user(comment_author):
455494
# ignore bots
456495
continue
457496

458-
# Add to list of commentors on items they didn't author
497+
# Add to list of commenters on items they didn't author
459498
if comment_author != row["author"]:
460499
comment_helpers.append(comment_author)
461500

462-
# Add to list of commentors for this item so we can see how many times they commented
463-
item_commentors.append(comment_author)
501+
# Add to list of commenters for this item so we can see how many times they commented
502+
item_commenters.append(comment_author)
464503

465504
# count all comments on a PR as a contributor
466-
if comment_author not in item_contributors:
467-
item_contributors.append(comment_author)
505+
item_contributors.add(comment_author)
468506

469-
# Count any commentors that had enough comments on the issue to be a contributor
470-
item_commentors_counts = pd.Series(item_commentors).value_counts()
471-
item_commentors_counts = item_commentors_counts[
472-
item_commentors_counts >= comment_response_cutoff
507+
# Count any commenters that had enough comments on the issue to be a contributor
508+
item_commenters_counts = pd.Series(item_commenters).value_counts()
509+
item_commenters_counts = item_commenters_counts[
510+
item_commenters_counts >= comment_response_cutoff
473511
].index.tolist()
474-
for person in item_commentors_counts:
475-
all_contributors.append(person)
512+
for person in item_commenters_counts:
513+
all_contributors.add(person)
476514

477515
# record contributor list (ordered, unique)
478-
data.at[ix, "contributors"] = item_contributors
516+
data.at[ix, "contributors"] = list(item_contributors)
479517

480518
comment_contributor_counts = pd.Series(comment_helpers).value_counts()
481-
all_contributors += comment_contributor_counts[
482-
comment_contributor_counts >= comment_others_cutoff
483-
].index.tolist()
519+
all_contributors |= set(
520+
comment_contributor_counts[
521+
comment_contributor_counts >= comment_others_cutoff
522+
].index.tolist()
523+
)
484524

485525
# Filter the PRs by branch (or ref) if given
486526
if branch is not None:
@@ -517,7 +557,7 @@ def generate_activity_md(
517557
closed_prs = closed_prs.query("state != 'CLOSED'")
518558

519559
# Add any contributors to a merged PR to our contributors list
520-
all_contributors += closed_prs["contributors"].explode().unique().tolist()
560+
all_contributors |= set(closed_prs["contributors"].explode().unique().tolist())
521561

522562
# Define categories for a few labels
523563
if tags is None:
@@ -615,7 +655,7 @@ def generate_activity_md(
615655
[
616656
f"[@{user}](https://github.com/{user})"
617657
for user in irowdata.contributors
618-
if user not in BOT_USERS
658+
if not ignored_user(user)
619659
]
620660
)
621661
this_md = f"- {ititle} [#{irowdata['number']}]({irowdata['url']}) ({contributor_list})"
@@ -663,7 +703,7 @@ def generate_activity_md(
663703

664704
# Add a list of author contributions
665705
all_contributors = sorted(
666-
set(all_contributors) - BOT_USERS, key=lambda a: str(a).lower()
706+
filter_ignored(all_contributors), key=lambda a: str(a).lower()
667707
)
668708
all_contributor_links = []
669709
for iauthor in all_contributors:

tests/test_cli.py

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -117,3 +117,28 @@ def test_cli_all(tmpdir, file_regression):
117117
md = path_output.read_text()
118118
index = md.index("## v0.2.0")
119119
file_regression.check(md[index:], extension=".md")
120+
121+
122+
def test_cli_ignore_user(tmpdir):
123+
"""Test that a full changelog is created"""
124+
path_tmp = Path(tmpdir)
125+
path_output = path_tmp.joinpath("out.md")
126+
cmd = f"github-activity executablebooks/github-activity --ignore-contributor choldgraf -s v1.0.2 -o {path_output}"
127+
run(cmd.split(), check=True)
128+
md = path_output.read_text()
129+
assert not "@choldgraf" in md
130+
131+
132+
def test_contributor_sorting(tmpdir, file_regression):
133+
"""Test that PR author appears first, then rest of contributors, sorted"""
134+
path_tmp = Path(tmpdir)
135+
path_output = path_tmp.joinpath("out.md")
136+
137+
org, repo = ("jupyter-book", "mystmd")
138+
139+
cmd = (
140+
f"github-activity {org}/{repo} -s mystmd@1.5.1 -u mystmd@1.6.0 -o {path_output}"
141+
)
142+
run(cmd.split(), check=True)
143+
md = path_output.read_text()
144+
file_regression.check(md, extension=".md")
Lines changed: 32 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,32 @@
1+
# mystmd@1.5.1...mystmd@1.6.0
2+
3+
([full changelog](https://github.com/jupyter-book/mystmd/compare/mystmd@1.5.1...mystmd@1.6.0))
4+
5+
## Enhancements made
6+
7+
- 🎯 Render static HTML pages to expected server path [#2178](https://github.com/jupyter-book/mystmd/pull/2178) ([@stefanv](https://github.com/stefanv), [@agoose77](https://github.com/agoose77), [@bsipocz](https://github.com/bsipocz), [@choldgraf](https://github.com/choldgraf), [@rowanc1](https://github.com/rowanc1))
8+
- 🔗 Fix URLs in table of contents directive [#2140](https://github.com/jupyter-book/mystmd/pull/2140) ([@brianhawthorne](https://github.com/brianhawthorne), [@rowanc1](https://github.com/rowanc1), [@stefanv](https://github.com/stefanv))
9+
10+
## Bugs fixed
11+
12+
- 🏷️ Add NPM binary name to whitelabelling [#2175](https://github.com/jupyter-book/mystmd/pull/2175) ([@agoose77](https://github.com/agoose77), [@rowanc1](https://github.com/rowanc1), [@stefanv](https://github.com/stefanv))
13+
- Add `ipynb` format option in validators [#2159](https://github.com/jupyter-book/mystmd/pull/2159) ([@kp992](https://github.com/kp992), [@agoose77](https://github.com/agoose77))
14+
15+
## Documentation improvements
16+
17+
- 📖 Remove out of date readme note [#2155](https://github.com/jupyter-book/mystmd/pull/2155) ([@rowanc1](https://github.com/rowanc1), [@choldgraf](https://github.com/choldgraf))
18+
- 📖 A few miscellaneous documentation updates [#2154](https://github.com/jupyter-book/mystmd/pull/2154) ([@rowanc1](https://github.com/rowanc1))
19+
20+
## Other merged PRs
21+
22+
- 🚀 Release [#2180](https://github.com/jupyter-book/mystmd/pull/2180) ([@rowanc1](https://github.com/rowanc1))
23+
- 🚀 Release [#2130](https://github.com/jupyter-book/mystmd/pull/2130) ([@rowanc1](https://github.com/rowanc1))
24+
25+
## Contributors to this release
26+
27+
The following people contributed discussions, new ideas, code and documentation contributions, and review.
28+
See [our definition of contributors](https://github-activity.readthedocs.io/en/latest/#how-does-this-tool-define-contributions-in-the-reports).
29+
30+
([GitHub contributors page for this release](https://github.com/jupyter-book/mystmd/graphs/contributors?from=2025-07-05&to=2025-07-21&type=c))
31+
32+
@agoose77 ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Aagoose77+updated%3A2025-07-05..2025-07-21&type=Issues)) | @brian-rose ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Abrian-rose+updated%3A2025-07-05..2025-07-21&type=Issues)) | @brianhawthorne ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Abrianhawthorne+updated%3A2025-07-05..2025-07-21&type=Issues)) | @bsipocz ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Absipocz+updated%3A2025-07-05..2025-07-21&type=Issues)) | @choldgraf ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Acholdgraf+updated%3A2025-07-05..2025-07-21&type=Issues)) | @kp992 ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Akp992+updated%3A2025-07-05..2025-07-21&type=Issues)) | @rowanc1 ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Arowanc1+updated%3A2025-07-05..2025-07-21&type=Issues)) | @stefanv ([activity](https://github.com/search?q=repo%3Ajupyter-book%2Fmystmd+involves%3Astefanv+updated%3A2025-07-05..2025-07-21&type=Issues))

0 commit comments

Comments
 (0)