Skip to content

Comments

Dependency Track parser: Store DT uuid into unique_id_from_tool instead of vuln_id_from_tool#14346

Open
AndreVirtimo wants to merge 2 commits intoDefectDojo:devfrom
Virtimo:dev
Open

Dependency Track parser: Store DT uuid into unique_id_from_tool instead of vuln_id_from_tool#14346
AndreVirtimo wants to merge 2 commits intoDefectDojo:devfrom
Virtimo:dev

Conversation

@AndreVirtimo
Copy link
Contributor

Store DT uuid into unique_id_from_tool instead of vuln_id_from_tool
change default deduplication algorithm to DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE

Fixing #14345

change default deduplication algorithm to DEDUPE_ALGO_UNIQUE_ID_FROM_TOOL_OR_HASH_CODE
@github-actions github-actions bot added settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests parser labels Feb 19, 2026
Copy link
Contributor

@Maffooch Maffooch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good change, but I think we should also maintain the value of vuln_id_from_tool to accommodate folks who have customized the dedupe settings in their local_settings.py files. If the vuln_id_from_tool field is suddenly empty, any existing DT findings would not be matched again

@AndreVirtimo
Copy link
Contributor Author

This is a good change, but I think we should also maintain the value of vuln_id_from_tool to accommodate folks who have customized the dedupe settings in their local_settings.py files. If the vuln_id_from_tool field is suddenly empty, any existing DT findings would not be matched again

I don't agree. There is not deduplication algorithm which uses vuln_id_from_tool. The field description for vuln_id_from_tool is "Non-unique technical id from the source tool associated with the vulnerability type." which does not fit to the uuid from DT.

@valentijnscholten valentijnscholten changed the title Store DT uuid into unique_id_from_tool instead of vuln_id_from_tool Dependency Track parser: Store DT uuid into unique_id_from_tool instead of vuln_id_from_tool Feb 20, 2026
@valentijnscholten
Copy link
Member

I don't agree. There is not deduplication algorithm which uses vuln_id_from_tool. The field description for vuln_id_from_tool is "Non-unique technical id from the source tool associated with the vulnerability type." which does not fit to the uuid from DT.

Users can customize the deduplication behaviour via the settings in settings.dist.py. If users have chosen vuln_id_from_tool in there, their hash_code calculations will change which might break their dedupe for newly imported reports.

In general I don't like vuln_id_from_tool AND unique_id_from_tool being populated with the same unique id, but maybe in this case it's better to do it to avoid a painful upgrade for users using vuln_id_from_tool. Alternatively we can do a "v2 style" approach as we did with for example OpenVAS: https://github.com/DefectDojo/django-DefectDojo/blob/master/dojo/tools/openvas/parser.py

@valentijnscholten
Copy link
Member

@AndreVirtimo Do you have aliases enabled in your DT instance? The example vulnerabilities you provided on Slack seem to be aliases of eachother. Usually DT exports this in the report that is being sent to Dojo and Dojo will import all aliases as vulnerability_ids. This way both vulnerabilities should result in the same hash_code and become duplicates and work OK in reimports.

@AndreVirtimo
Copy link
Contributor Author

@AndreVirtimo Do you have aliases enabled in your DT instance? The example vulnerabilities you provided on Slack seem to be aliases of eachother. Usually DT exports this in the report that is being sent to Dojo and Dojo will import all aliases as vulnerability_ids. This way both vulnerabilities should result in the same hash_code and become duplicates and work OK in reimports.

We have NVD and GITHUB as sources. The alias option for GITHUB is active.

The finding from NVD was imported first, after 4 days the same finding came from GITHUB and the GHSA id was added to the list of vulnerability_ids. This also alters the hash.

@AndreVirtimo
Copy link
Contributor Author

Users can customize the deduplication behaviour via the settings in settings.dist.py. If users have chosen vuln_id_from_tool in there, their hash_code calculations will change which might break their dedupe for newly imported reports.

If backward compatibility is more important than a strict usage of the fields, than we should keep vuln_id_from_tool. Maybe this could be changed with DD 3.

@Maffooch
Copy link
Contributor

With parsers, backward compatibility is paramount. From a data integrity perspective, a situation where new findings are created due to default hash code changes often will restarts SLAs. This is not acceptable for many orgs

@valentijnscholten
Copy link
Member

in that case having vulnerability_ids in the hash_code config is a risk for some parsers including dependency track. But that's unrelated to this PR.

@valentijnscholten valentijnscholten added this to the 2.56.0 milestone Feb 20, 2026
@AndreVirtimo
Copy link
Contributor Author

Ok. I have accepted your suggestions to keep the old vuln_id_from_tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

parser settings_changes Needs changes to settings.py based on changes in settings.dist.py included in this PR unittests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants