Skip to content

Conversation

@Phaneendra293
Copy link

This PR updates the Clippy configuration to disallow usage of std::collections::HashMap and std::collections::HashSet across the DataFusion codebase.

The change helps enforce consistent usage of project-preferred HashMap/set implementations, avoiding accidental use of the standard library types.

What changed

Updated clippy. toml to add std::collections::HashMap and std::collections::HashSet to the list of disallowed types.

Ensures new code follows the project’s hashing and performance conventions.

Why

The default Rust HashMap / HashSet use randomized hashing, which may be:

non-deterministic,

sub-optimal for performance-critical paths,

inconsistent with DataFusion’s existing design choices.

Enforcing this at the lint level prevents regressions and improves code consistency for contributors.

Testing

No functional code changes.

CI / Clippy will fail if std::HashMap or std::HashSet are newly introduced.

@Phaneendra293
Copy link
Author

The changes have been completed and pushed.
Kindly review and let me know if there are any additional modifications required.

@Jefffrey
Copy link
Contributor

Thanks for volunteering to pick up this issue; part of the expectation of the issue is to ensure CI passes, namely that clippy violations are fixed. Could you look into this please?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants