Skip to content

Feedback after quick tests #2

@Pierlou

Description

@Pierlou

Hi, thanks for this lib! It caught my interest because such cleaning methods could prove useful for an open data platform, such as data.gouv.fr. Quick feedback:

  • you may want to add versions restrictions on install (for example I ran into an error using a quite old version of pandas, because of an unimplemented function back then)
  • I didn't get the expected results testing the trimmer on quite ugly csv files (from this dataset and that one). We've found it tedious too to find good euristics to clean any csv. You can find many examples of badly structured csv files on the platform, if you want to make this lib even more agnostic
  • as the input is a DataFrame and not directly the file-path, this lib could also be useful to clean excel (or other tabular) files, for instance from this dataset (files are really not suitable for any python analysis as it is)

For the record, we are maintaining an other csv-related lib: csv-detective, maybe there will be nice cross-overs in the future 🤞

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions