-
Notifications
You must be signed in to change notification settings - Fork 0
Add examples #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add examples #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds example code demonstrating three Python packaging scenarios for Deephaven applications: a library-only package, a CLI-only package, and a combined package. The examples show how to structure Python packages using modern packaging standards with pyproject.toml, implement CLI tools with Click, and create reusable Deephaven query functions.
Changes:
- Added three complete example packages (my_dh_library, my_dh_cli, my_dh_toolkit) demonstrating different packaging approaches
- Included sample CSV data files for testing the examples
- Updated main README with comprehensive documentation on package structure, usage patterns, and troubleshooting
Reviewed changes
Copilot reviewed 34 out of 38 changed files in this pull request and generated 29 comments.
Show a summary per file
| File | Description |
|---|---|
| my_dh_library/* | Library-only package with reusable Deephaven query and utility functions |
| my_dh_cli/* | CLI-only package with command-line tools for CSV processing |
| my_dh_toolkit/* | Combined package with both library functions and CLI tools |
| data/* | Sample CSV files for testing the example packages |
| README.md | Comprehensive documentation covering all three packaging scenarios and usage examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| def add_computed_columns(table: Table) -> Table: | ||
| """Add commonly used computed columns to a table.""" | ||
| return table.update( | ||
| [ | ||
| "DoubleValue = Value * 2", | ||
| "IsHigh = Value > 100", | ||
| ] | ||
| ) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function assumes the input table has a column named "Value" without validation. If the table doesn't have this column, it will fail at runtime. Consider adding validation to check if the required column exists, or make the column name a parameter to make the function more flexible and reusable.
| @@ -0,0 +1 @@ | |||
| my_dh_library | |||
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build artifacts from setuptools should not be committed to version control. The .egg-info directory is automatically generated during package installation and should be excluded. Add *.egg-info/ to .gitignore and remove this directory from the repository.
| my_dh_library |
| README.md | ||
| pyproject.toml | ||
| src/my_dh_cli/__init__.py | ||
| src/my_dh_cli/__main__.py | ||
| src/my_dh_cli/cli.py | ||
| src/my_dh_cli.egg-info/PKG-INFO | ||
| src/my_dh_cli.egg-info/SOURCES.txt | ||
| src/my_dh_cli.egg-info/dependency_links.txt | ||
| src/my_dh_cli.egg-info/entry_points.txt | ||
| src/my_dh_cli.egg-info/requires.txt | ||
| src/my_dh_cli.egg-info/top_level.txt No newline at end of file |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build artifacts from setuptools should not be committed to version control. The .egg-info directory is automatically generated during package installation and should be excluded. Add *.egg-info/ to .gitignore and remove this directory from the repository.
| README.md | |
| pyproject.toml | |
| src/my_dh_cli/__init__.py | |
| src/my_dh_cli/__main__.py | |
| src/my_dh_cli/cli.py | |
| src/my_dh_cli.egg-info/PKG-INFO | |
| src/my_dh_cli.egg-info/SOURCES.txt | |
| src/my_dh_cli.egg-info/dependency_links.txt | |
| src/my_dh_cli.egg-info/entry_points.txt | |
| src/my_dh_cli.egg-info/requires.txt | |
| src/my_dh_cli.egg-info/top_level.txt |
| input_path = Path(directory) | ||
| output_path = Path(output_dir) | ||
| output_path.mkdir(exist_ok=True) | ||
|
|
||
| csv_files = list(input_path.glob("*.csv")) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function doesn't validate that the input directory exists before attempting to glob for CSV files. While the Click command decorator validates this for CLI usage, the function can be called directly from Python code (as shown in the README examples) without validation. Add a check to verify the directory exists and provide a helpful error message if it doesn't.
| def my_dh_query(input_file: str, verbose: bool = False): | ||
| """Read a CSV file and perform a simple query operation on the data.""" | ||
| from deephaven import read_csv | ||
|
|
||
| if verbose: | ||
| click.echo(f"Processing {input_file}...") | ||
|
|
||
| source = read_csv(input_file) | ||
|
|
||
| result = source.update(formulas=["DoubleScore = Score * 2"]) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function assumes the input CSV has a column named "Score" without validation. If the CSV doesn't have this column, it will fail at runtime. Consider adding validation to check if the required column exists, or add error handling to provide a helpful error message to users.
| def filter_by_threshold(table: Table, column: str, threshold: float) -> Table: | ||
| """Filter table rows where column value exceeds threshold.""" | ||
| return table.where(f"{column} > {threshold}") |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function has a potential SQL injection vulnerability. The column name is directly interpolated into the query string using an f-string. If the column parameter comes from untrusted input, it could be exploited. Consider validating the column name against the table's actual columns before constructing the query.
| src/my_dh_library/utils.py | ||
| src/my_dh_library.egg-info/PKG-INFO | ||
| src/my_dh_library.egg-info/SOURCES.txt | ||
| src/my_dh_library.egg-info/dependency_links.txt | ||
| src/my_dh_library.egg-info/requires.txt | ||
| src/my_dh_library.egg-info/top_level.txt No newline at end of file |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Build artifacts from setuptools should not be committed to version control. The .egg-info directory is automatically generated during package installation and should be excluded. Add *.egg-info/ to .gitignore and remove this directory from the repository.
| src/my_dh_library/utils.py | |
| src/my_dh_library.egg-info/PKG-INFO | |
| src/my_dh_library.egg-info/SOURCES.txt | |
| src/my_dh_library.egg-info/dependency_links.txt | |
| src/my_dh_library.egg-info/requires.txt | |
| src/my_dh_library.egg-info/top_level.txt | |
| src/my_dh_library/utils.py |
| output_path = Path(output_dir) | ||
| output_path.mkdir(exist_ok=True) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mkdir operation could fail due to permissions or other I/O errors, but there's no error handling. Consider adding try-except blocks or using mkdir(parents=True, exist_ok=True) to handle cases where parent directories don't exist. Additionally, provide helpful error messages to users when directory creation fails.
| @click.option("--verbose", "-v", is_flag=True, help="Enable verbose output") | ||
| def app(input_file: str, verbose: bool) -> None: | ||
| """Process data with Deephaven.""" | ||
| result = my_dh_query(input_file, verbose) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable result is not used.
| @click.option("--verbose", "-v", is_flag=True, help="Enable verbose output") | ||
| def app(input_file: str, verbose: bool) -> None: | ||
| """Process data with Deephaven.""" | ||
| result = my_dh_query(input_file, verbose) |
Copilot
AI
Jan 23, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Variable result is not used.
| result = my_dh_query(input_file, verbose) | |
| my_dh_query(input_file, verbose) |
The accompanying doc will be in a deephaven-core PR.