Skip to content

Conversation

@elijahpetty
Copy link
Collaborator

The accompanying doc will be in a deephaven-core PR.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds example code demonstrating three Python packaging scenarios for Deephaven applications: a library-only package, a CLI-only package, and a combined package. The examples show how to structure Python packages using modern packaging standards with pyproject.toml, implement CLI tools with Click, and create reusable Deephaven query functions.

Changes:

  • Added three complete example packages (my_dh_library, my_dh_cli, my_dh_toolkit) demonstrating different packaging approaches
  • Included sample CSV data files for testing the examples
  • Updated main README with comprehensive documentation on package structure, usage patterns, and troubleshooting

Reviewed changes

Copilot reviewed 34 out of 38 changed files in this pull request and generated 29 comments.

Show a summary per file
File Description
my_dh_library/* Library-only package with reusable Deephaven query and utility functions
my_dh_cli/* CLI-only package with command-line tools for CSV processing
my_dh_toolkit/* Combined package with both library functions and CLI tools
data/* Sample CSV files for testing the example packages
README.md Comprehensive documentation covering all three packaging scenarios and usage examples

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +11 to +18
def add_computed_columns(table: Table) -> Table:
"""Add commonly used computed columns to a table."""
return table.update(
[
"DoubleValue = Value * 2",
"IsHigh = Value > 100",
]
)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function assumes the input table has a column named "Value" without validation. If the table doesn't have this column, it will fail at runtime. Consider adding validation to check if the required column exists, or make the column name a parameter to make the function more flexible and reusable.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1 @@
my_dh_library
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build artifacts from setuptools should not be committed to version control. The .egg-info directory is automatically generated during package installation and should be excluded. Add *.egg-info/ to .gitignore and remove this directory from the repository.

Suggested change
my_dh_library

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +11
README.md
pyproject.toml
src/my_dh_cli/__init__.py
src/my_dh_cli/__main__.py
src/my_dh_cli/cli.py
src/my_dh_cli.egg-info/PKG-INFO
src/my_dh_cli.egg-info/SOURCES.txt
src/my_dh_cli.egg-info/dependency_links.txt
src/my_dh_cli.egg-info/entry_points.txt
src/my_dh_cli.egg-info/requires.txt
src/my_dh_cli.egg-info/top_level.txt No newline at end of file
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build artifacts from setuptools should not be committed to version control. The .egg-info directory is automatically generated during package installation and should be excluded. Add *.egg-info/ to .gitignore and remove this directory from the repository.

Suggested change
README.md
pyproject.toml
src/my_dh_cli/__init__.py
src/my_dh_cli/__main__.py
src/my_dh_cli/cli.py
src/my_dh_cli.egg-info/PKG-INFO
src/my_dh_cli.egg-info/SOURCES.txt
src/my_dh_cli.egg-info/dependency_links.txt
src/my_dh_cli.egg-info/entry_points.txt
src/my_dh_cli.egg-info/requires.txt
src/my_dh_cli.egg-info/top_level.txt

Copilot uses AI. Check for mistakes.
Comment on lines +9 to +13
input_path = Path(directory)
output_path = Path(output_dir)
output_path.mkdir(exist_ok=True)

csv_files = list(input_path.glob("*.csv"))
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function doesn't validate that the input directory exists before attempting to glob for CSV files. While the Click command decorator validates this for CLI usage, the function can be called directly from Python code (as shown in the README examples) without validation. Add a check to verify the directory exists and provide a helpful error message if it doesn't.

Copilot uses AI. Check for mistakes.
Comment on lines +4 to +13
def my_dh_query(input_file: str, verbose: bool = False):
"""Read a CSV file and perform a simple query operation on the data."""
from deephaven import read_csv

if verbose:
click.echo(f"Processing {input_file}...")

source = read_csv(input_file)

result = source.update(formulas=["DoubleScore = Score * 2"])
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function assumes the input CSV has a column named "Score" without validation. If the CSV doesn't have this column, it will fail at runtime. Consider adding validation to check if the required column exists, or add error handling to provide a helpful error message to users.

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +8
def filter_by_threshold(table: Table, column: str, threshold: float) -> Table:
"""Filter table rows where column value exceeds threshold."""
return table.where(f"{column} > {threshold}")
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function has a potential SQL injection vulnerability. The column name is directly interpolated into the query string using an f-string. If the column parameter comes from untrusted input, it could be exploited. Consider validating the column name against the table's actual columns before constructing the query.

Copilot uses AI. Check for mistakes.
Comment on lines +5 to +10
src/my_dh_library/utils.py
src/my_dh_library.egg-info/PKG-INFO
src/my_dh_library.egg-info/SOURCES.txt
src/my_dh_library.egg-info/dependency_links.txt
src/my_dh_library.egg-info/requires.txt
src/my_dh_library.egg-info/top_level.txt No newline at end of file
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Build artifacts from setuptools should not be committed to version control. The .egg-info directory is automatically generated during package installation and should be excluded. Add *.egg-info/ to .gitignore and remove this directory from the repository.

Suggested change
src/my_dh_library/utils.py
src/my_dh_library.egg-info/PKG-INFO
src/my_dh_library.egg-info/SOURCES.txt
src/my_dh_library.egg-info/dependency_links.txt
src/my_dh_library.egg-info/requires.txt
src/my_dh_library.egg-info/top_level.txt
src/my_dh_library/utils.py

Copilot uses AI. Check for mistakes.
Comment on lines +10 to +11
output_path = Path(output_dir)
output_path.mkdir(exist_ok=True)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The mkdir operation could fail due to permissions or other I/O errors, but there's no error handling. Consider adding try-except blocks or using mkdir(parents=True, exist_ok=True) to handle cases where parent directories don't exist. Additionally, provide helpful error messages to users when directory creation fails.

Copilot uses AI. Check for mistakes.
@click.option("--verbose", "-v", is_flag=True, help="Enable verbose output")
def app(input_file: str, verbose: bool) -> None:
"""Process data with Deephaven."""
result = my_dh_query(input_file, verbose)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable result is not used.

Copilot uses AI. Check for mistakes.
@click.option("--verbose", "-v", is_flag=True, help="Enable verbose output")
def app(input_file: str, verbose: bool) -> None:
"""Process data with Deephaven."""
result = my_dh_query(input_file, verbose)
Copy link

Copilot AI Jan 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Variable result is not used.

Suggested change
result = my_dh_query(input_file, verbose)
my_dh_query(input_file, verbose)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants