Skip to content

Conversation

@junhaoliao
Copy link
Member

@junhaoliao junhaoliao commented Dec 18, 2025

Description

The server_settings_json_updates dictionary in _set_up_env_for_webui was missing several configuration keys that are defined in the WebUI server's settings.json template:

  • SqlDbCompressionJobsTableName
  • ArchiveOutputCompressionLevel
  • ArchiveOutputTargetArchiveSize
  • ArchiveOutputTargetDictionariesSize
  • ArchiveOutputTargetEncodedFileSize
  • ArchiveOutputTargetSegmentSize
  • ClpStorageEngine

Without these settings, user configurations for archive output parameters and storage engine were not being propagated to the WebUI server's settings.json file. This fix ensures the WebUI server receives these values from the CLP package configuration.

Checklist

  • The PR satisfies the contribution guidelines.
  • This is a breaking change and that has been indicated in the PR title, OR this isn't a
    breaking change.
  • Necessary docs have been updated, OR no docs need to be updated.

Validation performed

task # to build the project
cd build/clp-package
./sbin/start-clp.sh

observed the configurations now show up in build/clp-package/var/www/webui/server/dist/settings.json

{
  "SqlDbHost": "database",
  "SqlDbPort": 3306,
  "SqlDbName": "clp-db",
  "SqlDbQueryJobsTableName": "query_jobs",
  "SqlDbCompressionJobsTableName": "compression_jobs",
  "MongoDbHost": "results_cache",
  "MongoDbPort": 27017,
  "MongoDbName": "clp-query-results",
  "MongoDbSearchResultsMetadataCollectionName": "results-metadata",
  "MongoDbStreamFilesCollectionName": "stream-files",
  "ClientDir": "/opt/clp/var/www/webui/client",
  "LogViewerDir": "/opt/clp/var/www/webui/yscope-log-viewer",
  "LogsInputRootDir": "/mnt/logs",
  "StreamFilesDir": "/var/data/streams",
  "StreamTargetUncompressedSize": 134217728,
  "StreamFilesS3Region": null,
  "StreamFilesS3PathPrefix": null,
  "StreamFilesS3Profile": null,
  "ArchiveOutputCompressionLevel": 3,
  "ArchiveOutputTargetArchiveSize": 268435456,
  "ArchiveOutputTargetDictionariesSize": 33554432,
  "ArchiveOutputTargetEncodedFileSize": 268435456,
  "ArchiveOutputTargetSegmentSize": 268435456,
  "ClpQueryEngine": "clp-s",
  "ClpStorageEngine": "clp-s",
  "PrestoHost": null,
  "PrestoPort": null
}

Summary by CodeRabbit

  • New Features
    • Web UI now supports configurable archive compression and storage settings
    • New archive sizing parameters available for tuning performance
    • Storage engine selection now exposed in Web UI configuration
    • Database compression jobs table configuration now accessible through settings

✏️ Tip: You can customize this high-level summary in your review settings.

@junhaoliao junhaoliao requested a review from a team as a code owner December 18, 2025 20:32
@junhaoliao junhaoliao requested a review from davemarco December 18, 2025 20:32
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Dec 18, 2025

Walkthrough

The PR updates the controller module to expose additional archive output configuration fields and storage engine settings in the Web UI client and server configurations, including compression levels, target sizes, and database table identifiers.

Changes

Cohort / File(s) Summary
Controller Configuration Updates
components/clp-package-utils/clp_package_utils/controller.py
Exposes archive output settings (ArchiveOutputCompressionLevel, ArchiveOutputTargetArchiveSize, ArchiveOutputTargetDictionariesSize, ArchiveOutputTargetEncodedFileSize, ArchiveOutputTargetSegmentSize), storage engine configuration (ClpStorageEngine, ClpQueryEngine), and SqlDbCompressionJobsTableName in both client and server settings JSON updates.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related issues

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title accurately describes the main objective: adding missing archive output and storage engine settings to the server configuration, directly matching the primary changes outlined in the objectives and raw summary.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings

📜 Recent review details

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 742f409 and fbafcc8.

📒 Files selected for processing (1)
  • components/clp-package-utils/clp_package_utils/controller.py
🧰 Additional context used
🧠 Learnings (3)
📓 Common learnings
Learnt from: Bill-hbrhbr
Repo: y-scope/clp PR: 1122
File: components/core/src/clp/clp/CMakeLists.txt:175-195
Timestamp: 2025-07-23T09:54:45.185Z
Learning: In the CLP project, when reviewing CMakeLists.txt changes that introduce new compression library dependencies (BZip2, LibLZMA, LZ4, ZLIB), the team prefers to address conditional linking improvements in separate PRs rather than expanding the scope of focused migration PRs like the LibArchive task-based installation migration.
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1152
File: components/clp-package-utils/clp_package_utils/scripts/start_clp.py:613-613
Timestamp: 2025-08-08T06:59:42.436Z
Learning: In components/clp-package-utils/clp_package_utils/scripts/start_clp.py, generic_start_scheduler sets CLP_LOGGING_LEVEL using clp_config.query_scheduler.logging_level for both schedulers; compression scheduler should use its own logging level. Tracking via an issue created from PR #1152 discussion.
📚 Learning: 2025-10-17T19:59:25.596Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1178
File: components/clp-package-utils/clp_package_utils/controller.py:315-315
Timestamp: 2025-10-17T19:59:25.596Z
Learning: In components/clp-package-utils/clp_package_utils/controller.py, worker log directories (compression_worker, query_worker, reducer) created via `mkdir()` do not need `_chown_paths_if_root()` calls because directories are created with the same owner as the script caller. This differs from infrastructure service directories (database, queue, Redis, results cache) which do require explicit ownership changes.

Applied to files:

  • components/clp-package-utils/clp_package_utils/controller.py
📚 Learning: 2025-10-27T07:07:37.901Z
Learnt from: junhaoliao
Repo: y-scope/clp PR: 1501
File: tools/deployment/presto-clp/scripts/init.py:10-13
Timestamp: 2025-10-27T07:07:37.901Z
Learning: In `tools/deployment/presto-clp/scripts/init.py`, the `DATABASE_COMPONENT_NAME` and `DATABASE_DEFAULT_PORT` constants are intentionally duplicated from `clp_py_utils.clp_config` because `clp_py_utils` is not installed in the Presto init script's runtime environment. The two flows are separate and this duplication is documented. There are plans to merge these flows after a future release.

Applied to files:

  • components/clp-package-utils/clp_package_utils/controller.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: package-image
  • GitHub Check: check-generated
  • GitHub Check: lint-check (macos-15)
🔇 Additional comments (3)
components/clp-package-utils/clp_package_utils/controller.py (3)

701-701: LGTM! Consistent table name configuration.

Adding SqlDbCompressionJobsTableName to the server settings makes it consistent with the client settings (line 693), ensuring both have access to the compression jobs table name.


714-722: LGTM! Archive output configuration now properly exposed.

These additions correctly propagate the archive output configuration parameters (compression level and target sizes) from the CLP package config into the WebUI server settings, implementing the core objective of this PR.


724-724: LGTM! Storage engine configuration now complete.

Adding ClpStorageEngine to the server settings completes the propagation of storage engine configuration to the WebUI, ensuring the server has access to the storage engine setting alongside the query engine configuration.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@junhaoliao junhaoliao changed the title fix(webui): Add missing archive output and storage engine settings to… fix(webui): Add missing archive output and storage engine settings to server configuration (fixes #1799). Dec 18, 2025
@junhaoliao junhaoliao changed the title fix(webui): Add missing archive output and storage engine settings to server configuration (fixes #1799). fix(clp-package): Add missing archive output and storage engine settings to server configuration (fixes #1799). Dec 18, 2025
self._clp_config.archive_output.target_encoded_file_size
),
"ArchiveOutputTargetSegmentSize": self._clp_config.archive_output.target_segment_size,
"ClpStorageEngine": self._clp_config.package.storage_engine,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

want to reverse the order of clpQueryEngine, and clpStrorageEngine so matches package config and setting.json

davemarco
davemarco previously approved these changes Dec 19, 2025
Copy link
Contributor

@davemarco davemarco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i left a nit but approve

@junhaoliao junhaoliao requested a review from davemarco January 5, 2026 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants