diff --git a/README.md b/README.md index 81c0aabf..d9dc6ea8 100644 --- a/README.md +++ b/README.md @@ -364,6 +364,12 @@ hyp create hyp-pytorch-job \ | `--accelerator-partition-limit` | INTEGER | No | Limit for the number of accelerator partitions (minimum: 1) | | `--preferred-topology` | TEXT | No | Preferred topology annotation for scheduling | | `--required-topology` | TEXT | No | Required topology annotation for scheduling | +| `--max-node-count` | INTEGER | No | Maximum number of nodes| +| `--elastic-replica-increment-step` | INTEGER | No | Scaling step size for elastic training. Provide either this or elastic-replica-discrete-values| +| `--elastic-graceful-shutdown-timeout-in-seconds` | INTEGER | No | Graceful shutdown timeout in seconds for elastic scaling operations| +| `--elastic-scaling-timeout-in-seconds` | INTEGER | No | Scaling timeout for elastic training| +| `--elastic-scale-up-snooze-time-in-seconds` | INTEGER | No | Timeout period after job restart during which no scale up/workload admission is allowed| +| `--elastic-replica-discrete-values` | ARRAY | No | Alternative to elastic-replica-increment-step. Provides exact values for total replicas count (array of integers)| | `--debug` | FLAG | No | Enable debug mode (default: false) | #### List Available Accelerator Partition Types diff --git a/doc/cli/training/cli_training.md b/doc/cli/training/cli_training.md index 905ec54b..8f5bffe1 100644 --- a/doc/cli/training/cli_training.md +++ b/doc/cli/training/cli_training.md @@ -206,6 +206,12 @@ hyp create hyp-pytorch-job [OPTIONS] | `--memory-limit` | FLOAT | No | Limit for the amount of memory in GiB | | `--preferred-topology` | TEXT | No | Preferred topology annotation for scheduling | | `--required-topology` | TEXT | No | Required topology annotation for scheduling | +| `--max-node-count` | INTEGER | No | Maximum number of nodes| +| `--elastic-replica-increment-step` | INTEGER | No | Scaling step size for elastic training. Provide either this or elastic-replica-discrete-values| +| `--elastic-graceful-shutdown-timeout-in-seconds` | INTEGER | No | Graceful shutdown timeout in seconds for elastic scaling operations| +| `--elastic-scaling-timeout-in-seconds` | INTEGER | No | Scaling timeout for elastic training| +| `--elastic-scale-up-snooze-time-in-seconds` | INTEGER | No | Timeout period after job restart during which no scale up/workload admission is allowed| +| `--elastic-replica-discrete-values` | ARRAY | No | Alternative to elastic-replica-increment-step. Provides exact values for total replicas count (array of integers)| | `--debug` | FLAG | No | Enable debug mode (default: false) | ### Volume Configuration