Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -364,6 +364,12 @@ hyp create hyp-pytorch-job \
| `--accelerator-partition-limit` | INTEGER | No | Limit for the number of accelerator partitions (minimum: 1) |
| `--preferred-topology` | TEXT | No | Preferred topology annotation for scheduling |
| `--required-topology` | TEXT | No | Required topology annotation for scheduling |
| `--max-node-count` | INTEGER | No | Maximum number of nodes|
| `--elastic-replica-increment-step` | INTEGER | No | Scaling step size for elastic training. Provide either this or elastic-replica-discrete-values|
| `--elastic-graceful-shutdown-timeout-in-seconds` | INTEGER | No | Graceful shutdown timeout in seconds for elastic scaling operations|
| `--elastic-scaling-timeout-in-seconds` | INTEGER | No | Scaling timeout for elastic training|
| `--elastic-scale-up-snooze-time-in-seconds` | INTEGER | No | Timeout period after job restart during which no scale up/workload admission is allowed|
| `--elastic-replica-discrete-values` | ARRAY | No | Alternative to elastic-replica-increment-step. Provides exact values for total replicas count (array of integers)|
| `--debug` | FLAG | No | Enable debug mode (default: false) |

#### List Available Accelerator Partition Types
Expand Down
6 changes: 6 additions & 0 deletions doc/cli/training/cli_training.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,12 @@ hyp create hyp-pytorch-job [OPTIONS]
| `--memory-limit` | FLOAT | No | Limit for the amount of memory in GiB |
| `--preferred-topology` | TEXT | No | Preferred topology annotation for scheduling |
| `--required-topology` | TEXT | No | Required topology annotation for scheduling |
| `--max-node-count` | INTEGER | No | Maximum number of nodes|
| `--elastic-replica-increment-step` | INTEGER | No | Scaling step size for elastic training. Provide either this or elastic-replica-discrete-values|
| `--elastic-graceful-shutdown-timeout-in-seconds` | INTEGER | No | Graceful shutdown timeout in seconds for elastic scaling operations|
| `--elastic-scaling-timeout-in-seconds` | INTEGER | No | Scaling timeout for elastic training|
| `--elastic-scale-up-snooze-time-in-seconds` | INTEGER | No | Timeout period after job restart during which no scale up/workload admission is allowed|
| `--elastic-replica-discrete-values` | ARRAY | No | Alternative to elastic-replica-increment-step. Provides exact values for total replicas count (array of integers)|
| `--debug` | FLAG | No | Enable debug mode (default: false) |

### Volume Configuration
Expand Down
Loading