Feature Request: Layer-wise (Mixed Precision) Quantization Configuration

### Summary
It would be highly valuable if whisper.cpp supported mixed precision and layer-wise quantization configuration, allowing users to specify different quantization types for different layers or tensors when converting models. This would enable more flexible and optimized deployments, balancing memory, accuracy, and performance based on user and hardware requirements. 

### Motivation
- Real-world edge deployments and research increasingly require fine-grained quantization strategies to optimize for specific accuracy, performance, and size constraints.
- Currently, whisper. cpp supports applying only one quantization type per model conversion. There is no straightforward way to assign different quantization types (e.g., Q8_0 for encoder, Q4_0 for decoder, FP16 for attention layers, etc.) to specific layers or tensor name patterns.
- llama.cpp already provides a `tensor_types`-like configuration for flexible per-layer quantization control.  Bringing similar options to whisper.cpp would enable hardware-aware and use-case-tailored deployment.

### Proposed Solution
- Provide a mechanism (such as a config file or command-line flag) for users to specify desired quantization types for each layer, group of layers, or by regex/pattern match on tensor names. 
- Default to the current unified quantization type when no mapping is provided, ensuring backward compatibility. 

Thank you for considering this improvement to whisper.cpp! 


**Do you think this is feasible? If so, I can try to implement it myself. Do you have any suggestions or advice?**

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Feature Request: Layer-wise (Mixed Precision) Quantization Configuration #3571

Summary

Motivation

Proposed Solution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature Request: Layer-wise (Mixed Precision) Quantization Configuration #3571

Description

Summary

Motivation

Proposed Solution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions