Allow Automodel to use from_config with custom code.#13123
Conversation
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
| ) | ||
|
|
||
| @classmethod | ||
| def from_config( |
There was a problem hiding this comment.
ohh I think maybe we should support this in from_pretrained()? even for scheduler and such weightless things, we load them using from_pretrained(), no?
we normally use from_config like this, would be confusing I think
pipe = DiffusionPipeline.from_pretrained(..)
pipe.scheduler = NEWSCHEDULER.from_config(pipe.config)There was a problem hiding this comment.
This is true. But the SchedulerMixin's from_pretrained just runs from_config under the hood.
diffusers/src/diffusers/schedulers/scheduling_utils.py
Lines 147 to 154 in 985d83c
I thought perhaps we could make an exception for just the AutoModel class? The load_config and then from_config flow doesn't work too well when trying to load custom code.
There was a problem hiding this comment.
ohh thanks for explaining. I think I got it now.
but is it possible to not re-use the from_config API with a different usage pattern? I think maybe we can add a new arg to from_pretrained to support this use case? e.g load_weights=True, init_from_config=False etc
Even for custom components that come with the weights, we neeed to support for this use case, for example training, or stuff like here we need to initialize it to inspect model structure for quantisation https://github.com/cubiq/Mellon/blob/main/modules/ModularDiffusers/loaders.py#L181
There was a problem hiding this comment.
or maybe a brand new method if it you prefer
There was a problem hiding this comment.
init_from_hub? from_pretrained_config? etc
There was a problem hiding this comment.
IMO from_config is the preferred option because it works well with ComponentSpec's default_creation_method as well. So we don't need to introduce too many changes to modular get this functionality.
Models currently support passing in a pretrained_model_name_or_path. I know that there was a plan to deprecate this behaviour (but I'm not clear as to why the deprecation is needed?)
diffusers/src/diffusers/configuration_utils.py
Lines 237 to 251 in 76af013
There was a problem hiding this comment.
but I'm not clear as to why the deprecation is needed?
I think it's before my time in but if I have to guess, they deprecated it after adding from_pretrained() to scheduler to make API cleaner: from_pretrained handles loading (without or without weights) and from_config takes only config dict
so if we undo the deprecation & introduce the from_config on AutoModel, the mental model would be something like this?
from_pretrainedcreate the model with the hub_id with default behavior: load weigths for models, create config for weightless things) and this correspond so thedefault_creation_methodin ComponentSpecfrom_configis used to explicitly create from config without weights, acccept both a config dict and path
this sounds fine to me
There was a problem hiding this comment.
IMO from_config is the preferred option because it works well with ComponentSpec's default_creation_method as wel
One note on this though — default_creation_method is currently aligned with the current from_pretrained/from_config usage. e.g. the default creation method would be from_pretrained for weightless components as long as it needs to load a config from hub, image processor uses from_config because it's created from a config dict defined in componenet spec directly
not sure if this complicates anything so just something to keep in mind
What does this PR do?
AutoModel currently doesn't support creating objects using
from_config, but this can be useful in cases where we might want to load a custom component that doesn't have any weights associated with it e.g. Custom scheduler like components for pipelines.Fixes # (issue)
Before submitting
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.