tfm.optimization.AdafactorConfig
Stay organized with collections
Save and categorize content based on your preferences.
Configuration for Adafactor optimizer.
Inherits From: BaseOptimizerConfig, Config, ParamsDict
View aliases
Main aliases
tfm.optimization.AdafactorConfig(
default_params: dataclasses.InitVar[Optional[Mapping[str, Any]]] = None,
restrictions: dataclasses.InitVar[Optional[List[str]]] = None,
clipnorm: Optional[float] = None,
clipvalue: Optional[float] = None,
global_clipnorm: Optional[float] = None,
name: str = 'Adafactor',
factored: bool = True,
multiply_by_parameter_scale: bool = True,
beta1: Optional[float] = None,
decay_rate: float = 0.8,
step_offset: int = 0,
clipping_threshold: float = 1.0,
min_dim_size_to_factor: int = 128,
epsilon1: float = 1e-30,
epsilon2: float = 0.001,
weight_decay: Optional[float] = None,
include_in_weight_decay: Optional[str] = None
)
The attributes for this class matches the arguments of the Adafactor implementation.
Attributes | |
|---|---|
BUILDER
|
|
default_params
|
Dataclass field |
restrictions
|
Dataclass field |
clipnorm
|
Dataclass field |
clipvalue
|
Dataclass field |
global_clipnorm
|
Dataclass field |
name
|
Dataclass field |
factored
|
Dataclass field |
multiply_by_parameter_scale
|
Dataclass field |
beta1
|
Dataclass field |
decay_rate
|
Dataclass field |
step_offset
|
Dataclass field |
clipping_threshold
|
Dataclass field |
min_dim_size_to_factor
|
Dataclass field |
epsilon1
|
Dataclass field |
epsilon2
|
Dataclass field |
weight_decay
|
Dataclass field |
include_in_weight_decay
|
Dataclass field |
Methods
as_dict
as_dict()
Returns a dict representation of params_dict.ParamsDict.
For the nested params_dict.ParamsDict, a nested dict will be returned.
from_args
@classmethodfrom_args( *args, **kwargs )
Builds a config from the given list of arguments.
from_json
@classmethodfrom_json( file_path: str )
Wrapper for from_yaml.
from_yaml
@classmethodfrom_yaml( file_path: str )
get
get(
key, value=None
)
Accesses through built-in dictionary get method.
lock
lock()
Makes the ParamsDict immutable.
override
override(
override_params, is_strict=True
)
Override the ParamsDict with a set of given params.
| Args | |
|---|---|
override_params
|
a dict or a ParamsDict specifying the parameters to be overridden. |
is_strict
|
a boolean specifying whether override is strict or not. If
True, keys in override_params must be present in the ParamsDict. If
False, keys in override_params can be different from what is currently
defined in the ParamsDict. In this case, the ParamsDict will be extended
to include the new keys.
|
replace
replace(
**kwargs
)
Overrides/returns a unlocked copy with the current config unchanged.
validate
validate()
Validate the parameters consistency based on the restrictions.
This method validates the internal consistency using the pre-defined list of restrictions. A restriction is defined as a string which specifies a binary operation. The supported binary operations are {'==', '!=', '<', '<=', '>', '>='}. Note that the meaning of these operators are consistent with the underlying Python immplementation. Users should make sure the define restrictions on their type make sense.
For example, for a ParamsDict like the following
a:
a1: 1
a2: 2
b:
bb:
bb1: 10
bb2: 20
ccc:
a1: 1
a3: 3
one can define two restrictions like this ['a.a1 == b.ccc.a1', 'a.a2 <= b.bb.bb2']
| What it enforces are | |
|---|---|
|
| Raises | |
|---|---|
KeyError
|
if any of the following happens (1) any of parameters in any of restrictions is not defined in ParamsDict, (2) any inconsistency violating the restriction is found. |
ValueError
|
if the restriction defined in the string is not supported. |
__contains__
__contains__(
key
)
Implements the membership test operator.
__eq__
__eq__(
other
)
Class Variables | |
|---|---|
| IMMUTABLE_TYPES |
(<class 'str'>,
<class 'int'>,
<class 'float'>,
<class 'bool'>,
<class 'NoneType'>)
|
| RESERVED_ATTR |
['_locked', '_restrictions']
|
| SEQUENCE_TYPES |
(<class 'list'>, <class 'tuple'>)
|
| beta1 |
None
|
| clipnorm |
None
|
| clipping_threshold |
1.0
|
| clipvalue |
None
|
| decay_rate |
0.8
|
| default_params |
None
|
| epsilon1 |
1e-30
|
| epsilon2 |
0.001
|
| factored |
True
|
| global_clipnorm |
None
|
| include_in_weight_decay |
None
|
| min_dim_size_to_factor |
128
|
| multiply_by_parameter_scale |
True
|
| name |
'Adafactor'
|
| restrictions |
None
|
| step_offset |
0
|
| weight_decay |
None
|