[Auto] update AuraFlow transformer docstrings (cluster-9591-5): merged 1 of 5 PRs #3

Original file line number	Diff line number	Diff line change
Expand Up		@@ -202,7 +202,7 @@ class AuraFlowJointTransformerBlock(nn.Module):
		* No bias in the attention blocks
		* Most LayerNorms are in FP32

	Parameters:
	Args:
		dim (`int`): The number of channels in the input and output.
		num_attention_heads (`int`): The number of heads to use for multi-head attention.
		attention_head_dim (`int`): The number of channels in each head.
Expand Down Expand Up		@@ -279,21 +279,21 @@ class AuraFlowTransformer2DModel(ModelMixin, AttentionMixin, ConfigMixin, PeftAd
		r"""
		A 2D Transformer model as introduced in AuraFlow (https://blog.fal.ai/auraflow/).

	Parameters:
	Args:
		sample_size (`int`): The width of the latent images. This is fixed during training since
		it is used to learn a number of position embeddings.
		patch_size (`int`): Patch size to turn the input data into small patches.
	in_channels (`int`, optional, defaults to 4): The number of channels in the input.
	num_mmdit_layers (`int`, optional, defaults to 4): The number of layers of MMDiT Transformer blocks to use.
	num_single_dit_layers (`int`, optional, defaults to 32):
	in_channels (`int`, optional, defaults to `4`): The number of channels in the input.
	num_mmdit_layers (`int`, optional, defaults to `4`): The number of layers of MMDiT Transformer blocks to use.
	num_single_dit_layers (`int`, optional, defaults to `32`):
		The number of layers of Transformer blocks to use. These blocks use concatenated image and text
		representations.
	attention_head_dim (`int`, optional, defaults to 256): The number of channels in each head.
	num_attention_heads (`int`, optional, defaults to 12): The number of heads to use for multi-head attention.
	attention_head_dim (`int`, optional, defaults to `256`): The number of channels in each head.
	num_attention_heads (`int`, optional, defaults to `12`): The number of heads to use for multi-head attention.
		joint_attention_dim (`int`, optional): The number of `encoder_hidden_states` dimensions to use.
		caption_projection_dim (`int`): Number of dimensions to use when projecting the `encoder_hidden_states`.
	out_channels (`int`, defaults to 4): Number of output channels.
	pos_embed_max_size (`int`, defaults to 1024): Maximum positions to embed from the image latents.
	out_channels (`int`, defaults to `4`): Number of output channels.
	pos_embed_max_size (`int`, defaults to `1024`): Maximum positions to embed from the image latents.
		"""

		_no_split_modules = ["AuraFlowJointTransformerBlock", "AuraFlowSingleTransformerBlock", "AuraFlowPatchEmbed"]
Expand Down Expand Up		@@ -368,8 +368,10 @@ def __init__(
		# Copied from diffusers.models.unets.unet_2d_condition.UNet2DConditionModel.fuse_qkv_projections with FusedAttnProcessor2_0->FusedAuraFlowAttnProcessor2_0
		def fuse_qkv_projections(self):
		"""
	Enables fused QKV projections. For self-attention modules, all projection matrices (i.e., query, key, value)
	are fused. For cross-attention modules, key and value projection matrices are fused.
	Enables fused QKV projections.

	For self-attention modules, all projection matrices (i.e., query, key, value) are fused.
	For cross-attention modules, only key and value projection matrices are fused.

		> [!WARNING] > This API is 🧪 experimental.
		"""
Expand Down

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Auto] update AuraFlow transformer docstrings (cluster-9591-5): merged 1 of 5 PRs #3

Are you sure you want to change the base?

Uh oh!

[Auto] update AuraFlow transformer docstrings (cluster-9591-5): merged 1 of 5 PRs #3

Filter by extension

Uh oh!

Uh oh!

Diff view

Diff view

There are no files selected for viewing

Uh oh!