CtrlK

multi_head_attention

    description: 'layer multi_head_attention for configurable component'
    ---

    # multi_head_attention layer

    Adds an attention layer with multiple heads. Uses softmax. Ends in a linear

    ## required arguments

        `size` - output size

    ## optional arguments

        `heads` - Number of heads to use. Defaults to 4

    ## input size

    Any 2-d tensor

    ## output size

Previousmul Nextoperation

Last updated 5 years ago

Was this helpful?