multi_head_attention

description: 'layer multi_head_attention for configurable component'
---
​
# multi_head_attention layer
​
Adds an attention layer with multiple heads. Uses softmax. Ends in a linear
​
## required arguments
​
`size` - output size
​
## optional arguments
​
`heads` - Number of heads to use. Defaults to 4
​
## input size
​
Any 2-d tensor
​
## output size
​
First argument
​
## syntax
​
```json
"ez_norm CHANNELS heads=HEADS"
```
​
## examples
​
```json
"multi_head_attention 1024 heads=4"
```