Hypergan comes with a simple DSL to design your own network architectures. A ConfigurableComponent can be used to create custom designs when building your models.
For example, a discriminator component can be defined as:
..."discriminator":{"class": "class:hypergan.discriminators.configurable_discriminator.ConfigurableDiscriminator","layers":["conv 32","relu","conv 64 ","relu","conv 128","relu","conv 256","relu","flatten","linear 1"]​}
This means to create a network composed of 4 convolution layers that decrease along stride and increase filter size, ending in a linear without activation. End the discriminator in a logit(output of conv/linear/etc)
"generator": {"class": "class:hypergan.discriminators.configurable_discriminator.ConfigurableDiscriminator","layers": ["linear 128 name=w","relu","resize_conv 256","const 1*1*128","adaptive_instance_norm","relu","resize_conv 256","adaptive_instance_norm","relu","resize_conv 256","adaptive_instance_norm","relu","resize_conv 256","adaptive_instance_norm","relu","resize_conv 256","adaptive_instance_norm","relu","resize_conv 128","adaptive_instance_norm","relu","resize_conv 64","adaptive_instance_norm","relu","resize_conv 32","adaptive_instance_norm","relu","resize_conv 3","tanh"]​}
This is a generator. A generator takes in a latent space and returns a data type that matches the input.
adaptive_instance_norm
looks up the style
layer (default named 'w') and uses it to perform the adaptive instance norm.
HyperGAN defaults to the image space of (-1, 1), which is the same range as tanh.
Common layer types:
linear [outputs] (options)
Creates a linear layer.
conv [filters] (options)
A convolution. Stride is applied if it is set. For example: conv [filters] filter=5 stride=1
will run set stride to 1 and with a filter size of 5.
deconv [filters] (options)
Doubles the width and height of your tensor. Called conv2d_transpose in literature
resize_conv [output filters] (options)
reshape [size]
size can be:
-1
*
delimited dimensions.
Same as reshape -1
const [size]
size is *
delimited dimensions
Configuration in HyperGAN uses JSON files. You can create a new config with the default template by running hypergan new mymodel
.
You can see all templates with hypergan new mymodel -l
.
A hypergan configuration contains all hyperparameters for reproducing the full GAN.
In the original DCGAN you will have one of the following components:
Distribution(latent space)
Generator
Discriminator
Loss
Trainer
Other architectures may differ. See the configuration templates.
A base class for each of the component types listed below.
A generator is responsible for projecting an encoding (sometimes called z space) to an output (normally an image). A single GAN object from HyperGAN has one generator.
A discriminator's main purpose(sometimes called a critic) is to separate out G from X, and to give the Generator a useful error signal to learn from.
Each component in the GAN can be specified with a flexible DSL inside the JSON file.
Wasserstein Loss is simply:
d_loss = d_real - d_fakeg_loss = d_fake
d_loss and g_loss can be reversed as well - just add a '-' sign.
d_loss = (d_real-b)**2 - (d_fake-a)**2g_loss = (d_fake-c)**2
a, b, and c are all hyperparameters.
Includes support for Improved GAN. See hypergan/losses/standard_gan_loss.py
for details.
Use with the AutoencoderDiscriminator
.
See the began
configuration template.
attribute | description | type |
batch_norm | batch_norm_1, layer_norm_1, or None | f(batch_size, name)(net):net |
create | Called during graph creation | f(config, gan, net):net |
discriminator | Set to restrict this loss to a single discriminator(defaults to all) | int >= 0 or None |
label_smooth | improved gan - Label smoothing. | float > 0 |
labels | lsgan - A triplet of values containing (a,b,c) terms. | [a,b,c] floats |
reduce | Reduces the output before applying loss | f(net):net |
reverse | Reverses the loss terms, if applicable | boolean |
Determined by the GAN implementation. These variables are the same across all trainers.
Consensus trainers trains G and D at the same time. Resize Conv is known to not work with this technique(PR welcome).
Configuration
attribute | description | type |
learn_rate | Learning rate for the generator | float >= 0 |
beta1 | (adam) | float >= 0 |
beta2 | (adam) | float >= 0 |
epsilon | (adam) | float >= 0 |
decay | (rmsprop) | float >= 0 |
momentum | (rmsprop) | float >= 0 |
Original GAN training. Locks generator weights while training the discriminator, and vice-versa.
Configuration
attribute | description | type |
g_learn_rate | Learning rate for the generator | float >= 0 |
g_beta1 | (adam) | float >= 0 |
g_beta2 | (adam) | float >= 0 |
g_epsilon | (adam) | float >= 0 |
g_decay | (rmsprop) | float >= 0 |
g_momentum | (rmsprop) | float >= 0 |
d_learn_rate | Learning rate for the discriminator | float >= 0 |
d_beta1 | (adam) | float >= 0 |
d_beta2 | (adam) | float >= 0 |
d_epsilon | (adam) | float >= 0 |
d_decay | (rmsprop) | float >= 0 |
d_momentum | (rmsprop) | float >= 0 |
clipped_gradients | If set, gradients will be clipped to this value. | float > 0 or None |
d_clipped_weights | If set, the discriminator will be clipped by value. | float > 0 or None |
Only trains on good z candidates.
Train on a schedule.
An evolution based trainer that plays a subgame between multiple generators/discriminators.