Inference模型的所有参数
3.2.5. MODEL PROPERTIES AND COMPLEXITY模型特点:
(1)隐画布函数的选择:在hidden space产生pre-image,最后反向转换到图像空间。
One of the largest deviations is the introduction of the hidden canvas into the generativemodel that provides an important richness to the model, since it allows apre-image to be constructed in a hidden space before a finalcorrective transformation, using the function fo,is used.
Inference network
shareparameters of the LSTM from the prior—the removal of this additional recursivefunction has no effect on performance.
4. Image Generation and Analysis数据 Data: Binary images
像素概率模型 Modelthe probability of the pixels : Bernoullilikelihood 伯努利分布
神经元单元 Units: 400 LSTMhidden units
空间变换 Spatial transformer : 12x12 kernels, used for recognition orgenerative attention
潜变量 Latent variable Zt: 4-D Gaussian distributions
时间步长 Timestep: 20-80
隐画布 Hiddencanvas: size of Image with4 channels
训练迭代 Approximativelyiterations: 800K
批量 mini-batches 24
训练集似然边界 likelihood bounds: 训练的最后1K次迭代的平均值
测试集似然边界 likelihood bounds: 24000个随机样本bound边界均值
4.1. MNIST and Multi-MNISTData Set
1. The binarized MNIST data set of Salakhutdinov &Murray (2008)
28X28 binary imageswith 50,000 training and 10,000test images.
2. Multi-MNIST data set
3. 64x64 images,two MNIST digits placed at random
Importance of each stepThese results alsoindicate that longer sequences can lead to better performance。
The latent variableszt have diminishing contribution to the model as the number of steps grow.
Efficiently allocateand decide on the number of latent variables to use
4.2. OmniglotThe omniglot data set
105 X105 binaryimages across ;
1628 classes withjust 20 images per class.
4.3. Multi-PIEMulti-PIE dataset:
48X48 RGB faceimages from various viewpoints
convertedto grayscale
Trainedour model on a subset comprising of all 15-viewpoints but only 3 out of the 19illumination conditions.
93,130training samples and 10, 000 test samples
5. One-Shot GeneralizationThree tasks toevaluate one-shot generalization
(1) unconditional(free) generation,
(2) generation ofnovel variations of a given exemplar,
(3) generation ofrepresentative samples from a novel alphabet
Weak and one-shotgeneralization tests:
模型可以使用字母表中相近的字符训练,被期待将这种知识迁移到弱生成任务中。
Training data consists of all available alphabets,3character types from each
alphabet were removedto form test set.
Strong one-shot generalization test
使用30中字母表中字符训练,剩余20个字母表作为测试。