DeepMind one shot learning 论文批注 One-Shot Generalization in D(3)_H5之家

Inference模型的所有参数

3.2.5. MODEL PROPERTIES AND COMPLEXITY

模型特点：

（1）隐画布函数的选择：在hidden space产生pre-image,最后反向转换到图像空间。

One of the largest deviations is the introduction of the hidden canvas into the generativemodel that provides an important richness to the model, since it allows apre-image to be constructed in a hidden space before a finalcorrective transformation, using the function fo,is used.

Inference network

shareparameters of the LSTM from the prior—the removal of this additional recursivefunction has no effect on performance.

4. Image Generation and Analysis

数据 Data: Binary images

像素概率模型 Modelthe probability of the pixels : Bernoullilikelihood 伯努利分布

神经元单元 Units: 400 LSTMhidden units

空间变换 Spatial transformer : 12x12 kernels, used for recognition orgenerative attention

潜变量 Latent variable Zt: 4-D Gaussian distributions

时间步长 Timestep: 20-80

隐画布 Hiddencanvas: size of Image with4 channels

训练迭代 Approximativelyiterations: 800K

批量 mini-batches 24

训练集似然边界 likelihood bounds: 训练的最后1K次迭代的平均值

测试集似然边界 likelihood bounds: 24000个随机样本bound边界均值

4.1. MNIST and Multi-MNIST

Data Set

1. The binarized MNIST data set of Salakhutdinov &Murray (2008)

28X28 binary imageswith 50,000 training and 10,000test images.

2. Multi-MNIST data set

3. 64x64 images，two MNIST digits placed at random

Importance of each step

These results alsoindicate that longer sequences can lead to better performance。

The latent variableszt have diminishing contribution to the model as the number of steps grow.

Efficiently allocateand decide on the number of latent variables to use

4.2. Omniglot

The omniglot data set

105 X105 binaryimages across ;

1628 classes withjust 20 images per class.

4.3. Multi-PIE

Multi-PIE dataset:

48X48 RGB faceimages from various viewpoints

convertedto grayscale

Trainedour model on a subset comprising of all 15-viewpoints but only 3 out of the 19illumination conditions.

93,130training samples and 10, 000 test samples

5. One-Shot Generalization

Three tasks toevaluate one-shot generalization

(1) unconditional(free) generation,

(2) generation ofnovel variations of a given exemplar,

(3) generation ofrepresentative samples from a novel alphabet

Weak and one-shotgeneralization tests:

模型可以使用字母表中相近的字符训练，被期待将这种知识迁移到弱生成任务中。

Training data consists of all available alphabets,3character types from each

alphabet were removedto form test set.

Strong one-shot generalization test

使用30中字母表中字符训练，剩余20个字母表作为测试。