Objective: Transfer visual attribute (color, tone, texture, and style, etc) between two semantically-meaningful images such as a picture and a sketch.
Inner workings:
Image analogy
An image analogy A:A′::B:B′ is a relation where:
- B′ relates to B in the same way as A′ relates to A
- A and A′ are in pixel-wise correspondences
- B and B′ are in pixel-wise correspondences
In this paper only a source image A and an example image B′ are given, and both A′ and B represent latent images to be estimated.

Dense correspondence
In order to find dense correspondences between two images they use features from previously trained CNN (VGG-19) and retrieve all the ReLU layers.
The mapping is divided in two sub-mappings that are easier to compute, first a visual attribute transformation and then a space transformation.

Architecture:
The algorithm proceeds as follow:
- Compute features at each layer for the input image using a pre-trained CNN and initialize feature maps of latent images with coarsest layer.
- For said layer compute a forward and reverse nearest-neighbor field (NNF, basically an offset field).
- Use this NNF with the feature of the input current layer to compute the features of the latent images.
- Upsample the NNF and use it as the initialization for the NNF of the next layer.

Results:
Impressive quality on all type of visual transfer but veryyyyy slow! (~3min on GPUs for one image).
