Google Implements AI Landscape Photographer
Source – i-programmer.info
The latest artistic AI is Google’s Creatism, a deep learning photographer which it claims is capable of creating professional quality work. Looks like photographers aren’t immune from the robot/AI takeover.
One of the problems with using AI in any creative endeavour is that it is really difficult to judge how well things have worked – art is subjective and even human art is often influenced by the random. You could put this more formally by saying that a Turing test restricted to creating art would be one of the easiest to fake. In this case, however, the two-man team tried very hard to validate what they had achieved. It is also true that the particular branch of photography, landscape photography, has some rather more deterministic rules about what we find acceptable. Landscape photos have elements of symmetry starting from the sky earth dividing line and extending to the landforms and other elements.
Rather than sending a robot photographer out into the field, well this is landscape photography, the team of two, Hui Fang and Meng Zhang, used Google Street View – which isn’t known for its stunning photography:
Our virtual photographer “travelled” ~40,000 panoramas in areas like the Alps, Banff and Jasper National Parks in Canada, Big Sur in California and Yellowstone National Park, and returned with creations that are quite impressive, some even approaching professional quality — as judged by professional photographers.
Without a labeled set of good and bad photos, training a Generative Adversarial Network (GAN) is difficult. The new idea is to use roughly orthogonal aesthetic qualities and train the network on each separately.
The generative part of the network used:
- Composite an image from environment
- Apply saturation filter
- Apply HDR filter
- Apply dramatic mask
to try to create artistically good photos that the main network couldn’t distinguish from human-taken photos that had been degraded in some way. Then the other network tried to tell the difference between the improved photo and the original. In this way the network slowly learned how to improve a not-so-good photo.
From our point of view the interesting one was the dramatic mask. This is a generalization of vignetting, a change in brightness to the edge of the image, often applied to make photos look more dramatic.
In this case the network attempted to learn a brightness modulation across the entire image. This was a way of making up for the long hours that landscape photographers spend waiting for the light distribution in the scene to be just right – or for the luck in taking the photo at the right moment lighting-wise by accident.
Images before and after applying the dramatic mask.
The dramatic mask technique looks as it if could be generally useful. It also takes what the AI can do with a photo well beyond what a human photographer can do.
After learning what makes a good photo, the AI was let loose on some test cases. It cropped, change the saturation and balance and added a dramatic mask. One example is below:
To evaluate how well the AI had done in learning what makes a great landscape photo, a panel of professional photographers was asked to grade a set of photos, including those produced by the AI – a sort of landscape photo Turing test. Roughly 40% of the AI photos were judged to be pro or semi-pro.
You can see samples in the paper, in a Github gallery and on the Google blog, but the example below showing the Street View original and what the AI has done to it is typical.
You can see that the dramatic mask is worth its weight in Ektachrome. The final comment on the blog is interesting:
“The Street View panoramas served as a testing bed for our project. Someday this technique might even help you to take better photos in the real world. We compiled a showcase of photos created to our satisfaction. If you see a photo you like, you can click on it to bring out a nearby Street View panorama. Would you make the same decision if you were there holding the camera at that moment?”
Of course, you might think that there were better positions and crops but without that dramatic mask you might not have produced something quite as good. You can tell I really, really want to use it myself.