A recent research paper (entitled “A Neural Algorithm of Artistic Style“) has kicked off a flurry of online discussion with some striking visual examples. Essentially, the paper discusses a technique to train a deep neural network to separate artistic style from image structure, and combine the style of one image with the structure of another. The upshot of all of this is that you can train a huge neural network to turn photographs into “neural paintings” that look as though they were painted by famous artists — “digital counterfeits,” so to speak.
Here are some examples from the article. The first image is the original. The later images are the generated results, with the painting from which the style was sampled shown in miniature.
The original researchers have not released their code, unfortunately. However, some intrepid programmers have replicated their results over the last few days, and their code is available, open source on the Internet. All you need to run it is a linux machine and a little bit of patience.
Today, I’m going to walk you through how to do that, and show you some of my own results. You can think of this as a loose sequel to our DeepDream tutorial. It’s a bit complicated, but anyone with a Linux machine can follow along — no coding experience needed.
— James Blaha (@jamesblaha) September 5, 2015
Setting Up the Software
First off, if you aren’t in a big hurry or don’t have a Linux machine, you can still play with DeepStyle using the DeepForger Twitter bot (send it an image and a style, and it will eventually reply with the results you want). If you want to process more images quickly (and with more control over the results), read on for the tutorial.
First off, make sure you have an up to date copy of Ubuntu (14.04 is what I used). You’re should have at least a few extra gigs of hard drive space. For more information, check out our tutorial on dual-booting Ubuntu alongside Windows. You’re also going to need root privileges, so make sure you do before proceeding.
Right of the bat, this is an open-source project, so we’re going to want to have Git installed. Git is the gold standard for version control software. Pretty much every open source project worth knowing about is hosted on Github.
To download and install Git, just open a terminal and type “
sudo apt-get install git” and agree to the installer’s demands.
Next: we’re going to set up some basics tools needed to make the software work.
First, install Lua. This is the language that the tool is written in. It is pretty simple. Just type “
sudo apt-get install lua5.2” and follow the installation process.
Second, we’re going to get Luarocks. This is the tool that makes it easier to install other tools (don’t you love Linux?). For this one, type “
sudo apt-get install luarocks” and follow the installations steps.
Third, we’re going to install Luajit. This is a just-in-time compiler for Lua that will make our lives a little bit simpler. Just type “
sudo apt-get install luajit.”
So far so good.
I'm a bot that creates forgeries from your photos in the style of famous painters. Click for instructions below! pic.twitter.com/3MpThDNwRE
— The Deep Forger (@DeepForger) September 5, 2015
Next up, we’re going to install Torch, a scientific computing and machine learning framework that makes up the backbone of the application. Unfortunately, this one can’t be installed using apt-get (the standard Ubuntu package manager).
Luckily, they do have a one-line installer that uses some command-line magic. Return to your terminal and enter “
curl -s https://raw.githubusercontent.com/torch/ezinstall/master/install-all | bash“.
When you’re done, type “
luajit -ltorch“. This will bring up the torch interface and verify that everything was installed correctly.
Exit out of that.
Now we’re going to install loadcaffe — a neural-network specific package. Install its only dependency by typing “
sudo apt-get install libprotobuf-dev protobuf-compiler“. Then you can install the package itself using “
sudo luarocks install loadcaffe".
— The Deep Forger (@DeepForger) September 7, 2015
Double Checking Dependencies
Finally, we’re going to pre-emptively update some stuff just to make sure everything goes smoothly.
sudo luarocks install image” to make sure that your image package is up to date. Next, enter “
luarocks install nn” which will do the same for your ‘nn’ package.
Installing Deep Style
Alright! At this point, we’re ready to actually install the software itself. For cleanliness sake, create a new folder in your home directory (‘mkdir DeepStyle’). Then, enter it, using “
cd Deepstyle“. Now type “
sudo git clone https://github.com/jcjohnson/neural-style.git".
Next up, we’ve got to download the model. Make a cup of coffee or something, this will take a while. Back to the terminal, and type “
sudo sh models/download_models.sh“. That’ll start a long, elaborate download process. If it fails because of permission errors, try giving yourself read-write permissions on the relevant folders, using chmod.
— The Deep Forger (@DeepForger) September 6, 2015
Using Deep Style
Okay, we’re good to go. Using the software is pretty simple.
Make sure you’re in the DeepStyle/neural-style directory in the terminal. Now, you’re going to need some images to work on. Download them off the Internet (or whatever), then copy them into the DeepStyle/neural-style folder using the file browser.
Now you can use the command line to process individual images. The format is pretty straightforward:
th neural_style.lua -style_image YOURPAINTINGHERE.jpg - content_image YOURPHOTOHERE.jpg -gpu -1
(Obviously, you’ll need to replace the chunks in ALL CAPS with the names of your files).
That will get the neural network started. It’ll run for about an hour, spitting out new partially-converged images every few minutes until it finishes. The
-gpu -1 flag stops it from trying to access your GPU.
After several hours of trying (and bricking the operating system several times) I was unable to get Ubuntu and CUDA to play nice with my GPU (an NVIDIA GTX 970). If you have more luck with that, you’ll want to install CUDA and cudann.torch (see the github repo for more information). If not, that’s fine — it’ll still work using your your CPU; it’ll just be a little slower.
If you have any issues getting all of this working, just ask me in the comments, and I’ll do my best to help you out.
Here are some images I’ve generated over the last couple of days. The results are mixed, but many of them are pretty impressive.
This one is of my friend Zack on a hiking trip to Yellowstone. The style comes from an abstract painting, created by Theresa Paden. I was curious to see how the system would do using an image with absolutely no structure. The results are pretty neat, and you can definitely see the similarities to the style image.
This one comes courtesy of one of my favorite artists, Charles Demuth (see: Incense of a New Church, and Figure 5 in Gold). Interestingly, Demuth is one of primary visual inspirations for the art of Team Fortress 2, as you can see from the style image.
I fed it an image of Jersey City that I found on Wikimedia. The results are… pretty good. It didn’t pick up on the angularity of Demuth’s style, but it certainly picked up the soft, textured look and the color palette.
This one is an attempt to generate a synthetic O’Keeffe, using a fairly mundane picture of some flowers I found. The results are, frankly, spectacular. Aesthetically, this is one of my favorite results. The richness of O’Keeffe’s colors and shapes come through clearly. The layered edges of the flower petals become the edges of the leaves in the background. The flowers themselves dissolve into colors, becoming almost abstract.
It would be a good painting if a human did it. I’m very tempted to spend a couple of days rendering a higher resolution version of this one and have it framed.
Here’s my friend Shannon in her Halloween costume, by way of a Picasso print. Interestingly, the device chose to paint the lower portion of her face white (similar to the color layout of the Picasso piece). I’m not sure if this was a coincidence or not, but the results are striking. It also seems to have correctly identified Shannon’s hair on the left hand side, and re-drawn it using the color and linework from the hair in the style image. Ditto for her hat.
This is one of the pieces where the limitations of the technique start to become clear. If Picasso were actually painting Shannon, he’d have thrown away the structure of her face and skewed the features to achieve the effect he wanted. This system doesn’t understand those sorts of high level concepts, and is able to imitate only superficial aspects of the style, like the dark, angular lines and color palette.
Fairly straightforward: a picture of the Eiffel Tower, and Van Gogh’s other Starry Night. It does a good job rendering the cloud in a Van Gogh-ey style, despite the absence of clouds in the original image. It also does a good job of translating the scene from day to night.
I wasn’t sure why it decided to render the tip of the Fiffel tower as a pillar of fire. It looks cool, but it’s not really justifiable from the input data. Then I realized that the style image has thirteen long, vertical yellow strips in it, in the form of the reflections in the water. That’s a pretty massive cluster, given so little training data. The poor thing has probably learned that any high-contrast vertical edge must be one of those reflections. You can see more extraneous vertical stripes faintly in the clouds.
Same Van Gogh painting, but this time I gave it some actual stars to paint. In this case, the pillars portion of the Eagle nebula. I like the results — although, once again you can see its obsession with yellow stripes. Every vertical portion of the pillar becomes a bright, wobbly yellow line. It’s also clearly upset by the green, which did not occur in the training data, and does its best to get rid of it in favor of blue and black.
Some results from this are extremely compelling, although the technique has clear limitations. Some images have lousy composition, and the system has difficulty with more abstract artists like Picasso — who famously liked to distort his subject matter, scattering its features. The algorithm picks up his angular lines, and clashing colors, but is still a slave to the pixel values of the image. It doesn’t have the comprehension you’d need to deviate too far from the source material.
What excites me about all this is that I don’t think those limitations are fundamental.
The approach being used here — train a network on one image and use it to construct another — is fundamentally kind of a hack. It gives the network very little data to work with. A more advanced version of this application would use a network that has information on many paintings, and perhaps even real images, to give it plenty of context about the image it’s trying to “paint.”
A deep grasp of style can only exist in a broader context. You can’t derive it from a single image. Designing an architecture that gives the system access to broader data might allow it to derive a more “human-like” understanding of the image, and how artists represent different elements of the real world. Such a network might be able to produce images that are more abstract and have a better composition. Such algorithms would cease to be a cool toy (like this) and become a way to produce actual, original art.
Which is a very peculiar thought, in some ways.
Making Your Own Images
If you get a disappointing result, you can play around with the options a little to try to get more convincing results. The full list is on the Github. The important ones are
- -content_weight -value How much to weight the content reconstruction term. Default is 5e0.
- -style_weight -value: How much weight to give to the style image. Default is 1e2.
- -style_scale – value: How large of image patches should the system analyze (Larger becomes more abstract). Default is 1.0.
Once you get everything working to your satisfaction, please post your most interesting images in the comments. I am really interested to see what you guys come up with.
Image Credits: human brain painter via Shutterstock