Neural networks are algorithms that learn by example, rather than by following a programmer’s set of rules. Although on this blog I’ve mostly been using them to generate new examples of things (like paintcolors, halloween costumes, or craft beers), neural networks can do a lot more.
One thing neural networks can do is classify things. Give them a bunch of examples of one kind of thing, and a bunch of examples of another kind of thing, and it will (hopefully) learn to tell the two apart. This is really useful – for identifying obstacles for self-driving cars, for telling diseased tissue from healthy tissue, and even (with mixed success) for identifying spam or troll comments. I wanted to test this kind of algorithm out, so I devised the simplest task I could think of: telling metal bands from My Little Ponies.
I’ve previously trained text-generating algorithms to generate metal bands and My Little Ponies, so I had datasets ready to go. IBM Watson has a very easy-to-use tool for training classifiers (there’s a classroom-friendly version at machinelearningforkids.co.uk). I loaded in all 1,300 of the My Little Pony names I had, and filled the rest of the tool’s memory with metal bands (about 18,700).
Then I entered some new pony names – neural network-generated pony names so they weren’t in the original dataset – to see how it would classify them. The result:
The neural network labeled *everything* as metal. People who have worked with neural network classifiers before will have seen this coming: with a dataset that was 94% metal class and only 6% pony class, I had set myself up with a classic case of something called class imbalance. The neural network found it could achieve 94% accuracy on my training dataset by calling everything metal. Princess Pie? Metal band with 81% confidence. Sweetie Loo? 85% likely to be metal. Sparkle Cheer? 84% sure that’s a metal band. Flutter Buns? So, so metal. 97%. The only names it didn’t label as metal bands were ponies that were the original dataset. So, Twilight Sparkle? 100% pony. Twilight Sprinkle, though? 83% metal.
The fix was easy: I trained the classifier again, this time with equal numbers of ponies and metal bands. This time the results were a lot more believable. And, the classifier network mostly agreed with the generator neural network names. There were some surprises, though.
When I fed the classifier names that were generated by a neural network trained on BOTH metal bands and ponies, it was not as confused as I had expected. Instead, it classified them with high confidence as one or the other.
According to this neural network, we may need to rethink Star Wars canon. Leia Organa – 96% metal, 4% pony Luke Skywalker – 31% metal, 69% pony Darth Vader – 19% metal, 81% pony Kylo Ren – 18% metal, 82% pony