Deep Learning Overview
Neural networks.
Deep learning. This is the “how” behind what we see as AI (artificial
intelligence) around us today, from Alexa’s voice recognition to Google’s image
search to your smartphone’s ability to unlock based on your face. While these
algorithms are very powerful indeed, they are also mysterious…
“Mysterious”? What
does that even mean? Simply put, we don’t understand how they conclude
what they conclude. It’s not as if someone wrote specific software instructions
on how to recognize a dog. Rather, a huge set of data is provided as input with
tags like “dog” and “not dog”. The system goes over the data and self-discovers
which patterns correspond to a dog. Remember that old joke on computers as GIGO
(Garbage In, Garbage Out)? That joke was based on instructions we keyed
into a computer (type the wrong instruction, you get the wrong result). With
AI, GIGO now means something else – feed it the wrong data and/or wrong
tags as input, and the patterns the system learns will be wrong.
Like the time
Google Photos started misidentifying many photos of black people as gorillas.
How could that have happened? Since the system had identified patterns on its
own, Google didn’t have a clue as to what it had “learnt”. How can we fix such
problems? We can’t. So Google’s fix for the gorilla misidentification problem
was to disallow any search for the word “gorilla”. An ugly and crude hack, no
doubt, but it stopped the tide of criticism that Google was racist…
One of the first
companies to recognize the potential of deep learning was, yes, Google. The
possible applications were obvious for a search company – text searches would
work even better if the system could “understand” phrases rather than just
return results based on keywords. Image recognition would open the door for
image searches. Translations might get better. Plus, Google was trying its
hardest to get to driverless cars back then, with limited success. Deep
learning, with its potential to “recognize” objects sounded like a possible
solution for driverless cars. But all that was potential. Google could also see
an immediate application – if the algorithms that decide which ads to show when
you search for something got better, it would lead to more clicks on ads and
thus more money.
But there was a
major hurdle. The CPU’s of the time were not designed for crunching the
quantity of data that deep learning takes as input. So Google decided to go
with the fastest processors out there, GPU’s – Graphical Processing Unit –
which were originally made for video games! GPU’s are very expensive, and the
numbers needed for deep learning are huge. But money was never an issue at
Google – they paid $130 million for the first batch. It would be the beginning
of a new line of sales for the company they bought the GPU’s from, Nvidia,
which soon reorganized itself around the deep learning market.
Thanks to its
headstart, Google also started making custom chips for deep learning – it
called them TPU’s (Tensor Processing Unit), the “tensor” becoming a reference
to the kind of maths used extensively in deep learning. The funny thing? TPU’s
calculate less precisely than CPU’s, not more precisely – they
drop everything after the decimal! How was that better? By working only with
integers, they operated faster and the huge number of calculations being done
more than compensated for the slight inaccuracies in individual inaccuracies.
Impressive though deep learning is, it is nowhere near being a general purpose AI. So far, its accomplishments have been in very specific areas (speech, images, games etc). All controlled environments. A dynamic environment, where things change and actors adapt, were beyond its scope – that’s why fighting fake news algorithmically is not possible today. Plus, it requires humongous amounts of data to “learn”. Contrast that with even children – they learn based on far, far smaller data points. But who knows what the future might bring?
Comments
Post a Comment