In 2007, while he was an Engineering Science student at U of T, Matt Zeiler (EngSci 0T9) saw a computer-generated video that changed his life.
The video depicted a flickering flame in lifelike detail, but the flame wasn’t real; every pixel in the video was generated by a computer algorithm, trained using the artificial intelligence technique known as neural nets.
“I realized I couldn’t write a function or a loop to generate a video as realistic as that,” he says. “It was a whole new way of programming computers, and I knew I had to learn more about it.”
Today, Zeiler is the founder and CEO of Clarifai, a company that leverages artificial intelligence (AI) techniques — including neural nets — in its mission to “understand every image and video to improve life.” In late January, Zeiler visited U of T Engineering as part of the Engineering Science Education Conference: a recording of his talk is available online.
Writer Tyler Irving sat down with Zeiler following his talk to learn more about his journey in AI and entrepreneurship.
You joined Engineering Science after a full year of pre-med studies at another university. Why did you want to switch paths?
I always liked building things. As a kid, my brother and I helped my dad with lots of projects: a new garage, a new driveway for the cottage. The rest of my family is in the medical space, but for me, calculus and physics held more appeal than biology and chemistry. The idea of using them as tools to make something that’s never been done before was really exciting.
I applied to a couple of engineering schools, but at U of T I heard someone say something that has always stuck with me — Engineering Science will help you learn how to learn. I think that’s true: you are thrown deep into many subjects all at once, and you have to be able to pick up new things very quickly. That stretched my mind in ways I still find useful today.
How did you get into AI?
It was really the one piece of pure luck in my whole journey. When I was deciding on my major, I went to my residence advisor, who happened to be Graham Taylor, a PhD student working with Professor Geoff Hinton. Hinton is known as the godfather of deep learning, a set of techniques that includes neural nets. It was Taylor that showed me the flame video, which really got me thinking about the potential of artificial intelligence in a whole new way.
You did your undergraduate thesis with Hinton. What was the subject?
It was kind of strange. There was a group at Queen’s University studying the behaviour of pigeons. They had a motion capture system set up, with markers attached to the pigeon’s head, torso and feet. But the cameras were mounted on the ceiling, so often the feet were occluded by the body.
We were able to create algorithms that learned where the foot markers ought to be based on videos where all were visible, and we used those to fill in some of the missing markers.
Where did the technology for Clarifai come from?
After graduating from U of T, I went to New York University to work with Professor Rob Fergus. He had this idea for algorithms that would be able to scan through images and extract repeated patterns.
The advantage of this kind of system is that you don’t need to train it on a data set that has already been sorted by humans, because it learns to identify the important features. For example, if you fed it pictures of dogs, it would end up learning to identify features like a dog’s face. It doesn’t know what a dog is, but it knows that this pattern of pixels is key to understanding the image, and you can add the labels after.
During my PhD, I interned at Google Brain, but I left early after I realized that models I had been working on at NYU were getting better results than what they had. Eventually I started working on some new models on my own, in my apartment — that was the beginning of Clarifai. In 2013, I finished my PhD, and submitted both the NYU models and the Clarifai models to ImageNet, which was an international competition to find the best-performing algorithms for image recognition. The Clarifai results won, which was an important early boost to the company.
Tell me more about your mission to “understand every image and video to improve life.”
Improving life can mean organizing your personal photos, helping companies understand social media profiles so they can market more relevant products, or improving your search for your next vacation on hotel listing sites.
Our software runs on our own cloud platform, and we make it available to our customers through an application programming interface (API) — all they need to take advantage of our algorithms is four lines of code. That’s been key to discovering new applications. When you empower developers to build new things on top of your platform, you’re going to learn a lot of new ideas.
Many of your competitors have been swallowed up by giant tech companies, but you have decided to remain independent. Why?
I think a lot of the companies that got acquired were founded by people who didn’t really want to be entrepreneurs. They had cool technology and good people, but the company was just a way of packaging those assets.
I have been passionate about this stuff since I was a kid, and from day one, I knew there was an opportunity to build something big with this technology. Because image and video recognition is all we do, we can remain laser-focused on that, and not get distracted by the other projects that larger companies are pursuing in parallel.
We don’t yet have the name recognition of the larger companies, but our business is growing and generating lots of revenue: we now have more than 60 employees. It’s exciting for me to be part of that growth and to be involved in how it all plays out.
What advice do you have for students who want to follow in your footsteps?
If you want to be a CEO, make sure you actually want to start a company and that you’re not just mesmerized by some cool piece of technology. As CEO, you’re going to have to do a lot of things that you never thought of, and not all of them are fun, such as when you have to tell someone they are not doing a good job.
For people who want to really innovate in AI, my advice is to go beyond the open-source toolkits that are now widely available online. Those are like building blocks: they’re great if you want to build something quick, but to have a really new idea, you need to actually do the math and understand how it works. That means implementing it from scratch. It’s hard to do, but that’s how you gain deep knowledge and really make an impact.