interview

Internet pioneer Vint Cerf: Google's algorithm 'almost impossible to understand'

Vinton G. Cerf, co-inventor of the Internet and vice president of Google, talks about online abuse, Google's search algorithm and digital innovation in the Global South.

Vinton G. Cerf's job title of Chief Internet Evangelist at Google pretty well sums up his Internet career. In the 1970s, while working at Stanford University, Cerf helped co-invent some of the standards and technologies that underpin the Internet. Since then, the computer scientist has served as a fierce advocate for an open and free Internet.

#mediadev had the opportunity to talk to Cerf on the sidelines of the 2016 Internet Governance Forum (IGF) held in December in Mexico.

The following interview has been lightly edited for clarity and brevity.

#mediadev: When you look at the topics being discussed at IGF 2016 and compare it with the enthusiasm and hope with which the Internet was described in the 1990s, the Internet today is fraught with many threats. Would you say the history of the Internet is one of disappointment?

Vinton G. Cerf: No, it doesn't disappoint me. The problems we face – abuse, for example, or misinformation – in a very funny way mean the system has penetrated very deeply into our social fabric. Today, we estimate 47 percent of the population is online. I am hoping some day for about 100 percent. But I knew as far back as 1988 when I was pushing to make the Internet a commercially supported effort, a self-supporting engine of sorts, that this meant the general public would be online.

The general public is different from the people who built the network. For the most part, they were geeks and engineers and all they wanted was to make it work; that was their passion. The general public, on the other hand, wants it to work but they don't work to make it work – they just use it and expect it to work. But there are people out there who don't necessarily mean well and they're part of the general population, as well. So the problems we face are ones we encounter whenever we build infrastructure of this scope.

View of Google headquarter in California

"It's not clear that anyone including us completely understands Google's algorithm and how it works"

Just to illustrate using a metaphor – if the Internet were a collection of road systems and these roads were global and you could go anywhere from one place to another, then you'd have the question of: How is the traffic arranged down the road? What side of the road do you drive on? Are you allowed to get drunk and drive the car? There are a whole series of problems that arise in this global road system and we now have the problem of making this a safer environment for everyone. This is going to be a major topic of discussion certainly for the rest of this decade.

Internet giants such as Facebook or Google have received a lot of criticism for their algorithms. Are you planning to publish the criteria you apply when using algorithms so that users can understand better how [the search engine] provides information?

There's a big problem with trying to do that. The first problem is that people are constantly trying to game the system. They are trying to find ways of forcing their websites or their ads to show up first. I use the phrase 'gaming the system' on purpose because there are people who build fake websites to point to [certain] targets in order to cause the site to escalate in rankings. If we were to expose the details of the algorithm, all we would get is increased amounts of gaming and greater difficulty defending against it. Gaming is harmful to net users so exposing the algorithm, in my view, doesn't make any sense.

Second, this is an incredibly complex algorithm – there are hundreds of variables that go into it, trying to figure out what is quality information and those variables change whenever we experiment to try to improve the quality of the algorithm. To make this really clear, we're starting to introduce neural network mechanisms for machine learning. We're talking about systems that are almost impossible to understand. We can see their reaction to the training but if you open up the box, it's a box with a million dials in it that have all been set to some amount. Nobody really understands exactly how those dials got set because of the way training algorithms work. If somebody says to you, "I opened up the box and here is dial number 7,222 and it's set to 0.08 – what happens if we change it to 0.10?" Nobody knows the answer to that.

It's understandable why people ask to be shown the algorithm. The problem is, it's not clear that anyone including us completely understands the algorithm and how it works. All we can do is to keep testing it to see what its responses are and adjusting it so that it improves the quality of the result.

As one of the big players, how does Google, compared to Facebook, decide what is valuable information and what needs to be filtered?

"The internet is not a magic wand"

There's a difference between indexing the web and [hosting information]. Google is indexing the entire world wide web and we have zero control over what people put on the web. All we can do is find the information, index it and help other people find it. In the case of Facebook, it hosts an incredible amount of content. And so Facebook – and Google in the case of YouTube [which it owns] – have some rules, some appropriate user policies, about what they're willing to host on their systems.

On YouTube, something like 500 hours of video per minute are being uploaded and it's not possible for human beings to actually observe that. We're trying to automate mechanisms to detect things like misuse of music. We have this fingerprint mechanism which we're using to detect if someone has uploaded a piece of music which should be protected or should be paid for or not used at all.

There's a more recent experiment about terrorist information. Both Google and Facebook and others recognize this content is potentially harmful to our societies and so we're experimenting with trying to identifying that kind of information. We're pooling information that our algorithms and our human observers identify as potentially harmful and we put that into a common pool and each company uses whatever mechanisms it has in order to determine whether that should be filtered. So these are steps being taken by the most visible and influential companies who have the biggest impact on the user community.

The digital divide in the Global South now is about more than whether someone can get online or not. It's also about bigger issues such as Internet speed and quality of access. What are your thoughts or recommendations on improving digital access?

It's way beyond being a technical issue.

(Cerf then went on a tangent to talk about an IGF workshop he attended on digital innovation and entrepreneurship in the Global South, which emphasized what he believes is the important point of context.)

So what you do as an entrepreneur depends on where you are, what the conditions are, what facilities are available, what rules there are for the formation of companies and for their funding, and so. We have to take a very contextual view of the problem. This is not one of those holistic things where you have one formula and it works for everyone. We have to start paying attention to what the local conditions are: What kind of training is available? Are there universities? Do I have people who are not only technically capable but understand business, sales, marketing, human relations, hiring people and legal structures? You need all those ingredients to make it work.

"The magic is in people's heads when they recognize a problem and a solution to it"

The Internet may enable some of that but it's not a magic wand. The magic is in people's heads when they recognize a problem and a solution to it. I am convinced there are just as many smart people per capita in the Global South as anywhere else in the world. The problem is they don't always get an opportunity to exercise their intelligence and their creativity. The real problem is figuring out what is getting in the way of those people being successful entrepreneurs.

What advice do you have for young innovators?

They shouldn't be afraid of taking risks and that things may fail. It happens all the time in new businesses. If you're in Silicon Valley and you're a venture capital company, you assume that 85 to 90 percent of the companies you're trying to help are going to fail. … That's the important message to young people - it's not fatal to try it out. Even if you fail, you learn something and you should try again.

If you talk to entrepreneurs and ask them how many times they have tried, the numbers are astonishing. My favorite is the fellow who started Waze which [Google acquired in 2013] for about $1.5 billion. He was talking to a bunch of young entrepreneurs … and he said, "I started four companies all of which failed before I got Waze off the ground. I am still here, I am not dead, it didn't kill me!" He learned a great deal from the failures of the business, it could have been he didn't have the right people, he misunderstood the market, the market shifted out from under him or somebody came along with better technology.

We are absolutely terrified that there are a couple of young people in a garage somewhere that are inventing a much better technology than we have right now. We don't know who they are or where they are but it makes us – I am in the research team – run as fast as we can looking at as many different possibilities as we can because we don't want those two guys in the garage to overtake us without us making an attempt to prevent that from happening.

Interview by Steffen Leidel (kh)

DW recommends

The new power of manipulation

Google and Facebook are able to influence the opinions of their users. That's dangerous. By Sandro Gaycken (18.10.2016)

Internet governance - why you should care

Internet governance is one of those stodgy phrases that seem irrelevant to people's daily lives. But it has a big impact on issues like human rights, media freedom and development, making it important for everyone. (13.06.2016)

WWW links

#mediadev – media development insight and analysis

#mediadev brings you insights, analysis and research around the topics of freedom of expression, media development and digital transformation. Join the global conversation. Use the #mediadev hashtag

Date 19.12.2016
Feedback: Send us your feedback.
Print Print this page
Permalink https://p.dw.com/p/2UXy3

Date 19.12.2016
Send us your feedback.
Print Print this page
Permalink https://p.dw.com/p/2UXy3