Be Careful What You Wish For: Caution on Net Neutrality

Net Neutrality seems to be the hottest topic around these days. The good guys are for it and the bad guys are against it, at least according to the press. However, it seems that virtually no one really understands what it is that one is for or against, nor the implications of what it is they think they want. Given how worked up everyone is perhaps it would be a good idea to try to understand what it is all about.

Net Neutrality confuses what should be two very separate issues: handling traffic differently based on 1) what kind it is, and 2) based on who sent it. But these aren’t separate, because of some bungled engineering.

A bit of explanation. The machinations required to get messages through the Internet are structured into layers of different kinds. Each message going through the Internet contains information (called a header) for each layer to use. When a message leaves the sender, it goes down through 4 layers, adding headers for each layer. When it encounters each router it goes up through 3 layers (each one performing its part of the task of moving data across the Net) on the way in to figure out where it goes next (the primary purpose of the 3rd layer, IP) and then back down through those layers to the wire. Finally at the destination, it goes back up through to the 4th layer (TCP) which makes sure everything got there and that the sender is not sending too fast for the receiver, then it is delivered to the app. The headers are structured so that the lowest layer information comes first, then the next higher layer’s, etc. and finally the user’s data.

The idea is each layer uses its header to do its thing and doesn’t understand the rest of the bytes in the message. To each layer, the bytes that don’t belong to that header are considered unintelligible; they’re called user data (even though some of it isn’t the real user’s data, but intervening headers.) For decades this discipline has been followed partly because it is good solid practice and partly because bad things tend to happen when it is ignored. Now, of course, when a programmer writes the code for a router to process messages, unless another layer’s information is encrypted, the code for a layer can be written to “cheat” and look beyond its layer’s header.

Generally, lower layer information is more about where it is going, getting the message across the net correctly and with the necessary characteristics within the finite resources of the network, while information near the top (deeper in the message) is about who sent it. The network tries to be as invisible to the app as it can; at least that’s the theory.

When the Internet and its precursor the ARPANet were first built, computers and lines were much slower, tens of thousands of times slower. Voice was a high bandwidth application then! The kind of data exchanged was often text, or numbers for number crunching applications such as payrolls, or weather forecasts, molecule folding simulations, etc. Of course, no one wanted to wait a long time for their data, but for this kind of data delay or the “burstiness” of the traffic was not a big deal. In fact, it was burstiness that distinguished “data networks” from traditional telephone networks. More recently, with faster computers and faster lines, there has been a push to also support voice and video. These are very different from the early text and numbers uses of a network.

For voice and video, burstiness is a big problem. Voice and video should be delivered smoothly at a constant rate. Delay is less a problem than the variation in delay between messages, called jitter (a pretty self-describing term!). As messages go through the Internet, they are relayed by many routers along the way from source to destination. The amount of delay encountered in each router may vary widely and cause messages to arrive in erratic bursts. So even if the source sends a nice smooth flow of traffic, varying conditions at the routers along the way will introduce jitter, often considerable amounts, because lots of messages show up at routers at the same time.

And while some packets are lost in ordinary operation, streaming voice doesn’t have time to replace lost packets before they’re no longer useful. It doesn’t take much jitter or packet loss to make voice unintelligible or video unwatchable, as we have all experienced. For mail and normal web browsing, jitter is irrelevant because generally the code will wait for a whole line or paragraph before displaying it, and even if it didn’t the text or picture would be intelligible. Jitter has no effect.

Why do these conditions vary? Think about the last time you went into a nearly empty store, maybe only a half dozen customers in it, who all came in at different times, but somehow all of you end up at the cash register at the same time and you find yourself standing in a queue, delaying! You thought it was Murphy’s Law! Well, it might be, but really it is just normal random behavior. (Even though it feels like it does, it doesn’t happen every time!) The same thing happens in routers! Different lengths of the queues at different times create different amounts of jitter. Again – for some messages, this is not a problem, but jitter can make voice and video unusable. And if the queues get too long, this can be bad for all messages; it’s called congestion. For networks like the Internet congestion starts to appear at about 30% loading.

There are two ways to deal with this: 1) overprovision the network (order more bandwidth) and run the network around 25% efficiency or lower, (you can guess how popular this is), or 2) manage the traffic to reduce the probability that jitter-sensitive messages encounter queues but jitter-insensitive message might hit queues. In other words, treat jitter-sensitive traffic different. No one cares if fragments of text or pictures arrive a few milliseconds earlier or later.

In the early 1960s when Paul Baran proposed the kind of networks represented by the Internet, his insight was that computer networks would be the opposite of telephone networks: bursty vs smooth and continuous. This is what the Internet was built to do. And that was perfect for file transfers, mail, and even the web. From the mid-70s, engineers recognized that eventually, these networks would need to carry different kinds of traffic with different characteristics. It just wasn’t clear which traffic types would be needed first. By the early 90s, it was clear that voice and video over data networks would be the first applications that needed traffic characteristics different from the early applications. The problem was how to provide low jitter flows in a network like the Internet. With the Internet boom on, growth is double digits and all network providers can do is order more bandwidth and run their networks at very low efficiency. In fact, having to maintain low utilization to avoid congestion made it appear that voice could be supported.

Experts set out to solve the problem in the early 90s. Is this a hard problem? Well, let’s say moderately hard. There are not that many tools a router has to ensure smooth regular intervals between messages. And the problem is cumulative: even if one router only varies a small amount, because a message may go through as many as 20 (or even more) routers to get to you, a lot of “little bits” quickly become “a whole lot.” So what to do? Clearly, any solution would need to treat jitter-sensitive messages differently than jitter-insensitive messages. And since jitter is created by the variations in delay at each router, the primary thing to do with jitter-sensitive messages is to “get out of their way”! Forward them as quickly as possible. (Can someone say “fast lane”?)

During the mid- to late-90s, Internet engineers came up with at least two different approaches, Integrated Services (IntServ) and Differentiated Services (DiffServ). Unfortunately by the turn of the century it was clear that neither proposal worked as well as expected so neither was deployed widely. Given the long string of what appeared to be successes and all of the hype surrounding the Internet, this was more than a little disappointing…and disconcerting.

Soon after this experience in 2001, Blumenthal and Clark wrote a paper making mostly non-technical arguments that Internets should only behave in this bursty, “best-effort” way. The way the Internet was today is the way networks were meant to be. This is odd, because before this time there was great confidence that anything could be achieved. Had they seen something that indicated otherwise? Regardless, the seed of the net neutrality had been planted.

The major reason the proposed solutions didn’t work well was a design blunder made in the mid-80s when they first discovered that congestion could occur. (If you don’t do something about congestion, a condition called “congestion collapse” occurs. This is as bad as it sounds! Throughput in the network slows to a trickle. They put the patch for congestion control too high in the layers – which meant that as the Internet grew the effectiveness of the fix would decline.

“Discovered” is right. The Internet group did not see congestion collapse coming. They were caught flat-footed when it happened, even though everyone else had been doing research on the topic for 15 years, held conferences on it, etc. and the problem and the solution were reasonably well understood. Also, none of the other solutions proposed doing congestion control where the Internet did. They put it in the right place.

So why was this blunder made? The roots of the blundered congestion control lie in the early1980s, when another blunder had taken away the layer where congestion control should have gone. Regardless of what the lower layers do to smooth out traffic, TCP is predatory and will generate jitter. Luckily, the effect is secondary since interactive voice and video don’t use TCP.

Why was that blunder made? Because in the late 1970s, Internet architects made yet another blunder and didn’t realize that both network addresses and Internet addresses were necessary. (It isn’t relevant here, but even though you think you have an Internet address, it is really a network address.) The consequence of putting congestion control in the wrong place continually creates more jitter regardless of what the routers try to do to smooth it out.

Now the network operators – the much maligned Internet service providers – were between a rock and a hard place. Because the new approaches to supporting both kinds of traffic weren’t effective, they were confronted with just ordering more bandwidth to run their networks very inefficiently. But even that wasn’t really enough; they still needed to smooth out the delivery of jitter-sensitive messages.

Equipment vendors will tell you that they always try to build what their customers tell them they want. Their customers were desperate to be able to do something besides just order more bandwidth. Unlike other designs, the Internet does not provide any means for the applications to tell the network what kind of traffic it is (yet another blunder.)

There is only one other way to have a clue: If they just look a little bit into the next layer’s header, just “cheat” a little, so that they can tell what kind of traffic it is to make sure it is moved more smoothly with lower jitter. Of course, if they can look a little bit further into the message, then why not look a lot further!? Once that step is taken, you may as well go all the way. Then they could also tell who is sending the messages, perhaps tell if it is a competitor, and quite a bit more. The technical term for this is “Deep Packet Inspection.” Sounds ominous, doesn’t it? (A TCP/IP header is currently 40 bytes, they are delving a few hundred bytes.)

Now we have a problem! A Network operator can play all sorts of unfair games, because previous blunders left them with bad choices. Customers won’t tolerate bad service, so network providers find themselves adding expensive capacity and losing even more efficiency in their networks.

Thus we arrive at the heart of the problem, we want network operators to be able to discriminate jitter-sensitive messages from jitter-insensitive messages, but of course, we want the network to be fairly available to everyone.

Where Are We on Blunders? The first blunder we mention is considered a great achievement and people have gotten awards for it! It is now generally regarded as the right answer.  One can expect future work to follow it. The Internet is still unaware the second blunder even happened and that since the early 80s, the Internet has had the same structure as ITU networks. Even though the solution has been known for 40 years. Fixing the third blunder has been consistently rejected every time it has come up over the last 25 years, even when the solution was already deployed and operational. Multiple proposals that avoid admitting the mistake have all turned out not to scale.

To some extent we have been lucky so far. Voice is (now) a low bandwidth application, so it is unlikely to cause congestion, although too much of anything is not good (Voice congestion has been observed at .01% loading. Early VoIP deployments in corporations encountered this problem.) Video, on the other hand, is not even close to being low bandwidth (10,000 times more than voice), but watching a movie or listening to music is not really jitter-sensitive. (Most build a 4 second buffer and if that isn’t enough (indicating that sometimes it isn’t enough) degrade the quality of the video to lower the bandwidth until there is a 4 second buffer.)

One application that has always been a problem is video calling, which is both high bandwidth and jitter-sensitive. In fact, any high bandwidth jitter-sensitive application, such as interactive gaming, will be strongly affected once volumes of traffic are sufficient to require congestion control to avoid collapse. If the jitter-sensitive traffic cannot be treated differently then video calls, interactive games, and other future uses will not be possible. Our luck is going to run out.

This wouldn’t be so bad, if we could chalk it up to experience, i.e. if we only knew then what we know now. Unfortunately, we did know then what we know now; only the Internet made these blunders. There is going to be a need for other classes of service for new applications, and these design blunders are going to make it impossible to support them. There does not appear to be a way to deploy a solution to the congestion control blunder without incurring great, prolonged pain. We seem to have blundered ourselves into a dead end.

Now the problem with the Net-Neutrality debate should be clear: Two problems are being conflated because of earlier blunders. We need to distinguish traffic of different types but not traffic of the same type. If net-neutrality bans distinguishing traffic altogether, we are in deep trouble. Be careful what you wish for, you may get it!

John Day is a pioneer in network architecture and the author of Patterns in Network Architecture: A Return to Fundamentals. The president of the Boston Map Society, he teaches Computer Science at Boston University Metropolitan College. He wrote this piece for High Tech Forum.