LTE For Networking People

When I started working on CoLTE, I had a ton of knowledge and experience in “regular” computer networking (i.e. TCP/IP) but zero experience with cellular. Teaching myself LTE from this background, some things were easy to grasp, other things were a bit less familiar to me, and some things made absolutely no sense at all. Given that there are a lot of other people in the same boat as myself, I wrote this post as a high-level “Transitioning to LTE” guide that is specifically targeted towards people that are already familiar with “regular” networking concepts (i.e. IP addressing, routing, link protocols, the TCP/IP stack. etc).

Cellular Carriers Came From Telephony, Not Networking!

To understand the motivations behind the architecture and design of LTE, it’s important to realize that cellular carriers are much, much closer to phone companies than they are ISPs. This characteristic affects every point of cellular network design, but the most important and consequential difference is that every prior generation of cellular networks was circuit-switched, not packet-switched. This meant that each call required that a physical end-to-end circuit between the two phones be provisioned and maintained for the life of the call. Naturally, the circuit would have to be routed through the network core, just the same as in a regular phone company.

With 3G, cellular carriers added Internet access roughly the same way telephone companies supported dialup. The phone sets up a virtual circuit with a server in the telecom’s core, called the SGSN, that acts as an Internet gateway. The phone sends and receives packets over the virtual circuit, and the SGSN handles things like NAT and IP forwarding. Note that in this design, packet switching exists only from the SGSN out to the rest of the Internet: the route through the telecom network, from the SGSN to the phone, is still a circuit. As a result, all data traffic within the 3G network ends up being routed through the SGSN, and any change in the network path from the phone to the SGSN disrupts all communication. As a side note to this, the fact that voice and data share the same circuit creates a fundamental limit to 3G: you can’t use any data while on a voice call!

“LTE” Really Just Means “An IP Network Core”

With LTE, cell carriers finally caught up to the Internet age, and as a result, all communication in the network is just IP datagrams! The cell towers each have IP addresses, the network runs some form of routing protocol, and towers communicate with the network core over SCTP. When cell phones join the network, they setup a link-layer connection with the cell tower, the tower establishes a new GTP tunnel from the phone to the network core (assigning the phone an IP address in the process), and then the phone communicates directly with the network core (performing things like setup and auth) by sending IP packets over the GTP tunnel.

Once the phone’s connected to the network, the GTP tunnel handles all the data-plane as well as control-plane information… this includes Internet data, obviously, but also voice and texts. This leads me to one of the big secrets of LTE: literally everything in LTE is based over IP! Calls are VoLTE (Voice over LTE), which is really just a voice-over-IP call using SIP with some window-dressing! Text messages? Same thing! All of these services are nothing more than IP traffic to an IMS (stands for IP Multimedia Services) server in the telecom operator’s network. If you’re calling someone in another network, the two IMS servers setup a tunnel between them, but it’s still just tunneled IP traffic end-to-end.

Obviously, this has a huge simplifying effect on the telecom network architecture. Once you go to an all-IP core, you can finally ditch all of that legacy hardware and take advantage of the last forty years’ worth of network engineering. Tasks like firewalling, traffic shaping, and network monitoring become incredibly easy, as does building out the network.

Cellular Networks Are Centralized For Architectural Choice

Okay, so everything’s IP-based, and that’s great… but cellular networks still aren’t the same as the Internet. Astute readers will have already noticed that if all IP traffic from the handset is tunneled to the network core, then all traffic must be routed through the core, regardless of how close its final destination is. As a trivial example of this, all the datagrams for a call between two phones connected to the same cell tower will, in fact, be routed through the core – and if the core’s too congested, the call won’t go through, regardless of the available bandwidth at that tower. This is very un-Internet-like behavior, and brings me to the biggest difference between cellular networks and the Internet, being that of centralization.

Because of centralization, it’s challenging to create a good networking analogy for the relationship between the towers and the core. On one hand, GTP tunneling ensures that the first “hop” for all traffic generated by the handset is the network core – and in this view, the entire LTE network can be thought of as a single, massive WiFi router, with each individual cell tower representing a different antenna, and the network core performing tasks such as authentication, NAT, and packet switching upstream.

On the other hand, this analogy is not wrong – but it’s also absurd, given the size and scale of these networks, and the amount of engineering and technology that goes on to setup and maintain these tunnels. Cell towers are quite powerful devices, and it’s trivial to imagine a world wherein each cell tower is in charge of performing NAT and routing, and then the whole network core is simply packet-switched – this would actually be the WiFi analogy I described above. There are many reasons why this doesn’t happen, including vestigial design and backwards-compatibility with previous generations of cellular networks, but the main reason is the fact that cellular networks are a singular, massive enterprise network – and when we look at WiFi networks, we see the same designs!

Enterprise WiFi Is Equally Centralized

Our team recently realized that the best analogy for understanding LTE is Meraki. For those of you not familiar, Meraki is a set of cloud-based tools that automatically configure, deploy, manage, and provision enterprise-scale WiFi networks. Under Meraki, each WiFi router is centrally controlled, and this architecture enables things such as (1) a shared SSID across a large number of routers, (2) seamless mobility management over the network, (3) singularly controlled authentication and sign-in, etc. Meraki’s used in most large-scale enterprise or campus networks, basically anywhere that has a single “WiFi network” with tons of different individual access points. If this sounds remarkably similar to a cell network, it’s because it is.

Bringing This Back to CoLTE

Now that you’re up to speed with the basic LTE architecture, it’s time to bring this all back to the CoLTE project. A CoLTE network consists of a single cell tower combined with a single computer running all the core network logic. The core is connected to the Internet upstream, at which point it’s nothing but IP packets, so all the complexity goes away, the entire network condenses into a single physical unit, and the initial analogy of a WiFi router is incredibly apt. Hope this article’s been informative and useful, and thanks for reading!