Do you know your Kubernetes from Kafka? Your data pipelines from Elastic Search? If there's anything NFV has taught us, it’s don’t create a bespoke, sector-specific solution when a proven, general-purpose one exists. Steve Legge, COO, NetNumber, talks to Annie Turner.
AT: Where does your new TITAN.IUM platform leave your current solution?
SL: The name of the new platform is TITAN.IUM, and even from its name you can see it is the next evolution of the TITAN platform. TITAN.IUM is our new, cloud-native, InterGENerational platform, so we can support every generation of our network technology on one container-based platform, bringing together all of the legacy core signalling functions, whether it’s an STP [signalling transfer point] or a firewall or an HSS [home subscriber server] or whatever. We are not ‘just’ delivering a cloud-nat
ive 5G solution, but one that enables interworking with, and support of, legacy infrastructure.
We will continue to support TITAN for our customers – we foresee a natural migration over a period of time as infrastructure migrations occur, so we will keep application parity on both platforms.
AT: What about 5G?
SL: Of course, our customers are looking to 5G, so our first 5G applications will be routing, security and subscriber data management components, which we are rolling out now as proof of concept.
The first phase includes the SCP [service communication proxy] and the SEPP [security edge protection proxy], plus the NSSF [network slice selection function], the NRF [NF repository functions] and BSF [binding support function] – they are really the infrastructure functions – along with the UDSF [unstructured data storage function].
At the moment we offer HSS and HLR [home location register] on our TITAN platform with a common AuC [authentication centre] and SDM [subscriber data management] back end. We find that customers who are looking to transition HLR to HSS can get a lot of value out of our architecture because of the shared AuC and SDM components.
For the second phase of the 5G subscriber data management functions, we will be offering the UDM [unified data management], UDR [unified data repository] and AUSF [authentication server function]. So by bringing the 5G subscriber data management components onto the same platform, you’ll have that span from 3G to 5G on TITAN.IUM.
Listen to the webinar on-demand, Cloud-Native Networks Have Arrived – The InterGENerational Future of Your Network.
AT: What prompted you to create TITAN.IUM?
SL: While there’s so much discussion about 5G and the wonderful new use cases it will support, there has been far less talk about how, in parallel with the move from 3G to 4G, the core network has been decentralising, a trend that will be both accelerated and amplified by 5G and edge computing.
Automation enabled by orchestration, not the piecemeal virtualisation we have now, is the only way to gain the promised benefits of 5G, which are greater coverage, higher capacity, lower latency, better reliability and faster speeds at less cost.
Automation is essential in the 5G core network because of the huge number of container-based microservices involved in its implementation and deployment, and their corresponding network functions. The management of the 5G network at the microservice level can no longer be done manually, and requires the support of orchestration and automation tools, and frameworks.
Additionally, decentralising the core introduces new issues, particularly for use cases dependent on low latency and rapid scalability.
We were already thinking about these issues when a customer brought a low-latency use case to us which really made us focus on them. The customer wanted to deploy a set of firewalls across a large geographic area and resolve a particular threat vector in less than 20 milliseconds.
The speed of light was the problem, as the optical network consumes 17 of those 20 milliseconds for transmission, leaving us with 3 milliseconds for resolution at two ends of a large national network. It was clear that none of the database technologies on the market were up to replicating the data that fast.
It was also clear that this kind of problem would occur more frequently as the core becomes more distributed. When you've got data for things like policies, subscribers and sessions living in the same data centre, within a few racks of each other and all cabled up, that’s great.
When you’ve got elements spread over continents or around the world, then things don’t perform the same and scalability takes on a different meaning in this context. To address these scenarios our customers require a toolbox of solutions that enable them to solve these new challenges with a set of capabilities that can evolve with their changing needs.
A big part of that is providing data replication and management solutions that leverage technologies developed specifically to address these types of challenges. We combine both data pipeline and database technologies to solve these challenges.
AT: What are data pipelines and how do they solve the data replication problem?
SL: If there is anything that network functions virtualisation (NFV) has taught the telecoms industry, it’s don’t create a bespoke, sector-specific solution when a proven, general-purpose one exists.
Apache Kafka is exactly that: it’s an open-source, stream-processing software platform, originally developed by LinkedIn back in 2010, and donated to the Apache Software Foundation. Its purpose is to provide a unified, high-throughput, low-latency platform for handling real-time data feeds using pipelines.
Pipeline technology is almost ubiquitous – for example, it’s used in many widely used web-scale applications – yet it remains little known outside the people who work in that part of applications. We have a super-dedicated group of people that are very deep in the part of the network that we build product for and our own programme has provided invaluable learning to get this type of essential tooling right.
As we were moving towards the launch of TITAN.IUM, we talked to our customers about these issues, which confirmed just how in step this thinking was with them. As well as echoing the challenges around distributed architectures, many were also saying, “NFV is not delivering what we thought it would. We need the next step,” although NFV taught the industry a lot and it would not be so well positioned to move to cloud native without it.
Last November we received our first RFP that wanted extraordinary detail about CI/CD [continuous integration and continuous delivery] tooling integration into a container-based platform and we’re seeing that interest grow. Although I don’t think we’ll see many deployments until the middle of 2021, it’s rising up operators’ agendas.
NetNumber grasped that we should only spend development budget on the things that make us different, or that can’t be solved or haven’t been solved elsewhere. The telecoms industry in general needs to take that view of the world.
AT: How did you respond once you had really understood the implications?
SL: We realised we needed to shift to Agile and implement much deeper levels of automation across our own business in the interests of speed, quality, consistency and scalability. At the start of 2019, we set out to transition our business to Agile practices over a two-year period.
We audited all our tooling and rationalised it down to about a quarter of the original number, and once everyone was using the same tooling and the same processes, we saw significant efficiency improvements. Internally, Agile operations take care of the inbound work from the customer or from the field into development. Getting the product from development into the field is DevOps’ role with the tooling. For our customers, Agile delivers not only a software product, but our highly skilled and knowledgeable people, the processes, the tooling, and simple integration with their systems.
Our transition has gone well and we are about three-quarters of the way through, ahead of schedule. Heavy investment in DevOps has enabled us to automate development, testing and more, which has brought huge benefits. Leveraging web-scale processes, tooling and technologies internally has had a very positive impact on our business, and likewise the adoption of web-scale technologies in TITAN.IUM has a similar complementary effect.
AT: Telecoms tends to think it has specialised needs – surely that is a big leap?
SL: It is and it’s not. We recognise that every one of our customers has a different infrastructure and is in a unique position regarding what they have implemented, and their business and operational priorities. Moving to cloud-native needs to be in steps and in the way that best suits each one of them. This is not just about the applications themselves: operators need a comprehensive set of tools that will allow them to continue to re-architect their network as they require.
Over time, we think their steps will become much more frequent. It won’t be the five to seven-year timeframe that we saw with the introduction of earlier generations of mobile technology, but weeks or days as everything moves to being software defined and controlled. This means that the toolkit we provide to help our customers must address many things – and we need to be able to support all of them in a way that makes sense, both today and tomorrow.
We don’t want to leave anything or anyone behind, but make sure they can leverage the benefits of cloud-native tooling and infrastructure for what they have now and what they will build. Experience has taught us that new networks come and old networks don’t die, and that new technology turns into legacy very quickly.
AT: Can you describe the TITAN.IUM platform?
SL: TITAN.IUM is a cloud-native, container-based solution that brings forward all of the legacy protocol stacks and applications from the TITAN platform and integrates these into the new solution that also supports 5G. TITAN.IUM leverages Kubernetes-based container orchestration and a cloud-native software stack that allows us to support VM [virtual machines], NFV, and cloud-native, multi-cloud environments.
TITAN.IUM’s atomic elements that make up the solution are signalling processes and data brokers, and we communicate between them with data pipelines. Just to be clear, the data pipelines don’t replace the in-memory database for the signalling processor, which is integral to the TITAN platform and has been updated for TITAN.IUM, to ensure the extremely high performance it is known for.
Data pipelines are straightforward and organised into ‘topics’ which I picture as being like small pipes running inside big pipes. You can produce and consume as many topics as you like, and produce to any topic consumed from any other topic.
The data brokers manage that production and consumption, and the signalling processes write and read as appropriate. We deployed this solution for a couple of Tier #1 customers last year with a pre-release version of TITAN.IUM, although not with the cloud-native pieces because they weren’t ready for it, so we know just how well it works.
TITAN.IUM provides our customers with a powerful InterGENerational multi-cloud platform that enables network and service migration for all generations of core network infrastructure across any virtualisation, NFV or cloud-native infrastructure. It gives them the option of having all SS7, Diameter, SIP, DNS, HTTP and HTTP2 (5G) signalling components in a cloud-native architecture.
AT: What are the other advantages?
SL: Using data pipelines for elements to communicate with each other also means you can cluster the container-based elements – the building blocks of the distributed core architecture – and arrange them as you like. Customers can evolve the architecture in a very flexible way, over time, to suit their changing needs.
Importantly, these building blocks can be deployed in any type of compute environment, whether it’s cloud native or VM or NFV – we will have these mixed environments for the next few years. From a resource-separation standpoint, the components do what they’re meant to without being weighed down by having to sub-tend to other things like element management or analytics. The data pipelines provide all that connectivity.
A major advantage of using Kafka for our pipelines is there’s a huge ecosystem of open source components out there that work with it, such as Elastic Search, which plugs in natively to Kafka and we use it as our analytics solution.
Kafka’s ecosystem includes tracing, filtering, searching and more, which in previous solutions we’d have built ourselves. Now our customers get all of that functionality out of the box, as well as being able to easily integrate into solutions they may already have or are looking to implement.
It’s a very different world for us in terms of putting our products together with these building blocks as well as for our customers consuming the solutions.
AT: We’ve covered a lot of ground here, can you summarise why TITAN.IUM is such a big deal?
SL: For operators, TITAN.IUM provides the performance and scale needed for 5G while ensuring network and service evolution from 3G and 4G. Its unique architecture enables us to provide our customers with a containerised solution, regardless of where they are in their journey and the continuum of their infrastructure, so from here they will not have to rip and replace.
Automated workflows are critical for 5G, but they also unlock the economic benefits promised by NFV for prior generations of networks. We are bringing together the legacy network components with 5G applications into a solution that is infrastructure agnostic. Who would ever have thought we’d be putting an SS7 stack into a container to automate the STP lifecycle?
We wanted to achieve an infrastructure-independent solution because even if you’ve made the investment in NFV, we want to deliver this into your VM infrastructure, because with Kubernetes orchestration you get the benefits of the real application automation there. And then as you step into a fully cloud-native solution, it’s a less complex transition.
From our point of view, it’s all about architecting for the future – we deliver elements in a way that if orchestration shifts from Kubernetes to something else, we will be able to work with it. Similarly, it could be that Kafka is overtaken by something else for the pipelines, but our architecture has left the door open so we can be very flexible and adapt to new architectures and tooling as it evolves.
This article is sponsored by NetNumber. Listen to the webinar on-demand, Cloud-Native Networks Have Arrived – The InterGENerational Future of Your Network.