AI-Enabled Edge Servers

A Definition First

Edge Servers are servers that are not located in the cloud or a at the data center. They are primarily mounted in a closet in factory floors, airports, train and bus stations, trains, buses, autonomous vehicles, oil rigs, and virtually hundreds of other applications. At one end they are connected to a large number of “intelligent and connected things” such as sensors, actuators, cameras, pumps, factory equipment, smartphones and the likes while maintaining their uplink to the corporate data centers. Edge servers have the uninspiring but critical task of collecting, analyzing, and acting upon, massive amounts of data generated by the connected “Things”. Additionally, they have the added responsibility of converting raw collected data to “Operational Insights” and forwarding them to upper layers.

Where Does AI Fit in This Picture?

The real value of edge servers can only be recognized if they have the ability to locally process the collected data and make real-time decisions and predictions locally with no reliance on remote resources. This can only happen if edge servers are able to host pre-trained deep learning models and have the computational resources to do real-time inferences locally. In most circles such servers are referred to as “Smart Edge Servers”. Latency and locality are key factors at the edge since data transport latencies and upstream service interruptions are intolerable in mission critical applications. As an example, a small traffic camera on a lamp post should be able to detect a speeding car without relying on computational resources in the cloud.

Unique Characteristics of Edge Servers

Edge servers are an extension of an organization’s IT infrastructure. They must be able to run the same workloads that run in data centers, this includes virtual machines, containers, databases, and software defined storage just like other servers. Any deviation from this scenario will be very costly when it comes to IT resource management and logistics. In a way, smart edge servers are designed to support enterprise-class compute, management, security, and storage in one enclosure able to handle harsh environmental conditions. This mandates vendors to build much smaller units installable in various spots (and not server racks). Another common denominator in edge servers is their ability to support a wealth of wireless technologies such Wi-Fi, 4G/LTE, Bluetooth, and a variety of other IoT-centric wireless technologies (Lo-RaWan, LTE-M, . . . .). To support legacy industrial applications some need to support various wireline interfaces such as ethernet, USB, CANbus and the likes. This is typically done natively or with the help of plugins modules.

As for the support for deep learning tasks, most existing smart edge servers rely on either PCIe accelerator cards or carrier cards mostly powered by various flavors of NVIDIA’s GPUs. It is also noteworthy that AI-enabled edge servers are not the only intelligent edge devices. There are many vendors that build standalone AI-enabled appliances as well as embedded boards that are used strictly at the edge.

Opportunity for AI Chip Vendors

The leadership boundaries for deep learning datacenter accelerator chips are pretty much drawn (at least for now). Clearly NVIDIA has cornered the bulk of this market and there are solid indications that solutions from the likes of Graphcore, Intel (Nervana), and Wave Computing are gaining some ground. On the other hand, the market for AI accelerator chips for edge applications is wide open and desperately seeking fresh ideas. The existing AI-Enabled edge servers almost entirely rely on costly modules and plugins from NVIDIA or a handful of third-party vendors using NVIDIA GPUs. The market for edge servers is estimated to grow at a rate of 35% annually over the next five years and such a dramatic growth rate will invariably create fierce competition among suppliers and they all have to compete on price, performance, and power dissipation. This is when the BOM cost will be front and center and the high cost of existing accelerator modules will stick out like a sore thumb. Fancy packaging, fan-cooling, general purpose implementations have to go. There will be a need for cheaper, lower power, and optimized accelerator chips from newcomers and that is where I see opportunities. New device entrants will also be able to add value by building application specific features to optimize for certain use cases having to do with computer vision or voice processing.