🌐 LinkedIn Re-Architects Edge-Building System to Support Diverse Inference Workflows - InfoQ - www.infoq.com

LinkedIn has detailed its re-architected edge-building system, an evolution designed to support diverse inference workflows for delivering fresher and more personalized recommendations to members worldwide. The new architecture addresses growing demands for real-time scalability, cost efficiency, and flexibility across its global platform.

The edge-building system powers LinkedIn’s graph by recommending "edges", or connections between members and content. These recommendations are generated through inference workflows, which run machine learning models to score and rank candidate suggestions. Over time, the system has evolved to balance freshness, latency, and resource efficiency across different inference modes.

The first generation relied on offline inference pipelines, which pre-computed recommendations in bulk. While effective at an early scale, this approach lacked the freshness needed to reflect dynamic member activity. To address this, LinkedIn introduced nearline inference, running models shortly after user actions were recorded, enabling more responsive recommendations while remaining cost-efficient.

Initial architecture using an offline inference model (Source: LinkedIn Engineering Blog)

The next stage of evolution focused on online inference, enabling real-time evaluation of candidate edges at request time. This shift provided the most up-to-date recommendations but introduced latency and resource scaling challenges. To manage this complexity, LinkedIn implemented remote inference capabilities, allowing models hosted in specialized serving systems to be invoked from multiple surfaces.

The different inference models offer varying trade-offs in freshness, scalability, and efficiency:

Comparison of different inference models (Source: LinkedIn Engineering Blog)

The current architecture supports a mix of offline, nearline, online, and remote inference. A Directed Acyclic Graph (DAG) orchestrates these workflows, enabling parallel execution and flexible routing. For example, People You May Know leverages online inference for immediate updates, while large-scale content feeds continue to rely on offline computation.

To improve candidate generation, LinkedIn has adopted Embedding-Based Retrieval (EBR), which creates embeddings from member profiles and retrieves relevant candidates from a vector store. These candidates are then scored online and merged with outputs from other workflows, enhancing both diversity and relevance.

Current architecture supporting diverse reference models (Source: LinkedIn Engineering Blog)

Ensuring consistency across workflows at LinkedIn’s scale required significant investment in shared feature stores, model management frameworks, and distributed serving infrastructure.

As emphasized by Yi-Wen Liu, engineer at LinkedIn:

By decoupling workflows and supporting multiple inference strategies, we can flexibly balance freshness, scalability, and cost while continuing to deliver meaningful recommendations to our members.

According to LinkedIn engineers, the evolved edge-building system enables more efficient experiments and improved engagement through A/B testing. It also opens strategic opportunities in AI productivity, cost optimization, adoption of large language models and transformers, embedding-based retrieval, and advanced modeling techniques like graph neural networks and sequential models—together enabling more timely, personalized, and actionable recommendations.

About the Author

Leela Kumili

Show moreShow less

Unlock the full InfoQ experience

Don't have an InfoQ account?

Topics

Fearless Programming with Rust

Architecture in the Lead: Scaling Today, Shaping Tomorrow

Multidimensionality: Using Spatial Intelligence x Spatial Computing to Create New Worlds

Slack's AI-Powered, Hybrid Approach for Large-Scale Migration from Enzyme to React Testing Library

How Causal Reasoning Addresses the Limitations of LLMs in Observability

Helpful links

Choose your language

LinkedIn Re-Architects Edge-Building System to Support Diverse Inference Workflows

Write for InfoQ

About the Author

Leela Kumili

Rate this Article

This content is in the Real Time topic

Related Topics:

Related Editorial

Related Sponsored Content

Principles and Patterns for Distributed Application Architecture (By O'Reilly) - Download Now

Related Sponsor

InfoQ Dev Summit Munich 2025: Master the 'How' with Deep-Dive, Practitioner-Led Guidance

Zemlin at Open Source Summit EU: Even in the Age of AI, the Software’s Future is Still Open Source

Fearless Programming with Rust

Architecture in the Lead: Scaling Today, Shaping Tomorrow

LinkedIn Re-Architects Edge-Building System to Support Diverse Inference Workflows

Test Smarter, Not Harder: Achieving Confidence in Complex Distributed Systems

How Sociotechnical Design Can Improve Architectural Decisions

Slack's AI-Powered, Hybrid Approach for Large-Scale Migration from Enzyme to React Testing Library

Platform Engineering Patterns for Scalable Software Delivery

Multidimensionality: Using Spatial Intelligence x Spatial Computing to Create New Worlds

Effective Practices for Architecting a RAG Pipeline

Google Launches Gemini 2.5 Flash Image with Advanced Editing and Consistency Features

Researcher Unearths Thousands of Leaked Secrets in GitHub’s “Oops Commits”

Amazon EKS Enables Ultra-Scale AI/ML Workloads with Support for 100K Nodes per Cluster

How Causal Reasoning Addresses the Limitations of LLMs in Observability

InfoQ Dev Summit Munich

QCon San Francisco

QCon AI New York

QCon London

InfoQ Software Architects' Newsletter

LinkedIn Re-Architects Edge-Building System to Support Diverse Inference Workflows

Write for InfoQ

About the Author

Leela Kumili

Rate this Article

This content is in the Real Time topic

Related Topics:

Related Editorial

Related Sponsored Content

Related Sponsor

The InfoQ Newsletter