Skip to content

Designing YouTube

Absolutely โ€” hereโ€™s a detailed Markdown-formatted note on YouTube System Design, written in the same structured, easy-to-study format as before.


๐ŸŽฅ YouTube System Design โ€” Notes


๐Ÿงญ 1. Problem Statement

Design a scalable video streaming platform like YouTube.

Functional Requirements

  • Users can upload, watch, like, comment, and share videos.
  • Users can subscribe to channels.
  • Videos can be searched and recommended.
  • Support live streaming and video playback on multiple devices.

Non-Functional Requirements

  • High availability โ€” videos must always be accessible.
  • Low latency โ€” fast load and play times.
  • Scalability โ€” millions of concurrent viewers and uploads.
  • Reliability โ€” no data loss, even on server failure.

โš™๏ธ 2. High-Level Architecture Overview

Client (Web / Mobile)
โ”‚
โ–ผ
Load Balancer
โ”‚
โ–ผ
Application Servers (API Gateway)
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ โ”‚
โ–ผ โ–ผ
Video Upload Service Video Streaming Service
โ”‚ โ”‚
โ–ผ โ–ผ
Video Processing Pipeline CDN (Content Delivery Network)
โ”‚ โ”‚
โ–ผ โ–ผ
Object Storage (e.g., S3) Video Player Clients
โ”‚
โ–ผ
Metadata DB (SQL / NoSQL)

๐Ÿงฉ 3. Core Components

ComponentDescription
API Gateway / Load BalancerRoutes requests (upload, view, comment) to backend services.
Upload ServiceHandles user uploads, validates file formats, and stores raw videos temporarily.
Transcoding ServiceConverts raw videos into multiple resolutions/bitrates for adaptive streaming.
Storage ServiceStores videos in distributed object storage (e.g., AWS S3, GCP Storage).
CDNDelivers video content from edge servers closest to users.
Metadata ServiceManages video titles, tags, descriptions, likes, views, etc.
Recommendation SystemSuggests videos based on history, trends, or user behavior.
Notification ServiceHandles subscriptions and real-time notifications.
Search ServiceIndexes and retrieves videos using keywords.

๐Ÿ“ฆ 4. Video Upload Flow

  1. User uploads a video file to the platform.
  2. Video passes through upload validation (format, size, limits).
  3. The file is stored temporarily (e.g., in an object store like S3).
  4. The transcoding service is triggered via a message queue (Kafka/SQS).
  5. Transcoder converts video into multiple formats (240p โ†’ 1080p โ†’ 4K).
  6. Each processed file is stored in permanent storage.
  7. The metadata database is updated with file info (title, duration, links).
  8. Video is made available for streaming via CDN.

๐ŸŽž๏ธ 5. Video Streaming Flow

  1. User requests to play a video.
  2. The client fetches metadata (video URL, resolution options).
  3. Video chunks are delivered via CDN using HTTP adaptive streaming (HLS/DASH).
  4. Client selects the best resolution dynamically based on network speed.
  5. Playback starts quickly since CDN serves the nearest video copy.

๐Ÿงฐ 6. Technologies Used

AreaTechnologies
FrontendReact.js, Next.js, Video.js player
BackendNode.js, Nest.js, Express
StorageAWS S3 / Google Cloud Storage
TranscodingFFmpeg, AWS Elastic Transcoder
DatabasePostgreSQL (metadata), Redis (caching), Elasticsearch (search)
Queue SystemKafka / RabbitMQ for async video processing
CDNCloudFront / Akamai / Cloudflare
MonitoringPrometheus, Grafana, ELK Stack

๐Ÿงฑ 7. Database Design

Video Metadata Table (SQL)

FieldTypeDescription
video_idUUIDUnique video identifier
user_idUUIDOwner/creator
titleStringVideo title
descriptionTextDescription text
tagsArrayFor search/recommendations
viewsIntView count
likesIntLike count
statusEnumUploaded / Processing / Published
created_atTimestampUpload date

User Table

Stores profile details, subscriptions, watch history, etc.

Search Index

  • Uses Elasticsearch for quick keyword search.
  • Stores title, tags, and description fields for full-text queries.

โšก 8. Scalability Strategies

StrategyPurpose
CDNReduce latency and server load for video playback.
ShardingDistribute metadata across multiple DB servers.
CachingStore trending videos and metadata in Redis.
Message QueuesAsynchronous processing for uploads/transcoding.
MicroservicesIndependent scaling of upload, streaming, and metadata systems.
Read ReplicasImprove DB read performance.
Load BalancerDistribute traffic among servers.

๐Ÿงฎ 9. Data Flow Summary

Upload Path

Client โ†’ API โ†’ Upload Service โ†’ Message Queue โ†’ Transcoder โ†’ Storage โ†’ Metadata DB

Playback Path

Client โ†’ CDN โ†’ Streaming Service โ†’ Player (Adaptive bitrate playback)


๐Ÿง  10. Adaptive Bitrate Streaming (ABR)

Definition: Deliver videos at multiple qualities (240p, 480p, 720p, 1080p, etc.) and switch dynamically based on user bandwidth.

Protocols:

  • HLS (HTTP Live Streaming)
  • MPEG-DASH

Benefits:

  • Smooth playback
  • Handles variable network speeds
  • Reduces buffering

๐Ÿ” 11. Search & Recommendation System

  • Powered by Elasticsearch or Solr
  • Index video metadata and tags
  • Use autocomplete and ranking algorithms

Recommendation

  • Based on:

    • Watch history
    • Similar tags/topics
    • Trending videos
    • Collaborative filtering (similar users)

๐Ÿ“ก 12. Notifications & Subscriptions

  • Pub/Sub model

    • User subscribes to a channel โ†’ added to subscriber DB.
    • When uploader publishes new video โ†’ event triggers notification service.
    • Notification sent via email, app push, or in-app feed.

๐Ÿ’พ 13. Caching

TypeData CachedTool
Application CacheUser profiles, video metadataRedis
CDN CacheStatic and video filesCloudFront / Akamai
Search CachePopular queriesRedis

๐Ÿ” 14. Security & Compliance

  • Authentication & Authorization โ†’ OAuth 2.0 / JWT
  • Content Moderation โ†’ AI-based or manual review
  • Rate Limiting โ†’ Prevent DDoS & abuse
  • Encrypted Streaming URLs โ†’ Prevent piracy
  • HTTPS everywhere for data safety

๐Ÿ“ˆ 15. Key Metrics to Monitor

  • Video upload latency
  • Transcoding time per GB
  • Playback start time (Time-to-first-frame)
  • Buffering ratio
  • CDN hit/miss ratio
  • Average watch time per user
  • System availability (% uptime)

๐Ÿงฉ 16. Challenges & Trade-offs

ChallengeSolution
Storage costsCold storage for old videos (e.g., AWS Glacier)
Bandwidth usageAdaptive bitrate & CDN
Consistency vs latencyUse eventual consistency for views/likes
ScalabilityMicroservices + auto-scaling groups
Real-time metricsKafka + Stream processors (Flink / Spark)

๐Ÿ 17. Summary

FeatureTechnology / Design Choice
Video storageObject storage (S3, GCS)
MetadataSQL/NoSQL
PlaybackCDN + Adaptive streaming
ProcessingDistributed transcoding
SearchElasticsearch
QueueKafka / RabbitMQ
CacheRedis
AuthOAuth 2.0 / JWT
MonitoringPrometheus + Grafana