Week 9: Queueing Theory

Week 9: Queueing Theory

Queuing theory: model the system with queue and requests, and see how it behaves.

Queuing Model

Queuing Model



  • Server
    – Anything hath performs tasks
  • Arrival rate (λ)
    – Also specified by the inter-arrival time (time between two consecutive arrivals)
  • Queuing delay (W)
    – Wait time in the queue
    – How long you wait in line
  • Service time (S), service rate (μ)
    – Time to service the request (Can be differ for each request)
    – How long it takes once you get to the front of the line
    – S = 1/μ
  • Response time (R)
    – Queuing delay + service time
    – Typically we want to minimize response time
  • Utilization
    – Fraction of time the server is busy
    – Trade-off: utilization and response time
    – Service time * arrival rate: S * λ, or λ / μ
  • Throughput: rate of task completions
    – If no overload, throughput = arrival rate
    – Overloaded state: λ > μ (Stability)


얼마나 큰 Queue가 필요할까??

Number of Tasks

N = Q + U

  • N: Avg number of tasks in the system
  • Q: Avg number of tasks in the queue
  • U: Avg number of tasks receiving service


Little’s formula ( Intuitive and trivial )

  • For only queue: Q = λ * W
  • For the entire system: N = λ * R


M/M/1 queue: One candidate modeling

Inter-arrival time: exponential distribution

Service time: exponential distribution

-> Poisson random process

R = S/(1-U)

Variance in R = S/(1-U)^2

M/M/1 Queue: response time vs utilization

M/M/1 Queue: response time vs utilization



Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.