Understanding Goroutines & The Go Scheduler (GMP Model)

Go’s concurrency model is one of the main reasons behind its popularity for building scalable systems. While launching a goroutine looks trivial using the go keyword, the internal mechanics behind it are powerful and carefully designed.

This blog explains goroutines from the ground up and then dives deep into how Go schedules them using the GMP scheduler model. Everything here focuses on how things actually work.

What is Goroutine?

A goroutine is a lightweight unit of execution managed entirely by the Go runtime, not by the operating system.

Key characteristics of goroutines:

Lightweight compared to OS threads
Dynamically growing stack
User-space threads
Exist within the same address space
Managed by the Go runtime scheduler
Multiplexed onto OS threads
Initial stack size is ~2 KB
Communicate via channels or shared memory

Go promotes a strong concurrency principle:

Do not communicate by sharing memory; instead, share memory by communicating.

This philosophy is central to Go’s design and is the reason channels are preferred over locks in many cases.

Goroutines vs OS Threads

Goroutines are not OS threads.

OS threads are heavyweight, expensive to create, and managed by the OS
Goroutines are lightweight, cheap to create, and managed by Go

A Go program can run thousands or even millions of goroutines, which would be impossible with OS threads alone.

Goroutines Are Managed by the Go Runtime

Some important points:

Goroutines are scheduled in user space
The OS does not know about goroutines
The Go runtime maps goroutines onto OS threads
Multiple goroutines can run on a single OS thread
Multiple OS threads can exist inside one Go process

This mapping is handled using the GMP scheduler model.

The GMP Scheduler Model

The Go scheduler internally uses the GMP model:

G : Goroutine
M : Machine (OS Thread)
P : Processor (Logical scheduling context)

This model defines how goroutines are executed efficiently across CPU cores.

G : Goroutine

A G represents a goroutine.

It contains:

Stack
Program Counter
CPU registers
Metadata (state, ID, etc.)

A goroutine cannot execute by itself. It must be:

Assigned to a P (processor)
Executed by an M (machine)

M : Machine (OS Thread)

Represents an OS thread
Created by the Go runtime
Executes Go code
An M must acquire a P to run goroutines

P : Processor (Logical Execution Context)

A P represents execution capacity.

Each P :

Holds a local run queue of goroutines, when a goroutine is created, it is added to the local queue of the currently running P, initially, many goroutines may accumulate on a single P.
Contains scheduler state
Controls parallel execution

Key rule:

Parallelism is limited by the number of P, not by goroutines or OS threads.

The number of P is controlled by: GOMAXPROCS, by default, it equals the number of logical CPUs.

Important detail:

The number of OS threads (M) can be greater than the number of processors (P).

Why?

OS threads can block on system calls or I/O
Go creates extra threads to avoid stalling execution

How Goroutines Are Executed (High-Level Flow)

Many Goroutines (G) are created by the program
Each newly created goroutine is placed into the local run queue of a Processor (P)
Each Processor (P) manages its own queue of runnable goroutines
An OS thread (M) acquires a Processor (P)
The OS thread (M) picks a goroutine (G) from the P’s local run queue
The OS thread (M) executes the goroutine (G)
The goroutine runs on the CPU until it blocks, completes, or is preempted

Global Run Queue

Go also maintains a global run queue.

Purpose:

Ensure fairness
Handle load balancing
Support scheduling when local queues are empty or unsuitable

The scheduler periodically pulls goroutines from the global queue to avoid starvation.

Work Stealing

If a P runs out of runnable goroutines:

It steals half of the runnable goroutines
From another P’s local run queue
Keeps CPU cores busy
Maximizes parallelism

This mechanism is called work stealing and is crucial for Go’s scalability.

What Happens When a Goroutine Blocks?

When a goroutine:

Performs a blocking system call
Waits on I/O
Sleeps
Waits on a channel

Then:

The M releases its P
Another M can acquire that P
Execution continues without blocking the entire scheduler

This design ensures that blocking operations do not reduce parallelism.

M and P Relationship

Important observations:

An OS thread (M) holds a P only while executing Go code
Blocking, preemption, or scheduling can cause it to release the P
Each OS thread can execute many goroutines over time
Only one goroutine runs at a time per P

Goroutine Creation Timing

A very common misconception:

A goroutine is enqueued when the go statement executes, not when the function is declared.

go doWork()

At this moment:

A new G is created
It is placed into a P's run queue
Scheduler decides when it runs

Program Startup Flow

When a Go program starts:

The runtime creates multiple P (based on GOMAXPROCS)
Each P gets its own local run queue
The initial OS thread acquires a P
The main goroutine is placed into that P’s run queue
Execution begins

Over time, goroutines are redistributed across P using work stealing and the global queue.

Lifecycle of a Goroutine

Created → Runnable → Running → Blocked → Runnable → Terminated

Created: Goroutine is instantiated
Runnable: Waiting in a run queue
Running: Actively executing
Blocked: Waiting on I/O, syscall, channel, etc.
Terminated: Execution finished

Why the GMP Model Works So Well

The GMP scheduler allows Go to:

Scale efficiently across CPU cores
Avoid excessive OS thread creation
Handle blocking operations gracefully
Maximize CPU utilization
Keep concurrency simple for developers

All while writing code as simple as:

go func() { // concurrent work }()

Final Thoughts

Goroutines may look simple, but the Go scheduler is doing a huge amount of work behind the scenes.

Understanding:

Goroutines (G)
OS threads (M)
Processors (P)
Local and global queues
Work stealing
Blocking behavior

…gives you a clear mental model of how Go concurrency actually works.

Once you understand GMP, Go’s concurrency stops feeling magical and starts feeling predictable, powerful, and elegant.

Understanding Goroutines & The Go Scheduler (GMP Model)

What is Goroutine?

Goroutines vs OS Threads

Goroutines Are Managed by the Go Runtime

The GMP Scheduler Model

G : Goroutine

M : Machine (OS Thread)

P : Processor (Logical Execution Context)

How Goroutines Are Executed (High-Level Flow)

Global Run Queue

Work Stealing

What Happens When a Goroutine Blocks?

M and P Relationship

Goroutine Creation Timing

Program Startup Flow

Lifecycle of a Goroutine

Why the GMP Model Works So Well

Final Thoughts

Comments

More from this blog

Go Channels: Complete Guide to Communication Patterns in Go

Vol. 1 System Design Basics: Scaling from Zero to Millions

Setting Up a Go Web Server on Amazon EC2

Easy Explanation of Docker for Beginners

Command Palette

What is Goroutine?

Goroutines vs OS Threads

Goroutines Are Managed by the Go Runtime

The GMP Scheduler Model

G : Goroutine

M : Machine (OS Thread)

P : Processor (Logical Execution Context)

How Goroutines Are Executed (High-Level Flow)

Global Run Queue

Work Stealing

What Happens When a Goroutine Blocks?

M and P Relationship

Goroutine Creation Timing

Program Startup Flow

Lifecycle of a Goroutine

Why the GMP Model Works So Well

Final Thoughts

Comments

More from this blog