Scheduler but in Lua

Write a program which schedules other programs.

Linux scheduling policy is usually buried deep inside the kernel. Even with sched_ext, the policy is typically implemented as an eBPF program compiled ahead of time.

I contributed to Lunatik as part of GSoC project in 2026.

luasched is a part of that project.

It is a Lunatik binding created with a different approach: the scheduler remains in eBPF, but task classification is delegated to a Lua script. Instead of rebuilding the scheduler whenever policy changes, we can simply edit a Lua file.

Motivation

Suppose we want:

Nginx workers to get low latency scheduling
Firefox background processes to receive larger time slices
Everything else to use the default policy

Traditionally this logic would be hardcoded inside the scheduler like so:

if (is_nginx(task))
    ...
else if (is_firefox(task))
    ...

There are two pain points:

Every change is in eBPF, and the eBPF code is a pain to write.
Every policy change requires recompilation.

With luasched, the scheduler asks Lua how a task should be treated.

if task:comm():match("^nginx") then
	ctx:dsq(REALTIME)
	ctx:slice_ns(10000000)
end

IMPORTANT

The scheduling mechanism stays in eBPF. The scheduling policy lives in Lua.

Architecture:

flowchart TD
A[enqueue task] ==> B["`sched_ext BPF
check pid inside eBPF map
if found, dispatch
if **not** found, invoke Lua
`"]
B --> G[(eBPF map)]
B ==>|Cache miss|D[bpf_luasched_run kfunc]
B -->|Cache hit|E[scx_bpf_dsq_insert]
D ==>|ctx:dsq, ctx:slice_ns|F[Lua runtime]
F ==> E

There are two parts to this scheduler:

Create an eBPF scheduler which handles the fast path.
Create a workload.lua policy handler which takes care of slow path.
The first time a task is seen, the scheduler invokes Lua.
Lua returns:
- target dispatch queue (DSQ)
- scheduling slice
The result is cached in a BPF hash map keyed by PID.
Subsequent enqueues avoid Lua entirely.

Dispatch Queues

The user needs to define (on eBPF side) dispatch queues, like so.

#define DSQ_REALTIME 0
#define DSQ_BATCH    1
#define DSQ_DEFAULT  2

During initialization (on eBPF side) these queues are created:

scx_bpf_create_dsq(DSQ_REALTIME, -1);
scx_bpf_create_dsq(DSQ_BATCH, -1);
scx_bpf_create_dsq(DSQ_DEFAULT, -1);

Tasks are inserted into one of these queues based on their Lua-assigned class.

The dispatcher prioritizes them in order:

flowchart LR
a[REALTIME] --> B[BATCH]
B --> C[DEFAULT]

This means latency-sensitive work can be serviced before background workloads.

eBPF scheduler

Now on eBPF side create a map for caching pids with enqueue decisions. Define an enqueue decision as the following struct:

struct task_class {
	s32 dsq;
	u64 slice_ns;
};

Now we create the map for caching the decisions

struct {
	__uint(type, BPF_MAP_TYPE_HASH);
	__uint(max_entries, 10240);
	__type(key, pid_t);
	__type(value, struct task_class);
} task_classes SEC(".maps");

The interesting part happens inside enqueue. If the task was previously classified, the cached result is reused.

void BPF_STRUCT_OPS(luasched_enqueue, struct task_struct *p, u64 enq_flags)
{
    pid_t pid = p->pid;
    struct task_class *cls;
    
    cls = bpf_map_lookup_elem(&task_classes, &pid);
    if (cls) {
    	scx_bpf_dsq_insert(p, cls->dsq, cls->slice_ns, 0);
    	return;
    }
    ...
    /* invoke Lua to determine the enqueue verdict */
}

If no cached result is found, Lua is invoked. The function is exposed as a kfunc

extern int bpf_luasched_run(
    const char *key, 
    size_t key__sz, 
    struct task_struct *task, 
    struct task_class *cls
) __ksym;

It recieves a pointer to struct task_class and modifies it according to a Lua policy

void BPF_STRUCT_OPS(luasched_enqueue, struct task_struct *p, u64 enq_flags)
{
    ...
    /* invoke Lua to determine the enqueue verdict */
    struct task_class received_cls = { .dsq = -1, .slice_ns = -1 };
    
    int ret = bpf_luasched_run(runtime, sizeof(runtime), p, &received_cls);
    
    bpf_map_update_elem(&task_classes, &pid, &received_cls, BPF_ANY);
    scx_bpf_dsq_insert(p, received_cls.dsq, received_cls.slice_ns, 0);
}

Writing Policy in Lua

The scheduler itself knows nothing about process names like nginx or firefox.

That knowledge lives entirely in Lua.

local policy = {
    { pattern = "^nginx", dsq = REALTIME, slice = 1000000 },
    { pattern = "^firefox", dsq = BATCH, slice = 10000000 },
}

On Lua we attach a handler to set the enqueue verdict

local sched = require("sched")
 
local function workload(ctx)
	local task = ctx:task()
	for _, rule in ipairs(policy) do
		if task:comm():match(rule.pattern) then
			ctx:dsq(rule.dsq)
			ctx:slice_ns(rule.slice)
			return
		end
	end
	ctx:dsq(DEFAULT)
	ctx:slice_ns(scx.SLICE_DFL)
end
 
sched.attach(workload)

Why Cache Results?

Calling into Lua on every enqueue would be expensive. A task’s command name rarely changes after startup. By classifying a task once and storing the result in a BPF map, the scheduler pays the Lua cost only once.

All future scheduling decisions become simple hash lookups.

Vantage Point

Explorer

Scheduler but in Lua

Motivation

Dispatch Queues

eBPF scheduler

Writing Policy in Lua

Why Cache Results?

Table of Contents

Backlinks