General User Models¶

General User Models (GUMs) learn about you by observing any interaction you have with your computer. The GUM takes as input any unstructured observation of a user (e.g., device screenshots) and constructs confidence-weighted propositions that capture the user's knowledge and preferences. GUMs introduce an architecture that infers new propositions about a user from multimodal observations, retrieves related propositions for context, and continuously revises existing propositions.
tl;dr Everything you do can be used to make your systems more context-aware.
Getting Started¶
First, you'll need to install the GUM package. There are two ways to do it:
Getting Started I: Installing the GUM package
You can start a GUM server directly from the command line.
Getting Started II: Starting a GUM server
We recommend running this in a tmux or screen session to keep it alive.
First, install SGLang and launch its server with your LM.
> pip install "sglang[all]"
> pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.4/
> # Launch model
> CUDA_VISIBLE_DEVICES=0 python -m sglang.launch_server ....
> # name of the model you launched
> export MODEL_NAME="model-org/model-name"
> # point this to the GUM multimodal model
> export GUM_LM_API_BASE="base-url"
> # (optionally) set an API key
> export GUM_LM_API_KEY="None"
Alternatively, we recommend using SkyPilot to serve and run your own models on the cloud. You can use the following skypilot.yaml file in the repo. You'll need to replace the HuggingFace token (HF_TOKEN) with your own. By default, we use Qwen 2.5 VL 32B (AWQ quanitized). A single H100 (80GB) should give you good enough throughput.
Start the GUM listening process up:
Required Permissions
When you first run this command, your system will prompt you to grant accessibility and screen recording permissions to the application. You may need to restart the process a few times as you grant these permissions. This is necessary for GUM to observe your interactions and build its model.
Once you're all done, go ahead and try querying your GUM to view propositions and observations:
Applications¶
Once you're all set up, check out the tutorials here. There are a host of cool applications you can build atop of GUMs.
Getting Started III: Querying GUMs with the API
One of the main methods you'll use to interface with the GUM is the query function. It's exactly what the CLI calls under the hood. Simply pass your query in as a parameter (uses BM25 under the hood). The query takes many more arguments, which you can read about here.
For example: you can set up an MCP that uses GUMs here.
Under the hood¶

Observers collect raw interaction data.¶
Observers are modular components that capture various user interactions: screen content, notifications, etc. Each observer operates independently, streaming its observations to the GUM core for processing. We implement a Screen observer as an example.
Propositions describe inferences made about the user.¶
The core of GUM is its proposition system, which transforms raw observations into structured knowledge. Each proposition carries a confidence score and connects to related information, continuously updating as new evidence arrives.