Use this file to discover all available pages before exploring further.
The Lightdash Python SDK lets you query your semantic layer directly from Python. Use it in Jupyter notebooks, Python scripts, or anywhere you use Python to ensure everyone pulls from a single source of truth.
See it in action
Try the getting started Jupyter notebook for a hands-on walkthrough.
To use the SDK, you need three things from your Lightdash instance:
instance_url — the URL where you log into Lightdash (for example, https://app.lightdash.cloud for Lightdash Cloud, or your self-hosted URL like https://lightdash.mycompany.com).
project_uuid — the unique ID of the project you want to query.
access_token — a personal access token (PAT) used to authenticate as you.
You don’t need to be an admin to do any of this. You just need access to the project you want to query.
Click your avatar (top-right corner) and go to Settings.
In the left-hand menu, under your name, click Personal access tokens.
Click Generate new token.
Give it a description (for example, “Python SDK”) and pick an expiration date.
Click Generate token.
Copy the token immediately and save it somewhere safe — you won’t be able to see it again after closing the dialog. If you lose it, just generate a new one.
Treat your access token like a password. Don’t commit it to git or share it. For scripts, load it from an environment variable (for example, os.environ["LIGHTDASH_TOKEN"]) instead of pasting it directly into your code.
from lightdash import Clientclient = Client( instance_url="https://app.lightdash.cloud", access_token="your-token", project_uuid="your-uuid",)model = client.get_model("orders")# Build and execute a queryresult = ( model.query() .metrics(model.metrics.revenue, model.metrics.profit) .dimensions(model.dimensions.country) .filter(model.dimensions.status == "active") .sort(model.metrics.revenue.desc()) .limit(100) .execute())# Get results as a DataFramedf = result.to_df()# Or as a list of dictionariesrecords = result.to_records()
Immutable — each method returns a new Query object, safe for reuse
Lazy evaluation — API calls only happen when .execute() is called
Order-independent — methods can be called in any order
Composable — create base queries and extend them
# Create a reusable base querybase = model.query().metrics(model.metrics.revenue).dimensions(model.dimensions.country)# Extend it for different use casesby_active = base.filter(model.dimensions.status == "active")by_inactive = base.filter(model.dimensions.status == "inactive")
Access dimensions and metrics as attributes on the model:
# Access via attributecountry = model.dimensions.countryrevenue = model.metrics.revenue# List all availableall_dimensions = model.dimensions.list()all_metrics = model.metrics.list()
Features:
Lazy loading — fetched from API on first access, then cached
Fuzzy matching — typos suggest closest matches
Tab completion — works in Jupyter/IPython for discovery
result = query.execute()# To pandas DataFramedf = result.to_df() # or result.to_df(backend="pandas")# To polars DataFramedf = result.to_df(backend="polars")# To list of dictionariesrecords = result.to_records()# To JSON stringjson_str = result.to_json_str()
For large result sets, results are paginated automatically:
result = query.execute()# Access specific pagepage_2 = result.page(2)# Iterate through all pagesfor page in result.iter_pages(): process(page)# Lazy DataFrame loading (polars only)lazy_df = result.to_df_lazy()
Available properties on the result:
result.query_uuid — unique identifier for the query