Skip to main content

API Access

Definity provides comprehensive REST APIs for programmatic access to all platform features.

  • API Documentation: All endpoints are documented in the Swagger UI at https://[definity-server]/api/docs
  • Direct Database Access: Not recommended - schemas may change between versions

Authentication

All API requests require authentication via Bearer token in the Authorization header:

Authorization: Bearer <your_token>

Generate tokens in the Definity UI under User → Generate API Token.

Data Retrieval — GET /api/*

Query pipelines, tasks, metrics, and lineage data from Definity. See the Swagger documentation for full endpoint details.

Reporting — POST /log/*

Report task runs, transformations, and metric values to definity platform. This allows you to integrate external pipelines and metrics for unified observability.

Basic Flow

  1. Start task: POST /log/task with status: "start"
  2. Report transformations: POST /log/tf with lineage (input/output datasets)
  3. Report dataset schema: POST /log/dataset with column metadata
  4. Report metrics: POST /log/metrics with metric values
  5. End task: POST /log/task with status: "end"

Example: Metrics

import requests
from datetime import datetime

# Configuration
HOST_URL = "https://app.definity.run"
headers = {"Authorization": "Bearer <your_token>"}
TASK_ID = datetime.now() # Must be unique across all tasks

# 1. Start task
start_payload = {
"task_id": TASK_ID,
"status": "start",
"env": "dev", # optional
"app_name": "demo-pipeline",
"task_name": "basic_example",
"task_type": "python",
"app_pit": "2025-09-01" # optional
}
requests.post(f"{HOST_URL}/log/task", headers=headers, json=start_payload)

# 2. Report metrics
metrics_payload = {
"task_id": TASK_ID,
"metrics": [{
"asset_name": "basic_asset_example",
"asset_type": "table",
"metric_type": "cnt",
"value": 12345
}]
}
requests.post(f"{HOST_URL}/log/metrics", headers=headers, json=metrics_payload)

# 3. End task
end_payload = {
"task_id": TASK_ID,
"status": "end"
}
requests.post(f"{HOST_URL}/log/task", headers=headers, json=end_payload)

Example: Lineage & Dataset Schema

import requests
from datetime import datetime

# Configuration
HOST_URL = "https://app.definity.run"
headers = {"Authorization": "Bearer <your_token>"}
TASK_ID = datetime.now() # Must be unique across all tasks

# 1. Start task
start_payload = {
"task_id": TASK_ID,
"status": "start",
"app_name": "spark-iceberg-producer",
"task_name": "produce-tables",
"task_type": "spark",
"env": "production", # optional
"app_pit": "2025-09-01T10:00:00" # optional
}
requests.post(f"{HOST_URL}/log/task", headers=headers, json=start_payload)

# 2. Report transformation with lineage (input datasets -> output dataset)
tf_payload = {
"task_id": TASK_ID,
"tf_id": 1,
"status": "end",
"query_str": "CREATE TABLE shared_db.table_b AS SELECT id, name, upper(event_type) as event_type_upper FROM shared_db.table_a", # optional
"input_datasets": {
"shared_db.table_a": {
"columns": ["id", "name", "event_type"] # optional
}
},
"output": {"name": "shared_db.table_b"}
}
requests.post(f"{HOST_URL}/log/tf", headers=headers, json=tf_payload)

# 3. Report dataset schema for the output table
dataset_payload = {
"task_id": TASK_ID,
"tf_id": 1, # optional - links schema to a specific transformation
"ds_name": "shared_db.table_b",
"ds_type": "table", # optional
"metadata": [
{"info_type": "schema_field", "info_value": "id-INTEGER", "info_num": 0},
{"info_type": "schema_field", "info_value": "name-STRING", "info_num": 1},
{"info_type": "schema_field", "info_value": "event_type_upper-STRING", "info_num": 2}
]
}
requests.post(f"{HOST_URL}/log/dataset", headers=headers, json=dataset_payload)

# 4. End task
end_payload = {
"task_id": TASK_ID,
"status": "end"
}
requests.post(f"{HOST_URL}/log/task", headers=headers, json=end_payload)

Cross-app lineage: Lineage is connected automatically through shared dataset names. If one app writes shared_db.table_a and another app reads it, the lineage will link across apps.

Important: You must generate and manage unique task_id values. Use the same task_id across all API calls for a given task run.