Core Concepts
Full Stack Data Observability
Full Stack Data observability is a holistic approach for maintaining complex data ecosystems. While classic data observability systems focus on the data quality aspects, full stack systems monitor the software & code that generates & consumes the data.
Once monitoring is shifted from the data to the software the user gains visibility to:
- The Data application health & stability
- The performance, resource efficiency & cost of the data applications
- The quality of the processed data tested for the specific requirements of each application accessing it.
AI for data platform insights, optimizations & data incidents
Modern data platforms & pipelines are complex and the teams that run them prefer to focus on the business logic & value they are expected to deliver and not on quality checks and optimizations.
definity monitors hundreds of metrics covering all aspects: Execution health, Data health & resource utilization, learns the behavior of your pipelines and uses AI-ML to generate tests and optimizations accordingly:
- Data Quality Tests
- Execution Health Tests
- Health insights
- Resource utilization recommendations
Real-time proactive observability
Unlike traditional data quality tools definity does not run on the data after the pipeline has ended. Instead, it is injected to run as part of the pipeline itself with negligible footprint. This allows real-time analysis & proactive response to incidents. Below are some examples:
Real time analysis | Active intervention | Impact |
---|---|---|
Identify stale input data before the application started to transform it | Preempt the pipeline from running | Resources saved & no downstream impact of stale data |
Identify stuck jobs early | Stop the stuck jobs automatically | Meet SLA, allow rerun & save wasted resources |
Identify faulty data output automatically in runtime | Divert final output to an alternate location | Avoid downstream contamination while allowing debug & approval |