Getting Started with Bruin
Most data teams end up with a stack that looks like this:
- Fivetran or Airbyte for data ingestion.
- dbt for SQL transformations.
- Airflow for orchestration.
- Great Expectations for data quality.
This stack works, but it comes with real costs: each tool requires its own configuration files, authentication setup, and learning curve. Airbyte needs connector configurations. dbt needs profiles.yml and dbt_project.yml. Airflow needs DAGs written in Python. Great Expectations needs expectation suites. For a small team or solo developer, this overhead adds days to what should be a simple pipeline.
Bruin consolidates these tools into one. You define data sources, transformations, and quality checks in a single project. Everything runs through one CLI. If you know SQL, you can build a complete pipeline in an afternoon.
In this guide, you'll build an e-commerce analytics pipeline using CSV files and SQL. You'll load raw data, clean and join it, and produce business metrics that answer questions like "What's our daily revenue?" and "Who are our best customers?"
By the end, you'll have four analytics tables:
| Table | Question | Key Metrics |
|---|---|---|
daily_revenue | "How much did we make each day?" | Total revenue, order count, customer count |
product_performance | "Which products sell best?" | Units sold, revenue, ranking |
customer_metrics | "Who are our best customers?" | Total spent, order count, segment |
category_performance | "Which categories drive revenue?" | Category revenue, average order value |