← Back to Blogs

Scaffolding a Full dbt Pipeline with Zero Hand-Written SQL

I recently tried an experiment: how much of a dbt pipeline can you build without actually writing SQL?

The answer: almost all of it.

The Three Macros That Do the Heavy Lifting

The dbt-labs/codegen package ships three macros that handle the bulk of the work:

generate_source connects to your warehouse, introspects a schema, and outputs a full _sources.yml with every table, column name, and data type. No more manually typing out column definitions.

generate_base_model takes a source table and generates a staging model with a clean CTE structure (source, renamed, select). You get the column list pulled straight from the warehouse, with your choice of leading/trailing commas and materialisation.

generate_model_yaml points at an already-materialized model and generates the schema YAML with column names and data types. This is what powers your docs and test definitions.

All three run via dbt run-operation and write their output to stdout, so you just redirect to a file. No copy-pasting from the Snowflake UI, no guessing column types.

The Workflow

  1. generate_source against a Snowflake schema — _sources.yml with 4 tables fully defined
  2. generate_base_model for each table — staging views with CTE structure, done
  3. Materialize staging, then generate_model_yaml — schema files with column-level docs
  4. For marts (dim/fct), a temporary source trick — define your staging views as a source so generate_base_model can introspect their columns too, then swap source() to ref() and add your join logic

The only hand-written part? The join in my fact model. Everything else was generated by these three macros.

Where It Gets Interesting

I added a CLAUDE.md file to the repo with the full workflow documented. This means anyone with Claude Code can just say “generate a pipeline from this Snowflake source” and the entire process runs end-to-end — Claude orchestrates the macro calls in the right order, redirects output to the right files, and handles the source-to-ref swap for marts. You just review the output.

The Takeaway

The real value isn’t that AI writes the code. It’s that codegen macros already do the heavy lifting — AI just orchestrates them in the right sequence and handles the plumbing between steps.