BigQuery’s Pipe Syntax: Making SQL Easier, Faster, and More Readable

Anyone who’s worked with SQL knows the love-hate relationship that comes with it. SQL is powerful, standardized, and expressive—but it’s also one of the most unintuitive languages to read and maintain at scale. The deeper your query gets, the harder it becomes to understand what’s actually happening and in what order.
That’s because SQL doesn’t flow in the order you think. You start by writing SELECT, but the database doesn’t actually process that first. The engine begins with the FROM clause, then joins data, filters it, aggregates it, and only later applies sorting and projection. The result is a query that’s written backward relative to the way it’s executed.
This mismatch between syntax and execution makes SQL both cognitively heavy and computationally expensive. Analysts often rely on subqueries or Common Table Expressions (CTEs) to simulate a top-to-bottom flow, adding unnecessary complexity and redundancy. Even experienced users end up debugging nested blocks of logic instead of focusing on the problem they’re solving.
To address this, BigQuery supports pipe syntax (|>), an evolution of SQL that makes queries execute in the same order they are read. Instead of writing logic inside out, you write it step by step, like a data pipeline—each line representing a distinct transformation. The result is SQL that’s easier to understand, easier to debug, and far more intuitive for modern data analysis.

Source: Google
The image above illustrates the core problem pipe syntax solves. In traditional SQL, the syntactic order (how you write a query) is misaligned with the semantic order (how the database actually processes it). For example, SELECT appears at the top, but the engine doesn’t evaluate it until after joining, filtering, grouping, and sorting are done.
This sequence represents the logical and syntactic flow of a BigQuery pipe query — top to bottom, left to right — mirroring how the database actually processes the data.

FROM: The starting point of any pipe query. Defines the initial table or subquery from which data flows. In pipe syntax, FROM can stand alone as a valid query and acts as the entry to a sequential pipeline.
JOIN: Extends the current table with additional columns and rows from another table or subquery. Can be used multiple times in sequence (|> JOIN table USING (key)), making left-deep join trees readable without nested subqueries.
SET: Updates existing column values in place (|> SET column = expression). Functions like SELECT * REPLACE(…) in standard SQL but is more readable and modular when used as a pipeline step.
EXTEND: Adds computed columns to the existing dataset (|> EXTEND expression AS new_column). Similar to SELECT *, new_column in standard SQL but allows incremental creation of derived fields between other operations such as joins and filters.
DROP: Removes one or more columns from the current dataset (|> DROP column_name). Equivalent to SELECT * EXCEPT(column) in standard SQL and is often used to simplify output or reduce intermediate data size.
WHERE: Filters rows that satisfy a condition (|> WHERE condition). Can appear anywhere in the pipeline, before or after aggregation, removing the need for different keywords like HAVING or QUALIFY.
AGGREGATE: Performs full-table or grouped aggregations (|> AGGREGATE SUM(column) AS total GROUP BY category). Replaces GROUP BY and aggregate functions within SELECT, simplifying syntax and aligning it with the logical order of operations.
ORDER BY: Sorts rows in ascending or descending order (|> ORDER BY column DESC). Produces an ordered result table that can be followed by operators like LIMIT.
LIMIT: Restricts the number of rows in the result (|> LIMIT 10). Works after ORDER BY or on unordered tables, preserving the natural sequence of filters and aggregations.
CALL: Executes a table-valued function or machine learning model using the current table as input (|> CALL ML.PREDICT(MODEL project.model_name)). Eliminates the need for nested function calls, creating a linear and readable workflow.
SELECT: Defines the final projection of columns to include in the output (|> SELECT column1, column2). Serves as the closing operation in a pipeline, similar to the final SELECT in a standard SQL query.

Below are the most common scenarios where pipe syntax simplifies SQL logic, making queries cleaner and faster to work with.
Aggregating Data Without Subqueries
Aggregations are where SQL starts to feel inside-out. If you want to count something, then count those counts, you’re suddenly in parentheses hell.

SQL
SELECT c_count, COUNT(*) AS custdist
FROM (
SELECT c_custkey, COUNT(o_orderkey) AS c_count
FROM customer
JOIN orders ON c_custkey = o_custkey
WHERE o_comment NOT LIKE ‘%unusual%packages%’
GROUP BY c_custkey
)
GROUP BY c_count
ORDER BY custdist DESC;

Pipe Syntax
FROM customer
|> JOIN orders ON c_custkey = o_custkey
AND o_comment NOT LIKE ‘%unusual%packages%’
|> AGGREGATE COUNT(o_orderkey) AS c_count GROUP BY c_custkey
|> AGGREGATE COUNT(*) AS custdist GROUP BY c_count
|> ORDER BY custdist DESC;

No more nesting or double grouping. Each step flows logically and can be modified independently without rewriting the entire query.
Cleaning and Transforming Columns Step by Step
When you need to lowercase text, calculate totals, and drop extra columns, standard SQL forces you to rewrite multiple SELECT statements. Pipe syntax introduces SET, EXTEND, and DROP operators so you can apply changes in sequence.

SQL
SELECT o_custkey, ROUND(o_totalprice) AS total_price
FROM (
SELECT
o_custkey,
o_totalprice,
LOWER(o_orderstatus) AS o_orderstatus
FROM orders
)
WHERE total_price > 1000;

Pipe Syntax
FROM orders
|> SET o_orderstatus = LOWER(o_orderstatus)
|> EXTEND ROUND(o_totalprice) AS total_price
|> WHERE total_price > 1000
|> SELECT o_custkey, total_price;

Each operation builds upon the previous one, making it easier to trace transformations and reuse logic.
Filtering After Aggregation Without Remembering “HAVING”
One of SQL’s quirks is that the timing of filters changes depending on the clause. You use WHERE before grouping and HAVING after, but in reality, both just filter rows. Pipe syntax allows you to use WHERE consistently, no matter where you place it.

SQL
SELECT department, COUNT(*) AS emp_count
FROM employees
WHERE active = TRUE
GROUP BY department
HAVING COUNT(*) > 5;

Pipe Syntax
FROM employees
|> WHERE active = TRUE
|> AGGREGATE COUNT(*) AS emp_count GROUP BY department
|> WHERE emp_count > 5;

Now you can write filters in the same order you think about them: first filter the data, then group it, then filter again on the results.
Debugging Queries Without Temporary Tables
In standard SQL, checking an intermediate result requires either creating a temporary table or wrapping the code in multiple CTEs. With pipe syntax, you can run the query up to any point in the pipeline.

SQL
WITH filtered AS (
SELECT * FROM orders WHERE o_totalprice > 500
),
summed AS (
SELECT o_custkey, SUM(o_totalprice) AS total
FROM filtered GROUP BY o_custkey
)
SELECT * FROM summed WHERE total > 10000;

Pipe Syntax
FROM orders
|> WHERE o_totalprice > 500
|> AGGREGATE SUM(o_totalprice) AS total GROUP BY o_custkey
|> WHERE total > 10000;

Each prefix of the query is executable on its own, which means you can “peek” at the data at any stage. It’s a cleaner, more interactive way to debug and iterate.
Chaining Models and Functions Without Nesting
When working with table-valued functions or BigQuery ML models, nesting can quickly get unreadable. Pipe syntax replaces those nested calls with linear chaining using CALL.

SQL
SELECT *
FROM ML.PREDICT(
MODEL `project.sentiment_model`,
(SELECT text FROM reviews)
);

Pipe Syntax
SELECT text FROM reviews
|> CALL ML.PREDICT(MODEL `project.sentiment_model`);

If you apply multiple models or transformations, you simply stack additional CALL lines—no parentheses required.
Pivoting Data Without Nested Selects
Pivoting data has always been a tedious process, often requiring layers of subqueries. Pipe syntax simplifies it into one flowing sequence.

SQL
SELECT *
FROM (
SELECT n_name, c_acctbal, c_mktsegment
FROM customer JOIN nation USING (n_nationkey)
)
PIVOT(SUM(c_acctbal) FOR n_name IN (‘PERU’, ‘KENYA’, ‘JAPAN’));

Pipe Syntax
FROM customer
|> JOIN nation USING (n_nationkey)
|> SELECT n_name, c_acctbal, c_mktsegment
|> PIVOT(SUM(c_acctbal) FOR n_name IN (‘PERU’, ‘KENYA’, ‘JAPAN’));

The query now reads like a story: start with your data, join it, select what you need, then pivot.
Why Pipe Syntax Is a Game-Changer
Pipe syntax doesn’t reinvent SQL—it clarifies it. It preserves everything powerful about SQL’s declarative structure but removes the cognitive overhead of writing logic backward.
For data scientists, analysts, and engineers, that means:

Easier to read, write, and debug queries
No more juggling subqueries or CTEs for simple tasks
Streamlined logic that mirrors how you actually think
Better performance and fewer redundant operations

BigQuery’s pipe syntax is SQL for the modern data era—linear, intuitive, and optimized for real-world analytics.
©2025 DK New Media, LLC, All rights reserved | DisclosureOriginally Published on Martech Zone: BigQuery’s Pipe Syntax: Making SQL Easier, Faster, and More Readable

Scroll to Top