---
title: R2 SQL
description: A distributed SQL engine for R2 Data Catalog
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# R2 SQL

Note

R2 SQL is in **open beta**, and any developer with an [R2 subscription](https://developers.cloudflare.com/r2/pricing/) can start using it. Currently, outside of standard R2 storage and operations, you will not be billed for your use of R2 SQL. We will update [the pricing page](https://developers.cloudflare.com/r2-sql/platform/pricing) and provide at least 30 days notice before enabling billing.

Query Apache Iceberg tables managed by R2 Data Catalog using SQL.

R2 SQL is Cloudflare's serverless, distributed, analytics query engine for querying [Apache Iceberg ↗](https://iceberg.apache.org/) tables stored in [R2 Data Catalog](https://developers.cloudflare.com/r2/data-catalog/). R2 SQL is designed to efficiently query large amounts of data by automatically utilizing file pruning, Cloudflare's distributed compute, and R2 object storage.

Terminal window

```
❯ npx wrangler r2 sql query "3373912de3f5202317188ae01300bd6_data-catalog" \"SELECT * FROM default.transactions LIMIT 10"
 ⛅️ wrangler 4.38.0────────────────────────────────────────────────────────────────────────────▲ [WARNING] 🚧 `wrangler r2 sql query` is an open-beta command. Please report any issues to https://github.com/cloudflare/workers-sdk/issues/new/choose

┌─────────────────────────────┬──────────────────────────────────────┬─────────┬──────────┬──────────────────────────────────┬───────────────┬───────────────────┬──────────┐│ __ingest_ts                 │ transaction_id                       │ user_id │ amount   │ transaction_timestamp            │ location      │ merchant_category │ is_fraud │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.872554Z │ fdc1beed-157c-4d2d-90cf-630fdea58051 │ 1679    │ 13241.59 │ 2025-09-20T02:23:04.269988+00:00 │ NEW_YORK      │ RESTAURANT        │ false    │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.724378Z │ ea7ef106-8284-4d08-9348-ad33989b6381 │ 1279    │ 17615.79 │ 2025-09-20T02:23:04.271090+00:00 │ MIAMI         │ GAS_STATION       │ true     │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.724330Z │ afcdee4d-5c71-42be-97ec-e282b6937a8c │ 1843    │ 7311.65  │ 2025-09-20T06:23:04.267890+00:00 │ SEATTLE       │ GROCERY           │ true     │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.657007Z │ b99d14e0-dbe0-49bc-a417-0ee57f8bed99 │ 1976    │ 15228.21 │ 2025-09-16T23:23:04.269426+00:00 │ NEW_YORK      │ RETAIL            │ false    │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.656992Z │ 712cd094-ad4c-4d24-819a-0d3daaaceea1 │ 1184    │ 7570.89  │ 2025-09-20T00:23:04.269163+00:00 │ LOS_ANGELES   │ RESTAURANT        │ true     │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.656912Z │ b5a1aab3-676d-4492-92b8-aabcde6db261 │ 1196    │ 46611.25 │ 2025-09-20T16:23:04.268693+00:00 │ NEW_YORK      │ RETAIL            │ true     │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.613740Z │ 432d3976-8d89-4813-9099-ea2afa2c0e70 │ 1720    │ 21547.9  │ 2025-09-20T05:23:04.273681+00:00 │ SAN FRANCISCO │ GROCERY           │ true     │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.532068Z │ 25e0b851-3092-4ade-842f-e3189e07d4ee │ 1562    │ 29311.54 │ 2025-09-20T05:23:04.277405+00:00 │ NEW_YORK      │ RETAIL            │ false    │├─────────────────────────────┼──────────────────────────────────────┼─────────┼──────────┼──────────────────────────────────┼───────────────┼───────────────────┼──────────┤│ 2025-09-20T22:30:11.526037Z │ 8001746d-05fe-42fe-a189-40caf81d7aa2 │ 1817    │ 15976.5  │ 2025-09-15T16:23:04.266632+00:00 │ SEATTLE       │ RESTAURANT        │ true     │└─────────────────────────────┴──────────────────────────────────────┴─────────┴──────────┴──────────────────────────────────┴───────────────┴───────────────────┴──────────┘Read 11.3 kB across 4 files from R2On average, 3.36 kB / s
```

Create an end-to-end data pipeline by following [this step by step guide](https://developers.cloudflare.com/r2-sql/get-started/), which shows you how to stream events into an Apache Iceberg table and query it with R2 SQL.

```json
{"@context":"https://schema.org","@type":"WebPage","@id":"https://developers.cloudflare.com/r2-sql/#page","headline":"R2 SQL · R2 SQL docs","description":"A distributed SQL engine for R2 Data Catalog","url":"https://developers.cloudflare.com/r2-sql/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-04-21","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}}]}
```

---

---
title: Getting started
description: Create your first pipeline to ingest streaming data and write to R2 Data Catalog as an Apache Iceberg table.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Getting started

This guide will instruct you through:

* Creating your first [R2 bucket](https://developers.cloudflare.com/r2/buckets/) and enabling its [data catalog](https://developers.cloudflare.com/r2/data-catalog/).
* Creating an [API token](https://developers.cloudflare.com/r2/api/tokens/) needed for pipelines to authenticate with your data catalog.
* Creating your first pipeline with a simple ecommerce schema that writes to an [Apache Iceberg ↗](https://iceberg.apache.org/) table managed by R2 Data Catalog.
* Sending sample ecommerce data via HTTP endpoint.
* Validating data in your bucket and querying it with R2 SQL.

## Prerequisites

1. Sign up for a [Cloudflare account ↗](https://dash.cloudflare.com/sign-up/workers-and-pages).
2. Install [Node.js ↗](https://docs.npmjs.com/downloading-and-installing-node-js-and-npm).

Node.js version manager

Use a Node version manager like [Volta ↗](https://volta.sh/) or [nvm ↗](https://github.com/nvm-sh/nvm) to avoid permission issues and change Node.js versions. [Wrangler](https://developers.cloudflare.com/workers/wrangler/install-and-update/), discussed later in this guide, requires a Node version of `16.17.0` or later.

## 1\. Create an R2 bucket

* [ Wrangler CLI ](#tab-panel-10091)
* [ Dashboard ](#tab-panel-10092)

1. If not already logged in, run:  
Terminal window  
```  
npx wrangler login  
```
2. Create an R2 bucket:  
Terminal window  
```  
npx wrangler r2 bucket create pipelines-tutorial  
```

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select **Create bucket**.
3. Enter the bucket name: pipelines-tutorial
4. Select **Create bucket**.

## 2\. Enable R2 Data Catalog

* [ Wrangler CLI ](#tab-panel-10093)
* [ Dashboard ](#tab-panel-10094)

Enable the catalog on your R2 bucket:

Terminal window

```
npx wrangler r2 bucket catalog enable pipelines-tutorial
```

When you run this command, take note of the "Warehouse" and "Catalog URI". You will need these later.

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select the bucket: pipelines-tutorial.
3. Switch to the **Settings** tab, scroll down to **R2 Data Catalog**, and select **Enable**.
4. Once enabled, note the **Catalog URI** and **Warehouse name**.

## 3\. Create an API token

Pipelines must authenticate to R2 Data Catalog with an [R2 API token](https://developers.cloudflare.com/r2/api/tokens/) that has catalog and R2 permissions.

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select **Manage API tokens**.
3. Select **Create Account API token**.
4. Give your API token a name.
5. Under **Permissions**, choose the **Admin Read & Write** permission.
6. Select **Create Account API Token**.
7. Note the **Token value**.

Note

This token also includes the R2 SQL Read permission, which allows you to query your data with R2 SQL.

## 4\. Create a pipeline

* [ Wrangler CLI ](#tab-panel-10095)
* [ Dashboard ](#tab-panel-10096)

First, create a schema file that defines your ecommerce data structure:

**Create `schema.json`:**

```
{  "fields": [    {      "name": "user_id",      "type": "string",      "required": true    },    {      "name": "event_type",      "type": "string",      "required": true    },    {      "name": "product_id",      "type": "string",      "required": false    },    {      "name": "amount",      "type": "float64",      "required": false    }  ]}
```

Use the interactive setup to create a pipeline that writes to R2 Data Catalog:

Terminal window

```
npx wrangler pipelines setup
```

Follow the prompts:

1. **Pipeline name**: Enter `ecommerce`
2. **Stream configuration**:

  * Enable HTTP endpoint: `yes`
  * Require authentication: `no` (for simplicity)
  * Configure custom CORS origins: `no`
  * Schema definition: `Load from file`
  * Schema file path: `schema.json` (or your file path)
3. **Sink configuration**:

  * Destination type: `Data Catalog Table`
  * R2 bucket name: `pipelines-tutorial`
  * Namespace: `default`
  * Table name: `ecommerce`
  * Catalog API token: Enter your token from step 3
  * Compression: `zstd`
  * Roll file when size reaches (MB): `100`
  * Roll file when time reaches (seconds): `10` (for faster data visibility in this tutorial)
4. **SQL transformation**: Choose `Use simple ingestion query` to use:  
```  
INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream  
```

After setup completes, note the HTTP endpoint URL displayed in the final output.

1. In the Cloudflare dashboard, go to **Pipelines** \> **Pipelines**.  
[ Go to **Pipelines** ](https://dash.cloudflare.com/?to=/:account/pipelines/overview)
2. Select **Create Pipeline**.
3. **Connect to a Stream**:

  * Pipeline name: `ecommerce`
  * Enable HTTP endpoint for sending data: Enabled
  * HTTP authentication: Disabled (default)
  * Select **Next**
4. **Define Input Schema**:

  * Select **JSON editor**
  * Copy in the schema:  
  ```  
  {  "fields": [    {      "name": "user_id",      "type": "string",      "required": true    },    {      "name": "event_type",      "type": "string",      "required": true    },    {      "name": "product_id",      "type": "string",      "required": false    },    {      "name": "amount",      "type": "f64",      "required": false    }  ]}  
  ```
  * Select **Next**
5. **Define Sink**:

  * Select your R2 bucket: `pipelines-tutorial`
  * Storage type: **R2 Data Catalog**
  * Namespace: `default`
  * Table name: `ecommerce`
  * **Advanced Settings**: Change **Maximum Time Interval** to `10 seconds`
  * Select **Next**
6. **Credentials**:

  * Disable **Automatically create an Account API token for your sink**
  * Enter **Catalog Token** from step 3
  * Select **Next**
7. **Pipeline Definition**:

  * Leave the default SQL query:  
  ```  
  INSERT INTO ecommerce_sink SELECT * FROM ecommerce_stream;  
  ```
  * Select **Create Pipeline**
8. After pipeline creation, note the **Stream ID** for the next step.

## 5\. Send sample data

Send ecommerce events to your pipeline's HTTP endpoint:

Terminal window

```
curl -X POST https://{stream-id}.ingest.cloudflare.com \  -H "Content-Type: application/json" \  -d '[    {      "user_id": "user_12345",      "event_type": "purchase",      "product_id": "widget-001",      "amount": 29.99    },    {      "user_id": "user_67890",      "event_type": "view_product",      "product_id": "widget-002"    },    {      "user_id": "user_12345",      "event_type": "add_to_cart",      "product_id": "widget-003",      "amount": 15.50    }  ]'
```

Replace `{stream-id}` with your actual stream endpoint from the pipeline setup.

## 6\. Validate data in your bucket

1. In the Cloudflare dashboard, go to the **R2 object storage** page.
2. Select your bucket: `pipelines-tutorial`.
3. You should see Iceberg metadata files and data files created by your pipeline. Note: If you aren't seeing any files in your bucket, try waiting a couple of minutes and trying again.
4. The data is organized in the Apache Iceberg format with metadata tracking table versions.

## 7\. Query your data using R2 SQL

Set up your environment to use R2 SQL:

Terminal window

```
export WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN
```

Or create a `.env` file with:

```
WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN
```

Where `YOUR_API_TOKEN` is the token you created in step 3\. For more information on setting environment variables, refer to [Wrangler system environment variables](https://developers.cloudflare.com/workers/wrangler/system-environment-variables/).

Query your data:

Terminal window

```
npx wrangler r2 sql query "YOUR_WAREHOUSE_NAME" "SELECT    user_id,    event_type,    product_id,    amountFROM default.ecommerceWHERE event_type = 'purchase'LIMIT 10"
```

Replace `YOUR_WAREHOUSE_NAME` with the warehouse name from step 2.

You can also query this table with any engine that supports Apache Iceberg. To learn more about connecting other engines to R2 Data Catalog, refer to [Connect to Iceberg engines](https://developers.cloudflare.com/r2/data-catalog/config-examples/).

## Learn more

[ Managing R2 Data Catalogs ](https://developers.cloudflare.com/r2/data-catalog/manage-catalogs/) Enable or disable R2 Data Catalog on your bucket, retrieve configuration details, and authenticate your Iceberg engine. 

[ Try another example ](https://developers.cloudflare.com/r2-sql/tutorials/end-to-end-pipeline) Detailed tutorial for setting up a simple fraud detection data pipeline, and generate events for it in Python. 

[ Pipelines ](https://developers.cloudflare.com/pipelines/) Understand SQL transformations and pipeline configuration.

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/get-started/#page","headline":"Getting started · R2 SQL docs","description":"Create your first pipeline to ingest streaming data and write to R2 Data Catalog as an Apache Iceberg table.","url":"https://developers.cloudflare.com/r2-sql/get-started/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-06-03","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/get-started/","name":"Getting started"}}]}
```

---

---
title: Query data
description: Understand how to query data with R2 SQL
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Query data

Query [Apache Iceberg ↗](https://iceberg.apache.org/) tables managed by [R2 Data Catalog](https://developers.cloudflare.com/r2/data-catalog/). R2 SQL queries can be made via [Wrangler](https://developers.cloudflare.com/workers/wrangler/) or HTTP API.

## Get your warehouse name

To query data with R2 SQL, you'll need your warehouse name associated with your [catalog](https://developers.cloudflare.com/r2/data-catalog/manage-catalogs/). To retrieve it, you can run the [r2 bucket catalog get command](https://developers.cloudflare.com/workers/wrangler/commands/r2/#r2-bucket-catalog-get):

Terminal window

```
npx wrangler r2 bucket catalog get <BUCKET_NAME>
```

Alternatively, you can find it in the dashboard by going to the **R2 object storage** page, selecting the bucket, switching to the **Settings** tab, scrolling to **R2 Data Catalog**, and finding **Warehouse name**.

## Query via Wrangler

To begin, install [npm ↗](https://docs.npmjs.com/getting-started). Then [install Wrangler, the Developer Platform CLI](https://developers.cloudflare.com/workers/wrangler/install-and-update/).

Wrangler needs an API token with permissions to access R2 Data Catalog, R2 storage, and R2 SQL to execute queries. The `r2 sql query` command looks for the token in the `WRANGLER_R2_SQL_AUTH_TOKEN` environment variable.

Set up your environment:

Terminal window

```
export WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN
```

Or create a `.env` file with:

```
WRANGLER_R2_SQL_AUTH_TOKEN=YOUR_API_TOKEN
```

Where `YOUR_API_TOKEN` is the token you created with the [required permissions](#authentication). For more information on setting environment variables, refer to [Wrangler system environment variables](https://developers.cloudflare.com/workers/wrangler/system-environment-variables/).

To run a SQL query, run the [r2 sql query command](https://developers.cloudflare.com/workers/wrangler/commands/r2/#r2-sql-query):

Terminal window

```
npx wrangler r2 sql query <WAREHOUSE> "SELECT * FROM namespace.table_name limit 10;"
```

For a full list of supported SQL commands, refer to the [R2 SQL reference](https://developers.cloudflare.com/r2-sql/sql-reference/).

## Query via API

Below is an example of using R2 SQL via the REST endpoint:

Terminal window

```
curl -X POST \  "https://api.sql.cloudflarestorage.com/api/v1/accounts/{ACCOUNT_ID}/r2-sql/query/{BUCKET_NAME}" \  -H "Authorization: Bearer ${WRANGLER_R2_SQL_AUTH_TOKEN}" \  -H "Content-Type: application/json" \  -d '{    "query": "SELECT * FROM namespace.table_name limit 10;"  }'
```

The API requires an API token with the appropriate permissions in the Authorization header. Refer to [Authentication](#authentication) for details on creating a token.

For a full list of supported SQL commands, refer to the [R2 SQL reference](https://developers.cloudflare.com/r2-sql/sql-reference/).

## Authentication

To query data with R2 SQL, you must provide a Cloudflare API token with R2 SQL, R2 Data Catalog, and R2 storage permissions. R2 SQL requires these permissions to access catalog metadata and read the underlying data files stored in R2.

### Create API token in the dashboard

Create an [R2 API token](https://developers.cloudflare.com/r2/api/tokens/#permissions) with the following permissions:

* Access to R2 Data Catalog (read-only)
* Access to R2 storage (Admin read/write)
* Access to R2 SQL (read-only)

Use this token value for the `WRANGLER_R2_SQL_AUTH_TOKEN` environment variable when querying with Wrangler, or in the Authorization header when using the REST API.

### Create API token via API

To create an API token programmatically for use with R2 SQL, you'll need to specify R2 SQL, R2 Data Catalog, and R2 storage permission groups in your [Access Policy](https://developers.cloudflare.com/r2/api/tokens/#access-policy).

#### Example Access Policy

```
[  {    "id": "f267e341f3dd4697bd3b9f71dd96247f",    "effect": "allow",    "resources": {      "com.cloudflare.edge.r2.bucket.4793d734c0b8e484dfc37ec392b5fa8a_default_my-bucket": "*",      "com.cloudflare.edge.r2.bucket.4793d734c0b8e484dfc37ec392b5fa8a_eu_my-eu-bucket": "*"    },    "permission_groups": [      {        "id": "f45430d92e2b4a6cb9f94f2594c141b8",        "name": "Workers R2 SQL Read"      },      {        "id": "d229766a2f7f4d299f20eaa8c9b1fde9",        "name": "Workers R2 Data Catalog Write"      },      {        "id": "bf7481a1826f439697cb59a20b22293e",        "name": "Workers R2 Storage Write"      }    ]  }]
```

To learn more about how to create API tokens for R2 SQL using the API, including required permission groups and usage examples, refer to the [Create API tokens via API documentation](https://developers.cloudflare.com/r2/api/tokens/#create-api-tokens-via-api).

## Additional resources

[ Manage R2 Data Catalogs ](https://developers.cloudflare.com/r2/data-catalog/manage-catalogs/) Enable or disable R2 Data Catalog on your bucket, retrieve configuration details, and authenticate your Iceberg engine. 

[ Build an end to end data pipeline ](https://developers.cloudflare.com/r2-sql/tutorials/end-to-end-pipeline) Detailed tutorial for setting up a simple fraud detection data pipeline, and generate events for it in Python.

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/query-data/#page","headline":"Query data · R2 SQL docs","description":"Understand how to query data with R2 SQL","url":"https://developers.cloudflare.com/r2-sql/query-data/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-04-21","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/query-data/","name":"Query data"}}]}
```

---

---
title: SQL reference
description: Comprehensive reference for SQL syntax, functions, and data types supported in R2 SQL.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# SQL reference

Note

R2 SQL is in public beta. Supported SQL grammar may change over time.

R2 SQL is Cloudflare's serverless, distributed, analytics query engine for querying [Apache Iceberg ↗](https://iceberg.apache.org/) tables stored in [R2 Data Catalog](https://developers.cloudflare.com/r2/data-catalog/). This page documents the supported SQL syntax.

---

## Query syntax

```
SELECT [DISTINCT] column_list | expression | aggregate_function | window_functionFROM namespace_name.table_name[JOIN namespace_name.table_name ON condition][WHERE conditions][GROUP BY column_list][HAVING conditions][QUALIFY window_condition][ORDER BY expression [ASC | DESC]][LIMIT number]
```

Two or more queries can be combined with [set operations](#set-operations) (`UNION`, `UNION ALL`, `INTERSECT`, `EXCEPT`).

---

## Schema discovery commands

### SHOW DATABASES

Lists all available namespaces.

```
SHOW DATABASES;
```

### SHOW NAMESPACES

Alias for `SHOW DATABASES`. Lists all available namespaces.

```
SHOW NAMESPACES;
```

### SHOW TABLES

Lists all tables within a specific namespace.

```
SHOW TABLES IN namespace_name;
```

### DESCRIBE

Describes the structure of a table, showing column names and data types.

```
DESCRIBE namespace_name.table_name;
```

---

## SELECT clause

### Syntax

```
SELECT [DISTINCT] column_specification [, column_specification, ...]
```

### Column specification

* **Column name**: `column_name`
* **All columns**: `*`
* **Qualified wildcard**: `table_name.*`
* **Column alias**: `column_name AS alias`
* **Expressions**: arithmetic, function calls, CASE expressions, and casts

### Examples

```
SELECT * FROM my_namespace.sales_data LIMIT 10SELECT customer_id, region, total_amount FROM my_namespace.sales_data LIMIT 10SELECT region, total_amount * 1.1 AS total_with_tax FROM my_namespace.sales_data LIMIT 10
```

### DISTINCT

`SELECT DISTINCT` returns unique rows. `DISTINCT ON (...)` returns the first row for each combination of the listed expressions, using the `ORDER BY` clause to determine which row is kept.

```
-- Unique combinationsSELECT DISTINCT region, department FROM my_namespace.sales_data
-- First row per region by amountSELECT DISTINCT ON (region) region, customer_id, total_amountFROM my_namespace.sales_dataORDER BY region, total_amount DESC
```

For counting unique values on large datasets, `approx_distinct()` is a faster alternative.

---

## Common table expressions (CTEs)

CTEs let you define named temporary result sets using `WITH` that you can reference in the main query. CTEs can reference different tables and can include JOINs. A CTE can also be joined with other CTEs or regular tables in the main query.

### Syntax

```
WITH cte_name AS (    SELECT ...    FROM namespace_name.table_name    [WHERE ...])SELECT ... FROM cte_name
```

### Chained CTEs

A CTE can reference a previously defined CTE.

```
WITH filtered AS (    SELECT customer_id, department, total_amount    FROM my_namespace.sales_data    WHERE total_amount > 0),summary AS (    SELECT department,           COUNT(*) AS order_count,           round(AVG(total_amount), 2) AS avg_amount    FROM filtered    GROUP BY department)SELECT *FROM summaryWHERE order_count > 100ORDER BY avg_amount DESC
```

### CTE joined with another table

```
WITH enterprise_zones AS (    SELECT zone_id, domain, plan    FROM my_namespace.zones    WHERE plan = 'enterprise')SELECT ez.domain, f.action, COUNT(*) AS cntFROM enterprise_zones ezINNER JOIN my_namespace.firewall_events f ON ez.zone_id = f.zone_idGROUP BY ez.domain, f.actionORDER BY cnt DESCLIMIT 20
```

### Two CTEs joined together

```
WITH top_zones AS (    SELECT zone_id, COUNT(*) AS req_count    FROM my_namespace.http_requests    GROUP BY zone_id    ORDER BY req_count DESC    LIMIT 50),zone_threats AS (    SELECT zone_id, COUNT(*) AS threat_count    FROM my_namespace.firewall_events    WHERE risk_score > 0.5    GROUP BY zone_id)SELECT tz.zone_id, tz.req_count, COALESCE(zt.threat_count, 0) AS threat_countFROM top_zones tzLEFT JOIN zone_threats zt ON tz.zone_id = zt.zone_idORDER BY tz.req_count DESCLIMIT 20
```

---

## FROM clause

### Syntax

```
SELECT * FROM namespace_name.table_name
```

R2 SQL queries can reference one or more tables. Tables are specified as `namespace_name.table_name`. Multiple tables can be combined using JOINs or comma-separated syntax. Refer to the [JOIN clause](#join-clause) section for details.

---

## JOIN clause

R2 SQL supports joining multiple Iceberg tables in a single query. All join types use standard SQL syntax.

### Supported join types

| Join type        | Syntax                          | Description                                                        |
| ---------------- | ------------------------------- | ------------------------------------------------------------------ |
| Inner join       | INNER JOIN ... ON               | Returns rows that match in both tables                             |
| Left outer join  | LEFT JOIN ... ON                | Returns all rows from the left table, NULLs for non-matching right |
| Right outer join | RIGHT JOIN ... ON               | Returns all rows from the right table, NULLs for non-matching left |
| Full outer join  | FULL OUTER JOIN ... ON          | Returns all rows from both tables, NULLs where no match            |
| Cross join       | CROSS JOIN                      | Cartesian product of both tables                                   |
| Implicit join    | FROM t1, t2 WHERE t1.id = t2.id | Comma-separated tables with join condition in WHERE                |

### Syntax

```
-- Explicit JOINSELECT columnsFROM namespace.table1 alias1[INNER | LEFT | RIGHT | FULL OUTER | CROSS] JOIN namespace.table2 alias2  ON alias1.column = alias2.column[WHERE conditions]
-- Implicit joinSELECT columnsFROM namespace.table1 alias1, namespace.table2 alias2WHERE alias1.column = alias2.column
```

### Multi-way joins

You can join three or more tables in a single query:

```
SELECT z.domain, h.method, f.action, COUNT(*) AS cntFROM my_namespace.zones zINNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_idINNER JOIN my_namespace.firewall_events f ON z.zone_id = f.zone_idWHERE h.status_code >= 400GROUP BY z.domain, h.method, f.actionORDER BY cnt DESCLIMIT 20
```

### Self-joins

A table can be joined with itself using different aliases:

```
SELECT f1.source_ip, f1.zone_id AS zone1, f2.zone_id AS zone2FROM my_namespace.firewall_events f1INNER JOIN my_namespace.firewall_events f2  ON f1.source_ip = f2.source_ip  AND f1.zone_id < f2.zone_idWHERE f1.action = 'block'LIMIT 20
```

### Join conditions

* Join conditions use the `ON` clause with equality (`=`) or expression-based predicates.
* Functions are supported in join predicates (for example, `ON LOWER(a.col) = LOWER(b.col)`).
* Multiple conditions can be combined with `AND`.

Note

Nested (parenthesized) joins are not supported. Write multi-way joins as a flat sequence of `JOIN` clauses instead of grouping them with parentheses.

```
-- Not supportedSELECT * FROM (t1 JOIN t2 ON t1.id = t2.id) JOIN t3 ON t2.id = t3.id
-- SupportedSELECT * FROM t1 JOIN t2 ON t1.id = t2.id JOIN t3 ON t2.id = t3.id
```

### Best practices for joins

* Include `WHERE` filters to reduce intermediate result sizes, especially for multi-way joins.
* Join large fact tables through a shared dimension table rather than directly cross-joining two large tables.
* Use `LIMIT` to cap result sizes.

---

## Subqueries

R2 SQL supports subqueries in multiple positions within a query.

### Subqueries in FROM (derived tables)

A subquery in the `FROM` clause creates a derived table that can be referenced in the outer query:

```
SELECT sub.domain, sub.total_requestsFROM (    SELECT z.domain, COUNT(*) AS total_requests    FROM my_namespace.zones z    INNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_id    GROUP BY z.domain) subWHERE sub.total_requests > 1000ORDER BY sub.total_requests DESCLIMIT 20
```

Note

`LATERAL` derived tables are not supported. Subqueries in `FROM` cannot reference columns from other tables in the same `FROM` clause.

Derived tables can be joined with other derived tables or regular tables:

```
SELECT req.domain, req.total_reqs, fw.total_eventsFROM (    SELECT zone_id, domain, COUNT(*) AS total_reqs    FROM my_namespace.zones z    INNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_id    GROUP BY zone_id, domain) reqINNER JOIN (    SELECT zone_id, COUNT(*) AS total_events    FROM my_namespace.firewall_events    GROUP BY zone_id) fw ON req.zone_id = fw.zone_idORDER BY fw.total_events DESCLIMIT 20
```

### `IN` / `NOT IN` subqueries

Filter rows based on whether a value exists in the result of a subquery:

```
-- Find requests from enterprise zonesSELECT method, status_code, COUNT(*) AS cntFROM my_namespace.http_requestsWHERE zone_id IN (    SELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise')GROUP BY method, status_codeORDER BY cnt DESCLIMIT 20
```

```
-- NOT IN exampleSELECT zone_id, COUNT(*) AS cntFROM my_namespace.http_requestsWHERE zone_id NOT IN (    SELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block')GROUP BY zone_idLIMIT 10
```

Warning

`NOT IN` subqueries are not supported on nullable columns. If the subquery column can contain `NULL` values, use `NOT EXISTS` instead.

```
-- Instead of NOT IN on a nullable column:SELECT z.domainFROM my_namespace.zones zWHERE NOT EXISTS (    SELECT 1 FROM my_namespace.firewall_events f    WHERE f.zone_id = z.zone_id)LIMIT 20
```

### `EXISTS` / `NOT EXISTS` subqueries

Test for the existence of rows matching a correlated condition:

```
-- Find zones with blocked firewall events (semi-join)SELECT z.domain, z.planFROM my_namespace.zones zWHERE EXISTS (    SELECT 1 FROM my_namespace.firewall_events f    WHERE f.zone_id = z.zone_id AND f.action = 'block')ORDER BY z.domainLIMIT 20
```

```
-- Find zones with NO firewall events (anti-join)SELECT z.domain, z.planFROM my_namespace.zones zWHERE NOT EXISTS (    SELECT 1 FROM my_namespace.firewall_events f    WHERE f.zone_id = z.zone_id)ORDER BY z.domainLIMIT 20
```

### Scalar subqueries

A subquery that returns a single value can be used in `SELECT`, `WHERE`, or `HAVING`:

```
-- In SELECT (constant value per row)SELECT z.domain, z.plan,       (SELECT COUNT(*) FROM my_namespace.zones) AS total_zonesFROM my_namespace.zones zWHERE z.plan = 'enterprise'LIMIT 10
```

```
-- In WHERE (comparison)SELECT z.domain, z.plan, z.requests_30dFROM my_namespace.zones zWHERE z.requests_30d > (    SELECT AVG(requests_30d) FROM my_namespace.zones)ORDER BY z.requests_30d DESCLIMIT 20
```

---

## WHERE clause

### Syntax

```
SELECT * FROM namespace_name.table_name WHERE condition [AND | OR condition ...]
```

### Conditions

#### Comparison operators

`=`, `!=`, `<>`, `<`, `>`, `<=`, `>=`

#### Null checks

* `column_name IS NULL`
* `column_name IS NOT NULL`

#### Boolean checks

* `IS TRUE`, `IS FALSE`, `IS NOT TRUE`, `IS NOT FALSE`
* `IS UNKNOWN`, `IS NOT UNKNOWN`

#### Range

* `column_name BETWEEN value1 AND value2`
* `column_name NOT BETWEEN value1 AND value2`

#### List membership

* `column_name IN ('value1', 'value2')`
* `column_name NOT IN ('value1', 'value2')`

#### Pattern matching

* `column_name LIKE 'pattern'`
* `column_name NOT LIKE 'pattern'`
* `column_name ILIKE 'pattern'` (case-insensitive)
* `column_name NOT ILIKE 'pattern'`
* `column_name SIMILAR TO 'regex_pattern'`

#### Logical operators

* `AND`
* `OR`
* `NOT`

### Examples

```
SELECT * FROM my_namespace.sales_dataWHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'
SELECT * FROM my_namespace.sales_dataWHERE status = 200 AND response_time > 1000
SELECT * FROM my_namespace.sales_dataWHERE (region = 'North' OR region = 'South')  AND total_amount IS NOT NULL
SELECT * FROM my_namespace.sales_dataWHERE department ILIKE '%eng%'
```

---

## GROUP BY clause

### Syntax

```
SELECT column_list, aggregation_function(column)FROM namespace_name.table_name[WHERE conditions]GROUP BY column_list
```

### Examples

```
SELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataGROUP BY department
SELECT department, category, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY department, category
```

### GROUPING SETS, ROLLUP, and CUBE

These extensions compute multiple groupings, including subtotals and grand totals, in a single query.

* **`GROUPING SETS`**: Computes exactly the groupings you list. `()` produces the grand total.
* **`ROLLUP`**: Computes hierarchical subtotals from left to right. `ROLLUP(a, b)` groups by `(a, b)`, `(a)`, and `()`.
* **`CUBE`**: Computes every combination of the listed columns. `CUBE(a, b)` groups by `(a, b)`, `(a)`, `(b)`, and `()`.

```
-- Subtotals per department plus a grand totalSELECT department, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY ROLLUP(department)
-- Every combination of department and categorySELECT department, category, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY CUBE(department, category)
-- Explicit groupingsSELECT department, category, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY GROUPING SETS ((department, category), (department), ())
```

---

## HAVING clause

### Syntax

```
SELECT column_list, aggregation_function(column) AS aliasFROM namespace_name.table_nameGROUP BY column_listHAVING aggregation_function(column) comparison_operator value
```

### Examples

```
SELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataGROUP BY departmentHAVING COUNT(*) > 1000
SELECT region, SUM(total_amount) AS totalFROM my_namespace.sales_dataGROUP BY regionHAVING SUM(total_amount) > 1000000
```

---

## ORDER BY clause

### Syntax

```
ORDER BY expression [ASC | DESC] [, expression [ASC | DESC], ...]
```

* **ASC**: Ascending order (default)
* **DESC**: Descending order
* Multi-column ordering is supported

### Examples

```
SELECT customer_id, total_amountFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLORDER BY total_amount DESCLIMIT 50
SELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataGROUP BY departmentORDER BY dept_count DESC, department ASC
```

---

## LIMIT clause

### Syntax

```
LIMIT number
```

* **Type**: Integer only
* **Default**: 500

### Examples

```
SELECT * FROM my_namespace.sales_data LIMIT 100
```

---

## Window functions

Window functions compute a value across a set of rows related to the current row without collapsing them into a single output row. The window is defined inline with an `OVER (...)` clause containing an optional `PARTITION BY`, `ORDER BY`, and frame specification.

### Syntax

```
function(args) OVER (    [PARTITION BY expression [, ...]]    [ORDER BY expression [ASC | DESC] [, ...]]    [frame_specification])
```

### Supported functions

| Category  | Functions                                                        |
| --------- | ---------------------------------------------------------------- |
| Ranking   | ROW\_NUMBER, RANK, DENSE\_RANK, PERCENT\_RANK, CUME\_DIST, NTILE |
| Offset    | LAG, LEAD, FIRST\_VALUE, LAST\_VALUE, NTH\_VALUE                 |
| Aggregate | SUM, AVG, COUNT, MIN, MAX, and other aggregates used with OVER   |

### Examples

```
-- Rank rows within each partitionSELECT customer_id, region,       ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) AS rank_in_region,       LAG(total_amount) OVER (PARTITION BY region ORDER BY total_amount DESC) AS prev_amountFROM my_namespace.sales_data
-- Running total with an explicit frameSELECT customer_id, total_amount,       SUM(total_amount) OVER (ORDER BY total_amount ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) AS running_totalFROM my_namespace.sales_data
```

Note

The named `WINDOW` clause is not supported. Inline the `OVER (...)` specification at each call site instead of defining `WINDOW w AS (...)`.

### QUALIFY

`QUALIFY` filters rows based on the result of a window function, similar to how `HAVING` filters grouped rows.

```
-- Keep only the top 3 customers by amount in each regionSELECT customer_id, region, total_amountFROM my_namespace.sales_dataQUALIFY ROW_NUMBER() OVER (PARTITION BY region ORDER BY total_amount DESC) <= 3
```

---

## Set operations

Set operations combine the results of two or more `SELECT` statements.

### Syntax

```
SELECT ... FROM table1UNION | UNION ALL | INTERSECT | EXCEPTSELECT ... FROM table2
```

### Supported operations

| Operation | Description                                                        |
| --------- | ------------------------------------------------------------------ |
| UNION     | Returns all rows from both queries, removing duplicates            |
| UNION ALL | Returns all rows from both queries, including duplicates           |
| INTERSECT | Returns only rows that appear in both query results                |
| EXCEPT    | Returns rows from the first query that do not appear in the second |

### Examples

#### Union

```
-- Find zones that had either firewall blocks OR high-risk requestsSELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block'UNIONSELECT zone_id FROM my_namespace.http_requests WHERE risk_score > 0.8
```

#### Intersect

```
-- Find zones with both firewall blocks AND entries in the zones tableSELECT zone_id FROM my_namespace.firewall_events WHERE action = 'block'INTERSECTSELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise'
```

#### Except

```
-- Find enterprise zones that have no firewall eventsSELECT zone_id FROM my_namespace.zones WHERE plan = 'enterprise'EXCEPTSELECT zone_id FROM my_namespace.firewall_events
```

### Requirements

* All queries in a set operation must return the same number of columns.
* Corresponding columns must have compatible data types.
* Column names in the result are taken from the first query.

Note

A `LIMIT` placed directly before a set operator is not valid. Wrap each query in a subquery if you need to limit its rows, or apply a single `LIMIT` to the combined result.

---

## EXPLAIN

Returns the execution plan for a query without running it.

```
EXPLAIN SELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLGROUP BY department;
```

### EXPLAIN FORMAT JSON

Returns the execution plan as structured JSON for programmatic analysis.

```
EXPLAIN FORMAT JSON SELECT * FROM my_namespace.sales_data LIMIT 10;
```

---

## Expressions

Expressions can be used in `SELECT`, `WHERE`, `GROUP BY`, `HAVING`, and `ORDER BY` clauses.

### Literals

```
SELECT 42 AS int_val, 3.14 AS float_val, 'hello' AS str_val, TRUE AS bool_val, NULL AS null_valFROM my_namespace.sales_data LIMIT 1
```

### Arithmetic operators

`+`, `-`, `*`, `/`, `%`

```
SELECT customer_id, total_amount * 1.1 AS total_with_tax, total_amount % 10 AS remainderFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### String concatenation

```
SELECT customer_id || ' - ' || region AS labelFROM my_namespace.sales_dataLIMIT 5
```

### CASE expressions

Searched form:

```
SELECT customer_id,    CASE        WHEN total_amount > 1000 THEN 'high'        WHEN total_amount > 100 THEN 'medium'        ELSE 'low'    END AS tierFROM my_namespace.sales_dataLIMIT 10
```

Simple form:

```
SELECT customer_id,    CASE region        WHEN 'North' THEN 'N'        WHEN 'South' THEN 'S'        ELSE 'Other'    END AS region_codeFROM my_namespace.sales_dataLIMIT 10
```

### Type casting

```
-- CASTSELECT CAST(total_amount AS INT) AS amount_int FROM my_namespace.sales_data LIMIT 5
-- TRY_CAST (returns NULL on failure instead of error)SELECT TRY_CAST(customer_id AS INT) AS id_int FROM my_namespace.sales_data LIMIT 5
-- Shorthand (::)SELECT total_amount::INT AS amount_int FROM my_namespace.sales_data LIMIT 5
```

### EXTRACT

```
SELECT EXTRACT(YEAR FROM timestamp) AS yr,       EXTRACT(MONTH FROM timestamp) AS mo,       EXTRACT(DAY FROM timestamp) AS dyFROM my_namespace.sales_dataLIMIT 1
```

---

## Data type reference

| Type      | Description     | Example Values               |
| --------- | --------------- | ---------------------------- |
| integer   | Whole numbers   | 1, 42, \-10, 0               |
| float     | Decimal numbers | 1.5, 3.14, \-2.7, 0.0        |
| string    | Text values     | 'hello', 'GET', '2024-01-01' |
| boolean   | Boolean values  | true, false                  |
| timestamp | RFC3339         | '2025-09-24T01:00:00Z'       |
| date      | Date values     | '2025-09-24'                 |
| struct    | Named fields    | struct\_col\['field\_name'\] |
| array     | Ordered list    | array\_col\[1\] (1-indexed)  |
| map       | Key-value pairs | map\_keys(map\_col)          |

---

## Operator precedence

1. **Comparison operators**: `=`, `!=`, `<`, `<=`, `>`, `>=`, `LIKE`, `BETWEEN`, `IS NULL`, `IS NOT NULL`
2. **AND** (higher precedence)
3. **OR** (lower precedence)

Use parentheses to override default precedence:

```
SELECT * FROM my_namespace.sales_data WHERE (status = 404 OR status = 500) AND region = 'North'
```

---

## Complete query examples

### Basic query

```
SELECT *FROM my_namespace.sales_dataWHERE timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z'LIMIT 100
```

### Filtered query with sorting

```
SELECT customer_id, timestamp, status, total_amountFROM my_namespace.sales_dataWHERE status >= 400 AND total_amount > 5000ORDER BY total_amount DESCLIMIT 50
```

### Aggregation with HAVING

```
SELECT region, COUNT(*) AS region_count, AVG(total_amount) AS avg_amountFROM my_namespace.sales_dataWHERE status = 'completed'GROUP BY regionHAVING COUNT(*) > 1000ORDER BY avg_amount DESCLIMIT 20
```

### Conditional categorization

```
SELECT customer_id,    CASE        WHEN total_amount >= 1000 THEN 'Premium'        WHEN total_amount >= 100 THEN 'Standard'        ELSE 'Basic'    END AS tier,    total_amountFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLORDER BY total_amount DESCLIMIT 20
```

```json
{"@context":"https://schema.org","@type":"WebPage","@id":"https://developers.cloudflare.com/r2-sql/sql-reference/#page","headline":"SQL reference · R2 SQL docs","description":"Comprehensive reference for SQL syntax, functions, and data types supported in R2 SQL.","url":"https://developers.cloudflare.com/r2-sql/sql-reference/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-06-22","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"},"keywords":["SQL"]}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}}]}
```

---

---
title: Aggregate functions
description: Reference for the aggregate functions supported in R2 SQL, organized by category.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Aggregate functions

Aggregate functions collapse multiple rows into a single result. They are used with `GROUP BY` to compute summaries per group, or without `GROUP BY` to compute a single result across all rows.

Most aggregates accept a `DISTINCT` modifier, for example `COUNT(DISTINCT customer_id)` or `SUM(DISTINCT total_amount)`.

Note

On large datasets, prefer the [approximate aggregates](#approximate-aggregates) (`approx_distinct`, `approx_median`, `approx_percentile_cont`) over their exact counterparts for lower memory and compute.

---

## Basic aggregates

### COUNT

Counts rows. `COUNT(*)` counts all rows. `COUNT(column)` counts non-NULL values.

```
SELECT COUNT(*) AS total_rowsFROM my_namespace.sales_data
SELECT department, COUNT(*) AS dept_countFROM my_namespace.sales_dataGROUP BY departmentORDER BY dept_count DESC
```

### SUM

Returns the sum of values in a column.

```
SELECT SUM(total_amount) AS grand_totalFROM my_namespace.sales_data
SELECT department, SUM(total_amount) AS dept_totalFROM my_namespace.sales_dataGROUP BY departmentORDER BY dept_total DESC
```

### AVG

Returns the average of values in a column. Alias: `mean`.

```
SELECT AVG(total_amount) AS avg_amountFROM my_namespace.sales_data
SELECT department, AVG(total_amount) AS avg_amountFROM my_namespace.sales_dataGROUP BY departmentORDER BY avg_amount DESC
```

### MIN

Returns the minimum value. Works on numeric and string columns.

```
SELECT MIN(total_amount) AS min_amount, MIN(customer_id) AS first_customerFROM my_namespace.sales_data
SELECT department, MIN(total_amount) AS min_amountFROM my_namespace.sales_dataGROUP BY department
```

### MAX

Returns the maximum value. Works on numeric and string columns.

```
SELECT MAX(total_amount) AS max_amount, MAX(customer_id) AS last_customerFROM my_namespace.sales_data
SELECT department, MAX(total_amount) AS max_amountFROM my_namespace.sales_dataGROUP BY department
```

### MEDIAN

Returns the exact median value. For large datasets, use [approx\_median](#approx%5Fmedian) instead.

```
SELECT MEDIAN(total_amount) AS median_amountFROM my_namespace.sales_data
SELECT department, MEDIAN(total_amount) AS median_amountFROM my_namespace.sales_dataGROUP BY department
```

### PERCENTILE\_CONT

Returns the exact value at a given percentile using `WITHIN GROUP (ORDER BY ...)`. The percentile parameter must be between `0.0` and `1.0` inclusive. For large datasets, use [approx\_percentile\_cont](#approx%5Fpercentile%5Fcont) instead.

```
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY total_amount) AS median,       PERCENTILE_CONT(0.95) WITHIN GROUP (ORDER BY total_amount) AS p95FROM my_namespace.sales_data
```

Note

`PERCENTILE_DISC` is not supported. Use `PERCENTILE_CONT` or [approx\_percentile\_cont](#approx%5Fpercentile%5Fcont).

---

## Approximate aggregates

Approximate aggregation functions produce statistically estimated results while using significantly less memory and compute than their exact counterparts. Use them when analyzing large datasets and an approximate result is acceptable.

### approx\_percentile\_cont

Returns the approximate value at a given percentile using a T-Digest algorithm. The percentile parameter must be between `0.0` and `1.0` inclusive.

```
SELECT approx_percentile_cont(total_amount, 0.5) AS median,       approx_percentile_cont(total_amount, 0.95) AS p95FROM my_namespace.sales_data
SELECT department,       approx_percentile_cont(total_amount, 0.5) AS medianFROM my_namespace.sales_dataGROUP BY departmentORDER BY median DESC
```

### approx\_percentile\_cont\_with\_weight

Returns the approximate weighted percentile. Rows are weighted by the `weight` column.

```
SELECT approx_percentile_cont_with_weight(unit_price, quantity, 0.5) AS weighted_medianFROM my_namespace.sales_dataWHERE unit_price IS NOT NULL AND quantity IS NOT NULL
```

### approx\_median

Returns the approximate median. Equivalent to `approx_percentile_cont(column, 0.5)`.

```
SELECT approx_median(total_amount) AS median_amountFROM my_namespace.sales_data
SELECT department, approx_median(total_amount) AS medianFROM my_namespace.sales_dataGROUP BY department
```

### approx\_distinct

Returns the approximate count of distinct values using HyperLogLog.

```
SELECT approx_distinct(customer_id) AS unique_customersFROM my_namespace.sales_data
SELECT department, approx_distinct(customer_id) AS unique_customersFROM my_namespace.sales_dataGROUP BY department
```

### approx\_top\_k

Returns the _k_ most frequent values with their approximate counts.

```
SELECT approx_top_k(department, 5) AS top_departmentsFROM my_namespace.sales_data
```

---

## Statistical aggregates

### var / var\_samp

Returns the sample variance.

```
SELECT var(total_amount) AS varianceFROM my_namespace.sales_data
SELECT department, var(total_amount) AS varianceFROM my_namespace.sales_dataGROUP BY department
```

### var\_pop

Returns the population variance.

```
SELECT var_pop(total_amount) AS pop_varianceFROM my_namespace.sales_data
```

### stddev / stddev\_samp

Returns the sample standard deviation.

```
SELECT stddev(total_amount) AS std_devFROM my_namespace.sales_data
SELECT department, stddev(total_amount) AS std_devFROM my_namespace.sales_dataGROUP BY department
```

### stddev\_pop

Returns the population standard deviation.

```
SELECT stddev_pop(total_amount) AS pop_std_devFROM my_namespace.sales_data
```

### covar\_samp

Returns the sample covariance. Alias: `covar`.

```
SELECT covar_samp(total_amount, CAST(quantity AS DOUBLE)) AS covarianceFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### covar\_pop

Returns the population covariance.

```
SELECT covar_pop(total_amount, CAST(quantity AS DOUBLE)) AS pop_covarianceFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### corr

Returns the Pearson correlation coefficient between two columns.

```
SELECT corr(total_amount, CAST(quantity AS DOUBLE)) AS correlationFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_slope

Returns the slope of the linear regression line.

```
SELECT regr_slope(total_amount, CAST(quantity AS DOUBLE)) AS slopeFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_intercept

Returns the y-intercept of the linear regression line.

```
SELECT regr_intercept(total_amount, CAST(quantity AS DOUBLE)) AS interceptFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_count

Returns the count of non-NULL pairs.

```
SELECT regr_count(total_amount, CAST(quantity AS DOUBLE)) AS pair_countFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_r2

Returns the coefficient of determination (R-squared).

```
SELECT regr_r2(total_amount, CAST(quantity AS DOUBLE)) AS r_squaredFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_avgx

Returns the average of the independent variable (x) for non-NULL pairs.

```
SELECT regr_avgx(total_amount, CAST(quantity AS DOUBLE)) AS avg_qtyFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_avgy

Returns the average of the dependent variable (y) for non-NULL pairs.

```
SELECT regr_avgy(total_amount, CAST(quantity AS DOUBLE)) AS avg_amountFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_sxx

Returns the sum of squares of the independent variable.

```
SELECT regr_sxx(total_amount, CAST(quantity AS DOUBLE)) AS sxxFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_syy

Returns the sum of squares of the dependent variable.

```
SELECT regr_syy(total_amount, CAST(quantity AS DOUBLE)) AS syyFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

### regr\_sxy

Returns the sum of products of the paired variables.

```
SELECT regr_sxy(total_amount, CAST(quantity AS DOUBLE)) AS sxyFROM my_namespace.sales_dataWHERE total_amount IS NOT NULL AND quantity IS NOT NULL
```

---

## Bitwise aggregates

### bit\_and

Returns the bitwise AND of all values in a group.

```
SELECT department, bit_and(quantity) AS and_resultFROM my_namespace.sales_dataWHERE quantity IS NOT NULLGROUP BY department
```

### bit\_or

Returns the bitwise OR of all values in a group.

```
SELECT department, bit_or(quantity) AS or_resultFROM my_namespace.sales_dataWHERE quantity IS NOT NULLGROUP BY department
```

### bit\_xor

Returns the bitwise XOR of all values in a group.

```
SELECT department, bit_xor(quantity) AS xor_resultFROM my_namespace.sales_dataWHERE quantity IS NOT NULLGROUP BY department
```

---

## Boolean aggregates

### bool\_and

Returns true if all values in a group are true.

```
SELECT department, bool_and(is_completed) AS all_completedFROM my_namespace.sales_dataWHERE is_completed IS NOT NULLGROUP BY department
```

### bool\_or

Returns true if any value in a group is true.

```
SELECT department, bool_or(is_completed) AS any_completedFROM my_namespace.sales_dataWHERE is_completed IS NOT NULLGROUP BY department
```

---

## Positional aggregates

### first\_value

Returns the first value in a group according to the specified ordering.

```
SELECT department,       first_value(customer_id ORDER BY total_amount ASC) AS lowest_spenderFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLGROUP BY department
```

### last\_value

Returns the last value in a group according to the specified ordering.

```
SELECT department,       last_value(customer_id ORDER BY total_amount ASC) AS highest_spenderFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLGROUP BY department
```

---

## Collection aggregates

These aggregates accumulate all input values into a single result. Because they hold every value in memory, use a `WHERE` filter or `GROUP BY` to keep group sizes bounded on large datasets.

### array\_agg

Collects values from a group into an array.

```
SELECT department, array_agg(customer_id) AS customersFROM my_namespace.sales_dataGROUP BY department
```

### string\_agg

Concatenates values from a group into a single string, separated by the given delimiter.

```
SELECT department, string_agg(customer_id, ', ') AS customer_listFROM my_namespace.sales_dataGROUP BY department
```

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/sql-reference/aggregate-functions/#page","headline":"Aggregate functions · R2 SQL docs","description":"Reference for the aggregate functions supported in R2 SQL, organized by category.","url":"https://developers.cloudflare.com/r2-sql/sql-reference/aggregate-functions/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-06-22","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"},"keywords":["SQL"]}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/sql-reference/aggregate-functions/","name":"Aggregate functions"}}]}
```

---

---
title: Complex types
description: Reference for querying struct, array, and map column types in R2 SQL.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Complex types

R2 SQL supports querying struct, array, and map column types stored in Iceberg tables. This page covers access patterns, supported functions, and examples for each type.

---

## Structs

Struct columns contain named fields. Access fields using bracket notation or the `get_field()` function.

### Bracket notation

```
SELECT pricing['price'] AS price,       pricing['discount_percent'] AS discountFROM my_namespace.productsLIMIT 5
```

### get\_field function

```
SELECT get_field(pricing, 'price') AS price,       get_field(pricing, 'discount_percent') AS discountFROM my_namespace.productsLIMIT 5
```

### Struct fields in WHERE

```
SELECT customer_id, pricing['price'] AS priceFROM my_namespace.productsWHERE pricing['price'] > 50LIMIT 10
```

### Struct fields in ORDER BY

```
SELECT customer_id, pricing['price'] AS priceFROM my_namespace.productsWHERE pricing['price'] IS NOT NULLORDER BY pricing['price'] DESCLIMIT 10
```

### Struct fields in GROUP BY

```
SELECT platforms['windows'] AS windows_support,       COUNT(*) AS product_count,       AVG(pricing['price']) AS avg_priceFROM my_namespace.productsWHERE pricing['price'] IS NOT NULLGROUP BY platforms['windows']
```

### Creating structs inline

```
-- named_struct creates a struct with named fieldsSELECT named_struct('id', customer_id, 'amount', total_amount) AS infoFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 1
-- struct creates a struct with positional fieldsSELECT struct(customer_id, total_amount, region) AS infoFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 1
```

---

## Arrays

Array columns contain ordered lists of values. Array indexing is **1-based**.

### Index access

```
SELECT customer_id, tags[1] AS first_tag, tags[2] AS second_tagFROM my_namespace.productsLIMIT 5
```

### Create arrays

#### make\_array

Creates an array from a list of values.

```
SELECT make_array(1, 2, 3) AS numsFROM my_namespace.sales_dataLIMIT 1
```

#### string\_to\_array

Splits a string into an array by a delimiter.

```
SELECT string_to_array(categories, ',') AS cat_arrayFROM my_namespace.productsWHERE categories IS NOT NULLLIMIT 5
```

#### range

Generates an array of integers from start (inclusive) to stop (exclusive).

```
SELECT range(0, 5) AS numsFROM my_namespace.sales_dataLIMIT 1
```

#### generate\_series

Generates an array of integers from start to stop (inclusive).

```
SELECT generate_series(1, 5) AS numsFROM my_namespace.sales_dataLIMIT 1
```

### Inspect arrays

#### array\_length

Returns the number of elements in an array.

```
SELECT customer_id, array_length(tags) AS tag_countFROM my_namespace.productsLIMIT 5
```

#### cardinality

Returns the total number of elements in an array. Alias for `array_length`.

```
SELECT customer_id, cardinality(tags) AS tag_countFROM my_namespace.productsLIMIT 5
```

#### empty

Returns true if an array has zero elements.

```
SELECT customer_id, empty(tags) AS has_no_tagsFROM my_namespace.productsLIMIT 5
```

#### array\_ndims

Returns the number of dimensions of an array.

```
SELECT array_ndims(make_array(1, 2, 3)) AS ndimsFROM my_namespace.sales_dataLIMIT 1
```

#### array\_dims

Returns the dimensions of an array.

```
SELECT array_dims(make_array(1, 2, 3)) AS dimsFROM my_namespace.sales_dataLIMIT 1
```

### Search arrays

#### array\_has

Returns true if an array contains a value.

```
SELECT customer_id, array_has(tags, 'premium') AS is_premiumFROM my_namespace.productsLIMIT 5
```

#### array\_has\_all

Returns true if the first array contains all elements of the second.

```
SELECT array_has_all(make_array(1, 2, 3, 4), make_array(2, 3)) AS has_allFROM my_namespace.sales_dataLIMIT 1
```

#### array\_has\_any

Returns true if the first array contains any element of the second.

```
SELECT array_has_any(make_array(1, 2, 3), make_array(3, 4, 5)) AS has_anyFROM my_namespace.sales_dataLIMIT 1
```

#### array\_position

Returns the position of the first occurrence of a value (1-indexed). Returns 0 if not found.

```
SELECT array_position(make_array('a', 'b', 'c', 'b'), 'b') AS posFROM my_namespace.sales_dataLIMIT 1
```

#### array\_positions

Returns all positions of a value as an array.

```
SELECT array_positions(make_array(1, 2, 1, 3, 1), 1) AS positionsFROM my_namespace.sales_dataLIMIT 1
```

### Transform arrays

#### array\_sort

Sorts array elements.

```
SELECT array_sort(make_array(3, 1, 2)) AS sortedFROM my_namespace.sales_dataLIMIT 1
```

#### array\_reverse

Reverses the order of array elements.

```
SELECT array_reverse(make_array(1, 2, 3)) AS reversedFROM my_namespace.sales_dataLIMIT 1
```

#### array\_distinct

Removes duplicate elements from an array.

```
SELECT array_distinct(make_array(1, 2, 2, 3, 3, 3)) AS unique_valsFROM my_namespace.sales_dataLIMIT 1
```

#### flatten

Flattens a nested array by one level.

```
SELECT flatten(make_array(make_array(1, 2), make_array(3, 4))) AS flatFROM my_namespace.sales_dataLIMIT 1
```

#### array\_slice

Returns a slice of an array from a start index to an end index (both inclusive, 1-indexed).

```
SELECT array_slice(make_array(10, 20, 30, 40, 50), 2, 4) AS slicedFROM my_namespace.sales_dataLIMIT 1
```

### Modify arrays

#### array\_append

Appends a value to the end of an array.

```
SELECT array_append(make_array(1, 2, 3), 4) AS appendedFROM my_namespace.sales_dataLIMIT 1
```

#### array\_prepend

Prepends a value to the beginning of an array.

```
SELECT array_prepend(0, make_array(1, 2, 3)) AS prependedFROM my_namespace.sales_dataLIMIT 1
```

#### array\_concat

Concatenates two or more arrays.

```
SELECT array_concat(make_array(1, 2), make_array(3, 4)) AS mergedFROM my_namespace.sales_dataLIMIT 1
```

#### array\_remove

Removes the first occurrence of a value from an array.

```
SELECT array_remove(make_array(1, 2, 3, 2), 2) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_remove\_all

Removes all occurrences of a value from an array.

```
SELECT array_remove_all(make_array(1, 2, 3, 2, 2), 2) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_remove\_n

Removes the first _n_ occurrences of a value from an array.

```
SELECT array_remove_n(make_array(1, 2, 2, 2, 3), 2, 2) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_replace

Replaces the first occurrence of a value in an array.

```
SELECT array_replace(make_array(1, 2, 3), 2, 99) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_replace\_n

Replaces the first _n_ occurrences of a value in an array.

```
SELECT array_replace_n(make_array(1, 2, 2, 2, 3), 2, 99, 2) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_replace\_all

Replaces all occurrences of a value in an array.

```
SELECT array_replace_all(make_array(1, 2, 3, 2), 2, 99) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_pop\_back

Removes the last element from an array.

```
SELECT array_pop_back(make_array(1, 2, 3)) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_pop\_front

Removes the first element from an array.

```
SELECT array_pop_front(make_array(1, 2, 3)) AS resultFROM my_namespace.sales_dataLIMIT 1
```

#### array\_repeat

Repeats a value a given number of times as an array.

```
SELECT array_repeat(region, 3) AS repeatedFROM my_namespace.sales_dataLIMIT 1
```

#### array\_resize

Resizes an array to a given length, filling with a default value.

```
SELECT array_resize(make_array(1, 2), 5, 0) AS resizedFROM my_namespace.sales_dataLIMIT 1
```

### Set operations on arrays

#### array\_intersect

Returns elements common to both arrays.

```
SELECT array_intersect(make_array(1, 2, 3), make_array(2, 3, 4)) AS commonFROM my_namespace.sales_dataLIMIT 1
```

#### array\_union

Returns all unique elements from both arrays.

```
SELECT array_union(make_array(1, 2, 3), make_array(3, 4, 5)) AS mergedFROM my_namespace.sales_dataLIMIT 1
```

#### array\_except

Returns elements in the first array that are not in the second.

```
SELECT array_except(make_array(1, 2, 3, 4), make_array(2, 4)) AS diffFROM my_namespace.sales_dataLIMIT 1
```

### Aggregate array values

#### array\_max

Returns the maximum value in an array.

```
SELECT customer_id, array_max(scores) AS max_scoreFROM my_namespace.productsLIMIT 5
```

#### array\_min

Returns the minimum value in an array.

```
SELECT customer_id, array_min(scores) AS min_scoreFROM my_namespace.productsLIMIT 5
```

#### array\_any\_value

Returns the first non-NULL value in an array.

```
SELECT array_any_value(make_array(NULL, 42, NULL)) AS first_valFROM my_namespace.sales_dataLIMIT 1
```

#### array\_element

Returns the element at a given index (1-indexed). Equivalent to bracket-notation access (`arr[idx]`).

```
SELECT array_element(make_array(10, 20, 30), 2) AS second_valFROM my_namespace.sales_dataLIMIT 1
```

### Convert arrays

#### array\_to\_string

Joins array elements into a string with a separator.

```
SELECT customer_id, array_to_string(tags, ', ') AS tag_listFROM my_namespace.productsLIMIT 5
```

---

## Maps

Map columns store key-value pairs. Use `map_keys`, `map_values`, and `map_extract` to query them.

### map\_keys

Returns all keys from a map as an array.

```
SELECT map_keys(metadata) AS keysFROM my_namespace.productsLIMIT 5
```

### map\_values

Returns all values from a map as an array.

```
SELECT map_values(metadata) AS valsFROM my_namespace.productsLIMIT 5
```

### map\_extract

Returns the value for a specific key.

```
SELECT map_extract(metadata, 'source') AS source,       map_extract(metadata, 'store_name') AS storeFROM my_namespace.productsLIMIT 5
```

### Creating maps inline

```
SELECT map(make_array('a', 'b'), make_array(1, 2)) AS mFROM my_namespace.sales_dataLIMIT 1
```

---

## Complete function index

### Struct functions

| Function                    | Description                          |
| --------------------------- | ------------------------------------ |
| struct\_col\['field'\]      | Bracket notation field access        |
| get\_field(struct, 'field') | Function-based field access          |
| named\_struct(k1, v1, ...)  | Create struct with named fields      |
| struct(v1, v2, ...)         | Create struct with positional fields |

### Array functions

| Function                            | Description                              |
| ----------------------------------- | ---------------------------------------- |
| make\_array(v1, v2, ...)            | Create array from values                 |
| string\_to\_array(str, delim)       | Split string into array                  |
| range(start, stop)                  | Generate integer range (exclusive stop)  |
| generate\_series(start, stop)       | Generate integer series (inclusive stop) |
| array\_length(arr)                  | Number of elements                       |
| cardinality(arr)                    | Number of elements                       |
| empty(arr)                          | True if empty                            |
| array\_ndims(arr)                   | Number of dimensions                     |
| array\_dims(arr)                    | Dimension information                    |
| array\_has(arr, val)                | Contains check                           |
| array\_has\_all(arr, arr2)          | Contains all check                       |
| array\_has\_any(arr, arr2)          | Contains any check                       |
| array\_position(arr, val)           | First position of value                  |
| array\_positions(arr, val)          | All positions of value                   |
| array\_sort(arr)                    | Sort elements                            |
| array\_reverse(arr)                 | Reverse order                            |
| array\_distinct(arr)                | Remove duplicates                        |
| flatten(arr)                        | Flatten one level                        |
| array\_slice(arr, start, end)       | Extract sub-array                        |
| array\_append(arr, val)             | Append to end                            |
| array\_prepend(val, arr)            | Prepend to start                         |
| array\_concat(arr1, arr2)           | Concatenate arrays                       |
| array\_remove(arr, val)             | Remove first occurrence                  |
| array\_remove\_all(arr, val)        | Remove all occurrences                   |
| array\_remove\_n(arr, val, n)       | Remove first _n_ occurrences             |
| array\_replace(arr, old, new)       | Replace first occurrence                 |
| array\_replace\_n(arr, old, new, n) | Replace first _n_ occurrences            |
| array\_replace\_all(arr, old, new)  | Replace all occurrences                  |
| array\_pop\_back(arr)               | Remove last element                      |
| array\_pop\_front(arr)              | Remove first element                     |
| array\_repeat(val, n)               | Repeat value _n_ times                   |
| array\_resize(arr, size, default)   | Resize with default fill                 |
| array\_intersect(arr1, arr2)        | Common elements                          |
| array\_union(arr1, arr2)            | Union of elements                        |
| array\_except(arr1, arr2)           | Difference of elements                   |
| array\_max(arr)                     | Maximum value                            |
| array\_min(arr)                     | Minimum value                            |
| array\_any\_value(arr)              | First non-NULL value                     |
| array\_to\_string(arr, delim)       | Join elements as string                  |
| array\_element(arr, idx)            | Element at index                         |

### Map functions

| Function                  | Description                          |
| ------------------------- | ------------------------------------ |
| map(keys\_arr, vals\_arr) | Create map from key and value arrays |
| map\_keys(map)            | All keys as array                    |
| map\_values(map)          | All values as array                  |
| map\_extract(map, key)    | Value for a specific key             |

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/sql-reference/complex-types/#page","headline":"Complex types · R2 SQL docs","description":"Reference for querying struct, array, and map column types in R2 SQL.","url":"https://developers.cloudflare.com/r2-sql/sql-reference/complex-types/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-04-21","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"},"keywords":["SQL"]}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/sql-reference/complex-types/","name":"Complex types"}}]}
```

---

---
title: Scalar functions
description: Reference for the scalar functions supported in R2 SQL, organized by category.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Scalar functions

Scalar functions transform individual values and can be used in `SELECT`, `WHERE`, `GROUP BY`, `HAVING`, and `ORDER BY` clauses.

---

## Core functions

### arrow\_cast

Casts an expression to a specific Arrow data type by string name.

```
SELECT arrow_cast(total_amount, 'Float32') AS amount_f32FROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 1
```

### arrow\_typeof

Returns the Arrow data type name of an expression.

```
SELECT arrow_typeof(total_amount) AS amount_type,       arrow_typeof(customer_id) AS id_typeFROM my_namespace.sales_dataLIMIT 1
```

### coalesce

Returns the first non-NULL argument.

```
SELECT coalesce(department, region, 'unknown') AS first_valFROM my_namespace.sales_dataLIMIT 5
```

### get\_field

Extracts a field from a struct by name.

```
SELECT get_field(named_struct('customer', customer_id, 'amount', total_amount), 'amount') AS amtFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 1
```

### greatest

Returns the largest value from a list of arguments.

```
SELECT greatest(total_amount, unit_price, quantity) AS max_valFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### least

Returns the smallest value from a list of arguments.

```
SELECT least(total_amount, unit_price, quantity) AS min_valFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### named\_struct

Creates a struct with named fields from key-value pairs.

```
SELECT named_struct('customer', customer_id, 'amount', total_amount) AS infoFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 1
```

### nullif

Returns NULL if both arguments are equal, otherwise returns the first argument.

```
SELECT nullif(department, 'Unknown') AS deptFROM my_namespace.sales_dataLIMIT 5
```

### nvl

Returns the second argument if the first is NULL. Alias: `ifnull`.

```
SELECT nvl(department, 'N/A') AS deptFROM my_namespace.sales_dataLIMIT 5
```

### nvl2

Returns the second argument if the first is not NULL, otherwise returns the third.

```
SELECT nvl2(department, 'has_dept', 'no_dept') AS dept_statusFROM my_namespace.sales_dataLIMIT 5
```

### overlay

Replaces a substring at a given position.

```
SELECT customer_id,       overlay(customer_id PLACING 'XX' FROM 1 FOR 2) AS maskedFROM my_namespace.sales_dataLIMIT 3
```

### struct

Creates a struct with positional fields. Alias: `row`.

```
SELECT struct(customer_id, total_amount, region) AS infoFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 1
```

---

## Crypto functions

### digest

Returns a hash of a string using a specified algorithm. Supported algorithms: `md5`, `sha224`, `sha256`, `sha384`, `sha512`.

```
SELECT customer_id, digest(customer_id, 'sha256') AS hashFROM my_namespace.sales_dataLIMIT 1
```

### md5

Returns the MD5 hash of a string.

```
SELECT customer_id, md5(customer_id) AS hashFROM my_namespace.sales_dataLIMIT 1
```

### sha224

Returns the SHA-224 hash of a string.

```
SELECT sha224(customer_id) AS hashFROM my_namespace.sales_dataLIMIT 1
```

### sha256

Returns the SHA-256 hash of a string.

```
SELECT customer_id, sha256(customer_id) AS hashFROM my_namespace.sales_dataLIMIT 1
```

### sha384

Returns the SHA-384 hash of a string.

```
SELECT sha384(customer_id) AS hashFROM my_namespace.sales_dataLIMIT 1
```

### sha512

Returns the SHA-512 hash of a string.

```
SELECT sha512(customer_id) AS hashFROM my_namespace.sales_dataLIMIT 1
```

---

## Datetime functions

### current\_date

Returns today's date. Alias: `today`.

```
SELECT current_date() AS today_dateFROM my_namespace.sales_dataLIMIT 1
```

### current\_time

Returns the current time. Precision is quantized to 10ms boundaries.

```
SELECT current_time() AS now_timeFROM my_namespace.sales_dataLIMIT 1
```

### date\_bin

Bins a timestamp into fixed-size intervals aligned to an origin.

```
SELECT date_bin(INTERVAL '1 hour', timestamp, '2025-01-01T00:00:00Z') AS hour_bin,       COUNT(*) AS cntFROM my_namespace.sales_dataGROUP BY date_bin(INTERVAL '1 hour', timestamp, '2025-01-01T00:00:00Z')ORDER BY hour_binLIMIT 5
```

### date\_part

Extracts a component from a timestamp. Alias: `datepart`.

Supported fields: `year`, `month`, `day`, `hour`, `minute`, `second`, `millisecond`, `microsecond`, `week`, `dow`, `doy`, `quarter`, `epoch`.

```
SELECT date_part('hour', timestamp) AS hr,       date_part('minute', timestamp) AS mnFROM my_namespace.sales_dataLIMIT 1
```

### date\_trunc

Truncates a timestamp to a specified unit. Alias: `datetrunc`.

Supported units: `year`, `month`, `week`, `day`, `hour`, `minute`, `second`.

```
SELECT date_trunc('day', timestamp) AS day_trunc, COUNT(*) AS cntFROM my_namespace.sales_dataGROUP BY date_trunc('day', timestamp)ORDER BY day_truncLIMIT 5
```

### from\_unixtime

Converts a Unix epoch (seconds) to a timestamp.

```
SELECT from_unixtime(1770000000) AS tsFROM my_namespace.sales_dataLIMIT 1
```

### make\_date

Constructs a date from year, month, and day components.

```
SELECT make_date(2026, 3, 1) AS dFROM my_namespace.sales_dataLIMIT 1
```

### make\_time

Constructs a time from hour, minute, and second components.

```
SELECT make_time(14, 30, 0) AS tFROM my_namespace.sales_dataLIMIT 1
```

### now

Returns the current timestamp. Aliases: `current_timestamp`.

Precision is quantized to 10ms boundaries.

```
SELECT now() AS current_tsFROM my_namespace.sales_dataLIMIT 1
```

### to\_char

Formats a timestamp as a string using strftime format. Alias: `date_format`.

```
SELECT to_char(timestamp, '%Y-%m-%d %H:%M') AS formattedFROM my_namespace.sales_dataLIMIT 1
```

### to\_date

Parses a date from a string using a format pattern.

```
SELECT to_date('2026-03-01', '%Y-%m-%d') AS dFROM my_namespace.sales_dataLIMIT 1
```

### to\_local\_time

Strips timezone information from a timestamp.

```
SELECT to_local_time(timestamp) AS local_tsFROM my_namespace.sales_dataLIMIT 1
```

### to\_time

Parses a time from a string using a format pattern.

```
SELECT to_time('14:30:00', '%H:%M:%S') AS tFROM my_namespace.sales_dataLIMIT 1
```

### to\_timestamp

Parses a timestamp from a string using a format pattern.

```
SELECT to_timestamp('2026-03-01 12:00:00', '%Y-%m-%d %H:%M:%S') AS tsFROM my_namespace.sales_dataLIMIT 1
```

### to\_timestamp\_micros

Converts microseconds since Unix epoch to a timestamp.

```
SELECT to_timestamp_micros(1770000000000000) AS tsFROM my_namespace.sales_dataLIMIT 1
```

### to\_timestamp\_millis

Converts milliseconds since Unix epoch to a timestamp.

```
SELECT to_timestamp_millis(1770000000000) AS tsFROM my_namespace.sales_dataLIMIT 1
```

### to\_timestamp\_nanos

Converts nanoseconds since Unix epoch to a timestamp. Large values may overflow.

```
SELECT to_timestamp_nanos(1770000000000000000) AS tsFROM my_namespace.sales_dataLIMIT 1
```

### to\_timestamp\_seconds

Converts seconds since Unix epoch to a timestamp.

```
SELECT to_timestamp_seconds(1770000000) AS tsFROM my_namespace.sales_dataLIMIT 1
```

### to\_unixtime

Converts a timestamp to a Unix epoch (seconds).

```
SELECT to_unixtime(timestamp) AS epochFROM my_namespace.sales_dataLIMIT 1
```

---

## Encoding functions

### decode

Decodes a string to binary data. Supported encoding: `base64`.

```
SELECT decode('aGVsbG8=', 'base64') AS rawFROM my_namespace.sales_dataLIMIT 1
```

### encode

Encodes binary data to a string. Supported encoding: `base64`.

```
SELECT encode(CAST('hello' AS BYTEA), 'base64') AS b64FROM my_namespace.sales_dataLIMIT 1
```

---

## JSON functions

### json\_as\_text

Returns any JSON value as unquoted text.

```
SELECT json_as_text(doc, 'description') AS descriptionFROM my_namespace.sales_dataLIMIT 5
```

### json\_contains

Returns true if the specified key path exists in the JSON.

```
SELECT customer_id, json_contains(doc, 'email') AS has_emailFROM my_namespace.sales_dataLIMIT 5
```

### json\_get

Extracts a value by key path. Returns a union type — use the typed variants (`json_get_str`, `json_get_int`, etc.) for predictable results.

```
SELECT json_get(doc, 'name') AS nameFROM my_namespace.sales_dataLIMIT 5
```

### json\_get\_array

Returns a JSON array as a list of strings.

```
SELECT json_get_array(doc, 'tags') AS tagsFROM my_namespace.sales_dataLIMIT 5
```

### json\_get\_bool

Returns a boolean value from a JSON column by key path.

```
SELECT json_get_bool(doc, 'active') AS is_activeFROM my_namespace.sales_dataLIMIT 5
```

### json\_get\_float

Returns a float value from a JSON column by key path.

```
SELECT json_get_float(doc, 'price') AS priceFROM my_namespace.sales_dataLIMIT 5
```

### json\_get\_int

Returns an integer value from a JSON column by key path.

```
SELECT json_get_int(doc, 'age') AS ageFROM my_namespace.sales_dataLIMIT 5
```

### json\_get\_json

Returns nested JSON as a raw JSON string.

```
SELECT json_get_json(doc, 'metadata') AS metadataFROM my_namespace.sales_dataLIMIT 5
```

### json\_get\_str

Returns a string value from a JSON column by key path.

```
SELECT json_get_str(doc, 'name') AS nameFROM my_namespace.sales_dataLIMIT 5
```

### json\_length

Returns the length of a JSON array or object.

```
SELECT json_length(doc, 'items') AS item_countFROM my_namespace.sales_dataLIMIT 5
```

---

## Math functions

### abs

Returns the absolute value of a number.

```
SELECT abs(total_amount - 500) AS distance_from_500FROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### cbrt

Returns the cube root of a number.

```
SELECT cbrt(CAST(quantity AS DOUBLE)) AS cbrt_qtyFROM my_namespace.sales_dataWHERE quantity IS NOT NULLLIMIT 5
```

### ceil

Returns the smallest integer greater than or equal to a number.

```
SELECT ceil(total_amount) AS rounded_upFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### degrees

Converts radians to degrees.

```
SELECT degrees(pi()) AS full_circleFROM my_namespace.sales_dataLIMIT 1
```

### exp

Returns _e_ raised to the given power.

```
SELECT exp(total_amount / 1000.0) AS exp_valFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### factorial

Returns the factorial of a non-negative integer.

```
SELECT factorial(5) AS fact5FROM my_namespace.sales_dataLIMIT 1
```

### floor

Returns the largest integer less than or equal to a number.

```
SELECT floor(total_amount) AS rounded_downFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### gcd

Returns the greatest common divisor of two integers.

```
SELECT gcd(12, 8) AS gcd_valFROM my_namespace.sales_dataLIMIT 1
```

### Hyperbolic functions

`sinh`, `cosh`, `tanh`, `asinh`, `acosh`, `atanh`

```
SELECT sinh(1.0) AS sh, cosh(1.0) AS ch, tanh(1.0) AS thFROM my_namespace.sales_dataLIMIT 1
```

### isnan

Returns true if the value is NaN.

```
SELECT isnan(0.0 / 0.0) AS is_nanFROM my_namespace.sales_dataLIMIT 1
```

### iszero

Returns true if the value is zero.

```
SELECT iszero(0.0) AS is_zeroFROM my_namespace.sales_dataLIMIT 1
```

### lcm

Returns the least common multiple of two integers.

```
SELECT lcm(4, 6) AS lcm_valFROM my_namespace.sales_dataLIMIT 1
```

### ln

Returns the natural logarithm.

```
SELECT ln(total_amount) AS ln_valFROM my_namespace.sales_dataWHERE total_amount > 0LIMIT 5
```

### log

Returns the logarithm of a value for a given base.

```
SELECT log(10.0, total_amount) AS log10_valFROM my_namespace.sales_dataWHERE total_amount > 0LIMIT 5
```

### log2

Returns the base-2 logarithm.

```
SELECT log2(total_amount) AS log2_valFROM my_namespace.sales_dataWHERE total_amount > 0LIMIT 5
```

### log10

Returns the base-10 logarithm.

```
SELECT log10(total_amount) AS log10_valFROM my_namespace.sales_dataWHERE total_amount > 0LIMIT 5
```

### nanvl

Returns the first argument if it is not NaN, otherwise returns the second.

```
SELECT nanvl(0.0 / 0.0, -1.0) AS safe_valFROM my_namespace.sales_dataLIMIT 1
```

### pi

Returns the value of pi.

```
SELECT pi() AS pi_valFROM my_namespace.sales_dataLIMIT 1
```

### power

Raises a number to a power. Alias: `pow`.

```
SELECT power(total_amount, 2.0) AS amount_squaredFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### radians

Converts degrees to radians.

```
SELECT radians(180.0) AS pi_valFROM my_namespace.sales_dataLIMIT 1
```

### random

Returns a random float between 0 and 1.

```
SELECT random() AS rndFROM my_namespace.sales_dataLIMIT 1
```

### round

Rounds a number to a specified number of decimal places.

```
SELECT round(total_amount, 2) AS roundedFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### signum

Returns the sign of a number: -1, 0, or 1.

```
SELECT signum(total_amount - 500) AS sign_valFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

### sqrt

Returns the square root of a number.

```
SELECT sqrt(CAST(quantity AS DOUBLE)) AS sqrt_qtyFROM my_namespace.sales_dataWHERE quantity IS NOT NULLLIMIT 5
```

### Trigonometric functions

`sin`, `cos`, `tan`, `asin`, `acos`, `atan`, `atan2`, `cot`

```
SELECT sin(1.0) AS s, cos(1.0) AS c, tan(1.0) AS t,       asin(0.5) AS as_val, acos(0.5) AS ac_val, atan(1.0) AS at_valFROM my_namespace.sales_dataLIMIT 1
```

### trunc

Truncates a number to a specified number of decimal places.

```
SELECT trunc(total_amount, 0) AS truncatedFROM my_namespace.sales_dataWHERE total_amount IS NOT NULLLIMIT 5
```

---

## Regex functions

### regexp\_count

Returns the number of matches of a pattern in a string.

```
SELECT department, regexp_count(department, '[aeiou]') AS vowelsFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### regexp\_instr

Returns the position of the first match of a pattern.

```
SELECT department, regexp_instr(department, '[0-9]') AS digit_posFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### regexp\_like

Returns true if a string matches a regular expression pattern.

```
SELECT department, regexp_like(department, '^[A-Z]{2}') AS starts_two_capsFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### regexp\_match

Returns the first match of a pattern as an array.

```
SELECT department, regexp_match(department, '([A-Z][a-z]+)') AS first_wordFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 3
```

### regexp\_replace

Replaces matches of a pattern with a replacement string.

```
SELECT department, regexp_replace(department, '[0-9]', '#') AS no_digitsFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

---

## String functions

### ascii

Returns the ASCII code of the first character.

```
SELECT customer_id, ascii(customer_id) AS first_codeFROM my_namespace.sales_dataLIMIT 3
```

### bit\_length

Returns the length of a string in bits.

```
SELECT customer_id, bit_length(customer_id) AS bitsFROM my_namespace.sales_dataLIMIT 3
```

### btrim

Trims characters from both sides of a string. Alias: `trim`.

```
SELECT btrim('  hello  ') AS trimmedFROM my_namespace.sales_dataLIMIT 1
```

### chr

Returns the character for a given ASCII code.

```
SELECT chr(65) AS letterFROM my_namespace.sales_dataLIMIT 1
```

### concat

Concatenates two or more strings.

```
SELECT concat(department, ' - ', region) AS labelFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### concat\_ws

Concatenates strings with a separator.

```
SELECT concat_ws('/', region, department) AS pathFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### contains

Returns true if a string contains a substring.

```
SELECT customer_id, contains(department, 'Sales') AS is_salesFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### ends\_with

Returns true if a string ends with a suffix.

```
SELECT customer_id, ends_with(department, 'ing') AS ends_ingFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### levenshtein

Returns the Levenshtein edit distance between two strings.

```
SELECT department, levenshtein(department, 'Engineering') AS distFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### lower

Converts a string to lowercase.

```
SELECT lower(department) AS dept_lowerFROM my_namespace.sales_dataLIMIT 5
```

### ltrim

Trims characters from the left side of a string.

```
SELECT ltrim('  hello') AS trimmedFROM my_namespace.sales_dataLIMIT 1
```

### octet\_length

Returns the length of a string in bytes.

```
SELECT customer_id, octet_length(customer_id) AS bytesFROM my_namespace.sales_dataLIMIT 3
```

### repeat

Repeats a string a given number of times.

```
SELECT repeat(region, 2) AS doubledFROM my_namespace.sales_dataLIMIT 3
```

### replace

Replaces all occurrences of a substring.

```
SELECT department, replace(department, ' ', '_') AS underscoredFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### rtrim

Trims characters from the right side of a string.

```
SELECT rtrim('hello  ') AS trimmedFROM my_namespace.sales_dataLIMIT 1
```

### split\_part

Splits a string by a delimiter and returns the specified part (1-indexed).

```
SELECT customer_id, split_part(customer_id, '-', 1) AS first_partFROM my_namespace.sales_dataWHERE customer_id IS NOT NULLLIMIT 5
```

### starts\_with

Returns true if a string starts with a prefix.

```
SELECT customer_id, starts_with(department, 'Eng') AS is_engFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### to\_hex

Converts an integer to a hexadecimal string.

```
SELECT to_hex(255) AS hex_ffFROM my_namespace.sales_dataLIMIT 1
```

### upper

Converts a string to uppercase.

```
SELECT upper(region) AS region_upperFROM my_namespace.sales_dataLIMIT 5
```

### uuid

Generates a random UUID.

```
SELECT uuid() AS new_idFROM my_namespace.sales_dataLIMIT 1
```

---

## Unicode functions

### character\_length

Returns the number of characters in a string. Aliases: `length`, `char_length`.

```
SELECT department, character_length(department) AS lenFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### find\_in\_set

Returns the position of a string within a comma-separated list.

```
SELECT find_in_set('North', 'South,North,East,West') AS posFROM my_namespace.sales_dataLIMIT 1
```

### initcap

Capitalizes the first letter of each word.

```
SELECT initcap('hello world') AS cappedFROM my_namespace.sales_dataLIMIT 1
```

### left

Returns the leftmost _n_ characters of a string.

```
SELECT department, left(department, 5) AS prefixFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### lpad

Left-pads a string to a specified length.

```
SELECT region, lpad(region, 15, '.') AS paddedFROM my_namespace.sales_dataLIMIT 5
```

### reverse

Reverses a string.

```
SELECT department, reverse(department) AS revFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### right

Returns the rightmost _n_ characters of a string.

```
SELECT department, right(department, 3) AS suffixFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### rpad

Right-pads a string to a specified length.

```
SELECT region, rpad(region, 15, '.') AS paddedFROM my_namespace.sales_dataLIMIT 5
```

### strpos

Returns the position of a substring (1-indexed). Aliases: `instr`, `position`.

```
SELECT department, strpos(department, 'a') AS a_posFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### substr

Returns a substring starting at a position for a given length. Alias: `substring`.

```
SELECT department, substr(department, 1, 8) AS first_eightFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

### substr\_index

Returns the substring before the _n_\-th occurrence of a delimiter. Alias: `substring_index`.

```
SELECT customer_id, substr_index(customer_id, '-', 1) AS first_segmentFROM my_namespace.sales_dataWHERE customer_id IS NOT NULLLIMIT 5
```

### translate

Replaces characters in a string based on a mapping.

```
SELECT department, translate(department, 'aeiou', '12345') AS codedFROM my_namespace.sales_dataWHERE department IS NOT NULLLIMIT 5
```

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/sql-reference/scalar-functions/#page","headline":"Scalar functions · R2 SQL docs","description":"Reference for the scalar functions supported in R2 SQL, organized by category.","url":"https://developers.cloudflare.com/r2-sql/sql-reference/scalar-functions/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-06-22","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"},"keywords":["SQL"]}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/sql-reference/","name":"SQL reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/sql-reference/scalar-functions/","name":"Scalar functions"}}]}
```

---

---
title: Troubleshooting guide
description: Troubleshoot common R2 SQL errors including query structure, type, and timeout issues.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Troubleshooting guide

This guide covers potential errors and limitations you may encounter when using R2 SQL. R2 SQL is in open beta, and supported functionality will evolve and change over time.

## Query structure errors

### Missing required clauses

**Error**: `expected exactly 1 table in FROM clause`

**Problem**: R2 SQL requires a `FROM` clause in your query.

```
-- Invalid - Missing FROM clauseSELECT user_id WHERE status = 200;
-- ValidSELECT user_idFROM my_namespace.http_requestsWHERE status = 200 AND timestamp BETWEEN '2025-09-24T01:00:00Z' AND '2025-09-25T01:00:00Z';
```

**Solution**: Always include `FROM` with a fully qualified table name (`namespace_name.table_name`).

---

## FROM clause issues

### Join performance issues

**Symptom**: Query returns 502 Bad Gateway or times out.

**Problem**: Multi-way joins across large tables can exceed resource limits, especially with `COUNT(DISTINCT)` or other memory-intensive aggregations.

```
-- May timeout: cross-joining two large fact tablesSELECT COUNT(DISTINCT h.ray_id), COUNT(DISTINCT f.event_id)FROM my_namespace.http_requests hINNER JOIN my_namespace.firewall_events f ON h.zone_id = f.zone_id
```

**Solution**:

* Add `WHERE` filters to reduce intermediate result sizes.
* Join through dimension tables instead of directly joining fact tables.
* Use `approx_distinct()` instead of `COUNT(DISTINCT)` for approximate counts.
* Break complex multi-way joins into smaller queries using CTEs or sequential queries.

```
-- Better: filter both sides and use approx_distinctSELECT z.plan,       approx_distinct(h.ray_id) AS unique_requestsFROM my_namespace.zones zINNER JOIN my_namespace.http_requests h ON z.zone_id = h.zone_idWHERE z.plan = 'enterprise'  AND h.status_code >= 400GROUP BY z.plan
```

### `NOT IN` on nullable columns

**Symptom**: `NOT IN` subquery returns unexpected results or errors.

**Problem**: `NOT IN` subqueries are not supported when the subquery column can contain `NULL` values.

```
-- Fails: nullable_col may contain NULLsSELECT zone_idFROM my_namespace.http_requestsWHERE zone_id NOT IN (    SELECT nullable_col FROM my_namespace.other_table)LIMIT 20
```

**Solution**: Use `NOT EXISTS` with a correlated subquery instead.

```
-- Works: NOT EXISTS handles NULLs correctlySELECT h.zone_idFROM my_namespace.http_requests hWHERE NOT EXISTS (    SELECT 1 FROM my_namespace.other_table o    WHERE o.nullable_col = h.zone_id)LIMIT 20
```

### Correlated subquery performance

**Symptom**: `EXISTS` or `NOT EXISTS` subquery runs slowly.

**Problem**: Correlated subqueries with complex conditions can be slow because the inner query is evaluated for each row of the outer query.

```
-- Slower: multiple filter conditions in correlated subquerySELECT z.domainFROM my_namespace.zones zWHERE EXISTS (    SELECT 1 FROM my_namespace.firewall_events f    WHERE f.zone_id = z.zone_id      AND f.risk_score > 0.9      AND f.colo = 'SJC')LIMIT 20
```

**Solution**:

* Simplify correlated conditions where possible.
* Consider rewriting as a `JOIN` with `GROUP BY` instead of `EXISTS`.
* Use an `IN` subquery with pre-aggregated results instead of `EXISTS`.

---

## WHERE clause issues

### JSON object filtering

**Error**: `unsupported binary operator` or `Error during planning: could not parse compound`

**Problem**: JSON functions are not yet implemented. You cannot filter on fields inside JSON objects using JSON path operators.

```
-- Invalid - JSON path operators not supportedSELECT * FROM my_namespace.requests WHERE json_data->>'level' = 'error'
-- Valid - Filter on the entire JSON columnSELECT * FROM my_namespace.logs WHERE json_data IS NOT NULL LIMIT 100
```

**Solution**:

* Denormalize frequently queried JSON fields into separate columns.
* Filter on the entire JSON field, and handle parsing in your application.

Note

Struct columns are supported and can be filtered using bracket notation. Refer to [Complex types](https://developers.cloudflare.com/r2-sql/sql-reference/complex-types/) for details.

```
SELECT * FROM my_namespace.products WHERE pricing['price'] > 50 LIMIT 100
```

---

## LIMIT clause issues

### Invalid limit values

**Error**: `maximum LIMIT is 10000`

**Problem**: LIMIT values must be between 1 and 10,000.

```
-- Invalid - Out of rangeSELECT * FROM my_namespace.events LIMIT 50000
-- ValidSELECT * FROM my_namespace.events LIMIT 10000
```

**Solution**: Use LIMIT values between 1 and 10,000.

### Pagination attempts

**Error**: `unsupported feature: OFFSET clause is not supported`

**Problem**: OFFSET is not supported.

```
-- Invalid - Pagination not supportedSELECT * FROM my_namespace.events LIMIT 100 OFFSET 200
-- Valid - Use cursor-based pagination with ORDER BY and WHERE-- Page 1SELECT * FROM my_namespace.eventsWHERE timestamp >= '2024-01-01'ORDER BY timestampLIMIT 100
-- Page 2 - Use the last timestamp from the previous pageSELECT * FROM my_namespace.eventsWHERE timestamp > '2024-01-01T10:30:00Z'ORDER BY timestampLIMIT 100
```

**Solution**: Implement cursor-based pagination using `ORDER BY` and `WHERE` conditions.

---

## Schema issues

### DDL and DML operations

**Error**: `only read-only queries are allowed`

**Problem**: R2 SQL is a read-only query engine. DDL and DML statements are not supported.

```
-- Invalid - Schema changes not supportedALTER TABLE my_namespace.events ADD COLUMN new_field STRINGUPDATE my_namespace.events SET status = 200 WHERE user_id = '123'CREATE TABLE my_namespace.test (id INT)DROP TABLE my_namespace.events
```

**Solution**: Manage your schema through your data ingestion pipeline and R2 Data Catalog.

---

## Performance optimization

### Query performance issues

If your queries are running slowly:

1. **Always include partition (timestamp) filters**: This is the most important optimization.  
```
-- Good - Narrows data scan to one daySELECT * FROM my_namespace.eventsWHERE timestamp BETWEEN '2024-01-01' AND '2024-01-02'LIMIT 100  
```
2. **Use selective filtering**: Include specific conditions to reduce result sets.  
```
-- Good - Multiple filters reduce scanned dataSELECT * FROM my_namespace.eventsWHERE status = 200 AND region = 'US' AND timestamp > '2024-01-01'LIMIT 100  
```
3. **Select specific columns**: Avoid `SELECT *` when you only need a few fields.  
```
-- Good - Only reads the columns you needSELECT user_id, status, timestampFROM my_namespace.eventsWHERE timestamp > '2024-01-01'LIMIT 100  
```
4. **Use EXPLAIN to inspect the execution plan**: Verify that predicate pushdown and file pruning are working.  
```  
EXPLAIN SELECT user_id, statusFROM my_namespace.eventsWHERE timestamp > '2024-01-01' AND status = 200  
```
5. **Enable compaction**: Enable compaction in R2 Data Catalog to reduce the number of small files scanned per query.

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/troubleshooting/#page","headline":"Troubleshooting guide · R2 SQL docs","description":"Troubleshoot common R2 SQL errors including query structure, type, and timeout issues.","url":"https://developers.cloudflare.com/r2-sql/troubleshooting/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-05-15","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"},"keywords":["SQL"]}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/troubleshooting/","name":"Troubleshooting guide"}}]}
```

---

---
title: Tutorials
description: Step-by-step R2 SQL tutorials for querying data stored in R2.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Tutorials

```json
{"@context":"https://schema.org","@type":"WebPage","@id":"https://developers.cloudflare.com/r2-sql/tutorials/#page","headline":"Tutorials · R2 SQL docs","description":"Step-by-step R2 SQL tutorials for querying data stored in R2.","url":"https://developers.cloudflare.com/r2-sql/tutorials/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-04-21","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/tutorials/","name":"Tutorials"}}]}
```

---

---
title: Build an end to end data pipeline
description: This tutorial demonstrates how to build a complete data pipeline using Cloudflare Pipelines, R2 Data Catalog, and R2 SQL.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Build an end to end data pipeline

Learn how to create an end-to-end data pipeline using Cloudflare Pipelines, R2 Data Catalog, and R2 SQL for real-time transaction analysis.

In this tutorial, you will learn how to build a complete data pipeline using Cloudflare Pipelines, R2 Data Catalog, and R2 SQL. This also includes a sample Python script that creates and sends financial transaction data to your Pipeline that can be queried by R2 SQL or any Apache Iceberg-compatible query engine.

This tutorial demonstrates how to:

* Set up R2 Data Catalog to store our transaction events in an Apache Iceberg table
* Set up a Cloudflare Pipeline
* Create transaction data with fraud patterns to send to your Pipeline
* Query your data using R2 SQL for fraud analysis

## Prerequisites

1. Sign up for a [Cloudflare account ↗](https://dash.cloudflare.com/sign-up).
2. Install [Node.js ↗](https://nodejs.org/en/).
3. Install [Python 3.8+ ↗](https://python.org) for the data generation script.

Node.js version manager

Use a Node version manager like [Volta ↗](https://volta.sh/) or [nvm ↗](https://github.com/nvm-sh/nvm) to avoid permission issues and change Node.js versions.

Wrangler requires a Node version of 16.17.0 or later.

## 1\. Set up authentication

You will need API tokens to interact with Cloudflare services.

1. In the Cloudflare dashboard, go to the **API tokens** page.  
[ Go to **Account API tokens** ](https://dash.cloudflare.com/?to=/:account/api-tokens)
2. Select **Create Token**.
3. Select **Get started** next to Create Custom Token.
4. Enter a name for your API token.
5. Under **Permissions**, choose:

  * **Workers Pipelines** with Read, Send, and Edit permissions
  * **Workers R2 Data Catalog** with Read and Edit permissions
  * **Workers R2 SQL** with Read permissions
  * **Workers R2 Storage** with Read and Edit permissions
6. Optionally, add a TTL to this token.
7. Select **Continue to summary**.
8. Click **Create Token**
9. Note the **Token value**.

Export your new token as an environment variable:

Terminal window

```
export WRANGLER_R2_SQL_AUTH_TOKEN= #paste your token here
```

If this is your first time using Wrangler, make sure to log in.

Terminal window

```
npx wrangler login
```

## 2\. Create an R2 bucket and enable R2 Data Catalog

* [ Wrangler CLI ](#tab-panel-10100)
* [ Dashboard ](#tab-panel-10101)

Create an R2 bucket:

Terminal window

```
npx wrangler r2 bucket create fraud-pipeline
```

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select **Create bucket**.
3. Enter the bucket name: `fraud-pipeline`
4. Select **Create bucket**.

Enable the catalog on your R2 bucket:

* [ Wrangler CLI ](#tab-panel-10102)
* [ Dashboard ](#tab-panel-10103)

Terminal window

```
npx wrangler r2 bucket catalog enable fraud-pipeline
```

When you run this command, take note of the "Warehouse" and "Catalog URI". You will need these later.

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select the bucket: `fraud-pipeline`.
3. Switch to the **Settings** tab, scroll down to **R2 Data Catalog**, and select **Enable**.
4. Once enabled, note the **Catalog URI** and **Warehouse name**.

Note

Copy the `warehouse` (ACCOUNTID\_BUCKETNAME) and paste it in the `export` below. We will use it later in the tutorial.

Terminal window

```
export WAREHOUSE= #Paste your warehouse here
```

### (Optional) Enable compaction on your R2 Data Catalog

R2 Data Catalog can automatically compact tables for you. In production event streaming use cases, it is common to end up with many small files, so it is recommended to enable compaction. Since the tutorial only demonstrates a sample use case, this step is optional.

* [ Wrangler CLI ](#tab-panel-10104)
* [ Dashboard ](#tab-panel-10105)

Terminal window

```
npx wrangler r2 bucket catalog compaction enable fraud-pipeline --token $WRANGLER_R2_SQL_AUTH_TOKEN
```

1. In the Cloudflare dashboard, go to the **R2 object storage** page.  
[ Go to **Overview** ](https://dash.cloudflare.com/?to=/:account/r2/overview)
2. Select the bucket: `fraud-pipeline`.
3. Switch to the **Settings** tab, scroll down to **R2 Data Catalog**, click on edit icon, and select **Enable**.
4. You can choose a target file size or leave the default. Click save.

## 3\. Set up the pipeline infrastructure

### 3.1\. Create the Pipeline stream

* [ Wrangler CLI ](#tab-panel-10106)
* [ Dashboard ](#tab-panel-10107)

First, create a schema file called `raw_transactions_schema.json` with the following `json` schema:

```
{  "fields": [    { "name": "transaction_id", "type": "string", "required": true },    { "name": "user_id", "type": "int64", "required": true },    { "name": "amount", "type": "float64", "required": false },    { "name": "transaction_timestamp", "type": "string", "required": false },    { "name": "location", "type": "string", "required": false },    { "name": "merchant_category", "type": "string", "required": false },    { "name": "is_fraud", "type": "bool", "required": false }  ]}
```

Create a stream to receive incoming fraud detection events:

Terminal window

```
npx wrangler pipelines streams create raw_events_stream \  --schema-file raw_transactions_schema.json \  --http-enabled true \  --http-auth false
```

Note

Note the **HTTP Ingest Endpoint URL** from the output. This is the endpoint you will use to send data to your pipeline.

Terminal window

```
# The http ingest endpoint from the output (see example below)export STREAM_ENDPOINT= #the http ingest endpoint from the output (see example below)
```

The output should look like this:

Terminal window

```
🌀 Creating stream 'raw_events_stream'...✨ Successfully created stream 'raw_events_stream' with id 'stream_id'.
Creation Summary:General:  Name:  raw_events_stream
HTTP Ingest:  Enabled:         Yes  Authentication:  Yes  Endpoint:        https://stream_id.ingest.cloudflare.com  CORS Origins:    None
Input Schema:┌───────────────────────┬────────┬────────────┬──────────┐│ Field Name            │ Type   │ Unit/Items │ Required │├───────────────────────┼────────┼────────────┼──────────┤│ transaction_id        │ string │            │ Yes      │├───────────────────────┼────────┼────────────┼──────────┤│ user_id               │ int64  │            │ Yes      │├───────────────────────┼────────┼────────────┼──────────┤│ amount                │float64 │            │ No       │├───────────────────────┼────────┼────────────┼──────────┤│ transaction_timestamp │ string │            │ No       │├───────────────────────┼────────┼────────────┼──────────┤│ location              │ string │            │ No       │├───────────────────────┼────────┼────────────┼──────────┤│ merchant_category     │ string │            │ No       │├───────────────────────┼────────┼────────────┼──────────┤│ is_fraud              │ bool   │            │ No       │└───────────────────────┴────────┴────────────┴──────────┘
```

### 3.2\. Create the data sink

Create a sink that writes data to your R2 bucket as Apache Iceberg tables:

Terminal window

```
npx wrangler pipelines sinks create raw_events_sink \  --type "r2-data-catalog" \  --bucket "fraud-pipeline" \  --roll-interval 30 \  --namespace "fraud_detection" \  --table "transactions" \  --catalog-token $WRANGLER_R2_SQL_AUTH_TOKEN
```

Note

This creates a `sink` configuration that will write to the Iceberg table `fraud_detection.transactions` in your R2 Data Catalog every 30 seconds. Pipelines automatically appends an `__ingest_ts` column that is used to partition the table by `DAY`.

### 3.3\. Create the pipeline

Connect your stream to your sink with SQL:

Terminal window

```
npx wrangler pipelines create raw_events_pipeline \  --sql "INSERT INTO raw_events_sink SELECT * FROM raw_events_stream"
```

1. In the Cloudflare dashboard, go to **Pipelines** \> **Pipelines**.  
[ Go to **Pipelines** ](https://dash.cloudflare.com/?to=/:account/pipelines/overview)
2. Select **Create Pipeline**.
3. **Connect to a Stream**:

  * Pipeline name: `raw_events`
  * Enable HTTP endpoint for sending data: Enabled
  * HTTP authentication: Disabled (default)
  * Select **Next**
4. **Define Input Schema**:

  * Select **JSON editor**
  * Copy in the schema:  
  ```  
  {  "fields": [    { "name": "transaction_id", "type": "string", "required": true },    { "name": "user_id", "type": "int64", "required": true },    { "name": "amount", "type": "float64", "required": false },    {      "name": "transaction_timestamp",      "type": "string",      "required": false    },    { "name": "location", "type": "string", "required": false },    { "name": "merchant_category", "type": "string", "required": false },    { "name": "is_fraud", "type": "bool", "required": false }  ]}  
  ```
  * Select **Next**
5. **Define Sink**:

  * Select your R2 bucket: `fraud-pipeline`
  * Storage type: **R2 Data Catalog**
  * Namespace: `fraud_detection`
  * Table name: `transactions`
  * **Advanced Settings**: Change **Maximum Time Interval** to `30 seconds`
  * Select **Next**
6. **Credentials**:

  * Disable **Automatically create an Account API token for your sink**
  * Enter **Catalog Token** from step 1
  * Select **Next**
7. **Pipeline Definition**:

  * Leave the default SQL query:  
  ```  
  INSERT INTO raw_events_sink SELECT * FROM raw_events_stream;  
  ```
  * Select **Create Pipeline**
8. After pipeline creation, note the **Stream ID** for the next step.

## 4\. Generate sample fraud detection data

Create a Python script to generate realistic transaction data with fraud patterns:

fraud\_data\_generator.py

```
import requestsimport jsonimport uuidimport randomimport timeimport osfrom datetime import datetime, timezone, timedelta
# Configuration - exported from the prior stepsSTREAM_ENDPOINT = os.environ["STREAM_ENDPOINT"]# From the stream you createdAPI_TOKEN = os.environ["WRANGLER_R2_SQL_AUTH_TOKEN"] #the same one created earlierEVENTS_TO_SEND = 1000 # Feel free to adjust this
def generate_transaction():    """Generate some random transactions with occasional fraud"""
    # User IDs    high_risk_users = [1001, 1002, 1003, 1004, 1005]    normal_users = list(range(1006, 2000))
    user_id = random.choice(high_risk_users + normal_users)    is_high_risk_user = user_id in high_risk_users
    # Generate amounts    if random.random() < 0.05:        amount = round(random.uniform(5000, 50000), 2)    elif random.random() < 0.03:        amount = round(random.uniform(0.01, 1.00), 2)    else:        amount = round(random.uniform(10, 500), 2)
    # Locations    normal_locations = ["NEW_YORK", "LOS_ANGELES", "CHICAGO", "MIAMI", "SEATTLE", "SAN FRANCISCO"]    high_risk_locations = ["UNKNOWN_LOCATION", "VPN_EXIT", "MARS", "BAT_CAVE"]
    if is_high_risk_user and random.random() < 0.3:        location = random.choice(high_risk_locations)    else:        location = random.choice(normal_locations)
    # Merchant categories    normal_merchants = ["GROCERY", "GAS_STATION", "RESTAURANT", "RETAIL"]    high_risk_merchants = ["GAMBLING", "CRYPTO", "MONEY_TRANSFER", "GIFT_CARDS"]
    if random.random() < 0.1:  # 10% high-risk merchants        merchant_category = random.choice(high_risk_merchants)    else:        merchant_category = random.choice(normal_merchants)
    # Series of checks to either increase fraud score by a certain margin    fraud_score = 0    if amount > 2000: fraud_score += 0.4    if amount < 1: fraud_score += 0.3    if location in high_risk_locations: fraud_score += 0.5    if merchant_category in high_risk_merchants: fraud_score += 0.3    if is_high_risk_user: fraud_score += 0.2
    # Compare the fraud scores    is_fraud = random.random() < min(fraud_score * 0.3, 0.8)
    # Generate timestamps (some fraud happens at unusual hours)    base_time = datetime.now(timezone.utc)    if is_fraud and random.random() < 0.4:  # 40% of fraud at night        hour = random.randint(0, 5)  # Late night/early morning        transaction_time = base_time.replace(hour=hour)    else:        transaction_time = base_time - timedelta(            hours=random.randint(0, 168)  # Last week        )
    return {        "transaction_id": str(uuid.uuid4()),        "user_id": user_id,        "amount": amount,        "transaction_timestamp": transaction_time.isoformat(),        "location": location,        "merchant_category": merchant_category,        "is_fraud": True if is_fraud else False    }
def send_batch_to_stream(events, batch_size=100):    """Send events to Cloudflare Stream in batches"""
    headers = {        "Authorization": f"Bearer {API_TOKEN}",        "Content-Type": "application/json"    }
    total_sent = 0    fraud_count = 0
    for i in range(0, len(events), batch_size):        batch = events[i:i + batch_size]        fraud_in_batch = sum(1 for event in batch if event["is_fraud"] == True)
        try:            response = requests.post(STREAM_ENDPOINT, headers=headers, json=batch)
            if response.status_code in [200, 201]:                total_sent += len(batch)                fraud_count += fraud_in_batch                print(f"Sent batch of {len(batch)} events (Total: {total_sent})")            else:                print(f"Failed to send batch: {response.status_code} - {response.text}")
        except Exception as e:            print(f"Error sending batch: {e}")
        time.sleep(0.1)
    return total_sent, fraud_count
def main():    print("Generating fraud detection data...")
    # Generate events    events = []    for i in range(EVENTS_TO_SEND):        events.append(generate_transaction())        if (i + 1) % 100 == 0:            print(f"Generated {i + 1} events...")
    fraud_events = sum(1 for event in events if event["is_fraud"] == True)    print(f"📊 Generated {len(events)} total events ({fraud_events} fraud, {fraud_events/len(events)*100:.1f}%)")
    # Send to stream    print("Sending data to Pipeline stream...")    sent, fraud_sent = send_batch_to_stream(events)
    print(f"\nComplete!")    print(f"   Events sent: {sent:,}")    print(f"   Fraud events: {fraud_sent:,} ({fraud_sent/sent*100:.1f}%)")    print(f"   Data is now flowing through your pipeline!")
if __name__ == "__main__":    main()
```

Install the required Python dependency and run the script:

Terminal window

```
pip install requestspython fraud_data_generator.py
```

## 5\. Query the data with R2 SQL

Now you can analyze your fraud detection data using R2 SQL. Here are some example queries:

### 5.1\. View recent transactions

Terminal window

```
npx wrangler r2 sql query "$WAREHOUSE" "SELECT    transaction_id,    user_id,    amount,    location,    merchant_category,    is_fraud,    transaction_timestampFROM fraud_detection.transactionsWHERE __ingest_ts > '2025-09-24T01:00:00Z'AND is_fraud = trueLIMIT 10"
```

### 5.2\. Filter the raw transactions into a new table to highlight high-value transactions

Create a new sink that will write the filtered data to a new Apache Iceberg table in R2 Data Catalog:

Terminal window

```
npx wrangler pipelines sinks create fraud_filter_sink \  --type "r2-data-catalog" \  --bucket "fraud-pipeline" \  --roll-interval 30 \  --namespace "fraud_detection" \  --table "fraud_transactions" \  --catalog-token $WRANGLER_R2_SQL_AUTH_TOKEN
```

Now you will create a new SQL query to process data from the original `raw_events_stream` stream and only write flagged transactions that are over the `amount` of 1,000.

Terminal window

```
npx wrangler pipelines create fraud_events_pipeline \  --sql "INSERT INTO fraud_filter_sink SELECT * FROM raw_events_stream WHERE is_fraud=true and amount > 1000"
```

Note

It may take a few minutes for the new Pipeline to fully Initialize and start processing the data. Also keep in mind the 30 second `roll-interval`.

Query the table and check the results:

Terminal window

```
npx wrangler r2 sql query "$WAREHOUSE" "SELECT    transaction_id,    user_id,    amount,    location,    merchant_category,    is_fraud,    transaction_timestampFROM fraud_detection.fraud_transactionsLIMIT 10"
```

Also verify that the non-fraudulent events are being filtered out:

Terminal window

```
npx wrangler r2 sql query "$WAREHOUSE" "SELECT    transaction_id,    user_id,    amount,    location,    merchant_category,    is_fraud,    transaction_timestampFROM fraud_detection.fraud_transactionsWHERE is_fraud = falseLIMIT 10"
```

You should see the following output:

```
Query executed successfully with no results
```

## Conclusion

You have successfully built an end to end data pipeline using Cloudflare's data platform. Through this tutorial, you have learned to:

1. **Use R2 Data Catalog**: Leveraged Apache Iceberg tables for efficient data storage
2. **Set up Cloudflare Pipelines**: Created streams, sinks, and pipelines for data ingestion
3. **Generated sample data**: Created transaction data with some basic fraud patterns
4. **Query your tables with R2 SQL**: Access raw and processed data tables stored in R2 Data Catalog

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/tutorials/end-to-end-pipeline/#page","headline":"Build an end to end data pipeline · R2 SQL docs","description":"This tutorial demonstrates how to build a complete data pipeline using Cloudflare Pipelines, R2 Data Catalog, and R2 SQL.","url":"https://developers.cloudflare.com/r2-sql/tutorials/end-to-end-pipeline/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-05-05","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/tutorials/","name":"Tutorials"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/tutorials/end-to-end-pipeline/","name":"Build an end to end data pipeline"}}]}
```

---

---
title: Platform
description: R2 SQL platform details including pricing and configuration.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Platform

```json
{"@context":"https://schema.org","@type":"WebPage","@id":"https://developers.cloudflare.com/r2-sql/platform/#page","headline":"Platform · R2 SQL docs","description":"R2 SQL platform details including pricing and configuration.","url":"https://developers.cloudflare.com/r2-sql/platform/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-04-21","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/platform/","name":"Platform"}}]}
```

---

---
title: Pricing
description: R2 SQL pricing based on data scanned, with included usage details and billing examples.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Pricing

R2 SQL charges based on a single dimension:

* **Data scanned**: The volume of compressed data read from R2 to execute your query.

R2 SQL pricing is additive to standard [R2 storage and operations](https://developers.cloudflare.com/r2/pricing/) and [R2 Data Catalog](https://developers.cloudflare.com/r2/data-catalog/platform/pricing/) charges. When the query engine reads files, those requests count as R2 Class B (read) operations. R2 does not charge for egress, so there is no additional data transfer cost.

All included usage is on a monthly basis.

## R2 SQL pricing

|              | Pricing                   |
| ------------ | ------------------------- |
| Included     | 10 GB / month             |
| Data scanned | $0.0025 / GB ($2.50 / TB) |

### What counts as data scanned

Data scanned is the compressed bytes read from R2 object storage to answer your query. This matches what you see in your R2 bucket — if a Parquet file is 100 MB on disk, scanning that file bills for 100 MB.

* **Minimum per query**: 10 MB. If a query scans less than 10 MB, you are billed for 10 MB.
* **Failed queries**: Queries that fail due to a system error or syntax error caught before execution are not charged. Queries that fail mid-execution due to a runtime error are also not charged.
* **Metadata-only operations**: Operations such as `EXPLAIN`, `SHOW`, and `DESCRIBE` do not scan data and are free. Standard R2 and R2 Data Catalog request charges still apply.

## Billing examples

### Example 1: Ad-hoc analytics on 500 GB of Parquet data

A user stores 500 GB of Parquet data in R2 Data Catalog and runs queries that scan a total of 50 GB of compressed data during the month.

| Dimension             | Usage        | Included    | Billable     | Cost      |
| --------------------- | ------------ | ----------- | ------------ | --------- |
| R2 storage            | 500 GB-month | 10 GB-month | 490 GB-month | $7.35     |
| R2 SQL (data scanned) | 50 GB        | 10 GB       | 40 GB        | $0.10     |
| **Total**             |              |             |              | **$7.45** |

### Example 2: Heavy query workload on 10 TB dataset

A data team stores 10 TB of compressed Parquet/Iceberg data and scans 50 TB of data per month across their queries. The team also makes 2 million catalog operations with compaction processing 500 GB.

| Dimension                         | Usage           | Included    | Billable       | Cost        |
| --------------------------------- | --------------- | ----------- | -------------- | ----------- |
| R2 storage                        | 10,000 GB-month | 10 GB-month | 9,990 GB-month | $149.85     |
| R2 SQL (data scanned)             | 50,000 GB       | 10 GB       | 49,990 GB      | $124.98     |
| R2 Data Catalog operations        | 2,000,000       | 1,000,000   | 1,000,000      | $9.00       |
| R2 Data Catalog compaction (data) | 500 GB          | 10 GB       | 490 GB         | $2.45       |
| **Total**                         |                 |             |                | **$286.28** |

## Frequently asked questions

### Is there a minimum billing increment per query?

Yes. Each query is billed for a minimum of 10 MB of data scanned. This covers the overhead of initializing the query engine.

### Does data scanned include R2 egress fees?

No. R2 does not charge for egress. The query engine runs within the Cloudflare network adjacent to R2 storage, so there are no data transfer costs.

## Cloudflare billing policy

To learn more about how usage is billed, refer to [Cloudflare Billing Policy](https://developers.cloudflare.com/billing/understand/billing-policy/).

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/platform/pricing/#page","headline":"R2 SQL - Pricing · R2 SQL docs","description":"R2 SQL pricing based on data scanned, with included usage details and billing examples.","url":"https://developers.cloudflare.com/r2-sql/platform/pricing/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-06-25","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/platform/","name":"Platform"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/platform/pricing/","name":"Pricing"}}]}
```

---

---
title: Reference
description: R2 SQL reference documentation for limitations, best practices, and Wrangler commands.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Reference

```json
{"@context":"https://schema.org","@type":"WebPage","@id":"https://developers.cloudflare.com/r2-sql/reference/#page","headline":"Reference · R2 SQL docs","description":"R2 SQL reference documentation for limitations, best practices, and Wrangler commands.","url":"https://developers.cloudflare.com/r2-sql/reference/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-04-21","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/reference/","name":"Reference"}}]}
```

---

---
title: Limitations and best practices
description: Supported features, known limitations, and best practices for R2 SQL queries.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Limitations and best practices

Note

R2 SQL is in open beta. Limitations and best practices will change over time.

This page summarizes supported features, limitations, and best practices.

## Quick reference

| Feature                                                 | Supported | Notes                                                                      |
| ------------------------------------------------------- | --------- | -------------------------------------------------------------------------- |
| SELECT, WHERE, GROUP BY, HAVING, ORDER BY, LIMIT        | Yes       |                                                                            |
| Column aliases (AS)                                     | Yes       |                                                                            |
| Expressions (CASE, CAST, LIKE, BETWEEN, IN, arithmetic) | Yes       | Full expression support                                                    |
| EXPLAIN                                                 | Yes       | Returns execution plan as text or JSON                                     |
| Scalar functions                                        | Yes       | Math, string, datetime, regex, crypto, array, map, struct, JSON            |
| Aggregate functions                                     | Yes       | Basic, approximate, statistical, bitwise, boolean, positional              |
| Approximate aggregates                                  | Yes       | approx\_distinct, approx\_median, approx\_percentile\_cont, approx\_top\_k |
| Struct / Array / Map column types                       | Yes       | Bracket notation, get\_field(), array functions, map functions             |
| CTEs (WITH ... AS)                                      | Yes       | Can reference different tables and include JOINs                           |
| JOINs (INNER, LEFT, RIGHT, FULL OUTER, CROSS)           | Yes       | All standard join types                                                    |
| Implicit joins (comma FROM)                             | Yes       |                                                                            |
| Subqueries (IN, NOT IN)                                 | Yes       | NOT IN not supported on nullable columns — use NOT EXISTS instead          |
| Subqueries (EXISTS, NOT EXISTS)                         | Yes       | semi-join and anti-join patterns                                           |
| Scalar subqueries                                       | Yes       | In SELECT, WHERE, HAVING                                                   |
| Derived tables (FROM subqueries)                        | Yes       | Can be nested and joined. LATERAL derived tables not supported.            |
| Self-joins                                              | Yes       | Same table with different aliases                                          |
| Window functions (OVER)                                 | Yes       | Inline OVER (...) only — named WINDOW clause not supported                 |
| QUALIFY                                                 | Yes       | Filter on a window function result                                         |
| SELECT DISTINCT / DISTINCT ON                           | Yes       |                                                                            |
| func(DISTINCT ...)                                      | Yes       | COUNT, SUM, AVG, and other aggregates                                      |
| Set operations (UNION, UNION ALL, INTERSECT, EXCEPT)    | Yes       |                                                                            |
| GROUPING SETS / ROLLUP / CUBE                           | Yes       |                                                                            |
| OFFSET                                                  | No        |                                                                            |
| INSERT / UPDATE / DELETE                                | No        | Read-only                                                                  |
| CREATE / DROP / ALTER                                   | No        | Read-only                                                                  |

For the full SQL syntax, refer to the [SQL reference](https://developers.cloudflare.com/r2-sql/sql-reference/).

---

## Unsupported SQL features

| Feature                                                            | Error                                               |
| ------------------------------------------------------------------ | --------------------------------------------------- |
| OFFSET                                                             | unsupported feature: OFFSET clause is not supported |
| Named WINDOW clause                                                | unsupported feature: WINDOW clause is not supported |
| INSERT / UPDATE / DELETE                                           | only read-only queries are allowed                  |
| CREATE / DROP / ALTER                                              | only read-only queries are allowed                  |
| UNNEST / PIVOT / UNPIVOT                                           | Not supported                                       |
| Wildcard modifiers (ILIKE, EXCLUDE, EXCEPT, REPLACE, RENAME on \*) | Not supported                                       |
| Nested (parenthesized) joins                                       | Not supported                                       |
| LATERAL derived tables / LATERAL VIEW                              | Not supported                                       |
| PERCENTILE\_DISC                                                   | Not supported — use PERCENTILE\_CONT                |

---

## Unsupported expression patterns

| Pattern                             | Alternative                                       |
| ----------------------------------- | ------------------------------------------------- |
| NOT IN subquery on nullable columns | Use NOT EXISTS with a correlated subquery instead |

Exact aggregates such as `COUNT(DISTINCT ...)`, `MEDIAN`, `PERCENTILE_CONT`, `ARRAY_AGG`, and `STRING_AGG` are supported. On large datasets, prefer the approximate alternatives (`approx_distinct`, `approx_median`, `approx_percentile_cont`) for lower memory and compute. Refer to [Aggregate functions](https://developers.cloudflare.com/r2-sql/sql-reference/aggregate-functions/).

---

## Runtime constraints

| Constraint                           | Details                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       |
| ------------------------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| Resource-intensive queries           | During open beta, queries that require high memory or compute may time out. This includes multi-way joins (three or more large tables), COUNT(DISTINCT) and other func(DISTINCT ...) across joins or high-cardinality columns, ARRAY\_AGG / STRING\_AGG, set operations that deduplicate large inputs, window functions over large partitions, and large sorts or high-cardinality GROUP BY. Add WHERE filters and LIMIT, and prefer approx\_\* aggregates to reduce the chance of a timeout. |
| Budget-gated functions               | MEDIAN, PERCENTILE\_CONT, ARRAY\_AGG, STRING\_AGG, NTH\_VALUE used as an aggregate, any aggregate with DISTINCT, and window functions (including those used through QUALIFY) are budget-gated up front. R2 SQL estimates the memory required before running the query and rejects it with a 400 error if too much data would be scanned. Add a GROUP BY or WHERE filters to reduce the rows processed.                                                                                        |
| Multi-table queries                  | JOINs, subqueries (IN, EXISTS, scalar, derived tables), and multi-table CTEs are supported. Performance depends on intermediate result size; use WHERE filters to manage join selectivity.                                                                                                                                                                                                                                                                                                    |
| Partitioned and unpartitioned tables | Both partitioned and unpartitioned Iceberg tables are supported.                                                                                                                                                                                                                                                                                                                                                                                                                              |
| Parquet format only                  | No CSV, JSON, or other formats.                                                                                                                                                                                                                                                                                                                                                                                                                                                               |
| Read-only                            | R2 SQL is a query engine, not a database. No writes.                                                                                                                                                                                                                                                                                                                                                                                                                                          |
| now() / current\_time() precision    | Quantized to 10ms boundaries and forced to UTC.                                                                                                                                                                                                                                                                                                                                                                                                                                               |

---

## Common error codes

| Code  | Meaning                                                            |
| ----- | ------------------------------------------------------------------ |
| 40003 | Invalid SQL syntax                                                 |
| 40004 | Invalid query (unsupported feature, unknown column, type mismatch) |
| 80001 | Edge service connection failure (retry)                            |

---

## Best practices

1. Include time-range filters in `WHERE` to limit data scanned.
2. Use specific column names instead of `SELECT *` for better performance.
3. Use `LIMIT` to control result set size.
4. Use approximate aggregation functions (`approx_distinct`, `approx_median`, `approx_percentile_cont`) instead of exact alternatives on large datasets.
5. Enable compaction in R2 Data Catalog to reduce the number of files scanned per query.
6. Use `EXPLAIN` to inspect the execution plan and verify predicate pushdown.
7. Use `WHERE` filters with multi-way joins to reduce intermediate result sizes. Joining three or more large tables without filters can exceed resource limits.
8. Join large fact tables through dimension tables rather than directly joining two large fact tables. For example, join `http_requests` to `firewall_events` through a shared `zones` dimension rather than cross-joining both fact tables.
9. Be cautious with `COUNT(DISTINCT)` across multi-way joins. This combination can produce very large intermediate results. Consider using `approx_distinct()` or breaking the query into smaller steps.
10. Use explicit `JOIN` syntax instead of implicit joins (comma-separated `FROM`) for readability and to ensure the optimizer can choose optimal join ordering.

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/reference/limitations-best-practices/#page","headline":"Limitations and best practices · R2 SQL docs","description":"Supported features, known limitations, and best practices for R2 SQL queries.","url":"https://developers.cloudflare.com/r2-sql/reference/limitations-best-practices/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-06-22","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"},"keywords":["SQL"]}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/reference/","name":"Reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/reference/limitations-best-practices/","name":"Limitations and best practices"}}]}
```

---

---
title: Wrangler commands
description: Wrangler CLI commands for querying data with R2 SQL.
image: https://developers.cloudflare.com/dev-products-preview.png
---

> Documentation Index  
> Fetch the complete documentation index at: https://developers.cloudflare.com/r2-sql/llms.txt  
> Use this file to discover all available pages before exploring further. 

[Skip to content](#%5Ftop) 

# Wrangler commands

Note

R2 SQL is currently in open beta. Report R2 SQL bugs in [GitHub ↗](https://github.com/cloudflare/workers-sdk/issues/new/choose). R2 SQL expects there to be a [WRANGLER\_R2\_SQL\_AUTH\_TOKEN](https://developers.cloudflare.com/r2-sql/query-data/#authentication) environment variable to be set.

### `r2 sql query`

Execute SQL query against R2 Data Catalog

* [  npm ](#tab-panel-10097)
* [  pnpm ](#tab-panel-10098)
* [  yarn ](#tab-panel-10099)

Terminal window

```
npx wrangler r2 sql query [WAREHOUSE] [QUERY]
```

Terminal window

```
pnpm wrangler r2 sql query [WAREHOUSE] [QUERY]
```

Terminal window

```
yarn wrangler r2 sql query [WAREHOUSE] [QUERY]
```

* `[WAREHOUSE]` ` string ` required  
R2 Data Catalog warehouse name
* `[QUERY]` ` string ` required  
The SQL query to execute

Global flags

* `--v` ` boolean ` alias: --version  
Show version number
* `--cwd` ` string `  
Run as if Wrangler was started in the specified directory instead of the current working directory
* `--config` ` string ` alias: --c  
Path to Wrangler configuration file
* `--env` ` string ` alias: --e  
Environment to use for operations, and for selecting .env and .dev.vars files
* `--env-file` ` string `  
Path to an .env file to load - can be specified multiple times - values from earlier files are overridden by values in later files
* `--experimental-provision` ` boolean ` aliases: --x-provision default: true  
Experimental: Enable automatic resource provisioning
* `--experimental-auto-create` ` boolean ` alias: --x-auto-create default: true  
Automatically provision draft bindings with new resources
* `--install-skills` ` boolean ` default: false  
Install Cloudflare agents skills, if not already present, without asking the user for confirmation

```json
{"@context":"https://schema.org","@type":"TechArticle","@id":"https://developers.cloudflare.com/r2-sql/reference/wrangler-commands/#page","headline":"Wrangler commands · R2 SQL docs","description":"Wrangler CLI commands for querying data with R2 SQL.","url":"https://developers.cloudflare.com/r2-sql/reference/wrangler-commands/","inLanguage":"en","image":"https://developers.cloudflare.com/dev-products-preview.png","dateModified":"2026-04-21","publisher":{"@type":"Organization","name":"Cloudflare","url":"https://www.cloudflare.com/"},"isPartOf":{"@type":"WebSite","@id":"https://developers.cloudflare.com/#website","name":"Cloudflare Docs","url":"https://developers.cloudflare.com/"}}
{"@context":"https://schema.org","@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"item":{"@id":"/directory/","name":"Directory"}},{"@type":"ListItem","position":2,"item":{"@id":"/r2-sql/","name":"R2 SQL"}},{"@type":"ListItem","position":3,"item":{"@id":"/r2-sql/reference/","name":"Reference"}},{"@type":"ListItem","position":4,"item":{"@id":"/r2-sql/reference/wrangler-commands/","name":"Wrangler commands"}}]}
```
