Skip to content

Partially fix #3160 by properly handling resultsFormat query#5917

Merged
rozenmd merged 10 commits into
cloudflare:mainfrom
kossnocorp:join-squash
May 27, 2024
Merged

Partially fix #3160 by properly handling resultsFormat query#5917
rozenmd merged 10 commits into
cloudflare:mainfrom
kossnocorp:join-squash

Conversation

@kossnocorp

@kossnocorp kossnocorp commented May 25, 2024

Copy link
Copy Markdown
Contributor

TL;DR: Cloudflare D1 has a critical bug in Workers SDK reproducible on the Cloudflare D1 Console and when running D1 through Wrangler. The PR fixes the bug and also improves compatibility with workerd.

The problem

The essence of the problem is selecting two columns with similar names:

SELECT
    projects.name,
    organizations.name
FROM
    projects
    JOIN organizations ON projects.organization_id = organizations.id;

...must return in the row containing both columns (tested on PostgreSQL v16.3):

test=# SELECT projects.name, organizations.name
        FROM projects
        JOIN organizations ON projects.organization_id = organizations.id;
    name    |    name    
------------+------------
 Cloudflare | Cloudflare
(1 row)

SQLite (on a database created by Wrangler) gives a similar result:

sqlite> SELECT
    projects.name,
    organizations.name
FROM
    projects
    JOIN organizations ON projects.organization_id = organizations.id;
Daisy Chain|Daisy Chain
Fake Daisy Chain|Fake Org
Mind Control|Daisy Chain
sqlite> 

I also added a test for this behavior to workerd that is passing:

await itShould(
  // ...
  () =>
    DB.prepare(
      `SELECT projects.name, organizations.name
      FROM projects
      JOIN organizations ON projects.organization_id = organizations.id;`
    ).raw(),
  [['Cloudflare', 'Cloudflare']]
)

However, when sending the same request through Workers SDK, I get:

| name       |
| Cloudflare |

As both columns have the same name, name, they got squashed into a single column.

A GitHub user raised the issue a year ago and since then it got fixed in workerd (that my test confirms), but it is still present in Workers SDK.

The problem is also reproducible in Cloudflare D1 Console, where I get the same data:

| name       |
| Cloudflare |

It breaks people's code in unexpected ways and creates an impression of D1 as an unreliable product that might behave differently in production.

In my opinion, it's a critical bug that must be prioritized.

The source

I chased the bug through wranglerworkerdminiflare to these line in database.worker.ts:

const cursor = this.db.prepare(query.sql)(...params);
const results = convertResults(all(cursor));

When running the test:

await db
  .prepare(
	`SELECT ${tableColours}.name, ${tablePalettes}.name FROM ${tableColours} JOIN ${tablePalettes} ON ${tableColours}.id = ${tablePalettes}.colour_id`
  )
  .raw();

...the cursor object passed through all results in the given structure:

[{ name: "Night" }]

You can see that at this point, the information is already lost, and the name columns have been reduced to one property.

After further researching I found that workerd sends the resultsFormat query that determines the format of the results.

It worked before only because workerd handles the case when the API always returns an array of objects either by sheer luck or as a workaround for Workers SDK behavior.

The solution

The PR fixes the problem and adds proper handling of the resultsFormat query.

It makes raw preserve columns with the same name, addressing the original problem.

It also adds handling of NONE which workerd sends with run and exec (hence the change in existing test cases).

Extras

Here's the SQL I used to test the behavior:

CREATE TABLE organizations (id INTEGER PRIMARY KEY, name TEXT);

CREATE TABLE projects (
    id INTEGER PRIMARY KEY,
    name TEXT,
    organization_id INTEGER,
    FOREIGN KEY (organization_id) REFERENCES organizations(id)
);

INSERT INTO
    organizations (id, name)
VALUES
    (1, 'Cloudflare');
    
INSERT INTO
    projects (id, name, organization_id)
VALUES
    (1, 'Cloudflare', 1);
    
SELECT
    projects.name,
    organizations.name
FROM
    projects
    JOIN organizations ON projects.organization_id = organizations.id;

What this PR solves / how to test

The PR fixes #3160.

Author has addressed the following

  • Tests
    • TODO (before merge)
    • Included
    • Not necessary because:
  • E2E Tests CI Job required? (Use "e2e" label or ask maintainer to run separately)
    • I don't know
    • Required / Maybe required
    • Not required because:
  • Changeset (Changeset guidelines)
    • TODO (before merge)
    • Included
    • Not necessary because:
  • Public documentation

@kossnocorp kossnocorp requested a review from a team as a code owner May 25, 2024 07:39
@changeset-bot

changeset-bot Bot commented May 25, 2024

Copy link
Copy Markdown

🦋 Changeset detected

Latest commit: 41624d9

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 4 packages
Name Type
miniflare Minor
@cloudflare/pages-shared Patch
@cloudflare/vitest-pool-workers Patch
wrangler Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@github-actions

github-actions Bot commented May 25, 2024

Copy link
Copy Markdown
Contributor

A wrangler prerelease is available for testing. You can install this latest build in your project with:

npm install --save-dev https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-wrangler-5917

You can reference the automatically updated head of this PR with:

npm install --save-dev https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/prs/5917/npm-package-wrangler-5917

Or you can use npx with this latest build directly:

npx https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-wrangler-5917 dev path/to/script.js
Additional artifacts:
npx https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-create-cloudflare-5917 --no-auto-update
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-cloudflare-kv-asset-handler-5917
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-miniflare-5917
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-cloudflare-pages-shared-5917
npm install https://prerelease-registry.devprod.cloudflare.dev/workers-sdk/runs/9256924792/npm-package-cloudflare-vitest-pool-workers-5917

Note that these links will no longer work once the GitHub Actions artifact expires.


wrangler@3.57.1 includes the following runtime dependencies:

Package Constraint Resolved
miniflare workspace:* 3.20240512.0
workerd 1.20240512.0 1.20240512.0
workerd --version 1.20240512.0 2024-05-12

Please ensure constraints are pinned, and miniflare/workerd minor versions match.

@petebacondarwin petebacondarwin added the product:d1 Relating to Cloudflare D1: https://developers.cloudflare.com/d1/ label May 25, 2024
This PR fixes cloudflare#3160 caused by ignoring the `resultsFormat` query that `workerd` sends.

The commits add proper handling of the query. It makes `raw` preserve columns with the same name by handling `resultsFormat=ROWS_AND_COLUMNS` and removes `results` from `run` as it sends `resultsFormat=NONE` (see https://github.com/cloudflare/workerd/blob/1d89f3b8e9cdcd898ea486656d72d9551e79f4a3/src/cloudflare/internal/d1-api.ts#L295)
@kossnocorp kossnocorp changed the title Add failing test case for #3160 Fix #3160 by properly handling resultsFormat query May 25, 2024
@kossnocorp

Copy link
Copy Markdown
Contributor Author

@geelen @petebacondarwin please take a look again. I found the root cause of the problem and fixed the issue.

I bealive it's a critical bug as it breaks people code in unexpected way and creates an impression of D1 as unreliable product that might behave differently in production.

I'm confident in the fix quality, so I think it won't create much work for you and fix an important problem for customers.

Comment thread packages/miniflare/src/workers/d1/database.worker.ts
This PR fixes cloudflare#3160 caused by ignoring the `resultsFormat` query that `workerd` sends.

The commits add proper handling of the query. It makes `raw` preserve columns with the same name by handling `resultsFormat=ROWS_AND_COLUMNS` and removes `results` from `run` as it sends `resultsFormat=NONE` (see https://github.com/cloudflare/workerd/blob/1d89f3b8e9cdcd898ea486656d72d9551e79f4a3/src/cloudflare/internal/d1-api.ts#L295)
@rozenmd

rozenmd commented May 27, 2024

Copy link
Copy Markdown
Contributor

Sorry, looks like the build is failing - can you run prettier against packages/miniflare/test/plugins/d1/test.ts?

Comment thread packages/miniflare/test/plugins/d1/test.ts Outdated
@rozenmd

rozenmd commented May 27, 2024

Copy link
Copy Markdown
Contributor

nvm found the issue, missing ;

@kossnocorp

Copy link
Copy Markdown
Contributor Author

@rozenmd sorry, I had to turn off automatic formatting as it was changing unrelated files, particularly in workerd. Do you want me try to fix formatting and force push the branch?

@rozenmd

rozenmd commented May 27, 2024

Copy link
Copy Markdown
Contributor

That's alright @kossnocorp - I can fix it up

@kossnocorp

Copy link
Copy Markdown
Contributor Author

@rozenmd thank you, I appreciate it!

@rozenmd

rozenmd commented May 27, 2024

Copy link
Copy Markdown
Contributor

Looked at this a bit more - can you remove this part?

It also adds handling of NONE which workerd sends with run and exec (hence the change in existing test cases).

We initially released this change in workerd and had to rapidly roll-forward a fix in our D1 Worker to prevent breaking compatibility with user Workers that relied on run/exec to return results. We intend to fix this properly via a compatibility_date later.

@kossnocorp

Copy link
Copy Markdown
Contributor Author

@rozenmd sure, I'm on it.

@kossnocorp

Copy link
Copy Markdown
Contributor Author

@rozenmd done!


// Check whether workerd raw test case passes here too
// Note that this test did not pass with the old binding
if (!t.context.bindings["__D1_BETA__DB"]) {

@rozenmd rozenmd May 27, 2024

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test suite is dynamic (yay) and gets run by both versions of the D1 binding.

I had to add this check to ensure that it's fixed only for new bindings (since old versions of wrangler have the old binding, and we have no control over that code)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fascinating!

Comment thread .changeset/little-jars-train.md Outdated
Comment thread .changeset/little-jars-train.md Outdated
Comment thread .changeset/little-jars-train.md Outdated
@rozenmd rozenmd requested a review from CarmenPopoviciu May 27, 2024 14:45
Comment thread .changeset/little-jars-train.md Outdated
Comment thread .changeset/little-jars-train.md Outdated
@rozenmd rozenmd changed the title Fix #3160 by properly handling resultsFormat query Partially fix #3160 by properly handling resultsFormat query May 27, 2024
@rozenmd rozenmd merged commit 64ccdd6 into cloudflare:main May 27, 2024
@kossnocorp kossnocorp deleted the join-squash branch May 27, 2024 20:08
@kossnocorp

Copy link
Copy Markdown
Contributor Author

@rozenmd thank you for helping me to land this PR into the main branch 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

product:d1 Relating to Cloudflare D1: https://developers.cloudflare.com/d1/

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

🐛 BUG: when trying to left join 2 tables that have same name columns - second column is not returned

5 participants