Skip to content

websocket: new --ws-text, --ws-binary options, change implicit default to websocket text frames#22093

Closed
zmajeed wants to merge 5 commits into
curl:masterfrom
zmajeedforks:zmajeed_websocket_text_frames
Closed

websocket: new --ws-text, --ws-binary options, change implicit default to websocket text frames#22093
zmajeed wants to merge 5 commits into
curl:masterfrom
zmajeedforks:zmajeed_websocket_text_frames

Conversation

@zmajeed

@zmajeed zmajeed commented Jun 18, 2026

Copy link
Copy Markdown

Two new options --ws-text and --ws-binary have been added to set websocket frames to text and binary respectively. The implicit default has been changed to text frames if neither option is given. Previously the default was binary frames.

Both options are documented. Four tests have been added for various combinations of the new options

I've gone for the simplest change - replacing the hardcoded CURLWS_BINARY in cr_ws_read with the frame type indicated by ws_text_frames in Curl_easy options

All other changes are boilerplate for adding new options. I did add CURLOPT_WS_OPTIONS to tool_setopt.c so I could use my_setopt_bitmask in config2setopts.c

The new options correspond to a new enum in tool_cfgable.h - I prefer this to a boolean flag

enum {
  WS_FRAMES_DEFAULT,            /* implicit default websocket frame type */
  WS_TEXT_FRAMES,               /* send text websocket frames */
  WS_BINARY_FRAMES              /* send binary websocket frames */
} ws_frame_type;

Let me make a case for changing the implicit default for websocket frames to text from binary frames

  1. Text frames are the more natural and obvious default over binary frames as they are given opcode 0x1 in the Websocket standard and binary frames are 0x2
  2. I found the curl websocket design doc https://github.com/curl/curl/blob/master/docs/internals/WEBSOCKET.md - it mentions defaulting to text frames - I don't know why this initial inclination wasn't implemented
  3. Many important websocket-based protocols, particularly those based on JSON-RPC, like Chrome Devtools Protocol exclusively use text frames. These protocols drop binary frames and terminate the connection. WebRTC is another example of a text only protocol.
  4. Further major API gateways like Amazon AWS do not allow binary frames to pass through even if the evneutal endpoint supports binary frames. These gateways also terminate the connection on receiving a binary frame
  5. The only examples I can find of protocols that only use binary frames are legacy protocols like VNC/RBF or MQTT that use websocket as the transport in a subprotocol. These are pretty rare.
  6. Many websocket-based protocols that use binary frames also support text frames for some messages. WAMP is an example of a mixed-frame protocol.
  7. This being the case for well-known protocols, it is very unlikely anyone has used curl websocket support in a real-world scenario so far - I just don't see how it could work in a production environment without a way to send text frames. The feature is also relatively new out of its experimental designation.
  8. The standard commandline tool for websocket is websocat. I can imagine some websocat users trying out curl and migrating over time - it would be nice to provide the same default so users can try a simple curl websocket message and not be disappointed with the behavior

Given all of the above, I feel it's better to go with the better default now even if it means breaking compatibility.

Fixes #21997

New --ws-text option sets text frames as default websocket frame type.
Similarly --ws-binary sets the default to binary frames.
The default frame type is text if neither option is given.
The last option wins if both --ws-text and --ws-binary are given.

Reported-by: Zartaj Majeed
Fixes curl#21997
Comment thread tests/data/test2725 Outdated
Comment thread tests/data/Makefile.am
Comment thread include/curl/websockets.h Outdated
Comment thread lib/urldata.h
Comment on lines +1203 to +1204
BIT(ws_text_frames); /* true if default is text frames */
/* false if default is binary frames */

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
BIT(ws_text_frames); /* true if default is text frames */
/* false if default is binary frames */
BIT(ws_text_frames);

being a bit and the name of the variable explains this already
IMO. Seems overkill to me to reiterate this in two comments.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While it's true that a client frame is either text or binary, I just don't find the naming to patently suggest that. I like the comment because it clearly states what !ws_text_frames means and I'll never need to go look in the code where it's used to figure out that it is indeed binary frames.

@vszakats vszakats Jun 19, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a non-text frame can be any other than binary, then
perhaps this should not be a bit, but an enum/int. Short
of that, I still see these comments adding no information.

@zmajeed zmajeed Jun 19, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment minimally and plainly makes it clear ws_text_frames false means binary frames are default - something the name and type of the variable itself doesn't tell you unless you already know its background -- think as a developer looking at this variable and websocket code for the first time - would you resent or appreciate the contextual info?

There's also the benefit of getting a hit here for simple grep and search for "binary frames" or "binary"

- --ws-text -o storage $URL
---

# `--ws-text`

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since binary is the current default, and this PR aims to introduce
text mode over that, and because the two are mutually exclusive
and reverse of each other, would it make sense to introduce just
one new option: --ws-text and support the --no-ws-text
as its reverse, like its done with other existing boolean options?

@zmajeed zmajeed Jun 18, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This point was raised in the issue too. I really prefer obvious option names to something implied or that relies on pre-knowledge. To me --no-ws-text doesn't obviously mean binary frames. Sure it could be documented that way. But if I put myself in the shoes of a casual user or even a user who wants to use curl for websocket without knowing much or anything about the protocol specs. Seeing just --ws-text and nothing else won't tell me about the possibility of binary frames - also this thing with --no- prefix might be well-known to you and regular curl users but I bet a lot of people don't know about --no- or don't know which options allow it.

I brought up websocat earlier - I think it's useful to look at what websocat has done - it's a popular tool and despite being made just for websocket and nothing else, it still gives separate text and binary options.

I think the bottom line of this debate is I don't see the names as boolean opposites - even though I agree they are mutually exclusive in effect - it's just the plain upfront affirmative naming that works for me compared to a negated option name that implies the other name

@vszakats vszakats Jun 19, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be any other format in the future? If it can be I could see
a point in spelling out the exact format in the option (instead
of rolling a separate option for each), but in such case perhaps
it should be an argument, such as --ws=binary (or similar).
This may even be more future proof to do now, just in case
there will be more format variations in the future.

(I don't believe that looking at naming conventions used
in other tools help in general. Names may be heavily dependent
on context and other existing conventions in each specifc tool.)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The two separate options is the most user-friendly choice

A frame-type or mode option is not as user-friendly because it's longer, involves values, requires an extra docs and mental lookup to know what the values are and their syntax

Similarly for text + no-text, just not as user-friendly and ergonomic, because it hides the word binary and requires extra knowledge of curl options syntax

The naming convention is straight from the spec - the choice of separate options is just ergonomics and being considerate of users

I mention websocat because it's a good tool with good options that has held up over time - its choices reflect its users experience with doing commandline websocket messages - it's not particularly about looking at another tool but about looking at what the people using that tool to solve the same problem have come to like or dislike

Comment thread src/config2setopts.c Outdated
@testclutch

Copy link
Copy Markdown

Analysis of PR #22093 at f1375373:

Test 1139 failed, which has NOT been flaky recently, so there could be a real issue in this PR. Note that this test has failed in 156 different CI jobs (the link just goes to one of them).

Test 1119 failed, which has NOT been flaky recently, so there could be a real issue in this PR. Note that this test has failed in 144 different CI jobs (the link just goes to one of them).

Test 971 failed, which has NOT been flaky recently, so there could be a real issue in this PR. Note that this test has failed in 213 different CI jobs (the link just goes to one of them).

Test 1478 failed, which has NOT been flaky recently, so there could be a real issue in this PR. Note that this test has failed in 137 different CI jobs (the link just goes to one of them).

Test 2727 failed, which has NOT been flaky recently, so there could be a real issue in this PR.

Test 2724 failed, which has NOT been flaky recently, so there could be a real issue in this PR.

Test 2725 failed, which has NOT been flaky recently, so there could be a real issue in this PR.

There are more failures, but that's enough from Gha.

Generated by Testclutch

@zmajeed

zmajeed commented Jun 18, 2026

Copy link
Copy Markdown
Author

There's an awful lot of CI failures that I find hard to believe have anything to do with my changes - is testclutch bot to be believed when it says none of these tests are flaky?

@dfandrich

Copy link
Copy Markdown
Contributor

None of those failing tests are found in the recent test failures list so it's very likely they're related to this PR, especially since most of them are causing failures in multiple CI jobs.

Comment thread src/config2setopts.c Outdated
#endif
}
my_setopt_long(curl, CURLOPT_UPLOAD_FLAGS, config->upload_flags);
#ifndef CURL_DISABLE_WEBSOCKETS

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We try to keep protocol disable/enable limited to libcurl only, so that the tool works with libcurl builds independently of those. IOW: the tool always knows about WebSocket.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

Comment thread docs/cmdline-opts/ws-binary.md Outdated

# `--ws-binary`

Tell curl to send websocket binary frames by default. This option and the related `--ws-text` option determine the type of frame sent by curl to a websocket server. If neither option is given, the default is to send text frames. If both options are used, the last one wins.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the official casing of the protocol name:

Suggested change
Tell curl to send websocket binary frames by default. This option and the related `--ws-text` option determine the type of frame sent by curl to a websocket server. If neither option is given, the default is to send text frames. If both options are used, the last one wins.
Tell curl to send WebSocket binary frames by default. This option and the related `--ws-text` option determine the type of frame sent by curl to a WebSocket server. If neither option is given, the default is to send text frames. If both options are used, the last one wins.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense

@zmajeed zmajeed Jun 18, 2026

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should websocket options be a new websocket help category? Went ahead and made a new category

@bagder bagder added the feature-window A merge of this requires an open feature window label Jun 18, 2026
@zmajeed

zmajeed commented Jun 18, 2026

Copy link
Copy Markdown
Author

ok - I see the same thing failing in many checks - something about symbols in sync, manpages - what is this and how do I fix it?

@bagder

bagder commented Jun 18, 2026

Copy link
Copy Markdown
Member

I propose you run them one by one and check tests/log/stderr[num] on failure to get the details. The four below are for documentation and source files not all being in sync,

FAIL 971: 'Verify that options-in-versions and docs/cmdline-opts are in sync' source analysis, options-in-versions
FAIL 1119: 'Verify that symbols-in-versions and headers are in sync' source analysis, symbols-in-versions
FAIL 1139: 'Verify that all libcurl options have man pages' source analysis, symbols-in-versions, documentation, --manual
FAIL 1478: 'src/tool_listhelp.c is in sync with docs/cmdline-opts' source analysis, documentation, --help

@zmajeed

zmajeed commented Jun 18, 2026

Copy link
Copy Markdown
Author

this comment in src/tool_help.h - bitmask generated with make -C docs/cmdline-opts listcats

is there a way to make it work for out-of-tree builds? Or do I hack the makefile?

@zmajeed

zmajeed commented Jun 18, 2026

Copy link
Copy Markdown
Author

nvm, looks like I have to run managen manually for each case

@zmajeed

zmajeed commented Jun 18, 2026

Copy link
Copy Markdown
Author

similar to difficulty of running runtests.pl directly for out-of-tree builds - make test works but sets -s which cannot be overridden with -v - a bug in my opinion

@zmajeed

zmajeed commented Jun 19, 2026

Copy link
Copy Markdown
Author

Not getting this function size check for config2setopts - I added 4 simple lines to an already long function - is it simply the straw the broke its back?

@zmajeed

zmajeed commented Jun 19, 2026

Copy link
Copy Markdown
Author

Waiting on some advice on what to do with config2setopts length - should the function be broken into smaller pieces for each group of options so it's not teetering on triggering this failure every time - or are my 4 lines really that complicated that I have to find an alternative

@bagder

bagder commented Jun 23, 2026

Copy link
Copy Markdown
Member

Maybe we can use CURLOPT_TRANSFERTEXT to signal using text (UTF-8) instead of binary? It is basically the same functionality but for FTP, so not even a stretch.

Similarly it would probably make sense to reuse -B / --use-ascii for the command line for this, even if ascii should rather be text...

@zmajeed

zmajeed commented Jun 23, 2026

Copy link
Copy Markdown
Author

I think my perspective is of a new user who's looking at curl for websocket features - I'd prefer to not dig through legacy options that might or might not be useful for websocket - old option names won't be meaningful when read from websocket commandlines - yes --use-ascii could mean text frames with UTF-8 but why insist when --ws-text is so clear?

Is it that curl is never expected to support websocket options? If so then I could understand the opposition to the new options - otherwise why not give websocket a good first-class citizen start?

@bagder

bagder commented Jun 23, 2026

Copy link
Copy Markdown
Member

why insist when --ws-text is so clear?

  • it does not make sense to add protocol specific options for separate protocols if they do basically the same thing. It complicates rather than simplifies.
  • we want to add as few new options as possible for users' sake. It makes it harder to find the right one. The haystack grows.
  • this practice follows an existing pattern of how curl generally handles command line options

Is it that curl is never expected to support websocket options?

This is a WebSocket option we're talking about.

@zmajeed

zmajeed commented Jun 23, 2026

Copy link
Copy Markdown
Author

What I meant is curl is never expected to have a websocket category of options - no --ws- options ever - websocket behavior will be controlled only through legacy options

Since that's the case, I'll close my PR - and let you guys come up with a resolution to the issue

@bagder

bagder commented Jun 28, 2026

Copy link
Copy Markdown
Member

is curl is never expected to have a websocket category of options - no --ws- options ever

ever is a long time. It would probably be foolish of us to make absolute statements of what kind of options we might add in a future. If we feel we need such options and someone implements them, I can imagine we add some.

But our job is to always make sure we first try to avoid adding new options.

@zmajeed

zmajeed commented Jun 28, 2026

Copy link
Copy Markdown
Author

No worries - I'm happy with websocat - I'll close the PR now - should I close the issue too?

@zmajeed

zmajeed commented Jun 28, 2026

Copy link
Copy Markdown
Author

closing, no cigar

@zmajeed zmajeed closed this Jun 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cmdline tool feature-window A merge of this requires an open feature window libcurl API tests

Development

Successfully merging this pull request may close these issues.

curl websocket doesn't work with Chrome Devtools Protocol

5 participants