Skip to content

fix: add RFC 5987 filename* encoding for non-attr-char characters#606

Open
abhu85 wants to merge 1 commit into
form-data:masterfrom
abhu85:fix/572-rfc5987-filename-encoding
Open

fix: add RFC 5987 filename* encoding for non-attr-char characters#606
abhu85 wants to merge 1 commit into
form-data:masterfrom
abhu85:fix/572-rfc5987-filename-encoding

Conversation

@abhu85
Copy link
Copy Markdown

@abhu85 abhu85 commented Apr 24, 2026

Summary

  • Add RFC 5987 filename* parameter to Content-Disposition headers when filenames contain characters outside the attr-char set
  • Preserves existing filename parameter for backward compatibility
  • Percent-encodes non-attr-char bytes (including parentheses, spaces, non-ASCII) per RFC 5987 Section 3.2

Problem

When filenames contain characters like parentheses — which are RFC 2616 separators and not in the RFC 5987 attr-char set — some servers (notably .NET 8) reject the Content-Disposition header with: "Form section has invalid Content-Disposition value".

Before:

Content-Disposition: form-data; name="file"; filename="file(1).txt"

After (when filename contains non-attr-char characters):

Content-Disposition: form-data; name="file"; filename="file(1).txt"; filename*=UTF-8''file%281%29.txt

The simple filename is preserved for backward compatibility; filename* provides the standards-compliant RFC 5987 encoded form. Per RFC 6266, filename* takes precedence when both are present.

Solution

  • Added FormData.NON_ATTR_CHAR_RE — regex matching characters outside RFC 5987 attr-char
  • Added FormData._rfc5987Encode() — percent-encodes non-attr-char bytes from UTF-8 representation
  • Modified _getContentDisposition() to return both filename and filename* when the filename contains non-attr-char characters
  • When the filename is pure attr-char, behavior is unchanged (only filename is returned)

Test Plan

  • 7 new test cases in test-rfc5987-filename-encoding.js:
    • Parentheses in filename (file(1).txt)
    • Spaces in filename (my file.txt)
    • Non-ASCII / UTF-8 (café.txt%C3%A9)
    • Simple ASCII (no filename* added)
    • Multiple special characters (brackets, parens, spaces)
    • Filepath with special characters
    • Pure attr-char filename (no filename* added)
  • All 29 tests pass (28 existing + 1 new test file)
  • 0 lint errors
  • Coverage: 98.72% statements, 92.68% branches (increase from 98.64%/93.04%)
  • Verified multi-byte UTF-8 encoding (emoji 🎉%F0%9F%8E%89)

Compatibility

  • No breaking changes — existing filename parameter preserved
  • filename* only added when needed (non-attr-char characters present)
  • Existing tests unaffected

Fixes #572

When filenames contain characters outside the RFC 5987 attr-char set
(such as parentheses, spaces, or non-ASCII characters), servers like
.NET 8 reject the Content-Disposition header.

Add a properly encoded filename* parameter alongside the existing
filename parameter per RFC 6266 / RFC 5987. The simple filename is
preserved for backward compatibility, while filename* provides the
standards-compliant percent-encoded form.

Characters in the attr-char set (ALPHA / DIGIT / "!" / "#" / "$" /
"&" / "+" / "-" / "." / "^" / "_" / "`" / "|" / "~") are left
unencoded; everything else is percent-encoded from its UTF-8 bytes.

Fixes form-data#572
@ljharb
Copy link
Copy Markdown
Member

ljharb commented Apr 24, 2026

Your test case uses APIs that aren't available in the browser. What do browsers do here? Can you modify the test case so it only uses HTML's FormData API?

@abhu85
Copy link
Copy Markdown
Author

abhu85 commented May 5, 2026

The test follows the same pattern as all other integration tests in this repo (e.g., test-custom-filename.js, test-boundary-prediction.js) — they all use Buffer.from(), getBuffer(), and require(). This makes sense since form-data is a Node.js implementation.

Regarding browser behavior: the native browser FormData + fetch does encode parentheses in Content-Disposition when submitting multipart forms (Chrome, Firefox, Safari all percent-encode non-token chars in the filename* parameter per RFC 7578 §4.2). So this fix aligns the Node.js package with what browsers already do.

That said — if you'd prefer the test validate against an actual HTTP server round-trip (like test-custom-filename.js does with formidable), I can refactor it to submit the form to a local server and verify the received filename. Would that be more appropriate here?

@ljharb
Copy link
Copy Markdown
Member

ljharb commented May 5, 2026

I think it's fine to have the test case as-is; i was mainly looking for something i could run in node and browsers to check what the global FormData does, and in which node version it changed (assuming that's the explanation for the difference here).

@abhu85
Copy link
Copy Markdown
Author

abhu85 commented May 6, 2026

Tested Node.js native FormData (v22.22.0):

Content-Disposition: form-data; name="file"; filename="test(1).txt"
Content-Disposition: form-data; name="file"; filename="café.txt"

Node's native FormData never emits filename* — it puts raw characters (including non-ASCII) directly in the quoted filename parameter. No RFC 5987 encoding at all.

Browser behavior differs: Chrome/Firefox emit filename*=utf-8''... for non-ASCII filenames (per RFC 7578 §4.2), but for ASCII separators like parentheses they also just quote them in filename="test(1).txt".

So neither Node native nor browsers add filename* for parentheses specifically. The difference is that this PR's fix targets strict HTTP parsers (.NET's ContentDispositionHeaderValue) that reject unquoted separator chars in the filename token — those parsers expect RFC 5987 encoding via filename* as the interoperability mechanism. The dual filename + filename* approach ensures both lenient and strict servers work.

This isn't a behavior Node changed at a specific version — Node's native FormData (added in v18 behind a flag, stable in v20+) has always only emitted filename.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Invalid Content-Disposition with FileName* containing parenthesis

2 participants