Skip to content

[MINOR] Reject traversal segments in note and folder paths#5227

Open
jongyoul wants to merge 3 commits intoapache:masterfrom
jongyoul:ZEPPELIN-fs-notebook-path
Open

[MINOR] Reject traversal segments in note and folder paths#5227
jongyoul wants to merge 3 commits intoapache:masterfrom
jongyoul:ZEPPELIN-fs-notebook-path

Conversation

@jongyoul
Copy link
Copy Markdown
Member

@jongyoul jongyoul commented May 6, 2026

What is this PR for?

NotebookRepo.buildNoteFileName composes a filesystem path or object-store key from a user-supplied note path. The previous implementation required only a leading / but otherwise concatenated the value verbatim, so a value such as /../etc/foo would compose to a location outside the configured notebook root for backends that perform a raw filesystem operation (FileSystemNotebookRepo) or build an object key directly from the string (S3 / Azure / GCS).

This PR centralises a small validation helper at the NotebookRepo interface level so every backend gets the same defence, fixes one folder-level path that bypassed buildNoteFileName, and adds the missing normalizeNotePath call in NotebookService.renameNote.

What type of PR is it?

Improvement

Todos

  • Add NotebookRepo.rejectTraversalSegments (URL-decoded, recursive, capped at 5 layers).
  • Call rejectTraversalSegments from the default buildNoteFileName, so every implementation is covered.
  • Apply it to FileSystemNotebookRepo's folder-level move / remove, which build the path without going through buildNoteFileName.
  • Add the missing normalizeNotePath in NotebookService.renameNote to match createNote, cloneNote, and moveFolder.
  • NotebookRepoPathValidationTest (27): .., ., URL-encoded variants (%2e%2e, %2E%2E), double-encoded variants, and an excessive-encoding-layer payload. Realistic note names including Korean characters and ... inside a segment are accepted.

What is the Jira issue?

N/A — [MINOR] change.

How should this be tested?

./mvnw install -pl zeppelin-server -am -DskipTests
./mvnw test -pl zeppelin-server -Dtest='NotebookRepoPathValidationTest,NotebookServiceTest' -DfailIfNoTests=false
./mvnw test -pl zeppelin-plugins/notebookrepo/filesystem

All test sets pass locally.

Questions

  • Does the license files need to update? No
  • Is there breaking changes for older versions? No — existing valid note paths render to byte-identical filenames; only paths containing literal traversal segments now produce a clear IOException instead of silently composing to an out-of-root location.
  • Does this needs documentation? No

jongyoul and others added 2 commits May 6, 2026 20:57
NotebookRepo.buildNoteFileName composes a filesystem path or object-store
key from a user-supplied note path. The previous implementation only
required a leading "/" but otherwise concatenated the value verbatim, so
paths such as "/../etc/foo" would compose to a location outside the
configured notebook root for backends that perform a raw filesystem
operation (FileSystemNotebookRepo) or build an object key directly from
the string (S3 / Azure / GCS).

Changes:

- NotebookRepo: add a default rejectTraversalSegments helper that
  URL-decodes the path repeatedly (capped at five layers) and refuses
  any "." or ".." segment. buildNoteFileName now calls it, so every
  NotebookRepo implementation is hardened in one place.
- FileSystemNotebookRepo: the folder-level move and remove APIs build
  the path directly from the input rather than via buildNoteFileName,
  so apply rejectTraversalSegments to those inputs as well.
- NotebookService.renameNote: normalize the rebuilt note path through
  normalizeNotePath, matching what other rename / move entry points
  already do (createNote, cloneNote, moveFolder).

Tests:

- NotebookRepoPathValidationTest (27): rejects "..", ".", URL-encoded
  variants such as "%2e%2e" and double-encoded variants, and the
  excessive-encoding-layer case. Accepts realistic note names including
  Korean characters and trailing dots inside a segment ("..." stays
  valid, only exact "..", "." segments are rejected).
- Existing FileSystemNotebookRepoTest (2) and NotebookServiceTest (5)
  continue to pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Address reviewer feedback (simplify pass) on the previous commit:

- Move rejectTraversalSegments and decodeRepeatedly out of NotebookRepo
  default methods into a new final NotebookPathValidator class. Default
  methods on a public interface can be overridden, which would let an
  implementation accidentally — or intentionally — bypass the check; a
  static helper on a final class makes that impossible.
- Drop the duplicate decodeRepeatedly that lived as a private static in
  NotebookService and route normalizeNotePath through the same helper.
- Cache the segment-split Pattern as a static final field so the regex
  is compiled once instead of on every save / get / move call.
- Drop the InMemoryStub boilerplate from the test and exercise the
  utility class directly; the stub existed only to instantiate an
  interface that no longer carries the helpers.
- Keep the existing "Exceeded maximum decode attempts" error message so
  existing assertion tests in NotebookServiceTest continue to pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jongyoul jongyoul marked this pull request as ready for review May 9, 2026 11:35
Copilot AI review requested due to automatic review settings May 9, 2026 11:36
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR hardens Zeppelin’s notebook path handling against path traversal by rejecting .. / . path segments (including URL-encoded forms) when composing note storage paths/keys, and aligns service-layer rename behavior with existing normalization.

Changes:

  • Introduces NotebookPathValidator with shared helpers to repeatedly URL-decode note paths (capped) and reject traversal segments.
  • Applies traversal-segment rejection in NotebookRepo.buildNoteFileName(...) so all backends using it get consistent protection.
  • Fixes missed normalization in NotebookService.renameNote(...) and adds traversal checks for folder move/remove in FileSystemNotebookRepo.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
zeppelin-server/src/main/java/org/apache/zeppelin/notebook/repo/NotebookPathValidator.java Adds centralized URL-decoding and traversal-segment rejection utilities.
zeppelin-server/src/main/java/org/apache/zeppelin/notebook/repo/NotebookRepo.java Enforces traversal-segment rejection in the default note filename/key builder.
zeppelin-plugins/notebookrepo/filesystem/src/main/java/org/apache/zeppelin/notebook/repo/FileSystemNotebookRepo.java Validates folder paths for traversal segments before folder move/remove operations.
zeppelin-server/src/main/java/org/apache/zeppelin/service/NotebookService.java Uses the shared decoder and normalizes rename paths before moving notes.
zeppelin-server/src/test/java/org/apache/zeppelin/notebook/repo/NotebookRepoPathValidationTest.java Adds test coverage for traversal rejection and repeated-decoding behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

NotebookPathValidator.rejectTraversalSegments(folderPath);
NotebookPathValidator.rejectTraversalSegments(newFolderPath);
// [ZEPPELIN-4195] newFolderPath parent path maybe not exist
this.fs.tryMkDir(new Path(notebookDir, folderPath.substring(1)).getParent());
Copy link
Copy Markdown
Member Author

@jongyoul jongyoul May 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Copilot You are right — the mkDir line targets the wrong parent. However, that line was not introduced by this PR (the PR only added the two rejectTraversalSegments calls just above it; the buggy tryMkDir(folderPath...) predates this change in master). Since this PR is a focused security fix for path traversal, I would prefer to keep its scope tight and address the move(folderPath, newFolderPath, ...) mkdir bug in a separate follow-up.

Filed as ZEPPELIN-6415.

Comment on lines +62 to +75
/**
* Repeatedly URL-decodes {@code encoded} until it stabilises, capped at
* {@value #MAX_DECODE_LAYERS} layers.
*/
public static String decodeRepeatedly(String encoded) throws IOException {
String previous = encoded;
for (int attempt = 0; attempt < MAX_DECODE_LAYERS; attempt++) {
String decoded = URLDecoder.decode(previous, StandardCharsets.UTF_8);
if (decoded.equals(previous)) {
return decoded;
}
previous = decoded;
}
throw new IOException("Exceeded maximum decode attempts. Possible malicious input.");
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Copilot Good catch on both. Fixed in dd53762:

  1. URLDecoder.decode is now wrapped in a try/catch that re-throws IllegalArgumentException as IOException — keeps the declared failure mode consistent.
  2. The loop now runs MAX_DECODE_LAYERS + 1 iterations (pass <= MAX_DECODE_LAYERS), so the constant truly caps the number of accepted encoding layers: detecting stability after N layers needs N decodes plus one confirmation pass. Updated the Javadoc and added two tests — one for the malformed-encoding wrap, one for the exact 5-layer boundary.

…ract

Wrap URLDecoder.decode IllegalArgumentException as IOException so the
declared IOException contract is preserved for malformed percent-encoding,
and run the loop MAX_DECODE_LAYERS + 1 times so the constant truly caps the
number of accepted encoding layers (an extra pass is needed to confirm the
result has stabilised).
Copy link
Copy Markdown
Contributor

@tbonelee tbonelee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, approving. A couple of questions that might be worth a follow-up:

  1. The folder-level move(folderPath, ...) and remove(folderPath, ...) in S3NotebookRepo, OSSNotebookRepo, AzureNotebookRepo, etc. don't go through buildNoteFileName either, so they look like they'd benefit from the same rejectTraversalSegments call you added to FileSystemNotebookRepo. Was leaving them out intentional, or worth a follow-up?
  2. The segment splitter is /+, so backslash variants like /foo/..\bar pass through. I'm not sure how realistic this is, but on Windows + LocalFileSystem I wonder if java.io.File could end up interpreting \ as a separator and resolving outside the notebook dir? Might be worth considering [\\/]+, or it may not be a case we need to worry about.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants