[spark] Add union read for lake-enabled log tables by fresh-borzoni · Pull Request #2956 · apache/fluss

fresh-borzoni · 2026-03-29T23:58:26Z

Summary

Adds batch read for lake-enabled log tables. When a table has datalake enabled, reads combine lake storage (Paimon/Iceberg) with Fluss log tail. Lake and log are planned as separate Spark partition, lake tasks read from lake storage without Fluss connections, log tail tasks reuse the existing reader. Falls back to pure log reads when no snapshot exists. Only enabled in FULL startup mode.

Tests cover both Paimon and Iceberg.

Follow-up PRs

PK table lake reads (sort-merge with lake snapshot)
Streaming with lake bootstrap
Filter/partition/limit push-down to lake source
DV support for Paimon

Yohahaha · 2026-04-02T06:14:17Z

@fresh-borzoni thank you for the patch, I create an issue to track it #2983.

Yohahaha · 2026-04-02T06:17:07Z

PK table lake reads (sort-merge with lake snapshot)

I will add spark sql support for union read pk table, #2984

fresh-borzoni · 2026-04-02T09:01:35Z

Thanks @Yohahaha! Just a heads up, this PR already implements batch union read for log tables, so #2983 should be covered once it's merged.

Regarding PK table union read (#2984),I was planning to follow up with that as noted in the PR description.
Happy to collaborate if you're interested, let me know!

Yohahaha · 2026-04-02T10:03:45Z

Thanks @Yohahaha! Just a heads up, this PR already implements batch union read for log tables, so #2983 should be covered once it's merged.

yeah, you could add "closes #2983" in the PR description so that the corresponding issue can be properly linked and closed, like other PRs.

when fluss release a new version, RM can easily collect the features of version scope.

Yohahaha · 2026-04-02T10:08:51Z

Regarding PK table union read (#2984),I was planning to follow up with that as noted in the PR description. Happy to collaborate if you're interested, let me know!

@fresh-borzoni I was planing to implementing it over the next two weeks. Do you already have a draft PR?

Yohahaha

left some comments, thank you!

fluss-spark/fluss-spark-common/src/main/scala/org/apache/fluss/spark/read/FlussLakeBatch.scala

...luss-spark-ut/src/test/scala/org/apache/fluss/spark/lake/SparkLakeLogTableReadTestBase.scala

fresh-borzoni · 2026-04-02T12:44:06Z

@Yohahaha Thank you, feel free to take #2984, I have some starting code, but I'd love to collaborate and appreciate your help here

fresh-borzoni · 2026-04-02T22:45:57Z

@Yohahaha Ty for the review,
Addressed comments, PTAL 🙏

fresh-borzoni mentioned this pull request Apr 2, 2026

[spark] Batch union read for log table #2983

Open

2 tasks

polyzos mentioned this pull request Apr 2, 2026

[spark] Batch union read for pk table #2984

Open

2 tasks

Yohahaha reviewed Apr 2, 2026

View reviewed changes

fresh-borzoni added 2 commits April 2, 2026 19:21

[spark] Add union read for lake-enabled log tables

dcc63af

remove iceberg test for now, shading problem

656dfcd

fresh-borzoni force-pushed the spark-union-read branch 2 times, most recently from 37d1a7d to 9682f4e Compare April 2, 2026 22:25

address comments, improve test infra

281d734

fresh-borzoni force-pushed the spark-union-read branch from 9682f4e to 281d734 Compare April 2, 2026 22:44

fresh-borzoni requested a review from Yohahaha April 2, 2026 22:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[spark] Add union read for lake-enabled log tables#2956

[spark] Add union read for lake-enabled log tables#2956
fresh-borzoni wants to merge 3 commits intoapache:mainfrom
fresh-borzoni:spark-union-read

fresh-borzoni commented Mar 29, 2026 •

edited

Loading

Uh oh!

Yohahaha commented Apr 2, 2026

Uh oh!

Yohahaha commented Apr 2, 2026

Uh oh!

fresh-borzoni commented Apr 2, 2026

Uh oh!

Yohahaha commented Apr 2, 2026 •

edited

Loading

Uh oh!

Yohahaha commented Apr 2, 2026

Uh oh!

Yohahaha left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fresh-borzoni commented Apr 2, 2026

Uh oh!

fresh-borzoni commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

fresh-borzoni commented Mar 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Yohahaha commented Apr 2, 2026

Uh oh!

Yohahaha commented Apr 2, 2026

Uh oh!

fresh-borzoni commented Apr 2, 2026

Uh oh!

Yohahaha commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Yohahaha commented Apr 2, 2026

Uh oh!

Yohahaha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fresh-borzoni commented Apr 2, 2026

Uh oh!

fresh-borzoni commented Apr 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

fresh-borzoni commented Mar 29, 2026 •

edited

Loading

Yohahaha commented Apr 2, 2026 •

edited

Loading