Skip to content

Comments

GH-49351: [C++] Check TZDIR environment variable in vendored date library#49353

Open
canassa wants to merge 2 commits intoapache:mainfrom
canassa:gh-49351-tzdir-support
Open

GH-49351: [C++] Check TZDIR environment variable in vendored date library#49353
canassa wants to merge 2 commits intoapache:mainfrom
canassa:gh-49351-tzdir-support

Conversation

@canassa
Copy link

@canassa canassa commented Feb 20, 2026

Rationale for This Change

The vendored Howard Hinnant date library hardcodes /usr/share/zoneinfo as the timezone database path in discover_tz_dir(). It does not check the TZDIR environment variable, which is the POSIX standard mechanism for overriding this path.

This causes timezone operations to fail on non FHS Linux distributions such as NixOS, where zoneinfo resides under a non standard path like /nix/store/.../share/zoneinfo.

The upstream library also lacks TZDIR support.


What Changes Are Included in This PR?

  • cpp/src/arrow/vendored/datetime/tz.cpp: Check the TZDIR environment variable in discover_tz_dir() before falling back to platform specific hardcoded paths. Uses stat() and S_ISDIR() for validation, matching the existing pattern in the function.
  • cpp/src/arrow/vendored/datetime/README.md: Document the patch.
  • cpp/src/arrow/public_api_test.cc: Add a non Windows test that sets TZDIR and verifies timezone resolution succeeds through Arrow's compute API.
  • python/pyarrow/conftest.py: Respect TZDIR in the emscripten timezone_data test marker.

Are These Changes Tested?

Yes. A new Misc.TZDIREnvironmentVariable test sets TZDIR to a valid zoneinfo directory and casts a UTC timestamp to America/New_York, verifying the code path works end to end.


Are There Any User Facing Changes?

Arrow now respects the TZDIR environment variable on non Windows platforms, enabling timezone operations on systems without /usr/share/zoneinfo.

Fixes #49351

…te library

The vendored Howard Hinnant date library hardcodes /usr/share/zoneinfo
as the timezone database path. This adds a TZDIR check in
discover_tz_dir() before falling back to platform-specific defaults,
consistent with POSIX conventions.

This fixes timezone operations on non-FHS Linux distributions (e.g.
NixOS) where zoneinfo lives under a non-standard path.
@github-actions
Copy link

⚠️ GitHub issue #49351 has been automatically assigned in GitHub to PR creator.

@rok
Copy link
Member

rok commented Feb 20, 2026

Thanks for opening this @canassa! Is this an issue you're personally seeing on NixOS?

@canassa
Copy link
Author

canassa commented Feb 20, 2026

Thanks for opening this @canassa! Is this an issue you're personally seeing on NixOS?

Yes! I noticed this when working on a Python project that uses PyArrow on my NixOS desktop.

I could work around it by creating a symlink, but I figured it would be nice if Arrow supported the environment variable.

@rok
Copy link
Member

rok commented Feb 20, 2026

Great to hear!
I don't have time to review this right now. I'll take a look next week. One thing right now - we try to avoid changing things in cpp/src/arrow/vendored/ if possible. Could you check if we can solve this without the change in cpp/src/arrow/vendored/datetime/tz.cpp?

@canassa
Copy link
Author

canassa commented Feb 20, 2026

iew this right now. I'll take a look next week. One thing right now - we try to avoid changing things in cpp/src/arrow/vendored/ if possible. Could you check if we can solve this without the change in cpp/src/arrow/vendored/datetime/tz.cpp?

Sure, I will take a look!

@rok
Copy link
Member

rok commented Feb 20, 2026

The linting issue can be fixed with (source):

pre-commit run --show-diff-on-failure --color=always --all-files cpp

The cpp paginator issues are unrelated (#49347).

@canassa
Copy link
Author

canassa commented Feb 21, 2026

Thanks, I pushed the linting fixes.

I also investigated other approaches, but the vendored library change seem to be unavoidable. The get_tz_dir() is a static singleton, once initialized, the path is fixed for the lifetime of the process, and there's no setter.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[C++] Vendored date library does not respect TZDIR environment variable

2 participants