diff --git a/.github/sync.yml b/.github/sync.yml index 43f5a0049f..16be95d8ae 100644 --- a/.github/sync.yml +++ b/.github/sync.yml @@ -20,9 +20,13 @@ apache/fory-site@main: dest: docs/community/DEVELOPMENT.md - source: docs/guide/ dest: docs/guide/ + deleteOrphaned: true - source: docs/specification/ dest: docs/specification/ + deleteOrphaned: true - source: docs/compiler/ dest: docs/compiler/ + deleteOrphaned: true - source: docs/benchmarks/ dest: docs/benchmarks/ + deleteOrphaned: true diff --git a/ci/release.py b/ci/release.py index c2f6eaf278..6d926b50a9 100644 --- a/ci/release.py +++ b/ci/release.py @@ -165,12 +165,7 @@ def bump_version(**kwargs): _update_scala_version, ) elif lang == "kotlin": - _bump_version( - "kotlin", - "pom.xml", - _normalize_java_version(new_version), - _update_kotlin_version, - ) + bump_kotlin_version(_normalize_java_version(new_version)) elif lang == "rust": bump_rust_version(new_version) elif lang == "python": @@ -269,6 +264,16 @@ def bump_rust_version(new_version): ) +def bump_kotlin_version(new_version): + _bump_version("kotlin", "pom.xml", new_version, _update_kotlin_version) + for p in [ + "kotlin/fory-kotlin", + "kotlin/fory-kotlin-ksp", + "kotlin/fory-kotlin-tests", + ]: + _bump_version(p, "pom.xml", new_version, _update_pom_parent_version) + + def bump_cpp_version(new_version): for p in [ "cpp", @@ -371,7 +376,7 @@ def _update_scala_version(lines, v): def _update_kotlin_version(lines, v): v = _normalize_java_version(v) - return _update_pom_version(lines, v, "fory-kotlin") + return _update_pom_version(lines, v, "fory-kotlin-parent") def _update_parent_pom_version(lines, v): @@ -384,6 +389,8 @@ def _update_pom_version(lines, v, prev): if prev in line: target_index = index + 1 break + if target_index == -1: + raise ValueError(f"Could not find POM version marker: {prev}") current_version_line = lines[target_index] # Find the start and end of the version number start = current_version_line.index("") + len("") diff --git a/docs/guide/cpp/configuration.md b/docs/guide/cpp/configuration.md index d58bfaaa31..a693f32091 100644 --- a/docs/guide/cpp/configuration.md +++ b/docs/guide/cpp/configuration.md @@ -1,6 +1,6 @@ --- title: Configuration -sidebar_position: 2 +sidebar_position: 4 id: configuration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -170,8 +170,18 @@ auto fory = Fory::builder().xlang(true).build_thread_safe(); // Returns ThreadS | `max_dyn_depth(uint32_t)` | Maximum nesting depth for dynamic types | `5` | | `check_struct_version(bool)` | Enable struct version checking | `false` | +## Security + +Security-related configuration: + +- Register all structs and polymorphic implementations before deserializing untrusted payloads. +- Use `check_struct_version(true)` with `compatible(false)` when exact schema matching is required. +- Keep `max_dyn_depth(...)` as low as your model permits to reject unexpectedly deep polymorphic + graphs. +- Prefer concrete fields over broad polymorphic fields for untrusted input. + ## Related Topics - [Basic Serialization](basic-serialization.md) - Using configured Fory -- [Cross-Language](cross-language.md) - xlang mode details +- [Xlang Serialization](xlang-serialization.md) - xlang mode details - [Type Registration](type-registration.md) - Registering types diff --git a/docs/guide/cpp/custom-serializers.md b/docs/guide/cpp/custom-serializers.md index d59b75b497..093d265d84 100644 --- a/docs/guide/cpp/custom-serializers.md +++ b/docs/guide/cpp/custom-serializers.md @@ -368,4 +368,4 @@ static MyType read_data(ReadContext &ctx) { - [Type Registration](type-registration.md) - Registering serializers - [Basic Serialization](basic-serialization.md) - Using FORY_STRUCT macro - [Schema Evolution](schema-evolution.md) - Compatible mode -- [Cross-Language](cross-language.md) - Cross-language serialization +- [Xlang Serialization](xlang-serialization.md) - Cross-language serialization diff --git a/docs/guide/cpp/index.md b/docs/guide/cpp/index.md index 70acae8f6d..537ef9fd17 100644 --- a/docs/guide/cpp/index.md +++ b/docs/guide/cpp/index.md @@ -26,7 +26,7 @@ The C++ implementation provides high-performance serialization with compile-time ## Why Apache Fory™ C++? - **Fast binary encoding**: Fast serialization and optimized binary protocols -- **Cross-language**: Seamlessly serialize/deserialize data across Java, Python, C++, Go, JavaScript, and Rust +- **Xlang**: Seamlessly serialize/deserialize data across Java, Python, C++, Go, JavaScript, and Rust - **Type-safe**: Compile-time type checking with macro-based struct registration - **Reference tracking**: Automatic tracking of shared and circular references - **Schema evolution**: Compatible mode for independent schema changes @@ -212,7 +212,7 @@ Use xlang mode for cross-language payloads and schemas shared with other Fory ru Use native mode for C++-only traffic. Native mode is selected with `.xlang(false)`, uses schema-consistent payloads unless compatible mode is enabled, and keeps C++ object serialization on the C++ runtime path. It is optimized for C++ types and avoids portable xlang type-mapping constraints when the payload never leaves C++. -See [Cross-Language Serialization](cross-language.md) for C++ xlang registration and interoperability rules, and [Configuration](configuration.md) for native-mode builder options. +See [Xlang Serialization](xlang-serialization.md) for C++ xlang registration and interoperability rules, and [Native Serialization](native-serialization.md) for C++-only payloads. ## Thread Safety @@ -262,9 +262,11 @@ std::thread t2([&]() { - [Configuration](configuration.md) - Builder options and modes - [Basic Serialization](basic-serialization.md) - Object graph serialization -- [Cross-Language](cross-language.md) - xlang mode and interoperability +- [Xlang Serialization](xlang-serialization.md) - xlang mode and interoperability +- [Native Serialization](native-serialization.md) - C++-only serialization - [Schema Metadata](schema-metadata.md) - Field-level metadata (nullable, ref tracking) - [Schema Evolution](schema-evolution.md) - Compatible mode and schema changes - [Type Registration](type-registration.md) - Registering types - [Supported Types](supported-types.md) - All supported types +- [Custom Serializers](custom-serializers.md) - Extend serialization behavior - [Row Format](row-format.md) - Zero-copy row-based format diff --git a/docs/guide/cpp/native-serialization.md b/docs/guide/cpp/native-serialization.md new file mode 100644 index 0000000000..0a530278e9 --- /dev/null +++ b/docs/guide/cpp/native-serialization.md @@ -0,0 +1,211 @@ +--- +title: Native Serialization +sidebar_position: 3 +id: native_serialization +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +C++ native serialization is the C++-only wire mode selected with `.xlang(false)`. Use it when every +writer and reader is C++ and the payload should follow C++ type behavior instead of the portable +xlang type system. + +Use [Xlang Serialization](xlang-serialization.md), the default C++ mode, when bytes must be read by +Java, Python, Go, Rust, JavaScript, or another non-C++ Fory runtime. + +## When To Use Native Serialization + +Use native serialization when: + +- A payload is produced and consumed only by C++ applications. +- The data model uses C++-specific types such as character types, unsigned-native type IDs, + `std::tuple`, smart pointers, or C++ polymorphic models. +- You want schema-consistent C++ payloads for lockstep services. +- You need compatible schema evolution for C++-only rolling deployments. +- You want to avoid portable xlang type-mapping constraints for a C++ boundary. + +## Create a Native Runtime + +```cpp +#include "fory/serialization/fory.h" +#include +#include +#include + +using namespace fory::serialization; + +struct Order { + int64_t id; + double amount; + + bool operator==(const Order &other) const { + return id == other.id && amount == other.amount; + } +}; +FORY_STRUCT(Order, id, amount); + +int main() { + auto fory = Fory::builder() + .xlang(false) + .build(); + fory.register_struct(100); + + Order order{1, 42.5}; + auto bytes = fory.serialize(order).value(); + auto decoded = fory.deserialize(bytes).value(); + assert(order == decoded); +} +``` + +Use one configured `Fory` instance per thread, or build a thread-safe runtime when the same runtime +is shared by multiple threads: + +```cpp +auto fory = Fory::builder() + .xlang(false) + .track_ref(true) + .build_thread_safe(); +``` + +Register types before concurrent serialization starts. + +## Schema Evolution + +Native serialization defaults to schema-consistent mode. Enable compatible mode when C++-only +writer and reader schemas can differ: + +```cpp +auto fory = Fory::builder() + .xlang(false) + .compatible(true) + .build(); +``` + +Compatible mode writes schema metadata so readers can tolerate added, removed, or reordered fields +when field identity remains compatible. See [Schema Evolution](schema-evolution.md). + +## Registration + +Register structs with stable IDs or names before serialization: + +```cpp +fory.register_struct(100); +fory.register_struct("example", "Order"); +``` + +Use numeric IDs for compact payloads. Use namespace/type-name registration when independent teams +coordinate type identity by names. + +## C++ Object Surface + +Native serialization owns the C++-specific object surface: + +- Structs and classes described by `FORY_STRUCT`. +- Standard containers such as `std::vector`, `std::map`, `std::unordered_map`, `std::set`, and + `std::unordered_set`. +- `std::optional`, `std::variant`, and tuple-like values. +- `std::shared_ptr` and `std::unique_ptr`. +- Character types such as `char`, `char16_t`, and `char32_t`. +- Unsigned integer types with native-mode type IDs. +- Polymorphic serialization registered through the C++ runtime. + +Use [Supported Types](supported-types.md) for the full type surface and xlang mapping notes. + +## References And Smart Pointers + +Native serialization supports smart pointers and reference tracking: + +```cpp +auto fory = Fory::builder() + .xlang(false) + .track_ref(true) + .build(); +``` + +When reference tracking is enabled, shared pointer identity can be preserved and cyclic object +graphs can be represented through supported pointer patterns. Disable reference tracking for +value-shaped data when identity is not part of the model. + +## Native-Only Scalar Shapes + +Some C++ scalar shapes are not portable xlang payloads. Use native serialization when these shapes +must round-trip as C++ values: + +```cpp +auto fory = Fory::builder().xlang(false).build(); + +auto char_bytes = fory.serialize(char32_t{U'A'}).value(); +auto value = fory.deserialize(char_bytes).value(); + +auto unsigned_bytes = fory.serialize(uint64_t{42}).value(); +auto unsigned_value = fory.deserialize(unsigned_bytes).value(); +``` + +For xlang payloads, use schema metadata and the shared xlang type mapping instead of relying on +C++ native-only type IDs. + +## Performance Guidelines + +- Reuse configured `Fory` instances. +- Use single-threaded `Fory` per thread for the fastest path; use `build_thread_safe()` for shared + concurrent use. +- Keep native schema-consistent mode for lockstep C++ services. +- Enable `.compatible(true)` only when C++-only schema evolution is required. +- Register structs with explicit numeric IDs for compact payloads. +- Disable reference tracking for value-shaped graphs. +- Prefer concrete types over polymorphic/dynamic fields on hot paths. + +## Native And Xlang Comparison + +| Requirement | Use native serialization | Use xlang serialization | +| ---------------------------------------- | ------------------------ | ----------------------- | +| C++-only payloads | Yes | Optional | +| Non-C++ readers or writers | No | Yes | +| C++ native character and unsigned shapes | Yes | Limited | +| Smart pointers and C++ object graphs | Yes | Limited | +| Schema-consistent same-language payloads | Yes | No | +| Compatible schema evolution by default | No | Yes | +| Portable type mapping across runtimes | No | Yes | + +## Troubleshooting + +### A non-C++ runtime cannot read the payload + +The writer is using native serialization. Rebuild it with `.xlang(true)` and align type +registration with every peer runtime. + +### A rolling deployment fails after a field change + +Native serialization defaults to schema-consistent mode. Use `.compatible(true)` on both writer and +reader when schemas can differ. + +### A native-only scalar does not map to another language + +Use xlang serialization with explicit schema metadata for portable payloads. Native C++ type IDs +are only for C++ readers. + +### A shared pointer graph loses identity + +Enable `.track_ref(true)` and verify the graph uses supported pointer patterns. + +## Related Topics + +- [Xlang Serialization](xlang-serialization.md) - Cross-runtime C++ payloads +- [Configuration](configuration.md) - Builder options +- [Basic Serialization](basic-serialization.md) - Object graph serialization +- [Supported Types](supported-types.md) - C++ type support +- [Polymorphic Serialization](polymorphism.md) - Polymorphic object models +- [Schema Evolution](schema-evolution.md) - Compatible mode diff --git a/docs/guide/cpp/polymorphism.md b/docs/guide/cpp/polymorphism.md index 74a7dc38a0..3ceb850015 100644 --- a/docs/guide/cpp/polymorphism.md +++ b/docs/guide/cpp/polymorphism.md @@ -1,6 +1,6 @@ --- title: Polymorphic Serialization -sidebar_position: 7 +sidebar_position: 8 id: polymorphism license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/cpp/row-format.md b/docs/guide/cpp/row-format.md index f12d265e57..5957cb6367 100644 --- a/docs/guide/cpp/row-format.md +++ b/docs/guide/cpp/row-format.md @@ -1,6 +1,6 @@ --- title: Row Format -sidebar_position: 20 +sidebar_position: 11 id: row_format license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/cpp/schema-evolution.md b/docs/guide/cpp/schema-evolution.md index 119ba0293a..9844a26dec 100644 --- a/docs/guide/cpp/schema-evolution.md +++ b/docs/guide/cpp/schema-evolution.md @@ -1,6 +1,6 @@ --- title: Schema Evolution -sidebar_position: 5 +sidebar_position: 6 id: schema_evolution license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -387,7 +387,7 @@ Test both upgrade and downgrade scenarios: // Test V3 -> V1 ``` -## Cross-Language Schema Evolution +## Xlang Schema Evolution Schema evolution works across languages when using xlang mode: @@ -413,4 +413,4 @@ Both instances can exchange data even with different schema versions. - [Configuration](configuration.md) - Enabling compatible mode - [Type Registration](type-registration.md) - Type ID management -- [Cross-Language](cross-language.md) - Cross-language considerations +- [Xlang Serialization](xlang-serialization.md) - Cross-language considerations diff --git a/docs/guide/cpp/schema-metadata.md b/docs/guide/cpp/schema-metadata.md index e0bb38dc57..23869b68af 100644 --- a/docs/guide/cpp/schema-metadata.md +++ b/docs/guide/cpp/schema-metadata.md @@ -1,6 +1,6 @@ --- title: Schema Metadata -sidebar_position: 4 +sidebar_position: 5 id: schema_metadata license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/cpp/supported-types.md b/docs/guide/cpp/supported-types.md index 49917d608e..c4dd9f0040 100644 --- a/docs/guide/cpp/supported-types.md +++ b/docs/guide/cpp/supported-types.md @@ -1,6 +1,6 @@ --- title: Supported Types -sidebar_position: 8 +sidebar_position: 9 id: supported_types license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -278,4 +278,4 @@ Currently not supported: - [Basic Serialization](basic-serialization.md) - Using these types - [Type Registration](type-registration.md) - Registering types -- [Cross-Language](cross-language.md) - Cross-language compatibility +- [Xlang Serialization](xlang-serialization.md) - Cross-language compatibility diff --git a/docs/guide/cpp/type-registration.md b/docs/guide/cpp/type-registration.md index 16c0f45b27..e7e94b2726 100644 --- a/docs/guide/cpp/type-registration.md +++ b/docs/guide/cpp/type-registration.md @@ -1,6 +1,6 @@ --- title: Type Registration -sidebar_position: 6 +sidebar_position: 7 id: type_registration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -25,7 +25,7 @@ This page explains how to register types for serialization. Apache Fory™ requires explicit type registration for struct types. This design enables: -- **Cross-Language Compatibility**: Registered type IDs are used across language boundaries +- **Xlang compatibility**: Registered type IDs are used across language boundaries - **Type Safety**: Detects type mismatches at deserialization time - **Polymorphic Serialization**: Enables serialization of polymorphic objects via smart pointers @@ -127,7 +127,7 @@ std::thread t2([&]() { }); ``` -## Cross-Language Registration +## Xlang Registration For cross-language compatibility, ensure: @@ -250,5 +250,5 @@ if (!result.ok()) { ## Related Topics - [Basic Serialization](basic-serialization.md) - Using registered types -- [Cross-Language](cross-language.md) - Cross-language considerations +- [Xlang Serialization](xlang-serialization.md) - Cross-language considerations - [Supported Types](supported-types.md) - All supported types diff --git a/docs/guide/cpp/cross-language.md b/docs/guide/cpp/xlang-serialization.md similarity index 96% rename from docs/guide/cpp/cross-language.md rename to docs/guide/cpp/xlang-serialization.md index eb73b84b61..0570c02176 100644 --- a/docs/guide/cpp/cross-language.md +++ b/docs/guide/cpp/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization -sidebar_position: 3 -id: cross_language +title: Xlang Serialization +sidebar_position: 2 +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -19,11 +19,11 @@ license: | limitations under the License. --- -This page explains how to use Fory for cross-language serialization between C++ and other languages. +This page explains how to use Fory xlang serialization between C++ and other languages. ## Overview -Apache Fory™ enables seamless data exchange between C++, Java, Python, Go, Rust, and JavaScript. The xlang (cross-language) mode ensures binary compatibility across all supported languages. +Apache Fory™ enables seamless data exchange between C++, Java, Python, Go, Rust, and JavaScript. Xlang mode ensures binary compatibility across all supported languages. ## Create an Xlang Runtime @@ -37,7 +37,7 @@ using namespace fory::serialization; auto fory = Fory::builder().xlang(true).build(); ``` -## Cross-Language Example +## Xlang Example ### C++ Producer diff --git a/docs/guide/csharp/configuration.md b/docs/guide/csharp/configuration.md index 0a548e1a2f..bbcaba7172 100644 --- a/docs/guide/csharp/configuration.md +++ b/docs/guide/csharp/configuration.md @@ -118,6 +118,15 @@ ThreadSafeFory fory = Fory.Builder() .BuildThreadSafe(); ``` +## Security + +Security-related configuration: + +- Register only the expected types before deserializing untrusted payloads. +- Use `CheckStructVersion(true)` with `Compatible(false)` when exact schema matching is required. +- Set `MaxDepth(...)` to reject unexpectedly deep dynamic object graphs. +- Prefer generated or registered concrete models over broad dynamic fields for untrusted input. + ## Related Topics - [Basic Serialization](basic-serialization.md) diff --git a/docs/guide/csharp/index.md b/docs/guide/csharp/index.md index 04f19b3b78..d2c18aa009 100644 --- a/docs/guide/csharp/index.md +++ b/docs/guide/csharp/index.md @@ -24,7 +24,7 @@ Apache Fory™ C# is a high-performance, cross-language serialization runtime fo ## Why Fory C#? - High performance binary serialization for .NET 8+ -- Cross-language compatibility with Fory implementations in Java, Python, C++, Go, Rust, and JavaScript +- Xlang compatibility with Fory implementations in Java, Python, C++, Go, Rust, and JavaScript - Source-generator-based serializers for `[ForyObject]` types - Optional reference tracking for shared and circular object graphs - Compatible mode for schema evolution @@ -87,7 +87,7 @@ User decoded = fory.Deserialize(payload); | --------------------------------------------- | --------------------------------------------- | | [Configuration](configuration.md) | Builder options and runtime modes | | [Basic Serialization](basic-serialization.md) | Typed and dynamic serialization APIs | -| [Cross-Language](cross-language.md) | Interoperability guidance | +| [Xlang Serialization](xlang-serialization.md) | Interoperability guidance | | [Schema Metadata](schema-metadata.md) | `[ForyField]` ids and schema type descriptors | | [Type Registration](type-registration.md) | Registering user types and custom serializers | | [Custom Serializers](custom-serializers.md) | Implementing `Serializer` | @@ -99,6 +99,6 @@ User decoded = fory.Deserialize(payload); ## Related Resources -- [Cross-language serialization specification](../../specification/xlang_serialization_spec.md) -- [Cross-language guide](../xlang/index.md) +- [Xlang serialization specification](../../specification/xlang_serialization_spec.md) +- [Xlang guide](../xlang/index.md) - [C# source directory](https://github.com/apache/fory/tree/main/csharp) diff --git a/docs/guide/csharp/supported-types.md b/docs/guide/csharp/supported-types.md index da5d4ced40..fed11cd457 100644 --- a/docs/guide/csharp/supported-types.md +++ b/docs/guide/csharp/supported-types.md @@ -97,4 +97,4 @@ Dynamic object payloads via `Serialize` / `Deserialize` suppor - [Basic Serialization](basic-serialization.md) - [Type Registration](type-registration.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/csharp/type-registration.md b/docs/guide/csharp/type-registration.md index 1f8d8e2286..fd1e088546 100644 --- a/docs/guide/csharp/type-registration.md +++ b/docs/guide/csharp/type-registration.md @@ -79,4 +79,4 @@ fory.Register(101); - [Basic Serialization](basic-serialization.md) - [Custom Serializers](custom-serializers.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/csharp/cross-language.md b/docs/guide/csharp/xlang-serialization.md similarity index 95% rename from docs/guide/csharp/cross-language.md rename to docs/guide/csharp/xlang-serialization.md index 4551f72d15..3c53e52af3 100644 --- a/docs/guide/csharp/cross-language.md +++ b/docs/guide/csharp/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization +title: Xlang Serialization sidebar_position: 3 -id: cross_language +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -19,9 +19,9 @@ license: | limitations under the License. --- -Apache Fory™ C# supports cross-language serialization with other Fory runtimes. +Apache Fory™ C# supports xlang serialization with other Fory runtimes. -## Cross-Language Runtime +## Xlang Runtime C# always writes and reads the xlang frame header. There is no mode switch, so interoperability code only needs to configure the remaining runtime behavior such as compatibility mode and reference @@ -58,7 +58,7 @@ Use the same ID mapping on all languages. fory.Register("com.example", "Person"); ``` -## Cross-Language Example +## Xlang Example ### C# (Serializer) diff --git a/docs/guide/dart/configuration.md b/docs/guide/dart/configuration.md index 09fdadda44..52bd45c1c3 100644 --- a/docs/guide/dart/configuration.md +++ b/docs/guide/dart/configuration.md @@ -107,7 +107,7 @@ final fory = Fory(maxBinarySize: 8 * 1024 * 1024); | `maxCollectionSize` | 1 048 576 | | `maxBinarySize` | 64 MiB | -## Cross-Language Notes +## Xlang Notes When Fory is used to communicate between services written in different languages: @@ -115,8 +115,17 @@ When Fory is used to communicate between services written in different languages - Use the same numeric IDs or `namespace + typeName` pairs on every side. - Match the `compatible` setting on both the writing and reading side — mismatching modes will fail. +## Security + +Security-related configuration: + +- Register only the expected generated models before deserializing untrusted payloads. +- Use `checkStructVersion: true` with `compatible: false` when exact schema matching is required. +- Set `maxDepth`, `maxCollectionSize`, and `maxBinarySize` to reject unexpectedly large payloads. +- Prefer generated schemas and explicit field metadata over broad dynamic fields for untrusted input. + ## Related Topics - [Basic Serialization](basic-serialization.md) - [Schema Evolution](schema-evolution.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/dart/custom-serializers.md b/docs/guide/dart/custom-serializers.md index 096438ecb2..6fad7a5608 100644 --- a/docs/guide/dart/custom-serializers.md +++ b/docs/guide/dart/custom-serializers.md @@ -135,5 +135,5 @@ Skipping this step causes back-references to that object to resolve to `null`. ## Related Topics - [Type Registration](type-registration.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Troubleshooting](troubleshooting.md) diff --git a/docs/guide/dart/index.md b/docs/guide/dart/index.md index b3d912395b..a92959bec7 100644 --- a/docs/guide/dart/index.md +++ b/docs/guide/dart/index.md @@ -23,7 +23,7 @@ Apache Fory™ Dart lets you serialize Dart objects to bytes and deserialize the ## Why Fory Dart? -- **Cross-language**: serialize in Dart, deserialize in Java, Go, C#, and more without writing any glue code +- **Xlang**: serialize in Dart, deserialize in Java, Go, C#, and more without writing any glue code - **Platform support**: use the same generated-serializer API on Dart VM/AOT, Flutter, and web - **Fast**: generated serializer code replaces reflection at runtime - **Schema evolution**: add or remove fields without breaking existing messages @@ -130,7 +130,7 @@ dart run build_runner build --delete-conflicting-outputs | [Configuration](configuration.md) | Runtime options, compatible mode, and safety limits | | [Basic Serialization](basic-serialization.md) | `serialize`, `deserialize`, generated registration, root graphs | | [Code Generation](code-generation.md) | `@ForyStruct`, build runner, and generated namespaces | -| [Cross-Language](cross-language.md) | Interoperability rules and field alignment | +| [Xlang Serialization](xlang-serialization.md) | Interoperability rules and field alignment | | [Schema Metadata](schema-metadata.md) | `@ForyField`, field IDs, nullability, references, polymorphism | | [Type Registration](type-registration.md) | ID-based vs name-based registration and registration rules | | [Custom Serializers](custom-serializers.md) | Manual `Serializer` implementations and unions | @@ -143,5 +143,5 @@ dart run build_runner build --delete-conflicting-outputs - [Xlang serialization specification](../../specification/xlang_serialization_spec.md) - [Xlang implementation guide](../../specification/xlang_implementation_guide.md) -- [Cross-language guide](../xlang/index.md) +- [Xlang guide](../xlang/index.md) - [Dart runtime source directory](https://github.com/apache/fory/tree/main/dart) diff --git a/docs/guide/dart/schema-evolution.md b/docs/guide/dart/schema-evolution.md index c165f1f629..53c0a82ba9 100644 --- a/docs/guide/dart/schema-evolution.md +++ b/docs/guide/dart/schema-evolution.md @@ -76,7 +76,7 @@ If you add field IDs after payloads are already in production, existing stored m - Change the registration identity (`id`, `namespace`, or `typeName`) of a type after messages are in production. - Change a field's logical meaning without changing its ID. -## Cross-Language Notes +## Xlang Notes Evolution only works when **all** runtimes that exchange messages agree on: @@ -90,4 +90,4 @@ Test rolling-upgrade scenarios with real round trips before deploying. - [Configuration](configuration.md) - [Schema Metadata](schema-metadata.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/dart/schema-metadata.md b/docs/guide/dart/schema-metadata.md index 978b7a1ffb..50b490016a 100644 --- a/docs/guide/dart/schema-metadata.md +++ b/docs/guide/dart/schema-metadata.md @@ -142,4 +142,4 @@ When the same model is defined in multiple languages: - [Code Generation](code-generation.md) - [Schema Evolution](schema-evolution.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/dart/supported-types.md b/docs/guide/dart/supported-types.md index 879dce949c..220ed988cd 100644 --- a/docs/guide/dart/supported-types.md +++ b/docs/guide/dart/supported-types.md @@ -155,5 +155,5 @@ width is one of the most common cross-language bugs. ## Related Topics - [Schema Metadata](schema-metadata.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Schema Evolution](schema-evolution.md) diff --git a/docs/guide/dart/troubleshooting.md b/docs/guide/dart/troubleshooting.md index 8c8065a3cc..769b132122 100644 --- a/docs/guide/dart/troubleshooting.md +++ b/docs/guide/dart/troubleshooting.md @@ -140,7 +140,7 @@ dart test ## Related Topics -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Code Generation](code-generation.md) - [Custom Serializers](custom-serializers.md) - [Web Platform Support](web-platform-support.md) diff --git a/docs/guide/dart/type-registration.md b/docs/guide/dart/type-registration.md index 96ab9a029a..f75b1f3042 100644 --- a/docs/guide/dart/type-registration.md +++ b/docs/guide/dart/type-registration.md @@ -87,12 +87,12 @@ See [Custom Serializers](custom-serializers.md) for how to implement a serialize - Keep IDs (or names) **stable** once payloads are persisted or exchanged across services. Changing them will break deserialization of old messages. - Do not mix a numeric ID on one side with a name on the other for the same type. -## Cross-Language Requirements +## Xlang Requirements -The same numeric ID or `namespace + typeName` pair must be used in every runtime that reads or writes the type. See [Cross-Language](cross-language.md) for examples. +The same numeric ID or `namespace + typeName` pair must be used in every runtime that reads or writes the type. See [Xlang Serialization](xlang-serialization.md) for examples. ## Related Topics - [Code Generation](code-generation.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Custom Serializers](custom-serializers.md) diff --git a/docs/guide/dart/cross-language.md b/docs/guide/dart/xlang-serialization.md similarity index 98% rename from docs/guide/dart/cross-language.md rename to docs/guide/dart/xlang-serialization.md index aabe9d9d5e..38ec6dc090 100644 --- a/docs/guide/dart/cross-language.md +++ b/docs/guide/dart/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization +title: Xlang Serialization sidebar_position: 4 -id: cross_language +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -218,4 +218,4 @@ dart test - [Type Registration](type-registration.md) - [Schema Evolution](schema-evolution.md) -- [Cross-language guide](../xlang/index.md) +- [Xlang guide](../xlang/index.md) diff --git a/docs/guide/go/codegen.md b/docs/guide/go/codegen.md index 81756ed6cf..743c60210b 100644 --- a/docs/guide/go/codegen.md +++ b/docs/guide/go/codegen.md @@ -1,6 +1,6 @@ --- title: Code Generation -sidebar_position: 100 +sidebar_position: 11 id: codegen license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/go/configuration.md b/docs/guide/go/configuration.md index 376558c6f8..4ca85273dd 100644 --- a/docs/guide/go/configuration.md +++ b/docs/guide/go/configuration.md @@ -1,6 +1,6 @@ --- title: Configuration -sidebar_position: 10 +sidebar_position: 4 id: configuration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -340,6 +340,14 @@ for req := range requests { 6. **Use compatible mode for evolving schemas**: Enable when struct definitions may change between service versions. +## Security + +Security-related configuration: + +- Register only the expected structs before deserializing untrusted data. +- Use `WithMaxDepth(...)` to reject unexpectedly deep payloads. +- Prefer concrete struct fields over broad `any` or interface-typed fields for untrusted input. + ## Related Topics - [Basic Serialization](basic-serialization.md) diff --git a/docs/guide/go/custom-serializers.md b/docs/guide/go/custom-serializers.md index e44f465863..9c0461f0fc 100644 --- a/docs/guide/go/custom-serializers.md +++ b/docs/guide/go/custom-serializers.md @@ -1,6 +1,6 @@ --- title: Custom Serializers -sidebar_position: 90 +sidebar_position: 10 id: custom_serializers license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -282,4 +282,4 @@ func TestMySerializer(t *testing.T) { - [Type Registration](type-registration.md) - [Supported Types](supported-types.md) -- [Cross-Language Serialization](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/go/index.md b/docs/guide/go/index.md index 260a22e1db..fb74931d62 100644 --- a/docs/guide/go/index.md +++ b/docs/guide/go/index.md @@ -24,7 +24,7 @@ Apache Fory Go is a high-performance serialization library for Go. It supports x ## Why Fory Go? - **High Performance**: Fast serialization and optimized binary protocols -- **Cross-Language**: Seamless data exchange with Java, Python, C++, Rust, and JavaScript +- **Xlang**: Seamless data exchange with Java, Python, C++, Rust, and JavaScript - **Automatic Serialization**: No IDL definitions or schema compilation required - **Reference Tracking**: Built-in support for circular references and shared objects - **Type Safety**: Strong typing with schema-aware serializers @@ -90,7 +90,7 @@ Use xlang mode for cross-language payloads and schemas shared with other Fory ru Use native mode for Go-only traffic. Native mode is selected with `fory.WithXlang(false)`, uses schema-consistent payloads unless compatible mode is enabled, and keeps Go object serialization on the Go runtime path. It is optimized for Go structs, pointers, interfaces, and Go-specific type behavior that does not need a portable xlang mapping. -See [Cross-Language Serialization](cross-language.md) for Go xlang registration and interoperability rules, and [Configuration](configuration.md) for native-mode options. +See [Xlang Serialization](xlang-serialization.md) for Go xlang registration and interoperability rules, and [Native Serialization](native-serialization.md) for Go-only payloads. ## Configuration @@ -118,7 +118,7 @@ Fory Go supports a wide range of types: See [Supported Types](supported-types.md) for the complete type mapping. -## Cross-Language Serialization +## Xlang Serialization Fory Go is fully compatible with other Fory implementations. Data serialized in Go can be deserialized in Java, Python, C++, Rust, or JavaScript: @@ -130,25 +130,27 @@ data, _ := f.Serialize(&User{ID: 1, Name: "Alice"}) // 'data' can be deserialized by Java, Python, etc. ``` -See [Cross-Language Serialization](cross-language.md) for type mapping and compatibility details. +See [Xlang Serialization](xlang-serialization.md) for type mapping and compatibility details. ## Documentation -| Topic | Description | -| --------------------------------------------- | -------------------------------------- | -| [Basic Serialization](basic-serialization.md) | Core APIs and usage patterns | -| [Configuration](configuration.md) | Options and settings | -| [Cross-Language](cross-language.md) | Multi-language serialization | -| [Schema Metadata](schema-metadata.md) | Field-level configuration | -| [Type Registration](type-registration.md) | Registering types for serialization | -| [Supported Types](supported-types.md) | Complete type support reference | -| [References](references.md) | Circular references and shared objects | -| [Schema Evolution](schema-evolution.md) | Forward/backward compatibility | -| [Thread Safety](thread-safety.md) | Concurrent usage patterns | -| [Troubleshooting](troubleshooting.md) | Common issues and solutions | +| Topic | Description | +| ----------------------------------------------- | -------------------------------------- | +| [Basic Serialization](basic-serialization.md) | Core APIs and usage patterns | +| [Xlang Serialization](xlang-serialization.md) | Multi-language serialization | +| [Native Serialization](native-serialization.md) | Go-only serialization | +| [Configuration](configuration.md) | Options and settings | +| [Schema Metadata](schema-metadata.md) | Field-level configuration | +| [Type Registration](type-registration.md) | Registering types for serialization | +| [Supported Types](supported-types.md) | Complete type support reference | +| [References](references.md) | Circular references and shared objects | +| [Schema Evolution](schema-evolution.md) | Forward/backward compatibility | +| [Custom Serializers](custom-serializers.md) | Extend serialization behavior | +| [Thread Safety](thread-safety.md) | Concurrent usage patterns | +| [Troubleshooting](troubleshooting.md) | Common issues and solutions | ## Related Resources - [Xlang Serialization Specification](../../specification/xlang_serialization_spec.md) -- [Cross-Language Type Mapping](../../specification/xlang_type_mapping.md) +- [Xlang Type Mapping](../../specification/xlang_type_mapping.md) - [GitHub Repository](https://github.com/apache/fory) diff --git a/docs/guide/go/native-serialization.md b/docs/guide/go/native-serialization.md new file mode 100644 index 0000000000..00dde847dc --- /dev/null +++ b/docs/guide/go/native-serialization.md @@ -0,0 +1,216 @@ +--- +title: Native Serialization +sidebar_position: 3 +id: native_serialization +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +Go native serialization is the Go-only wire mode selected with `fory.WithXlang(false)`. Use it +when every writer and reader is a Go service and the payload should follow Go's type system instead +of the portable xlang type system. + +Use [Xlang Serialization](xlang-serialization.md), the default Go mode, when bytes must be read by +Java, Python, C++, Rust, JavaScript, or another non-Go Fory runtime. + +## When To Use Native Serialization + +Use native serialization when: + +- A payload is produced and consumed only by Go applications. +- The data model uses Go-specific behavior such as native `int`/`uint`, nil slices, nil maps, + pointers, interfaces, or Go-only dynamic values. +- You need schema-consistent Go payloads with the smallest same-schema metadata surface. +- You want compatible schema evolution for Go-only rolling deployments without committing to a + cross-language type mapping. +- You are using reflection or code-generated serializers for Go structs that never leave Go. + +## Create a Native Runtime + +```go +package main + +import "github.com/apache/fory/go/fory" + +type Order struct { + ID int64 + Amount float64 +} + +func main() { + f := fory.New(fory.WithXlang(false)) + if err := f.RegisterStruct(Order{}, 100); err != nil { + panic(err) + } + + data, err := f.Serialize(&Order{ID: 1, Amount: 42.5}) + if err != nil { + panic(err) + } + + var decoded Order + if err := f.Deserialize(data, &decoded); err != nil { + panic(err) + } +} +``` + +Reuse a configured `Fory` instance. The default instance owns reusable buffers and is not +thread-safe; use the thread-safe wrapper for concurrent goroutines. + +```go +import ( + "github.com/apache/fory/go/fory" + "github.com/apache/fory/go/fory/threadsafe" +) + +f := threadsafe.New(fory.WithXlang(false), fory.WithTrackRef(true)) +_ = f.RegisterStruct(Order{}, 100) +``` + +## Schema Evolution + +Native serialization defaults to schema-consistent mode. Writer and reader structs should match +when `WithCompatible(true)` is not set. + +Enable compatible mode when Go-only services roll independently: + +```go +writer := fory.New(fory.WithXlang(false), fory.WithCompatible(true)) +reader := fory.New(fory.WithXlang(false), fory.WithCompatible(true)) +``` + +Compatible mode writes schema metadata so readers can tolerate added, removed, or reordered fields +when field names or explicit field IDs remain compatible. See [Schema Evolution](schema-evolution.md). + +## Registration + +Register structs before serializing them. Prefer explicit numeric IDs for long-lived payloads: + +```go +_ = f.RegisterStruct(Order{}, 100) +_ = f.RegisterStruct(LineItem{}, 101) +``` + +Name-based registration is useful when ID coordination is harder: + +```go +_ = f.RegisterStructByName(Order{}, "example.Order") +``` + +If you register without stable IDs, every writer and reader must make the same registration choices. + +## Go Object Surface + +Native serialization keeps Go data on the Go runtime path: + +- Primitive numeric types, including Go-native `int` and `uint`. +- Structs with exported fields. +- Slices, arrays, maps, and Fory sets. +- Pointers and nil values, including nil slices and maps. +- Interfaces and dynamic values when registered serializers can resolve their concrete types. +- Time values such as `time.Time` and `time.Duration`. +- Reflection-based and code-generated serializers. + +Use [Supported Types](supported-types.md) for the full type surface and xlang mapping details. + +## References And Pointers + +Enable reference tracking for shared object identity or cycles: + +```go +f := fory.New(fory.WithXlang(false), fory.WithTrackRef(true)) + +type Node struct { + Value int32 + Next *Node `fory:"ref"` +} +``` + +Disable reference tracking for value-shaped data. It is faster and smaller, but repeated pointers +deserialize as independent values and cyclic graphs are unsupported. + +## Buffer Ownership + +The default `Fory` instance reuses its internal buffer. Copy serialized bytes if they must outlive +the next serialization call: + +```go +data, _ := f.Serialize(value) +stable := append([]byte(nil), data...) +``` + +The thread-safe wrapper copies bytes before returning them. For high-throughput single-threaded +code, serialize into a caller-owned `ByteBuffer`: + +```go +buf := fory.NewByteBuffer(nil) +err := f.SerializeTo(buf, value) +data := buf.GetByteSlice(0, buf.WriterIndex()) +_ = err +_ = data +``` + +## Performance Guidelines + +- Reuse `Fory` or the thread-safe wrapper instead of constructing a runtime per request. +- Keep schema-consistent mode for lockstep Go services; enable compatible mode only when schema + evolution is needed. +- Register structs with explicit numeric IDs. +- Disable reference tracking unless the graph requires identity or cycles. +- Use code generation for hot Go structs when reflection overhead matters. +- Copy returned bytes only when the data must survive the next serialization call. + +## Native And Xlang Comparison + +| Requirement | Use native serialization | Use xlang serialization | +| ---------------------------------------- | ------------------------ | ----------------------- | +| Go-only payloads | Yes | Optional | +| Non-Go readers or writers | No | Yes | +| Go-native `int`, `uint`, nil slice/map | Yes | Limited | +| Schema-consistent same-language payloads | Yes | No | +| Compatible schema evolution by default | No | Yes | +| Portable type mapping across runtimes | No | Yes | + +## Troubleshooting + +### A non-Go runtime cannot read the payload + +The writer is using native serialization. Rebuild it with `fory.WithXlang(true)` and align type +registration with every peer runtime. + +### A rolling deployment fails after a field change + +Native serialization defaults to schema-consistent mode. Use `fory.WithCompatible(true)` on both +writer and reader when struct definitions can differ. + +### A nil slice or map changes shape + +Use native serialization for Go-only payloads that must preserve Go nil slice/map semantics. +Cross-language schemas should model nullability explicitly. + +### Returned bytes change after another serialization + +The default runtime reuses its buffer. Copy the byte slice or use `threadsafe.New(...)`. + +## Related Topics + +- [Xlang Serialization](xlang-serialization.md) - Cross-runtime Go payloads +- [Configuration](configuration.md) - Go runtime options +- [Type Registration](type-registration.md) - Struct and enum registration +- [References](references.md) - Shared and circular references +- [Schema Evolution](schema-evolution.md) - Compatible mode +- [Code Generation](codegen.md) - Generated serializers diff --git a/docs/guide/go/references.md b/docs/guide/go/references.md index 27d42ca02c..32d09d7600 100644 --- a/docs/guide/go/references.md +++ b/docs/guide/go/references.md @@ -1,6 +1,6 @@ --- title: References -sidebar_position: 60 +sidebar_position: 8 id: references license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -353,4 +353,4 @@ func main() { - [Configuration](configuration.md) - [Struct Tags](schema-metadata.md) -- [Cross-Language Serialization](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/go/schema-evolution.md b/docs/guide/go/schema-evolution.md index f2d4b31c30..2c5e04d2f0 100644 --- a/docs/guide/go/schema-evolution.md +++ b/docs/guide/go/schema-evolution.md @@ -1,6 +1,6 @@ --- title: Schema Evolution -sidebar_position: 70 +sidebar_position: 9 id: schema_evolution license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -228,7 +228,7 @@ if config.Retries == 0 { } ``` -## Cross-Language Schema Evolution +## Xlang Schema Evolution Schema evolution works across languages: @@ -360,5 +360,5 @@ func main() { ## Related Topics - [Configuration](configuration.md) -- [Cross-Language Serialization](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Troubleshooting](troubleshooting.md) diff --git a/docs/guide/go/schema-metadata.md b/docs/guide/go/schema-metadata.md index e248cee95a..aff97a9b8c 100644 --- a/docs/guide/go/schema-metadata.md +++ b/docs/guide/go/schema-metadata.md @@ -1,6 +1,6 @@ --- title: Schema Metadata -sidebar_position: 30 +sidebar_position: 5 id: schema_metadata license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/go/supported-types.md b/docs/guide/go/supported-types.md index 1b6776c9eb..8371e53c53 100644 --- a/docs/guide/go/supported-types.md +++ b/docs/guide/go/supported-types.md @@ -1,6 +1,6 @@ --- title: Supported Types -sidebar_position: 50 +sidebar_position: 7 id: supported_types license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -337,7 +337,7 @@ status := StatusActive data, _ := f.Serialize(status) ``` -## Cross-Language Type Mapping +## Xlang Type Mapping | Go Type | Java | Python | C++ | Rust | | --------------- | ---------- | --------- | ------------------ | -------------- | @@ -354,7 +354,7 @@ data, _ := f.Serialize(status) | `time.Time` | Instant | datetime | - | - | | `time.Duration` | Duration | timedelta | - | - | -See [Cross-Language Serialization](cross-language.md) for detailed mapping. +See [Xlang Serialization](xlang-serialization.md) for detailed mapping. ## Unsupported Types @@ -370,5 +370,5 @@ Attempting to serialize these types will result in an error. ## Related Topics - [Type Registration](type-registration.md) -- [Cross-Language Serialization](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [References](references.md) diff --git a/docs/guide/go/thread-safety.md b/docs/guide/go/thread-safety.md index df67d12cd4..19f9209564 100644 --- a/docs/guide/go/thread-safety.md +++ b/docs/guide/go/thread-safety.md @@ -1,6 +1,6 @@ --- title: Thread Safety -sidebar_position: 110 +sidebar_position: 12 id: thread_safety license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/go/troubleshooting.md b/docs/guide/go/troubleshooting.md index e671d40886..5af317b053 100644 --- a/docs/guide/go/troubleshooting.md +++ b/docs/guide/go/troubleshooting.md @@ -1,6 +1,6 @@ --- title: Troubleshooting -sidebar_position: 120 +sidebar_position: 13 id: troubleshooting license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -242,7 +242,7 @@ type Good struct { } ``` -## Cross-Language Issues +## Xlang Issues ### Field Order Mismatch @@ -407,7 +407,7 @@ func TestRoundTrip(t *testing.T) { } ``` -### Test Cross-Language +### Test Xlang ```bash cd java/fory-core @@ -444,6 +444,6 @@ If you encounter issues not covered here: ## Related Topics - [Configuration](configuration.md) -- [Cross-Language Serialization](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Schema Evolution](schema-evolution.md) - [Thread Safety](thread-safety.md) diff --git a/docs/guide/go/type-registration.md b/docs/guide/go/type-registration.md index 91a4bd3b45..452cdb5be9 100644 --- a/docs/guide/go/type-registration.md +++ b/docs/guide/go/type-registration.md @@ -1,6 +1,6 @@ --- title: Type Registration -sidebar_position: 40 +sidebar_position: 6 id: type_registration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -25,7 +25,7 @@ Type registration tells Fory how to identify and serialize your custom types. Re 1. **Type Identification**: Fory needs to identify the actual type during deserialization 2. **Polymorphism**: When deserializing interface types, Fory must know which concrete type to create -3. **Cross-Language Compatibility**: Other languages need to recognize and deserialize your types +3. **Xlang compatibility**: Other languages need to recognize and deserialize your types ## Struct Registration @@ -163,7 +163,7 @@ f.RegisterStruct(Address{}, 1) f.RegisterStruct(Person{}, 2) ``` -## Cross-Language Registration +## Xlang Registration For cross-language serialization, types must be registered consistently across all languages. @@ -259,6 +259,6 @@ Two types registered with the same ID will conflict. ## Related Topics - [Basic Serialization](basic-serialization.md) -- [Cross-Language Serialization](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Supported Types](supported-types.md) - [Troubleshooting](troubleshooting.md) diff --git a/docs/guide/go/cross-language.md b/docs/guide/go/xlang-serialization.md similarity index 97% rename from docs/guide/go/cross-language.md rename to docs/guide/go/xlang-serialization.md index 2b1fe36bc3..e7df613b75 100644 --- a/docs/guide/go/cross-language.md +++ b/docs/guide/go/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization -sidebar_position: 20 -id: cross_language +title: Xlang Serialization +sidebar_position: 2 +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -19,7 +19,7 @@ license: | limitations under the License. --- -Fory Go enables seamless data exchange with Java, Python, C++, Rust, and JavaScript. This guide covers cross-language compatibility and type mapping. +Fory Go enables seamless data exchange with Java, Python, C++, Rust, and JavaScript. This guide covers xlang compatibility and type mapping. ## Create an Xlang Runtime @@ -29,7 +29,7 @@ Go defaults to xlang mode with compatible schema evolution. Set the mode explici f := fory.New(fory.WithXlang(true)) ``` -## Type Registration for Cross-Language +## Type Registration for Xlang Use consistent type IDs across all languages: diff --git a/docs/guide/java/advanced-features.md b/docs/guide/java/advanced-features.md index c7d9af4060..b0957af85d 100644 --- a/docs/guide/java/advanced-features.md +++ b/docs/guide/java/advanced-features.md @@ -20,7 +20,7 @@ license: | --- This page covers advanced Java runtime features that are not part of first-use serialization. -Java native-mode zero-copy serialization is documented in [Native Mode](native-mode.md), and deep +Java native-mode zero-copy serialization is documented in [Native Serialization](native-serialization.md), and deep copy semantics are documented in [Object Copy](object-copy.md). ## Memory Allocation Customization @@ -155,6 +155,6 @@ static { - [Compression](compression.md) - Data compression options - [Configuration](configuration.md) - All ForyBuilder options -- [Native Mode](native-mode.md) - Java-only serialization, JDK hooks, and zero-copy buffers +- [Native Serialization](native-serialization.md) - Java-only serialization, JDK hooks, and zero-copy buffers - [Object Copy](object-copy.md) - Deep copy functionality -- [Cross-Language](cross-language.md) - Java xlang interoperability +- [Xlang Serialization](xlang-serialization.md) - Java xlang interoperability diff --git a/docs/guide/java/basic-serialization.md b/docs/guide/java/basic-serialization.md index 511008e01b..3e652ddf83 100644 --- a/docs/guide/java/basic-serialization.md +++ b/docs/guide/java/basic-serialization.md @@ -93,9 +93,9 @@ User decoded = fory.deserialize(bytes, User.class); When xlang bytes cross runtimes, every runtime must register the same type identity and compatible field metadata. The shared rules live in [Xlang](../xlang/index.md), while Java-specific API calls -are in [Cross-Language](cross-language.md). +are in [Xlang Serialization](xlang-serialization.md). -## Use Native Mode For Java-Only Traffic +## Use Native Serialization For Java-Only Traffic For same-language Java/JVM traffic, native mode is usually the better fit: @@ -106,7 +106,7 @@ Fory fory = Fory.builder() ``` Native mode supports the broad Java object serialization surface, including JDK serialization hooks, -object copy, and native-mode zero-copy buffers. See [Native Mode](native-mode.md). +object copy, and native-mode zero-copy buffers. See [Native Serialization](native-serialization.md). ## Common Options @@ -126,7 +126,7 @@ object copy, and native-mode zero-copy buffers. See [Native Mode](native-mode.md ## Related Topics - [Configuration](configuration.md) - All ForyBuilder options -- [Native Mode](native-mode.md) - Java-only serialization features +- [Native Serialization](native-serialization.md) - Java-only serialization features - [Schema Metadata](schema-metadata.md) - Field IDs, nullability, reference tracking, and enum IDs -- [Cross-Language](cross-language.md) - Java xlang interoperability +- [Xlang Serialization](xlang-serialization.md) - Java xlang interoperability - [Troubleshooting](troubleshooting.md) - Common API usage issues diff --git a/docs/guide/java/configuration.md b/docs/guide/java/configuration.md index 6cb89b43aa..61410fd9ff 100644 --- a/docs/guide/java/configuration.md +++ b/docs/guide/java/configuration.md @@ -1,6 +1,6 @@ --- title: Configuration -sidebar_position: 3 +sidebar_position: 4 id: configuration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -69,6 +69,29 @@ Fory fory = Fory.builder() .build(); ``` +## Security + +Keep class registration enabled for production and any untrusted payload source: + +```java +Fory fory = Fory.builder() + .requireClassRegistration(true) + .withMaxDepth(50) + .build(); +``` + +Security-related options: + +- `requireClassRegistration(true)` restricts deserialization to registered classes. +- `withMaxDepth(...)` rejects unexpectedly deep object graphs. +- `withDeserializeUnknownClass(false)` avoids materializing unknown classes from metadata. +- `checkJdkClassSerializable(true)` keeps the JDK serializability check for `java.*` classes. +- Class registration warnings can be useful during security audits; use + `suppressClassRegistrationWarnings(false)` when you need to surface unexpected types. + +Use `requireClassRegistration(false)` only for trusted payloads, and pair it with a `TypeChecker` +allow list when dynamic class loading is required. + ## Related Topics - [Schema Metadata](schema-metadata.md) - `@ForyField`, `@Ignore`, integer encoding annotations, `serializeEnumByName`, and `@ForyEnumId` diff --git a/docs/guide/java/custom-serializers.md b/docs/guide/java/custom-serializers.md index 4816ad1dd9..7abcdf5cbd 100644 --- a/docs/guide/java/custom-serializers.md +++ b/docs/guide/java/custom-serializers.md @@ -1,6 +1,6 @@ --- title: Custom Serializers -sidebar_position: 12 +sidebar_position: 11 id: custom_serializers license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/java/index.md b/docs/guide/java/index.md index 336e76257c..c52237d803 100644 --- a/docs/guide/java/index.md +++ b/docs/guide/java/index.md @@ -130,7 +130,7 @@ Use xlang mode for cross-language payloads and schemas shared with non-Java runt Use native mode for Java-only traffic. Native mode is selected with `.withXlang(false)`, uses schema-consistent payloads unless compatible mode is enabled, and owns Java-specific object behavior such as JDK serialization hooks, `Externalizable`, dynamic object graphs, object copy, and Java native-mode zero-copy buffers. It is optimized for the JVM type system and supports a broader Java object surface than xlang mode. If you are replacing JDK serialization, Kryo, FST, Hessian, or Java-only Protocol Buffers payloads, start with native mode. -See [Native Mode](native-mode.md) for Java-only serialization details and [Cross-Language Serialization](cross-language.md) for Java xlang registration and interoperability rules. +See [Native Serialization](native-serialization.md) for Java-only serialization details and [Xlang Serialization](xlang-serialization.md) for Java xlang registration and interoperability rules. ## Thread Safety @@ -213,6 +213,7 @@ ThreadSafeFory threadLocalFory = Fory.builder() - [Virtual Threads](virtual-threads.md) - Virtual-thread usage and pool sizing guidance - [Type Registration](type-registration.md) - Class registration and security - [Custom Serializers](custom-serializers.md) - Implement custom serializers -- [Cross-Language Serialization](cross-language.md) - Serialize data for other languages +- [Xlang Serialization](xlang-serialization.md) - Serialize data for other languages +- [Native Serialization](native-serialization.md) - Java-only serialization features - [Static Generated Serializers](static-generated-serializers.md) - Annotation-processor static generated serializers for `@ForyStruct` - [GraalVM Support](graalvm-support.md) - Build-time serializer compilation for native images diff --git a/docs/guide/java/native-mode.md b/docs/guide/java/native-mode.md deleted file mode 100644 index 9091caf0f9..0000000000 --- a/docs/guide/java/native-mode.md +++ /dev/null @@ -1,188 +0,0 @@ ---- -title: Native Mode -sidebar_position: 2 -id: native_mode -license: | - Licensed to the Apache Software Foundation (ASF) under one or more - contributor license agreements. See the NOTICE file distributed with - this work for additional information regarding copyright ownership. - The ASF licenses this file to You under the Apache License, Version 2.0 - (the "License"); you may not use this file except in compliance with - the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. ---- - -Java native mode is the Java-only wire format selected with `withXlang(false)`. Use it when both -writer and reader are Java/JVM services and you want the broad Java object serialization surface: -JDK serialization hooks, dynamic object graphs, optional class-registration policies, object copy, -and Java-native collection and wrapper handling. Native mode is optimized for the JVM type system, -so it is the right starting point for Java/JVM-only replacements of JDK serialization, Kryo, FST, -Hessian, or Java-only Protocol Buffers payloads. - -Use xlang mode, the default, when bytes must be read by other Fory runtimes. - -## Create a Native-Mode Runtime - -```java -import org.apache.fory.Fory; - -Fory fory = Fory.builder() - .withXlang(false) - .requireClassRegistration(true) - .withRefTracking(true) - .build(); - -byte[] bytes = fory.serialize(object); -Object decoded = fory.deserialize(bytes); -``` - -Native mode defaults to schema-consistent serialization. Enable compatible mode only when Java -classes evolve independently across writer and reader deployments: - -```java -Fory fory = Fory.builder() - .withXlang(false) - .withCompatible(true) - .build(); -``` - -## Java Serialization Framework Replacement - -Java native mode supports the JDK serialization hooks that are part of many existing Java object -models and is the Fory mode to use when replacing Java-only serialization frameworks: - -- `writeObject` and `readObject` -- `writeReplace` and `readResolve` -- `readObjectNoData` -- `Externalizable` - -Use native mode when replacing JDK serialization, Kryo, FST, Hessian, or Java-only Protocol -Buffers payloads. Use xlang mode only when the bytes must be read by non-Java Fory runtimes. - -```java -public class MyClass implements Serializable { - private void writeObject(ObjectOutputStream out) throws IOException { - // Custom serialization logic. - } - - private void readObject(ObjectInputStream in) throws IOException { - // Custom deserialization logic. - } - - private Object writeReplace() { - return this; - } - - private Object readResolve() { - return this; - } -} -``` - -When an application must read data that may be either JDK `ObjectOutputStream` bytes or Fory -native-mode bytes, `JavaSerializer.serializedByJDK` can identify the JDK payload before falling -back to Fory: - -```java -if (JavaSerializer.serializedByJDK(bytes)) { - ObjectInputStream objectInputStream = ...; - return objectInputStream.readObject(); -} -return fory.deserialize(bytes); -``` - -Use this bridge only at boundaries that actually accept both formats. Native-mode Fory payloads -should otherwise be written and read by Fory directly. - -## Object Graphs And Reference Tracking - -Native mode supports shared references and circular references when reference tracking is enabled: - -```java -Fory fory = Fory.builder() - .withXlang(false) - .withRefTracking(true) - .build(); -``` - -Disable reference tracking only for value-shaped graphs where identity and cycles are not part of -the data model. - -## Object Copy - -Fory can deep-copy Java objects without materializing a byte array. For full copy semantics, custom -copy hooks, and troubleshooting, see [Object Copy](object-copy.md). - -```java -Fory fory = Fory.builder() - .withXlang(false) - .withRefCopy(true) - .build(); - -MyClass copy = fory.copy(original); -``` - -## Zero-Copy Serialization - -Native mode supports out-of-band `BufferObject` payloads for large binary values and primitive -arrays: - -```java -import java.util.ArrayList; -import java.util.Arrays; -import java.util.Collection; -import java.util.List; -import java.util.stream.Collectors; -import org.apache.fory.Fory; -import org.apache.fory.memory.MemoryBuffer; -import org.apache.fory.serializer.BufferObject; - -Fory fory = Fory.builder() - .withXlang(false) - .build(); - -List value = Arrays.asList("str", new byte[1000], new int[100], new double[100]); -Collection bufferObjects = new ArrayList<>(); -byte[] bytes = fory.serialize(value, bufferObject -> !bufferObjects.add(bufferObject)); -List buffers = bufferObjects.stream() - .map(BufferObject::toBuffer) - .collect(Collectors.toList()); - -Object decoded = fory.deserialize(bytes, buffers); -``` - -The callback returns `false` for buffers that should be sent out-of-band. The main byte array still -contains the root object graph and references the buffers in callback order. - -## Registration And Security - -Class registration is enabled by default. Register application classes before serializing them: - -```java -Fory fory = Fory.builder() - .withXlang(false) - .requireClassRegistration(true) - .build(); - -fory.register(MyClass.class); -``` - -`requireClassRegistration(false)` is available for trusted environments that need dynamic class -loading, but deserializing unregistered classes from untrusted input is unsafe. Keep class -registration enabled for service boundaries unless a `TypeChecker` or allow-list policy owns the -trust decision. - -## Related Topics - -- [Basic Serialization](basic-serialization.md) - Xlang-first Java quickstart -- [Configuration](configuration.md) - Java builder options -- [Schema Evolution](schema-evolution.md) - Compatible and schema-consistent mode -- [Object Copy](object-copy.md) - Deep-copy semantics -- [GraalVM Support](graalvm-support.md) - Native-image platform support diff --git a/docs/guide/java/native-serialization.md b/docs/guide/java/native-serialization.md new file mode 100644 index 0000000000..133c0d4ad5 --- /dev/null +++ b/docs/guide/java/native-serialization.md @@ -0,0 +1,402 @@ +--- +title: Native Serialization +sidebar_position: 3 +id: native_serialization +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +Java native serialization is the Java-only wire format selected with `withXlang(false)`. Use it +when every writer and reader is a Java/JVM process and the payload should follow the JVM type system +instead of the portable xlang type system. Native serialization is the right starting point for +Java/JVM-only replacements of JDK serialization, Kryo, FST, Hessian, or Java-only Protocol Buffers +payloads. + +Native serialization in this page means Fory's `xlang=false` wire mode. It is separate from GraalVM +native image support, which is covered in [GraalVM Support](graalvm-support.md). + +Use [Xlang Serialization](xlang-serialization.md), the default Java mode, when bytes must be read by +non-Java Fory runtimes. + +## When To Use Native Serialization + +Use native serialization when: + +- A payload is produced and consumed only by Java/JVM applications. +- The object model uses Java-specific types, JDK collections, wrapper types, inheritance, + interfaces, or polymorphism that do not need a cross-language schema. +- Existing classes rely on JDK serialization hooks such as `writeObject`, `readObject`, + `writeReplace`, `readResolve`, `readObjectNoData`, or `Externalizable`. +- You need Java object copy through `Fory.copy(...)`. +- Large primitive arrays or binary payloads should use native-mode out-of-band buffers. +- You are replacing Java-only serialization frameworks and want the broadest Java object surface. + +Use xlang serialization instead when the payload must be read by Python, C++, Go, Rust, +JavaScript/TypeScript, C#, Swift, Dart, or another non-Java runtime. + +## Create a Native Runtime + +```java +import org.apache.fory.Fory; + +Fory fory = Fory.builder() + .withXlang(false) + .requireClassRegistration(true) + .withRefTracking(true) + .build(); + +byte[] bytes = fory.serialize(object); +Object decoded = fory.deserialize(bytes); +``` + +Create and reuse a `Fory` or `ThreadSafeFory` instance for each configuration. Fory creation is not +cheap because the runtime caches class metadata, serializers, and generated code. + +```java +import org.apache.fory.Fory; +import org.apache.fory.ThreadSafeFory; + +ThreadSafeFory fory = Fory.builder() + .withXlang(false) + .requireClassRegistration(true) + .withRefTracking(true) + .buildThreadSafeFory(); + +fory.register(Order.class, 100); +``` + +Register classes and serializers during startup before concurrent serialization starts. Use a +separate runtime when class loader, registration, security, schema evolution, or reference-tracking +settings differ. + +## Schema Evolution + +Native serialization defaults to schema-consistent mode. In schema-consistent mode, writer and +reader classes are expected to have the same schema. This is the most direct native-mode path and is +the right default for lockstep deployments. + +Enable compatible mode when Java classes can evolve independently across writer and reader +deployments: + +```java +Fory fory = Fory.builder() + .withXlang(false) + .withCompatible(true) + .build(); +``` + +Compatible mode lets readers tolerate added, removed, or reordered fields when the schema metadata +remains compatible. It also enables metadata sharing by default. See [Schema Evolution](schema-evolution.md) +for field IDs, class version checks, meta sharing, and unknown-class handling. + +## Registration And Security + +Class registration is enabled by default. Keep it enabled for service boundaries and register +application classes explicitly: + +```java +Fory fory = Fory.builder() + .withXlang(false) + .requireClassRegistration(true) + .build(); + +fory.register(Order.class, 100); +fory.register(LineItem.class, 101); +``` + +Explicit numeric IDs avoid registration-order drift. If you use `fory.register(MyClass.class)` +without an ID, every writer and reader must register classes in the same order. Name-based +registration is also available when type ID coordination is harder: + +```java +fory.register(Order.class, "com.example", "Order"); +``` + +Disable class registration only in trusted environments. If you need dynamic class loading, install +a `TypeChecker` or `AllowListChecker` so deserialization can reject unexpected classes: + +```java +import org.apache.fory.Fory; +import org.apache.fory.resolver.AllowListChecker; + +AllowListChecker checker = new AllowListChecker(AllowListChecker.CheckLevel.STRICT); +checker.allowClass("com.example.*"); + +Fory fory = Fory.builder() + .withXlang(false) + .requireClassRegistration(false) + .withTypeChecker(checker) + .withMaxDepth(100) + .build(); +``` + +Use `withMaxDepth(...)` to cap object graph depth for untrusted or externally supplied payloads. +See [Type Registration](type-registration.md) for the full security configuration. + +## Java Object Surface + +Native serialization owns the Java-specific object surface: + +- POJOs, records, enums, primitive arrays, object arrays, and common JDK collections. +- Inheritance, interfaces, polymorphic fields, shared references, and circular object graphs. +- Java wrapper and collection behavior that does not have to map to a portable xlang type. +- JDK serialization hooks for classes that require Java serialization compatibility. +- Custom serializers registered with `registerSerializer(...)` or `registerSerializerAndType(...)`. + +For ordinary application classes, Fory can use generated serializers and avoid JDK +`ObjectOutputStream` semantics. Classes that require JDK serialization hooks may use the Java +serialization-compatible path; prefer a Fory custom serializer for hot classes when the hook-based +path is too expensive. + +## JDK Serialization Hooks + +Java native mode supports the JDK serialization hooks that are part of many existing Java object +models: + +- `writeObject` and `readObject` +- `writeReplace` and `readResolve` +- `readObjectNoData` +- `Externalizable` + +```java +import java.io.IOException; +import java.io.ObjectInputStream; +import java.io.ObjectOutputStream; +import java.io.Serializable; + +public class MyClass implements Serializable { + private void writeObject(ObjectOutputStream out) throws IOException { + // Custom serialization logic. + } + + private void readObject(ObjectInputStream in) throws IOException { + // Custom deserialization logic. + } + + private Object writeReplace() { + return this; + } + + private Object readResolve() { + return this; + } +} +``` + +Fory native payloads are not JDK `ObjectOutputStream` payloads. The hooks are honored for +Java-object compatibility, but new payloads should be written and read by Fory. + +## Migrating From Java Serialization Frameworks + +When replacing JDK serialization, Kryo, FST, Hessian, or a Java-only Protocol Buffers pipeline: + +1. Start with `.withXlang(false)` because the data is Java-only. +2. Keep `requireClassRegistration(true)` and register application classes with explicit IDs. +3. Use `.withCompatible(true)` if writer and reader deployments roll independently. +4. Enable `.withRefTracking(true)` only when identity or circular references matter. +5. Add custom serializers for hot classes that would otherwise use expensive JDK serialization hooks. +6. Keep old and new byte streams separated when possible. + +When an application must read data that may be either JDK `ObjectOutputStream` bytes or Fory +native-mode bytes, `JavaSerializer.serializedByJDK` can identify the JDK payload before falling +back to Fory: + +```java +import java.io.ByteArrayInputStream; +import java.io.ObjectInputStream; +import org.apache.fory.serializer.JavaSerializer; + +if (JavaSerializer.serializedByJDK(bytes)) { + ObjectInputStream objectInputStream = new ObjectInputStream(new ByteArrayInputStream(bytes)); + return objectInputStream.readObject(); +} +return fory.deserialize(bytes); +``` + +Use this bridge only at boundaries that actually accept both formats. Native-mode Fory payloads +should otherwise be written and read by Fory directly. + +## Object Graphs And Reference Tracking + +Native mode supports shared references and circular references when reference tracking is enabled: + +```java +Fory fory = Fory.builder() + .withXlang(false) + .withRefTracking(true) + .build(); +``` + +Disable reference tracking only for value-shaped graphs where identity and cycles are not part of +the data model: + +```java +Fory fory = Fory.builder() + .withXlang(false) + .withRefTracking(false) + .build(); +``` + +Reference tracking is a semantic choice. Turning it off can improve performance and reduce payload +size, but repeated references deserialize as distinct objects and cycles are unsupported. + +## Object Copy + +Fory can deep-copy Java objects without materializing a byte array. For full copy semantics, custom +copy hooks, and troubleshooting, see [Object Copy](object-copy.md). + +```java +Fory fory = Fory.builder() + .withXlang(false) + .withRefCopy(true) + .build(); + +MyClass copy = fory.copy(original); +``` + +`withRefCopy(true)` controls reference preservation for copy operations. It is separate from +`withRefTracking(...)`, which controls serialization and deserialization. + +## Zero-Copy Serialization + +Native mode supports out-of-band `BufferObject` payloads for large binary values and primitive +arrays: + +```java +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collection; +import java.util.List; +import java.util.stream.Collectors; +import org.apache.fory.Fory; +import org.apache.fory.memory.MemoryBuffer; +import org.apache.fory.serializer.BufferObject; + +Fory fory = Fory.builder() + .withXlang(false) + .build(); + +List value = Arrays.asList("str", new byte[1000], new int[100], new double[100]); +Collection bufferObjects = new ArrayList<>(); +byte[] bytes = fory.serialize(value, bufferObject -> !bufferObjects.add(bufferObject)); +List buffers = bufferObjects.stream() + .map(BufferObject::toBuffer) + .collect(Collectors.toList()); + +Object decoded = fory.deserialize(bytes, buffers); +``` + +The callback returns `false` for buffers that should be sent out-of-band. The main byte array still +contains the root object graph and references the buffers in callback order. + +Use this when the transport can carry the main payload and buffers separately. If the stream is +stored or sent as one byte array, omit the callback and let Fory keep buffer contents in-band. + +Native serialization also supports byte arrays, `MemoryBuffer`, `ByteBuffer`, `OutputStream`, +`ForyInputStream`, and `ForyReadableChannel` APIs. Choose the API that matches the boundary you +already own; avoid copying through `byte[]` when a buffer or stream is already available. + +## Class Loaders + +```java +ClassLoader loader = Thread.currentThread().getContextClassLoader(); + +Fory fory = Fory.builder() + .withXlang(false) + .withClassLoader(loader) + .build(); +``` + +Each `Fory` instance is tied to one class loader because class metadata and serializers are cached. +Build a separate runtime for each application, plugin, or tenant class loader instead of switching +loaders on an existing runtime. + +## Performance Guidelines + +- Reuse `Fory` or `ThreadSafeFory` instances instead of rebuilding them per request. +- Register classes with explicit numeric IDs for compact type metadata and stable deployments. +- Keep schema-consistent mode for lockstep Java services; enable compatible mode only when schema + evolution requires it. +- Disable reference tracking for value-shaped graphs with no identity or cycles. +- Use async compilation on ordinary JVMs when startup latency can tolerate interpreter-first + serialization: + + ```java + Fory fory = Fory.builder() + .withXlang(false) + .withAsyncCompilation(true) + .build(); + ``` + +- Keep runtime code generation enabled on ordinary JVMs. Use static generated serializers for + GraalVM native image and Android flows. +- Use zero-copy out-of-band buffers for large primitive arrays or binary fields when the transport + supports split payloads. +- Replace expensive JDK serialization hooks with Fory custom serializers for hot classes when the + object contract allows it. + +## Native And Xlang Comparison + +| Requirement | Use native serialization | Use xlang serialization | +| ------------------------------------------- | ------------------------ | ----------------------- | +| Java/JVM-only payloads | Yes | Optional | +| Non-Java readers or writers | No | Yes | +| Broad Java object surface | Yes | Limited to xlang types | +| JDK serialization hooks | Yes | No | +| Java object copy | Yes | No | +| Portable type mapping across runtimes | No | Yes | +| Compatible schema evolution by default | No | Yes | +| Schema-consistent same-language performance | Yes | No | + +## Troubleshooting + +### A non-Java runtime cannot read the payload + +The writer is using native serialization. Rebuild the writer with `.withXlang(true)` and align type +registration with every peer runtime. + +### A class is rejected during deserialization + +Keep class registration enabled and register the class on both writer and reader. If dynamic class +loading is intentional, use `requireClassRegistration(false)` only with an allow-listing +`TypeChecker`. + +### A rolling deployment fails after a field change + +Native serialization defaults to schema-consistent mode. Use `.withCompatible(true)` when writer and +reader versions can differ, and add stable field metadata for long-lived schemas. + +### Object identity is not preserved + +Enable `.withRefTracking(true)` for serialization and deserialization. For `Fory.copy(...)`, enable +`.withRefCopy(true)`. + +### A migrated boundary receives both JDK and Fory bytes + +Use `JavaSerializer.serializedByJDK(...)` only at the mixed-format boundary, then route JDK bytes to +`ObjectInputStream` and Fory native bytes to `fory.deserialize(...)`. + +## Related Topics + +- [Basic Serialization](basic-serialization.md) - Xlang-first Java quickstart +- [Xlang Serialization](xlang-serialization.md) - Cross-runtime Java payloads +- [Configuration](configuration.md) - Java builder options +- [Schema Evolution](schema-evolution.md) - Compatible and schema-consistent mode +- [Type Registration](type-registration.md) - Registration and security +- [Object Copy](object-copy.md) - Deep-copy semantics +- [Custom Serializers](custom-serializers.md) - Custom Java serializers +- [Static Generated Serializers](static-generated-serializers.md) - Build-time generated serializers +- [GraalVM Support](graalvm-support.md) - Native-image platform support diff --git a/docs/guide/java/row-format.md b/docs/guide/java/row-format.md index dd43df8258..477f9ec136 100644 --- a/docs/guide/java/row-format.md +++ b/docs/guide/java/row-format.md @@ -1,6 +1,6 @@ --- title: Row Format -sidebar_position: 11 +sidebar_position: 12 id: row_format license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -189,6 +189,6 @@ std::string str = bar10->get_string(0); ## Related Topics -- [Cross-Language Serialization](cross-language.md) - xlang mode +- [Xlang Serialization](xlang-serialization.md) - xlang mode - [Advanced Features](advanced-features.md) - Zero-copy serialization - [Row Format Specification](https://fory.apache.org/docs/specification/row_format_spec) - Protocol details diff --git a/docs/guide/java/schema-evolution.md b/docs/guide/java/schema-evolution.md index 76dbdf500b..21270cc1a3 100644 --- a/docs/guide/java/schema-evolution.md +++ b/docs/guide/java/schema-evolution.md @@ -242,5 +242,5 @@ public class DeserializeIntoType { ## Related Topics - [Configuration](configuration.md) - All ForyBuilder options -- [Cross-Language Serialization](cross-language.md) - xlang mode +- [Xlang Serialization](xlang-serialization.md) - xlang mode - [Troubleshooting](troubleshooting.md) - Common schema issues diff --git a/docs/guide/java/schema-metadata.md b/docs/guide/java/schema-metadata.md index 49b5eccdd6..e3fcc595a9 100644 --- a/docs/guide/java/schema-metadata.md +++ b/docs/guide/java/schema-metadata.md @@ -714,4 +714,4 @@ public class User { - [Basic Serialization](basic-serialization.md) - Getting started with Fory serialization - [Configuration](configuration.md) - Runtime builder options - [Schema Evolution](schema-evolution.md) - Compatible mode and schema evolution -- [Cross-Language](cross-language.md) - Interoperability with Python, Rust, C++, Go +- [Xlang Serialization](xlang-serialization.md) - Interoperability with Python, Rust, C++, Go diff --git a/docs/guide/java/troubleshooting.md b/docs/guide/java/troubleshooting.md index 435528f610..e04a325185 100644 --- a/docs/guide/java/troubleshooting.md +++ b/docs/guide/java/troubleshooting.md @@ -198,4 +198,4 @@ FORY_LOG_LEVEL=INFO mvn test -Dtest=org.apache.fory.TestClass#testMethod - [Configuration](configuration.md) - All ForyBuilder options - [Schema Evolution](schema-evolution.md) - Compatible mode details - [Type Registration](type-registration.md) - Registration best practices -- [Native Mode](native-mode.md) - Java-only serialization features +- [Native Serialization](native-serialization.md) - Java-only serialization features diff --git a/docs/guide/java/cross-language.md b/docs/guide/java/xlang-serialization.md similarity index 61% rename from docs/guide/java/cross-language.md rename to docs/guide/java/xlang-serialization.md index e29f7b498e..425d5b41b3 100644 --- a/docs/guide/java/cross-language.md +++ b/docs/guide/java/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization -sidebar_position: 4 -id: cross_language +title: Xlang Serialization +sidebar_position: 2 +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -19,32 +19,43 @@ license: | limitations under the License. --- -Apache Fory™ supports seamless data exchange between Java and other languages (Python, Rust, Go, JavaScript, etc.) through the xlang serialization format. This enables multi-language microservices, polyglot data pipelines, and cross-platform data sharing. +Apache Fory™ xlang serialization is the Java wire mode for payloads that must be read by Python, +Rust, Go, JavaScript, C++, C#, Swift, Dart, or another non-Java Fory runtime. Java defaults to +xlang mode with compatible schema evolution, but examples set the mode explicitly so the payload +contract is visible in code. ## Create an Xlang Runtime -Java defaults to xlang mode with compatible schema evolution. Set the mode explicitly in xlang examples: +Use one long-lived `Fory` or `ThreadSafeFory` instance per configuration. Creating a runtime is +expensive because Fory caches type metadata and generated serializers. ```java -import org.apache.fory.*; -import org.apache.fory.config.*; +import org.apache.fory.Fory; Fory fory = Fory.builder() .withXlang(true) - .withRefTracking(true) // Enable reference tracking for complex graphs + .requireClassRegistration(true) + .withRefTracking(true) .build(); ``` -## Register Types for Cross-Language Compatibility +`withRefTracking(true)` is required only when the cross-language data model includes shared object +identity or cycles. Disable it for value-shaped schemas. -Types must be registered with **consistent IDs or names** across all languages. Fory supports two registration methods: +Use [Native Serialization](native-serialization.md) instead when every writer and reader is Java +and the payload should preserve Java-specific object behavior. + +## Register Types + +Types must be registered with consistent IDs or names across all languages. Fory supports two +registration methods. ### Register by ID (Recommended for Performance) ```java public record Person(String name, int age) {} -// Register with numeric ID - faster and more compact +// Numeric ID registration is compact and fast. fory.register(Person.class, 1); Person person = new Person("Alice", 30); @@ -52,50 +63,55 @@ byte[] bytes = fory.serialize(person); // bytes can be deserialized by Python, Rust, Go, etc. ``` -**Benefits**: Faster serialization, smaller binary size -**Trade-offs**: Requires coordination to avoid ID conflicts across teams/services +Benefits: faster serialization and smaller binary size. + +Trade-off: every service must coordinate IDs so the same logical type uses the same number. ### Register by Name (Recommended for Flexibility) ```java public record Person(String name, int age) {} -// Register with string name - more flexible -fory.register(Person.class, "example.Person"); +// Namespace/type-name registration is easier to coordinate across teams. +fory.register(Person.class, "example", "Person"); Person person = new Person("Alice", 30); byte[] bytes = fory.serialize(person); // bytes can be deserialized by Python, Rust, Go, etc. ``` -**Benefits**: Less prone to conflicts, easier management across teams, no coordination needed -**Trade-offs**: Slightly larger binary size due to string encoding +Benefits: less risk of numeric ID conflicts and easier management across independently owned +services. + +Trade-off: the payload includes string identity, so it is larger than ID-based registration. + +The Java API also supports a single string type name, such as +`fory.register(Person.class, "example.Person")`. Use the same logical identity on every runtime. -## Cross-Language Example: Java ↔ Python +## Java To Python Example ### Java (Serializer) ```java -import org.apache.fory.*; -import org.apache.fory.config.*; +import org.apache.fory.Fory; public record Person(String name, int age) {} public class Example { - public static void main(String[] args) { - Fory fory = Fory.builder() - .withXlang(true) - .withRefTracking(true) - .build(); + public static void main(String[] args) { + Fory fory = Fory.builder() + .withXlang(true) + .withRefTracking(true) + .build(); - // Register with consistent name - fory.register(Person.class, "example.Person"); + // Register with the same logical name used by Python. + fory.register(Person.class, "example.Person"); - Person person = new Person("Bob", 25); - byte[] bytes = fory.serialize(person); + Person person = new Person("Bob", 25); + byte[] bytes = fory.serialize(person); - // Send bytes to Python service via network/file/queue - } + // Send bytes to Python by your service transport. + } } ``` @@ -112,10 +128,9 @@ class Person: fory = pyfory.Fory(xlang=True, ref=True) -# Register with the SAME name as Java +# Register with the same name as Java. fory.register_type(Person, typename="example.Person") -# Deserialize bytes from Java person = fory.deserialize(bytes_from_java) print(f"{person.name}, {person.age}") # Output: Bob, 25 ``` @@ -126,25 +141,24 @@ Xlang mode supports circular and shared references when reference tracking is en ```java public class Node { - public String value; - public Node next; - public Node parent; + public String value; + public Node next; + public Node parent; } Fory fory = Fory.builder() .withXlang(true) - .withRefTracking(true) // Required for circular references + .withRefTracking(true) .build(); fory.register(Node.class, "example.Node"); -// Create circular reference Node node1 = new Node(); node1.value = "A"; Node node2 = new Node(); node2.value = "B"; node1.next = node2; -node2.parent = node1; // Circular reference +node2.parent = node1; byte[] bytes = fory.serialize(node1); // Python/Rust/Go can correctly deserialize this with circular references preserved @@ -154,12 +168,16 @@ byte[] bytes = fory.serialize(node1); Not all Java types have equivalents in other languages. When using xlang mode: -- Use **primitive types** (`int`, `long`, `double`, `String`) for maximum compatibility -- Use **standard collections** (`List`, `Map`, `Set`) instead of language-specific ones -- Use **reduced-precision carriers** (`Float16`, `BFloat16`, `Float16List`, `BFloat16List`) for 16-bit float payloads -- Treat `Float16[]`, `BFloat16[]`, `Float16List`, and `BFloat16List` as `list` carriers by default; use `@ArrayType` when the schema must be `array` or `array` -- Avoid **Java-specific types** like `Optional`, `BigDecimal` (unless the target language supports them) -- See [Type Mapping Guide](../../specification/xlang_type_mapping.md) for complete compatibility matrix +- Use primitive types (`int`, `long`, `double`, `String`) for maximum compatibility. +- Use standard collections (`List`, `Map`, `Set`) instead of language-specific collections. +- Use reduced-precision carriers (`Float16`, `BFloat16`, `Float16List`, `BFloat16List`) for + 16-bit float payloads. +- Treat `Float16[]`, `BFloat16[]`, `Float16List`, and `BFloat16List` as `list` carriers by + default; use `@ArrayType` when the schema must be `array` or `array`. +- Avoid Java-specific types like `Optional`, `BigDecimal`, and `EnumSet` unless every target runtime + has an agreed mapping. +- See [Type Mapping Guide](../../specification/xlang_type_mapping.md) for the complete + compatibility matrix. ### Lists and Dense Arrays @@ -226,12 +244,14 @@ Xlang mode has additional overhead compared to Java native mode: - **Disable reference tracking** if you don't need circular references (`withRefTracking(false)`) - **Use native mode** (`withXlang(false)`) when only Java serialization is needed -## Cross-Language Best Practices +## Best Practices -1. **Consistent Registration**: Ensure all services register types with identical IDs/names -2. **Version Compatibility**: Keep compatible mode for schema evolution across services +1. Use explicit type IDs or namespace/type names for every user type. +2. Keep compatible mode for independently deployed services. +3. Test payloads through every runtime before relying on a schema in production. +4. Use native serialization for Java-only traffic that needs Java-specific object behavior. -## Troubleshooting Cross-Language Serialization +## Troubleshooting ### "Type not registered" errors @@ -250,13 +270,14 @@ Xlang mode has additional overhead compared to Java native mode: ## See Also -- [Cross-Language Serialization Specification](../../specification/xlang_serialization_spec.md) +- [Xlang Serialization Specification](../../specification/xlang_serialization_spec.md) - [Type Mapping Reference](../../specification/xlang_type_mapping.md) -- [Python Cross-Language Guide](../python/cross-language.md) -- [Rust Cross-Language Guide](../rust/cross-language.md) +- [Python Xlang Serialization Guide](../python/xlang-serialization.md) +- [Rust Xlang Serialization Guide](../rust/xlang-serialization.md) ## Related Topics - [Schema Evolution](schema-evolution.md) - Compatible mode - [Type Registration](type-registration.md) - Registration methods +- [Native Serialization](native-serialization.md) - Java-only serialization features - [Row Format](row-format.md) - Cross-language row format diff --git a/docs/guide/javascript/configuration.md b/docs/guide/javascript/configuration.md index aad60758dc..cd113cce50 100644 --- a/docs/guide/javascript/configuration.md +++ b/docs/guide/javascript/configuration.md @@ -96,6 +96,16 @@ const fory = new Fory({ hps }); Leave this unset unless you run on Node.js 20+ and have benchmarked your workload. +## Security + +Security-related configuration: + +- Register only the expected schemas before deserializing untrusted payloads. +- Set `maxDepth`, `maxBinarySize`, and `maxCollectionSize` for the maximum payload shape your + service accepts. +- Prefer explicit `Type.struct(...)` schemas over `Type.any()` for untrusted input. +- Pass `hps` only from the official package version you deploy with the runtime. + ## Related Topics - [Basic Serialization](basic-serialization.md) diff --git a/docs/guide/javascript/index.md b/docs/guide/javascript/index.md index ac6b02b2bb..49d4de7290 100644 --- a/docs/guide/javascript/index.md +++ b/docs/guide/javascript/index.md @@ -23,7 +23,7 @@ Apache Fory JavaScript lets you serialize JavaScript and TypeScript objects to b ## Why Fory JavaScript? -- **Cross-language**: serialize in JavaScript, deserialize in Java, Python, Go, and more without writing glue code +- **Xlang**: serialize in JavaScript, deserialize in Java, Python, Go, and more without writing glue code - **Fast**: serializer code is generated and cached the first time you register a schema, not on every call - **Reference-aware**: shared references and circular object graphs are supported when enabled - **Explicit schemas**: field types, nullability, and polymorphism are declared once with `Type.*` builders or TypeScript decorators @@ -113,10 +113,10 @@ options; see [Configuration](configuration.md). | [Supported Types](supported-types.md) | Primitive, collection, time, enum, and struct mappings | | [References](references.md) | Shared references and circular object graphs | | [Schema Evolution](schema-evolution.md) | Compatible mode and evolving structs | -| [Cross-Language](cross-language.md) | Interop guidance and mapping rules | +| [Xlang Serialization](xlang-serialization.md) | Interop guidance and mapping rules | | [Troubleshooting](troubleshooting.md) | Common issues, limits, and debugging tips | ## Related Resources - [Xlang Serialization Specification](../../specification/xlang_serialization_spec.md) -- [Cross-Language Type Mapping](../../specification/xlang_type_mapping.md) +- [Xlang Type Mapping](../../specification/xlang_type_mapping.md) diff --git a/docs/guide/javascript/references.md b/docs/guide/javascript/references.md index 258e83a643..ef5f2fbd9d 100644 --- a/docs/guide/javascript/references.md +++ b/docs/guide/javascript/references.md @@ -100,7 +100,7 @@ Leave it disabled when: - you want the lowest overhead - object identity does not matter -## Cross-Language Note +## Xlang Note Reference tracking is part of the Fory binary protocol and works across runtimes. Both sides must enable reference tracking and mark the same fields as reference-tracked for the behavior to be consistent. @@ -108,4 +108,4 @@ Reference tracking is part of the Fory binary protocol and works across runtimes - [Basic Serialization](basic-serialization.md) - [Schema Evolution](schema-evolution.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/javascript/schema-evolution.md b/docs/guide/javascript/schema-evolution.md index 13ab85b3a7..3d6479e353 100644 --- a/docs/guide/javascript/schema-evolution.md +++ b/docs/guide/javascript/schema-evolution.md @@ -89,11 +89,11 @@ const fixedType = Type.struct( | Smallest possible messages | best choice | slightly larger | | Rolling upgrades | risky | safe | -## Cross-Language Requirement +## Xlang Requirement -Compatible mode only protects you from schema differences in the _fields_ of a type. You still need the same type identity (same numeric ID or same `namespace + typeName`) on every side. See [Cross-Language](cross-language.md). +Compatible mode only protects you from schema differences in the _fields_ of a type. You still need the same type identity (same numeric ID or same `namespace + typeName`) on every side. See [Xlang Serialization](xlang-serialization.md). ## Related Topics - [Type Registration](type-registration.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/javascript/supported-types.md b/docs/guide/javascript/supported-types.md index 4c1d1564bc..48240de8d7 100644 --- a/docs/guide/javascript/supported-types.md +++ b/docs/guide/javascript/supported-types.md @@ -174,4 +174,4 @@ For types that need completely custom encoding, use `Type.ext(...)` and pass a c - [Basic Serialization](basic-serialization.md) - [References](references.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/javascript/troubleshooting.md b/docs/guide/javascript/troubleshooting.md index f39003be19..e23effea80 100644 --- a/docs/guide/javascript/troubleshooting.md +++ b/docs/guide/javascript/troubleshooting.md @@ -103,4 +103,4 @@ const fory = new Fory({ - [Basic Serialization](basic-serialization.md) - [References](references.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/javascript/type-registration.md b/docs/guide/javascript/type-registration.md index bb086fde0a..921d314655 100644 --- a/docs/guide/javascript/type-registration.md +++ b/docs/guide/javascript/type-registration.md @@ -161,13 +161,13 @@ Use **names** when: - schemas are already identified by package/module name - slightly larger metadata overhead is acceptable -## Cross-Language +## Xlang -For a message to round-trip between JavaScript and another runtime, both sides must use the same identity for a given type: same numeric ID, or same `namespace + typeName`. See [Cross-Language](cross-language.md). +For a message to round-trip between JavaScript and another runtime, both sides must use the same identity for a given type: same numeric ID, or same `namespace + typeName`. See [Xlang Serialization](xlang-serialization.md). ## Related Topics - [Basic Serialization](basic-serialization.md) - [Schema Metadata](schema-metadata.md) - [Schema Evolution](schema-evolution.md) -- [Cross-Language](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) diff --git a/docs/guide/javascript/cross-language.md b/docs/guide/javascript/xlang-serialization.md similarity index 99% rename from docs/guide/javascript/cross-language.md rename to docs/guide/javascript/xlang-serialization.md index b93176d521..13ed956747 100644 --- a/docs/guide/javascript/cross-language.md +++ b/docs/guide/javascript/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization +title: Xlang Serialization sidebar_position: 20 -id: cross_language +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with diff --git a/docs/guide/kotlin/configuration.md b/docs/guide/kotlin/configuration.md index 5defacc3c6..fc44d872ea 100644 --- a/docs/guide/kotlin/configuration.md +++ b/docs/guide/kotlin/configuration.md @@ -114,3 +114,22 @@ val fory = ForyKotlin.builder().withXlang(false) .withLongCompressed(true) .build() ``` + +## Security + +Kotlin uses the Java runtime configuration surface. Keep class registration enabled for production +and any untrusted payload source: + +```kotlin +val fory = ForyKotlin.builder() + .requireClassRegistration(true) + .withMaxDepth(50) + .build() +``` + +Security-related configuration: + +- Keep `requireClassRegistration(true)` and register application classes or generated modules. +- Use `withMaxDepth(...)` to reject unexpectedly deep object graphs. +- Follow [Java Configuration](../java/configuration.md#security) for allow-listing and unknown-class + controls. diff --git a/docs/guide/kotlin/index.md b/docs/guide/kotlin/index.md index f7a822b51d..784d103f60 100644 --- a/docs/guide/kotlin/index.md +++ b/docs/guide/kotlin/index.md @@ -94,7 +94,7 @@ Use xlang mode for cross-language payloads and schemas shared with other Fory ru Use native mode for Kotlin/JVM-only traffic. Native mode is selected with `.withXlang(false)`, uses schema-consistent payloads unless compatible mode is enabled, and inherits the JVM native-mode object serialization path from Fory Java while adding Kotlin-specific serializers for data classes, unsigned values, ranges, stdlib types, and generated serializers. It is optimized for JVM and Kotlin type systems and is the right path for same-language Kotlin/JVM framework replacement payloads. -See [Configuration](configuration.md) for Kotlin builder setup and [Java Native Mode](../java/native-mode.md) for the full JVM native-mode behavior. +See [Configuration](configuration.md) for Kotlin builder setup and [Java Native Serialization](../java/native-serialization.md) for the full JVM native-mode behavior. ## Built on Fory Java diff --git a/docs/guide/python/basic-serialization.md b/docs/guide/python/basic-serialization.md index aea4764adf..8048aa8ec3 100644 --- a/docs/guide/python/basic-serialization.md +++ b/docs/guide/python/basic-serialization.md @@ -83,7 +83,7 @@ assert result[0] is result[1] ``` For arbitrary Python object graphs, local classes, functions, and methods, use -[Python Native Mode](python-native.md). +[Native Serialization](native-serialization.md). ## Performance Tips @@ -108,5 +108,5 @@ for obj in objects: - [Configuration](configuration.md) - Fory parameters - [Type Registration](type-registration.md) - Registration patterns -- [Python Native Mode](python-native.md) - Functions and lambdas +- [Native Serialization](native-serialization.md) - Functions and lambdas - [Out-of-Band Serialization](out-of-band.md) - Buffer callback APIs diff --git a/docs/guide/python/configuration.md b/docs/guide/python/configuration.md index 7ed680ebb0..fc6433debb 100644 --- a/docs/guide/python/configuration.md +++ b/docs/guide/python/configuration.md @@ -1,6 +1,6 @@ --- title: Configuration -sidebar_position: 3 +sidebar_position: 4 id: configuration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -158,9 +158,97 @@ fory = pyfory.Fory( Use `strict=False` only for trusted data, preferably with a `policy=` deserialization policy. +## Security + +Treat native-mode bytes from untrusted sources the same way you would treat untrusted pickle bytes. +Native mode can reconstruct Python objects, import modules, invoke reduction hooks, and rebuild +dynamic classes or functions when `strict=False`. + +### Production Configuration + +Keep `strict=True` for production payloads unless the whole data source is trusted and a +`DeserializationPolicy` owns the remaining trust decisions: + +```python +import pyfory + +fory = pyfory.Fory( + xlang=True, + ref=False, + strict=True, + max_depth=50, +) + +fory.register(UserModel, typename="example.User") +fory.register(OrderModel, typename="example.Order") +``` + +Use dynamic native-mode deserialization (`strict=False`) only for trusted Python-only payloads: + +```python +import pyfory + +fory = pyfory.Fory( + xlang=False, + ref=True, + strict=False, + max_depth=100, +) +``` + +### DeserializationPolicy + +When `strict=False` is necessary, use `DeserializationPolicy` to restrict the dynamic types and +hooks accepted during deserialization: + +```python +import pyfory +from pyfory import DeserializationPolicy + +dangerous_modules = {"subprocess", "os", "__builtin__"} + +class SafeDeserializationPolicy(DeserializationPolicy): + def validate_class(self, cls, is_local, **kwargs): + if cls.__module__ in dangerous_modules: + raise ValueError(f"Blocked dangerous class: {cls.__module__}.{cls.__name__}") + return None + + def intercept_reduce_call(self, callable_obj, args, **kwargs): + if getattr(callable_obj, "__name__", "") == "Popen": + raise ValueError("Blocked attempt to invoke subprocess.Popen") + return None + + def intercept_setstate(self, obj, state, **kwargs): + if isinstance(state, dict) and "password" in state: + state["password"] = "***REDACTED***" + return None + +policy = SafeDeserializationPolicy() +fory = pyfory.Fory(xlang=False, ref=True, strict=False, policy=policy) +``` + +Available policy hooks include: + +| Hook | Description | +| -------------------------------------------- | --------------------------------------------------- | +| `validate_class(cls, is_local)` | Validate or block class types | +| `validate_module(module, is_local)` | Validate or block module imports | +| `validate_function(func, is_local)` | Validate or block function references | +| `intercept_reduce_call(callable_obj, args)` | Intercept `__reduce__` invocations | +| `inspect_reduced_object(obj)` | Inspect or replace objects created via `__reduce__` | +| `intercept_setstate(obj, state)` | Sanitize state before `__setstate__` | +| `authorize_instantiation(cls, args, kwargs)` | Control class instantiation | + +### Security Checklist + +- Keep `strict=True` for untrusted data. +- Register all expected application types before deserialization. +- Use `DeserializationPolicy` when `strict=False` is necessary. +- Keep `max_depth` low enough to reject unexpectedly deep payloads. +- Do not treat xlang/native mode choice as a security control. + ## Related Topics - [Basic Serialization](basic-serialization.md) - Using configured Fory - [Type Registration](type-registration.md) - Registration patterns -- [Python Native Mode](python-native.md) - Python-only object serialization -- [Security](security.md) - Security best practices +- [Native Serialization](native-serialization.md) - Python-only object serialization diff --git a/docs/guide/python/custom-serializers.md b/docs/guide/python/custom-serializers.md index 5edc35a22a..520f0d484a 100644 --- a/docs/guide/python/custom-serializers.md +++ b/docs/guide/python/custom-serializers.md @@ -1,6 +1,6 @@ --- title: Custom Serializers -sidebar_position: 12 +sidebar_position: 10 id: custom_serializers license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -135,4 +135,4 @@ fory.register(MyClass, typename="com.example.MyClass", serializer=MySerializer(f - [Type Registration](type-registration.md) - Registration patterns - [Configuration](configuration.md) - Fory parameters -- [Cross-Language](cross-language.md) - type registration and schema rules for xlang +- [Xlang Serialization](xlang-serialization.md) - type registration and schema rules for xlang diff --git a/docs/guide/python/index.md b/docs/guide/python/index.md index 7a7323b4ea..899f087eaf 100644 --- a/docs/guide/python/index.md +++ b/docs/guide/python/index.md @@ -148,16 +148,17 @@ Use xlang mode for cross-language payloads and dataclass schemas shared with oth Use native mode for Python-only traffic. Native mode is selected with `xlang=False`, uses schema-consistent payloads unless compatible mode is enabled, and owns pickle/cloudpickle-style behavior such as functions, lambdas, classes, methods, `__reduce__`, `__getstate__`, and out-of-band pickle protocol 5 buffers. It is optimized for Python's type system and supports a broader Python object surface than xlang mode, so use it when replacing pickle or cloudpickle. -See [Python Native Mode](python-native.md) for Python-only serialization details and [Cross-Language](cross-language.md) for Python xlang registration and interoperability rules. +See [Native Serialization](native-serialization.md) for Python-only serialization details and [Xlang Serialization](xlang-serialization.md) for Python xlang registration and interoperability rules. ## Next Steps -- [Configuration](configuration.md) - Fory parameters and modes - [Basic Serialization](basic-serialization.md) - Basic usage patterns -- [Python Native Mode](python-native.md) - Functions, lambdas, classes -- [Cross-Language](cross-language.md) - xlang mode +- [Xlang Serialization](xlang-serialization.md) - xlang mode +- [Native Serialization](native-serialization.md) - Python-only serialization +- [Configuration](configuration.md) - Fory parameters, modes, and security +- [Type Registration](type-registration.md) - User-defined type registration +- [Custom Serializers](custom-serializers.md) - Extend serialization behavior - [Row Format](row-format.md) - Zero-copy row format -- [Security](security.md) - Security best practices ## Links diff --git a/docs/guide/python/native-serialization.md b/docs/guide/python/native-serialization.md new file mode 100644 index 0000000000..b67bc2f662 --- /dev/null +++ b/docs/guide/python/native-serialization.md @@ -0,0 +1,341 @@ +--- +title: Native Serialization +sidebar_position: 3 +id: native_serialization +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +Python native serialization is the Python-only wire mode selected with `xlang=False`. Use it when +every writer and reader is Python and the payload should follow Python's object model instead of +the portable xlang type system. + +Use [Xlang Serialization](xlang-serialization.md), the default Python mode, when bytes must be read +by Java, Go, Rust, C++, JavaScript, or another non-Python Fory runtime. + +## When To Use Native Serialization + +Use native serialization when: + +- A payload is produced and consumed only by Python applications. +- You are replacing `pickle` or `cloudpickle` for Python-only object graphs. +- The data model includes functions, lambdas, local classes, methods, or Python reduction hooks. +- The graph can contain shared objects or cycles that need Python reference tracking. +- You need pickle protocol 5-style out-of-band buffers for large Python data objects. + +Native mode can serialize Python-specific values such as global functions, local functions, lambdas, +local classes, methods, and objects customized with `__getstate__`, `__setstate__`, `__reduce__`, +or `__reduce_ex__`. Those values are not valid xlang payloads. + +## Create a Native Runtime + +Create `Fory` with `xlang=False`: + +```python +import pyfory +fory = pyfory.Fory(xlang=False, ref=False, strict=True) +``` + +Keep `strict=True` for registered, trusted type surfaces. Use `strict=False` only when native-mode +payloads need dynamic Python types such as functions, local classes, or objects reconstructed by +reduction hooks. + +## Common Usage + +```python +import pyfory + +fory = pyfory.Fory(xlang=False, ref=True, strict=False) + +data = fory.dumps({"name": "Alice", "age": 30, "scores": [95, 87, 92]}) +print(fory.loads(data)) + +from dataclasses import dataclass + +@dataclass +class Person: + name: str + age: int + +person = Person("Bob", 25) +data = fory.dumps(person) +print(fory.loads(data)) # Person(name='Bob', age=25) +``` + +Use `dumps`/`loads` for pickle-style APIs, or `serialize`/`deserialize` when matching the xlang +API shape in code that switches modes explicitly. + +## Security And Dynamic Types + +Native mode can reconstruct Python objects that execute import and construction logic during +deserialization. Treat untrusted native-mode bytes the same way you would treat untrusted pickle +bytes. + +- Keep `strict=True` when deserializing data that should contain only registered or built-in types. +- Use `strict=False` only for trusted payloads that require dynamic Python classes or functions. +- Provide a `policy=` deserialization policy when dynamic types are required but the accepted type + surface should still be restricted. +- Do not use xlang/native mode choice as a security control. Apply strict mode, policies, + registration, and resource limits based on the payload source. + +## References And Cycles + +Enable `ref=True` when object identity, shared references, or cycles must round-trip: + +```python +import pyfory + +fory = pyfory.Fory(xlang=False, ref=True, strict=True) + +node = {} +node["self"] = node +data = fory.dumps(node) +decoded = fory.loads(data) +assert decoded["self"] is decoded +``` + +Disable reference tracking for value-shaped payloads that do not need identity preservation. It +keeps the payload smaller and the hot path simpler. + +## Pickle And Cloudpickle Replacement + +Native mode is the Python mode to choose when the existing boundary uses `pickle` or +`cloudpickle`. It supports richer Python values than JSON and xlang mode, including Python +functions, local classes, closures, and reduction hooks. + +Use xlang mode instead when the payload crosses language boundaries or the data model should be a +portable schema shared with other Fory runtimes. + +## Serialize Global Functions + +Capture and serialize functions defined at module level. Fory deserializes and returns the same +function object: + +```python +import pyfory + +fory = pyfory.Fory(xlang=False, ref=True, strict=False) + +def my_global_function(x): + return 10 * x + +data = fory.dumps(my_global_function) +print(fory.loads(data)(10)) # 100 +``` + +## Serialize Local Functions/Lambdas + +Serialize functions with closures and lambda expressions. Fory captures the closure variables +automatically: + +```python +import pyfory + +fory = pyfory.Fory(xlang=False, ref=True, strict=False) + +# Local functions with closures +def my_function(): + local_var = 10 + def local_func(x): + return x * local_var + return local_func + +data = fory.dumps(my_function()) +print(fory.loads(data)(10)) # 100 + +# Lambdas +data = fory.dumps(lambda x: 10 * x) +print(fory.loads(data)(10)) # 100 +``` + +## Serialize Global Classes/Methods + +Serialize class objects, instance methods, class methods, and static methods: + +```python +from dataclasses import dataclass +import pyfory +fory = pyfory.Fory(xlang=False, ref=True, strict=False) + +@dataclass +class Person: + name: str + age: int + + def f(self, x): + return self.age * x + + @classmethod + def g(cls, x): + return 10 * x + + @staticmethod + def h(x): + return 10 * x + +# Serialize global class +print(fory.loads(fory.dumps(Person))("Bob", 25)) # Person(name='Bob', age=25) + +# Serialize instance method +print(fory.loads(fory.dumps(Person("Bob", 20).f))(10)) # 200 + +# Serialize class method +print(fory.loads(fory.dumps(Person.g))(10)) # 100 + +# Serialize static method +print(fory.loads(fory.dumps(Person.h))(10)) # 100 +``` + +## Serialize Local Classes/Methods + +Serialize classes defined inside functions along with their methods: + +```python +from dataclasses import dataclass +import pyfory +fory = pyfory.Fory(xlang=False, ref=True, strict=False) + +def create_local_class(): + class LocalClass: + def f(self, x): + return 10 * x + + @classmethod + def g(cls, x): + return 10 * x + + @staticmethod + def h(x): + return 10 * x + return LocalClass + +# Serialize local class +data = fory.dumps(create_local_class()) +print(fory.loads(data)().f(10)) # 100 + +# Serialize local class instance method +data = fory.dumps(create_local_class()().f) +print(fory.loads(data)(10)) # 100 + +# Serialize local class method +data = fory.dumps(create_local_class().g) +print(fory.loads(data)(10)) # 100 + +# Serialize local class static method +data = fory.dumps(create_local_class().h) +print(fory.loads(data)(10)) # 100 +``` + +## Custom Python Object Hooks + +Native mode respects common Python customization hooks: + +```python +import pyfory + +class SessionToken: + def __init__(self, value): + self.value = value + + def __getstate__(self): + return {"value": self.value} + + def __setstate__(self, state): + self.value = state["value"] + +fory = pyfory.Fory(xlang=False, strict=False) +token = fory.loads(fory.dumps(SessionToken("abc"))) +print(token.value) # abc +``` + +Use these hooks for Python-only payloads. For xlang payloads, model the data as dataclasses with +portable field annotations instead. + +## Out-of-Band Buffers + +Python native mode can use pickle protocol 5-style out-of-band buffers for large binary payloads +and data structures backed by external memory: + +```python +import pickle +import pyfory + +data = b"Large binary data" +pickle_buffer = pickle.PickleBuffer(data) + +buffer_objects = [] +fory = pyfory.Fory(xlang=False, ref=True, strict=False) +serialized = fory.dumps(pickle_buffer, buffer_callback=buffer_objects.append) +buffers = [obj.getbuffer() for obj in buffer_objects] +decoded = fory.loads(serialized, buffers=buffers) +assert bytes(decoded.raw()) == data +``` + +Use this when the payload stays in Python and large buffers should avoid extra copies. See +[Out-of-Band Serialization](out-of-band.md). + +## Native And Xlang Comparison + +| Requirement | Use native serialization | Use xlang serialization | +| ------------------------------------------ | ------------------------ | ----------------------- | +| Python-only payloads | Yes | Optional | +| Non-Python readers or writers | No | Yes | +| Functions, lambdas, local classes | Yes | No | +| `__reduce__` / `__getstate__` object hooks | Yes | No | +| Pickle/cloudpickle replacement | Yes | No | +| Portable type mapping across runtimes | No | Yes | + +## Performance Comparison + +```python +import pyfory +import pickle +import timeit + +fory = pyfory.Fory(xlang=False, ref=True, strict=False) + +obj = {f"key{i}": f"value{i}" for i in range(10000)} +print(f"Fory: {timeit.timeit(lambda: fory.dumps(obj), number=1000):.3f}s") +print(f"Pickle: {timeit.timeit(lambda: pickle.dumps(obj), number=1000):.3f}s") +``` + +## Troubleshooting + +### Another language cannot read the payload + +The writer is using native serialization. Rebuild it with `xlang=True`, register portable schemas +on every peer runtime, and avoid Python-only values such as lambdas or local classes. + +### A dynamic class or function fails to deserialize + +Use `strict=False` for trusted payloads and provide a deserialization `policy=` when only selected +dynamic types should be accepted. + +### A cycle does not round-trip + +Create the runtime with `ref=True`. + +### A value depends on pickle hooks + +Keep the payload in native mode. Xlang mode does not execute Python `__reduce__`, +`__reduce_ex__`, `__getstate__`, or `__setstate__` object reconstruction hooks. + +## Related Topics + +- [Xlang Serialization](xlang-serialization.md) - Cross-runtime Python payloads +- [Configuration](configuration.md) - Python runtime options +- [Out-of-Band Serialization](out-of-band.md) - Zero-copy buffer support +- [Configuration](configuration.md#security) - Deserialization policies diff --git a/docs/guide/python/numpy-integration.md b/docs/guide/python/numpy-integration.md index ba7f5e1ede..b91c910ec8 100644 --- a/docs/guide/python/numpy-integration.md +++ b/docs/guide/python/numpy-integration.md @@ -1,6 +1,6 @@ --- title: NumPy & Pandas -sidebar_position: 10 +sidebar_position: 11 id: numpy_integration license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/python/python-native.md b/docs/guide/python/python-native.md deleted file mode 100644 index 51046e695e..0000000000 --- a/docs/guide/python/python-native.md +++ /dev/null @@ -1,216 +0,0 @@ ---- -title: Python Native Mode -sidebar_position: 2 -id: native_mode -license: | - Licensed to the Apache Software Foundation (ASF) under one or more - contributor license agreements. See the NOTICE file distributed with - this work for additional information regarding copyright ownership. - The ASF licenses this file to You under the Apache License, Version 2.0 - (the "License"); you may not use this file except in compliance with - the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. ---- - -`pyfory` provides a Python native mode for same-language Python traffic. It covers the pickle and -cloudpickle-style object surface while keeping xlang mode, the default, focused on cross-language -payloads. Native mode is optimized for Python's type system and is the mode to use when replacing -pickle or cloudpickle for Python-only payloads. - -## Overview - -The binary protocol and API are similar to Fory's xlang mode, but Python native mode can serialize any Python object—including global functions, local functions, lambdas, local classes and types with customized serialization using `__getstate__/__reduce__/__reduce_ex__`, which are not allowed in xlang mode. - -To use Python native mode, create `Fory` with `xlang=False`: - -```python -import pyfory -fory = pyfory.Fory(xlang=False, ref=False, strict=True) -``` - -## Drop-in Replacement for Pickle/Cloudpickle - -`pyfory` can serialize Python objects that are outside the xlang type system with the following -configuration. Use this mode when the payload stays in Python and you want Fory as the -pickle/cloudpickle replacement: - -- **For circular references**: Set `ref=True` to enable reference tracking -- **For functions/classes**: Set `strict=False` to allow deserialization of dynamic types - -Security warning: when `strict=False`, Fory can deserialize arbitrary Python types. Use this only -with trusted data, and provide a `policy=` deserialization policy when dynamic types are required. - -### Common Usage - -```python -import pyfory - -# Create Fory instance -fory = pyfory.Fory(xlang=False, ref=True, strict=False) - -# serialize common Python objects -data = fory.dumps({"name": "Alice", "age": 30, "scores": [95, 87, 92]}) -print(fory.loads(data)) - -# serialize custom objects -from dataclasses import dataclass - -@dataclass -class Person: - name: str - age: int - -person = Person("Bob", 25) -data = fory.dumps(person) -print(fory.loads(data)) # Person(name='Bob', age=25) -``` - -Native mode can replace pickle and cloudpickle for Python-only object graphs and can serialize -richer values than JSON, including functions, local classes, methods, and objects that implement -`__reduce__`, `__reduce_ex__`, `__getstate__`, or `__setstate__`. These Python-specific features -are not valid xlang payloads. - -## Serialize Global Functions - -Capture and serialize functions defined at module level. Fory deserialize and return same function object: - -```python -import pyfory - -fory = pyfory.Fory(xlang=False, ref=True, strict=False) - -def my_global_function(x): - return 10 * x - -data = fory.dumps(my_global_function) -print(fory.loads(data)(10)) # 100 -``` - -## Serialize Local Functions/Lambdas - -Serialize functions with closures and lambda expressions. Fory captures the closure variables automatically: - -```python -import pyfory - -fory = pyfory.Fory(xlang=False, ref=True, strict=False) - -# Local functions with closures -def my_function(): - local_var = 10 - def local_func(x): - return x * local_var - return local_func - -data = fory.dumps(my_function()) -print(fory.loads(data)(10)) # 100 - -# Lambdas -data = fory.dumps(lambda x: 10 * x) -print(fory.loads(data)(10)) # 100 -``` - -## Serialize Global Classes/Methods - -Serialize class objects, instance methods, class methods, and static methods: - -```python -from dataclasses import dataclass -import pyfory -fory = pyfory.Fory(xlang=False, ref=True, strict=False) - -@dataclass -class Person: - name: str - age: int - - def f(self, x): - return self.age * x - - @classmethod - def g(cls, x): - return 10 * x - - @staticmethod - def h(x): - return 10 * x - -# Serialize global class -print(fory.loads(fory.dumps(Person))("Bob", 25)) # Person(name='Bob', age=25) - -# Serialize instance method -print(fory.loads(fory.dumps(Person("Bob", 20).f))(10)) # 200 - -# Serialize class method -print(fory.loads(fory.dumps(Person.g))(10)) # 100 - -# Serialize static method -print(fory.loads(fory.dumps(Person.h))(10)) # 100 -``` - -## Serialize Local Classes/Methods - -Serialize classes defined inside functions along with their methods: - -```python -from dataclasses import dataclass -import pyfory -fory = pyfory.Fory(xlang=False, ref=True, strict=False) - -def create_local_class(): - class LocalClass: - def f(self, x): - return 10 * x - - @classmethod - def g(cls, x): - return 10 * x - - @staticmethod - def h(x): - return 10 * x - return LocalClass - -# Serialize local class -data = fory.dumps(create_local_class()) -print(fory.loads(data)().f(10)) # 100 - -# Serialize local class instance method -data = fory.dumps(create_local_class()().f) -print(fory.loads(data)(10)) # 100 - -# Serialize local class method -data = fory.dumps(create_local_class().g) -print(fory.loads(data)(10)) # 100 - -# Serialize local class static method -data = fory.dumps(create_local_class().h) -print(fory.loads(data)(10)) # 100 -``` - -## Performance Comparison - -```python -import pyfory -import pickle -import timeit - -fory = pyfory.Fory(xlang=False, ref=True, strict=False) - -obj = {f"key{i}": f"value{i}" for i in range(10000)} -print(f"Fory: {timeit.timeit(lambda: fory.dumps(obj), number=1000):.3f}s") -print(f"Pickle: {timeit.timeit(lambda: pickle.dumps(obj), number=1000):.3f}s") -``` - -## Related Topics - -- [Configuration](configuration.md) - Python native mode configuration -- [Out-of-Band Serialization](out-of-band.md) - Zero-copy buffers -- [Security](security.md) - DeserializationPolicy diff --git a/docs/guide/python/row-format.md b/docs/guide/python/row-format.md index 008551e698..a9f7219e41 100644 --- a/docs/guide/python/row-format.md +++ b/docs/guide/python/row-format.md @@ -1,6 +1,6 @@ --- title: Row Format -sidebar_position: 11 +sidebar_position: 12 id: row_format license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -186,6 +186,6 @@ pip install pyfory[format] ## Related Topics -- [Cross-Language Serialization](cross-language.md) - xlang mode +- [Xlang Serialization](xlang-serialization.md) - xlang mode - [Basic Serialization](basic-serialization.md) - Object serialization - [Row Format Specification](https://fory.apache.org/docs/specification/row_format_spec) - Protocol details diff --git a/docs/guide/python/schema-evolution.md b/docs/guide/python/schema-evolution.md index 199d636fb0..9ad7306787 100644 --- a/docs/guide/python/schema-evolution.md +++ b/docs/guide/python/schema-evolution.md @@ -96,5 +96,5 @@ print(user.email) # "unknown@example.com" ## Related Topics - [Configuration](configuration.md) - Compatible mode settings -- [Cross-Language](cross-language.md) - Schema evolution across languages +- [Xlang Serialization](xlang-serialization.md) - Schema evolution across languages - [Type Registration](type-registration.md) - Registration patterns diff --git a/docs/guide/python/schema-metadata.md b/docs/guide/python/schema-metadata.md index 4583a9d7cc..97fd6fde02 100644 --- a/docs/guide/python/schema-metadata.md +++ b/docs/guide/python/schema-metadata.md @@ -521,4 +521,4 @@ class User: - [Basic Serialization](basic-serialization.md) - Getting started with Fory serialization - [Schema Evolution](schema-evolution.md) - Compatible mode and schema evolution -- [Cross-Language](cross-language.md) - Interoperability with Java, Rust, C++, Go +- [Xlang Serialization](xlang-serialization.md) - Interoperability with Java, Rust, C++, Go diff --git a/docs/guide/python/security.md b/docs/guide/python/security.md deleted file mode 100644 index 1bcd441d70..0000000000 --- a/docs/guide/python/security.md +++ /dev/null @@ -1,148 +0,0 @@ ---- -title: Security Best Practices -sidebar_position: 6 -id: security -license: | - Licensed to the Apache Software Foundation (ASF) under one or more - contributor license agreements. See the NOTICE file distributed with - this work for additional information regarding copyright ownership. - The ASF licenses this file to You under the Apache License, Version 2.0 - (the "License"); you may not use this file except in compliance with - the License. You may obtain a copy of the License at - - http://www.apache.org/licenses/LICENSE-2.0 - - Unless required by applicable law or agreed to in writing, software - distributed under the License is distributed on an "AS IS" BASIS, - WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. - See the License for the specific language governing permissions and - limitations under the License. ---- - -This page covers security best practices and DeserializationPolicy. - -## Production Configuration - -Never disable `strict=True` in production unless your environment is completely trusted: - -```python -import pyfory - -# Recommended production settings -f = pyfory.Fory( - ref=True, # Handle circular references - strict=True, # IMPORTANT: Prevent malicious data - max_depth=100 # Prevent deep recursion attacks -) - -# Explicitly register allowed types -f.register(UserModel, type_id=100) -f.register(OrderModel, type_id=101) -# Never set strict=False in production with untrusted data! -``` - -## Development vs Production - -Use environment variables to switch between configurations: - -```python -import pyfory -import os - -# Development configuration -if os.getenv('ENV') == 'development': - fory = pyfory.Fory( - xlang=False, - ref=True, - strict=False, # Allow any type for development - max_depth=1000 # Higher limit for development - ) -else: - # Production configuration (security hardened) - fory = pyfory.Fory( - ref=True, - strict=True, # CRITICAL: Require registration - max_depth=100 # Reasonable limit - ) - # Register only known safe types - for idx, model_class in enumerate([UserModel, ProductModel, OrderModel]): - fory.register(model_class, type_id=100 + idx) -``` - -## DeserializationPolicy - -When `strict=False` is necessary (e.g., deserializing functions/lambdas), use `DeserializationPolicy` to implement fine-grained security controls during deserialization. - -**Why use DeserializationPolicy?** - -- Block dangerous classes/modules (e.g., `subprocess.Popen`) -- Intercept and validate `__reduce__` callables before invocation -- Sanitize sensitive data during `__setstate__` -- Replace or reject deserialized objects based on custom rules - -### Blocking Dangerous Classes - -```python -import pyfory -from pyfory import DeserializationPolicy - -dangerous_modules = {'subprocess', 'os', '__builtin__'} - -class SafeDeserializationPolicy(DeserializationPolicy): - """Block potentially dangerous classes during deserialization.""" - - def validate_class(self, cls, is_local, **kwargs): - # Block dangerous modules - if cls.__module__ in dangerous_modules: - raise ValueError(f"Blocked dangerous class: {cls.__module__}.{cls.__name__}") - return None - - def intercept_reduce_call(self, callable_obj, args, **kwargs): - # Block specific callable invocations during __reduce__ - if getattr(callable_obj, '__name__', "") == 'Popen': - raise ValueError("Blocked attempt to invoke subprocess.Popen") - return None - - def intercept_setstate(self, obj, state, **kwargs): - # Sanitize sensitive data - if isinstance(state, dict) and 'password' in state: - state['password'] = '***REDACTED***' - return None - -# Create Fory with custom security policy -policy = SafeDeserializationPolicy() -fory = pyfory.Fory(xlang=False, ref=True, strict=False, policy=policy) - -# Now deserialization is protected by your custom policy -data = fory.serialize(my_object) -result = fory.deserialize(data) # Policy hooks will be invoked -``` - -## Available Policy Hooks - -| Hook | Description | -| -------------------------------------------- | ------------------------------------------------- | -| `validate_class(cls, is_local)` | Validate/block class types during deserialization | -| `validate_module(module, is_local)` | Validate/block module imports | -| `validate_function(func, is_local)` | Validate/block function references | -| `intercept_reduce_call(callable_obj, args)` | Intercept `__reduce__` invocations | -| `inspect_reduced_object(obj)` | Inspect/replace objects created via `__reduce__` | -| `intercept_setstate(obj, state)` | Sanitize state before `__setstate__` | -| `authorize_instantiation(cls, args, kwargs)` | Control class instantiation | - -**See also:** `pyfory/policy.py` contains detailed documentation and examples for each hook. - -## Best Practices - -1. **Always use `strict=True` in production** -2. **Use `DeserializationPolicy`** when `strict=False` is necessary -3. **Block dangerous modules** (subprocess, os, etc.) -4. **Set appropriate `max_depth`** to prevent stack overflow -5. **Validate data sources** before deserialization -6. **Log security events** for auditing - -## Related Topics - -- [Type Registration](type-registration.md) - Registration and strict mode -- [Configuration](configuration.md) - Fory parameters -- [Python Native Mode](python-native.md) - Functions and lambdas diff --git a/docs/guide/python/troubleshooting.md b/docs/guide/python/troubleshooting.md index 5ec222371a..d81d2fe9de 100644 --- a/docs/guide/python/troubleshooting.md +++ b/docs/guide/python/troubleshooting.md @@ -192,4 +192,4 @@ ruff check --fix . - [Configuration](configuration.md) - Fory parameters - [Type Registration](type-registration.md) - Registration best practices -- [Security](security.md) - Security configuration +- [Configuration](configuration.md#security) - Security configuration diff --git a/docs/guide/python/type-registration.md b/docs/guide/python/type-registration.md index 8565ff608a..be662ca963 100644 --- a/docs/guide/python/type-registration.md +++ b/docs/guide/python/type-registration.md @@ -19,7 +19,7 @@ license: | limitations under the License. --- -This page covers Python type registration APIs. Use [Security](security.md) for +This page covers Python type registration APIs. Use [Configuration](configuration.md#security) for strict-mode policy, max-depth limits, and trusted-data guidance. ## Type Registration @@ -81,5 +81,5 @@ same registration IDs or names on every peer that shares those payloads. ## Related Topics - [Configuration](configuration.md) - Fory parameters -- [Security](security.md) - Strict mode, deserialization policies, and size limits +- [Configuration](configuration.md#security) - Strict mode, deserialization policies, and size limits - [Custom Serializers](custom-serializers.md) - Custom serialization diff --git a/docs/guide/python/cross-language.md b/docs/guide/python/xlang-serialization.md similarity index 92% rename from docs/guide/python/cross-language.md rename to docs/guide/python/xlang-serialization.md index b1ed319ab7..6bf564951e 100644 --- a/docs/guide/python/cross-language.md +++ b/docs/guide/python/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization -sidebar_position: 4 -id: cross_language +title: Xlang Serialization +sidebar_position: 2 +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -19,7 +19,7 @@ license: | limitations under the License. --- -`pyfory` supports cross-language object graph serialization, allowing you to serialize data in Python and deserialize it in Java, Go, Rust, or other supported languages. +`pyfory` supports xlang object graph serialization, allowing you to serialize data in Python and deserialize it in Java, Go, Rust, or other supported languages. ## Create an Xlang Runtime @@ -30,7 +30,7 @@ import pyfory fory = pyfory.Fory(xlang=True, ref=False, strict=True) ``` -## Cross-Language Example +## Xlang Example ### Python (Serializer) @@ -40,7 +40,7 @@ from dataclasses import dataclass f = pyfory.Fory(xlang=True, ref=True) -# Register type for cross-language compatibility +# Register type for xlang compatibility @dataclass class Person: name: str @@ -90,9 +90,9 @@ fory.register_by_name::("example", "Person"); let person: Person = fory.deserialize(&binary_data)?; ``` -## Type Annotations for Cross-Language +## Type Annotations for Xlang -Use pyfory type annotations for explicit cross-language type mapping: +Use pyfory type annotations for explicit xlang type mapping: ```python from dataclasses import dataclass @@ -196,10 +196,10 @@ The binary protocol and API are similar to `pyfory`'s Python native mode, but Py ## See Also -- [Cross-Language Serialization Specification](../../specification/xlang_serialization_spec.md) +- [Xlang Serialization Specification](../../specification/xlang_serialization_spec.md) - [Type Mapping Reference](../../specification/xlang_type_mapping.md) -- [Java Cross-Language Guide](../java/cross-language.md) -- [Rust Cross-Language Guide](../rust/cross-language.md) +- [Java Xlang Serialization Guide](../java/xlang-serialization.md) +- [Rust Xlang Serialization Guide](../rust/xlang-serialization.md) ## Related Topics diff --git a/docs/guide/rust/configuration.md b/docs/guide/rust/configuration.md index 93cf1d38a8..83fc6e538c 100644 --- a/docs/guide/rust/configuration.md +++ b/docs/guide/rust/configuration.md @@ -1,6 +1,6 @@ --- title: Configuration -sidebar_position: 2 +sidebar_position: 4 id: configuration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -128,8 +128,17 @@ let fory = Fory::builder() | `xlang(bool)` | Use xlang mode | `true` | | `max_dyn_depth(u32)` | Maximum nesting depth for dynamic types | `5` | +## Security + +Security-related configuration: + +- Register application structs and trait-object implementations before deserializing untrusted + payloads. +- Use `max_dyn_depth(...)` to reject unexpectedly deep dynamic object graphs. +- Prefer concrete typed fields over `dyn Any` or broad trait-object fields for untrusted input. + ## Related Topics - [Basic Serialization](basic-serialization.md) - Using configured Fory - [Schema Evolution](schema-evolution.md) - Compatible mode details -- [Cross-Language](cross-language.md) - xlang mode +- [Xlang Serialization](xlang-serialization.md) - xlang mode diff --git a/docs/guide/rust/custom-serializers.md b/docs/guide/rust/custom-serializers.md index 8320a0dd88..c3c5e8f90e 100644 --- a/docs/guide/rust/custom-serializers.md +++ b/docs/guide/rust/custom-serializers.md @@ -1,6 +1,6 @@ --- title: Custom Serializers -sidebar_position: 9 +sidebar_position: 10 id: custom_serializers license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/rust/index.md b/docs/guide/rust/index.md index a4377461d4..dccd998644 100644 --- a/docs/guide/rust/index.md +++ b/docs/guide/rust/index.md @@ -26,7 +26,7 @@ The Rust implementation provides versatile and high-performance serialization wi ## Why Apache Fory™ Rust? - **Fast binary encoding**: Zero-copy deserialization and optimized binary protocols -- **Cross-language**: Seamlessly serialize/deserialize data across Java, Python, C++, Go, JavaScript, and Rust +- **Xlang**: Seamlessly serialize/deserialize data across Java, Python, C++, Go, JavaScript, and Rust - **Type-safe**: Compile-time type checking with derive macros - **Circular references**: Automatic tracking of shared and circular references with `Rc`/`Arc` and weak pointers - **Polymorphic**: Serialize trait objects with `Box`, `Rc`, and `Arc` @@ -96,7 +96,7 @@ Use xlang mode for cross-language payloads and schemas shared with other Fory ru Use native mode for Rust-only traffic. Native mode is selected with `.xlang(false)`, uses schema-consistent payloads unless compatible mode is enabled, and keeps Rust object serialization on the Rust runtime path. It is optimized for Rust's type system and covers Rust-specific object features such as trait objects and shared-reference patterns that are not portable xlang payloads. -See [Cross-Language Serialization](cross-language.md) for Rust xlang registration and interoperability rules, and [Configuration](configuration.md) for native-mode builder options. +See [Xlang Serialization](xlang-serialization.md) for Rust xlang registration and interoperability rules, and [Native Serialization](native-serialization.md) for Rust-only payloads. ## Thread Safety @@ -186,7 +186,9 @@ fory-derive/ # Procedural macros - [Configuration](configuration.md) - Fory builder options and modes - [Basic Serialization](basic-serialization.md) - Object graph serialization +- [Xlang Serialization](xlang-serialization.md) - xlang mode +- [Native Serialization](native-serialization.md) - Rust-only serialization - [References](references.md) - Shared and circular references - [Polymorphism](polymorphism.md) - Trait object serialization -- [Cross-Language](cross-language.md) - xlang mode +- [Custom Serializers](custom-serializers.md) - Extend serialization behavior - [Row Format](row-format.md) - Zero-copy row-based format diff --git a/docs/guide/rust/native-serialization.md b/docs/guide/rust/native-serialization.md new file mode 100644 index 0000000000..387e8deae7 --- /dev/null +++ b/docs/guide/rust/native-serialization.md @@ -0,0 +1,225 @@ +--- +title: Native Serialization +sidebar_position: 3 +id: native_serialization +license: | + Licensed to the Apache Software Foundation (ASF) under one or more + contributor license agreements. See the NOTICE file distributed with + this work for additional information regarding copyright ownership. + The ASF licenses this file to You under the Apache License, Version 2.0 + (the "License"); you may not use this file except in compliance with + the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, software + distributed under the License is distributed on an "AS IS" BASIS, + WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + See the License for the specific language governing permissions and + limitations under the License. +--- + +Rust native serialization is the Rust-only wire mode selected with `.xlang(false)`. Use it when +every writer and reader is Rust and the payload should preserve Rust object-graph behavior instead +of the portable xlang type system. + +Use [Xlang Serialization](xlang-serialization.md), the default Rust mode, when bytes must be read +by Java, Python, C++, Go, JavaScript, or another non-Rust Fory runtime. + +## When To Use Native Serialization + +Use native serialization when: + +- A payload is produced and consumed only by Rust applications. +- The data model uses Rust-specific object graph features such as `Rc`, `Arc`, weak + pointers, `RefCell`, `Mutex`, trait objects, or `dyn Any`. +- You want schema-consistent Rust payloads for lockstep services. +- You need compatible schema evolution for Rust-only rolling deployments. +- You want compile-time serializers from `#[derive(ForyStruct)]` without portable xlang mapping + constraints. + +## Create a Native Runtime + +```rust +use fory::{Error, Fory, ForyStruct}; + +#[derive(ForyStruct, Debug, PartialEq)] +struct Order { + id: i64, + amount: f64, +} + +fn main() -> Result<(), Error> { + let mut fory = Fory::builder().xlang(false).build(); + fory.register::(100)?; + + let order = Order { id: 1, amount: 42.5 }; + let bytes = fory.serialize(&order)?; + let decoded: Order = fory.deserialize(&bytes)?; + assert_eq!(order, decoded); + Ok(()) +} +``` + +Perform registrations before sharing a `Fory` instance across threads. Once configured, `Fory` can +be shared through `Arc`. + +## Schema Evolution + +Native serialization defaults to schema-consistent mode. Enable compatible mode only when Rust-only +writer and reader versions can differ: + +```rust +let mut writer = Fory::builder().xlang(false).compatible(true).build(); +let mut reader = Fory::builder().xlang(false).compatible(true).build(); +``` + +Compatible mode uses schema metadata to tolerate added, removed, or reordered fields when field +identity remains compatible. See [Schema Evolution](schema-evolution.md). + +## Registration + +Register application structs and enum-like types before serialization: + +```rust +fory.register::(100)?; +fory.register_by_name::("example", "Order")?; +``` + +Use explicit numeric IDs for compact payloads and stable deployments. Use namespace/type-name +registration when independent teams coordinate type identity by names. + +## Rust Object Surface + +Native serialization owns the Rust-specific object surface: + +- Structs and tuple structs with `#[derive(ForyStruct)]`. +- Enums and union-like models supported by Fory derive macros. +- `Vec`, maps, sets, tuples, arrays, and optional values. +- `Box`, `Rc`, `Arc`, `RcWeak`, and `ArcWeak`. +- `RefCell` and `Mutex`. +- Trait objects such as `Box`, `Rc`, and `Arc`. +- Runtime type dispatch with `Rc` and `Arc`. +- Date and time carriers, including optional `chrono` support. + +Use [Basic Serialization](basic-serialization.md), [References](references.md), and +[Trait Object Serialization](polymorphism.md) for focused examples. + +## Shared And Circular References + +Native mode can preserve shared references with `Rc` and `Arc`: + +```rust +use fory::{Error, Fory}; +use std::rc::Rc; + +fn main() -> Result<(), Error> { + let fory = Fory::builder().xlang(false).build(); + let shared = Rc::new(String::from("shared")); + let values = vec![shared.clone(), shared.clone()]; + + let bytes = fory.serialize(&values)?; + let decoded: Vec> = fory.deserialize(&bytes)?; + assert!(Rc::ptr_eq(&decoded[0], &decoded[1])); + Ok(()) +} +``` + +Use `.track_ref(true)` when weak pointers or explicit cyclic graphs need reference tracking: + +```rust +let mut fory = Fory::builder().xlang(false).track_ref(true).build(); +``` + +Weak pointers serialize as references to their target when the target is still alive, and as null +when the target has been dropped. + +## Trait Objects + +Trait objects are Rust runtime features and belong in native serialization: + +```rust +use fory::{register_trait_type, Error, Fory, ForyStruct, Serializer}; + +trait Animal: Serializer { + fn name(&self) -> &str; +} + +#[derive(ForyStruct)] +struct Dog { + name: String, +} + +impl Animal for Dog { + fn name(&self) -> &str { + &self.name + } +} + +register_trait_type!(Animal, Dog); + +fn main() -> Result<(), Error> { + let mut fory = Fory::builder().xlang(false).compatible(true).build(); + fory.register::(100)?; + + let value: Box = Box::new(Dog { name: "Milo".into() }); + let bytes = fory.serialize(&value)?; + let decoded: Box = fory.deserialize(&bytes)?; + assert_eq!(decoded.name(), "Milo"); + Ok(()) +} +``` + +Register every concrete implementation that can appear behind the trait object. + +## Performance Guidelines + +- Reuse a configured `Fory` instance and register types before concurrent use. +- Keep native schema-consistent mode for lockstep Rust services. +- Enable `.compatible(true)` only when Rust-only schema evolution is required. +- Use derive-generated serializers for application structs. +- Use `.track_ref(true)` only for weak-pointer or cyclic graph scenarios that require it. +- Prefer concrete typed fields over `dyn Any` or trait objects on hot paths. + +## Native And Xlang Comparison + +| Requirement | Use native serialization | Use xlang serialization | +| ---------------------------------------- | ------------------------ | ----------------------- | +| Rust-only payloads | Yes | Optional | +| Non-Rust readers or writers | No | Yes | +| `Rc`, `Arc`, weak pointers | Yes | No | +| Trait objects and `dyn Any` | Yes | No | +| Schema-consistent same-language payloads | Yes | No | +| Compatible schema evolution by default | No | Yes | +| Portable type mapping across runtimes | No | Yes | + +## Troubleshooting + +### A non-Rust runtime cannot read the payload + +The writer is using native serialization. Rebuild it with `.xlang(true)` and align type +registration with every peer runtime. + +### A weak pointer fails to resolve + +Use `.track_ref(true)` and make sure the target object is still alive when serialized. Dropped weak +targets deserialize as null. + +### A trait object cannot deserialize + +Register the trait mapping and every concrete implementation that can appear behind the trait +object. + +### A rolling deployment fails after a field change + +Native serialization defaults to schema-consistent mode. Use `.compatible(true)` on both writer and +reader when schemas can differ. + +## Related Topics + +- [Xlang Serialization](xlang-serialization.md) - Cross-runtime Rust payloads +- [Configuration](configuration.md) - Builder options +- [Basic Serialization](basic-serialization.md) - Object graph serialization +- [Shared & Circular References](references.md) - `Rc`, `Arc`, and weak pointers +- [Trait Object Serialization](polymorphism.md) - Trait objects and dynamic dispatch +- [Schema Evolution](schema-evolution.md) - Compatible mode diff --git a/docs/guide/rust/polymorphism.md b/docs/guide/rust/polymorphism.md index 2c1e7f8c61..1f21d9eacd 100644 --- a/docs/guide/rust/polymorphism.md +++ b/docs/guide/rust/polymorphism.md @@ -1,6 +1,6 @@ --- title: Trait Object Serialization -sidebar_position: 7 +sidebar_position: 8 id: polymorphism license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/rust/references.md b/docs/guide/rust/references.md index 98ed922b7a..a0e1a20acc 100644 --- a/docs/guide/rust/references.md +++ b/docs/guide/rust/references.md @@ -1,6 +1,6 @@ --- title: Shared & Circular References -sidebar_position: 6 +sidebar_position: 7 id: references license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/rust/row-format.md b/docs/guide/rust/row-format.md index a72202f574..9c5c545525 100644 --- a/docs/guide/rust/row-format.md +++ b/docs/guide/rust/row-format.md @@ -1,6 +1,6 @@ --- title: Row Format -sidebar_position: 10 +sidebar_position: 11 id: row_format license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -122,5 +122,5 @@ assert_eq!(prefs.values().get(0).unwrap(), "en"); ## Related Topics - [Basic Serialization](basic-serialization.md) - Object graph serialization -- [Cross-Language](cross-language.md) - Row format across languages +- [Xlang Serialization](xlang-serialization.md) - Row format across languages - [Row Format Specification](https://fory.apache.org/docs/specification/row_format_spec) - Protocol details diff --git a/docs/guide/rust/schema-evolution.md b/docs/guide/rust/schema-evolution.md index db3b40d1d2..154a6e2b32 100644 --- a/docs/guide/rust/schema-evolution.md +++ b/docs/guide/rust/schema-evolution.md @@ -1,6 +1,6 @@ --- title: Schema Evolution -sidebar_position: 8 +sidebar_position: 9 id: schema_evolution license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -215,4 +215,4 @@ assert_eq!(data, decoded); - [Configuration](configuration.md) - Enabling compatible mode - [Polymorphism](polymorphism.md) - Trait objects with schema evolution -- [Cross-Language](cross-language.md) - Schema evolution across languages +- [Xlang Serialization](xlang-serialization.md) - Schema evolution across languages diff --git a/docs/guide/rust/schema-metadata.md b/docs/guide/rust/schema-metadata.md index d87192c343..4aa3a892cf 100644 --- a/docs/guide/rust/schema-metadata.md +++ b/docs/guide/rust/schema-metadata.md @@ -1,6 +1,6 @@ --- title: Schema Metadata -sidebar_position: 5 +sidebar_position: 6 id: schema_metadata license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -456,4 +456,4 @@ struct User { - [Basic Serialization](basic-serialization.md) - Getting started with Fory serialization - [Schema Evolution](schema-evolution.md) - Compatible mode and schema evolution -- [Cross-Language](cross-language.md) - Interoperability with Java, C++, Go, Python +- [Xlang Serialization](xlang-serialization.md) - Interoperability with Java, C++, Go, Python diff --git a/docs/guide/rust/troubleshooting.md b/docs/guide/rust/troubleshooting.md index 97f3350532..5eab309025 100644 --- a/docs/guide/rust/troubleshooting.md +++ b/docs/guide/rust/troubleshooting.md @@ -1,6 +1,6 @@ --- title: Troubleshooting -sidebar_position: 11 +sidebar_position: 12 id: troubleshooting license: | Licensed to the Apache Software Foundation (ASF) under one or more diff --git a/docs/guide/rust/type-registration.md b/docs/guide/rust/type-registration.md index 85a55af8e3..301cd1f756 100644 --- a/docs/guide/rust/type-registration.md +++ b/docs/guide/rust/type-registration.md @@ -1,6 +1,6 @@ --- title: Type Registration -sidebar_position: 4 +sidebar_position: 5 id: type_registration license: | Licensed to the Apache Software Foundation (ASF) under one or more @@ -120,5 +120,5 @@ let handles: Vec<_> = (0..4) ## Related Topics - [Configuration](configuration.md) - Fory builder options -- [Cross-Language](cross-language.md) - xlang mode registration +- [Xlang Serialization](xlang-serialization.md) - xlang mode registration - [Custom Serializers](custom-serializers.md) - Custom serialization diff --git a/docs/guide/rust/cross-language.md b/docs/guide/rust/xlang-serialization.md similarity index 93% rename from docs/guide/rust/cross-language.md rename to docs/guide/rust/xlang-serialization.md index dab15eeb63..299e158834 100644 --- a/docs/guide/rust/cross-language.md +++ b/docs/guide/rust/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization -sidebar_position: 3 -id: cross_language +title: Xlang Serialization +sidebar_position: 2 +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -38,7 +38,7 @@ fory.register::(100)?; fory.register_by_name::("com.example", "MyStruct")?; ``` -## Type Registration for Cross-Language +## Type Registration for Xlang ### Register by ID @@ -58,7 +58,7 @@ For more flexible type naming: fory.register_by_name::("com.example", "User")?; ``` -## Cross-Language Example +## Xlang Example ### Rust (Serializer) @@ -177,10 +177,10 @@ explicit array field attribute when the schema is dense `array`. ## See Also -- [Cross-Language Serialization Specification](../../specification/xlang_serialization_spec.md) +- [Xlang Serialization Specification](../../specification/xlang_serialization_spec.md) - [Type Mapping Reference](../../specification/xlang_type_mapping.md) -- [Java Cross-Language Guide](../java/cross-language.md) -- [Python Cross-Language Guide](../python/cross-language.md) +- [Java Xlang Serialization Guide](../java/xlang-serialization.md) +- [Python Xlang Serialization Guide](../python/xlang-serialization.md) ## Related Topics diff --git a/docs/guide/scala/configuration.md b/docs/guide/scala/configuration.md index fe0490fb55..0932b6a0db 100644 --- a/docs/guide/scala/configuration.md +++ b/docs/guide/scala/configuration.md @@ -158,3 +158,22 @@ val fory = ForyScala.builder() In xlang mode, Scala collections use canonical `list`, `set`, and `map` payloads instead of Scala factory payloads. Generated optional fields use `Option[T]`. + +## Security + +Scala uses the Java runtime configuration surface. Keep class registration enabled for production +and any untrusted payload source: + +```scala +val fory = ForyScala.builder() + .requireClassRegistration(true) + .withMaxDepth(50) + .build() +``` + +Security-related configuration: + +- Keep `requireClassRegistration(true)` and register application classes or generated modules. +- Use `withMaxDepth(...)` to reject unexpectedly deep object graphs. +- Follow [Java Configuration](../java/configuration.md#security) for allow-listing and unknown-class + controls. diff --git a/docs/guide/scala/index.md b/docs/guide/scala/index.md index 775f705551..0a77b168f6 100644 --- a/docs/guide/scala/index.md +++ b/docs/guide/scala/index.md @@ -83,7 +83,7 @@ Use xlang mode for cross-language payloads and schemas shared with other Fory ru Use native mode for Scala/JVM-only traffic. Native mode is selected with `.withXlang(false)`, uses schema-consistent payloads unless compatible mode is enabled, and inherits the JVM native-mode object serialization path from Fory Java while adding Scala-specific serializers for case classes, collections, tuples, options, and enumerations. It is optimized for JVM and Scala type systems and is the right path for same-language Scala/JVM framework replacement payloads. -See [Configuration](configuration.md) for Scala builder setup and [Java Native Mode](../java/native-mode.md) for the full JVM native-mode behavior. +See [Configuration](configuration.md) for Scala builder setup and [Java Native Serialization](../java/native-serialization.md) for the full JVM native-mode behavior. ## Built on Fory Java diff --git a/docs/guide/swift/configuration.md b/docs/guide/swift/configuration.md index 4875e3e797..d312ef78b3 100644 --- a/docs/guide/swift/configuration.md +++ b/docs/guide/swift/configuration.md @@ -111,3 +111,12 @@ let fory = Fory() ```swift let fory = Fory(ref: true) ``` + +## Security + +Security-related configuration: + +- Register only the expected generated models before deserializing untrusted payloads. +- Use `checkClassVersion` with `compatible: false` when exact schema matching is required. +- Set `maxCollectionSize`, `maxBinarySize`, and `maxDepth` for the largest payload shape your + service accepts. diff --git a/docs/guide/swift/index.md b/docs/guide/swift/index.md index e96e785fff..92dcb709e4 100644 --- a/docs/guide/swift/index.md +++ b/docs/guide/swift/index.md @@ -25,7 +25,7 @@ Apache Fory Swift provides high-performance object graph serialization with stro - Fast binary serialization for Swift value and reference types - `@ForyStruct`, `@ForyEnum`, and `@ForyUnion` macros for zero-boilerplate model serialization -- Cross-language protocol compatibility (`xlang`) with Java, Rust, Go, Python, and more +- Xlang protocol compatibility with Java, Rust, Go, Python, and more - Compatible mode for schema evolution across versions - Built-in support for dynamic values (`Any`, `AnyObject`, `any Serializer`, `AnyHashable`) - Reference tracking for shared/circular graphs, including weak references on classes @@ -52,7 +52,7 @@ targets: [ - [Configuration](configuration.md) - [Basic Serialization](basic-serialization.md) -- [Cross-Language Serialization](cross-language.md) +- [Xlang Serialization](xlang-serialization.md) - [Schema Metadata](schema-metadata.md) - [Type Registration](type-registration.md) - [Custom Serializers](custom-serializers.md) diff --git a/docs/guide/swift/cross-language.md b/docs/guide/swift/xlang-serialization.md similarity index 94% rename from docs/guide/swift/cross-language.md rename to docs/guide/swift/xlang-serialization.md index ff2bb6133f..c189baef40 100644 --- a/docs/guide/swift/cross-language.md +++ b/docs/guide/swift/xlang-serialization.md @@ -1,7 +1,7 @@ --- -title: Cross-Language Serialization +title: Xlang Serialization sidebar_position: 3 -id: cross_language +id: xlang_serialization license: | Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with @@ -21,7 +21,7 @@ license: | Fory Swift can exchange payloads with other Fory runtimes using the xlang protocol. -## Recommended Cross-language Configuration +## Recommended Xlang Configuration ```swift let fory = Fory() @@ -48,7 +48,7 @@ fory.register(Order.self, id: 100) try fory.register(Order.self, namespace: "com.example", name: "Order") ``` -## Cross-language Rules +## Xlang Rules - Keep type registration mapping consistent across languages - Keep compatible mode enabled when independently evolving schemas. Swift enables it by default. @@ -92,7 +92,7 @@ Generated Swift code includes: - `ForyRegistration.register(_:)` helpers with transitive import registration - `toBytes` / `fromBytes` helpers on generated types -Use generated registration before cross-language serialization: +Use generated registration before xlang serialization: ```swift let fory = Fory(ref: true) @@ -111,7 +111,7 @@ cd integration_tests/idl_tests This runs Swift roundtrip matrix tests and Java peer roundtrip checks (`IDL_PEER_LANG=swift`). -## Debugging Cross-language Tests +## Debugging Xlang Tests Enable debug output when running xlang tests: diff --git a/docs/guide/xlang/index.md b/docs/guide/xlang/index.md index 9b4e8866af..af7f22b181 100644 --- a/docs/guide/xlang/index.md +++ b/docs/guide/xlang/index.md @@ -146,7 +146,7 @@ This generates native language types with consistent field/type mappings across | Topic | Description | | --------------------------------------------------------- | ------------------------------------------------ | | [Getting Started](getting-started.md) | Installation and basic setup for all languages | -| [Type Mapping](../../specification/xlang_type_mapping.md) | Cross-language type mapping reference | +| [Type Mapping](../../specification/xlang_type_mapping.md) | Xlang type mapping reference | | [Serialization](serialization.md) | Built-in types, custom types, reference handling | | [Zero-Copy](zero-copy.md) | Out-of-band serialization for large data | | [Row Format](row_format.md) | Cache-friendly binary format with random access | @@ -156,10 +156,10 @@ This generates native language types with consistent field/type mappings across For language-specific details and API reference: -- [Java Cross-Language Guide](../java/cross-language.md) -- [Python Cross-Language Guide](../python/cross-language.md) -- [C++ Cross-Language Guide](../cpp/cross-language.md) -- [Rust Cross-Language Guide](../rust/cross-language.md) +- [Java Xlang Serialization Guide](../java/xlang-serialization.md) +- [Python Xlang Serialization Guide](../python/xlang-serialization.md) +- [C++ Xlang Serialization Guide](../cpp/xlang-serialization.md) +- [Rust Xlang Serialization Guide](../rust/xlang-serialization.md) ## Specifications