diff --git a/attribute-value-table-design.md b/attribute-value-table-design.md new file mode 100644 index 00000000000..f8a43343104 --- /dev/null +++ b/attribute-value-table-design.md @@ -0,0 +1,251 @@ +# Dense known-tag storage (a.k.a. AttributeValueTable) — design + +Branch: `dougqh/attribute-value-table` (off `dougqh/tagmap-tagid-experiment`) + +## Goal + +Eliminate the **per-tag `TagMap$Entry` allocation** — the #1 remaining tracer allocator +(~1.1% of process allocation in the PetClinic JFR, even after the tag-id work). The tag-id +fast-path made tag *placement* fast (positional slot vs hash bucket) but still allocates one +`Entry` wrapper per tag set, and keeps it alive until serialize. + +**Idea:** for known tags, store the *values* in a dense `(id, value)` pair array — no +`Entry` object per tag. A span's known tags never materialize an `Entry`; the serializer reads +`(name, type, value)` straight from the arrays. + +## Phasing + +- **Phase 1 (this design): replace `OptimizedTagMap`'s `Entry[] knownEntries` in place** with dense + `long[] ids` + `Object[] values`. No new type, no interface, no codegen. This is purely an internal + storage change to one class, and it *removes* machinery as much as it adds (see below). It's the + measurable step that kills the per-tag `Entry` for known tags. +- **Phase 2 (later, if warranted): extract an `AttributeValueTable` interface + a codegen POJO** per + hot span type (real typed fields, no bounds checks, type-reject for free). Extracting the interface + from a *working* dense impl is an easy refactor — and we'll know its true shape from having built + it, rather than guessing now. The `set(long)→boolean` / `get(long)→EntryReader` contract below is + where that interface is headed; in phase 1 it's just how `OptimizedTagMap` works internally. + +Everything below describes **phase 1** unless marked otherwise. + +## What phase 1 removes + +Replacing the positional `Entry[] knownEntries` with a dense scan-by-id store deletes the collision +machinery the positional slot model needed: first-writer-wins occupancy, the `collidedSlots` bitmask, +and bucket-eviction-on-reclaim. Dense `(id, value)` pairs have no positional collisions — you match by +id. `fieldPos`/`slotCount` stop mattering for storage (identity, name, and hash all come from the id); +they stay in the tagId for the eventual POJO but the dense store ignores them. + +## Storage — dense parallel arrays + +A **dense association list of only the tags actually present** — not arrays sized to the slot count: + +``` +long[] ids // the tag id of each present known tag, in insertion order +Object[] values // its value (boxed if primitive) +int size // number of used entries (arrays grow as needed) +``` + +- **`set(id, v)`**: scan `ids[0..size)` for a match (overwrite) else append. No `Entry`. Returns + `true` (stored) — unknown ids / type mismatches return `false` and the caller buckets them. +- **`get(id)`**: scan `ids` for the match → flyweight `EntryReader` over `(nameOf(id), values[i])`. +- **iterate/serialize**: dense walk of `ids[0..size)`; name = `nameOf(ids[i])`, value = `values[i]`. +- **Unknown tags** (`globalSerial == 0`) and **type mismatches** fall back to the hash buckets (still + `Entry`) — the minority. + +Why dense rather than positional-by-`fieldPos`: +- **Mixins need no special machinery** — a product tag is just another `(id, value)` pair; the list + holds only what's set, so disabled/unused products cost nothing. No segments, presence bitmask, or + `fieldPos` partition. +- **The id is stored**, so iteration names a tag directly (`nameOf`) — no `fieldPos → id` reverse map. +- **Maps onto `EntryReadingHelper`** (already in `LegacyTagMap`): a reusable `EntryReader` holding + `(tag, Object value)` with coercion via `TagValueConversions` and `EntryReader.entry()` to + materialize. The flyweight per index is `(nameOf(ids[i]), values[i])`. Almost nothing new. +- **`fieldPos` stops mattering** for the generic store — identity + name come from the id; positional + field layout is the POJO specialization's concern only. + +Trade-offs (accepted): **O(n) scan** instead of O(1)-by-position — fine in the small-map regime +(spans carry ~5–15 tags; a packed `long[]` scan is cache-friendly, and the common path is set-once + +one dense serialize pass). **Boxing** of the few primitive tags (status_code, port) — most tags are +strings (no box), and a boxed `Integer` is smaller than the `Entry` it replaces, so still a net win. + +A parallel `long[] prims` to avoid that boxing was **considered and rejected**: it adds a whole extra +per-span array *and* per-entry type tracking (which array holds the value), which costs more than the +handful of small boxes it would save. Single `Object[] values`, box the few fresh primitives. + +Prebuilt/shared `Entry`s holding a primitive are **not** a loss either: `Entry` caches its boxed value, +so a write sourced from a prebuilt `EntryReader` stores that *shared* box (`entry.objectValue()`) — +zero per-span allocation, same as today. So the only boxing is a fresh, per-span-varying primitive set +via the typed `set(long, int/...)` overloads — negligible. No real regression. + +### Type discipline + +The resolver declares each tag's type (`typeOf`). `set` accepts a value only if it matches; otherwise +it returns `false` and the caller buckets it as a normal `Entry`. Off-type writes degrade gracefully +instead of corrupting the slot. Type *coercion on read* (e.g. int → string for serialization) is +`EntryReader`'s job (via `TagValueConversions`), not a widening of the stored value. + +Memory: trades *N* per-tag `Entry` objects for two arrays (`ids` + `values`) sized to the tags +present, plus a box per primitive tag. Net win when a span carries more than ~2–3 known tags +(PetClinic spans carry 5–10), and especially on the serialize path (zero transient `Entry`). + +## Write path + +``` +table.set(long id, value): // returns true iff stored + if globalSerial(id) == 0 || !typeMatches(id, value): return false // unknown / wrong type + for i in 0..size: // small-n linear scan + if ids[i] == id: values[i] = value; return true // overwrite + ids[size] = id; values[size] = value; size++; return true // append, no Entry + +// caller (OptimizedTagMap): +if (!table.set(id, value)) setInBuckets(id, value) // Entry (unknown / off-type) +// string set: id = keyOf(name); known -> table.set; else -> buckets +``` + +Interception (the 3-case routing in `DDSpanContext.setTag`) is unchanged and sits *above* this — +the table is just the storage the non-intercepted / post-interceptor write lands in. + +## Read path + +All reads go through **`get(long) → EntryReader`** (a repositioned flyweight, the `EntryReadingHelper` +pattern). `EntryReader`'s own accessors + `TagValueConversions` provide value reads and type coercion +(e.g. int → string for serialization) in one place — so there are no separate typed getters on the +table, and slot-stored vs bucket-stored values coerce identically. Materialize a retainable `Entry` +only when a caller needs to hold one, via the existing `EntryReader.entry()`. + +## The payoff: no-`Entry` serialize + +`TagMap` is already `Iterable` and the msgpack `TraceMapper` already consumes +`EntryReader` — so the table reuses that contract with **no serializer change and no bespoke visitor**. +`iterator()` does a dense walk of `ids[0..size)` and yields the repositioned flyweight `EntryReader` +(name = `nameOf(ids[i])`, value = `values[i]`). `OptimizedTagMap` chains the table's readers then its +bucket `Entry`s (also `EntryReader`s). Result: a span's known tags serialize with **zero `Entry` +allocation**; only unknown/bucket tags retain `Entry`s. + +## How product mixins interact + +The dense representation makes this nearly a non-question: **a product tag is just another `(id, value)` +pair**. The list holds only the tags actually set, so a span that doesn't trigger profiling/dsm/appsec +simply has none of their pairs — zero cost, decided per span, with no segments, no layout composition, +and no need for the span type at creation. `applies` stays a *codegen* concern (which span types may +emit which product tags / whether a product tag earns a stable id at all); it no longer shapes the +runtime storage. (The earlier positional-segment scheme — `fieldPos = [segment][offset]`, lazily +allocated per mixin — is moot under dense arrays and was dropped.) + +## API + +`AttributeValueTable` is the **slotted-only** store; `OptimizedTagMap` owns the hash buckets and +the composition. The key shape: **`set` returns whether it stored the value** — a `false` tells the +caller to place it in the buckets. The table knows nothing about buckets; routing is explicit and +the "did it slot?" check happens once, inside `set`. + +The table consults the registered `KnownTags.Resolver` directly (like `OptimizedTagMap` already uses +`KnownTags.slotCount()`) — no separate `Layout` object. The dense store needs only **one** addition +the codegen already knows: `typeOf(long)` (for type-reject + the reader's `type()`). No reverse +`fieldPos → id` lookup is needed — the id is stored, so iteration names a tag via `nameOf(ids[i])`. + +```java +public final class AttributeValueTable { // backed by KnownTags.Resolver (global layout) + + // write: @return true if stored in a slot; false => caller must bucket it + public boolean set(long tagId, CharSequence value); + public boolean set(long tagId, Object value); + public boolean set(long tagId, boolean value); + public boolean set(long tagId, int value); + public boolean set(long tagId, long value); + public boolean set(long tagId, float value); + public boolean set(long tagId, double value); + + public boolean remove(long tagId); // @return true if a slot was cleared + public void clear(); + + public boolean remove(long tagId); + public boolean contains(long tagId); + public int size(); + + // read: returns a FLYWEIGHT EntryReader positioned at the matching entry (or null if absent). + // EntryReader's own type()/objectValue()/ accessors cover value reads, so no + // separate getString/getInt/... and no separate Visitor are needed. + // NOTE: transient view — valid until the next table op; not retainable. + public TagMap.EntryReader get(long tagId); + + // iteration yields the repositioned flyweight EntryReader -> plugs into the existing + // Iterable serialize path with ZERO per-tag allocation. + public Iterator iterator(); +} +``` + +Read model: `TagMap` is already `Iterable` and the msgpack writer already consumes +`EntryReader`, so the table reuses that contract — no bespoke visitor and no separate typed getters +(`EntryReader`'s own coercion covers reads, shared via `TagValueConversions`). `get`/`iterator` +return a **flyweight** `EntryReader` (the `EntryReadingHelper` pattern — one reusable cursor +repositioned per entry), so no `Entry` per tag. `OptimizedTagMap`'s iterator chains the table's +readers then its bucket `Entry`s (also `EntryReader`s) — uniform. Materialize a retainable `Entry` +via the existing `EntryReader.entry()` when a caller needs to hold it (the flyweight is otherwise a +transient view, valid until the next table op). + +Composition + the three tiers: + +```java +// OptimizedTagMap.set(long id, value) +if (!table.set(id, value)) setInBuckets(id, value); +// slotted known -> table stores, returns true +// unslotted known -> table returns false -> bucket (id-bearing Entry) +// unknown (keyOf==0)-> caller buckets directly +``` + +(Open: add a `getAndSet`-style variant only if a caller needs the prior value; `set->boolean` +covers the common write path.) + +## API-compat strategy + +`TagMap` is a large `Entry`-centric interface. Plan: +1. Implement `AttributeValueTable` as an alternative storage *inside* `OptimizedTagMap` + (replace the `Entry[] knownEntries` with the dense `ids`/`values` arrays), rather than a new + top-level type — keeps the whole interface working. +2. Known-tag get/set/remove/iterate operate on the dense arrays; bucket paths unchanged. +3. `Entry`-returning methods materialize lazily via `EntryReader.entry()`. +4. Reuse the existing `Iterable` serialize path (flyweight per entry) — no new cursor. + +## Open questions + +1. **Initial array capacity / growth.** Starting size for `ids`/`values` and growth policy (spans + carry ~5–15 tags; pick a sensible default to avoid resizes without over-allocating tiny spans). +2. **`Ledger` / builder path** — how accumulated changes apply to the dense arrays. +3. **Scan vs index at larger N.** If some span types carry many tags, confirm the linear scan still + wins; otherwise a small index is an option (but adds cost the dense form is trying to avoid). + +Resolved during design: dense parallel arrays over positional-by-`fieldPos` (mixins become plain +pairs); single `Object[] values` over a parallel `long[] prims` (the extra array + type tracking +cost more than the few boxes); reads/serialize via the existing `EntryReader` rather than a bespoke +visitor; no separate `Layout` (consult the resolver, + `typeOf`). + +## Performance: the trade, eyes open + +- **Write path (frequent): better** — scan + append into `ids`/`values`, no per-tag `Entry`. +- **Allocation / GC: better** — removes the 1.1% `Entry` lever; less GC (CPU the profile attributes + elsewhere). A typical (string-heavy) span allocates two arrays instead of N `Entry`s. +- **Read / serialize: some extra CPU per tag** — flyweight reposition + array read + `nameOf` + + coercion dispatch, vs today's `Entry` that caches name and typed value. **This is intrinsic to a + generic store** — you cannot match direct-field access without generating the fields (the POJO + endgame). Mitigations (lean flyweight, near-no-op coercion when the stored type matches) narrow it + but do not erase it. + +Why it's acceptable: the array-backed impl accepts that small read cost as the **price of generality** +(any tag, no codegen, no span-type-at-creation); **POJOs recover it for hot span types** on the same +interface. You pay the indirection only where you haven't specialized — i.e. where you don't care. +The net is likely neutral-to-positive even pre-POJO (cheaper frequent writes + lower GC; serialize is +a single pass per span); POJOs make it clearly positive where it counts. + +## How we'll measure + +**Standalone JMH first, three-way**, on a realistic PetClinic-like tag set (component, span.kind, +db.*, http.*), measuring throughput and **allocation (`-prof gc`)**: +1. today's `OptimizedTagMap` (`Entry[]`) — the baseline, +2. array-backed `AttributeValueTable` — does it regress read CPU? how much alloc does it save? +3. a **hand-written POJO** for one span type (e.g. `db.client`) — confirms the codegen endgame wins + enough to justify building the generator. + +If array-backed is promising (or break-even on CPU with the alloc win), integrate it (incl. the +`EntryReader` serialize path) and re-run the PetClinic CPU/alloc A/B with the existing harness; build +codegen POJOs for the hot span types once the hand-POJO confirms the payoff. diff --git a/dd-java-agent/agent-bootstrap/src/jmh/java/datadog/trace/bootstrap/instrumentation/decorator/PeerConnectionBenchmark.java b/dd-java-agent/agent-bootstrap/src/jmh/java/datadog/trace/bootstrap/instrumentation/decorator/PeerConnectionBenchmark.java new file mode 100644 index 00000000000..c7738d44192 --- /dev/null +++ b/dd-java-agent/agent-bootstrap/src/jmh/java/datadog/trace/bootstrap/instrumentation/decorator/PeerConnectionBenchmark.java @@ -0,0 +1,97 @@ +package datadog.trace.bootstrap.instrumentation.decorator; + +import static java.util.concurrent.TimeUnit.NANOSECONDS; +import static java.util.concurrent.TimeUnit.SECONDS; + +import datadog.trace.api.GlobalTracer; +import datadog.trace.bootstrap.instrumentation.api.AgentSpan; +import datadog.trace.common.writer.Writer; +import datadog.trace.core.CoreTracer; +import datadog.trace.core.DDSpan; +import java.net.InetAddress; +import java.net.InetSocketAddress; +import java.util.List; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.BenchmarkMode; +import org.openjdk.jmh.annotations.Fork; +import org.openjdk.jmh.annotations.Level; +import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Mode; +import org.openjdk.jmh.annotations.OutputTimeUnit; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.Setup; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.annotations.Warmup; + +/** + * Measures {@link BaseDecorator#onPeerConnection} on a real {@link DDSpan}. This is the + * tag-id-keyed fast-path (peer.hostname / peer.ipv4) end-to-end through the span/context/TagMap + * layers: compare this branch (id-keyed, slotted) against the prior commit (string-keyed, bucketed) + * by running the same benchmark on each. + */ +@State(Scope.Benchmark) +@Warmup(iterations = 3, time = 5, timeUnit = SECONDS) +@Measurement(iterations = 5, time = 5, timeUnit = SECONDS) +@BenchmarkMode(Mode.AverageTime) +@OutputTimeUnit(NANOSECONDS) +@Fork(value = 1) +public class PeerConnectionBenchmark { + + BenchmarkDecorator decorator; + InetSocketAddress connection; + AgentSpan span; + + @Setup(Level.Trial) + public void setUp() throws Exception { + CoreTracer tracer = + CoreTracer.builder().strictTraceWrites(true).writer(new NoOpWriter()).build(); + GlobalTracer.forceRegister(tracer); + decorator = new BenchmarkDecorator(); + span = tracer.startSpan("benchmark", "peer.connection"); + // resolved IPv4 address carrying an explicit host name, so onPeerConnection exercises + // peer.hostname + peer.ipv4 without triggering a reverse-DNS lookup. + InetAddress address = InetAddress.getByAddress("benchmark.host", new byte[] {10, 0, 0, 1}); + connection = new InetSocketAddress(address, 8080); + } + + @Benchmark + public AgentSpan onPeerConnection() { + return decorator.onPeerConnection(span, connection); + } + + static final class BenchmarkDecorator extends BaseDecorator { + @Override + protected String[] instrumentationNames() { + return new String[] {"benchmark"}; + } + + @Override + protected CharSequence spanType() { + return "benchmark"; + } + + @Override + protected CharSequence component() { + return "benchmark"; + } + } + + private static final class NoOpWriter implements Writer { + @Override + public void write(final List trace) {} + + @Override + public void start() {} + + @Override + public boolean flush() { + return false; + } + + @Override + public void close() {} + + @Override + public void incrementDropCounts(final int spanCount) {} + } +} diff --git a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/BaseDecorator.java b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/BaseDecorator.java index 6a8767e523f..bcd352786af 100644 --- a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/BaseDecorator.java +++ b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/BaseDecorator.java @@ -7,12 +7,12 @@ import datadog.trace.api.Config; import datadog.trace.api.DDTags; import datadog.trace.api.Functions; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.api.cache.QualifiedClassNameCache; import datadog.trace.bootstrap.instrumentation.api.AgentScope; import datadog.trace.bootstrap.instrumentation.api.AgentSpan; import datadog.trace.bootstrap.instrumentation.api.ErrorPriorities; -import datadog.trace.bootstrap.instrumentation.api.Tags; import java.lang.reflect.Method; import java.net.Inet4Address; import java.net.Inet6Address; @@ -81,7 +81,8 @@ protected final TagMap.Entry componentEntry() { // This approach while more complicated doesn't have any field initialization ordering issues. TagMap.Entry componentEntry = cachedComponentEntry; if (componentEntry == null) { - cachedComponentEntry = componentEntry = TagMap.Entry.create(Tags.COMPONENT, component()); + cachedComponentEntry = + componentEntry = TagMap.Entry.create(KnownTagIds.COMPONENT, component()); } return componentEntry; } @@ -165,26 +166,26 @@ public AgentSpan onPeerConnection(AgentSpan span, InetAddress remoteAddress, boo if (remoteAddress != null) { String ip = remoteAddress.getHostAddress(); if (resolved && Config.get().isPeerHostNameEnabled()) { - span.setTag(Tags.PEER_HOSTNAME, hostName(remoteAddress, ip)); + span.setTag(KnownTagIds.PEER_HOSTNAME, hostName(remoteAddress, ip)); } if (remoteAddress instanceof Inet4Address) { - span.setTag(Tags.PEER_HOST_IPV4, ip); + span.setTag(KnownTagIds.PEER_HOST_IPV4, ip); } else if (remoteAddress instanceof Inet6Address) { - span.setTag(Tags.PEER_HOST_IPV6, ip); + span.setTag(KnownTagIds.PEER_HOST_IPV6, ip); } } return span; } public AgentSpan setPeerPort(AgentSpan span, String port) { - span.setTag(Tags.PEER_PORT, port); + span.setTag(KnownTagIds.PEER_PORT, (CharSequence) port); return span; } public AgentSpan setPeerPort(AgentSpan span, int port) { if (port > UNSET_PORT) { - span.setTag(Tags.PEER_PORT, port); + span.setTag(KnownTagIds.PEER_PORT, port); } return span; } diff --git a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ClientDecorator.java b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ClientDecorator.java index 99dec2dbc08..48687afe130 100644 --- a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ClientDecorator.java +++ b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ClientDecorator.java @@ -1,5 +1,6 @@ package datadog.trace.bootstrap.instrumentation.decorator; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.bootstrap.instrumentation.api.AgentSpan; import datadog.trace.bootstrap.instrumentation.api.Tags; @@ -22,7 +23,7 @@ private final TagMap.Entry spanKindEntry() { // decided to be cautious here, too. TagMap.Entry kindEntry = cachedSpanKindEntry; if (kindEntry == null) { - cachedSpanKindEntry = kindEntry = TagMap.Entry.create(Tags.SPAN_KIND, spanKind()); + cachedSpanKindEntry = kindEntry = TagMap.Entry.create(KnownTagIds.SPAN_KIND, spanKind()); } return kindEntry; } diff --git a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/DatabaseClientDecorator.java b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/DatabaseClientDecorator.java index 7336a059bdc..b1ddc014314 100644 --- a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/DatabaseClientDecorator.java +++ b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/DatabaseClientDecorator.java @@ -2,10 +2,10 @@ import static datadog.trace.api.gateway.Events.EVENTS; import static datadog.trace.bootstrap.instrumentation.api.ServiceNameSources.DB_CLIENT_SPLIT_BY_HOST; -import static datadog.trace.bootstrap.instrumentation.api.Tags.DB_TYPE; import datadog.appsec.api.blocking.BlockingException; import datadog.trace.api.Config; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.cache.DDCache; import datadog.trace.api.cache.DDCaches; import datadog.trace.api.gateway.BlockResponseFunction; @@ -16,7 +16,6 @@ import datadog.trace.api.naming.SpanNaming; import datadog.trace.bootstrap.instrumentation.api.AgentSpan; import datadog.trace.bootstrap.instrumentation.api.AgentTracer; -import datadog.trace.bootstrap.instrumentation.api.Tags; import datadog.trace.bootstrap.instrumentation.api.UTF8BytesString; import java.util.function.BiConsumer; import java.util.function.BiFunction; @@ -70,11 +69,11 @@ public String getDbType() { */ public AgentSpan onConnection(final AgentSpan span, final CONNECTION connection) { if (connection != null) { - span.setTag(Tags.DB_USER, dbUser(connection)); + span.setTag(KnownTagIds.DB_USER, dbUser(connection)); onInstance(span, dbInstance(connection)); CharSequence hostName = dbHostname(connection); if (hostName != null) { - span.setTag(Tags.PEER_HOSTNAME, hostName); + span.setTag(KnownTagIds.PEER_HOSTNAME, hostName); if (Config.get().isDbClientSplitByHost()) { span.setServiceName(hostName.toString(), DB_CLIENT_SPLIT_BY_HOST); @@ -86,7 +85,7 @@ public AgentSpan onConnection(final AgentSpan span, final CONNECTION connection) protected AgentSpan onInstance(final AgentSpan span, final String dbInstance) { if (dbInstance != null) { - span.setTag(Tags.DB_INSTANCE, dbInstance); + span.setTag(KnownTagIds.DB_INSTANCE, dbInstance); String serviceName = dbClientService(dbInstance); if (null != serviceName) { span.setServiceName(serviceName, component()); @@ -149,7 +148,7 @@ public void onRawStatement(AgentSpan span, String sql) { protected void processDatabaseType(AgentSpan span, String dbType) { final NamingEntry namingEntry = CACHE.computeIfAbsent(dbType, NamingEntry::new); - span.setTag(DB_TYPE, namingEntry.dbType); + span.setTag(KnownTagIds.DB_TYPE, namingEntry.dbType); postProcessServiceAndOperationName(span, namingEntry); if (Config.get().isAppSecRaspEnabled() && dbType != null) { diff --git a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecorator.java b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecorator.java index 267d0149c3c..c47c5d6d792 100644 --- a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecorator.java +++ b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecorator.java @@ -13,6 +13,7 @@ import datadog.context.propagation.Propagators; import datadog.trace.api.Config; import datadog.trace.api.DDTags; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.datastreams.DataStreamsTransactionExtractor; import datadog.trace.api.datastreams.DataStreamsTransactionTracker; import datadog.trace.api.function.TriConsumer; @@ -314,7 +315,7 @@ public AgentSpan onRequest( if (request != null) { String method = method(request); - span.setTag(Tags.HTTP_METHOD, method); + span.setTag(KnownTagIds.HTTP_METHOD, method); // Copy of HttpClientDecorator url handling try { @@ -326,9 +327,10 @@ public AgentSpan onRequest( String path = encoded ? url.rawPath() : url.path(); if (valid) { span.setTag( - Tags.HTTP_URL, URIUtils.lazyValidURL(url.scheme(), url.host(), url.port(), path)); + KnownTagIds.HTTP_URL, + URIUtils.lazyValidURL(url.scheme(), url.host(), url.port(), path)); } else if (supportsRaw) { - span.setTag(Tags.HTTP_URL, URIUtils.lazyInvalidUrl(url.raw())); + span.setTag(KnownTagIds.HTTP_URL, URIUtils.lazyInvalidUrl(url.raw())); } if (extracted != null && extracted.getXForwardedHost() != null) { span.setTag(Tags.HTTP_HOSTNAME, extracted.getXForwardedHost()); diff --git a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ServerDecorator.java b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ServerDecorator.java index 20b11038ffd..d7a79f0cbd6 100644 --- a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ServerDecorator.java +++ b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/ServerDecorator.java @@ -1,15 +1,17 @@ package datadog.trace.bootstrap.instrumentation.decorator; import datadog.trace.api.DDTags; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.bootstrap.instrumentation.api.AgentSpan; import datadog.trace.bootstrap.instrumentation.api.Tags; public abstract class ServerDecorator extends BaseDecorator { + // id-keyed cached entries (set on every server span) so set() skips keyOf - see KnownTagIds private static final TagMap.Entry SPAN_KIND_ENTRY = - TagMap.Entry.create(Tags.SPAN_KIND, Tags.SPAN_KIND_SERVER); + TagMap.Entry.create(KnownTagIds.SPAN_KIND, Tags.SPAN_KIND_SERVER); private static final TagMap.Entry LANG_ENTRY = - TagMap.Entry.create(DDTags.LANGUAGE_TAG_KEY, DDTags.LANGUAGE_TAG_VALUE); + TagMap.Entry.create(KnownTagIds.LANGUAGE, DDTags.LANGUAGE_TAG_VALUE); @Override public AgentSpan afterStart(final AgentSpan span) { diff --git a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/http/HttpResourceDecorator.java b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/http/HttpResourceDecorator.java index 1458bd04eb1..c88d37d0aa7 100644 --- a/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/http/HttpResourceDecorator.java +++ b/dd-java-agent/agent-bootstrap/src/main/java/datadog/trace/bootstrap/instrumentation/decorator/http/HttpResourceDecorator.java @@ -1,10 +1,10 @@ package datadog.trace.bootstrap.instrumentation.decorator.http; import datadog.trace.api.Config; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.normalize.HttpResourceNames; import datadog.trace.bootstrap.instrumentation.api.AgentSpan; import datadog.trace.bootstrap.instrumentation.api.ResourceNamePriorities; -import datadog.trace.bootstrap.instrumentation.api.Tags; import datadog.trace.bootstrap.instrumentation.api.URIUtils; import datadog.trace.bootstrap.instrumentation.api.UTF8BytesString; @@ -42,7 +42,7 @@ public final AgentSpan withRoute( if (encoded) { routeTag = URIUtils.decode(route.toString()); } - span.setTag(Tags.HTTP_ROUTE, routeTag); + span.setTag(KnownTagIds.HTTP_ROUTE, routeTag); if (Config.get().isHttpServerRouteBasedNaming()) { final CharSequence resourceName = HttpResourceNames.join(method, route); span.setResourceName(resourceName, ResourceNamePriorities.HTTP_FRAMEWORK_ROUTE); diff --git a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/BaseDecoratorTest.groovy b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/BaseDecoratorTest.groovy index 354a9c6bc4f..bffa7de056e 100644 --- a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/BaseDecoratorTest.groovy +++ b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/BaseDecoratorTest.groovy @@ -1,5 +1,6 @@ package datadog.trace.bootstrap.instrumentation.decorator +import datadog.trace.api.KnownTagIds import datadog.trace.api.TagMap import datadog.trace.bootstrap.instrumentation.api.AgentSpan import datadog.trace.bootstrap.instrumentation.api.AgentSpanContext @@ -52,14 +53,14 @@ class BaseDecoratorTest extends DDSpecification { then: if (!connection.isUnresolved()) { - 1 * span.setTag(Tags.PEER_HOSTNAME, connection.hostName) + 1 * span.setTag(KnownTagIds.PEER_HOSTNAME, connection.hostName) } - 1 * span.setTag(Tags.PEER_PORT, connection.port) + 1 * span.setTag(KnownTagIds.PEER_PORT, connection.port) if (connection.address instanceof Inet4Address) { - 1 * span.setTag(Tags.PEER_HOST_IPV4, connection.address.hostAddress) + 1 * span.setTag(KnownTagIds.PEER_HOST_IPV4, connection.address.hostAddress) } if (connection.address instanceof Inet6Address) { - 1 * span.setTag(Tags.PEER_HOST_IPV6, connection.address.hostAddress) + 1 * span.setTag(KnownTagIds.PEER_HOST_IPV6, connection.address.hostAddress) } 0 * _ diff --git a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpClientDecoratorTest.groovy b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpClientDecoratorTest.groovy index 1bc83457bd0..3c1881397a9 100644 --- a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpClientDecoratorTest.groovy +++ b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpClientDecoratorTest.groovy @@ -1,5 +1,7 @@ package datadog.trace.bootstrap.instrumentation.decorator +import datadog.trace.api.KnownTagIds + import datadog.trace.api.DDTags import datadog.trace.api.appsec.HttpClientRequest import datadog.trace.api.config.AppSecConfig @@ -69,7 +71,7 @@ class HttpClientDecoratorTest extends ClientDecoratorTest { 1 * span.setTag(DDTags.HTTP_QUERY, null) 1 * span.setTag(DDTags.HTTP_FRAGMENT, null) 1 * span.setTag(Tags.PEER_HOSTNAME, req.url.host) - 1 * span.setTag(Tags.PEER_PORT, req.url.port) + 1 * span.setTag(KnownTagIds.PEER_PORT, req.url.port) 1 * span.setResourceName({ it as String == req.method.toUpperCase() + " " + req.path }, ResourceNamePriorities.HTTP_PATH_NORMALIZER) if (renameService) { 1 * span.setServiceName(req.url.host, _) @@ -107,7 +109,7 @@ class HttpClientDecoratorTest extends ClientDecoratorTest { 1 * span.setTag(Tags.PEER_HOSTNAME, hostname) } if (port) { - 1 * span.setTag(Tags.PEER_PORT, port) + 1 * span.setTag(KnownTagIds.PEER_PORT, port) } if (url != null) { 1 * span.setResourceName({ it as String == expectedPath }, ResourceNamePriorities.HTTP_PATH_NORMALIZER) diff --git a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecoratorTest.groovy b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecoratorTest.groovy index da411dc2431..782f7e6241d 100644 --- a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecoratorTest.groovy +++ b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/HttpServerDecoratorTest.groovy @@ -1,5 +1,7 @@ package datadog.trace.bootstrap.instrumentation.decorator +import datadog.trace.api.KnownTagIds + import datadog.trace.api.DDTags import datadog.trace.api.TraceConfig @@ -70,10 +72,10 @@ class HttpServerDecoratorTest extends ServerDecoratorTest { then: if (req) { - 1 * this.span.setTag(Tags.HTTP_METHOD, "test-method") + 1 * this.span.setTag(KnownTagIds.HTTP_METHOD, "test-method") 1 * this.span.setTag(DDTags.HTTP_QUERY, _) 1 * this.span.setTag(DDTags.HTTP_FRAGMENT, _) - 1 * this.span.setTag(Tags.HTTP_URL, {it.toString() == url}) + 1 * this.span.setTag(KnownTagIds.HTTP_URL, {it.toString() == url}) 1 * this.span.setTag(Tags.HTTP_HOSTNAME, req.url.host) 2 * this.span.getRequestContext() 1 * this.span.setResourceName({ it as String == req.method.toUpperCase() + " " + req.path }, ResourceNamePriorities.HTTP_PATH_NORMALIZER) @@ -103,7 +105,7 @@ class HttpServerDecoratorTest extends ServerDecoratorTest { then: if (expectedUrl) { - 1 * this.span.setTag(Tags.HTTP_URL, {it.toString() == expectedUrl}) + 1 * this.span.setTag(KnownTagIds.HTTP_URL, {it.toString() == expectedUrl}) 2 * this.span.getRequestContext() } if (expectedUrl && tagQueryString) { @@ -119,7 +121,7 @@ class HttpServerDecoratorTest extends ServerDecoratorTest { 1 * this.span.getRequestContext() 1 * this.span.setResourceName({ it as String == expectedPath }) } - 1 * this.span.setTag(Tags.HTTP_METHOD, null) + 1 * this.span.setTag(KnownTagIds.HTTP_METHOD, null) _ * this.span.getLocalRootSpan() >> this.span 0 * _ @@ -153,13 +155,13 @@ class HttpServerDecoratorTest extends ServerDecoratorTest { decorator.onRequest(this.span, null, req, root()) then: - 1 * this.span.setTag(Tags.HTTP_URL, {it.toString() == expectedUrl}) + 1 * this.span.setTag(KnownTagIds.HTTP_URL, {it.toString() == expectedUrl}) 1 * this.span.setTag(Tags.HTTP_HOSTNAME, req.url.host) 1 * this.span.setTag(DDTags.HTTP_QUERY, expectedQuery) 1 * this.span.setTag(DDTags.HTTP_FRAGMENT, null) 2 * this.span.getRequestContext() 1 * this.span.setResourceName({ it as String == expectedResource }, ResourceNamePriorities.HTTP_PATH_NORMALIZER) - 1 * this.span.setTag(Tags.HTTP_METHOD, null) + 1 * this.span.setTag(KnownTagIds.HTTP_METHOD, null) _ * this.span.getLocalRootSpan() >> this.span 0 * _ @@ -229,7 +231,7 @@ class HttpServerDecoratorTest extends ServerDecoratorTest { } 1 * this.span.setTag(Tags.HTTP_FORWARDED_PORT, "123") if (conn?.port) { - 1 * this.span.setTag(Tags.PEER_PORT, conn.port) + 1 * this.span.setTag(KnownTagIds.PEER_PORT, conn.port) } 1 * this.span.setTag(Tags.HTTP_USER_AGENT, "some-user-agent") _ * this.span.getRequestContext() >> null diff --git a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/UrlConnectionDecoratorTest.groovy b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/UrlConnectionDecoratorTest.groovy index b31ab85d2d1..74bc461c73f 100644 --- a/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/UrlConnectionDecoratorTest.groovy +++ b/dd-java-agent/agent-bootstrap/src/test/groovy/datadog/trace/bootstrap/instrumentation/decorator/UrlConnectionDecoratorTest.groovy @@ -1,5 +1,7 @@ package datadog.trace.bootstrap.instrumentation.decorator +import datadog.trace.api.KnownTagIds + import datadog.trace.api.DDSpanTypes import datadog.trace.bootstrap.instrumentation.api.Tags import datadog.trace.bootstrap.instrumentation.api.UTF8BytesString @@ -20,7 +22,7 @@ class UrlConnectionDecoratorTest extends ClientDecoratorTest { 1 * span.setTag(Tags.PEER_HOSTNAME, hostname) } if (port) { - 1 * span.setTag(Tags.PEER_PORT, port) + 1 * span.setTag(KnownTagIds.PEER_PORT, port) } 0 * _ diff --git a/dd-java-agent/instrumentation/jdbc/src/main/java/datadog/trace/instrumentation/jdbc/JDBCDecorator.java b/dd-java-agent/instrumentation/jdbc/src/main/java/datadog/trace/instrumentation/jdbc/JDBCDecorator.java index b6455d74372..b81bc1c69b1 100644 --- a/dd-java-agent/instrumentation/jdbc/src/main/java/datadog/trace/instrumentation/jdbc/JDBCDecorator.java +++ b/dd-java-agent/instrumentation/jdbc/src/main/java/datadog/trace/instrumentation/jdbc/JDBCDecorator.java @@ -11,6 +11,7 @@ import datadog.trace.api.BaseHash; import datadog.trace.api.Config; import datadog.trace.api.DDTraceId; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.naming.SpanNaming; import datadog.trace.api.propagation.W3CTraceParent; import datadog.trace.api.telemetry.LogCollector; @@ -277,7 +278,7 @@ private AgentSpan withQueryInfo(AgentSpan span, DBQueryInfo info, CharSequence c span.setResourceName(DB_QUERY); } span.context().setIntegrationName(component); - return span.setTag(Tags.COMPONENT, component); + return span.setTag(KnownTagIds.COMPONENT, component); } public boolean isOracle(final DBInfo dbInfo) { diff --git a/dd-trace-core/src/main/java/datadog/trace/core/DDSpan.java b/dd-trace-core/src/main/java/datadog/trace/core/DDSpan.java index 8ffcc77b49c..bbc0e08d708 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/DDSpan.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/DDSpan.java @@ -512,6 +512,48 @@ public DDSpan setTag(final String tag, final Object value) { return this; } + @Override + public DDSpan setTag(final long tagId, final Object value) { + context.setTag(tagId, value); + return this; + } + + @Override + public DDSpan setTag(final long tagId, final CharSequence value) { + context.setTag(tagId, value); + return this; + } + + @Override + public DDSpan setTag(final long tagId, final boolean value) { + context.setTag(tagId, value); + return this; + } + + @Override + public DDSpan setTag(final long tagId, final int value) { + context.setTag(tagId, value); + return this; + } + + @Override + public DDSpan setTag(final long tagId, final long value) { + context.setTag(tagId, value); + return this; + } + + @Override + public DDSpan setTag(final long tagId, final float value) { + context.setTag(tagId, value); + return this; + } + + @Override + public DDSpan setTag(final long tagId, final double value) { + context.setTag(tagId, value); + return this; + } + @Override public AgentSpan setAllTags(Map map) { context.setAllTags(map); diff --git a/dd-trace-core/src/main/java/datadog/trace/core/DDSpanContext.java b/dd-trace-core/src/main/java/datadog/trace/core/DDSpanContext.java index e7038db5dbe..20c46b73951 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/DDSpanContext.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/DDSpanContext.java @@ -1,6 +1,5 @@ package datadog.trace.core; -import static datadog.trace.api.DDTags.PARENT_ID; import static datadog.trace.api.DDTags.SPAN_LINKS; import static datadog.trace.api.cache.RadixTreeCache.HTTP_STATUSES; import static datadog.trace.bootstrap.instrumentation.api.ErrorPriorities.UNSET; @@ -11,6 +10,8 @@ import datadog.trace.api.DDTags; import datadog.trace.api.DDTraceId; import datadog.trace.api.Functions; +import datadog.trace.api.KnownTagIds; +import datadog.trace.api.KnownTags; import datadog.trace.api.ProcessTags; import datadog.trace.api.TagMap; import datadog.trace.api.cache.DDCache; @@ -385,7 +386,7 @@ public DDSpanContext( if (samplingPriority != PrioritySampling.UNSET) { setSamplingPriority(samplingPriority, SamplingMechanism.UNKNOWN); } - setTag(PARENT_ID, this.propagationTags.getLastParentId()); + setTag(KnownTagIds.PARENT_ID, this.propagationTags.getLastParentId()); } @Override @@ -901,6 +902,126 @@ public void setTag(final String tag, final String value) { } } + /** + * Sets a tag by its generated tag id. Three cases, classified by a single sign test on the id + * ({@link KnownTags#isIntercepted}): (a) reserved "virtual" tags and (c) intercepted-but-stored + * tags (e.g. http.method) are routed to the interceptor via an id dispatch, then stored if the + * interceptor didn't fully handle them; (b) non-intercepted stored tags go straight to the map + * (slot or bucket) keyed by id, bypassing the per-tag interceptor string switch. + */ + public void setTag(final long tagId, final Object value) { + if (null == value) { + String name = KnownTags.nameOf(tagId); + if (name != null) { + removeTag(name); + } + return; + } + if (KnownTags.isIntercepted(tagId)) { + if (!tagInterceptor.interceptTag(this, tagId, value)) { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } else { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } + + public void setTag(final long tagId, final CharSequence value) { + if (null == value) { + String name = KnownTags.nameOf(tagId); + if (name != null) { + removeTag(name); + } + return; + } + if (KnownTags.isIntercepted(tagId)) { + if (!tagInterceptor.interceptTag(this, tagId, value)) { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } else { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } + + public void setTag(final long tagId, final boolean value) { + if (KnownTags.isIntercepted(tagId)) { + // boxes on the (rare) reserved/intercepted path only + if (!tagInterceptor.interceptTag(this, tagId, value)) { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } else { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } + + public void setTag(final long tagId, final int value) { + if (KnownTags.isIntercepted(tagId)) { + if (!tagInterceptor.interceptTag(this, tagId, value)) { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } else { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } + + public void setTag(final long tagId, final long value) { + if (KnownTags.isIntercepted(tagId)) { + if (!tagInterceptor.interceptTag(this, tagId, value)) { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } else { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } + + public void setTag(final long tagId, final float value) { + if (KnownTags.isIntercepted(tagId)) { + if (!tagInterceptor.interceptTag(this, tagId, value)) { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } else { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } + + public void setTag(final long tagId, final double value) { + if (KnownTags.isIntercepted(tagId)) { + if (!tagInterceptor.interceptTag(this, tagId, value)) { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } else { + synchronized (unsafeTags) { + unsafeTags.set(tagId, value); + } + } + } + public void setTag(TagMap.EntryReader entry) { if (entry == null) { return; diff --git a/dd-trace-core/src/main/java/datadog/trace/core/taginterceptor/TagInterceptor.java b/dd-trace-core/src/main/java/datadog/trace/core/taginterceptor/TagInterceptor.java index 64bf017e9db..d32a37c9ca5 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/taginterceptor/TagInterceptor.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/taginterceptor/TagInterceptor.java @@ -24,6 +24,8 @@ import datadog.trace.api.Config; import datadog.trace.api.ConfigDefaults; import datadog.trace.api.DDTags; +import datadog.trace.api.KnownTagIds; +import datadog.trace.api.KnownTags; import datadog.trace.api.Pair; import datadog.trace.api.TagMap; import datadog.trace.api.config.GeneralConfig; @@ -131,6 +133,36 @@ public boolean needsIntercept(String tag) { } } + /** + * Id-dispatched (fast) variant of {@link #interceptTag(DDSpanContext, String, Object)}: switches + * on the tagId's globalSerial (an int) instead of the tag-name string. Used by {@code + * DDSpanContext.setTag(long, Object)} for any {@link KnownTags#isIntercepted} id — reserved + * "virtual" tags AND intercepted-but-stored tags (e.g. http.method/url, peer.service). Hot tags + * get a dedicated case; the default falls back to resolving the name and running the (slower) + * string interception, so behavior matches the string set-path exactly. + */ + public boolean interceptTag(DDSpanContext span, long tagId, Object value) { + // Hot intercepted tags get a dedicated arm so the id path is fully string-free (no nameOf, no + // string switch). The serial already distinguishes http.method from http.url, so the + // url-as-resource rule is called with the known name constant directly. Any other intercepted + // id falls back to resolving the name and running the string interception (same behavior). + switch (KnownTags.globalSerial(tagId)) { + case KnownTagIds.ERROR_SERIAL: + return interceptError(span, value); + case KnownTagIds.HTTP_METHOD_SERIAL: + return interceptUrlResourceAsNameRule(span, HTTP_METHOD, value); + case KnownTagIds.HTTP_URL_SERIAL: + return interceptUrlResourceAsNameRule(span, HTTP_URL, value); + case KnownTagIds.PEER_SERVICE_SERIAL: + // mirrors the Tags.PEER_SERVICE arm of the string switch + span.setTag(DDTags.PEER_SERVICE_SOURCE, Tags.PEER_SERVICE); + return interceptServiceName(PEER_SERVICE, span, value); + default: + String name = KnownTags.nameOf(tagId); + return name != null && interceptTag(span, name, value); + } + } + public boolean interceptTag(DDSpanContext span, String tag, Object value) { switch (tag) { case DDTags.RESOURCE_NAME: diff --git a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/HttpEndpointPostProcessor.java b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/HttpEndpointPostProcessor.java index c2e0dd72761..7955593f220 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/HttpEndpointPostProcessor.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/HttpEndpointPostProcessor.java @@ -1,9 +1,6 @@ package datadog.trace.core.tagprocessor; -import static datadog.trace.bootstrap.instrumentation.api.Tags.HTTP_METHOD; -import static datadog.trace.bootstrap.instrumentation.api.Tags.HTTP_ROUTE; -import static datadog.trace.bootstrap.instrumentation.api.Tags.HTTP_URL; - +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.api.endpoint.EndpointResolver; import datadog.trace.api.internal.VisibleForTesting; @@ -61,16 +58,21 @@ public void processTags( return; } - if (unsafeTags.getObject(HTTP_METHOD) == null) { + if (unsafeTags.getEntry(KnownTagIds.HTTP_METHOD) == null) { return; } try { - String httpRoute = unsafeTags.getString(HTTP_ROUTE); - String httpUrl = unsafeTags.getString(HTTP_URL); + String httpRoute = stringValue(unsafeTags, KnownTagIds.HTTP_ROUTE); + String httpUrl = stringValue(unsafeTags, KnownTagIds.HTTP_URL); endpointResolver.resolveEndpoint(unsafeTags, httpRoute, httpUrl); } catch (Throwable t) { log.debug("Error processing HTTP endpoint for span {}", spanContext.getSpanId(), t); } } + + private static String stringValue(TagMap unsafeTags, long tagId) { + TagMap.Entry entry = unsafeTags.getEntry(tagId); + return entry == null ? null : entry.stringValue(); + } } diff --git a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/IntegrationAdder.java b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/IntegrationAdder.java index 0aabbc29c47..c4babce9941 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/IntegrationAdder.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/IntegrationAdder.java @@ -1,7 +1,6 @@ package datadog.trace.core.tagprocessor; -import static datadog.trace.api.DDTags.DD_INTEGRATION; - +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.bootstrap.instrumentation.api.AppendableSpanLinks; import datadog.trace.core.DDSpanContext; @@ -12,9 +11,9 @@ public void processTags( TagMap unsafeTags, DDSpanContext spanContext, AppendableSpanLinks spanLinks) { final CharSequence instrumentationName = spanContext.getIntegrationName(); if (instrumentationName != null) { - unsafeTags.set(DD_INTEGRATION, instrumentationName); + unsafeTags.set(KnownTagIds.INTEGRATION_ID, instrumentationName); } else { - unsafeTags.remove(DD_INTEGRATION); + unsafeTags.remove(KnownTagIds.INTEGRATION_ID); } } } diff --git a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/InternalTagsAdder.java b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/InternalTagsAdder.java index 68b13d19faf..1ee17e829e3 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/InternalTagsAdder.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/InternalTagsAdder.java @@ -2,7 +2,7 @@ import static datadog.trace.bootstrap.instrumentation.api.Tags.VERSION; -import datadog.trace.api.DDTags; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.bootstrap.instrumentation.api.AppendableSpanLinks; import datadog.trace.bootstrap.instrumentation.api.UTF8BytesString; @@ -11,28 +11,39 @@ public final class InternalTagsAdder extends TagsPostProcessor { private final UTF8BytesString ddService; - private final UTF8BytesString version; + + // base.service / version are fixed for the life of the tracer, so their TagMap.Entry objects are + // pre-built once and shared across every span (Entry is immutable and safe to share between + // maps). + // The entries are tag-id-bearing (KnownTagIds), so they also land in their positional slot. null + // when the corresponding value is absent/empty. See PR #11555 for the string-keyed precursor. + @Nullable private final TagMap.Entry baseServiceEntry; + @Nullable private final TagMap.Entry versionEntry; public InternalTagsAdder(@Nullable final String ddService, @Nullable final String version) { this.ddService = ddService != null ? UTF8BytesString.create(ddService) : null; - this.version = version != null && !version.isEmpty() ? UTF8BytesString.create(version) : null; + this.baseServiceEntry = TagMap.Entry.create(KnownTagIds.BASE_SERVICE, this.ddService); + this.versionEntry = + version != null && !version.isEmpty() + ? TagMap.Entry.create(KnownTagIds.VERSION, UTF8BytesString.create(version)) + : null; } @Override public void processTags( TagMap unsafeTags, DDSpanContext spanContext, AppendableSpanLinks spanLinks) { - if (spanContext == null || ddService == null) { + if (spanContext == null || ddService == null || ddService.length() == 0) { return; } if (!ddService.toString().equalsIgnoreCase(spanContext.getServiceName())) { - // service name != DD_SERVICE - unsafeTags.set(DDTags.BASE_SERVICE, ddService); + // service name != DD_SERVICE + unsafeTags.set(baseServiceEntry); } else { // as per config consistency, the version tag is added across tracers only if - // the service name is DD_SERVICE and version tag is not manually set - if (version != null && !unsafeTags.containsKey(VERSION)) { - unsafeTags.set(VERSION, version); + // the service name is DD_SERVICE and version tag is not manually set + if (versionEntry != null && !unsafeTags.containsKey(VERSION)) { + unsafeTags.set(versionEntry); } } } diff --git a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/PeerServiceCalculator.java b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/PeerServiceCalculator.java index 198e2c78f1c..bc17f6ec04c 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/PeerServiceCalculator.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/PeerServiceCalculator.java @@ -1,13 +1,12 @@ package datadog.trace.core.tagprocessor; import datadog.trace.api.Config; -import datadog.trace.api.DDTags; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.api.internal.VisibleForTesting; import datadog.trace.api.naming.NamingSchema; import datadog.trace.api.naming.SpanNaming; import datadog.trace.bootstrap.instrumentation.api.AppendableSpanLinks; -import datadog.trace.bootstrap.instrumentation.api.Tags; import datadog.trace.core.DDSpanContext; import java.util.Map; import javax.annotation.Nonnull; @@ -35,7 +34,7 @@ public PeerServiceCalculator() { @Override public void processTags( TagMap unsafeTags, DDSpanContext spanContext, AppendableSpanLinks spanLinks) { - Object peerService = unsafeTags.getObject(Tags.PEER_SERVICE); + Object peerService = peerService(unsafeTags); // the user set it if (peerService != null) { if (canRemap) { @@ -46,18 +45,23 @@ public void processTags( // calculate the defaults (if any) peerServiceNaming.tags(unsafeTags); // only remap if the mapping is not empty (saves one get) - remapPeerService(unsafeTags, canRemap ? unsafeTags.getObject(Tags.PEER_SERVICE) : null); + remapPeerService(unsafeTags, canRemap ? peerService(unsafeTags) : null); return; } // we have no peer.service and we do not compute defaults. Leave the map untouched } + private static Object peerService(TagMap unsafeTags) { + TagMap.Entry entry = unsafeTags.getEntry(KnownTagIds.PEER_SERVICE); + return entry == null ? null : entry.objectValue(); + } + private void remapPeerService(TagMap unsafeTags, Object value) { if (value != null) { String mapped = peerServiceMapping.get(value); if (mapped != null) { - unsafeTags.put(Tags.PEER_SERVICE, mapped); - unsafeTags.put(DDTags.PEER_SERVICE_REMAPPED_FROM, value); + unsafeTags.set(KnownTagIds.PEER_SERVICE, mapped); + unsafeTags.set(KnownTagIds.PEER_SERVICE_REMAPPED_FROM, value); } } } diff --git a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/RemoteHostnameAdder.java b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/RemoteHostnameAdder.java index bc0939a74cb..245a3f2a18f 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/RemoteHostnameAdder.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/RemoteHostnameAdder.java @@ -1,6 +1,6 @@ package datadog.trace.core.tagprocessor; -import datadog.trace.api.DDTags; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.bootstrap.instrumentation.api.AppendableSpanLinks; import datadog.trace.core.DDSpanContext; @@ -33,7 +33,7 @@ public void processTags( return; } - TagMap.Entry newEntry = TagMap.Entry.create(DDTags.TRACER_HOST, hostname); + TagMap.Entry newEntry = TagMap.Entry.create(KnownTagIds.TRACER_HOST_ID, hostname); unsafeTags.set(newEntry); this.cachedHostEntry = newEntry; } diff --git a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/ServiceNameSourceAdder.java b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/ServiceNameSourceAdder.java index 4b081889039..0a72d02c73e 100644 --- a/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/ServiceNameSourceAdder.java +++ b/dd-trace-core/src/main/java/datadog/trace/core/tagprocessor/ServiceNameSourceAdder.java @@ -1,7 +1,6 @@ package datadog.trace.core.tagprocessor; -import static datadog.trace.api.DDTags.DD_SVC_SRC; - +import datadog.trace.api.KnownTagIds; import datadog.trace.api.TagMap; import datadog.trace.bootstrap.instrumentation.api.AppendableSpanLinks; import datadog.trace.core.DDSpanContext; @@ -12,9 +11,9 @@ public void processTags( TagMap unsafeTags, DDSpanContext spanContext, AppendableSpanLinks spanLinks) { final CharSequence serviceNameSource = spanContext.getServiceNameSource(); if (serviceNameSource != null) { - unsafeTags.set(DD_SVC_SRC, serviceNameSource); + unsafeTags.set(KnownTagIds.SVC_SRC_ID, serviceNameSource); } else { - unsafeTags.remove(DD_SVC_SRC); + unsafeTags.remove(KnownTagIds.SVC_SRC_ID); } } } diff --git a/dd-trace-core/src/test/java/datadog/trace/core/DDSpanContextTest.java b/dd-trace-core/src/test/java/datadog/trace/core/DDSpanContextTest.java index 4185c9acdab..9193d29dbd1 100644 --- a/dd-trace-core/src/test/java/datadog/trace/core/DDSpanContextTest.java +++ b/dd-trace-core/src/test/java/datadog/trace/core/DDSpanContextTest.java @@ -23,7 +23,9 @@ import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertFalse; import static org.junit.jupiter.api.Assertions.assertInstanceOf; +import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertTrue; import static org.mockito.Mockito.mock; import static org.mockito.Mockito.times; import static org.mockito.Mockito.verify; @@ -31,6 +33,7 @@ import datadog.trace.api.DDTags; import datadog.trace.api.DDTraceId; +import datadog.trace.api.KnownTagIds; import datadog.trace.api.internal.TraceSegment; import datadog.trace.bootstrap.instrumentation.api.AgentSpan; import datadog.trace.bootstrap.instrumentation.api.AgentSpanContext; @@ -70,6 +73,69 @@ void setup() { .build(); } + @Test + void setTagById_storedTagResolvesByName() { + AgentSpan span = tracer.buildSpan("datadog", "fakeOperation").start(); + DDSpanContext context = (DDSpanContext) span.context(); + + // PARENT_ID is a stored tag (serial >= FIRST_STORED_SERIAL): set by id, it lands in the map and + // is findable / serialized by its resolved name. + context.setTag(KnownTagIds.PARENT_ID, "p123"); + assertEquals("p123", context.getTags().get(DDTags.PARENT_ID)); + + span.finish(); + } + + @Test + void setTagById_reservedTagIsIntercepted() { + AgentSpan span = tracer.buildSpan("datadog", "fakeOperation").start(); + DDSpanContext context = (DDSpanContext) span.context(); + + // ERROR is a reserved (virtual) tag: setting it by id dispatches through the interceptor + // (id-keyed), which sets the error flag and does NOT store an "error" tag. + context.setTag(KnownTagIds.ERROR, true); + assertTrue(context.getErrorFlag()); + assertNull(context.getTags().get(Tags.ERROR)); + + span.finish(); + } + + @Test + void setTagById_interceptedButStoredTagRunsInterceptor() { + AgentSpan span = tracer.buildSpan("datadog", "fakeOperation").start(); + DDSpanContext context = (DDSpanContext) span.context(); + + // peer.service is intercepted-but-stored (case c): setting it by id must run the interceptor + // side-effect (which records the peer.service source) AND store the value, exactly like the + // string set-path. The id carries the INTERCEPTED flag so setTag(long) routes through the + // interceptor. + context.setTag(KnownTagIds.PEER_SERVICE, "my-remote-svc"); + + assertEquals(Tags.PEER_SERVICE, context.getTags().get(DDTags.PEER_SERVICE_SOURCE)); + assertEquals("my-remote-svc", context.getTags().get(Tags.PEER_SERVICE)); + + span.finish(); + } + + @Test + void commonTags_slotByNameViaCommonLayout() { + // env / product flags are build-time-known common tags (KnownTagIds). Set by name they resolve + // to their id and land in the common slot layout, and remain findable by both name and id. + AgentSpan span = tracer.buildSpan("datadog", "fakeOperation").start(); + DDSpanContext context = (DDSpanContext) span.context(); + + context.setTag("env", "prod"); + context.setTag(DDTags.DJM_ENABLED, 1); + + assertEquals("prod", context.getTags().get("env")); + assertEquals(1, context.getTags().get(DDTags.DJM_ENABLED)); + // proves they occupy the shared slot layout (findable by id) + assertNotNull(context.getTags().getEntry(KnownTagIds.ENV_ID)); + assertNotNull(context.getTags().getEntry(KnownTagIds.DJM_ENABLED)); + + span.finish(); + } + @ParameterizedTest @ValueSource(strings = {DDTags.SERVICE_NAME, DDTags.RESOURCE_NAME, DDTags.SPAN_TYPE, "some.tag"}) void nullValuesForTagsDeleteExistingTags(String name) throws Exception { diff --git a/dd-trace-core/src/test/java/datadog/trace/core/taginterceptor/TagInterceptorTest.java b/dd-trace-core/src/test/java/datadog/trace/core/taginterceptor/TagInterceptorTest.java index 42a8b80e054..8e2b925b35b 100644 --- a/dd-trace-core/src/test/java/datadog/trace/core/taginterceptor/TagInterceptorTest.java +++ b/dd-trace-core/src/test/java/datadog/trace/core/taginterceptor/TagInterceptorTest.java @@ -5,6 +5,7 @@ import static datadog.trace.junit.utils.config.WithConfigExtension.injectSysConfig; import static org.junit.jupiter.api.Assertions.assertEquals; import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertNotNull; import static org.junit.jupiter.api.Assertions.assertNull; import static org.junit.jupiter.api.Assertions.assertTrue; import static org.junit.jupiter.params.provider.Arguments.arguments; @@ -16,6 +17,8 @@ import datadog.trace.api.DDSpanTypes; import datadog.trace.api.DDTags; +import datadog.trace.api.KnownTagIds; +import datadog.trace.api.KnownTags; import datadog.trace.api.ProductTraceSource; import datadog.trace.api.remoteconfig.ServiceNameCollector; import datadog.trace.api.remoteconfig.ServiceNameCollectorTestBridge; @@ -630,6 +633,23 @@ void urlAsResourceNameRuleSetsTheResourceName( } } + @Test + void urlAsResourceNameRuleViaTagId() { + // Drives the specialized HTTP_METHOD_SERIAL / HTTP_URL_SERIAL arms of interceptTag(long): + // setting http.method + http.url BY ID must run the same url-as-resource rule as the string + // path and produce the same resource name. + CoreTracer tracer = tracerBuilder().writer(new ListWriter()).build(); + + AgentSpan span = tracer.buildSpan("datadog", "fakeOperation").start(); + try { + span.setTag(KnownTagIds.HTTP_METHOD, "POST"); + span.setTag(KnownTagIds.HTTP_URL, "/with-method"); + assertEquals("POST /with-method", span.getResourceName().toString()); + } finally { + span.finish(); + } + } + @Test void whenUserSetsPeerServiceTheSourceShouldBePeerService() { CoreTracer tracer = tracerBuilder().writer(new ListWriter()).build(); @@ -688,6 +708,37 @@ void whenInterceptServletContextExtraServiceProviderIsCalled(String value, Strin } } + @Test + void knownTagIdInterceptedFlagMatchesNameBasedNeedsIntercept() throws Exception { + // No-regression guard: the INTERCEPTED bit baked into each KnownTagIds id must agree with the + // interceptor's name-based needsIntercept(name). If a new id is added (or interception of a + // name changes) without keeping the flag in sync, DDSpanContext.setTag(long) would either skip + // a needed interception or pointlessly intercept — this catches the drift. + RuleFlags ruleFlags = mock(RuleFlags.class); + when(ruleFlags.isEnabled(any())).thenReturn(true); + TagInterceptor interceptor = new TagInterceptor(ruleFlags); + + int checked = 0; + for (java.lang.reflect.Field field : KnownTagIds.class.getDeclaredFields()) { + if (field.getType() != long.class) { + continue; // ids are the long constants; skip *_SERIAL ints, ENV string, etc. + } + long tagId = field.getLong(null); + String name = KnownTags.nameOf(tagId); + assertNotNull(name, "id " + field.getName() + " should resolve to a name"); + assertEquals( + interceptor.needsIntercept(name), + KnownTags.isIntercepted(tagId), + "INTERCEPTED flag for " + + field.getName() + + " (\"" + + name + + "\") disagrees with needsIntercept"); + checked++; + } + assertTrue(checked > 0, "expected to check at least one tag id"); + } + @Test void whenInterceptsProductTraceSourcePropagationTagUpdatePropagatedTraceSourceIsCalled() { RuleFlags ruleFlags = mock(RuleFlags.class); diff --git a/dd-trace-core/src/test/java/datadog/trace/core/tagprocessor/InternalTagsAdderTest.java b/dd-trace-core/src/test/java/datadog/trace/core/tagprocessor/InternalTagsAdderTest.java index ea3798a4427..a914b78fd28 100644 --- a/dd-trace-core/src/test/java/datadog/trace/core/tagprocessor/InternalTagsAdderTest.java +++ b/dd-trace-core/src/test/java/datadog/trace/core/tagprocessor/InternalTagsAdderTest.java @@ -14,6 +14,7 @@ import datadog.trace.test.util.DDJavaSpecification; import java.util.Collections; import java.util.Objects; +import org.junit.jupiter.api.Test; import org.tabletest.junit.TableTest; class InternalTagsAdderTest extends DDJavaSpecification { @@ -67,4 +68,16 @@ void shouldAddVersionWhenDdServiceEqualsServiceNameAndVersionSet( verify(spanContext, times(1)).getServiceName(); assertEquals(expected, Objects.toString(unsafeTags.get(VERSION), null)); } + + // Regression: empty DD_SERVICE is treated the same as unset — processTags exits early and writes + // no tags, regardless of the span's service name (the prebuilt base.service entry is null). + @Test + void emptyDdServiceWritesNoTags() { + InternalTagsAdder adder = new InternalTagsAdder("", "1.0"); + DDSpanContext spanContext = mock(DDSpanContext.class); + + TagMap tags = TagMap.fromMap(Collections.emptyMap()); + adder.processTags(tags, spanContext, link -> {}); + assertTrue(tags.isEmpty()); + } } diff --git a/internal-api/build.gradle.kts b/internal-api/build.gradle.kts index fc95dd9e1f1..89b5b17e958 100644 --- a/internal-api/build.gradle.kts +++ b/internal-api/build.gradle.kts @@ -279,7 +279,33 @@ dependencies { testImplementation(libs.bundles.mockito) } +// Forward TagMapFuzzTest knobs (datadog.tagmap.fuzz.seed / .iterations) to the forked test JVM, so +// a failing run can be reproduced via -Ddatadog.tagmap.fuzz.seed= (forked test JVMs don't +// inherit -D from the Gradle invocation). +tasks.withType().configureEach { + System.getProperties().stringPropertyNames().forEach { + if (it.startsWith("datadog.tagmap.fuzz.")) { + systemProperty(it, System.getProperty(it)) + } + } +} + jmh { jmhVersion = libs.versions.jmh.get() duplicateClassesStrategy = DuplicatesStrategy.EXCLUDE + if (project.hasProperty("jmhInclude")) { + includes.set(listOf(project.property("jmhInclude") as String)) + } + if (project.hasProperty("jmhWarmup")) { + warmupIterations.set((project.property("jmhWarmup") as String).toInt()) + } + if (project.hasProperty("jmhIterations")) { + iterations.set((project.property("jmhIterations") as String).toInt()) + } + if (project.hasProperty("jmhFork")) { + fork.set((project.property("jmhFork") as String).toInt()) + } + if (project.hasProperty("jmhProfilers")) { + profilers.set((project.property("jmhProfilers") as String).split(",").toList()) + } } diff --git a/internal-api/src/jmh/java/datadog/trace/api/AttrStoreBenchmark.java b/internal-api/src/jmh/java/datadog/trace/api/AttrStoreBenchmark.java new file mode 100644 index 00000000000..80745ef691b --- /dev/null +++ b/internal-api/src/jmh/java/datadog/trace/api/AttrStoreBenchmark.java @@ -0,0 +1,277 @@ +package datadog.trace.api; + +import datadog.trace.util.TagSet; +import java.util.concurrent.TimeUnit; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.BenchmarkMode; +import org.openjdk.jmh.annotations.Fork; +import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Mode; +import org.openjdk.jmh.annotations.OutputTimeUnit; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.Setup; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.annotations.TearDown; +import org.openjdk.jmh.annotations.Threads; +import org.openjdk.jmh.annotations.Warmup; +import org.openjdk.jmh.infra.Blackhole; + +/** + * How much headroom is left in the dense known-tag store? Builds a span's known tags three ways and + * measures throughput (+ allocation via {@code -prof gc}). Models the real lifecycle — set N tags, + * then iterate once (serialize). + * + *

Phase 1 (dense storage inside {@code OptimizedTagMap}) has already landed, so {@code current} + * below is the LIVE dense store, NOT the old Entry-per-tag design. The Entry[]->dense win is + * evidenced elsewhere (petclinic CPU/req, JFR); this benchmark now isolates what's LEFT to chase: + * + *

    + *
  1. {@code current}: the live {@link TagMap} ({@code OptimizedTagMap}) — already dense ({@code + * long[] knownIds + Object[] knownValues}, no per-tag {@code Entry}), plus the full TagMap + * machinery (size bookkeeping, lazy buckets, keyOf upgrade path). + *
  2. {@code dense}: a bare {@code long[] ids + Object[] values} store — strips the TagMap + * machinery, so {@code current} vs {@code dense} measures that overhead. Both box the one + * int. + *
  3. {@code pojo}: a hand-written class with typed fields — the codegen endgame (no {@code + * Entry}, no boxing, no arrays-per-tag); shows the ceiling {@code dense}->{@code pojo} + * buys. + *
+ * + * Tag set is db.client-like (the dominant PetClinic span): 11 strings + 1 int. + */ +@BenchmarkMode(Mode.Throughput) +@OutputTimeUnit(TimeUnit.SECONDS) +@Fork(1) +@Warmup(iterations = 3) +@Measurement(iterations = 5) +@Threads(8) +@State(Scope.Benchmark) +public class AttrStoreBenchmark { + static final String[] NAMES = { + "component", + "span.kind", + "language", + "_dd.base_service", + "db.type", + "db.instance", + "db.operation", + "db.user", + "db.pool.name", + "peer.hostname", + "peer.ipv4", + "peer.port", // last is the int + }; + static final int PORT_IDX = 11; + static final int N = NAMES.length; + + static final long[] IDS = new long[N]; + static final Object[] VALUES = new Object[N]; // string values; port is boxed Integer + + @Setup + public void setup() { + for (int i = 0; i < N; ++i) { + IDS[i] = KnownTags.tagId(i + 1, i, NAMES[i]); // serial=i+1, fieldPos=i + VALUES[i] = (i == PORT_IDX) ? Integer.valueOf(5432) : ("value-" + i); + } + final TagSet.Data nameTable = TagSet.Support.create(NAMES); + final long[] slotIds = new long[nameTable.names.length]; + for (int i = 0; i < N; ++i) { + slotIds[TagSet.Support.indexOf(nameTable.hashes, nameTable.names, NAMES[i])] = IDS[i]; + } + KnownTags.register( + new KnownTags.Resolver() { + @Override + public String nameOf(long tagId) { + int gs = (int) ((tagId >>> 48) & 0x7FFF); + return (gs >= 1 && gs <= N) ? NAMES[gs - 1] : null; + } + + @Override + public long keyOf(String name) { + int slot = TagSet.Support.indexOf(nameTable.hashes, nameTable.names, name); + return slot < 0 ? 0L : slotIds[slot]; + } + + @Override + public int slotCount() { + return N; + } + }); + } + + @TearDown + public void tearDown() { + KnownTags.register(null); + } + + // ---------- current: OptimizedTagMap (Entry per tag) ---------- + @Benchmark + public TagMap build_current() { + TagMap map = TagMap.create(); + for (int i = 0; i < N; ++i) { + map.set(IDS[i], VALUES[i]); + } + return map; + } + + @Benchmark + public void buildIter_current(Blackhole bh) { + TagMap map = TagMap.create(); + for (int i = 0; i < N; ++i) { + map.set(IDS[i], VALUES[i]); + } + for (TagMap.EntryReader e : map) { + bh.consume(e.tag()); + bh.consume(e.objectValue()); + } + } + + // ---------- dense: long[] ids + Object[] values ---------- + @Benchmark + public DenseStore build_dense() { + DenseStore s = new DenseStore(); + for (int i = 0; i < N; ++i) { + s.set(IDS[i], VALUES[i]); + } + return s; + } + + @Benchmark + public void buildIter_dense(Blackhole bh) { + DenseStore s = new DenseStore(); + for (int i = 0; i < N; ++i) { + s.set(IDS[i], VALUES[i]); + } + for (int i = 0; i < s.size; ++i) { + bh.consume(KnownTags.nameOf(s.ids[i])); + bh.consume(s.values[i]); + } + } + + // ---------- pojo: typed fields ---------- + @Benchmark + public DbPojo build_pojo() { + DbPojo p = new DbPojo(); + for (int i = 0; i < N; ++i) { + if (i == PORT_IDX) { + p.set(IDS[i], 5432); + } else { + p.set(IDS[i], VALUES[i]); + } + } + return p; + } + + @Benchmark + public void buildIter_pojo(Blackhole bh) { + DbPojo p = new DbPojo(); + for (int i = 0; i < N; ++i) { + if (i == PORT_IDX) { + p.set(IDS[i], 5432); + } else { + p.set(IDS[i], VALUES[i]); + } + } + p.iterate(bh); + } + + /** Dense (id, value) store — phase-1 design. */ + static final class DenseStore { + long[] ids = new long[16]; + Object[] values = new Object[16]; + int size; + + void set(long id, Object v) { + for (int i = 0; i < size; ++i) { + if (ids[i] == id) { + values[i] = v; + return; + } + } + if (size == ids.length) { + ids = java.util.Arrays.copyOf(ids, size * 2); + values = java.util.Arrays.copyOf(values, size * 2); + } + ids[size] = id; + values[size] = v; + size++; + } + } + + /** Hand-written POJO — phase-2 codegen endgame. serial = fieldPos+1 here. */ + static final class DbPojo { + String component, + spanKind, + language, + baseService, + dbType, + dbInstance, + dbOperation, + dbUser, + dbPoolName, + peerHostname, + peerIpv4; + int peerPort; + + void set(long id, Object v) { + switch ((int) ((id >>> 48) & 0x7FFF)) { + case 1: + component = (String) v; + break; + case 2: + spanKind = (String) v; + break; + case 3: + language = (String) v; + break; + case 4: + baseService = (String) v; + break; + case 5: + dbType = (String) v; + break; + case 6: + dbInstance = (String) v; + break; + case 7: + dbOperation = (String) v; + break; + case 8: + dbUser = (String) v; + break; + case 9: + dbPoolName = (String) v; + break; + case 10: + peerHostname = (String) v; + break; + case 11: + peerIpv4 = (String) v; + break; + default: /* off-type / unknown -> would bucket */ + break; + } + } + + void set(long id, int v) { + if (((int) ((id >>> 48) & 0x7FFF)) == 12) { + peerPort = v; + } + } + + void iterate(Blackhole bh) { + bh.consume(component); + bh.consume(spanKind); + bh.consume(language); + bh.consume(baseService); + bh.consume(dbType); + bh.consume(dbInstance); + bh.consume(dbOperation); + bh.consume(dbUser); + bh.consume(dbPoolName); + bh.consume(peerHostname); + bh.consume(peerIpv4); + bh.consume(peerPort); + } + } +} diff --git a/internal-api/src/jmh/java/datadog/trace/api/TagMapAccessBaselineBenchmark.java b/internal-api/src/jmh/java/datadog/trace/api/TagMapAccessBaselineBenchmark.java new file mode 100644 index 00000000000..72eeccb1fda --- /dev/null +++ b/internal-api/src/jmh/java/datadog/trace/api/TagMapAccessBaselineBenchmark.java @@ -0,0 +1,69 @@ +package datadog.trace.api; + +import java.util.concurrent.TimeUnit; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.BenchmarkMode; +import org.openjdk.jmh.annotations.Fork; +import org.openjdk.jmh.annotations.Level; +import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Mode; +import org.openjdk.jmh.annotations.OutputTimeUnit; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.Setup; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.annotations.Threads; +import org.openjdk.jmh.annotations.Warmup; +import org.openjdk.jmh.infra.Blackhole; + +/** + * Master-equivalent control for {@link TagMapAccessBenchmark}: string insertion / lookup with NO + * {@link KnownTags.Resolver} registered, so every tag uses the hash buckets (no slot routing, no + * keyOf). This mirrors how master behaves and isolates the comparison "automatic insertion by id + * (this branch) vs the pre-feature string baseline". + * + *

Runs in its own benchmark class so each method's fork has no resolver registered (the resolver + * is global static; {@code TagMapAccessBenchmark} registers one in its own forks). + */ +@BenchmarkMode(Mode.Throughput) +@OutputTimeUnit(TimeUnit.SECONDS) +@Fork(1) +@Warmup(iterations = 3) +@Measurement(iterations = 5) +@Threads(8) +@State(Scope.Benchmark) +public class TagMapAccessBaselineBenchmark { + // same tag set as TagMapAccessBenchmark for an apples-to-apples comparison + static final String[] NAMES = TagMapAccessBenchmark.NAMES; + + static final Object[] VALUES = new Object[NAMES.length]; + + TagMap readMap; + + @Setup(Level.Trial) + public void setup() { + KnownTags.register(null); // no resolver: pure string / bucket path, like master + for (int i = 0; i < NAMES.length; ++i) { + VALUES[i] = "value-" + i; + } + this.readMap = TagMap.create(); + for (int i = 0; i < NAMES.length; ++i) { + this.readMap.set(NAMES[i], VALUES[i]); + } + } + + @Benchmark + public TagMap insertByString_noResolver() { + TagMap map = TagMap.create(); + for (int i = 0; i < NAMES.length; ++i) { + map.set(NAMES[i], VALUES[i]); + } + return map; + } + + @Benchmark + public void getByString_noResolver(Blackhole bh) { + for (int i = 0; i < NAMES.length; ++i) { + bh.consume(this.readMap.getEntry(NAMES[i])); + } + } +} diff --git a/internal-api/src/jmh/java/datadog/trace/api/TagMapAccessBenchmark.java b/internal-api/src/jmh/java/datadog/trace/api/TagMapAccessBenchmark.java new file mode 100644 index 00000000000..ecd1a1244b4 --- /dev/null +++ b/internal-api/src/jmh/java/datadog/trace/api/TagMapAccessBenchmark.java @@ -0,0 +1,193 @@ +package datadog.trace.api; + +import datadog.trace.util.TagSet; +import java.util.concurrent.TimeUnit; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.BenchmarkMode; +import org.openjdk.jmh.annotations.Fork; +import org.openjdk.jmh.annotations.Level; +import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Mode; +import org.openjdk.jmh.annotations.OutputTimeUnit; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.Setup; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.annotations.Threads; +import org.openjdk.jmh.annotations.Warmup; +import org.openjdk.jmh.infra.Blackhole; + +/** + * Compares tag insertion / lookup by generated tag id vs by string name, with a {@link + * KnownTags.Resolver} registered (the production configuration once code generation is live). + * + *

Tag ids are built via {@link KnownTags#tagId} (which uses the runtime's own name hash), so the + * comparison is faithful even on the bucket-fallback path. + * + *

byId stores straight into the dense known-tag store at its positional slot ({@code + * knownValues[fieldPos]}, O(1), no scan); byString pays {@code keyOf(name)} to resolve the id first + * (via the real {@link datadog.trace.util.TagSet} table) and then slots it the same way. The bucket + * baseline (no resolver, master-equivalent) is {@link TagMapAccessBaselineBenchmark}. + * Apple M1 Max (10 core) - 8 threads - 1 fork - Java 8 (Zulu 8.0.382) - positional dense store + * + * Benchmark Mode Cnt Score Error Units + * insertById thrpt 5 126235943.1 ± 11653584.6 ops/s + * insertByString thrpt 5 57355057.5 ± 2976623.2 ops/s + * getObjectById thrpt 5 129726670.1 ± 10877596.1 ops/s + * getObjectByString thrpt 5 73544340.8 ± 1349944.7 ops/s + * getEntryById thrpt 5 129117822.8 ± 16455290.0 ops/s + * getEntryByString thrpt 5 73422181.5 ± 2210885.4 ops/s + * baseline.insertByString_noResolver thrpt 5 43334158.2 ± 4699836.5 ops/s (master path) + * baseline.getByString_noResolver thrpt 5 107969497.0 ± 7160811.9 ops/s (master path) + * + * + *

    + *
  • getObject by id vs by name: 129.7M vs 73.5M (~1.77x) — the common read. The whole + * gap is {@code keyOf}; both hit the slot and return the raw value with no Entry. Id-keyed + * value reads win. + *
  • getObject ~= getEntry (130M ~= 129M): the Entry "materialization penalty" vanishes + * for value use — escape analysis scalar-replaces the transient Entry when the caller + * consumes its value rather than retaining it, so {@code getEntry} needs no replacement here. + * (getEntryReader was measured and dropped: its eager name resolution made it the slowest + * read.) + *
  • insertById ~3x the bucket baseline (126M vs 43M) — O(1) positional claim + no + * per-tag Entry; insertByString +32% (57M vs 43M) even paying {@code keyOf}, so the + * former name-insert regression is gone. + *
+ */ +@BenchmarkMode(Mode.Throughput) +@OutputTimeUnit(TimeUnit.SECONDS) +@Fork(1) +@Warmup(iterations = 3) +@Measurement(iterations = 5) +@Threads(8) +@State(Scope.Benchmark) +public class TagMapAccessBenchmark { + // a representative HTTP-server-ish tag set + static final String[] NAMES = { + "http.request.method", + "http.response.status_code", + "http.route", + "url.path", + "url.scheme", + "server.address", + "server.port", + "client.address", + "network.protocol.version", + "user_agent.original", + "span.kind", + "component", + "language", + "error", + "resource.name", + "service.name", + "operation.name", + "env", + }; + + // globalSerial = i + 1 (unique, non-zero); fieldPos = i (the positional slot in the dense store) + static final long[] IDS = new long[NAMES.length]; + static final Object[] VALUES = new Object[NAMES.length]; + + static { + for (int i = 0; i < NAMES.length; ++i) { + IDS[i] = KnownTags.tagId(i + 1, i, NAMES[i]); + VALUES[i] = "value-" + i; + } + // Register the resolver at CLASS INIT, not in @Setup: a benchmark-class @Setup and the + // per-thread ReadMap @Setup have no guaranteed cross-scope ordering, but class init does (any + // access to IDS triggers it before ReadMap.build runs). Process-global for this benchmark fork. + // nameOf is a dense array index by globalSerial; keyOf goes through the real open-addressed + // TagSet table (the algorithm KnownTagIds uses in production). + final TagSet.Data nameTable = TagSet.Support.create(NAMES); + final long[] slotIds = new long[nameTable.names.length]; + for (int i = 0; i < NAMES.length; ++i) { + slotIds[TagSet.Support.indexOf(nameTable.hashes, nameTable.names, NAMES[i])] = IDS[i]; + } + KnownTags.register( + new KnownTags.Resolver() { + @Override + public String nameOf(long tagId) { + int globalSerial = (int) (tagId >>> 48); + return (globalSerial >= 1 && globalSerial <= NAMES.length) + ? NAMES[globalSerial - 1] + : null; + } + + @Override + public long keyOf(String name) { + int slot = TagSet.Support.indexOf(nameTable.hashes, nameTable.names, name); + return slot < 0 ? 0L : slotIds[slot]; + } + + @Override + public int slotCount() { + return NAMES.length; // fieldPos 0..NAMES.length-1 + } + }); + } + + /** + * Pre-populated read map, PER-THREAD (Scope.Thread): each thread owns its map AND its reused + * reader flyweight, so getEntryReader doesn't contend on a shared flyweight under @Threads(8). + */ + @State(Scope.Thread) + public static class ReadMap { + OptimizedTagMap map; + + @Setup(Level.Trial) + public void build() { + this.map = (OptimizedTagMap) TagMap.create(); + for (int i = 0; i < IDS.length; ++i) { + this.map.set(IDS[i], VALUES[i]); + } + } + } + + @Benchmark + public TagMap insertById() { + TagMap map = TagMap.create(); + for (int i = 0; i < IDS.length; ++i) { + map.set(IDS[i], VALUES[i]); + } + return map; + } + + @Benchmark + public TagMap insertByString() { + TagMap map = TagMap.create(); + for (int i = 0; i < NAMES.length; ++i) { + map.set(NAMES[i], VALUES[i]); + } + return map; + } + + // ---- value reads (getObject - raw value, no Entry; the common read) ---- + @Benchmark + public void getObjectById(ReadMap rm, Blackhole bh) { + for (int i = 0; i < IDS.length; ++i) { + bh.consume(rm.map.getObject(IDS[i])); + } + } + + @Benchmark + public void getObjectByString(ReadMap rm, Blackhole bh) { + for (int i = 0; i < NAMES.length; ++i) { + bh.consume(rm.map.getObject(NAMES[i])); + } + } + + // ---- entry reads (materializes an Entry per call; EA elides it for transient value use) ---- + @Benchmark + public void getEntryById(ReadMap rm, Blackhole bh) { + for (int i = 0; i < IDS.length; ++i) { + bh.consume(rm.map.getEntry(IDS[i]).objectValue()); + } + } + + @Benchmark + public void getEntryByString(ReadMap rm, Blackhole bh) { + for (int i = 0; i < NAMES.length; ++i) { + bh.consume(rm.map.getEntry(NAMES[i]).objectValue()); + } + } +} diff --git a/internal-api/src/jmh/java/datadog/trace/util/KeyOfBenchmark.java b/internal-api/src/jmh/java/datadog/trace/util/KeyOfBenchmark.java new file mode 100644 index 00000000000..aee20e94755 --- /dev/null +++ b/internal-api/src/jmh/java/datadog/trace/util/KeyOfBenchmark.java @@ -0,0 +1,253 @@ +package datadog.trace.util; + +import java.util.HashMap; +import java.util.Map; +import java.util.function.Supplier; +import org.openjdk.jmh.annotations.Benchmark; +import org.openjdk.jmh.annotations.Fork; +import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.State; +import org.openjdk.jmh.annotations.Threads; +import org.openjdk.jmh.annotations.Warmup; + +/** + * name -> id resolution shootout (the {@code keyOf} path), built on the generic {@link TagSet}. + * + *
    + *
  • tagSet — {@code TagSet.Support.indexOf} over static-final {@code int[] + * hashes} / {@code String[] names} (refs fold to constants) + a parallel {@code long[] ids}. + *
  • tagSet_throughClass — same, but via a {@code TagSet} instance (an + * instance-field load of hashes/names) — isolates the wrapper indirection vs static fields. + *
  • hashMap — {@code HashMap} (boxes the value). + *
  • switch — hand-written string {@code switch} (the thing keyOf replaces). At 16 cases + * it inlines fine; the at-scale degradation (hundreds of cases over FreqInlineSize) shows up + * against the real generated keyOf, not here. + *
+ * + *

Two term flavors: interned (realistic — instrumentation passes string literals → the + * {@code ==} fast path in eq) and copies (non-interned → forces {@code String.equals}). + * Terms are hit-dominated. + * Apple M1 Max (10 core) - 8 threads (per-thread state) - 2 forks - Java 8 (Zulu 8.0.382) + * + * Benchmark Mode Cnt Score Error Units + * KeyOfBenchmark.aa_baseline_termSelection thrpt 6 2743246161.5 ± 29519843.7 ops/s + * KeyOfBenchmark.tagSet thrpt 6 2275407420.3 ± 35217527.6 ops/s + * KeyOfBenchmark.tagSet_throughClass thrpt 6 2036161909.9 ± 49813775.7 ops/s + * KeyOfBenchmark.hashMap thrpt 6 1889985340.4 ± 46434121.2 ops/s + * KeyOfBenchmark.switch_ thrpt 6 1132557957.9 ±128775728.2 ops/s + * // copies (non-interned): tagSet 1843M, tagSet_throughClass 1708M, hashMap 1593M, switch_ 1137M + * + * + *

    + *
  • tagSet ~2x the switch (2275M vs 1133M) at only 16 cases — the gap widens toward the + * generated hundreds, where the switch exceeds the inline budget. The keyOf swap's win. + *
  • tagSet ~20% over HashMap (2275M vs 1890M). + *
  • static ~12% over the instance (tagSet 2275M vs tagSet_throughClass 2036M) — folded + * static-final arrays beat the instance-field load; pull {@code Data} into your own statics. + *
  • The switch is interning-insensitive (1133≈1137, dispatch-bound); hash contenders gain + * ~16-19% interned via the {@code ==} fast path. + *
+ */ +@Fork(2) +@Warmup(iterations = 2) +@Measurement(iterations = 3) +@Threads(8) +@State(Scope.Thread) +public class KeyOfBenchmark { + static final long UNKNOWN = 0L; + + static final String[] NAMES_IN = { + "span.type", "component", "span.kind", "db.type", "db.instance", "db.statement", + "peer.hostname", "peer.port", "http.method", "http.route", "http.status_code", "http.url", + "error", "resource", "service", "operation" + }; + + /** ids parallel to NAMES_IN — id == index+1, matched across all contenders. */ + static final long[] IDS_IN = + init( + () -> { + long[] ids = new long[NAMES_IN.length]; + for (int j = 0; j < ids.length; j++) { + ids[j] = j + 1L; + } + return ids; + }); + + // fastest path: build once, pull into static final so the refs fold + static final int[] HASHES; + static final String[] NAMES; + static final long[] IDS; + + static { + TagSet.Data data = TagSet.Support.create(NAMES_IN); + long[] ids = new long[data.names.length]; + for (int j = 0; j < NAMES_IN.length; j++) { + ids[TagSet.Support.indexOf(data.hashes, data.names, NAMES_IN[j])] = IDS_IN[j]; + } + HASHES = data.hashes; + NAMES = data.names; + IDS = ids; + } + + static final Map HASH_MAP = + init( + () -> { + Map m = new HashMap<>(NAMES_IN.length * 2); + for (int j = 0; j < NAMES_IN.length; j++) { + m.put(NAMES_IN[j], IDS_IN[j]); + } + return m; + }); + + /** Convenience instance — the through-the-class path (instance-field loads vs folded statics). */ + static final TagSet TAG_SET = TagSet.of(NAMES_IN); + + // hit-dominated, two misses; interned and non-interned copies + static final String[] TERMS = { + "span.type", "component", "span.kind", "db.type", "db.instance", "db.statement", + "peer.hostname", "peer.port", "http.method", "http.route", "http.status_code", "http.url", + "error", "resource", "service", "operation", "unknown.tag", "custom.attr" + }; + + static final String[] TERM_COPIES = + init( + () -> { + String[] copies = new String[TERMS.length]; + for (int i = 0; i < TERMS.length; i++) { + copies[i] = new String(TERMS[i]); // defeat interning + } + return copies; + }); + + int termIndex = 0; // per-thread (Scope.Thread) — no shared-counter contention under @Threads(8) + + String nextTerm() { + int i = termIndex + 1; + if (i >= TERMS.length) { + i = 0; + } + termIndex = i; + return TERMS[i]; + } + + String nextTermCopy() { + int i = termIndex + 1; + if (i >= TERM_COPIES.length) { + i = 0; + } + termIndex = i; + return TERM_COPIES[i]; + } + + static T init(Supplier supplier) { + return supplier.get(); + } + + // ---- resolvers ---- + static long tagSetKeyOf(String t) { + int slot = TagSet.Support.indexOf(HASHES, NAMES, t); // folded static-final refs + return slot < 0 ? UNKNOWN : IDS[slot]; + } + + static long tagSetThroughClassKeyOf(String t) { + int slot = TAG_SET.indexOf(t); // instance-field load of hashes/names + return slot < 0 ? UNKNOWN : IDS[slot]; + } + + static long hashMapKeyOf(String t) { + Long v = HASH_MAP.get(t); + return v == null ? UNKNOWN : v.longValue(); + } + + static long switchKeyOf(String t) { + switch (t) { + case "span.type": + return 1L; + case "component": + return 2L; + case "span.kind": + return 3L; + case "db.type": + return 4L; + case "db.instance": + return 5L; + case "db.statement": + return 6L; + case "peer.hostname": + return 7L; + case "peer.port": + return 8L; + case "http.method": + return 9L; + case "http.route": + return 10L; + case "http.status_code": + return 11L; + case "http.url": + return 12L; + case "error": + return 13L; + case "resource": + return 14L; + case "service": + return 15L; + case "operation": + return 16L; + default: + return UNKNOWN; + } + } + + // ---- interned terms (realistic) ---- + @Benchmark + public String aa_baseline_termSelection() { + return nextTerm(); + } + + @Benchmark + public long tagSet() { + return tagSetKeyOf(nextTerm()); + } + + @Benchmark + public long tagSet_throughClass() { + return tagSetThroughClassKeyOf(nextTerm()); + } + + @Benchmark + public long hashMap() { + return hashMapKeyOf(nextTerm()); + } + + @Benchmark + public long switch_() { + return switchKeyOf(nextTerm()); + } + + // ---- non-interned copies (forces equals) ---- + @Benchmark + public String aa_baseline_termSelectionCopy() { + return nextTermCopy(); + } + + @Benchmark + public long tagSet_copy() { + return tagSetKeyOf(nextTermCopy()); + } + + @Benchmark + public long tagSet_throughClass_copy() { + return tagSetThroughClassKeyOf(nextTermCopy()); + } + + @Benchmark + public long hashMap_copy() { + return hashMapKeyOf(nextTermCopy()); + } + + @Benchmark + public long switch_copy() { + return switchKeyOf(nextTermCopy()); + } +} diff --git a/internal-api/src/jmh/java/datadog/trace/util/SetBenchmark.java b/internal-api/src/jmh/java/datadog/trace/util/SetBenchmark.java index 144e4748400..eec1bbee95e 100644 --- a/internal-api/src/jmh/java/datadog/trace/util/SetBenchmark.java +++ b/internal-api/src/jmh/java/datadog/trace/util/SetBenchmark.java @@ -1,44 +1,62 @@ package datadog.trace.util; import java.util.Arrays; -import java.util.Collections; import java.util.HashSet; import java.util.TreeSet; -import java.util.concurrent.ThreadLocalRandom; import java.util.function.Supplier; import org.openjdk.jmh.annotations.Benchmark; import org.openjdk.jmh.annotations.Fork; import org.openjdk.jmh.annotations.Measurement; +import org.openjdk.jmh.annotations.Scope; +import org.openjdk.jmh.annotations.State; import org.openjdk.jmh.annotations.Threads; import org.openjdk.jmh.annotations.Warmup; /** - * + * Ways to represent a small set of strings and test membership, split into hit and miss lookups + * (different cost shapes per structure). Lookups are interned (the {@code ==} fast path); misses + * are short and never present. Per-thread state ({@code @State(Scope.Thread)}) keeps the rotation + * counter off the shared-write path under {@code @Threads(8)} — an earlier shared-counter version + * capped the fast structures at a ~1.4B contention ceiling (since superseded by the numbers below). * *
    - * Benchmark showing possible ways to represent and check if a set includes an elememt... - *
  • (RECOMMENDED) HashSet - on par with TreeSet - idiomatic - *
  • (RECOMMENDED) TreeMap - on par with HashSet - better solution if custom comparator is - * needed (see CaseInsensitiveMapBenchmark) - *
  • array - slower than HashSet - *
  • sortedArray - slowest - slower than array for common case of small arrays + *
  • tagSetSupport (static) is the fastest membership path — 2336M hit / 2170M miss. It + * beats the TagSet instance ({@code tagSet_*}) by ~7% (hit) to ~12% (miss): the instance pays + * an instance-field load of hashes/names, while {@code Support.indexOf} over {@code static + * final} arrays lets the refs fold to constants. Matches KeyOfBenchmark's ~12% static-vs- + * instance gap. So when the set is fixed, pull {@code Data} into your own static finals. + *
  • vs HashSet — the static path is ~12% faster on hit and ~par on miss. But HashSet was + * noisy here (±22% error) while TagSet was tight (±2-7%), so TagSet also wins on + * predictability — and is allocation-free and positional-capable. + *
  • array / sortedArray / treeSet cluster ~0.65-1.0B — they compare/scan per element, so they + * slow on miss (hit early-exits; miss does the full scan / binary descent / tree walk). + * TreeSet is NOT uniquely slowest — worth it only for a custom comparator (case-insensitive, + * dodging {@code toLowerCase}), not speed. *
* * - * MacBook M1 - 8 threads - Java 21 - * 1/3 not found rate + * Apple M1 Max (10 core) - 8 threads (per-thread state) - 2 forks - Java 8 (Zulu 8.0.382) * - * Benchmark Mode Cnt Score Error Units - * SetBenchmark.contains_array thrpt 6 645561886.327 ± 100781717.494 ops/s - * SetBenchmark.contains_hashSet thrpt 6 1536236680.235 ± 114966961.506 ops/s - * SetBenchmark.contains_sortedArray thrpt 6 571476939.441 ± 21334620.460 ops/s - * SetBenchmark.contains_treeSet thrpt 6 1557663759.411 ± 95343683.124 ops/s + * Benchmark Mode Cnt Score Error Units + * SetBenchmark.array_hit thrpt 6 995578895.732 ± 73709080.997 ops/s + * SetBenchmark.array_miss thrpt 6 649860848.470 ± 32489300.626 ops/s + * SetBenchmark.hashSet_hit thrpt 6 2081738804.271 ± 464349157.190 ops/s + * SetBenchmark.hashSet_miss thrpt 6 2136501411.026 ± 474132929.024 ops/s + * SetBenchmark.sortedArray_hit thrpt 6 837595967.794 ± 113538780.712 ops/s + * SetBenchmark.sortedArray_miss thrpt 6 692064118.699 ± 25752553.077 ops/s + * SetBenchmark.tagSet_hit thrpt 6 2184722734.028 ± 61054981.099 ops/s + * SetBenchmark.tagSet_miss thrpt 6 1933588009.009 ± 159869680.982 ops/s + * SetBenchmark.tagSetSupport_hit thrpt 6 2335685599.706 ± 52460762.937 ops/s + * SetBenchmark.tagSetSupport_miss thrpt 6 2169715463.018 ± 141321499.862 ops/s + * SetBenchmark.treeSet_hit thrpt 6 798251906.675 ± 39041398.413 ops/s + * SetBenchmark.treeSet_miss thrpt 6 667078954.487 ± 56517120.187 ops/s * */ @Fork(2) @Warmup(iterations = 2) @Measurement(iterations = 3) @Threads(8) +@State(Scope.Thread) public class SetBenchmark { static final String[] STRINGS = new String[] { @@ -60,45 +78,60 @@ static T init(Supplier supplier) { return supplier.get(); } - static final String[] LOOKUPS = + /** Present in the set (interned). */ + static final String[] HITS = STRINGS; + + /** Never present. */ + static final String[] MISSES = init( () -> { - String[] lookups = Arrays.copyOf(STRINGS, STRINGS.length * 10); - - for (int i = 0; i < STRINGS.length; ++i) { - lookups[STRINGS.length + i] = new String(STRINGS[i]); + String[] misses = new String[STRINGS.length * 4]; + for (int i = 0; i < misses.length; ++i) { + misses[i] = "dne-" + i; } - - // 2 / 3 of the key look-ups miss the set - for (int i = STRINGS.length * 2; i < lookups.length; ++i) { - lookups[i] = "dne-" + ThreadLocalRandom.current().nextInt(); - } - - Collections.shuffle(Arrays.asList(lookups)); - return lookups; + return misses; }); - static int sharedLookupIndex = 0; + int hitIndex = 0; // per-thread (Scope.Thread) — no shared-counter contention under @Threads(8) + int missIndex = 0; - static String nextString() { - int localIndex = ++sharedLookupIndex; - if (localIndex >= LOOKUPS.length) { - sharedLookupIndex = localIndex = 0; + String nextHit() { + int i = hitIndex + 1; + if (i >= HITS.length) { + i = 0; } - return LOOKUPS[localIndex]; + hitIndex = i; + return HITS[i]; + } + + String nextMiss() { + int i = missIndex + 1; + if (i >= MISSES.length) { + i = 0; + } + missIndex = i; + return MISSES[i]; } static final String[] ARRAY = STRINGS; - @Benchmark - public boolean contains_array() { - String needle = nextString(); + static boolean arrayContains(String needle) { for (String str : ARRAY) { if (needle.equals(str)) return true; } return false; } + @Benchmark + public boolean array_hit() { + return arrayContains(nextHit()); + } + + @Benchmark + public boolean array_miss() { + return arrayContains(nextMiss()); + } + static final String[] SORTED_ARRAY = init( () -> { @@ -108,21 +141,70 @@ public boolean contains_array() { }); @Benchmark - public boolean contains_sortedArray() { - return (Arrays.binarySearch(SORTED_ARRAY, nextString()) != -1); + public boolean sortedArray_hit() { + return Arrays.binarySearch(SORTED_ARRAY, nextHit()) >= 0; + } + + @Benchmark + public boolean sortedArray_miss() { + return Arrays.binarySearch(SORTED_ARRAY, nextMiss()) >= 0; } static final HashSet HASH_SET = new HashSet<>(Arrays.asList(STRINGS)); @Benchmark - public boolean contains_hashSet() { - return HASH_SET.contains(nextString()); + public boolean hashSet_hit() { + return HASH_SET.contains(nextHit()); + } + + @Benchmark + public boolean hashSet_miss() { + return HASH_SET.contains(nextMiss()); } static final TreeSet TREE_SET = new TreeSet<>(Arrays.asList(STRINGS)); @Benchmark - public boolean contains_treeSet() { - return HASH_SET.contains(nextString()); + public boolean treeSet_hit() { + return TREE_SET.contains(nextHit()); + } + + @Benchmark + public boolean treeSet_miss() { + return TREE_SET.contains(nextMiss()); + } + + static final TagSet TAG_SET = TagSet.of(STRINGS); + + @Benchmark + public boolean tagSet_hit() { + return TAG_SET.contains(nextHit()); + } + + @Benchmark + public boolean tagSet_miss() { + return TAG_SET.contains(nextMiss()); + } + + // The static Support path: hashes/names built once into static-final arrays (refs fold to + // constants) and probed directly via Support.indexOf -- vs tagSet_* above, which loads them + // through a TagSet instance. Mirrors KeyOfBenchmark's tagSet (static) vs tagSet_throughClass. + static final int[] SUPPORT_HASHES; + static final String[] SUPPORT_NAMES; + + static { + TagSet.Data data = TagSet.Support.create(STRINGS); + SUPPORT_HASHES = data.hashes; + SUPPORT_NAMES = data.names; + } + + @Benchmark + public boolean tagSetSupport_hit() { + return TagSet.Support.indexOf(SUPPORT_HASHES, SUPPORT_NAMES, nextHit()) >= 0; + } + + @Benchmark + public boolean tagSetSupport_miss() { + return TagSet.Support.indexOf(SUPPORT_HASHES, SUPPORT_NAMES, nextMiss()) >= 0; } } diff --git a/internal-api/src/main/java/datadog/trace/api/KnownTagIds.java b/internal-api/src/main/java/datadog/trace/api/KnownTagIds.java new file mode 100644 index 00000000000..3802fd955af --- /dev/null +++ b/internal-api/src/main/java/datadog/trace/api/KnownTagIds.java @@ -0,0 +1,298 @@ +package datadog.trace.api; + +import datadog.trace.bootstrap.instrumentation.api.Tags; +import datadog.trace.util.TagSet; + +/** + * Hand-assigned tag-id constants for well-known tags, plus the {@link KnownTags.Resolver} that + * resolves them. This is the single registry shared by the tracer core and by instrumentation + * (decorators) — it lives in {@code internal-api} so both layers can reference the ids; the + * eventual code generator will replace the hand assignment here. + * + *

Reserved serials {@code [1, KnownTags.FIRST_STORED_SERIAL)} name "virtual" tags handled by the + * tag interceptor / span fields and are NOT stored in the {@code TagMap}; their {@code fieldPos} is + * the {@link KnownTags#NO_SLOT} sentinel that is out of slot range, so any incidental store routes + * to the hash buckets rather than a positional slot. Serials {@code >= FIRST_STORED_SERIAL} name + * stored tags that slot/bucket normally (or, with {@code NO_SLOT}, are stored bucket-only). + * + *

The resolver registers on class initialization, so simply referencing any constant here makes + * tag-id resolution live before the first span is built. + */ +public final class KnownTagIds { + // slot count = (max stored fieldPos) + 1. Stored tags use fieldPos 0..25. + static final int SLOT_COUNT = 26; + + // ---- reserved / virtual (tag-interceptor handled, not stored) ---- + // Reserved tags are always intercepted -> set the INTERCEPTED flag. + public static final int ERROR_SERIAL = 1; + public static final long ERROR = KnownTags.intercepted(KnownTags.tagId(ERROR_SERIAL, Tags.ERROR)); + + // ---- stored (slotted / bucketed) ---- + public static final int PARENT_ID_SERIAL = KnownTags.FIRST_STORED_SERIAL; + public static final long PARENT_ID = KnownTags.tagId(PARENT_ID_SERIAL, 0, DDTags.PARENT_ID); + + // common (process-constant) tags added by InternalTagsAdder to ~every span + public static final int BASE_SERVICE_SERIAL = KnownTags.FIRST_STORED_SERIAL + 1; + public static final long BASE_SERVICE = + KnownTags.tagId(BASE_SERVICE_SERIAL, 1, DDTags.BASE_SERVICE); + + public static final int VERSION_SERIAL = KnownTags.FIRST_STORED_SERIAL + 2; + public static final long VERSION = KnownTags.tagId(VERSION_SERIAL, 2, Tags.VERSION); + + // build-time-known constant tags merged into defaultSpanTags (see CoreTracer.withTracerTags). + // "env" is a base-mixin tag; the *_ENABLED flags are product-mixin tags. Hand-assigned for now. + public static final String ENV = "env"; + public static final int ENV_SERIAL = KnownTags.FIRST_STORED_SERIAL + 3; + public static final long ENV_ID = KnownTags.tagId(ENV_SERIAL, 3, ENV); + + public static final int DJM_ENABLED_SERIAL = KnownTags.FIRST_STORED_SERIAL + 4; + public static final long DJM_ENABLED = KnownTags.tagId(DJM_ENABLED_SERIAL, 4, DDTags.DJM_ENABLED); + + public static final int DSM_ENABLED_SERIAL = KnownTags.FIRST_STORED_SERIAL + 5; + public static final long DSM_ENABLED = KnownTags.tagId(DSM_ENABLED_SERIAL, 5, DDTags.DSM_ENABLED); + + // common tags added by the tag post-processors (RemoteHostnameAdder / IntegrationAdder / + // ServiceNameSourceAdder). Not intercepted; stored. + public static final int TRACER_HOST_SERIAL = KnownTags.FIRST_STORED_SERIAL + 6; + public static final long TRACER_HOST_ID = + KnownTags.tagId(TRACER_HOST_SERIAL, 6, DDTags.TRACER_HOST); + + public static final int INTEGRATION_SERIAL = KnownTags.FIRST_STORED_SERIAL + 7; + public static final long INTEGRATION_ID = + KnownTags.tagId(INTEGRATION_SERIAL, 7, DDTags.DD_INTEGRATION); + + public static final int SVC_SRC_SERIAL = KnownTags.FIRST_STORED_SERIAL + 8; + public static final long SVC_SRC_ID = KnownTags.tagId(SVC_SRC_SERIAL, 8, DDTags.DD_SVC_SRC); + + // peer.service tags, read/written by PeerServiceCalculator (post-processor; uses Map put/get that + // bypass the interceptor). peer.service is intercepted on the set-path but STORED, so it slots. + public static final int PEER_SERVICE_SERIAL = KnownTags.FIRST_STORED_SERIAL + 9; + public static final long PEER_SERVICE = + KnownTags.intercepted(KnownTags.tagId(PEER_SERVICE_SERIAL, 9, Tags.PEER_SERVICE)); + + public static final int PEER_SERVICE_REMAPPED_FROM_SERIAL = KnownTags.FIRST_STORED_SERIAL + 10; + public static final long PEER_SERVICE_REMAPPED_FROM = + KnownTags.tagId(PEER_SERVICE_REMAPPED_FROM_SERIAL, 10, DDTags.PEER_SERVICE_REMAPPED_FROM); + + // HTTP tags read by HttpEndpointPostProcessor. http.method/http.url are intercepted-but-stored + // (interceptTag side-effects then returns false → stored); http.route is not intercepted. All + // stored, so the string set-path slots them via keyOf and the id reads here find them. + public static final int HTTP_METHOD_SERIAL = KnownTags.FIRST_STORED_SERIAL + 11; + public static final long HTTP_METHOD = + KnownTags.intercepted(KnownTags.tagId(HTTP_METHOD_SERIAL, 11, Tags.HTTP_METHOD)); + + public static final int HTTP_ROUTE_SERIAL = KnownTags.FIRST_STORED_SERIAL + 12; + public static final long HTTP_ROUTE = KnownTags.tagId(HTTP_ROUTE_SERIAL, 12, Tags.HTTP_ROUTE); + + public static final int HTTP_URL_SERIAL = KnownTags.FIRST_STORED_SERIAL + 13; + public static final long HTTP_URL = + KnownTags.intercepted(KnownTags.tagId(HTTP_URL_SERIAL, 13, Tags.HTTP_URL)); + + // peer connection tags set by BaseDecorator.onPeerConnection on ~every client/producer span. + // Not intercepted; stored. Slotted (common across client instrumentations). + public static final int PEER_HOSTNAME_SERIAL = KnownTags.FIRST_STORED_SERIAL + 14; + public static final long PEER_HOSTNAME = + KnownTags.tagId(PEER_HOSTNAME_SERIAL, 14, Tags.PEER_HOSTNAME); + + public static final int PEER_HOST_IPV4_SERIAL = KnownTags.FIRST_STORED_SERIAL + 15; + public static final long PEER_HOST_IPV4 = + KnownTags.tagId(PEER_HOST_IPV4_SERIAL, 15, Tags.PEER_HOST_IPV4); + + public static final int PEER_HOST_IPV6_SERIAL = KnownTags.FIRST_STORED_SERIAL + 16; + public static final long PEER_HOST_IPV6 = + KnownTags.tagId(PEER_HOST_IPV6_SERIAL, 16, Tags.PEER_HOST_IPV6); + + public static final int PEER_PORT_SERIAL = KnownTags.FIRST_STORED_SERIAL + 17; + public static final long PEER_PORT = KnownTags.tagId(PEER_PORT_SERIAL, 17, Tags.PEER_PORT); + + // Universal decorator tags — set on ~every span (component/span.kind via Base/Server/Client + // decorators, language via ServerDecorator). span.kind is intercepted (setSpanKindOrdinal). + public static final int COMPONENT_SERIAL = KnownTags.FIRST_STORED_SERIAL + 18; + public static final long COMPONENT = KnownTags.tagId(COMPONENT_SERIAL, 18, Tags.COMPONENT); + + public static final int SPAN_KIND_SERIAL = KnownTags.FIRST_STORED_SERIAL + 19; + public static final long SPAN_KIND = + KnownTags.intercepted(KnownTags.tagId(SPAN_KIND_SERIAL, 19, Tags.SPAN_KIND)); + + public static final int LANGUAGE_SERIAL = KnownTags.FIRST_STORED_SERIAL + 20; + public static final long LANGUAGE = KnownTags.tagId(LANGUAGE_SERIAL, 20, DDTags.LANGUAGE_TAG_KEY); + + // JDBC / database-client tags — set on every db span (58% of petclinic spans). Not intercepted + // (only db.statement is, and that's handled separately). + public static final int DB_TYPE_SERIAL = KnownTags.FIRST_STORED_SERIAL + 21; + public static final long DB_TYPE = KnownTags.tagId(DB_TYPE_SERIAL, 21, Tags.DB_TYPE); + + public static final int DB_INSTANCE_SERIAL = KnownTags.FIRST_STORED_SERIAL + 22; + public static final long DB_INSTANCE = KnownTags.tagId(DB_INSTANCE_SERIAL, 22, Tags.DB_INSTANCE); + + public static final int DB_USER_SERIAL = KnownTags.FIRST_STORED_SERIAL + 23; + public static final long DB_USER = KnownTags.tagId(DB_USER_SERIAL, 23, Tags.DB_USER); + + public static final int DB_OPERATION_SERIAL = KnownTags.FIRST_STORED_SERIAL + 24; + public static final long DB_OPERATION = + KnownTags.tagId(DB_OPERATION_SERIAL, 24, Tags.DB_OPERATION); + + public static final int DB_POOL_NAME_SERIAL = KnownTags.FIRST_STORED_SERIAL + 25; + public static final long DB_POOL_NAME = + KnownTags.tagId(DB_POOL_NAME_SERIAL, 25, Tags.DB_POOL_NAME); + + // Open-addressed name -> id table backing keyOf (data, not a switch): scales flat as the known + // set grows, where a generated switch eventually falls off the inline threshold. KEYOF_NAMES and + // KEYOF_VALUES are parallel; the table places names by hash and a parallel ids[] by slot. + private static final String[] KEYOF_NAMES = { + Tags.ERROR, + DDTags.PARENT_ID, + DDTags.BASE_SERVICE, + Tags.VERSION, + ENV, + DDTags.DJM_ENABLED, + DDTags.DSM_ENABLED, + DDTags.TRACER_HOST, + DDTags.DD_INTEGRATION, + DDTags.DD_SVC_SRC, + Tags.PEER_SERVICE, + DDTags.PEER_SERVICE_REMAPPED_FROM, + Tags.HTTP_METHOD, + Tags.HTTP_ROUTE, + Tags.HTTP_URL, + Tags.PEER_HOSTNAME, + Tags.PEER_HOST_IPV4, + Tags.PEER_HOST_IPV6, + Tags.PEER_PORT, + Tags.COMPONENT, + Tags.SPAN_KIND, + DDTags.LANGUAGE_TAG_KEY, + Tags.DB_TYPE, + Tags.DB_INSTANCE, + Tags.DB_USER, + Tags.DB_OPERATION, + Tags.DB_POOL_NAME, + }; + + private static final long[] KEYOF_VALUES = { + ERROR, + PARENT_ID, + BASE_SERVICE, + VERSION, + ENV_ID, + DJM_ENABLED, + DSM_ENABLED, + TRACER_HOST_ID, + INTEGRATION_ID, + SVC_SRC_ID, + PEER_SERVICE, + PEER_SERVICE_REMAPPED_FROM, + HTTP_METHOD, + HTTP_ROUTE, + HTTP_URL, + PEER_HOSTNAME, + PEER_HOST_IPV4, + PEER_HOST_IPV6, + PEER_PORT, + COMPONENT, + SPAN_KIND, + LANGUAGE, + DB_TYPE, + DB_INSTANCE, + DB_USER, + DB_OPERATION, + DB_POOL_NAME, + }; + + private static final int[] KEYOF_HASHES; + private static final String[] KEYOF_KEYS; + private static final long[] KEYOF_IDS; + + static { + TagSet.Data data = TagSet.Support.create(KEYOF_NAMES); + long[] ids = new long[data.names.length]; + for (int j = 0; j < KEYOF_NAMES.length; j++) { + ids[TagSet.Support.indexOf(data.hashes, data.names, KEYOF_NAMES[j])] = KEYOF_VALUES[j]; + } + KEYOF_HASHES = data.hashes; + KEYOF_KEYS = data.names; + KEYOF_IDS = ids; + } + + static final KnownTags.Resolver RESOLVER = + new KnownTags.Resolver() { + @Override + public String nameOf(long tagId) { + switch (KnownTags.globalSerial(tagId)) { + case ERROR_SERIAL: + return Tags.ERROR; + case PARENT_ID_SERIAL: + return DDTags.PARENT_ID; + case BASE_SERVICE_SERIAL: + return DDTags.BASE_SERVICE; + case VERSION_SERIAL: + return Tags.VERSION; + case ENV_SERIAL: + return ENV; + case DJM_ENABLED_SERIAL: + return DDTags.DJM_ENABLED; + case DSM_ENABLED_SERIAL: + return DDTags.DSM_ENABLED; + case TRACER_HOST_SERIAL: + return DDTags.TRACER_HOST; + case INTEGRATION_SERIAL: + return DDTags.DD_INTEGRATION; + case SVC_SRC_SERIAL: + return DDTags.DD_SVC_SRC; + case PEER_SERVICE_SERIAL: + return Tags.PEER_SERVICE; + case PEER_SERVICE_REMAPPED_FROM_SERIAL: + return DDTags.PEER_SERVICE_REMAPPED_FROM; + case HTTP_METHOD_SERIAL: + return Tags.HTTP_METHOD; + case HTTP_ROUTE_SERIAL: + return Tags.HTTP_ROUTE; + case HTTP_URL_SERIAL: + return Tags.HTTP_URL; + case PEER_HOSTNAME_SERIAL: + return Tags.PEER_HOSTNAME; + case PEER_HOST_IPV4_SERIAL: + return Tags.PEER_HOST_IPV4; + case PEER_HOST_IPV6_SERIAL: + return Tags.PEER_HOST_IPV6; + case PEER_PORT_SERIAL: + return Tags.PEER_PORT; + case COMPONENT_SERIAL: + return Tags.COMPONENT; + case SPAN_KIND_SERIAL: + return Tags.SPAN_KIND; + case LANGUAGE_SERIAL: + return DDTags.LANGUAGE_TAG_KEY; + case DB_TYPE_SERIAL: + return Tags.DB_TYPE; + case DB_INSTANCE_SERIAL: + return Tags.DB_INSTANCE; + case DB_USER_SERIAL: + return Tags.DB_USER; + case DB_OPERATION_SERIAL: + return Tags.DB_OPERATION; + case DB_POOL_NAME_SERIAL: + return Tags.DB_POOL_NAME; + default: + return null; + } + } + + @Override + public int slotCount() { + return SLOT_COUNT; + } + + @Override + public long keyOf(String name) { + int slot = TagSet.Support.indexOf(KEYOF_HASHES, KEYOF_KEYS, name); + return slot < 0 ? 0L : KEYOF_IDS[slot]; + } + }; + + static { + KnownTags.register(RESOLVER); + } + + private KnownTagIds() {} +} diff --git a/internal-api/src/main/java/datadog/trace/api/KnownTags.java b/internal-api/src/main/java/datadog/trace/api/KnownTags.java new file mode 100644 index 00000000000..41167077b87 --- /dev/null +++ b/internal-api/src/main/java/datadog/trace/api/KnownTags.java @@ -0,0 +1,167 @@ +package datadog.trace.api; + +/** + * Registry for generated tag ID ↔ name resolution. The code generator populates this at tracer init + * via {@link #register(Resolver)}. Once registered, HotSpot CHA devirtualizes and inlines the + * resolver's switch, making {@link #nameOf}/{@link #keyOf} effectively zero-overhead. + */ +public final class KnownTags { + // Plain (non-volatile) fast-path flag: false until a resolver is ever registered. A plain read is + // free and hoistable, unlike a volatile read of `resolver` (costly on weak memory models such as + // ARM). A stale `false` is benign — callers treat the tag as unknown and use the hash buckets, + // which is correct, just unoptimized; the next read after publication takes the slot path. + private static boolean active; + + private static volatile Resolver resolver; + + /** Fast-path gate: true once a resolver has been registered. */ + public static boolean isActive() { + return active; + } + + /* + * tagId bit layout: [63 intercepted] [62-48 globalSerial (15 bits)] [47-32 fieldPos] + * [31-0 nameHash]. Bit 63 (the sign bit) marks a tag the tag interceptor must see, so the check + * is a single {@code tagId < 0}. globalSerial is globally unique per known tag; fieldPos is its + * slot in the global positional layout (TagMap.knownEntries index); nameHash is + * TagMap.Entry#_hash(name) and is layout-independent. Unknown (string-only) tags have the upper + * 32 bits zero. NOTE: TagMap.Entry decodes nameHash inline as (int) tagId on its hot path, so the + * low-32 encoding here must stay in sync with that. + */ + public static int globalSerial(long tagId) { + return (int) ((tagId >>> 48) & 0x7FFF); + } + + /** + * Flag bit (the sign bit) marking a tag the tag interceptor must process — reserved/"virtual" + * tags AND intercepted-but-stored tags (e.g. http.method, which the interceptor side-effects and + * also stores). Encoded in the id so {@code DDSpanContext.setTag(long)} can route with a single + * sign test ({@link #isIntercepted}) instead of resolving the name. Non-intercepted tags (peer.*, + * base.service, …) leave it clear and take the fast store path. Must agree with the interceptor's + * name-based {@code needsIntercept} for every assigned id. + */ + public static final long INTERCEPTED = Long.MIN_VALUE; // 1L << 63 + + /** True if the tagId is flagged for tag-interceptor processing. */ + public static boolean isIntercepted(long tagId) { + return tagId < 0L; + } + + /** Returns the tagId with the {@link #INTERCEPTED} flag set. */ + public static long intercepted(long tagId) { + return tagId | INTERCEPTED; + } + + public static int fieldPos(long tagId) { + return (int) ((tagId >>> 32) & 0xFFFF); + } + + public static int nameHash(long tagId) { + return (int) tagId; + } + + /** + * globalSerial partition. {@code [1, FIRST_STORED_SERIAL)} is reserved for "virtual" tags that + * are specially handled (redirected to span fields or processed by the tag interceptor) and are + * NOT stored in the TagMap — these are hand-assigned in tracer core. {@code [FIRST_STORED_SERIAL, + * ..]} is for generated convention tags that ARE stored (slotted/bucketed). {@code globalSerial + * == 0} means unknown / string-only. Both core and the code generator must agree on this + * boundary. + */ + public static final int FIRST_STORED_SERIAL = 256; + + /** True if the tagId names a reserved "virtual"/specially-handled tag (not stored in the map). */ + public static boolean isReserved(long tagId) { + int globalSerial = globalSerial(tagId); + return globalSerial > 0 && globalSerial < FIRST_STORED_SERIAL; + } + + /** True if the tagId names a generated, map-stored (slotted/bucketed) tag. */ + public static boolean isStored(long tagId) { + return globalSerial(tagId) >= FIRST_STORED_SERIAL; + } + + /** + * Sentinel {@code fieldPos} meaning "no positional slot". It is the maximum value the 16-bit + * fieldPos field can hold, so it always compares {@code >= slotCount()} and routes to the hash + * buckets rather than the fast positional array. Two kinds of tagId use it: + * + *

    + *
  • Reserved/virtual tags ({@code globalSerial < FIRST_STORED_SERIAL}) — not stored at all; + * the sentinel just guarantees an incidental store never lands in a slot. + *
  • Unslotted stored tags ({@code globalSerial >= FIRST_STORED_SERIAL}) — "low-priority" tags + * that get a stable id (and so {@code keyOf}/{@code nameOf} unification with their string + * form) but are deliberately not given a slot, so they live in the buckets and don't widen + * {@code knownEntries[]} for every span. {@code getEntry(long)} for these resolves the name + * and rehashes — the cost of not owning a slot. + *
+ */ + public static final int NO_SLOT = 0xFFFF; + + /** + * True if the tagId names a stored tag that deliberately has no positional slot (bucket-only). + */ + public static boolean isUnslotted(long tagId) { + return isStored(tagId) && fieldPos(tagId) == NO_SLOT; + } + + /** + * Builds a tagId from its parts: {@code globalSerial} (globally unique per known tag), {@code + * fieldPos} (the tag's slot within its span type's positional table), and the tag {@code name} + * (whose hash is computed via the same function the runtime uses, so the low 32 bits match {@link + * TagMap.Entry#hash()}). Inverse of {@link #globalSerial}/{@link #fieldPos}/{@link #nameHash}. + * Intended for the code generator and tests. + */ + public static long tagId(int globalSerial, int fieldPos, String name) { + long nameHash = TagMap.Entry._hash(name) & 0xFFFFFFFFL; + return ((long) globalSerial << 48) | ((long) (fieldPos & 0xFFFF) << 32) | nameHash; + } + + /** + * Builds a tagId with no positional slot ({@code fieldPos == }{@link #NO_SLOT}). Use for reserved + * "virtual" tags and for "low-priority" stored tags that get a stable id but are intentionally + * kept out of the fast slot array (they route to the hash buckets). See {@link #NO_SLOT}. + */ + public static long tagId(int globalSerial, String name) { + return tagId(globalSerial, NO_SLOT, name); + } + + // Number of positional slots in the global layout = (max stored fieldPos) + 1, declared by the + // registered provider. Captured once at registration and read as a dynamic constant; TagMap sizes + // its knownEntries array to exactly this rather than a hardcoded max. 0 when no resolver. + private static int slotCount; + + /** Slot count of the registered provider (max stored fieldPos + 1); 0 if none. */ + public static int slotCount() { + return slotCount; + } + + public interface Resolver { + String nameOf(long tagId); + + long keyOf(String name); + + /** Number of positional slots this provider uses: (max stored fieldPos) + 1. */ + int slotCount(); + } + + public static void register(Resolver resolver) { + KnownTags.resolver = resolver; // volatile write publishes the resolver + KnownTags.slotCount = (resolver != null) ? resolver.slotCount() : 0; + KnownTags.active = (resolver != null); // plain write; readers re-read resolver volatile anyway + } + + public static String nameOf(long tagId) { + if (!active) return null; + Resolver r = resolver; + return r != null ? r.nameOf(tagId) : null; + } + + public static long keyOf(String name) { + if (!active) return 0L; + Resolver r = resolver; + return r != null ? r.keyOf(name) : 0L; + } + + private KnownTags() {} +} diff --git a/internal-api/src/main/java/datadog/trace/api/TagMap.java b/internal-api/src/main/java/datadog/trace/api/TagMap.java index 95f676245ef..06742a10a0c 100644 --- a/internal-api/src/main/java/datadog/trace/api/TagMap.java +++ b/internal-api/src/main/java/datadog/trace/api/TagMap.java @@ -172,6 +172,28 @@ static Ledger ledger(int size) { void set(EntryReader newEntry); + /* + * Tag-id keyed variants. The tagId encodes the tag's identity (see KnownTags); generated + * instrumentation uses these to avoid hashing tag-name strings. OptimizedTagMap routes these to + * its positional slots; LegacyTagMap resolves the name via KnownTags.nameOf and delegates to the + * string-keyed methods. (Abstract rather than default to keep the two implementations explicit.) + */ + void set(long tagId, Object value); + + void set(long tagId, CharSequence value); + + void set(long tagId, boolean value); + + void set(long tagId, int value); + + void set(long tagId, long value); + + void set(long tagId, float value); + + void set(long tagId, double value); + + Entry getEntry(long tagId); + /** sets the value while returning the prior Entry */ Entry getAndSet(String tag, Object value); @@ -227,6 +249,12 @@ static Ledger ledger(int size) { */ Entry getAndRemove(String tag); + /** Tag-id keyed removal (no prior value). See {@link #remove(String)}. */ + boolean remove(long tagId); + + /** Tag-id keyed removal returning the prior Entry. See {@link #getAndRemove(String)}. */ + Entry getAndRemove(long tagId); + /** Returns a mutable copy of this TagMap */ TagMap copy(); @@ -284,18 +312,51 @@ public static final EntryRemoval newRemoval(String tag) { return new EntryRemoval(tag); } - final String tag; + public static final EntryRemoval newRemoval(long tagId) { + return new EntryRemoval(tagId); + } + + // tagId encoding: bits 63-48 = globalSerial (0 for unknown tags), bits 47-32 = fieldPos, + // bits 31-0 = nameHash (_hash(tagName)). String-constructed entries have upper 32 bits zero + // with the hash lazily populated on first hash(). tagId-constructed entries have all bits set + // at construction; their tag name is resolved lazily via tag(). + long tagId; + + // Non-volatile: for tagId-constructed entries the name is resolved lazily and cached here. + // A benign race may cause multiple threads to re-resolve, but KnownTags.nameOf returns the + // same interned constant each time, so the extra lookup is harmless. + String tag; EntryChange(String tag) { this.tag = tag; + this.tagId = 0; // nameHash populated lazily in hash() + } + + EntryChange(long tagId) { + this.tagId = tagId; + this.tag = null; // resolved lazily via tag() } + // For tagId-constructed changes (entries and removals) the name is resolved lazily from the + // tagId via KnownTags and cached. Benign race: KnownTags.nameOf returns the same interned + // constant, so re-resolution is harmless. public final String tag() { - return this.tag; + String name = this.tag; + if (name != null) return name; + name = KnownTags.nameOf(this.tagId); + if (name != null) this.tag = name; + return name; } public final boolean matches(String tag) { - return (this.tag == tag) || this.tag.equals(tag); + // Read the field directly for the common (string-constructed) case so this stays inlinable. + // Only tagId-constructed entries with an unresolved name fall back to the virtual tag(). + String myTag = this.tag; + if (myTag == null) { + myTag = this.tag(); + if (myTag == null) return false; + } + return (myTag == tag) || myTag.equals(tag); } public abstract boolean isRemoval(); @@ -306,6 +367,10 @@ final class EntryRemoval extends EntryChange { super(tag); } + EntryRemoval(long tagId) { + super(tagId); + } + @Override public boolean isRemoval() { return true; @@ -386,6 +451,18 @@ public static final Entry create(String tag, CharSequence value) { : TagMap.Entry.newObjectEntry(tag, value); } + /** Tag-id keyed {@link #create(String, Object)}: null value yields null. */ + public static final Entry create(long tagId, Object value) { + return (value == null) ? null : TagMap.Entry.newAnyEntry(tagId, value); + } + + /** Tag-id keyed {@link #create(String, CharSequence)}: null/empty value yields null. */ + public static final Entry create(long tagId, CharSequence value) { + return (value == null || value.length() == 0) + ? null + : TagMap.Entry.newObjectEntry(tagId, value); + } + public static final Entry create(String tag, boolean value) { return TagMap.Entry.newBooleanEntry(tag, value); } @@ -465,11 +542,33 @@ static Entry newDoubleEntry(String tag, Double box) { return new Entry(tag, DOUBLE, double2Prim(box.doubleValue()), box); } - /* - * hash is stored in line for fast handling of Entry-s coming from another TagMap - * However, hash is lazily computed using the same trick as {@link java.lang.String}. - */ - int lazyTagHash; + static Entry newAnyEntry(long tagId, Object value) { + return new Entry(tagId, ANY, 0L, value); + } + + static Entry newObjectEntry(long tagId, Object value) { + return new Entry(tagId, OBJECT, 0L, value); + } + + static Entry newBooleanEntry(long tagId, boolean value) { + return new Entry(tagId, BOOLEAN, boolean2Prim(value), Boolean.valueOf(value)); + } + + static Entry newIntEntry(long tagId, int value) { + return new Entry(tagId, INT, int2Prim(value), null); + } + + static Entry newLongEntry(long tagId, long value) { + return new Entry(tagId, LONG, long2Prim(value), null); + } + + static Entry newFloatEntry(long tagId, float value) { + return new Entry(tagId, FLOAT, float2Prim(value), null); + } + + static Entry newDoubleEntry(long tagId, double value) { + return new Entry(tagId, DOUBLE, double2Prim(value), null); + } // To optimize construction of Entry around boxed primitives and Object entries, // no type checks are done during construction. @@ -493,21 +592,23 @@ static Entry newDoubleEntry(String tag, Double box) { private Entry(String tag, byte type, long prim, Object obj) { super(tag); - this.lazyTagHash = 0; // lazily computed + this.rawType = type; + this.rawPrim = prim; + this.rawObj = obj; + } + private Entry(long tagId, byte type, long prim, Object obj) { + super(tagId); this.rawType = type; this.rawPrim = prim; this.rawObj = obj; } int hash() { - // If value of hash read in this thread is zero, then hash is computed. - // hash is not held as a volatile, since this computation can safely be repeated as any time - int hash = this.lazyTagHash; + int hash = (int) this.tagId; if (hash != 0) return hash; - hash = _hash(this.tag); - this.lazyTagHash = hash; + this.tagId = hash & 0xFFFFFFFFL; return hash; } @@ -1028,6 +1129,36 @@ public Ledger set(String tag, double value) { return this.recordEntry(Entry.newDoubleEntry(tag, value)); } + // Tag-id keyed variants — record a tag-id-bearing Entry. build()/fill() shares the Entry + // object, so the tagId survives into the built map. + public Ledger set(long tagId, Object value) { + return this.recordEntry(Entry.newAnyEntry(tagId, value)); + } + + public Ledger set(long tagId, CharSequence value) { + return this.recordEntry(Entry.newObjectEntry(tagId, value)); + } + + public Ledger set(long tagId, boolean value) { + return this.recordEntry(Entry.newBooleanEntry(tagId, value)); + } + + public Ledger set(long tagId, int value) { + return this.recordEntry(Entry.newIntEntry(tagId, value)); + } + + public Ledger set(long tagId, long value) { + return this.recordEntry(Entry.newLongEntry(tagId, value)); + } + + public Ledger set(long tagId, float value) { + return this.recordEntry(Entry.newFloatEntry(tagId, value)); + } + + public Ledger set(long tagId, double value) { + return this.recordEntry(Entry.newDoubleEntry(tagId, value)); + } + public Ledger set(Entry entry) { return this.recordEntry(entry); } @@ -1036,6 +1167,10 @@ public Ledger remove(String tag) { return this.recordRemoval(EntryChange.newRemoval(tag)); } + public Ledger remove(long tagId) { + return this.recordRemoval(EntryChange.newRemoval(tagId)); + } + private Ledger recordEntry(Entry entry) { this.recordChange(entry); return this; @@ -1119,7 +1254,12 @@ void fill(TagMap map) { EntryChange change = entryChanges[i]; if (change.isRemoval()) { - map.remove(change.tag()); + // route tag-id removals by id (slot-aware, no name round-trip); string removals by name + if (KnownTags.globalSerial(change.tagId) != 0) { + map.remove(change.tagId); + } else { + map.remove(change.tag()); + } } else { map.set((Entry) change); } @@ -1206,7 +1346,7 @@ public OptimizedTagMap create(int size) { @Override public OptimizedTagMap empty() { - return OptimizedTagMap.EMPTY; + return OptimizedTagMap.empty(); } } @@ -1251,20 +1391,55 @@ public LegacyTagMap empty() { * removed from the collision chain. */ final class OptimizedTagMap implements TagMap { - // Using special constructor that creates a frozen view of an existing array - // Bucket calculation requires that array length is a power of 2 - // e.g. size 0 will not work, it results in ArrayIndexOutOfBoundsException, but size 1 does - static final OptimizedTagMap EMPTY = new OptimizedTagMap(new Object[1], 0); - - private final Object[] buckets; + // The shared empty (frozen) instance, via an initialization-on-demand holder. + // + // It must NOT be a direct static field of OptimizedTagMap: TagMap.EMPTY is computed in + // TagMap. through the factory (which returns this empty instance), and TagMap. + // can run *during* OptimizedTagMap. (before its static fields are assigned). A direct + // field would still be null at that point, so TagMap.EMPTY would capture null. A separate holder + // class initializes independently, so the factory always gets a valid instance regardless of + // which class is touched first. + // + // Special constructor creates a frozen view of an existing array; bucket calculation requires a + // power-of-2 length (size 0 fails with AIOOBE, size 1 works). + static OptimizedTagMap empty() { + return EmptyHolder.EMPTY; + } + + private static final class EmptyHolder { + static final OptimizedTagMap EMPTY = new OptimizedTagMap(null, 0); + } + + // Hash buckets for unknown tags (globalSerial == 0). Lazily allocated on the first unknown-tag + // insertion; an all-known map never allocates it. A null buckets array means "no bucketed + // entries" and must be treated as empty everywhere it is read/scanned. + private Object[] buckets; private int size; private boolean frozen; + // Positional store for known tags, indexed directly by fieldPos (no linear scan). Lazily + // allocated + // on the first slotted write, sized to KnownTags.slotCount() and never grown. Parallel arrays: + // knownIds[p] is the tagId occupying slot p (0L = empty) and knownValues[p] its value (Object; + // primitives boxed; null = empty). knownCount is the number of OCCUPIED slots. A known tag claims + // its slot first-writer-wins by occupant: an empty slot is claimed, the same globalSerial + // overwrites in place, and a DIFFERENT globalSerial already holding the slot (a fieldPos + // conflict) + // routes to the hash buckets instead. Unslotted known tags (fieldPos >= slotCount(), e.g. + // NO_SLOT) + // and unknown tags (globalSerial == 0) also live in the hash buckets. + private long[] knownIds; + private Object[] knownValues; + private int knownCount; + public OptimizedTagMap() { - // needs to be a power of 2 for bucket masking calculation to work as intended - this.buckets = new Object[1 << 4]; + // buckets stay null until the first unknown-tag insertion (see setInBuckets) + this.buckets = null; this.size = 0; this.frozen = false; + this.knownIds = null; + this.knownValues = null; + this.knownCount = 0; } /** Used for inexpensive immutable */ @@ -1272,6 +1447,9 @@ private OptimizedTagMap(Object[] buckets, int size) { this.buckets = buckets; this.size = size; this.frozen = true; + this.knownIds = null; + this.knownValues = null; + this.knownCount = 0; } @Override @@ -1297,15 +1475,30 @@ public Object get(Object tag) { return this.getObject((String) tag); } + // No-alloc dense lookup for the typed getters: for a known tag (keyOf resolves to a non-zero id), + // returns the raw stored value (primitives pre-boxed) so the caller can coerce it via + // TagValueConversions WITHOUT materializing an Entry. Returns null when the tag is not a present + // known tag (caller then falls back to the bucket lookup). Stored values are never null. + private Object knownRawValue(String tag) { + long tagId = KnownTags.keyOf(tag); + if (tagId == 0L) return null; + int i = this.knownIndexOf(tagId); + return i < 0 ? null : this.knownValues[i]; + } + /** Provides the corresponding entry value as an Object - boxing if necessary */ public Object getObject(String tag) { - Entry entry = this.getEntry(tag); + Object known = this.knownRawValue(tag); + if (known != null) return known; + Entry entry = this.getEntryFromBuckets(tag); return entry == null ? null : entry.objectValue(); } /** Provides the corresponding entry value as a String - calling toString if necessary */ public String getString(String tag) { - Entry entry = this.getEntry(tag); + Object known = this.knownRawValue(tag); + if (known != null) return TagValueConversions.toString(known); + Entry entry = this.getEntryFromBuckets(tag); return entry == null ? null : entry.stringValue(); } @@ -1314,7 +1507,9 @@ public boolean getBoolean(String tag) { } public boolean getBooleanOrDefault(String tag, boolean defaultValue) { - Entry entry = this.getEntry(tag); + Object known = this.knownRawValue(tag); + if (known != null) return TagValueConversions.toBoolean(known); + Entry entry = this.getEntryFromBuckets(tag); return entry == null ? defaultValue : entry.booleanValue(); } @@ -1323,7 +1518,9 @@ public int getInt(String tag) { } public int getIntOrDefault(String tag, int defaultValue) { - Entry entry = this.getEntry(tag); + Object known = this.knownRawValue(tag); + if (known != null) return TagValueConversions.toInt(known); + Entry entry = this.getEntryFromBuckets(tag); return entry == null ? defaultValue : entry.intValue(); } @@ -1332,7 +1529,9 @@ public long getLong(String tag) { } public long getLongOrDefault(String tag, long defaultValue) { - Entry entry = this.getEntry(tag); + Object known = this.knownRawValue(tag); + if (known != null) return TagValueConversions.toLong(known); + Entry entry = this.getEntryFromBuckets(tag); return entry == null ? defaultValue : entry.longValue(); } @@ -1341,7 +1540,9 @@ public float getFloat(String tag) { } public float getFloatOrDefault(String tag, float defaultValue) { - Entry entry = this.getEntry(tag); + Object known = this.knownRawValue(tag); + if (known != null) return TagValueConversions.toFloat(known); + Entry entry = this.getEntryFromBuckets(tag); return entry == null ? defaultValue : entry.floatValue(); } @@ -1350,7 +1551,9 @@ public double getDouble(String tag) { } public double getDoubleOrDefault(String tag, double defaultValue) { - Entry entry = this.getEntry(tag); + Object known = this.knownRawValue(tag); + if (known != null) return TagValueConversions.toDouble(known); + Entry entry = this.getEntryFromBuckets(tag); return entry == null ? defaultValue : entry.doubleValue(); } @@ -1397,7 +1600,55 @@ public Set> entrySet() { @Override public Entry getEntry(String tag) { + // Known tags live in the dense store; resolve identity and check there first. keyOf is a no-op + // until a resolver is registered, so this is just a hash-bucket lookup in the common case. + long tagId = KnownTags.keyOf(tag); + if (tagId != 0L) { + Entry known = this.knownGet(tagId); + if (known != null) return known; + } + return this.getEntryFromBuckets(tag); + } + + @Override + public Entry getEntry(long tagId) { + Entry known = this.knownGet(tagId); + if (known != null) return known; + + // not a known tag (unknown tag id) - look up by resolved name + String name = KnownTags.nameOf(tagId); + return name == null ? null : this.getEntryFromBuckets(name); + } + + // Mirrors knownRawValue(String) but skips keyOf - an id already carries its slot (fieldPos), so + // this is a direct positional dense read with no name resolution and no Entry. + private Object knownRawValue(long tagId) { + int i = this.knownIndexOf(tagId); + return i < 0 ? null : this.knownValues[i]; + } + + // Value read by id - dense fast path (no keyOf, no Entry), bucket fallback by resolved name. + public Object getObject(long tagId) { + Object known = this.knownRawValue(tagId); + if (known != null) return known; + String name = KnownTags.nameOf(tagId); + if (name == null) return null; + Entry entry = this.getEntryFromBuckets(name); + return entry == null ? null : entry.objectValue(); + } + + public String getString(long tagId) { + Object known = this.knownRawValue(tagId); + if (known != null) return TagValueConversions.toString(known); + String name = KnownTags.nameOf(tagId); + if (name == null) return null; + Entry entry = this.getEntryFromBuckets(name); + return entry == null ? null : entry.stringValue(); + } + + private Entry getEntryFromBuckets(String tag) { Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return null; int hash = TagMap.Entry._hash(tag); int bucketIndex = hash & (thisBuckets.length - 1); @@ -1426,53 +1677,309 @@ public Object put(String tag, Object value) { @Override public void set(TagMap.EntryReader newEntryReader) { - this.getAndSet(newEntryReader.entry()); + this.checkWriteAccess(); + // Cached-entry path (e.g. decorator componentEntry). entry() returns the reader's own Entry (no + // NEW allocation for a real Entry), carrying any id-encoded globalSerial. For a known tag we + // store its value in the dense store directly; otherwise the reader's entry goes to the bucket + // path. keyOf is a no-op until a resolver is registered (string-only entries keep their name). + Entry entry = newEntryReader.entry(); + long tagId = entry.tagId; + if (KnownTags.globalSerial(tagId) == 0 && KnownTags.isActive()) { + long resolved = KnownTags.keyOf(entry.tag()); + if (resolved != 0L) { + // Cache the resolved id back onto the entry so a SHARED cached entry (a decorator's + // SPAN_KIND_ENTRY / componentEntry, reused across every span) pays keyOf only on its first + // set, not on every span. Mirrors the write-back in getAndSet. Benign race on the shared + // entry: all writers store the same resolved id. + entry.tagId = resolved; + tagId = resolved; + } + } + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, entry.objectValue())) { + return; + } + this.setInBuckets(entry); } + // String-keyed setters. Resolve the tag identity once: a registered KnownTags resolver maps known + // tag names to a non-zero id (carrying globalSerial), letting us store the value in the dense + // store with NO Entry allocation. Until a resolver is registered keyOf returns 0 and we fall back + // to the bucket path, preserving the name-keyed Entry behavior. @Override public void set(String tag, Object value) { - this.getAndSet(Entry.newAnyEntry(tag, value)); + this.checkWriteAccess(); + long id = KnownTags.keyOf(tag); + if (id != 0L && this.trySetKnownSlot(id, value)) return; + this.setInBuckets(Entry.newAnyEntry(tag, value)); } @Override public void set(String tag, CharSequence value) { - this.getAndSet(Entry.newObjectEntry(tag, value)); + this.checkWriteAccess(); + long id = KnownTags.keyOf(tag); + if (id != 0L && this.trySetKnownSlot(id, value)) return; + this.setInBuckets(Entry.newObjectEntry(tag, value)); } @Override public void set(String tag, boolean value) { - this.getAndSet(Entry.newBooleanEntry(tag, value)); + this.checkWriteAccess(); + long id = KnownTags.keyOf(tag); + if (id != 0L && this.trySetKnownSlot(id, Boolean.valueOf(value))) return; + this.setInBuckets(Entry.newBooleanEntry(tag, value)); } @Override public void set(String tag, int value) { - this.getAndSet(Entry.newIntEntry(tag, value)); + this.checkWriteAccess(); + long id = KnownTags.keyOf(tag); + if (id != 0L && this.trySetKnownSlot(id, Integer.valueOf(value))) return; + this.setInBuckets(Entry.newIntEntry(tag, value)); } @Override public void set(String tag, long value) { - this.getAndSet(Entry.newLongEntry(tag, value)); + this.checkWriteAccess(); + long id = KnownTags.keyOf(tag); + if (id != 0L && this.trySetKnownSlot(id, Long.valueOf(value))) return; + this.setInBuckets(Entry.newLongEntry(tag, value)); } @Override public void set(String tag, float value) { - this.getAndSet(Entry.newFloatEntry(tag, value)); + this.checkWriteAccess(); + long id = KnownTags.keyOf(tag); + if (id != 0L && this.trySetKnownSlot(id, Float.valueOf(value))) return; + this.setInBuckets(Entry.newFloatEntry(tag, value)); } @Override public void set(String tag, double value) { - this.getAndSet(Entry.newDoubleEntry(tag, value)); + this.checkWriteAccess(); + long id = KnownTags.keyOf(tag); + if (id != 0L && this.trySetKnownSlot(id, Double.valueOf(value))) return; + this.setInBuckets(Entry.newDoubleEntry(tag, value)); + } + + // Tag-id keyed setters. The id already carries the globalSerial, so a known tag (non-zero + // globalSerial) goes straight into its positional slot via trySetKnownSlot with NO Entry + // allocation (strings/objects by reference; primitives boxed once). An id without a globalSerial, + // an unslotted id, or a fieldPos conflict falls back to the bucket path, which builds an Entry + // that resolves its name lazily. + @Override + public void set(long tagId, Object value) { + this.checkWriteAccess(); + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, value)) return; + this.setInBuckets(Entry.newAnyEntry(tagId, value)); + } + + @Override + public void set(long tagId, CharSequence value) { + this.checkWriteAccess(); + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, value)) return; + this.setInBuckets(Entry.newObjectEntry(tagId, value)); + } + + @Override + public void set(long tagId, boolean value) { + this.checkWriteAccess(); + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, Boolean.valueOf(value))) + return; + this.setInBuckets(Entry.newBooleanEntry(tagId, value)); + } + + @Override + public void set(long tagId, int value) { + this.checkWriteAccess(); + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, Integer.valueOf(value))) + return; + this.setInBuckets(Entry.newIntEntry(tagId, value)); + } + + @Override + public void set(long tagId, long value) { + this.checkWriteAccess(); + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, Long.valueOf(value))) + return; + this.setInBuckets(Entry.newLongEntry(tagId, value)); + } + + @Override + public void set(long tagId, float value) { + this.checkWriteAccess(); + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, Float.valueOf(value))) + return; + this.setInBuckets(Entry.newFloatEntry(tagId, value)); + } + + @Override + public void set(long tagId, double value) { + this.checkWriteAccess(); + if (KnownTags.globalSerial(tagId) != 0 && this.trySetKnownSlot(tagId, Double.valueOf(value))) + return; + this.setInBuckets(Entry.newDoubleEntry(tagId, value)); + } + + // Returns the slot index (== fieldPos) holding the known tag matching tagId, or -1 if the slot is + // empty or occupied by a different globalSerial (a conflict, which lives in the buckets instead). + private int knownIndexOf(long tagId) { + int s = KnownTags.globalSerial(tagId); + if (s == 0) return -1; + int p = KnownTags.fieldPos(tagId); + long[] ids = this.knownIds; + if (ids == null || p < 0 || p >= ids.length) return -1; + return (KnownTags.globalSerial(ids[p]) == s) ? p : -1; + } + + // Materializes a real Entry for the dense entry at slot p (carrying the stored tagId so it + // resolves its name and serializes correctly). + private Entry knownEntryAt(int p) { + return Entry.newAnyEntry(this.knownIds[p], this.knownValues[p]); + } + + // Returns a materialized entry for tagId if its slot holds that known tag, else null. (Explicit + // getEntry path - materializing here is fine, this is not iteration.) + private Entry knownGet(long tagId) { + int p = this.knownIndexOf(tagId); + return p < 0 ? null : this.knownEntryAt(p); + } + + // Removes and returns (materialized) the known tag from its slot, else null. Clears the slot in + // place; conflicting tags that live in the buckets are NOT promoted back into the freed slot. + private Entry knownRemove(long tagId) { + int p = this.knownIndexOf(tagId); + if (p < 0) return null; + + Entry removed = this.knownEntryAt(p); + this.knownIds[p] = 0L; + this.knownValues[p] = null; + this.knownCount -= 1; + this.size -= 1; + return removed; } @Override public Entry getAndSet(Entry newEntry) { this.checkWriteAccess(); + // Resolve the entry's identity. A tag-id-constructed entry already carries its globalSerial; a + // string-constructed entry may be a known tag — resolve via KnownTags and upgrade its tagId so + // it routes to (and is recognized in) its slot. keyOf is a no-op until a resolver is + // registered. The slot handling lives in setKnown so this hot method stays small and inlinable. + long tagId = newEntry.tagId; + int globalSerial = KnownTags.globalSerial(tagId); + if (globalSerial == 0 && KnownTags.isActive()) { + long resolved = KnownTags.keyOf(newEntry.tag()); + if (resolved != 0L) { + newEntry.tagId = resolved; + globalSerial = KnownTags.globalSerial(resolved); + } + } + + if (globalSerial != 0) { + return this.setKnown(newEntry, globalSerial); + } + return this.setInBuckets(newEntry); + } + + // Positional dense-write core: store a known tag's value WITHOUT constructing an Entry, indexed + // directly by fieldPos. tagId must carry a non-zero globalSerial. Returns true if the value was + // stored in its slot (claimed or overwritten in place), or false if it could not be slotted — + // unslotted (fieldPos >= slotCount()) or a fieldPos CONFLICT with a different globalSerial + // already + // holding the slot — in which case the caller must route it to the buckets. Primitives are + // expected pre-boxed by the caller; strings/objects are stored by reference. + private boolean trySetKnownSlot(long tagId, Object value) { + int slotCount = KnownTags.slotCount(); + if (slotCount == 0) return false; + int p = KnownTags.fieldPos(tagId); + if (p < 0 || p >= slotCount) return false; // unslotted (e.g. NO_SLOT) -> buckets + + long[] ids = this.knownIds; + if (ids == null) { + ids = this.knownIds = new long[slotCount]; + this.knownValues = new Object[slotCount]; + } + + long occupant = ids[p]; + if (occupant == 0L) { + // empty slot - claim it. A tag that previously lost the slot race is parked in the buckets; + // if + // the slot has since freed up it may now claim the slot, so evict any stale bucketed copy to + // keep a single source of truth. buckets is null in the all-known case, so this is free + // there. + if (this.buckets != null) { + this.evictFromBuckets(tagId); + } + ids[p] = tagId; + this.knownValues[p] = value; + this.knownCount += 1; + this.size += 1; + return true; + } + if (KnownTags.globalSerial(occupant) == KnownTags.globalSerial(tagId)) { + // same tag - overwrite in place, no size change (refresh the id to keep the name current) + ids[p] = tagId; + this.knownValues[p] = value; + return true; + } + // a different globalSerial already holds this slot - conflict, route to buckets + return false; + } + + // Removes a stale bucketed copy of tagId (by resolved name), decrementing size if found. Used + // when + // a tag reclaims a freed slot to avoid a slot+bucket duplicate. Returns the removed entry or + // null. + private Entry evictFromBuckets(long tagId) { + String name = KnownTags.nameOf(tagId); + if (name == null) return null; + return this.removeFromBuckets(name, KnownTags.nameHash(tagId)); + } + + // Stores a known tag (slotting it if possible) and returns the prior entry (materialized) or + // null. + // Used by the prior-returning getAndSet path. On a fieldPos conflict or an unslotted tag, routes + // the value to the buckets and returns the prior bucketed entry (if any). + private Entry setKnown(Entry newEntry, int globalSerial) { + long tagId = newEntry.tagId; + int p = this.knownIndexOf(tagId); + if (p >= 0) { + // slot already holds this tag - overwrite in place, returning the prior slot value + Entry prev = this.knownEntryAt(p); + this.knownIds[p] = tagId; + this.knownValues[p] = newEntry.objectValue(); + return prev; + } + // not in its slot. The only possible prior with this name is a bucketed copy (either a current + // conflict victim or one parked there before its slot freed up). Capture+remove it: it is both + // the prior to return AND, for the claim case, the duplicate that must be evicted. + Entry prev = (this.buckets != null) ? this.evictFromBuckets(tagId) : null; + if (this.trySetKnownSlot(tagId, newEntry.objectValue())) { + // claimed/overwrote a slot; evictFromBuckets already accounted for any removed bucketed copy + return prev; + } + // conflict / unslotted - the value goes (back) into the buckets, keyed by its resolved name + this.setInBuckets(newEntry); + return prev; + } + + private Entry setInBuckets(Entry newEntry) { Object[] thisBuckets = this.buckets; + if (thisBuckets == null) { + // first unknown-tag insertion - lazily allocate the bucket array + // needs to be a power of 2 for bucket masking calculation to work as intended + thisBuckets = this.buckets = new Object[1 << 4]; + } int newHash = newEntry.hash(); int bucketIndex = newHash & (thisBuckets.length - 1); + // Use the resolved accessor, not the raw field: a tag-id-constructed entry has a null tag + // field until its name is lazily resolved. For string-constructed entries this is just a field + // read. + String newTag = newEntry.tag(); + Object bucket = thisBuckets[bucketIndex]; if (bucket == null) { thisBuckets[bucketIndex] = newEntry; @@ -1481,7 +1988,7 @@ public Entry getAndSet(Entry newEntry) { return null; } else if (bucket instanceof Entry) { Entry existingEntry = (Entry) bucket; - if (existingEntry.matches(newEntry.tag)) { + if (existingEntry.matches(newTag)) { thisBuckets[bucketIndex] = newEntry; // replaced existing entry - no size change @@ -1496,7 +2003,7 @@ public Entry getAndSet(Entry newEntry) { } else if (bucket instanceof BucketGroup) { BucketGroup lastGroup = (BucketGroup) bucket; - BucketGroup containingGroup = lastGroup.findContainingGroupInChain(newHash, newEntry.tag); + BucketGroup containingGroup = lastGroup.findContainingGroupInChain(newHash, newTag); if (containingGroup != null) { // replaced existing entry - no size change return containingGroup._replace(newHash, newEntry); @@ -1584,15 +2091,33 @@ public void putAll(TagMap that) { private void putAllOptimizedMap(OptimizedTagMap that) { if (this.size == 0) { + // empty dest: clone source buckets + dense store wholesale (no duplication possible) this.putAllIntoEmptyMap(that); + } else if (this.knownCount != 0 || that.knownCount != 0) { + // known tags in play with a non-empty dest: route every source entry through getAndSet so the + // dense store stays deduplicated against this map's existing entries. + this.putAllByEntry(that); } else { this.putAllMerge(that); } } + private void putAllByEntry(OptimizedTagMap that) { + // ctx-passing forEach avoids a capturing lambda allocation + that.forEach(this, (dest, reader) -> dest.getAndSet(reader.entry())); + } + private void putAllMerge(OptimizedTagMap that) { - Object[] thisBuckets = this.buckets; Object[] thatBuckets = that.buckets; + // nothing bucketed in the source - nothing to merge + if (thatBuckets == null) return; + + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) { + // dest has no bucket array yet (its size came from the dense store, but putAllMerge is only + // reached when both maps have knownCount == 0); allocate to receive the source's buckets + thisBuckets = this.buckets = new Object[1 << 4]; + } // Since TagMap-s don't support expansion, buckets are perfectly aligned // Check against both thisBuckets.length && thatBuckets.length is to help the JIT do bound check @@ -1708,31 +2233,58 @@ private void putAllMerge(OptimizedTagMap that) { * Specially optimized version of putAll for the common case of destination map being empty */ private void putAllIntoEmptyMap(OptimizedTagMap that) { - Object[] thisBuckets = this.buckets; Object[] thatBuckets = that.buckets; - // Check against both thisBuckets.length && thatBuckets.length is to help the JIT do bound check - // elimination - for (int i = 0; i < thisBuckets.length && i < thatBuckets.length; ++i) { - Object thatBucket = thatBuckets[i]; + // source has bucketed entries - lazily allocate dest buckets and clone them in. A source with + // null buckets leaves dest buckets null (still empty until something buckets). + if (thatBuckets != null) { + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) { + thisBuckets = this.buckets = new Object[1 << 4]; + } + + // Check against both thisBuckets.length && thatBuckets.length is to help the JIT do bound + // check elimination + for (int i = 0; i < thisBuckets.length && i < thatBuckets.length; ++i) { + Object thatBucket = thatBuckets[i]; - // faster to explicitly null check first, then do instanceof - if (thatBucket == null) { - // do nothing - } else if (thatBucket instanceof BucketGroup) { - // if it is a BucketGroup, then need to clone - BucketGroup thatGroup = (BucketGroup) thatBucket; + // faster to explicitly null check first, then do instanceof + if (thatBucket == null) { + // do nothing + } else if (thatBucket instanceof BucketGroup) { + // if it is a BucketGroup, then need to clone + BucketGroup thatGroup = (BucketGroup) thatBucket; - thisBuckets[i] = thatGroup.cloneChain(); - } else { // if ( thatBucket instanceof Entry ) - thisBuckets[i] = thatBucket; + thisBuckets[i] = thatGroup.cloneChain(); + } else { // if ( thatBucket instanceof Entry ) + thisBuckets[i] = thatBucket; + } } } + + // dest is empty, so the source's dense store transfers directly (values are shared, as with + // buckets above). size is copied wholesale below and already accounts for known entries. + if (that.knownCount != 0) { + this.knownIds = that.knownIds.clone(); + this.knownValues = that.knownValues.clone(); + this.knownCount = that.knownCount; + } + this.size = that.size; } public void fillMap(Map map) { + long[] ids = this.knownIds; + Object[] values = this.knownValues; + if (ids != null) { + for (int i = 0; i < ids.length; ++i) { + if (ids[i] == 0L) continue; + map.put(KnownTags.nameOf(ids[i]), values[i]); + } + } + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return; for (int i = 0; i < thisBuckets.length; ++i) { Object thisBucket = thisBuckets[i]; @@ -1740,7 +2292,7 @@ public void fillMap(Map map) { if (thisBucket instanceof Entry) { Entry thisEntry = (Entry) thisBucket; - map.put(thisEntry.tag, thisEntry.objectValue()); + map.put(thisEntry.tag(), thisEntry.objectValue()); } else if (thisBucket instanceof BucketGroup) { BucketGroup thisGroup = (BucketGroup) thisBucket; @@ -1750,7 +2302,17 @@ public void fillMap(Map map) { } public void fillStringMap(Map stringMap) { + long[] ids = this.knownIds; + Object[] values = this.knownValues; + if (ids != null) { + for (int i = 0; i < ids.length; ++i) { + if (ids[i] == 0L) continue; + stringMap.put(KnownTags.nameOf(ids[i]), TagValueConversions.toString(values[i])); + } + } + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return; for (int i = 0; i < thisBuckets.length; ++i) { Object thisBucket = thisBuckets[i]; @@ -1758,7 +2320,7 @@ public void fillStringMap(Map stringMap) { if (thisBucket instanceof Entry) { Entry thisEntry = (Entry) thisBucket; - stringMap.put(thisEntry.tag, thisEntry.stringValue()); + stringMap.put(thisEntry.tag(), thisEntry.stringValue()); } else if (thisBucket instanceof BucketGroup) { BucketGroup thisGroup = (BucketGroup) thisBucket; @@ -1779,13 +2341,44 @@ public boolean remove(String tag) { return (this.getAndRemove(tag) != null); } + @Override + public boolean remove(long tagId) { + return (this.getAndRemove(tagId) != null); + } + + @Override + public Entry getAndRemove(long tagId) { + this.checkWriteAccess(); + + // known tags live in their slot - clear there first (by id, no name needed) + Entry slotEntry = this.knownRemove(tagId); + if (slotEntry != null) return slotEntry; + + // otherwise it may have collided into the buckets - look up by resolved name + String name = KnownTags.nameOf(tagId); + return name == null ? null : this.removeFromBuckets(name, KnownTags.nameHash(tagId)); + } + @Override public Entry getAndRemove(String tag) { this.checkWriteAccess(); + // known tags live in their slot - clear there first + long tagId = KnownTags.keyOf(tag); + if (tagId != 0L) { + Entry slotEntry = this.knownRemove(tagId); + if (slotEntry != null) return slotEntry; + } + + return this.removeFromBuckets(tag, TagMap.Entry._hash(tag)); + } + + // Removes tag from the hash buckets (only), decrementing size if found. Returns the removed entry + // or null. + private Entry removeFromBuckets(String tag, int hash) { Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return null; - int hash = TagMap.Entry._hash(tag); int bucketIndex = hash & (thisBuckets.length - 1); Object bucket = thisBuckets[bucketIndex]; @@ -1846,7 +2439,19 @@ public Stream stream() { @Override public void forEach(Consumer consumer) { + long[] ids = this.knownIds; + Object[] values = this.knownValues; + if (ids != null && this.knownCount != 0) { + EntryReadingHelper reader = new EntryReadingHelper(); + for (int i = 0; i < ids.length; ++i) { + if (ids[i] == 0L) continue; + reader.set(KnownTags.nameOf(ids[i]), values[i]); + consumer.accept(reader); + } + } + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return; for (int i = 0; i < thisBuckets.length; ++i) { Object thisBucket = thisBuckets[i]; @@ -1865,7 +2470,19 @@ public void forEach(Consumer consumer) { @Override public void forEach(T thisObj, BiConsumer consumer) { + long[] ids = this.knownIds; + Object[] values = this.knownValues; + if (ids != null && this.knownCount != 0) { + EntryReadingHelper reader = new EntryReadingHelper(); + for (int i = 0; i < ids.length; ++i) { + if (ids[i] == 0L) continue; + reader.set(KnownTags.nameOf(ids[i]), values[i]); + consumer.accept(thisObj, reader); + } + } + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return; for (int i = 0; i < thisBuckets.length; ++i) { Object thisBucket = thisBuckets[i]; @@ -1885,7 +2502,19 @@ public void forEach(T thisObj, BiConsumer con @Override public void forEach( T thisObj, U otherObj, TriConsumer consumer) { + long[] ids = this.knownIds; + Object[] values = this.knownValues; + if (ids != null && this.knownCount != 0) { + EntryReadingHelper reader = new EntryReadingHelper(); + for (int i = 0; i < ids.length; ++i) { + if (ids[i] == 0L) continue; + reader.set(KnownTags.nameOf(ids[i]), values[i]); + consumer.accept(thisObj, otherObj, reader); + } + } + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return; for (int i = 0; i < thisBuckets.length; ++i) { Object thisBucket = thisBuckets[i]; @@ -1905,7 +2534,13 @@ public void forEach( public void clear() { this.checkWriteAccess(); - Arrays.fill(this.buckets, null); + // drop the bucket array entirely - it will be lazily re-allocated on the next unknown-tag write + this.buckets = null; + if (this.knownIds != null && this.knownCount != 0) { + Arrays.fill(this.knownIds, 0L); + Arrays.fill(this.knownValues, null); + this.knownCount = 0; + } this.size = 0; } @@ -1928,9 +2563,27 @@ void checkIntegrity() { // That was done to avoid the extra static initialization needed for an assertion // While that's probably an unnecessary optimization, this method is only called in tests + long[] ids = this.knownIds; + int occupied = 0; + for (int i = 0; ids != null && i < ids.length; ++i) { + long id = ids[i]; + if (id == 0L) continue; // empty slot + occupied += 1; + if (KnownTags.globalSerial(id) == 0) { + throw new IllegalStateException("known entry without globalSerial"); + } + // positional invariant: a tag occupies the slot equal to its fieldPos + if (KnownTags.fieldPos(id) != i) { + throw new IllegalStateException("known entry in wrong slot"); + } + } + if (occupied != this.knownCount) { + throw new IllegalStateException("incorrect knownCount"); + } + Object[] thisBuckets = this.buckets; - for (int i = 0; i < thisBuckets.length; ++i) { + for (int i = 0; thisBuckets != null && i < thisBuckets.length; ++i) { Object thisBucket = thisBuckets[i]; if (thisBucket instanceof Entry) { @@ -1970,10 +2623,10 @@ void checkIntegrity() { } int computeSize() { - Object[] thisBuckets = this.buckets; + int size = this.knownCount; - int size = 0; - for (int i = 0; i < thisBuckets.length; ++i) { + Object[] thisBuckets = this.buckets; + for (int i = 0; thisBuckets != null && i < thisBuckets.length; ++i) { Object curBucket = thisBuckets[i]; if (curBucket instanceof Entry) { @@ -1987,7 +2640,10 @@ int computeSize() { } boolean checkIfEmpty() { + if (this.knownCount != 0) return false; + Object[] thisBuckets = this.buckets; + if (thisBuckets == null) return true; for (int i = 0; i < thisBuckets.length; ++i) { Object curBucket = thisBuckets[i]; @@ -2061,7 +2717,7 @@ String toInternalString() { Object[] thisBuckets = this.buckets; StringBuilder ledger = new StringBuilder(128); - for (int i = 0; i < thisBuckets.length; ++i) { + for (int i = 0; thisBuckets != null && i < thisBuckets.length; ++i) { ledger.append('[').append(i).append("] = "); Object thisBucket = thisBuckets[i]; @@ -2082,55 +2738,86 @@ String toInternalString() { } abstract static class IteratorBase { + private final long[] knownIds; + private final Object[] knownValues; private final Object[] buckets; - private Entry nextEntry; + // Reused flyweight reader for dense (known) entries - no per-tag Entry allocation. Lazily + // created on the first dense entry. Consumers that need to RETAIN a yielded reader across + // iteration steps must capture a copy via reader.entry(). + private EntryReadingHelper knownReader; + + // The pending reader (either the reused flyweight for dense entries, or a bucket Entry which is + // itself an EntryReader). null means none pending. + private EntryReader nextReader; + private int knownIndex = -1; private int bucketIndex = -1; private BucketGroup group = null; private int groupIndex = 0; IteratorBase(OptimizedTagMap map) { + this.knownIds = map.knownIds; + this.knownValues = map.knownValues; this.buckets = map.buckets; } public final boolean hasNext() { - if (this.nextEntry != null) return true; - - while (this.bucketIndex < this.buckets.length) { - this.nextEntry = this.advance(); - if (this.nextEntry != null) return true; - } + if (this.nextReader != null) return true; - return false; + this.nextReader = this.advance(); + return this.nextReader != null; } - final Entry nextEntryOrThrowNoSuchElement() { - if (this.nextEntry != null) { - Entry nextEntry = this.nextEntry; - this.nextEntry = null; - return nextEntry; + final EntryReader nextEntryOrThrowNoSuchElement() { + if (this.nextReader != null) { + EntryReader nextReader = this.nextReader; + this.nextReader = null; + return nextReader; } if (this.hasNext()) { - return this.nextEntry; + EntryReader nextReader = this.nextReader; + this.nextReader = null; + return nextReader; } else { throw new NoSuchElementException(); } } - final Entry nextEntryOrNull() { - if (this.nextEntry != null) { - Entry nextEntry = this.nextEntry; - this.nextEntry = null; - return nextEntry; + final EntryReader nextEntryOrNull() { + if (this.nextReader != null) { + EntryReader nextReader = this.nextReader; + this.nextReader = null; + return nextReader; } - return this.hasNext() ? this.nextEntry : null; + if (this.hasNext()) { + EntryReader nextReader = this.nextReader; + this.nextReader = null; + return nextReader; + } + return null; } - private final Entry advance() { + private final EntryReader advance() { + // drain the dense known entries first, via the reused flyweight reader (skip empty slots) + long[] ids = this.knownIds; + if (ids != null) { + while (++this.knownIndex < ids.length) { + if (ids[this.knownIndex] == 0L) continue; + EntryReadingHelper reader = this.knownReader; + if (reader == null) { + reader = this.knownReader = new EntryReadingHelper(); + } + reader.set(KnownTags.nameOf(ids[this.knownIndex]), this.knownValues[this.knownIndex]); + return reader; + } + } + + if (this.buckets == null) return null; + while (this.bucketIndex < this.buckets.length) { if (this.group != null) { for (++this.groupIndex; this.groupIndex < BucketGroup.LEN; ++this.groupIndex) { @@ -2429,18 +3116,21 @@ Entry replaceInChain(int hash, Entry entry) { Entry _replace(int hash, Entry entry) { // if ( this._mayContain(hash) ) return null; + // resolved accessor, not the raw field: tag-id entries have a null tag field until resolved + String tag = entry.tag(); + // first check to see if the item is already present Entry prevEntry = null; - if (this.hash0 == hash && this.entry0.matches(entry.tag)) { + if (this.hash0 == hash && this.entry0.matches(tag)) { prevEntry = this.entry0; this.entry0 = entry; - } else if (this.hash1 == hash && this.entry1.matches(entry.tag)) { + } else if (this.hash1 == hash && this.entry1.matches(tag)) { prevEntry = this.entry1; this.entry1 = entry; - } else if (this.hash2 == hash && this.entry2.matches(entry.tag)) { + } else if (this.hash2 == hash && this.entry2.matches(tag)) { prevEntry = this.entry2; this.entry2 = entry; - } else if (this.hash3 == hash && this.entry3.matches(entry.tag)) { + } else if (this.hash3 == hash && this.entry3.matches(tag)) { prevEntry = this.entry3; this.entry3 = entry; } @@ -2571,16 +3261,16 @@ void fillMapFromChain(Map map) { void _fillMap(Map map) { Entry entry0 = this.entry0; - if (entry0 != null) map.put(entry0.tag, entry0.objectValue()); + if (entry0 != null) map.put(entry0.tag(), entry0.objectValue()); Entry entry1 = this.entry1; - if (entry1 != null) map.put(entry1.tag, entry1.objectValue()); + if (entry1 != null) map.put(entry1.tag(), entry1.objectValue()); Entry entry2 = this.entry2; - if (entry2 != null) map.put(entry2.tag, entry2.objectValue()); + if (entry2 != null) map.put(entry2.tag(), entry2.objectValue()); Entry entry3 = this.entry3; - if (entry3 != null) map.put(entry3.tag, entry3.objectValue()); + if (entry3 != null) map.put(entry3.tag(), entry3.objectValue()); } void fillStringMapFromChain(Map map) { @@ -2591,16 +3281,16 @@ void fillStringMapFromChain(Map map) { void _fillStringMap(Map map) { Entry entry0 = this.entry0; - if (entry0 != null) map.put(entry0.tag, entry0.stringValue()); + if (entry0 != null) map.put(entry0.tag(), entry0.stringValue()); Entry entry1 = this.entry1; - if (entry1 != null) map.put(entry1.tag, entry1.stringValue()); + if (entry1 != null) map.put(entry1.tag(), entry1.stringValue()); Entry entry2 = this.entry2; - if (entry2 != null) map.put(entry2.tag, entry2.stringValue()); + if (entry2 != null) map.put(entry2.tag(), entry2.stringValue()); Entry entry3 = this.entry3; - if (entry3 != null) map.put(entry3.tag, entry3.stringValue()); + if (entry3 != null) map.put(entry3.tag(), entry3.stringValue()); } BucketGroup cloneChain() { @@ -2658,9 +3348,22 @@ public boolean isEmpty() { @Override public Iterator> iterator() { - @SuppressWarnings({"rawtypes", "unchecked"}) - Iterator> iter = (Iterator) this.map.iterator(); - return iter; + return new EntriesIterator(this.map); + } + } + + // Map.Entry view over the iterator. Dense entries are yielded as a reused flyweight EntryReader + // (not a Map.Entry), so materialize a real Map.Entry per element here via mapEntry(). Bucket + // Entry-s are themselves Map.Entry, so mapEntry() returns them directly without allocating. + static final class EntriesIterator extends IteratorBase + implements Iterator> { + EntriesIterator(OptimizedTagMap map) { + super(map); + } + + @Override + public Map.Entry next() { + return this.nextEntryOrThrowNoSuchElement().mapEntry(); } } @@ -2991,6 +3694,19 @@ public TagMap.Entry getAndRemove(String tag) { return prior == null ? null : TagMap.Entry.newAnyEntry(tag, prior); } + // Tag-id keyed removals: LegacyTagMap is name-keyed, so resolve the name and delegate. + @Override + public boolean remove(long tagId) { + String name = KnownTags.nameOf(tagId); + return name != null && this.remove(name); + } + + @Override + public TagMap.Entry getAndRemove(long tagId) { + String name = KnownTags.nameOf(tagId); + return name == null ? null : this.getAndRemove(name); + } + @Override public Object getObject(String tag) { return this.get(tag); @@ -3148,6 +3864,49 @@ public void set(TagMap.EntryReader newEntryReader) { this.put(newEntryReader.tag(), newEntryReader.objectValue()); } + // Tag-id keyed variants: LegacyTagMap is name-keyed, so resolve the name via KnownTags and + // delegate to the string-keyed methods. Requires a registered KnownTags.Resolver. + @Override + public void set(long tagId, Object value) { + this.set(KnownTags.nameOf(tagId), value); + } + + @Override + public void set(long tagId, CharSequence value) { + this.set(KnownTags.nameOf(tagId), value); + } + + @Override + public void set(long tagId, boolean value) { + this.set(KnownTags.nameOf(tagId), value); + } + + @Override + public void set(long tagId, int value) { + this.set(KnownTags.nameOf(tagId), value); + } + + @Override + public void set(long tagId, long value) { + this.set(KnownTags.nameOf(tagId), value); + } + + @Override + public void set(long tagId, float value) { + this.set(KnownTags.nameOf(tagId), value); + } + + @Override + public void set(long tagId, double value) { + this.set(KnownTags.nameOf(tagId), value); + } + + @Override + public TagMap.Entry getEntry(long tagId) { + String name = KnownTags.nameOf(tagId); + return name == null ? null : this.getEntry(name); + } + @Override public Object put(String key, Object value) { this.checkWriteAccess(); diff --git a/internal-api/src/main/java/datadog/trace/bootstrap/instrumentation/api/AgentSpan.java b/internal-api/src/main/java/datadog/trace/bootstrap/instrumentation/api/AgentSpan.java index 99c90b53b30..a10a4645edb 100644 --- a/internal-api/src/main/java/datadog/trace/bootstrap/instrumentation/api/AgentSpan.java +++ b/internal-api/src/main/java/datadog/trace/bootstrap/instrumentation/api/AgentSpan.java @@ -83,6 +83,48 @@ default boolean isValid() { AgentSpan setTag(String key, Object value); + /** + * Sets a tag by its generated tag id (see {@link datadog.trace.api.KnownTags}). The default + * resolves the id to its name and delegates to {@link #setTag(String, Object)}, so every span + * implementation works unchanged; mutable spans backed by {@code DDSpanContext} override this to + * take the id fast-path (no name hashing / interceptor string switch). If the id cannot be + * resolved (no resolver registered) the tag is left unchanged. + */ + default AgentSpan setTag(long tagId, Object value) { + String name = datadog.trace.api.KnownTags.nameOf(tagId); + return name == null ? this : setTag(name, value); + } + + default AgentSpan setTag(long tagId, CharSequence value) { + String name = datadog.trace.api.KnownTags.nameOf(tagId); + return name == null ? this : setTag(name, value); + } + + default AgentSpan setTag(long tagId, boolean value) { + String name = datadog.trace.api.KnownTags.nameOf(tagId); + return name == null ? this : setTag(name, value); + } + + default AgentSpan setTag(long tagId, int value) { + String name = datadog.trace.api.KnownTags.nameOf(tagId); + return name == null ? this : setTag(name, value); + } + + default AgentSpan setTag(long tagId, long value) { + String name = datadog.trace.api.KnownTags.nameOf(tagId); + return name == null ? this : setTag(name, value); + } + + default AgentSpan setTag(long tagId, float value) { + String name = datadog.trace.api.KnownTags.nameOf(tagId); + return name == null ? this : setTag(name, value); + } + + default AgentSpan setTag(long tagId, double value) { + String name = datadog.trace.api.KnownTags.nameOf(tagId); + return name == null ? this : setTag(name, value); + } + /** entry may be null - in which case the tags remained unchanged */ AgentSpan setTag(TagMap.EntryReader entry); diff --git a/internal-api/src/main/java/datadog/trace/util/TagSet.java b/internal-api/src/main/java/datadog/trace/util/TagSet.java new file mode 100644 index 00000000000..570f3c7e003 --- /dev/null +++ b/internal-api/src/main/java/datadog/trace/util/TagSet.java @@ -0,0 +1,142 @@ +package datadog.trace.util; + +/** + * Flat open-addressed name set. Generic — it knows only names. + * + *

Three ways to use it, trading convenience for indirection: + * + *

    + *
  • {@link Support} — static algorithm over raw arrays. Keep the arrays in your own + * (ideally {@code static final}) fields and the JIT folds the refs to constants. The fastest + * path; nothing to dereference. + *
  • {@link Data} — a build-time carrier for the placed {@code {hashes, names}} returned + * by {@link Support#create}. Pull its fields into your own and discard it. + *
  • The {@code TagSet} instance ({@link #of}) — a convenience wrapper holding the + * arrays; {@link #indexOf}/{@link #contains} delegate to {@link Support}. Costs an + * instance-field load per call (the indirection the static path removes) — fine off the hot + * path. + *
+ * + *

Consumers attach their own parallel payload arrays (ids, values, ...) sized to {@link #slots} + * and indexed by the slot {@code indexOf} returns. + * + *

Slot 0-value is the empty sentinel: {@link Support#hash} never returns 0, so {@code hashes[i] + * == 0} unambiguously means an empty slot. + */ +public final class TagSet { + private final int[] hashes; + private final String[] names; + public final int slots; // == hashes.length + + private TagSet(int[] hashes, String[] names) { + this.hashes = hashes; + this.names = names; + this.slots = hashes.length; + } + + /** + * Convenience instance — wraps the placed arrays. For the hot path prefer raw {@link Support}. + */ + public static TagSet of(String... names) { + Data data = Support.create(names); + return new TagSet(data.hashes, data.names); + } + + /** Slot of {@code name}, or -1. Delegates to {@link Support} on the instance's arrays. */ + public int indexOf(String name) { + return Support.indexOf(this.hashes, this.names, name); + } + + public boolean contains(String name) { + return indexOf(name) >= 0; + } + + /** Table size — allocate parallel payload arrays of this length. */ + public int slots() { + return this.slots; + } + + /** Build-time carrier. Pull the fields into your own (static final) fields; don't keep this. */ + public static final class Data { + public final int[] hashes; + public final String[] names; + + Data(int[] hashes, String[] names) { + this.hashes = hashes; + this.names = names; + } + } + + /** Static algorithm over raw arrays. Query helpers take raw arrays, never a Data or a TagSet. */ + public static final class Support { + private Support() {} + + /** Spread of String.hashCode; 0 reserved as the empty sentinel. */ + public static int hash(String name) { + int h = name.hashCode(); // cached on String -> field load + return h == 0 ? 0xDD06 : h ^ (h >>> 16); + } + + /** Power-of-two size, 2x-oversized so load factor stays <= 0.5. */ + public static int tableSizeFor(int n) { + int size = 1; + while (size <= n) { + size <<= 1; + } + return size << 1; + } + + /** Build the placed table. Returns a Data carrier; pull its arrays into your own fields. */ + public static Data create(String... names) { + int size = tableSizeFor(names.length); + int[] hashes = new int[size]; + String[] placed = new String[size]; + for (String name : names) { + put(hashes, placed, name, hash(name)); + } + return new Data(hashes, placed); + } + + /** Build-time placement. Returns the slot. */ + public static int put(int[] hashes, String[] names, String name, int h) { + final int mask = hashes.length - 1; + int i = h & mask; + for (int probes = 0; probes <= mask; probes++, i = (i + 1) & mask) { + if (hashes[i] == 0) { + hashes[i] = h; + names[i] = name; + return i; + } + if (hashes[i] == h && eq(names[i], name)) { + return i; // already present + } + } + throw new IllegalStateException("table full"); // impossible at LF <= 0.5 + } + + /** Probe; returns the slot or -1. Raw arrays — no Data, no instance. */ + public static int indexOf(int[] hashes, String[] names, String name, int h) { + final int mask = hashes.length - 1; + int i = h & mask; + for (int probes = 0; probes <= mask; probes++, i = (i + 1) & mask) { + int sh = hashes[i]; + if (sh == 0) { + return -1; + } + if (sh == h && eq(names[i], name)) { + return i; + } + } + return -1; + } + + public static int indexOf(int[] hashes, String[] names, String name) { + return indexOf(hashes, names, name, hash(name)); + } + + // `a` is a stored name on an occupied slot (never null); `b` is a non-null query. + private static boolean eq(String a, String b) { + return a == b || a.equals(b); // interned literals hit the == fast path + } + } +} diff --git a/internal-api/src/test/java/datadog/trace/api/TagMapEmptyInitTest.java b/internal-api/src/test/java/datadog/trace/api/TagMapEmptyInitTest.java new file mode 100644 index 00000000000..3ec4f6c6641 --- /dev/null +++ b/internal-api/src/test/java/datadog/trace/api/TagMapEmptyInitTest.java @@ -0,0 +1,28 @@ +package datadog.trace.api; + +import static org.junit.jupiter.api.Assertions.assertNotNull; + +import java.util.Collections; +import org.junit.jupiter.api.Test; + +/** + * Diagnostic: is TagMap.EMPTY null when OptimizedTagMap is the first TagMap-related class touched + * in the JVM? Forked so nothing else initializes TagMap first. + */ +public class TagMapEmptyInitTest { + @Test + void emptyNotNull_whenOptimizedInitsFirst() { + // force OptimizedTagMap to initialize before the TagMap interface + OptimizedTagMap m = new OptimizedTagMap(); + m.set("x", "y"); + + System.out.println("OptimizedTagMap.EMPTY=" + OptimizedTagMap.EMPTY); + System.out.println("TagMap.EMPTY=" + TagMap.EMPTY); + + assertNotNull(OptimizedTagMap.EMPTY, "OptimizedTagMap.EMPTY null"); + assertNotNull(TagMap.EMPTY, "TagMap.EMPTY null"); + assertNotNull( + TagMap.fromMapImmutable(Collections.emptyMap()), "fromMapImmutable(empty) returned null"); + assertNotNull(TagMap.ledger().buildImmutable(), "empty ledger buildImmutable returned null"); + } +} diff --git a/internal-api/src/test/java/datadog/trace/api/TagMapEntryTest.java b/internal-api/src/test/java/datadog/trace/api/TagMapEntryTest.java index e7c483e80ec..be18fd25183 100644 --- a/internal-api/src/test/java/datadog/trace/api/TagMapEntryTest.java +++ b/internal-api/src/test/java/datadog/trace/api/TagMapEntryTest.java @@ -20,6 +20,8 @@ import java.util.concurrent.ThreadFactory; import java.util.function.Function; import java.util.function.Supplier; +import org.junit.jupiter.api.AfterAll; +import org.junit.jupiter.api.BeforeAll; import org.junit.jupiter.api.DisplayName; import org.junit.jupiter.api.Test; import org.junit.jupiter.params.ParameterizedTest; @@ -550,6 +552,133 @@ public void removalChange() { assertTrue(removalChange.isRemoval()); } + // --------------------------------------------------------------------------------------------- + // Tag-id-constructed entries: the name is resolved lazily from the tagId via KnownTags on first + // tag()/getKey(). That resolution (and the cache write to the volatile-free `tag` field) is a + // benign race — these tests run tag-id entries through the existing multi-threaded harness so 4 + // threads resolve the name concurrently; all must agree, and on the same interned constant. + // --------------------------------------------------------------------------------------------- + + static final String[] TAG_NAMES = {"tag.alpha", "tag.beta", "tag.gamma"}; + + static long tagId(int serial, int fieldPos, String name) { + return KnownTags.tagId(serial, fieldPos, name); + } + + @BeforeAll + static void registerResolver() { + KnownTags.register( + new KnownTags.Resolver() { + @Override + public String nameOf(long tagId) { + int serial = (int) (tagId >>> 48); + // returns the same interned constant each call, so racing resolutions agree by identity + return (serial >= 1 && serial <= TAG_NAMES.length) ? TAG_NAMES[serial - 1] : null; + } + + @Override + public long keyOf(String name) { + for (int i = 0; i < TAG_NAMES.length; ++i) { + if (TAG_NAMES[i].equals(name)) return tagId(i + 1, i, TAG_NAMES[i]); + } + return 0L; + } + + @Override + public int slotCount() { + return TAG_NAMES.length; // fieldPos 0..TAG_NAMES.length-1 + } + }); + } + + @AfterAll + static void clearResolver() { + KnownTags.register(null); + } + + // resolved name must be the exact interned constant, and hash() must equal the tagId's low 32 + // bits (nameHash) — both stressed concurrently by the shared-entry multi-threaded harness. + static Check checkResolvedTagId(long id, String name, TagMap.Entry entry) { + return multiCheck( + checkKey(name, entry), + checkTrue(() -> entry.tag() == name, "tag() returns interned constant"), + checkEquals((int) (id & 0xFFFFFFFFL), () -> entry.hash(), "Entry::hash == nameHash")); + } + + @Test + @DisplayName("tag-id entry: Object resolves name lazily under race") + public void tagIdEntryObject() { + long id = tagId(1, 0, TAG_NAMES[0]); + test( + () -> TagMap.Entry.newAnyEntry(id, "bar"), + TagMap.Entry.ANY, + (entry) -> + multiCheck( + checkResolvedTagId(id, TAG_NAMES[0], entry), + checkValue("bar", entry), + checkTrue(entry::isObject), + checkType(TagMap.Entry.OBJECT, entry))); + } + + @Test + @DisplayName("tag-id entry: int resolves name lazily under race") + public void tagIdEntryInt() { + long id = tagId(2, 1, TAG_NAMES[1]); + test( + () -> TagMap.Entry.newIntEntry(id, 42), + TagMap.Entry.INT, + (entry) -> + multiCheck( + checkResolvedTagId(id, TAG_NAMES[1], entry), + checkValue(42, entry), + checkIsNumericPrimitive(entry), + checkType(TagMap.Entry.INT, entry))); + } + + @Test + @DisplayName("tag-id entry: boolean resolves name lazily under race") + public void tagIdEntryBoolean() { + long id = tagId(3, 2, TAG_NAMES[2]); + test( + () -> TagMap.Entry.newBooleanEntry(id, true), + TagMap.Entry.BOOLEAN, + (entry) -> + multiCheck( + checkResolvedTagId(id, TAG_NAMES[2], entry), + checkValue(true, entry), + checkType(TagMap.Entry.BOOLEAN, entry))); + } + + @Test + @DisplayName("string entry: lazy hash() under race") + public void stringEntryLazyHash() { + // string-constructed entry computes hash() lazily, writing into the low 32 bits of the `tagId` + // field (formerly a separate `int lazyTagHash`). Stress concurrent first-resolution. + String name = "some.unknown.tag.name"; + test( + () -> TagMap.Entry.newObjectEntry(name, "v"), + TagMap.Entry.OBJECT, + (entry) -> + multiCheck( + checkEquals(TagMap.Entry._hash(name), () -> entry.hash(), "lazy hash()"), + checkKey(name, entry), + checkValue("v", entry))); + } + + @Test + @DisplayName("tag-id entry: matches() resolves the name under race") + public void tagIdEntryMatches() { + long id = tagId(1, 0, TAG_NAMES[0]); + test( + () -> TagMap.Entry.newObjectEntry(id, "bar"), + TagMap.Entry.OBJECT, + (entry) -> + multiCheck( + checkTrue(() -> entry.matches(TAG_NAMES[0]), "matches(name)"), + checkFalse(() -> entry.matches("nope"), "!matches(other)"), + checkKey(TAG_NAMES[0], entry))); + } + static final int NUM_THREADS = 4; static final ExecutorService EXECUTOR = Executors.newFixedThreadPool( diff --git a/internal-api/src/test/java/datadog/trace/api/TagMapFuzzTest.java b/internal-api/src/test/java/datadog/trace/api/TagMapFuzzTest.java index 48254ae9bd1..d6c2c52637e 100644 --- a/internal-api/src/test/java/datadog/trace/api/TagMapFuzzTest.java +++ b/internal-api/src/test/java/datadog/trace/api/TagMapFuzzTest.java @@ -11,35 +11,153 @@ import java.util.HashMap; import java.util.List; import java.util.Map; -import java.util.concurrent.ThreadLocalRandom; +import java.util.Random; import java.util.function.Supplier; +import org.junit.jupiter.api.AfterAll; +import org.junit.jupiter.api.BeforeAll; +import org.junit.jupiter.api.BeforeEach; import org.junit.jupiter.api.Test; +import org.junit.jupiter.api.TestInfo; +import org.junit.jupiter.api.extension.ExtendWith; +import org.junit.jupiter.api.extension.ExtensionContext; +import org.junit.jupiter.api.extension.TestWatcher; +@ExtendWith(TagMapFuzzTest.SeedReporter.class) public final class TagMapFuzzTest { static final int NUM_KEYS = 128; static final int MAX_NUM_ACTIONS = 32; static final int MIN_NUM_ACTIONS = 8; + // Seedable RNG for reproducibility. Each test reseeds in @BeforeEach: from + // -Ddatadog.tagmap.fuzz.seed when set, else a fresh random seed (always logged). On failure + // SeedReporter reprints the seed + reproduce command. Static because the random* helpers are + // static; the fuzz tests are single-threaded. + static final String SEED_PROPERTY = "datadog.tagmap.fuzz.seed"; + static Random rng; + static long currentSeed; + + @BeforeEach + void seedRng(TestInfo info) { + String prop = System.getProperty(SEED_PROPERTY); + currentSeed = (prop != null) ? Long.parseLong(prop.trim()) : new Random().nextLong(); + rng = new Random(currentSeed); + System.out.println( + info.getDisplayName() + + " seed=" + + currentSeed + + " (reproduce with -D" + + SEED_PROPERTY + + "=" + + currentSeed + + ")"); + } + + static final class SeedReporter implements TestWatcher { + @Override + public void testFailed(ExtensionContext ctx, Throwable cause) { + System.err.println( + "TagMapFuzzTest." + + ctx.getDisplayName() + + " FAILED with seed=" + + currentSeed + + "\n reproduce: ./gradlew :internal-api:test --tests \"" + + ctx.getRequiredTestClass().getName() + + "." + + ctx.getDisplayName() + + "\" -D" + + SEED_PROPERTY + + "=" + + currentSeed); + } + } + + // Closed-form KnownTags resolver for the fuzz keys ("key-0".."key-(NUM_KEYS-1)"). Lets the + // tag-id keyed actions (setById / putAllLedgerById) resolve their names so id-bearing entries + // unify with string-keyed entries in the buckets and remain findable by name. + // slot count for the fuzz layout; fieldPos = n % SLOT_COUNT so keys spread across slots and some + // collide (first-writer-wins -> the rest fall to buckets), exercising both paths. + static final int SLOT_COUNT = 32; + + @BeforeAll + static void registerResolver() { + KnownTags.register( + new KnownTags.Resolver() { + @Override + public String nameOf(long tagId) { + int globalSerial = (int) (tagId >>> 48); + return globalSerial == 0 ? null : "key-" + (globalSerial - 1); + } + + @Override + public long keyOf(String name) { + return isFuzzKey(name) ? tagIdOf(name) : 0L; + } + + @Override + public int slotCount() { + return SLOT_COUNT; + } + }); + } + + @AfterAll + static void clearResolver() { + KnownTags.register(null); + } + + static boolean isFuzzKey(String name) { + return name != null && name.startsWith("key-"); + } + + static long tagIdOf(String key) { + int n = Integer.parseInt(key.substring("key-".length())); + // globalSerial = n + 1 (non-zero, unique per key); fieldPos = n % SLOT_COUNT + return KnownTags.tagId(n + 1, n % SLOT_COUNT, key); + } + + // Number of random sequences per @Test run. Default 1 (fast CI); crank via + // -Ddatadog.tagmap.fuzz.iterations=N to hunt rare cases. Deterministic under a fixed seed. + static int iterations() { + return Integer.getInteger("datadog.tagmap.fuzz.iterations", 1); + } + @Test void test() { - test(generateTest()); + for (int i = 0, n = iterations(); i < n; ++i) { + test(generateTest()); + } } @Test void testMerge() { - TestCase mapACase = generateTest(); - TestCase mapBCase = generateTest(); + for (int i = 0, n = iterations(); i < n; ++i) { + TestCase mapACase = generateTest(); + TestCase mapBCase = generateTest(); + + OptimizedTagMap tagMapA = test(mapACase); + OptimizedTagMap tagMapB = test(mapBCase); - OptimizedTagMap tagMapA = test(mapACase); - OptimizedTagMap tagMapB = test(mapBCase); + HashMap hashMapA = new HashMap<>(tagMapA); + HashMap hashMapB = new HashMap<>(tagMapB); - HashMap hashMapA = new HashMap<>(tagMapA); - HashMap hashMapB = new HashMap<>(tagMapB); + tagMapA.putAll(tagMapB); + hashMapA.putAll(hashMapB); - tagMapA.putAll(tagMapB); - hashMapA.putAll(hashMapB); + assertMapEquals(hashMapA, tagMapA); - assertMapEquals(hashMapA, tagMapA); + // The merge must not mutate the source, AND must not leave the dest sharing a mutable + // BucketGroup chain with it (cloneChain must deep-copy). So the source must stay intact both + // right after the merge and after the dest is independently mutated - mutating a shared chain + // would corrupt the source. + assertMapEquals(hashMapB, tagMapB); + for (int j = 0; j < 16; ++j) { + tagMapA.set("merge-probe-" + j, "probe-" + j); + } + for (String key : hashMapB.keySet()) { + tagMapA.remove(key); + } + assertMapEquals(hashMapB, tagMapB); + } } @Test @@ -934,8 +1052,7 @@ public static final OptimizedTagMap test(TestCase test) { } public static final TestCase generateTest() { - int numActions = - ThreadLocalRandom.current().nextInt(MAX_NUM_ACTIONS - MIN_NUM_ACTIONS) + MIN_NUM_ACTIONS; + int numActions = rng.nextInt(MAX_NUM_ACTIONS - MIN_NUM_ACTIONS) + MIN_NUM_ACTIONS; return generateTest(numActions); } @@ -948,7 +1065,7 @@ public static final TestCase generateTest(int size) { } public static final MapAction randomAction() { - float actionSelector = ThreadLocalRandom.current().nextFloat(); + float actionSelector = rng.nextFloat(); switch (randomChoice(0.02, 0.1, 0.2)) { case 0: @@ -958,18 +1075,21 @@ public static final MapAction randomAction() { return randomChoice( () -> putAll(randomKeysAndValues()), () -> putAllTagMap(randomKeysAndValues()), - () -> putAllLedger(randomKeysAndValues())); + () -> putAllLedger(randomKeysAndValues()), + () -> putAllLedgerById(randomKeysAndValues())); case 2: return randomChoice( () -> remove(randomKey()), () -> removeLight(randomKey()), + () -> removeById(randomKey()), () -> getAndRemove(randomKey())); default: return randomChoice( () -> put(randomKey(), randomValue()), () -> set(randomKey(), randomValue()), + () -> setById(randomKey(), randomValue()), () -> getAndSet(randomKey(), randomValue())); } } @@ -982,6 +1102,10 @@ public static final MapAction set(String key, String value) { return new Set(key, value); } + public static final MapAction setById(String key, String value) { + return new SetById(key, value); + } + public static final MapAction getAndSet(String key, String value) { return new GetAndSet(key, value); } @@ -998,6 +1122,10 @@ public static final MapAction putAllLedger(String... keysAndValues) { return new PutAllLedger(keysAndValues); } + public static final MapAction putAllLedgerById(String... keysAndValues) { + return new PutAllLedgerById(keysAndValues); + } + public static final MapAction clear() { return Clear.INSTANCE; } @@ -1010,6 +1138,10 @@ public static final MapAction removeLight(String key) { return new RemoveLight(key); } + public static final MapAction removeById(String key) { + return new RemoveById(key); + } + public static final MapAction getAndRemove(String key) { return new GetAndRemove(key); } @@ -1032,11 +1164,11 @@ static final void assertMapEquals(Map expected, OptimizedTagMap } static final float randomFloat() { - return ThreadLocalRandom.current().nextFloat(); + return rng.nextFloat(); } static final int randomChoice(int numChoices) { - return ThreadLocalRandom.current().nextInt(numChoices); + return rng.nextInt(numChoices); } static final T randomChoice(Supplier... choiceSuppliers) { @@ -1046,7 +1178,7 @@ static final T randomChoice(Supplier... choiceSuppliers) { } static final int randomChoice(double... proportions) { - double selector = ThreadLocalRandom.current().nextDouble(); + double selector = rng.nextDouble(); for (int i = 0; i < proportions.length; ++i) { if (selector < proportions[i]) return i; @@ -1057,15 +1189,15 @@ static final int randomChoice(double... proportions) { } static final String randomKey() { - return "key-" + ThreadLocalRandom.current().nextInt(NUM_KEYS); + return "key-" + rng.nextInt(NUM_KEYS); } static final String randomValue() { - return "values-" + ThreadLocalRandom.current().nextInt(); + return "values-" + rng.nextInt(); } static final String[] randomKeysAndValues() { - int numEntries = ThreadLocalRandom.current().nextInt(NUM_KEYS); + int numEntries = rng.nextInt(NUM_KEYS); String[] keysAndValues = new String[numEntries << 1]; for (int i = 0; i < keysAndValues.length; i += 2) { @@ -1123,6 +1255,17 @@ static final TagMap.Ledger ledgerOf(String... keysAndValues) { return ledger; } + static final TagMap.Ledger ledgerByIdOf(String... keysAndValues) { + TagMap.Ledger ledger = TagMap.ledger(); + for (int i = 0; i < keysAndValues.length; i += 2) { + String key = keysAndValues[i]; + String value = keysAndValues[i + 1]; + + ledger.set(tagIdOf(key), value); + } + return ledger; + } + static final class TestCase { final List actions; @@ -1286,6 +1429,41 @@ public String toString() { } } + static final class SetById extends BasicAction { + final String key; + final String value; + + SetById(String key, String value) { + this.key = key; + this.value = value; + } + + @Override + protected void _applyToTestMap(TagMap testMap) { + testMap.set(tagIdOf(this.key), this.value); + } + + @Override + protected void _applyToExpectedMap(Map expectedMap) { + expectedMap.put(this.key, this.value); + } + + @Override + public void verifyTestMap(TagMap testMap) { + // findable by name (read-path unification) ... + assertEquals(this.value, testMap.get(this.key)); + // ... and by tag id + TagMap.Entry byId = testMap.getEntry(tagIdOf(this.key)); + assertNotNull(byId); + assertEquals(this.value, byId.objectValue()); + } + + @Override + public String toString() { + return String.format("setById(%s,%s)", literal(this.key), literal(this.value)); + } + } + static final class GetAndSet extends ReturningAction { final String key; final String value; @@ -1428,6 +1606,43 @@ public String toString() { } } + static final class PutAllLedgerById extends BasicAction { + final String[] keysAndValues; + final TagMap.Ledger ledger; + + PutAllLedgerById(String... keysAndValues) { + this.keysAndValues = keysAndValues; + this.ledger = ledgerByIdOf(keysAndValues); + } + + @Override + protected void _applyToTestMap(TagMap testMap) { + this.ledger.fill(testMap); + } + + @Override + protected void _applyToExpectedMap(Map expectedMap) { + for (TagMap.EntryChange change : this.ledger) { + // ledgerByIdOf doesn't produce removals, so this cast is safe + TagMap.Entry entry = (TagMap.Entry) change; + expectedMap.put(entry.tag(), entry.objectValue()); + } + } + + @Override + public void verifyTestMap(TagMap expectedMap) { + // ledger may contain multiple updates of the same key; compare against a built map + for (TagMap.EntryReader entry : this.ledger.buildImmutable()) { + assertEquals(entry.objectValue(), expectedMap.get(entry.tag()), "key=" + entry.tag()); + } + } + + @Override + public String toString() { + return String.format("putAllLedgerById(%s)", literalVarArgs(this.keysAndValues)); + } + } + static final class Remove extends BasicReturningAction { final String key; @@ -1489,6 +1704,39 @@ public String toString() { } } + static final class RemoveById extends ReturningAction { + final String key; + + RemoveById(String key) { + this.key = key; + } + + @Override + protected Boolean _applyToTestMap(TagMap testMap) { + return testMap.remove(tagIdOf(this.key)); + } + + @Override + protected Object _applyToExpectedMap(Map expectedMap) { + return expectedMap.remove(this.key); + } + + @Override + protected void _verifyResults(Object expected, Boolean actual) { + assertEquals((expected != null), actual); + } + + @Override + public void verifyTestMap(TagMap testMap) { + assertFalse(testMap.containsKey(this.key)); + } + + @Override + public String toString() { + return String.format("removeById(%s)", literal(this.key)); + } + } + static final class GetAndRemove extends ReturningAction { final String key; diff --git a/internal-api/src/test/java/datadog/trace/api/TagMapTagIdTest.java b/internal-api/src/test/java/datadog/trace/api/TagMapTagIdTest.java new file mode 100644 index 00000000000..ba716f3c935 --- /dev/null +++ b/internal-api/src/test/java/datadog/trace/api/TagMapTagIdTest.java @@ -0,0 +1,292 @@ +package datadog.trace.api; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertNotNull; +import static org.junit.jupiter.api.Assertions.assertNull; +import static org.junit.jupiter.api.Assertions.assertTrue; + +import datadog.trace.api.TagMap.Entry; +import java.util.HashMap; +import java.util.Map; +import org.junit.jupiter.api.AfterEach; +import org.junit.jupiter.api.BeforeEach; +import org.junit.jupiter.api.Test; + +/** + * Exercises the tag-id keyed {@code set}/{@code getEntry} surface on {@link TagMap} and {@link + * TagMap.Ledger}. + * + *

In this (buckets-only) phase a tag-id keyed write builds a tag-id-bearing {@link Entry} and + * stores it in the regular hash buckets. The entry carries its full identity (globalSerial / + * fieldPos / nameHash) and resolves its name lazily via {@link KnownTags}, so it remains findable + * by string name once a {@link KnownTags.Resolver} is registered. + */ +public class TagMapTagIdTest { + // Test tags: name -> (globalSerial, fieldPos). nameHash is derived from Entry._hash(name) so the + // tag-id-bearing entry lands in the same hash bucket a string-keyed entry would. + static final String HTTP_METHOD = "http.request.method"; + static final String HTTP_STATUS = "http.response.status_code"; + static final String DB_SYSTEM = "db.system"; + // a "low-priority" stored tag: has an id, but deliberately no positional slot (NO_SLOT) so it + // lives in the hash buckets rather than widening knownEntries[]. + static final String MESSAGING_SYSTEM = "messaging.system"; + + static final long HTTP_METHOD_ID = tagId(1, 2, HTTP_METHOD); + static final long HTTP_STATUS_ID = tagId(2, 5, HTTP_STATUS); + static final long DB_SYSTEM_ID = tagId(3, 0, DB_SYSTEM); + // stored-range serial (>= FIRST_STORED_SERIAL) so it is an *unslotted stored* tag, not a reserved + static final long MESSAGING_SYSTEM_ID = + KnownTags.tagId(KnownTags.FIRST_STORED_SERIAL + 4, MESSAGING_SYSTEM); + + static long tagId(int globalSerial, int fieldPos, String name) { + return KnownTags.tagId(globalSerial, fieldPos, name); + } + + @Test + public void tagId_roundTripsThroughExtractors() { + long id = KnownTags.tagId(7, 13, "some.tag.name"); + assertEquals(7, KnownTags.globalSerial(id)); + assertEquals(13, KnownTags.fieldPos(id)); + assertEquals(Entry._hash("some.tag.name"), KnownTags.nameHash(id)); + } + + @BeforeEach + public void registerResolver() { + Map nameById = new HashMap<>(); + Map idByName = new HashMap<>(); + nameById.put(HTTP_METHOD_ID, HTTP_METHOD); + nameById.put(HTTP_STATUS_ID, HTTP_STATUS); + nameById.put(DB_SYSTEM_ID, DB_SYSTEM); + nameById.put(MESSAGING_SYSTEM_ID, MESSAGING_SYSTEM); + for (Map.Entry e : nameById.entrySet()) { + idByName.put(e.getValue(), e.getKey()); + } + KnownTags.register( + new KnownTags.Resolver() { + @Override + public String nameOf(long tagId) { + return nameById.get(tagId); + } + + @Override + public long keyOf(String name) { + Long id = idByName.get(name); + return id == null ? 0L : id; + } + + @Override + public int slotCount() { + return 6; // max stored fieldPos (HTTP_STATUS=5) + 1 + } + }); + } + + @AfterEach + public void clearResolver() { + KnownTags.register(null); + } + + @Test + public void setById_findableByIdAndName() { + TagMap map = TagMap.create(); + map.set(HTTP_METHOD_ID, "GET"); + + // findable by tag id + Entry byId = map.getEntry(HTTP_METHOD_ID); + assertNotNull(byId); + assertEquals("GET", byId.stringValue()); + + // findable by the resolved string name (read-path unification). getEntry materializes a fresh + // Entry from the dense store on each call, so identity is not preserved across calls; the + // logical entry (tag + value) is. + Entry byName = map.getEntry(HTTP_METHOD); + assertNotNull(byName); + assertEquals(byId.tag(), byName.tag()); + assertEquals(byId.stringValue(), byName.stringValue()); + assertEquals("GET", map.get(HTTP_METHOD)); + } + + @Test + public void setById_preservesIdentityOnEntry() { + TagMap map = TagMap.create(); + map.set(HTTP_STATUS_ID, 404); + + Entry entry = map.getEntry(HTTP_STATUS_ID); + assertNotNull(entry); + // globalSerial survives on the stored entry + assertEquals(2, KnownTags.globalSerial(entry.tagId)); + assertEquals(5, KnownTags.fieldPos(entry.tagId)); + // name resolves lazily from the id + assertEquals(HTTP_STATUS, entry.tag()); + } + + @Test + public void setById_typedValues() { + TagMap map = TagMap.create(); + map.set(HTTP_STATUS_ID, 500); + + Entry entry = map.getEntry(HTTP_STATUS_ID); + assertTrue(entry.is(Entry.INT)); + assertEquals(500, entry.intValue()); + assertEquals(500, map.getInt(HTTP_STATUS)); + } + + @Test + public void setById_overwriteSameTag() { + TagMap map = TagMap.create(); + map.set(HTTP_METHOD_ID, "GET"); + map.set(HTTP_METHOD_ID, "POST"); + + assertEquals("POST", map.get(HTTP_METHOD)); + assertEquals(1, map.size()); + } + + @Test + public void setByName_findableById() { + TagMap map = TagMap.create(); + // string write of a known tag is still findable through the id read path (resolves to the + // same name and hash bucket) + map.set(DB_SYSTEM, "postgresql"); + + Entry byId = map.getEntry(DB_SYSTEM_ID); + assertNotNull(byId); + assertEquals("postgresql", byId.stringValue()); + } + + @Test + public void getEntryById_missingReturnsNull() { + TagMap map = TagMap.create(); + assertNull(map.getEntry(HTTP_METHOD_ID)); + } + + @Test + public void ledger_setById_buildsMap() { + TagMap map = + TagMap.ledger() + .set(HTTP_METHOD_ID, "GET") + .set(HTTP_STATUS_ID, 204) + .set(DB_SYSTEM_ID, "mysql") + .build(); + + assertEquals("GET", map.get(HTTP_METHOD)); + assertEquals(204, map.getInt(HTTP_STATUS)); + assertEquals("mysql", map.get(DB_SYSTEM)); + + // tag id survives the ledger -> build path + Entry status = map.getEntry(HTTP_STATUS_ID); + assertEquals(2, KnownTags.globalSerial(status.tagId)); + assertEquals(204, status.intValue()); + } + + @Test + public void ledger_mixedIdAndName() { + TagMap map = TagMap.ledger().set(HTTP_METHOD_ID, "PUT").set(DB_SYSTEM, "redis").build(); + + assertEquals("PUT", map.get(HTTP_METHOD)); + assertEquals("redis", map.get(DB_SYSTEM)); + assertEquals(2, map.size()); + } + + @Test + public void removeById_clearsAndReportsSize() { + TagMap map = TagMap.create(); + map.set(HTTP_METHOD_ID, "GET"); + assertEquals(1, map.size()); + + assertTrue(map.remove(HTTP_METHOD_ID)); + assertNull(map.getEntry(HTTP_METHOD_ID)); + assertNull(map.getEntry(HTTP_METHOD)); + assertEquals(0, map.size()); + assertFalse(map.remove(HTTP_METHOD_ID)); // already gone + } + + @Test + public void getAndRemoveById_returnsPrior() { + TagMap map = TagMap.create(); + map.set(HTTP_STATUS_ID, 404); + + Entry prior = map.getAndRemove(HTTP_STATUS_ID); + assertNotNull(prior); + assertEquals(404, prior.intValue()); + assertNull(map.getEntry(HTTP_STATUS_ID)); + } + + @Test + public void removeById_removesStringSetTag() { + // set by name, removed by id (id resolves to the same tag, slot or bucket) + TagMap map = TagMap.create(); + map.set(DB_SYSTEM, "postgresql"); + + assertTrue(map.remove(DB_SYSTEM_ID)); + assertNull(map.get(DB_SYSTEM)); + } + + @Test + public void ledger_removeById() { + TagMap map = + TagMap.ledger() + .set(HTTP_METHOD_ID, "GET") + .set(DB_SYSTEM_ID, "mysql") + .remove(DB_SYSTEM_ID) + .build(); + + assertEquals("GET", map.get(HTTP_METHOD)); + assertNull(map.get(DB_SYSTEM)); + assertEquals(1, map.size()); + } + + @Test + public void noSlotOverload_stampsNoSlotSentinel() { + long id = KnownTags.tagId(KnownTags.FIRST_STORED_SERIAL + 4, MESSAGING_SYSTEM); + assertEquals(KnownTags.FIRST_STORED_SERIAL + 4, KnownTags.globalSerial(id)); + assertEquals(KnownTags.NO_SLOT, KnownTags.fieldPos(id)); + assertEquals(Entry._hash(MESSAGING_SYSTEM), KnownTags.nameHash(id)); + // a stored serial + NO_SLOT fieldPos == an unslotted (bucket-only) stored tag + assertTrue(KnownTags.isStored(id)); + assertTrue(KnownTags.isUnslotted(id)); + } + + @Test + public void unslotted_setFindableByIdAndName() { + TagMap map = TagMap.create(); + map.set(MESSAGING_SYSTEM_ID, "kafka"); + + Entry byId = map.getEntry(MESSAGING_SYSTEM_ID); + assertNotNull(byId); + assertEquals("kafka", byId.stringValue()); + // NO_SLOT survives on the stored entry — it lives in the buckets, not a slot + assertEquals(KnownTags.NO_SLOT, KnownTags.fieldPos(byId.tagId)); + assertEquals(MESSAGING_SYSTEM, byId.tag()); + + // string read of the same tag unifies with the id-stored entry (logically; getEntry + // materializes a fresh Entry per call so identity is not preserved) + Entry byName = map.getEntry(MESSAGING_SYSTEM); + assertNotNull(byName); + assertEquals(byId.tag(), byName.tag()); + assertEquals(byId.stringValue(), byName.stringValue()); + assertEquals("kafka", map.get(MESSAGING_SYSTEM)); + } + + @Test + public void unslotted_stringSetFindableById() { + TagMap map = TagMap.create(); + map.set(MESSAGING_SYSTEM, "rabbitmq"); + + Entry byId = map.getEntry(MESSAGING_SYSTEM_ID); + assertNotNull(byId); + assertEquals("rabbitmq", byId.stringValue()); + } + + @Test + public void unslotted_removeById() { + TagMap map = TagMap.create(); + map.set(MESSAGING_SYSTEM_ID, "kafka"); + assertEquals(1, map.size()); + + assertTrue(map.remove(MESSAGING_SYSTEM_ID)); + assertNull(map.getEntry(MESSAGING_SYSTEM_ID)); + assertNull(map.get(MESSAGING_SYSTEM)); + assertEquals(0, map.size()); + } +} diff --git a/internal-api/src/test/java/datadog/trace/util/TagSetTest.java b/internal-api/src/test/java/datadog/trace/util/TagSetTest.java new file mode 100644 index 00000000000..e5ca5ff41e5 --- /dev/null +++ b/internal-api/src/test/java/datadog/trace/util/TagSetTest.java @@ -0,0 +1,102 @@ +package datadog.trace.util; + +import static org.junit.jupiter.api.Assertions.assertEquals; +import static org.junit.jupiter.api.Assertions.assertFalse; +import static org.junit.jupiter.api.Assertions.assertNotEquals; +import static org.junit.jupiter.api.Assertions.assertThrows; +import static org.junit.jupiter.api.Assertions.assertTrue; + +import datadog.trace.util.TagSet.Data; +import datadog.trace.util.TagSet.Support; +import org.junit.jupiter.api.Test; + +class TagSetTest { + + @Test + void hash_spread_and_zeroSentinel() { + // "".hashCode() == 0 -> remapped to the non-zero sentinel so 0 can mean "empty slot" + assertEquals(0xDD06, Support.hash("")); + + int raw = "foo".hashCode(); + assertEquals(raw ^ (raw >>> 16), Support.hash("foo")); + assertNotEquals(0, Support.hash("foo")); + } + + @Test + void tableSizeFor_isPow2_andOversized() { + assertEquals(2, Support.tableSizeFor(0)); + assertEquals(4, Support.tableSizeFor(1)); + assertEquals(8, Support.tableSizeFor(3)); + assertEquals(16, Support.tableSizeFor(4)); + } + + @Test + void instance_contains_internedAndCopy_andMiss() { + TagSet set = TagSet.of("foo", "bar", "baz"); + + assertEquals(8, set.slots()); // 3 names -> tableSizeFor(3) == 8 + + assertTrue(set.contains("foo")); // interned literal -> == fast path in eq + assertTrue(set.contains(new String("bar"))); // non-interned -> .equals path + assertFalse(set.contains("nope")); + + assertTrue(set.indexOf("baz") >= 0); + assertEquals(-1, set.indexOf("nope")); + } + + @Test + void support_create_then_indexOf() { + Data d = Support.create("x", "y"); + + int slot = Support.indexOf(d.hashes, d.names, "x"); // 3-arg overload computes the hash + assertTrue(slot >= 0); + assertEquals("x", d.names[slot]); + + assertEquals(-1, Support.indexOf(d.hashes, d.names, "q")); + } + + /** Controlled hashes force collision, linear-probe wraparound, and the already-present path. */ + @Test + void put_and_indexOf_collisionAndWraparound() { + int[] hashes = new int[4]; // mask = 3 + String[] names = new String[4]; + + assertEquals(3, Support.put(hashes, names, "a", 7)); // 7 & 3 == 3 + assertEquals(0, Support.put(hashes, names, "b", 7)); // collides at 3, probes (3+1)&3 == 0 + assertEquals(3, Support.put(hashes, names, "a", 7)); // already present -> existing slot + + assertEquals(3, Support.indexOf(hashes, names, "a", 7)); // direct hit + assertEquals(0, Support.indexOf(hashes, names, "b", 7)); // hit after collision + wraparound + assertEquals( + -1, Support.indexOf(hashes, names, "c", 7)); // miss after probing 3 -> 0 -> 1(empty) + assertEquals(-1, Support.indexOf(hashes, names, "z", 6)); // 6 & 3 == 2, empty -> immediate miss + } + + @Test + void put_throwsWhenFull() { + int[] hashes = new int[2]; // mask = 1 + String[] names = new String[2]; + + Support.put(hashes, names, "a", 4); // 4 & 1 == 0 + Support.put(hashes, names, "b", 5); // 5 & 1 == 1 + + // both slots occupied, no match -> probe exhausts -> throw + assertThrows(IllegalStateException.class, () -> Support.put(hashes, names, "c", 6)); + } + + /** The documented usage: build a TagSet, attach a parallel payload indexed by slot. */ + @Test + void parallelPayloadBySlot() { + String[] names = {"a", "b", "c"}; + Data d = Support.create(names); + + long[] ids = new long[d.names.length]; + for (int j = 0; j < names.length; j++) { + ids[Support.indexOf(d.hashes, d.names, names[j])] = j + 1L; + } + + assertEquals(1L, ids[Support.indexOf(d.hashes, d.names, "a")]); + assertEquals(2L, ids[Support.indexOf(d.hashes, d.names, "b")]); + assertEquals(3L, ids[Support.indexOf(d.hashes, d.names, "c")]); + } +} diff --git a/tag-conventions.yaml b/tag-conventions.yaml new file mode 100644 index 00000000000..1ee73279326 --- /dev/null +++ b/tag-conventions.yaml @@ -0,0 +1,150 @@ +# Tag conventions — structural inheritance + mixins (pull via include / push via applies) +# --------------------------------------------------------------------------- +# Language-agnostic spec the code generator consumes to emit per-language tag-id +# constants, the id<->name resolver, and the slot layout. Reconciled from the +# OTel-convention prototype + the tags Spring PetClinic emits. +# +# THREE COMPOSITION MECHANISMS (we use all three; each models a different relationship) +# extends — STRUCTURAL "is-a" inheritance between span types (single concept, may be +# multiple). `http.server` is-a `http` is-a `base`. The root `base` is +# implicitly in every span. Abstract layers (`abstract: true`) exist only to +# be extended. Carries the common tags down the chain. +# +# mixins are reusable tag bundles attached TWO ways — BOTH are needed: +# include: a span type PULLS a mixin in (`include: [peer]`). Use for a bundle that is +# intrinsic to that type and declared locally on it; core-owned. (has-a) +# applies: a mixin PUSHES itself onto span types (`applies: all | [types]`). Use for +# cross-cutting / product bundles (profiling, appsec, dsm, ci) that attach +# without the span type knowing — so a product team owns its own attachment +# and never edits the core span_types. Gated by `enabled_by`. +# +# resolved tags(span_type) = own tags +# + tags up the `extends` chain (incl. base) +# + tags of every mixin the type `include`s +# + tags of every mixin whose `applies` matches the type +# +# When to choose: extends = identity; include = a capability the type intrinsically has +# (pull, core-owned, local to the type); applies = orthogonal/optional enrichment a +# bundle projects onto spans (push, product-owned, config-gated). +# +# tag fields: tag | type (string|int|long|boolean|double) | required (req|conditional| +# recommended|optional|opt_in) | aliases. Impl hints (split out cross-language): +# slot (true default; false = id-only, lives in buckets) | intercepted | source (core|inst). +# --------------------------------------------------------------------------- + +# ===== Structural span types (extends) ===================================== +span_types: + + # root: common tags, implicitly in EVERY span + base: + abstract: true + tags: + - { tag: component, type: string, required: required } + - { tag: span.kind, type: string, required: required, intercepted: true } # otel: kind + - { tag: _dd.base_service, type: string, required: required, source: core } + - { tag: version, type: string, required: recommended, source: core } + - { tag: env, type: string, required: recommended, source: core } + - { tag: language, type: string, required: required, source: core } + - { tag: runtime-id, type: string, required: required, source: core } + - { tag: _dd.integration, type: string, required: optional, source: core, slot: false } + - { tag: _dd.git.commit.sha, type: string, required: optional, source: core, slot: false } + - { tag: _dd.git.repository_url, type: string, required: optional, source: core, slot: false } + # error enrichment — any span can fail; present only on failure, hence slot:false + - { tag: error.type, type: string, required: recommended, slot: false } + - { tag: error.message, type: string, required: recommended, slot: false } + - { tag: error.stack, type: string, required: recommended, slot: false } + + http: + abstract: true + extends: base + tags: + - { tag: http.method, type: string, required: required, intercepted: true, aliases: [http.request.method] } + - { tag: http.status_code, type: int, required: conditional, aliases: [http.response.status_code] } + - { tag: network.protocol.version, type: string, required: recommended, slot: false } + + http.server: # servlet.request + extends: http + tags: + - { tag: http.url, type: string, required: required, intercepted: true, aliases: [url.full] } + - { tag: http.route, type: string, required: conditional } + - { tag: http.hostname, type: string, required: required, aliases: [server.address] } + - { tag: http.useragent, type: string, required: recommended, slot: false } + - { tag: http.query.string, type: string, required: recommended, slot: false, aliases: [url.query] } + - { tag: servlet.path, type: string, required: optional, slot: false } + - { tag: servlet.context, type: string, required: optional, slot: false, intercepted: true } + + http.client: + extends: http + include: [ peer ] # PULL: an http client intrinsically has a remote peer + tags: + - { tag: http.url, type: string, required: required, intercepted: true, aliases: [url.full] } + - { tag: http.resend_count, type: int, required: recommended, slot: false } + + db.client: # h2.query / jdbc + extends: base + include: [ peer ] # PULL + tags: + - { tag: db.type, type: string, required: required, aliases: [db.system] } + - { tag: db.instance, type: string, required: recommended } + - { tag: db.operation, type: string, required: recommended, aliases: [db.operation.name] } + - { tag: db.user, type: string, required: recommended } + - { tag: db.pool.name, type: string, required: optional } + - { tag: db.statement, type: string, required: recommended, slot: false, intercepted: true, aliases: [db.query.text] } + + view.render: # response.render + extends: base + tags: + - { tag: view.name, type: string, required: recommended, slot: false } + +# ===== Mixins (reusable bundles) =========================================== +# Attached by include (pull, on a span type above) and/or by applies (push, here). +mixins: + + # peer — outbound/remote-peer capability. PULLED via `include` by client span types + # (it's intrinsic to them, core-owned), so no `applies`. + peer: + tags: + - { tag: peer.service, type: string, required: recommended, intercepted: true } + - { tag: _dd.peer.service.source, type: string, required: recommended } + - { tag: peer.hostname, type: string, required: recommended } + - { tag: peer.ipv4, type: string } + - { tag: peer.ipv6, type: string } + - { tag: peer.port, type: int } + + # products — PUSH via applies, gated by enabled_by; owned by the product teams. + profiling: + enabled_by: dd.profiling.enabled + applies: all + tags: + - { tag: _dd.profiling.enabled, type: boolean, source: core } + + dsm: # Data Streams Monitoring + enabled_by: dd.data.streams.enabled + applies: all + tags: + - { tag: _dd.dsm.enabled, type: boolean, source: core } + + appsec: # ASM — entry/web spans + enabled_by: dd.appsec.enabled + applies: [ http.server ] + tags: + - { tag: _dd.appsec.enabled, type: boolean, source: core, slot: false } + + ci_visibility: # test spans (a `test` span type, not modeled here yet) + enabled_by: dd.civisibility.enabled + applies: [ test ] + tags: + - { tag: test.name, type: string, slot: false } + - { tag: test.suite, type: string, slot: false } + - { tag: test.status, type: string, slot: false } + - { tag: test.framework, type: string, slot: false } + +# --------------------------------------------------------------------------- +# Notes +# - A mixin may be both included AND applied (different span types reach it different ways); +# the resolver de-dups so a tag pulled in twice is one slot. +# - `span.kind` enumerates: server | client | producer | consumer | internal | broker. +# - "virtual" tags (sampling.priority, resource.name, service, manual.keep/drop, span.type, +# measured, origin, analytics.sample_rate) are interceptor/span-field handled, NOT stored; +# they'd carry `virtual: true` (reserved-tier id, no slot). Omitted here. +# ---------------------------------------------------------------------------