diff --git a/README.md b/README.md index 218b0748..670c5ad9 100644 --- a/README.md +++ b/README.md @@ -1,5 +1,5 @@ # DataOps Data Quality TestGen -![apache 2.0 license Badge](https://img.shields.io/badge/License%20-%20Apache%202.0%20-%20blue) ![PRs Badge](https://img.shields.io/badge/PRs%20-%20Welcome%20-%20green) [![Latest Version](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fhub.docker.com%2Fv2%2Frepositories%2Fdatakitchen%2Fdataops-testgen%2Ftags%2F&query=results%5B0%5D.name&label=latest%20version&color=06A04A)](https://hub.docker.com/r/datakitchen/dataops-testgen) [![Docker Pulls](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fhub.docker.com%2Fv2%2Frepositories%2Fdatakitchen%2Fdataops-testgen%2F&query=pull_count&style=flat&label=docker%20pulls&color=06A04A)](https://hub.docker.com/r/datakitchen/dataops-testgen) [![Documentation](https://img.shields.io/badge/docs-On%20datakitchen.io-06A04A?style=flat)](https://docs.datakitchen.io/articles/#!dataops-testgen-help/dataops-testgen-help) [![Static Badge](https://img.shields.io/badge/Slack-Join%20Discussion-blue?style=flat&logo=slack)](https://data-observability-slack.datakitchen.io/join) +![apache 2.0 license Badge](https://img.shields.io/badge/License%20-%20Apache%202.0%20-%20blue) ![PRs Badge](https://img.shields.io/badge/PRs%20-%20Welcome%20-%20green) [![Latest Version](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fhub.docker.com%2Fv2%2Frepositories%2Fdatakitchen%2Fdataops-testgen%2Ftags%2F&query=results%5B0%5D.name&label=latest%20version&color=06A04A)](https://hub.docker.com/r/datakitchen/dataops-testgen) [![Docker Pulls](https://img.shields.io/badge/dynamic/json?url=https%3A%2F%2Fhub.docker.com%2Fv2%2Frepositories%2Fdatakitchen%2Fdataops-testgen%2F&query=pull_count&style=flat&label=docker%20pulls&color=06A04A)](https://hub.docker.com/r/datakitchen/dataops-testgen) [![Documentation](https://img.shields.io/badge/docs-On%20datakitchen.io-06A04A?style=flat)](https://docs.datakitchen.io/articles/dataops-testgen-help/dataops-testgen-help) [![Static Badge](https://img.shields.io/badge/Slack-Join%20Discussion-blue?style=flat&logo=slack)](https://data-observability-slack.datakitchen.io/join) *

DataOps Data Quality TestGen, or "TestGen" for short, can help you find data issues so you can alert your users and notify your suppliers. It does this by delivering simple, fast data quality test generation and execution by data profiling, new dataset screening and hygiene review, algorithmic generation of data quality validation tests, ongoing production testing of new data refreshes, and continuous anomaly monitoring of datasets. TestGen is part of DataKitchen's Open Source Data Observability.

* @@ -110,7 +110,7 @@ Within the virtual environment, install the TestGen package with pip. pip install dataops-testgen ``` -Verify that the [_testgen_ command line](https://docs.datakitchen.io/articles/#!dataops-testgen-help/testgen-commands-and-details) works. +Verify that the [_testgen_ command line](https://docs.datakitchen.io/articles/dataops-testgen-help/testgen-commands-and-details) works. ```shell testgen --help ``` @@ -187,7 +187,7 @@ python3 dk-installer.py tg delete-demo ### Upgrade to latest version -New releases of TestGen are announced on the `#releases` channel on [Data Observability Slack](https://data-observability-slack.datakitchen.io/join), and release notes can be found on the [DataKitchen documentation portal](https://docs.datakitchen.io/articles/#!dataops-testgen-help/testgen-release-notes/a/h1_1691719522). Use the following command to upgrade to the latest released version. +New releases of TestGen are announced on the `#releases` channel on [Data Observability Slack](https://data-observability-slack.datakitchen.io/join), and release notes can be found on the [DataKitchen documentation portal](https://docs.datakitchen.io/articles/dataops-testgen-help/testgen-release-notes/a/h1_1691719522). Use the following command to upgrade to the latest released version. ```shell python3 dk-installer.py tg upgrade @@ -203,7 +203,7 @@ python3 dk-installer.py tg delete ### Access the _testgen_ CLI -The [_testgen_ command line](https://docs.datakitchen.io/articles/#!dataops-testgen-help/testgen-commands-and-details) can be accessed within the running container. +The [_testgen_ command line](https://docs.datakitchen.io/articles/dataops-testgen-help/testgen-commands-and-details) can be accessed within the running container. ```shell docker compose exec engine bash @@ -232,7 +232,7 @@ We recommend you start by going through the [Data Observability Overview Demo](h For support requests, [join the Data Observability Slack](https://data-observability-slack.datakitchen.io/join) πŸ‘‹ and post on the `#support` channel. ### Connect to your database -Follow [these instructions](https://docs.datakitchen.io/articles/#!dataops-testgen-help/connect-your-database) to improve the quality of data in your database. +Follow [these instructions](https://docs.datakitchen.io/articles/dataops-testgen-help/connect-your-database) to improve the quality of data in your database. ### Community Talk and learn with other data practitioners who are building with DataKitchen. Share knowledge, get help, and contribute to our open-source project. diff --git a/deploy/charts/README.md b/deploy/charts/README.md index a8656084..c081df4b 100644 --- a/deploy/charts/README.md +++ b/deploy/charts/README.md @@ -40,9 +40,14 @@ set can be easily used on the first install and future upgrades. The following configuration is recommended for experimental installations, but you're free to adjust it for your needs. The next installation steps assumes -that a file named tg-values.yaml exists with this configuration. +that a file named `tg-values.yaml` exists with this configuration. ```yaml +image: + + # DataOps TestGen version to be installed / upgraded + tag: v4 + testgen: # Password that will be assigned to the 'admin' user during the database preparation @@ -54,10 +59,18 @@ testgen: # Whether to run the SSL certificate verifications when connecting to DataOps Observability observabilityVerifySsl: false -image: + # (Optional) E-mail and SMTP configurations for enabling the email notifications + emailNotifications: - # DataOps TestGen version to be installed / upgraded - tag: v4.0 + # The email address that notifications will be sent from + fromAddress: + + # SMTP configuration for sending emails + smtp: + endpoint: + port: + username: + password: ``` # Installing diff --git a/deploy/charts/testgen-app/Chart.yaml b/deploy/charts/testgen-app/Chart.yaml index 01b2c072..ac7f2e5d 100644 --- a/deploy/charts/testgen-app/Chart.yaml +++ b/deploy/charts/testgen-app/Chart.yaml @@ -15,7 +15,7 @@ type: application # This is the chart version. This version number should be incremented each time you make changes # to the chart and its templates, including the app version. # Versions are expected to follow Semantic Versioning (https://semver.org/) -version: 1.0.1 +version: 1.1.0 # This is the version number of the application being deployed. This version number should be # incremented each time you make changes to the application. Versions are not expected to diff --git a/deploy/charts/testgen-app/templates/_environment.yaml b/deploy/charts/testgen-app/templates/_environment.yaml index c329f75c..5bf01711 100644 --- a/deploy/charts/testgen-app/templates/_environment.yaml +++ b/deploy/charts/testgen-app/templates/_environment.yaml @@ -31,6 +31,18 @@ value: {{ .Values.testgen.trustTargetDatabaseCertificate | ternary "yes" "no" | quote }} - name: TG_EXPORT_TO_OBSERVABILITY_VERIFY_SSL value: {{ .Values.testgen.observabilityVerifySsl | ternary "yes" "no" | quote }} +{{- with .Values.testgen.emailNotifications }} +- name: TG_SMTP_ENDPOINT + value: {{ .smtp.endpoint | quote }} +- name: TG_SMTP_PORT + value: {{ .smtp.port | quote }} +- name: TG_SMTP_USERNAME + value: {{ .smtp.username | quote }} +- name: TG_SMTP_PASSWORD + value: {{ .smtp.password | quote }} +- name: TG_EMAIL_FROM_ADDRESS + value: {{ .fromAddress | quote }} +{{- end -}} {{- end -}} {{- define "testgen.hookEnvironment" -}} diff --git a/deploy/charts/testgen-app/values.yaml b/deploy/charts/testgen-app/values.yaml index d958ae09..8018a4cc 100644 --- a/deploy/charts/testgen-app/values.yaml +++ b/deploy/charts/testgen-app/values.yaml @@ -1,12 +1,12 @@ # Default values for testgen. testgen: - databaseHost: "postgresql" + databaseHost: "postgres" databaseName: "datakitchen" databaseSchema: "tgapp" databaseUser: "postgres" databasePasswordSecret: - name: "postgresql" + name: "services-secret" key: "postgres-password" authSecrets: create: true @@ -15,6 +15,13 @@ testgen: uiPassword: trustTargetDatabaseCertificate: false observabilityVerifySsl: true + emailNotifications: + fromAddress: + smtp: + endpoint: + port: + username: + password: labels: cliHooks: diff --git a/deploy/charts/testgen-services/Chart.lock b/deploy/charts/testgen-services/Chart.lock deleted file mode 100644 index 72fc8358..00000000 --- a/deploy/charts/testgen-services/Chart.lock +++ /dev/null @@ -1,6 +0,0 @@ -dependencies: -- name: postgresql - repository: https://charts.bitnami.com/bitnami - version: 16.3.0 -digest: sha256:92eb2890efc38c617fa56144f4f54c0ac1ee11818f6b00860ec00a87be48f249 -generated: "2025-02-24T09:05:32.15558-05:00" diff --git a/deploy/charts/testgen-services/Chart.yaml b/deploy/charts/testgen-services/Chart.yaml index 8e98b830..954994c9 100644 --- a/deploy/charts/testgen-services/Chart.yaml +++ b/deploy/charts/testgen-services/Chart.yaml @@ -12,18 +12,13 @@ description: A Helm chart for Kubernetes # pipeline. Library charts do not define any templates and therefore cannot be deployed. type: application -dependencies: - - name: postgresql - version: 16.3.0 - repository: https://charts.bitnami.com/bitnami - # This is the chart version. This version number should be incremented each time you make changes # to the chart and its templates, including the app version. # Versions are expected to follow Semantic Versioning (https://semver.org/) -version: 0.1.1 +version: 1.0.0 # This is the version number of the application being deployed. This version number should be # incremented each time you make changes to the application. Versions are not expected to # follow Semantic Versioning. They should reflect the version the application is using. # It is recommended to use it with quotes. -appVersion: "1.16.0" +appVersion: "4" diff --git a/deploy/charts/testgen-services/templates/_helpers.tpl b/deploy/charts/testgen-services/templates/_helpers.tpl new file mode 100644 index 00000000..de964984 --- /dev/null +++ b/deploy/charts/testgen-services/templates/_helpers.tpl @@ -0,0 +1,62 @@ +{{/* +Expand the name of the chart. +*/}} +{{- define "testgen-services.name" -}} +{{- default .Chart.Name .Values.nameOverride | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Create a default fully qualified app name. +We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec). +If release name contains chart name it will be used as a full name. +*/}} +{{- define "testgen-services.fullname" -}} +{{- if .Values.fullnameOverride }} +{{- .Values.fullnameOverride | trunc 63 | trimSuffix "-" }} +{{- else }} +{{- $name := default .Chart.Name .Values.nameOverride }} +{{- if contains $name .Release.Name }} +{{- .Release.Name | trunc 63 | trimSuffix "-" }} +{{- else }} +{{- printf "%s-%s" .Release.Name $name | trunc 63 | trimSuffix "-" }} +{{- end }} +{{- end }} +{{- end }} + +{{/* +Create chart name and version as used by the chart label. +*/}} +{{- define "testgen-services.chart" -}} +{{- printf "%s-%s" .Chart.Name .Chart.Version | replace "+" "_" | trunc 63 | trimSuffix "-" }} +{{- end }} + +{{/* +Common labels +*/}} +{{- define "testgen-services.labels" -}} +helm.sh/chart: {{ include "testgen-services.chart" . }} +{{ include "testgen-services.selectorLabels" . }} +{{- if .Chart.AppVersion }} +app.kubernetes.io/version: {{ .Chart.AppVersion | quote }} +{{- end }} +app.kubernetes.io/managed-by: {{ .Release.Service }} +{{- end }} + +{{/* +Selector labels +*/}} +{{- define "testgen-services.selectorLabels" -}} +app.kubernetes.io/name: {{ include "testgen-services.name" . }} +app.kubernetes.io/instance: {{ .Release.Name }} +{{- end }} + +{{/* +Create the name of the service account to use +*/}} +{{- define "testgen-services.serviceAccountName" -}} +{{- if .Values.serviceAccount.create }} +{{- default (include "testgen-services.fullname" .) .Values.serviceAccount.name }} +{{- else }} +{{- default "default" .Values.serviceAccount.name }} +{{- end }} +{{- end }} diff --git a/deploy/charts/testgen-services/templates/secret.yaml b/deploy/charts/testgen-services/templates/secret.yaml new file mode 100644 index 00000000..eb894fd6 --- /dev/null +++ b/deploy/charts/testgen-services/templates/secret.yaml @@ -0,0 +1,9 @@ +{{- if .Values.secret.create }} +apiVersion: v1 +kind: Secret +metadata: + name: {{ .Values.secret.name }} +type: Opaque +data: + {{ .Values.postgres.auth.passwordKey }}: {{ randAlphaNum 24 | b64enc }} +{{- end }} diff --git a/deploy/charts/testgen-services/templates/service.yaml b/deploy/charts/testgen-services/templates/service.yaml new file mode 100644 index 00000000..f53c5c82 --- /dev/null +++ b/deploy/charts/testgen-services/templates/service.yaml @@ -0,0 +1,15 @@ +apiVersion: v1 +kind: Service +metadata: + name: {{ include "testgen-services.fullname" . }} + labels: + {{- include "testgen-services.labels" . | nindent 4 }} +spec: + type: {{ .Values.postgres.service.type }} + ports: + - port: {{ .Values.postgres.service.port }} + targetPort: 5432 + protocol: TCP + name: postgres + selector: + {{- include "testgen-services.selectorLabels" . | nindent 4 }} diff --git a/deploy/charts/testgen-services/templates/serviceaccount.yaml b/deploy/charts/testgen-services/templates/serviceaccount.yaml new file mode 100644 index 00000000..b7873994 --- /dev/null +++ b/deploy/charts/testgen-services/templates/serviceaccount.yaml @@ -0,0 +1,13 @@ +{{- if .Values.serviceAccount.create -}} +apiVersion: v1 +kind: ServiceAccount +metadata: + name: {{ include "testgen-services.serviceAccountName" . }} + labels: + {{- include "testgen-services.labels" . | nindent 4 }} + {{- with .Values.serviceAccount.annotations }} + annotations: + {{- toYaml . | nindent 4 }} + {{- end }} +automountServiceAccountToken: {{ .Values.serviceAccount.automount }} +{{- end }} diff --git a/deploy/charts/testgen-services/templates/statefulset.yaml b/deploy/charts/testgen-services/templates/statefulset.yaml new file mode 100644 index 00000000..608ed230 --- /dev/null +++ b/deploy/charts/testgen-services/templates/statefulset.yaml @@ -0,0 +1,75 @@ +apiVersion: apps/v1 +kind: StatefulSet +metadata: + name: {{ include "testgen-services.fullname" . }} + labels: + {{- include "testgen-services.labels" . | nindent 4 }} +spec: + replicas: {{ .Values.postgres.replicaCount }} + selector: + matchLabels: + {{- include "testgen-services.selectorLabels" . | nindent 6 }} + template: + metadata: + {{- with .Values.podAnnotations }} + annotations: + {{- toYaml . | nindent 8 }} + {{- end }} + labels: + {{- include "testgen-services.labels" . | nindent 8 }} + {{- with .Values.podLabels }} + {{- toYaml . | nindent 8 }} + {{- end }} + spec: + {{- with .Values.postgres.imagePullSecrets }} + imagePullSecrets: + {{- toYaml . | nindent 8 }} + {{- end }} + serviceAccountName: {{ include "testgen-services.serviceAccountName" . }} + securityContext: + {{- toYaml .Values.podSecurityContext | nindent 8 }} + containers: + - name: {{ .Chart.Name }}-postgres + securityContext: + {{- toYaml .Values.securityContext | nindent 12 }} + image: "{{ .Values.postgres.image.repository }}:{{ .Values.postgres.image.tag }}" + imagePullPolicy: {{ .Values.postgres.image.pullPolicy }} + ports: + - containerPort: 5432 + name: postgres + env: + - name: POSTGRES_USER + value: {{ .Values.postgres.auth.user }} + - name: POSTGRES_DB + value: {{ .Values.postgres.auth.database }} + - name: POSTGRES_PASSWORD + valueFrom: + secretKeyRef: + name: {{ .Values.secret.name }} + key: {{ .Values.postgres.auth.passwordKey }} + volumeMounts: + - name: data + mountPath: /var/lib/postgresql/data + readinessProbe: + exec: + command: ["pg_isready", "-U", "{{ .Values.postgres.auth.user }}"] + initialDelaySeconds: 5 + periodSeconds: 10 + livenessProbe: + exec: + command: ["pg_isready", "-U", "{{ .Values.postgres.auth.user }}"] + initialDelaySeconds: 30 + periodSeconds: 10 + resources: + {{- toYaml .Values.postgres.resources | nindent 12 }} + volumeClaimTemplates: + - metadata: + name: data + spec: + accessModes: ["ReadWriteOnce"] + resources: + requests: + storage: {{ .Values.postgres.storage.size }} + {{- if .Values.postgres.storage.storageClass }} + storageClassName: "{{ .Values.postgres.storage.storageClass }}" + {{- end }} diff --git a/deploy/charts/testgen-services/values.yaml b/deploy/charts/testgen-services/values.yaml index af7ca7be..90c44114 100644 --- a/deploy/charts/testgen-services/values.yaml +++ b/deploy/charts/testgen-services/values.yaml @@ -2,13 +2,69 @@ # This is a YAML-formatted file. # Declare variables to be passed into your templates. -postgresql: - fullnameOverride: postgresql - auth: - database: "datakitchen" +nameOverride: "" +fullnameOverride: "postgres" + +serviceAccount: + # Specifies whether a service account should be created + create: true + # Automatically mount a ServiceAccount's API credentials? + automount: true + # Annotations to add to the service account + annotations: {} + # The name of the service account to use. + # If not set and create is true, a name is generated using the fullname template + name: "" + +podAnnotations: {} +podLabels: {} + +podSecurityContext: {} + # fsGroup: 2000 + +securityContext: {} + # capabilities: + # drop: + # - ALL + # readOnlyRootFilesystem: true + # runAsNonRoot: true + # runAsUser: 1000 + +secret: + create: true + name: services-secret + +postgres: + replicaCount: 1 + + service: + type: ClusterIP + port: 5432 + image: - repository: bitnamilegacy/postgresql + repository: postgres + pullPolicy: IfNotPresent + tag: "14.1-alpine" + + imagePullSecrets: [] + + auth: + user: postgres + database: datakitchen + passwordKey: postgres-password + + storage: + size: 1Gi + storageClass: "" -global: - security: - allowInsecureImages: true + resources: {} + # We usually recommend not to specify default resources and to leave this as a conscious + # choice for the user. This also increases chances charts run on environments with little + # resources, such as Minikube. If you do want to specify resources, uncomment the following + # lines, adjust them as necessary, and remove the curly braces after 'resources:'. + # limits: + # cpu: 100m + # memory: 128Mi + # requests: + # cpu: 100m + # memory: 128Mi diff --git a/deploy/install_linuxodbc.sh b/deploy/install_linuxodbc.sh index 9f585221..e0f4080d 100755 --- a/deploy/install_linuxodbc.sh +++ b/deploy/install_linuxodbc.sh @@ -1,7 +1,8 @@ #!/usr/bin/env sh # From: https://learn.microsoft.com/en-us/sql/connect/odbc/linux-mac/installing-the-microsoft-odbc-driver-for-sql-server -# modifications: Added --non-interactive and --no-cache flags, removed sudo, added aarch64 as an alias for arm64 +# modifications: added --non-interactive and --no-cache flags, removed sudo, added aarch64 as an alias for arm64, +# added certificate installation, isolated folder, replaced gpg --verify with gpgv architecture="unsupported" @@ -19,18 +20,30 @@ if [ "unsupported" = "$architecture" ]; then exit 1 fi -#Download the desired package(s) -curl -O https://download.microsoft.com/download/7/6/d/76de322a-d860-4894-9945-f0cc5d6a45f8/msodbcsql18_18.4.1.1-1_$architecture.apk -curl -O https://download.microsoft.com/download/7/6/d/76de322a-d860-4894-9945-f0cc5d6a45f8/mssql-tools18_18.4.1.1-1_$architecture.apk - -#(Optional) Verify signature, if 'gpg' is missing install it using 'apk add gnupg': -curl -O https://download.microsoft.com/download/7/6/d/76de322a-d860-4894-9945-f0cc5d6a45f8/msodbcsql18_18.4.1.1-1_$architecture.sig -curl -O https://download.microsoft.com/download/7/6/d/76de322a-d860-4894-9945-f0cc5d6a45f8/mssql-tools18_18.4.1.1-1_$architecture.sig - -curl https://packages.microsoft.com/keys/microsoft.asc | gpg --import - -gpg --verify msodbcsql18_18.4.1.1-1_$architecture.sig msodbcsql18_18.4.1.1-1_$architecture.apk -gpg --verify mssql-tools18_18.4.1.1-1_$architecture.sig mssql-tools18_18.4.1.1-1_$architecture.apk - -#Install the package(s) -apk add --no-cache --non-interactive --allow-untrusted msodbcsql18_18.4.1.1-1_$architecture.apk -apk add --no-cache --non-interactive --allow-untrusted mssql-tools18_18.4.1.1-1_$architecture.apk +( + set -e + tmpdir="$(mktemp -d)" + trap 'rm -rf "$tmpdir"' EXIT + cd "$tmpdir" + + # Recent Alpine versions lacks the Microsoft certificate chain, so we download and install it manually + curl -fsSL -o cert.crt https://www.microsoft.com/pkiops/certs/Microsoft%20TLS%20G2%20ECC%20CA%20OCSP%2002.crt + openssl x509 -inform DER -in cert.crt -out /usr/local/share/ca-certificates/microsoft_tls_g2_ecc_ocsp_02.pem + update-ca-certificates + + # Download the desired packages + curl -O https://download.microsoft.com/download/9dcab408-e0d4-4571-a81a-5a0951e3445f/msodbcsql18_18.6.1.1-1_$architecture.apk + curl -O https://download.microsoft.com/download/b60bb8b6-d398-4819-9950-2e30cf725fb0/mssql-tools18_18.6.1.1-1_$architecture.apk + + # Verify signature, if 'gpg' is missing install it using 'apk add gnupg': + curl -O https://download.microsoft.com/download/9dcab408-e0d4-4571-a81a-5a0951e3445f/msodbcsql18_18.6.1.1-1_$architecture.sig + curl -O https://download.microsoft.com/download/b60bb8b6-d398-4819-9950-2e30cf725fb0/mssql-tools18_18.6.1.1-1_$architecture.sig + + curl https://packages.microsoft.com/keys/microsoft.asc | gpg --dearmor > microsoft.gpg + gpgv --keyring ./microsoft.gpg msodbcsql18_*.sig msodbcsql18_*.apk + gpgv --keyring ./microsoft.gpg mssql-tools18_*.sig mssql-tools18_*.apk + + # Install the packages + apk add --no-cache --allow-untrusted msodbcsql18_18.6.1.1-1_$architecture.apk + apk add --no-cache --allow-untrusted mssql-tools18_18.6.1.1-1_$architecture.apk +) diff --git a/deploy/testgen-base.dockerfile b/deploy/testgen-base.dockerfile index de45fcf7..d758a03f 100644 --- a/deploy/testgen-base.dockerfile +++ b/deploy/testgen-base.dockerfile @@ -1,4 +1,4 @@ -FROM python:3.12-alpine3.22 +FROM python:3.12-alpine3.23 ENV LANG=C.UTF-8 ENV LC_ALL=C.UTF-8 @@ -14,24 +14,22 @@ RUN apk update && apk upgrade && apk add --no-cache \ cmake \ musl-dev \ gfortran \ - linux-headers=6.14.2-r0 \ - # Tools needed for installing the MSSQL ODBC drivers \ + linux-headers=6.16.12-r0 \ + # Tools needed for installing the MSSQL ODBC drivers curl \ gpg \ + gpgv \ + openssl \ # Additional libraries needed and their dev counterparts. We add both so that we can remove # the *-dev later, keeping the libraries - openblas=0.3.28-r0 \ - openblas-dev=0.3.28-r0 \ - unixodbc=2.3.12-r0 \ - unixodbc-dev=2.3.12-r0 \ + openblas=0.3.30-r2 \ + openblas-dev=0.3.30-r2 \ + unixodbc=2.3.14-r0 \ + unixodbc-dev=2.3.14-r0 \ + libarrow=21.0.0-r4 \ + apache-arrow-dev=21.0.0-r4 \ # Pinned versions for security - xz=5.8.1-r0 - -RUN apk add --no-cache \ - --repository https://dl-cdn.alpinelinux.org/alpine/v3.21/community \ - --repository https://dl-cdn.alpinelinux.org/alpine/v3.21/main \ - libarrow=18.1.0-r0 \ - apache-arrow-dev=18.1.0-r0 + xz=5.8.2-r0 COPY --chmod=775 ./deploy/install_linuxodbc.sh /tmp/dk/install_linuxodbc.sh RUN /tmp/dk/install_linuxodbc.sh @@ -39,6 +37,10 @@ RUN /tmp/dk/install_linuxodbc.sh # Install TestGen's main project empty pyproject.toml to install (and cache) the dependencies first COPY ./pyproject.toml /tmp/dk/pyproject.toml RUN mkdir /dk + +# Upgrading pip for security +RUN python3 -m pip install --upgrade pip==26.0 + RUN python3 -m pip install --prefix=/dk /tmp/dk RUN apk del \ @@ -46,9 +48,12 @@ RUN apk del \ g++ \ make \ cmake \ + curl \ musl-dev \ gfortran \ gpg \ + gpgv \ + openssl \ linux-headers \ openblas-dev \ unixodbc-dev \ diff --git a/deploy/testgen.dockerfile b/deploy/testgen.dockerfile index 4ff2ff94..5c4bb933 100644 --- a/deploy/testgen.dockerfile +++ b/deploy/testgen.dockerfile @@ -1,4 +1,4 @@ -ARG TESTGEN_BASE_LABEL=v9 +ARG TESTGEN_BASE_LABEL=v11 FROM datakitchen/dataops-testgen-base:${TESTGEN_BASE_LABEL} AS release-image @@ -17,6 +17,8 @@ COPY . /tmp/dk/ RUN python3 -m pip install --prefix=/dk /tmp/dk RUN rm -Rf /tmp/dk +RUN tg-patch-streamlit + RUN addgroup -S testgen && adduser -S testgen -G testgen # Streamlit has to be able to write to these dirs diff --git a/pyproject.toml b/pyproject.toml index 7267ca28..57285fc2 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -8,7 +8,7 @@ build-backend = "setuptools.build_meta" [project] name = "dataops-testgen" -version = "4.39.2" +version = "5.0.1" description = "DataKitchen's Data Quality DataOps TestGen" authors = [ { "name" = "DataKitchen, Inc.", "email" = "info@datakitchen.io" }, @@ -41,7 +41,7 @@ dependencies = [ "requests_extensions==1.1.3", "numpy==1.26.4", "pandas==2.1.4", - "streamlit==1.46.1", + "streamlit==1.53.0", "streamlit-extras==0.3.0", "streamlit-aggrid==0.3.4.post3", "plotly_express==0.4.1", @@ -62,9 +62,11 @@ dependencies = [ "cron-descriptor==2.0.5", "pybars3==0.9.7", "azure-identity==1.25.1", + "statsmodels==0.14.6", + "holidays~=0.89", # Pinned to match the manually compiled libs or for security - "pyarrow==18.1.0", + "pyarrow==21.0.0", "matplotlib==3.9.2", "scipy==1.14.1", "jinja2==3.1.6", @@ -92,12 +94,13 @@ release = [ [project.entry-points.console_scripts] testgen = "testgen.__main__:cli" +tg-patch-streamlit = "testgen.ui.scripts.patch_streamlit:patch" [project.urls] "Source Code" = "https://github.com/DataKitchen/dataops-testgen" "Bug Tracker" = "https://github.com/DataKitchen/dataops-testgen/issues" -"Documentation" = "https://docs.datakitchen.io/articles/#!dataops-testgen-help/dataops-testgen-help" -"Release Notes" = "https://docs.datakitchen.io/articles/#!dataops-testgen-help/testgen-release-notes" +"Documentation" = "https://docs.datakitchen.io/articles/dataops-testgen-help/dataops-testgen-help" +"Release Notes" = "https://docs.datakitchen.io/articles/dataops-testgen-help/testgen-release-notes" "Slack" = "https://data-observability-slack.datakitchen.io/join" "Homepage" = "https://example.com" @@ -107,6 +110,7 @@ include-package-data = true [tool.setuptools.package-data] "*" = ["*.toml", "*.sql", "*.yaml"] "testgen.template" = ["*.sql", "*.yaml", "**/*.sql", "**/*.yaml"] +"testgen.ui.static" = ["**/*.js", "**/*.css", "**/*.woff2"] "testgen.ui.assets" = ["*.svg", "*.png", "*.js", "*.css", "*.ico", "flavors/*.svg"] "testgen.ui.components.frontend" = ["*.html", "**/*.js", "**/*.css", "**/*.woff2", "**/*.svg"] @@ -274,3 +278,19 @@ push = false "pyproject.toml" = [ 'version = "{version}"', ] + +[[tool.streamlit.component.components]] +name = "table_group_wizard" +asset_dir = "ui/components/frontend/js" + +[[tool.streamlit.component.components]] +name = "edit_monitor_settings" +asset_dir = "ui/components/frontend/js" + +[[tool.streamlit.component.components]] +name = "table_monitoring_trends" +asset_dir = "ui/components/frontend/js" + +[[tool.streamlit.component.components]] +name = "edit_table_monitors" +asset_dir = "ui/components/frontend/js" diff --git a/setup.py b/setup.py new file mode 100644 index 00000000..16bcbedc --- /dev/null +++ b/setup.py @@ -0,0 +1,22 @@ +import os + +from setuptools import setup +from setuptools.command.build_py import build_py + +THIS_DIR = os.path.dirname(os.path.abspath(__file__)) +ROOT_TOML = os.path.abspath(os.path.join(THIS_DIR, "pyproject.toml")) + +class CustomBuildPy(build_py): + def run(self): + super().run() + target_toml = os.path.join(self.build_lib, "testgen", "pyproject.toml") + if os.path.exists(ROOT_TOML): + os.makedirs(os.path.dirname(target_toml), exist_ok=True) + self.copy_file(ROOT_TOML, target_toml) + + +setup( + cmdclass={ + "build_py": CustomBuildPy, + }, +) diff --git a/testgen/__main__.py b/testgen/__main__.py index 8463ab4f..62ae21c0 100644 --- a/testgen/__main__.py +++ b/testgen/__main__.py @@ -10,7 +10,6 @@ from click.core import Context from testgen import settings -from testgen.commands.run_generate_tests import run_test_gen_queries from testgen.commands.run_get_entities import ( run_get_results, run_get_test_suite, @@ -29,10 +28,11 @@ from testgen.commands.run_launch_db_config import run_launch_db_config from testgen.commands.run_observability_exporter import run_observability_exporter from testgen.commands.run_profiling import run_profiling -from testgen.commands.run_quick_start import run_quick_start, run_quick_start_increment +from testgen.commands.run_quick_start import run_monitor_increment, run_quick_start, run_quick_start_increment from testgen.commands.run_test_execution import run_test_execution from testgen.commands.run_test_metadata_exporter import run_test_metadata_exporter from testgen.commands.run_upgrade_db_config import get_schema_revision, is_db_revision_up_to_date, run_upgrade_db_config +from testgen.commands.test_generation import run_monitor_generation, run_test_generation from testgen.common import ( configure_logging, display_service, @@ -57,7 +57,7 @@ APP_MODULES = ["ui", "scheduler"] VERSION_DATA = version_service.get_version() - +CHILDREN_POLL_INTERVAL = 10 @dataclass class Configuration: @@ -133,18 +133,25 @@ def run_profile(table_group_id: str): @cli.command("run-test-generation", help="Generates or refreshes the tests for a table group.") +@click.option( + "-t", + "--test-suite-id", + required=False, + type=click.STRING, + help="ID of the test suite to generate. Use a test_suite_id shown in list-test-suites.", +) @click.option( "-tg", "--table-group-id", help="The identifier for the table group used during a profile run. Use a table_group_id shown in list-table-groups.", - required=True, + required=False, type=click.STRING, ) @click.option( "-ts", "--test-suite-key", help="The identifier for a test suite. Use a test_suite_key shown in list-test-suites.", - required=True, + required=False, type=click.STRING, ) @click.option( @@ -152,16 +159,37 @@ def run_profile(table_group_id: str): "--generation-set", help="A defined subset of tests to generate for your purpose. Use a generation_set defined for your project.", required=False, - default=None, + default="Standard", ) -@pass_configuration -def run_test_generation(configuration: Configuration, table_group_id: str, test_suite_key: str, generation_set: str): - LOG.info("CurrentStep: Generate Tests - Main Procedure") - message = run_test_gen_queries(table_group_id, test_suite_key, generation_set) - LOG.info("Current Step: Generate Tests - Main Procedure Complete") +@with_database_session +def run_generation(test_suite_id: str | None = None, table_group_id: str | None = None, test_suite_key: str | None = None, generation_set: str | None = None): + click.echo(f"run-test-generation for suite: {test_suite_id or test_suite_key}") + # For backward compatibility + if not test_suite_id: + test_suites = TestSuite.select_minimal_where( + TestSuite.table_groups_id == table_group_id, + TestSuite.test_suite == test_suite_key, + ) + if test_suites: + test_suite_id = test_suites[0].id + message = run_test_generation(test_suite_id, generation_set) click.echo("\n" + message) +@cli.command("run-monitor-generation", help="Generates or refreshes the monitors for a table group.") +@click.option( + "-t", + "--test-suite-id", + required=True, + type=click.STRING, + help="ID of the monitor suite to generate", +) +@with_database_session +def generate_monitors(test_suite_id: str): + click.echo(f"run-monitor-generation for suite: {test_suite_id}") + run_monitor_generation(test_suite_id, ["Freshness_Trend", "Volume_Trend", "Schema_Drift"]) + + @register_scheduler_job @cli.command("run-tests", help="Performs tests defined for a test suite.") @click.option( @@ -201,6 +229,22 @@ def run_tests(test_suite_id: str | None = None, project_key: str | None = None, click.echo("\n" + message) +@register_scheduler_job +@cli.command("run-monitors", help="Performs tests defined for a monitor suite.") +@click.option( + "-t", + "--test-suite-id", + required=True, + type=click.STRING, + help="ID of the monitor suite to run.", +) +@with_database_session +def run_monitors(test_suite_id: str): + click.echo(f"run-monitors for suite: {test_suite_id}") + message = run_test_execution(test_suite_id) + click.echo("\n" + message) + + @cli.command("list-profiles", help="Lists all profile runs for a table group.") @click.option( "-tg", @@ -384,7 +428,7 @@ def quick_start( click.echo("loading initial data") run_quick_start_increment(0) now_date = datetime.now(UTC) - time_delta = timedelta(days=-30) # 1 month ago + time_delta = timedelta(days=-35) # before the first monitor iteration (~34 days back) table_group_id = "0ea85e17-acbe-47fe-8394-9970725ad37d" test_suite_id = "9df7489d-92b3-49f9-95ca-512160d7896f" @@ -392,8 +436,8 @@ def quick_start( message = run_profiling(table_group_id, run_date=now_date + time_delta) click.echo("\n" + message) - LOG.info(f"run-test-generation with table_group_id: {table_group_id} test_suite: {settings.DEFAULT_TEST_SUITE_KEY}") - message = run_test_gen_queries(table_group_id, settings.DEFAULT_TEST_SUITE_KEY) + LOG.info(f"run-test-generation with test_suite_id: {test_suite_id}") + message = with_database_session(run_test_generation)(test_suite_id, "Standard") click.echo("\n" + message) run_test_execution(test_suite_id, run_date=now_date + time_delta) @@ -405,6 +449,22 @@ def quick_start( run_quick_start_increment(iteration) run_test_execution(test_suite_id, run_date=run_date) + monitor_iterations = 68 # ~5 weeks + monitor_interval = timedelta(hours=12) + monitor_test_suite_id = "823a1fef-9b6d-48d5-9d0f-2db9812cc318" + # Round down to nearest 12-hour mark (12:00 AM or 12:00 PM UTC) + now = datetime.now(UTC) + nearest_12h_mark = now.replace(hour=12 if now.hour >= 12 else 0, minute=0, second=0, microsecond=0) + monitor_run_date = nearest_12h_mark - monitor_interval * (monitor_iterations - 1) + weekday_morning_count = 0 + for iteration in range(1, monitor_iterations + 1): + click.echo(f"Running monitor iteration: {iteration} / {monitor_iterations}") + if monitor_run_date.weekday() < 5 and monitor_run_date.hour < 12: + weekday_morning_count += 1 + run_monitor_increment(monitor_run_date, iteration, weekday_morning_count) + run_test_execution(monitor_test_suite_id, run_date=monitor_run_date) + monitor_run_date += monitor_interval + click.echo("Quick start has successfully finished.") @@ -646,7 +706,8 @@ def run_ui(): use_ssl = os.path.isfile(settings.SSL_CERT_FILE) and os.path.isfile(settings.SSL_KEY_FILE) - patch_streamlit.patch(force=True) + if settings.IS_DEBUG: + patch_streamlit.patch(dev=True) @with_database_session def init_ui(): @@ -671,6 +732,7 @@ def init_ui(): "--browser.gatherUsageStats=false", "--client.showErrorDetails=none", "--client.toolbarMode=minimal", + "--server.enableStaticServing=true", f"--server.sslCertFile={settings.SSL_CERT_FILE}" if use_ssl else "", f"--server.sslKeyFile={settings.SSL_KEY_FILE}" if use_ssl else "", "--", @@ -715,8 +777,20 @@ def term_children(signum, _): signal.signal(signal.SIGINT, term_children) signal.signal(signal.SIGTERM, term_children) - for child in children: - child.wait() + terminating = False + while children: + try: + children[0].wait(CHILDREN_POLL_INTERVAL) + except subprocess.TimeoutExpired: + pass + + for child in children: + if child.poll() is not None: + children.remove(child) + if not terminating: + terminating = True + term_children(signal.SIGTERM, None) + if __name__ == "__main__": diff --git a/testgen/commands/queries/execute_tests_query.py b/testgen/commands/queries/execute_tests_query.py index d71572b6..90d67859 100644 --- a/testgen/commands/queries/execute_tests_query.py +++ b/testgen/commands/queries/execute_tests_query.py @@ -1,17 +1,28 @@ import dataclasses from collections.abc import Iterable -from datetime import datetime +from datetime import date, datetime from typing import TypedDict from uuid import UUID +import pandas as pd + from testgen.common import read_template_sql_file from testgen.common.clean_sql import concat_columns from testgen.common.database.database_service import get_flavor_service, get_tg_schema, replace_params +from testgen.common.freshness_service import ( + count_excluded_minutes, + get_schedule_params, + is_excluded_day, + resolve_holiday_dates, +) from testgen.common.models.connection import Connection +from testgen.common.models.scheduler import JobSchedule from testgen.common.models.table_group import TableGroup from testgen.common.models.test_definition import TestRunType, TestScope from testgen.common.models.test_run import TestRun +from testgen.common.models.test_suite import TestSuite from testgen.common.read_file import replace_templated_functions +from testgen.utils import to_sql_timestamp @dataclasses.dataclass @@ -46,10 +57,12 @@ class TestExecutionDef(InputParameters): table_name: str column_name: str skip_errors: int + history_calculation: str custom_query: str + prediction: dict | str | None run_type: TestRunType test_scope: TestScope - template_name: str + template: str measure: str test_operator: str test_condition: str @@ -86,14 +99,27 @@ class TestExecutionSQL: "result_measure", ) - def __init__(self, connection: Connection, table_group: TableGroup, test_run: TestRun): + def __init__(self, connection: Connection, table_group: TableGroup, test_suite: TestSuite, test_run: TestRun): self.connection = connection self.table_group = table_group + self.test_suite = test_suite self.test_run = test_run - self.run_date = test_run.test_starttime.strftime("%Y-%m-%d %H:%M:%S") + self.run_date = test_run.test_starttime self.flavor = connection.sql_flavor self.flavor_service = get_flavor_service(self.flavor) + self._exclude_weekends = bool(self.test_suite.predict_exclude_weekends) + self._holiday_dates: set[date] | None = None + self._schedule_tz: str | None = None + if test_suite.is_monitor: + schedule = JobSchedule.get(JobSchedule.kwargs["test_suite_id"].astext == str(test_suite.id)) + self._schedule_tz = schedule.cron_tz or "UTC" if schedule else None + if test_suite.holiday_codes_list: + self._holiday_dates = resolve_holiday_dates( + test_suite.holiday_codes_list, + pd.DatetimeIndex([datetime(self.run_date.year - 1, 1, 1), datetime(self.run_date.year + 1, 12, 31)]), + ) + def _get_input_parameters(self, test_def: TestExecutionDef) -> str: return "; ".join( f"{field.name}={getattr(test_def, field.name)}" @@ -104,9 +130,10 @@ def _get_input_parameters(self, test_def: TestExecutionDef) -> str: def _get_params(self, test_def: TestExecutionDef | None = None) -> dict: quote = self.flavor_service.quote_character params = { + "TABLE_GROUPS_ID": self.table_group.id, "TEST_SUITE_ID": self.test_run.test_suite_id, "TEST_RUN_ID": self.test_run.id, - "RUN_DATE": self.run_date, + "RUN_DATE": to_sql_timestamp(self.run_date), "SQL_FLAVOR": self.flavor, "VARCHAR_TYPE": self.flavor_service.varchar_type, "QUOTE": quote, @@ -118,22 +145,24 @@ def _get_params(self, test_def: TestExecutionDef | None = None) -> dict: "TEST_DEFINITION_ID": test_def.id, "APP_SCHEMA_NAME": get_tg_schema(), "SCHEMA_NAME": test_def.schema_name, - "TABLE_GROUPS_ID": self.table_group.id, "TABLE_NAME": test_def.table_name, "COLUMN_NAME": f"{quote}{test_def.column_name or ''}{quote}", "COLUMN_NAME_NO_QUOTES": test_def.column_name, "CONCAT_COLUMNS": concat_columns(test_def.column_name, self.null_value) if test_def.column_name else "", "SKIP_ERRORS": test_def.skip_errors or 0, + "CUSTOM_QUERY": test_def.custom_query, "BASELINE_CT": test_def.baseline_ct, "BASELINE_UNIQUE_CT": test_def.baseline_unique_ct, "BASELINE_VALUE": test_def.baseline_value, "BASELINE_VALUE_CT": test_def.baseline_value_ct, - "THRESHOLD_VALUE": test_def.threshold_value, + "THRESHOLD_VALUE": test_def.threshold_value or 0, "BASELINE_SUM": test_def.baseline_sum, "BASELINE_AVG": test_def.baseline_avg, "BASELINE_SD": test_def.baseline_sd, - "LOWER_TOLERANCE": test_def.lower_tolerance, - "UPPER_TOLERANCE": test_def.upper_tolerance, + "LOWER_TOLERANCE": "NULL" if test_def.lower_tolerance in (None, "") else test_def.lower_tolerance, + "UPPER_TOLERANCE": "NULL" if test_def.upper_tolerance in (None, "") else test_def.upper_tolerance, + # SUBSET_CONDITION should be replaced after CUSTOM_QUERY + # since the latter may contain the former "SUBSET_CONDITION": test_def.subset_condition or "1=1", "GROUPBY_NAMES": test_def.groupby_names, "HAVING_CONDITION": f"HAVING {test_def.having_condition}" if test_def.having_condition else "", @@ -146,10 +175,35 @@ def _get_params(self, test_def: TestExecutionDef | None = None) -> dict: "MATCH_GROUPBY_NAMES": test_def.match_groupby_names, "CONCAT_MATCH_GROUPBY": concat_columns(test_def.match_groupby_names, self.null_value) if test_def.match_groupby_names else "", "MATCH_HAVING_CONDITION": f"HAVING {test_def.match_having_condition}" if test_def.match_having_condition else "", - "CUSTOM_QUERY": test_def.custom_query, "COLUMN_TYPE": test_def.column_type, "INPUT_PARAMETERS": self._get_input_parameters(test_def), }) + + # Freshness exclusion params β€” computed per test at execution time + if test_def.test_type == "Freshness_Trend" and test_def.baseline_sum: + sched = get_schedule_params(test_def.prediction) + has_exclusions = self._exclude_weekends or sched.excluded_days or sched.window_start is not None + if has_exclusions: + last_update = pd.Timestamp(test_def.baseline_sum) + excluded = int(count_excluded_minutes( + last_update, self.run_date, self._exclude_weekends, self._holiday_dates, + tz=self._schedule_tz, excluded_days=sched.excluded_days, + window_start=sched.window_start, window_end=sched.window_end, + )) + is_excl = 1 if is_excluded_day( + pd.Timestamp(self.run_date), self._exclude_weekends, self._holiday_dates, + tz=self._schedule_tz, excluded_days=sched.excluded_days, + window_start=sched.window_start, window_end=sched.window_end, + ) else 0 + params["EXCLUDED_MINUTES"] = excluded + params["IS_EXCLUDED_DAY"] = is_excl + else: + params["EXCLUDED_MINUTES"] = 0 + params["IS_EXCLUDED_DAY"] = 0 + else: + params["EXCLUDED_MINUTES"] = 0 + params["IS_EXCLUDED_DAY"] = 0 + return params def _get_query( @@ -171,6 +225,14 @@ def _get_query( query = query.replace(":", "\\:") return query, None if no_bind else params + + def has_schema_changes(self) -> tuple[dict]: + # Runs on App database + return self._get_query("has_schema_changes.sql") + + def get_errored_autogen_monitors(self) -> tuple[str, dict]: + # Runs on App database + return self._get_query("get_errored_autogen_monitors.sql") def get_active_test_definitions(self) -> tuple[dict]: # Runs on App database @@ -212,21 +274,28 @@ def disable_invalid_test_definitions(self) -> tuple[str, dict]: # Runs on App database return self._get_query("disable_invalid_test_definitions.sql") - def update_historic_thresholds(self) -> tuple[str, dict]: + def update_history_calc_thresholds(self) -> tuple[str, dict]: # Runs on App database - return self._get_query("update_historic_thresholds.sql") + return self._get_query("update_history_calc_thresholds.sql") def run_query_test(self, test_def: TestExecutionDef) -> tuple[str, dict]: # Runs on Target database - folder = "generic" if test_def.template_name.endswith("_generic.sql") else self.flavor - return self._get_query( - test_def.template_name, - f"flavors/{folder}/exec_query_tests", - no_bind=True, - # Final replace in CUSTOM_QUERY - extra_params={"DATA_SCHEMA": test_def.schema_name}, - test_def=test_def, - ) + if test_def.template.startswith("@"): + folder = "generic" if test_def.template.endswith("_generic.sql") else self.flavor + return self._get_query( + test_def.template, + f"flavors/{folder}/exec_query_tests", + no_bind=True, + # Final replace in CUSTOM_QUERY + extra_params={"DATA_SCHEMA": test_def.schema_name}, + test_def=test_def, + ) + else: + query = test_def.template + params = self._get_params(test_def) + params.update({"DATA_SCHEMA": test_def.schema_name}) + query = replace_params(query, params) + return query, params def aggregate_cat_tests( self, @@ -246,9 +315,18 @@ def aggregate_cat_tests( measure = replace_templated_functions(measure, self.flavor) td.measure_expression = f"COALESCE(CAST({measure} AS {varchar_type}) {concat_operator} '|', '{self.null_value}|')" - condition = replace_params(f"{td.measure}{td.test_operator}{td.test_condition}", params) - condition = replace_templated_functions(condition, self.flavor) - td.condition_expression = f"CASE WHEN {condition} THEN '0,' ELSE '1,' END" + # For prediction mode, return -1 during training period + if td.history_calculation == "PREDICT" and (td.lower_tolerance in (None, "") or td.upper_tolerance in (None, "")): + td.condition_expression = "'-1,'" + else: + condition = ( + f"{td.measure} {td.test_operator} {td.test_condition}" + if "BETWEEN" in td.test_operator + else f"{td.measure}{td.test_operator}{td.test_condition}" + ) + condition = replace_params(condition, params) + condition = replace_templated_functions(condition, self.flavor) + td.condition_expression = f"CASE WHEN {condition} THEN '0,' ELSE '1,' END" aggregate_queries: list[tuple[str, None]] = [] aggregate_test_defs: list[list[TestExecutionDef]] = [] diff --git a/testgen/commands/queries/generate_tests_query.py b/testgen/commands/queries/generate_tests_query.py deleted file mode 100644 index cece2d3e..00000000 --- a/testgen/commands/queries/generate_tests_query.py +++ /dev/null @@ -1,111 +0,0 @@ -import logging -from datetime import UTC, datetime -from typing import ClassVar, TypedDict - -from testgen.common import CleanSQL, read_template_sql_file -from testgen.common.database.database_service import get_flavor_service, replace_params -from testgen.common.read_file import get_template_files - -LOG = logging.getLogger("testgen") - -class GenTestParams(TypedDict): - test_type: str - selection_criteria: str - default_parm_columns: str - default_parm_values: str - - -class CDeriveTestsSQL: - run_date = "" - project_code = "" - connection_id = "" - table_groups_id = "" - data_schema = "" - test_suite = "" - test_suite_id = "" - generation_set = "" - as_of_date = "" - sql_flavor = "" - gen_test_params: ClassVar[GenTestParams] = {} - - _use_clean = False - - def __init__(self, flavor): - self.sql_flavor = flavor - self.flavor_service = get_flavor_service(flavor) - - today = datetime.now(UTC).strftime("%Y-%m-%d %H:%M:%S") - self.run_date = today - self.as_of_date = today - - def _get_params(self) -> dict: - return { - **{key.upper(): value for key, value in self.gen_test_params.items()}, - "PROJECT_CODE": self.project_code, - "SQL_FLAVOR": self.sql_flavor, - "CONNECTION_ID": self.connection_id, - "TABLE_GROUPS_ID": self.table_groups_id, - "RUN_DATE": self.run_date, - "TEST_SUITE": self.test_suite, - "TEST_SUITE_ID": self.test_suite_id, - "GENERATION_SET": self.generation_set, - "AS_OF_DATE": self.as_of_date, - "DATA_SCHEMA": self.data_schema, - "QUOTE": self.flavor_service.quote_character, - } - - def _get_query(self, template_file_name: str, sub_directory: str | None = "generation") -> tuple[str, dict]: - query = read_template_sql_file(template_file_name, sub_directory) - params = self._get_params() - query = replace_params(query, params) - if self._use_clean: - query = CleanSQL(query) - return query, params - - def GetInsertTestSuiteSQL(self) -> tuple[str, dict]: - # Runs on App database - return self._get_query("gen_insert_test_suite.sql") - - def GetTestTypesSQL(self) -> tuple[str, dict]: - # Runs on App database - return self._get_query("gen_standard_test_type_list.sql") - - def GetTestDerivationQueriesAsList(self, template_directory: str) -> list[tuple[str, dict]]: - # Runs on App database - generic_template_directory = template_directory - flavor_template_directory = f"flavors.{self.sql_flavor}.{template_directory}" - - query_templates = {} - try: - for query_file in get_template_files(r"^.*sql$", generic_template_directory): - query_templates[query_file.name] = generic_template_directory - except: - LOG.debug( - f"query template '{generic_template_directory}' directory does not exist", - exc_info=True, - stack_info=True, - ) - - try: - for query_file in get_template_files(r"^.*sql$", flavor_template_directory): - query_templates[query_file.name] = flavor_template_directory - except: - LOG.debug( - f"query template '{generic_template_directory}' directory does not exist", - exc_info=True, - stack_info=True, - ) - - queries = [] - for filename, sub_directory in query_templates.items(): - queries.append(self._get_query(filename, sub_directory=sub_directory)) - - return queries - - def GetTestQueriesFromGenericFile(self) -> tuple[str, dict]: - # Runs on App database - return self._get_query("gen_standard_tests.sql") - - def GetDeleteOldTestsQuery(self) -> tuple[str, dict]: - # Runs on App database - return self._get_query("gen_delete_old_tests.sql") diff --git a/testgen/commands/queries/profiling_query.py b/testgen/commands/queries/profiling_query.py index a5fd7ba0..c1ec78fe 100644 --- a/testgen/commands/queries/profiling_query.py +++ b/testgen/commands/queries/profiling_query.py @@ -1,14 +1,14 @@ import dataclasses -import re from uuid import UUID from testgen.commands.queries.refresh_data_chars_query import ColumnChars from testgen.common import read_template_sql_file, read_template_yaml_file -from testgen.common.database.database_service import replace_params +from testgen.common.database.database_service import process_conditionals, replace_params from testgen.common.models.connection import Connection from testgen.common.models.profiling_run import ProfilingRun from testgen.common.models.table_group import TableGroup from testgen.common.read_file import replace_templated_functions +from testgen.utils import to_sql_timestamp @dataclasses.dataclass @@ -58,7 +58,7 @@ def __init__(self, connection: Connection, table_group: TableGroup, profiling_ru self.connection = connection self.table_group = table_group self.profiling_run = profiling_run - self.run_date = profiling_run.profiling_starttime.strftime("%Y-%m-%d %H:%M:%S") + self.run_date = profiling_run.profiling_starttime self.flavor = connection.sql_flavor self._profiling_template: dict = None @@ -68,7 +68,7 @@ def _get_params(self, column_chars: ColumnChars | None = None, table_sampling: T "CONNECTION_ID": self.connection.connection_id, "TABLE_GROUPS_ID": self.table_group.id, "PROFILE_RUN_ID": self.profiling_run.id, - "RUN_DATE": self.run_date, + "RUN_DATE": to_sql_timestamp(self.run_date), "SQL_FLAVOR": self.flavor, "DATA_SCHEMA": self.table_group.table_group_schema, "PROFILE_ID_COLUMN_MASK": self.table_group.profile_id_column_mask, @@ -106,7 +106,7 @@ def _get_query( params = {} if query: - query = self._process_conditionals(query, extra_params) + query = process_conditionals(query, extra_params) params.update(self._get_params(column_chars, table_sampling)) if extra_params: params.update(extra_params) @@ -116,32 +116,6 @@ def _get_query( return query, params - def _process_conditionals(self, query: str, extra_params: dict | None = None) -> str: - re_pattern = re.compile(r"^--\s+TG-(IF|ELSE|ENDIF)(?:\s+(\w+))?\s*$") - condition = None - updated_query = [] - for line in query.splitlines(True): - if re_match := re_pattern.match(line): - match re_match.group(1): - case "IF" if condition is None and (variable := re_match.group(2)) is not None: - result = extra_params.get(variable) - if result is None: - result = getattr(self, variable, None) - condition = bool(result) - case "ELSE" if condition is not None: - condition = not condition - case "ENDIF" if condition is not None: - condition = None - case _: - raise ValueError("Template conditional misused") - elif condition is not False: - updated_query.append(line) - - if condition is not None: - raise ValueError("Template conditional misused") - - return "".join(updated_query) - def _get_profiling_template(self) -> dict: if not self._profiling_template: self._profiling_template = read_template_yaml_file( diff --git a/testgen/commands/queries/refresh_data_chars_query.py b/testgen/commands/queries/refresh_data_chars_query.py index 9ef02506..1df6e994 100644 --- a/testgen/commands/queries/refresh_data_chars_query.py +++ b/testgen/commands/queries/refresh_data_chars_query.py @@ -6,7 +6,7 @@ from testgen.common.database.database_service import get_flavor_service, replace_params from testgen.common.models.connection import Connection from testgen.common.models.table_group import TableGroup -from testgen.utils import chunk_queries +from testgen.utils import chunk_queries, to_sql_timestamp @dataclasses.dataclass @@ -20,6 +20,8 @@ class ColumnChars: db_data_type: str = None is_decimal: bool = False approx_record_ct: int = None + # This should not default to 0 since we don't always retrieve actual row counts + # UI relies on the null value to know that the approx_record_ct should be displayed instead record_ct: int = None @@ -132,7 +134,7 @@ def get_staging_data_chars(self, data_chars: list[ColumnChars], run_date: dateti return [ [ self.table_group.id, - run_date, + to_sql_timestamp(run_date), column.schema_name, column.table_name, column.column_name, @@ -146,9 +148,9 @@ def get_staging_data_chars(self, data_chars: list[ColumnChars], run_date: dateti for column in data_chars ] - def update_data_chars(self, run_date: str) -> list[tuple[str, dict]]: + def update_data_chars(self, run_date: datetime) -> list[tuple[str, dict]]: # Runs on App database - params = {"RUN_DATE": run_date} + params = {"RUN_DATE": to_sql_timestamp(run_date)} return [ self._get_query("data_chars_update.sql", extra_params=params), self._get_query("data_chars_staging_delete.sql", extra_params=params), diff --git a/testgen/commands/run_generate_tests.py b/testgen/commands/run_generate_tests.py deleted file mode 100644 index 71b48491..00000000 --- a/testgen/commands/run_generate_tests.py +++ /dev/null @@ -1,93 +0,0 @@ -import logging - -from testgen import settings -from testgen.commands.queries.generate_tests_query import CDeriveTestsSQL -from testgen.common import execute_db_queries, fetch_dict_from_db, get_test_generation_params, set_target_db_params -from testgen.common.mixpanel_service import MixpanelService -from testgen.common.models import with_database_session -from testgen.common.models.connection import Connection - -LOG = logging.getLogger("testgen") - - -@with_database_session -def run_test_gen_queries(table_group_id: str, test_suite: str, generation_set: str | None = None): - if table_group_id is None: - raise ValueError("Table Group ID was not specified") - - LOG.info("CurrentStep: Assigning Connection Parameters") - connection = Connection.get_by_table_group(table_group_id) - set_target_db_params(connection.__dict__) - - clsTests = CDeriveTestsSQL(connection.sql_flavor) - - LOG.info(f"CurrentStep: Retrieving General Parameters for Test Suite {test_suite}") - params = get_test_generation_params(table_group_id, test_suite) - - - # Set static parms - clsTests.project_code = params["project_code"] - clsTests.test_suite = test_suite - clsTests.generation_set = generation_set if generation_set is not None else "" - clsTests.test_suite_id = params["test_suite_id"] if params["test_suite_id"] else "" - clsTests.connection_id = str(connection.connection_id) - clsTests.table_groups_id = table_group_id - clsTests.data_schema = params["table_group_schema"] - if params["profiling_as_of_date"] is not None: - clsTests.as_of_date = params["profiling_as_of_date"].strftime("%Y-%m-%d %H:%M:%S") - - if params["test_suite_id"]: - clsTests.test_suite_id = params["test_suite_id"] - else: - LOG.info("CurrentStep: Creating new Test Suite") - insert_ids, _ = execute_db_queries([clsTests.GetInsertTestSuiteSQL()]) - clsTests.test_suite_id = insert_ids[0] - - LOG.info("CurrentStep: Compiling Test Gen Queries") - - lstFunnyTemplateQueries = clsTests.GetTestDerivationQueriesAsList("gen_funny_cat_tests") - lstQueryTemplateQueries = clsTests.GetTestDerivationQueriesAsList("gen_query_tests") - lstGenericTemplateQueries = [] - - # Delete old Tests - deleteQuery = clsTests.GetDeleteOldTestsQuery() - - # Retrieve test_types as parms from list of dictionaries: test_type, selection_criteria, default_parm_columns, - # default_parm_values - lstTestTypes = fetch_dict_from_db(*clsTests.GetTestTypesSQL()) - - if lstTestTypes is None: - raise ValueError("Test Type Parameters not found") - elif ( - lstTestTypes[0]["test_type"] == "" - or lstTestTypes[0]["selection_criteria"] == "" - or lstTestTypes[0]["default_parm_columns"] == "" - or lstTestTypes[0]["default_parm_values"] == "" - ): - raise ValueError("Test Type parameters not correctly set") - - lstGenericTemplateQueries = [] - for dctTestParms in lstTestTypes: - clsTests.gen_test_params = dctTestParms - lstGenericTemplateQueries.append(clsTests.GetTestQueriesFromGenericFile()) - - LOG.info("TestGen CAT Queries were compiled") - - # Make sure delete, then generic templates run before the funny templates - lstQueries = [deleteQuery, *lstGenericTemplateQueries, *lstFunnyTemplateQueries, *lstQueryTemplateQueries] - - if lstQueries: - LOG.info("Running Test Generation Template Queries") - execute_db_queries(lstQueries) - message = "Test generation completed successfully." - else: - message = "No TestGen Queries were compiled." - - MixpanelService().send_event( - "generate-tests", - source=settings.ANALYTICS_JOB_SOURCE, - sql_flavor=clsTests.sql_flavor, - generation_set=clsTests.generation_set, - ) - - return message diff --git a/testgen/commands/run_launch_db_config.py b/testgen/commands/run_launch_db_config.py index 83da5d09..0d926fbe 100644 --- a/testgen/commands/run_launch_db_config.py +++ b/testgen/commands/run_launch_db_config.py @@ -53,7 +53,6 @@ def _get_params_mapping() -> dict: "TABLE_GROUPS_NAME": settings.DEFAULT_TABLE_GROUPS_NAME, "TEST_SUITE": settings.DEFAULT_TEST_SUITE_KEY, "TEST_SUITE_DESCRIPTION": settings.DEFAULT_TEST_SUITE_DESCRIPTION, - "MONITOR_TEST_SUITE": settings.DEFAULT_MONITOR_TEST_SUITE_KEY, "MAX_THREADS": settings.PROJECT_CONNECTION_MAX_THREADS, "MAX_QUERY_CHARS": settings.PROJECT_CONNECTION_MAX_QUERY_CHAR, "OBSERVABILITY_API_URL": settings.OBSERVABILITY_API_URL, diff --git a/testgen/commands/run_observability_exporter.py b/testgen/commands/run_observability_exporter.py index b8f966b9..71179e9d 100644 --- a/testgen/commands/run_observability_exporter.py +++ b/testgen/commands/run_observability_exporter.py @@ -15,6 +15,7 @@ execute_db_queries, fetch_dict_from_db, ) +from testgen.common.models import with_database_session from testgen.common.models.test_suite import TestSuite LOG = logging.getLogger("testgen") @@ -268,11 +269,13 @@ def _get_input_parameters(input_parameters): is_first = False elif len(items) == item_number: # is last value = item - ret.append({"name": name.strip(), "value": value.strip()}) + if value.strip(): + ret.append({"name": name.strip(), "value": value.strip()}) else: words = item.split(",") value = ",".join(words[:-1]) # everything but the last word - ret.append({"name": name.strip(), "value": value.strip()}) + if value.strip(): + ret.append({"name": name.strip(), "value": value.strip()}) name = words[-1] # the last word is the next name return ret @@ -309,6 +312,7 @@ def export_test_results(test_suite_id): mark_exported_results(test_suite_id, updated_ids) +@with_database_session def run_observability_exporter(project_code, test_suite): LOG.info("CurrentStep: Observability Export - Test Results") test_suites = TestSuite.select_minimal_where( diff --git a/testgen/commands/run_profiling.py b/testgen/commands/run_profiling.py index 3764f584..c97ec695 100644 --- a/testgen/commands/run_profiling.py +++ b/testgen/commands/run_profiling.py @@ -9,10 +9,9 @@ from testgen.commands.queries.profiling_query import HygieneIssueType, ProfilingSQL, TableSampling from testgen.commands.queries.refresh_data_chars_query import ColumnChars from testgen.commands.queries.rollup_scores_query import RollupScoresSQL -from testgen.commands.run_generate_tests import run_test_gen_queries from testgen.commands.run_refresh_data_chars import run_data_chars_refresh from testgen.commands.run_refresh_score_cards_results import run_refresh_score_cards_results -from testgen.commands.run_test_execution import run_test_execution_in_background +from testgen.commands.test_generation import run_monitor_generation, run_test_generation from testgen.common import ( execute_db_queries, fetch_dict_from_db, @@ -114,11 +113,8 @@ def run_profiling(table_group_id: str | UUID, username: str | None = None, run_d profiling_run.save() send_profiling_run_notifications(profiling_run) - _rollup_profiling_scores(profiling_run, table_group) - - if bool(table_group.monitor_test_suite_id) and not table_group.last_complete_profile_run_id: - _generate_monitor_tests(table_group_id, table_group.monitor_test_suite_id) + _generate_tests(table_group) finally: MixpanelService().send_event( "run-profiling", @@ -254,6 +250,7 @@ def update_frequency_progress(progress: ThreadedProgress) -> None: LOG.info("Updating profiling results with frequency analysis and deleting staging") execute_db_queries(sql_generator.update_frequency_analysis_results()) except Exception as e: + LOG.exception("Error running frequency analysis") profiling_run.set_progress("freq_analysis", "Warning", error=f"Error encountered. {get_exception_message(e)}") else: if error_data: @@ -294,6 +291,7 @@ def _run_hygiene_issue_detection(sql_generator: ProfilingSQL) -> None: ] ) except Exception as e: + LOG.exception("Error detecting hygiene issues") profiling_run.set_progress("hygiene_issues", "Warning", error=f"Error encountered. {get_exception_message(e)}") else: profiling_run.set_progress("hygiene_issues", "Completed") @@ -315,14 +313,24 @@ def _rollup_profiling_scores(profiling_run: ProfilingRun, table_group: TableGrou @with_database_session -def _generate_monitor_tests(table_group_id: str, test_suite_id: str) -> None: - try: - monitor_test_suite = TestSuite.get(test_suite_id) - if not monitor_test_suite: - LOG.info("Skipping test generation on missing monitor test suite") - else: - LOG.info("Generating monitor tests") - run_test_gen_queries(table_group_id, monitor_test_suite.test_suite, "Monitor") - run_test_execution_in_background(test_suite_id) - except Exception: - LOG.exception("Error generating monitor tests") +def _generate_tests(table_group: TableGroup) -> None: + is_first_profile_run = not table_group.last_complete_profile_run_id + + if bool(table_group.monitor_test_suite_id): + monitor_suite = TestSuite.get(table_group.monitor_test_suite_id) + try: + run_monitor_generation( + table_group.monitor_test_suite_id, + # Only Freshness depends on profiling results + ["Freshness_Trend"], + # Insert for new tables only, if user disabled regeneration + mode="upsert" if is_first_profile_run or monitor_suite.monitor_regenerate_freshness else "insert", + ) + except Exception: + LOG.exception("Error generating Freshness monitors") + + if is_first_profile_run and bool(table_group.default_test_suite_id): + try: + run_test_generation(table_group.default_test_suite_id, "Standard") + except Exception: + LOG.exception(f"Error generating tests for test suite: {table_group.default_test_suite_id}") diff --git a/testgen/commands/run_quick_start.py b/testgen/commands/run_quick_start.py index 5c9ea325..f1885c69 100644 --- a/testgen/commands/run_quick_start.py +++ b/testgen/commands/run_quick_start.py @@ -1,24 +1,31 @@ import logging +import math +import random +from datetime import datetime +from typing import Any import click from testgen import settings from testgen.commands.run_launch_db_config import get_app_db_params_mapping, run_launch_db_config +from testgen.commands.test_generation import run_monitor_generation from testgen.common.credentials import get_tg_schema from testgen.common.database.database_service import ( + apply_params, create_database, execute_db_queries, - replace_params, set_target_db_params, ) from testgen.common.database.flavor.flavor_service import ConnectionParams from testgen.common.models import with_database_session from testgen.common.models.scores import ScoreDefinition +from testgen.common.models.settings import PersistedSetting from testgen.common.models.table_group import TableGroup +from testgen.common.notifications.base import smtp_configured from testgen.common.read_file import read_template_sql_file LOG = logging.getLogger("testgen") - +random.seed(42) def _get_max_date(iteration: int): if iteration == 0: @@ -85,7 +92,7 @@ def _prepare_connection_to_target_database(params_mapping): set_target_db_params(connection_params) -def _get_params_mapping(iteration: int = 0) -> dict: +def _get_settings_params_mapping() -> dict: return { "TESTGEN_ADMIN_USER": settings.DATABASE_ADMIN_USER, "TESTGEN_ADMIN_PASSWORD": settings.DATABASE_ADMIN_PASSWORD, @@ -96,6 +103,12 @@ def _get_params_mapping(iteration: int = 0) -> dict: "PROJECT_DB_HOST": settings.PROJECT_DATABASE_HOST, "PROJECT_DB_PORT": settings.PROJECT_DATABASE_PORT, "SQL_FLAVOR": settings.PROJECT_SQL_FLAVOR, + } + + +def _get_quick_start_params_mapping(iteration: int = 0) -> dict: + return { + **_get_settings_params_mapping(), "MAX_SUPPLIER_ID_SEQ": _get_max_supplierid_seq(iteration), "MAX_PRODUCT_ID_SEQ": _get_max_productid_seq(iteration), "MAX_CUSTOMER_ID_SEQ": _get_max_customerid_seq(iteration), @@ -104,9 +117,62 @@ def _get_params_mapping(iteration: int = 0) -> dict: } +def _metric_cumulative_shift(iteration: int) -> tuple[float, float]: + """Compute cumulative metric shifts at a given iteration for Metric_Trend monitors. + + Returns (discount_shift, price_shift) β€” the total shift from baseline + that should be applied to the underlying data at this iteration. + Uses composite sine waves for organic-looking oscillation patterns. + """ + i = iteration + discount = -1.0 + 1.8 * math.sin(2 * math.pi * i / 14 + math.pi) + 0.7 * math.sin(2 * math.pi * i / 6 + math.pi + 0.5) + price = 80 * math.sin(2 * math.pi * i / 16) + 40 * math.sin(2 * math.pi * i / 7 + 0.3) + 100 + return discount, price + + +def _get_monitor_params_mapping(run_date: datetime, iteration: int = 0, weekday_morning_count: int = 0) -> dict: + # Volume: linear growth with jitter, spike at specific iteration for anomaly + if iteration == 60: + new_sales = 100 + else: + new_sales = random.randint(5, 15) # noqa: S311 + + # Freshness: weekday morning updates with 1-day outage after schedule goes active + is_weekday = run_date.weekday() < 5 + is_morning = run_date.hour < 12 + is_outage = weekday_morning_count == 21 + is_update_suppliers_iter = is_weekday and is_morning and not is_outage + + # Metrics: compute deltas for discount and price shifts + curr_discount, curr_price = _metric_cumulative_shift(iteration) + prev_discount, prev_price = _metric_cumulative_shift(iteration - 1) if iteration > 1 else (0.0, 0.0) + discount_delta = round(curr_discount - prev_discount, 3) + price_delta = round(curr_price - prev_price, 2) + + return { + **_get_settings_params_mapping(), + "ITERATION_NUMBER": iteration, + "RUN_DATE": run_date, + "NEW_SALES": new_sales, + "IS_ADD_CUSTOMER_COL_ITER": iteration == 47, + "IS_DELETE_CUSTOMER_COL_ITER": iteration == 58, + "IS_UPDATE_PRODUCT_ITER": not 24 < iteration < 28, + "IS_CREATE_RETURNS_TABLE_ITER": iteration == 52, + "IS_DELETE_CUSTOMER_ITER": iteration in (29, 36, 55), + "IS_UPDATE_SUPPLIERS_ITER": is_update_suppliers_iter, + "DISCOUNT_DELTA": discount_delta, + "PRICE_DELTA": price_delta, + } + + +def _get_quick_start_query(template_file_name: str, params: dict[str, Any]) -> tuple[str, dict[str, Any]]: + template = read_template_sql_file(template_file_name, "quick_start") + return apply_params(template, params), params + + def run_quick_start(delete_target_db: bool) -> None: # Init - params_mapping = _get_params_mapping() + params_mapping = _get_quick_start_params_mapping() _prepare_connection_to_target_database(params_mapping) # Create DB @@ -124,16 +190,18 @@ def run_quick_start(delete_target_db: bool) -> None: app_db_params = get_app_db_params_mapping() execute_db_queries( [ - (replace_params(read_template_sql_file("initial_data_seeding.sql", "quick_start"), app_db_params), app_db_params), + _get_quick_start_query("initial_data_seeding.sql", app_db_params), ], ) + with_database_session(_setup_initial_config)() + # Schema and Populate target db click.echo(f"Populating target db : {target_db_name}") execute_db_queries( [ - (replace_params(read_template_sql_file("recreate_target_data_schema.sql", "quick_start"), params_mapping), params_mapping), - (replace_params(read_template_sql_file("populate_target_data.sql", "quick_start"), params_mapping), params_mapping), + _get_quick_start_query("recreate_target_data_schema.sql", params_mapping), + _get_quick_start_query("populate_target_data.sql", params_mapping), ], use_target_db=True, ) @@ -145,10 +213,15 @@ def run_quick_start(delete_target_db: bool) -> None: ) ) with_database_session(score_definition.save)() + with_database_session(run_monitor_generation)("823a1fef-9b6d-48d5-9d0f-2db9812cc318", ["Volume_Trend", "Schema_Drift"]) + + +def _setup_initial_config(): + PersistedSetting.set("SMTP_CONFIGURED", smtp_configured()) def run_quick_start_increment(iteration): - params_mapping = _get_params_mapping(iteration) + params_mapping = _get_quick_start_params_mapping(iteration) _prepare_connection_to_target_database(params_mapping) target_db_name = params_mapping["PROJECT_DB"] @@ -156,14 +229,29 @@ def run_quick_start_increment(iteration): execute_db_queries( [ - (replace_params(read_template_sql_file("update_target_data.sql", "quick_start"), params_mapping), params_mapping), - (replace_params(read_template_sql_file(f"update_target_data_iter{iteration}.sql", "quick_start"), params_mapping), params_mapping), + _get_quick_start_query("update_target_data.sql", params_mapping), + _get_quick_start_query(f"update_target_data_iter{iteration}.sql", params_mapping), ], use_target_db=True, ) setup_cat_tests(iteration) +def run_monitor_increment(run_date, iteration, weekday_morning_count=0): + params_mapping = _get_monitor_params_mapping(run_date, iteration, weekday_morning_count) + _prepare_connection_to_target_database(params_mapping) + + target_db_name = params_mapping["PROJECT_DB"] + LOG.info(f"Incremental monitor updates of target db ({iteration}) : {target_db_name}") + + execute_db_queries( + [ + _get_quick_start_query("run_monitor_iteration.sql", params_mapping), + ], + use_target_db=True, + ) + + def setup_cat_tests(iteration): if iteration == 0: return @@ -172,12 +260,11 @@ def setup_cat_tests(iteration): elif iteration >=1: sql_file = "update_cat_tests.sql" - params_mapping = _get_params_mapping(iteration) - query = replace_params(read_template_sql_file(sql_file, "quick_start"), params_mapping) + params_mapping = _get_quick_start_params_mapping(iteration) execute_db_queries( [ - (query, params_mapping), + _get_quick_start_query(sql_file, params_mapping), ], use_target_db=False, ) diff --git a/testgen/commands/run_test_execution.py b/testgen/commands/run_test_execution.py index 14403f46..a809ad20 100644 --- a/testgen/commands/run_test_execution.py +++ b/testgen/commands/run_test_execution.py @@ -1,6 +1,7 @@ import logging import subprocess import threading +from collections import defaultdict from datetime import UTC, datetime, timedelta from functools import partial from typing import Literal @@ -11,6 +12,8 @@ from testgen.commands.queries.execute_tests_query import TestExecutionDef, TestExecutionSQL from testgen.commands.queries.rollup_scores_query import RollupScoresSQL from testgen.commands.run_refresh_score_cards_results import run_refresh_score_cards_results +from testgen.commands.test_generation import run_monitor_generation +from testgen.commands.test_thresholds_prediction import TestThresholdsPrediction from testgen.common import ( execute_db_queries, fetch_dict_from_db, @@ -25,6 +28,7 @@ from testgen.common.models.table_group import TableGroup from testgen.common.models.test_run import TestRun from testgen.common.models.test_suite import TestSuite +from testgen.common.notifications.monitor_run import send_monitor_notifications from testgen.common.notifications.test_run import send_test_run_notifications from testgen.ui.session import session from testgen.utils import get_exception_message @@ -80,11 +84,14 @@ def run_test_execution(test_suite_id: str | UUID, username: str | None = None, r data_chars = run_data_chars_refresh(connection, table_group, test_run.test_starttime) test_run.set_progress("data_chars", "Completed") - sql_generator = TestExecutionSQL(connection, table_group, test_run) + sql_generator = TestExecutionSQL(connection, table_group, test_suite, test_run) + + if test_suite.is_monitor: + _sync_monitor_definitions(sql_generator) # Update the thresholds before retrieving the test definitions in the next steps - LOG.info("Updating historic test thresholds") - execute_db_queries([sql_generator.update_historic_thresholds()]) + LOG.info("Updating test thresholds based on history calculations") + execute_db_queries([sql_generator.update_history_calc_thresholds()]) LOG.info("Retrieving active test definitions in test suite") test_defs = fetch_dict_from_db(*sql_generator.get_active_test_definitions()) @@ -116,7 +123,7 @@ def run_test_execution(test_suite_id: str | UUID, username: str | None = None, r # Run metadata tests last so that results for other tests are available to them for run_type in ["QUERY", "CAT", "METADATA"]: if (run_test_defs := [td for td in valid_test_defs if td.run_type == run_type]): - run_functions[run_type](run_test_defs) + run_functions[run_type](run_test_defs, save_progress=not test_suite.is_monitor) else: test_run.set_progress(run_type, "Completed") LOG.info(f"No {run_type} tests to run") @@ -150,17 +157,27 @@ def run_test_execution(test_suite_id: str | UUID, username: str | None = None, r test_suite.last_complete_test_run_id = test_run.id test_suite.save() - send_test_run_notifications(test_run) - _rollup_test_scores(test_run, table_group) + if not test_suite.is_monitor: + send_test_run_notifications(test_run) + _rollup_test_scores(test_run, table_group) + else: + send_monitor_notifications(test_run) finally: + scoring_endtime = datetime.now(UTC) + time_delta + try: + TestThresholdsPrediction(test_suite, test_run.test_starttime).run() + except Exception: + LOG.exception("Error predicting test thresholds") + MixpanelService().send_event( - "run-tests", + "run-monitors" if test_suite.is_monitor else "run-tests", source=settings.ANALYTICS_JOB_SOURCE, username=username, sql_flavor=connection.sql_flavor_code, test_count=test_run.test_ct, run_duration=(test_run.test_endtime - test_run.test_starttime.replace(tzinfo=UTC)).total_seconds(), - scoring_duration=(datetime.now(UTC) + time_delta - test_run.test_endtime).total_seconds(), + scoring_duration=(scoring_endtime - test_run.test_endtime).total_seconds(), + prediction_duration=(datetime.now(UTC) + time_delta - scoring_endtime).total_seconds(), ) return f""" @@ -169,7 +186,32 @@ def run_test_execution(test_suite_id: str | UUID, username: str | None = None, r """ -def _run_tests(sql_generator: TestExecutionSQL, run_type: Literal["QUERY", "METADATA"], test_defs: list[TestExecutionDef]) -> None: +def _sync_monitor_definitions(sql_generator: TestExecutionSQL) -> None: + test_suite_id = sql_generator.test_run.test_suite_id + + schema_changes = fetch_dict_from_db(*sql_generator.has_schema_changes())[0] + if schema_changes["has_table_drops"]: + run_monitor_generation(test_suite_id, ["Freshness_Trend", "Volume_Trend", "Metric_Trend"], mode="delete") + if schema_changes["has_table_adds"]: + # Freshness monitors will be inserted after profiling + run_monitor_generation(test_suite_id, ["Volume_Trend"], mode="insert") + + # Regenerate monitors that errored in previous run + errored_monitors = fetch_dict_from_db(*sql_generator.get_errored_autogen_monitors()) + if errored_monitors: + errored_by_type: dict[str, list[str]] = defaultdict(list) + for row in errored_monitors: + errored_by_type[row["test_type"]].append(row["table_name"]) + for test_type, table_names in errored_by_type.items(): + run_monitor_generation(test_suite_id, [test_type], mode="upsert", table_names=table_names) + + +def _run_tests( + sql_generator: TestExecutionSQL, + run_type: Literal["QUERY", "METADATA"], + test_defs: list[TestExecutionDef], + save_progress: bool = False, +) -> None: test_run = sql_generator.test_run test_run.set_progress(run_type, "Running") test_run.save() @@ -190,7 +232,7 @@ def update_test_progress(progress: ThreadedProgress) -> None: [sql_generator.run_query_test(td) for td in test_defs], use_target_db=run_type != "METADATA", max_threads=sql_generator.connection.max_threads, - progress_callback=update_test_progress, + progress_callback=update_test_progress if save_progress else None, ) if test_results: @@ -215,7 +257,11 @@ def update_test_progress(progress: ThreadedProgress) -> None: ) -def _run_cat_tests(sql_generator: TestExecutionSQL, test_defs: list[TestExecutionDef]) -> None: +def _run_cat_tests( + sql_generator: TestExecutionSQL, + test_defs: list[TestExecutionDef], + save_progress: bool = False, +) -> None: test_run = sql_generator.test_run test_run.set_progress("CAT", "Running") test_run.save() @@ -241,7 +287,7 @@ def update_aggegate_progress(progress: ThreadedProgress) -> None: aggregate_queries, use_target_db=True, max_threads=sql_generator.connection.max_threads, - progress_callback=update_aggegate_progress, + progress_callback=update_aggegate_progress if save_progress else None, ) if aggregate_results: @@ -281,7 +327,7 @@ def update_single_progress(progress: ThreadedProgress) -> None: single_queries, use_target_db=True, max_threads=sql_generator.connection.max_threads, - progress_callback=update_single_progress, + progress_callback=update_single_progress if save_progress else False, ) if single_results: diff --git a/testgen/commands/test_generation.py b/testgen/commands/test_generation.py new file mode 100644 index 00000000..0583112c --- /dev/null +++ b/testgen/commands/test_generation.py @@ -0,0 +1,211 @@ +import dataclasses +import logging +from datetime import UTC, datetime, timedelta +from typing import Literal +from uuid import UUID + +from testgen import settings +from testgen.common.database.database_service import ( + execute_db_queries, + fetch_dict_from_db, + get_flavor_service, + replace_params, +) +from testgen.common.mixpanel_service import MixpanelService +from testgen.common.models.connection import Connection +from testgen.common.models.table_group import TableGroup +from testgen.common.models.test_suite import TestSuite +from testgen.common.read_file import read_template_sql_file +from testgen.utils import to_sql_timestamp + +LOG = logging.getLogger("testgen") + +GenerationSet = Literal["Standard", "Monitor"] +MonitorTestType = Literal["Freshness_Trend", "Volume_Trend", "Schema_Drift"] +MonitorGenerationMode = Literal["upsert", "insert", "delete"] + +@dataclasses.dataclass +class TestTypeParams: + test_type: str + selection_criteria: str | None + generation_template: str | None + default_parm_columns: str | None + default_parm_values: str | None + + +# Generate tests for a regular non-monitor test suite - don't use for monitors +def run_test_generation( + test_suite_id: str | UUID, + generation_set: GenerationSet = "Standard", + test_types: list[str] | None = None, +) -> str: + if test_suite_id is None: + raise ValueError("Test Suite ID was not specified") + + LOG.info(f"Starting test generation for test suite {test_suite_id}") + + LOG.info("Retrieving connection, table group, and test suite parameters") + test_suite = TestSuite.get(test_suite_id) + if test_suite.is_monitor: + raise ValueError("Cannot run regular test generation for monitor suite") + table_group = TableGroup.get(test_suite.table_groups_id) + connection = Connection.get(table_group.connection_id) + + success = False + try: + TestGeneration(connection, table_group, test_suite, generation_set, test_types).run() + success = True + except Exception: + LOG.exception("Test generation encountered an error.") + finally: + MixpanelService().send_event( + "generate-tests", + source=settings.ANALYTICS_JOB_SOURCE, + sql_flavor=connection.sql_flavor, + generation_set=generation_set, + ) + + return "Test generation completed." if success else "Test generation encountered an error. Check log for details." + + +def run_monitor_generation( + monitor_suite_id: str | UUID, + monitors: list[MonitorTestType], + mode: MonitorGenerationMode = "upsert", + table_names: list[str] | None = None, +) -> None: + """ + Modes: + - "upsert": Add tests for new tables + update tests for existing tables + no deletion + - "insert": Only add tests for new tables + - "delete": Only delete tests for dropped tables + """ + if monitor_suite_id is None: + raise ValueError("Monitor Suite ID was not specified") + + LOG.info(f"Starting monitor generation for {monitor_suite_id} (Mode = {mode}, Monitors = {monitors})") + + monitor_suite = TestSuite.get(monitor_suite_id) + if not monitor_suite.is_monitor: + raise ValueError("Cannot run monitor generation for regular test suite") + table_group = TableGroup.get(monitor_suite.table_groups_id) + connection = Connection.get(table_group.connection_id) + + TestGeneration(connection, table_group, monitor_suite, "Monitor", monitors).monitor_run(mode, table_names=table_names) + + +class TestGeneration: + + def __init__( + self, + connection: Connection, + table_group: TableGroup, + test_suite: TestSuite, + generation_set: str, + test_types_filter: list[MonitorTestType] | None = None, + ): + self.connection = connection + self.table_group = table_group + self.test_suite = test_suite + self.generation_set = generation_set + self.test_types_filter = test_types_filter + self.flavor = connection.sql_flavor + self.flavor_service = get_flavor_service(self.flavor) + + self.run_date = datetime.now(UTC) + self.as_of_date = self.run_date + if (delay_days := int(self.table_group.profiling_delay_days)): + self.as_of_date = self.run_date - timedelta(days=delay_days) + + def run(self) -> None: + LOG.info("Running test generation queries") + execute_db_queries([ + *self._get_generation_queries(), + self._get_query("delete_stale_autogen_tests.sql"), + ]) + + def monitor_run(self, mode: MonitorGenerationMode, table_names: list[str] | None = None) -> None: + if mode == "delete": + execute_db_queries([self._get_query("delete_stale_monitors.sql")]) + return + + extra_params = {"INSERT_ONLY": mode == "insert"} + if table_names: + table_list = ", ".join(f"'{table}'" for table in table_names) + extra_params["TABLE_FILTER"] = f"AND table_name IN ({table_list})" + + LOG.info("Running monitor generation queries") + execute_db_queries( + self._get_generation_queries(extra_params=extra_params), + ) + + def _get_generation_queries(self, extra_params: dict | None = None) -> list[tuple[str, dict]]: + test_types = fetch_dict_from_db(*self._get_query("get_test_types.sql", extra_params=extra_params)) + test_types = [TestTypeParams(**item) for item in test_types] + + if self.test_types_filter: + test_types = [tt for tt in test_types if tt.test_type in self.test_types_filter] + + selection_queries = [ + self._get_query("gen_selection_tests.sql", test_type=tt, extra_params=extra_params) + for tt in test_types + if tt.selection_criteria and tt.selection_criteria != "TEMPLATE" + ] + + template_queries = [] + for tt in test_types: + if template_file := tt.generation_template: + # Try flavor-specific template first, then fall back to generic + for directory in [f"flavors/{self.flavor}/gen_query_tests", "gen_query_tests", "gen_funny_cat_tests"]: + try: + template_queries.append(self._get_query(template_file, directory, extra_params=extra_params)) + break + except (ValueError, ModuleNotFoundError): + continue + else: + LOG.warning(f"Template file '{template_file}' not found for test type '{tt.test_type}'") + + return [*selection_queries, *template_queries] + + def _get_params(self, test_type: TestTypeParams | None = None) -> dict: + params = {} + if test_type: + params.update({ + "TEST_TYPE": test_type.test_type, + # Replace these first since they may contain other params + "SELECTION_CRITERIA": test_type.selection_criteria, + "DEFAULT_PARM_COLUMNS": test_type.default_parm_columns, + "DEFAULT_PARM_COLUMNS_UPDATE": ",".join([ + f"{column} = EXCLUDED.{column.strip()}" + for column in test_type.default_parm_columns.split(",") + ]) if test_type.default_parm_columns else "", + "DEFAULT_PARM_VALUES": test_type.default_parm_values, + }) + params.update({ + "TABLE_GROUPS_ID": self.table_group.id, + "TEST_SUITE_ID": self.test_suite.id, + "DATA_SCHEMA": self.table_group.table_group_schema, + "GENERATION_SET": self.generation_set, + "TEST_TYPES_FILTER": self.test_types_filter, + "RUN_DATE": to_sql_timestamp(self.run_date), + "AS_OF_DATE": to_sql_timestamp(self.as_of_date), + "SQL_FLAVOR": self.flavor, + "QUOTE": self.flavor_service.quote_character, + "INSERT_ONLY": False, + "TABLE_FILTER": "", + }) + return params + + def _get_query( + self, + template_file_name: str, + sub_directory: str | None = "generation", + test_type: TestTypeParams | None = None, + extra_params: dict | None = None, + ) -> tuple[str, dict | None]: + query = read_template_sql_file(template_file_name, sub_directory) + params = self._get_params(test_type) + if extra_params: + params.update(extra_params) + query = replace_params(query, params) + return query, params diff --git a/testgen/commands/test_thresholds_prediction.py b/testgen/commands/test_thresholds_prediction.py new file mode 100644 index 00000000..ca7b679b --- /dev/null +++ b/testgen/commands/test_thresholds_prediction.py @@ -0,0 +1,302 @@ +import json +import logging +from datetime import datetime + +import pandas as pd +from scipy import stats + +from testgen.common.database.database_service import ( + execute_db_queries, + fetch_dict_from_db, + replace_params, + write_to_app_db, +) +from testgen.common.freshness_service import ( + get_freshness_gap_threshold, + infer_schedule, + minutes_to_next_deadline, + resolve_holiday_dates, +) +from testgen.common.models import with_database_session +from testgen.common.models.scheduler import JobSchedule +from testgen.common.models.test_suite import PredictSensitivity, TestSuite +from testgen.common.read_file import read_template_sql_file +from testgen.common.time_series_service import ( + NotEnoughData, + get_sarimax_forecast, +) +from testgen.utils import to_dataframe, to_sql_timestamp + +LOG = logging.getLogger("testgen") + +NUM_FORECAST = 10 +T_DISTRIBUTION_THRESHOLD = 20 + +Z_SCORE_MAP = { + ("lower_tolerance", PredictSensitivity.low): -3.0, # 0.13th percentile + ("lower_tolerance", PredictSensitivity.medium): -2.5, # 0.62nd percentile + ("lower_tolerance", PredictSensitivity.high): -2.0, # 2.3rd percentile + ("upper_tolerance", PredictSensitivity.high): 2.0, # 97.7th percentile + ("upper_tolerance", PredictSensitivity.medium): 2.5, # 99.4th percentile + ("upper_tolerance", PredictSensitivity.low): 3.0, # 99.87th percentile +} + +FRESHNESS_THRESHOLD_MAP = { + # upper_pct floor_mult lower_pct + PredictSensitivity.high: (80, 1.0, 20), + PredictSensitivity.medium: (95, 1.25, 10), + PredictSensitivity.low: (99, 1.5, 5), +} + +SCHEDULE_DEADLINE_BUFFER_HOURS = { + PredictSensitivity.high: 1.5, + PredictSensitivity.medium: 3.0, + PredictSensitivity.low: 5.0, +} + +STALENESS_FACTOR_MAP = { + PredictSensitivity.high: 0.75, + PredictSensitivity.medium: 0.85, + PredictSensitivity.low: 0.95, +} + + +class TestThresholdsPrediction: + staging_table = "stg_test_definition_updates" + staging_columns = ( + "test_suite_id", + "test_definition_id", + "run_date", + "lower_tolerance", + "upper_tolerance", + "threshold_value", + "prediction", + ) + + @with_database_session + def __init__(self, test_suite: TestSuite, run_date: datetime): + self.test_suite = test_suite + self.run_date = run_date + schedule = JobSchedule.get(JobSchedule.kwargs["test_suite_id"].astext == str(test_suite.id)) + self.tz = schedule.cron_tz or "UTC" if schedule else None + + def run(self) -> None: + LOG.info("Retrieving historical test results for training prediction models") + test_results = fetch_dict_from_db(*self._get_query("get_historical_test_results.sql")) + if test_results: + df = to_dataframe(test_results, coerce_float=True) + grouped_dfs = df.groupby("test_definition_id", group_keys=False) + + LOG.info(f"Training prediction models for tests: {len(grouped_dfs)}") + prediction_results = [] + for test_def_id, group in grouped_dfs: + test_type = group["test_type"].iloc[0] + history = group[["test_time", "result_signal"]] + history = history.set_index("test_time") + + test_prediction = [ + self.test_suite.id, + test_def_id, + to_sql_timestamp(self.run_date), + ] + if test_type == "Freshness_Trend": + lower, upper, staleness, prediction = compute_freshness_threshold( + history, + sensitivity=self.test_suite.predict_sensitivity or PredictSensitivity.medium, + min_lookback=self.test_suite.predict_min_lookback or 1, + exclude_weekends=self.test_suite.predict_exclude_weekends, + holiday_codes=self.test_suite.holiday_codes_list, + schedule_tz=self.tz, + ) + test_prediction.extend([lower, upper, staleness, prediction]) + else: + lower, upper, prediction = compute_sarimax_threshold( + history, + sensitivity=self.test_suite.predict_sensitivity or PredictSensitivity.medium, + min_lookback=self.test_suite.predict_min_lookback or 1, + exclude_weekends=self.test_suite.predict_exclude_weekends, + holiday_codes=self.test_suite.holiday_codes_list, + schedule_tz=self.tz, + ) + if test_type == "Volume_Trend": + if lower is not None: + lower = max(lower, 0.0) + if upper is not None: + upper = max(upper, 0.0) + test_prediction.extend([lower, upper, None, prediction]) + + prediction_results.append(test_prediction) + + LOG.info("Writing predicted test thresholds to staging") + write_to_app_db(prediction_results, self.staging_columns, self.staging_table) + + LOG.info("Updating predicted test thresholds and deleting staging") + execute_db_queries([ + self._get_query("update_predicted_test_thresholds.sql"), + self._get_query("delete_staging_test_definitions.sql"), + ]) + + def _get_query( + self, + template_file_name: str, + sub_directory: str | None = "prediction", + ) -> tuple[str, dict]: + params = { + "TEST_SUITE_ID": self.test_suite.id, + "RUN_DATE": to_sql_timestamp(self.run_date), + } + query = read_template_sql_file(template_file_name, sub_directory) + query = replace_params(query, params) + return query, params + + +def compute_freshness_threshold( + history: pd.DataFrame, + sensitivity: PredictSensitivity, + min_lookback: int = 1, + exclude_weekends: bool = False, + holiday_codes: list[str] | None = None, + schedule_tz: str | None = None, +) -> tuple[float | None, float | None, float | None, str | None]: + """Compute freshness gap thresholds in business minutes. + + Returns (lower, upper, staleness_threshold, prediction_json) in business minutes, + or (None, None, None, None) if not enough data. + """ + if len(history) < min_lookback: + return None, None, None, None + + upper_percentile, floor_multiplier, lower_percentile = FRESHNESS_THRESHOLD_MAP[sensitivity] + staleness_factor = STALENESS_FACTOR_MAP[sensitivity] + + try: + result = get_freshness_gap_threshold( + history, + upper_percentile=upper_percentile, + floor_multiplier=floor_multiplier, + lower_percentile=lower_percentile, + exclude_weekends=exclude_weekends, + holiday_codes=holiday_codes, + tz=schedule_tz, + staleness_factor=staleness_factor, + ) + except NotEnoughData: + return None, None, None, None + + lower, upper = result.lower, result.upper + staleness: float | None = None + prediction_data: dict = {} + + if not schedule_tz: + return lower, upper, staleness, json.dumps(prediction_data) + + # --- Schedule inference --- + deadline_buffer = SCHEDULE_DEADLINE_BUFFER_HOURS[sensitivity] + + schedule = infer_schedule(history, schedule_tz) + if not schedule: + return lower, upper, staleness, json.dumps(prediction_data) + + prediction_data.update({ + "schedule_stage": schedule.stage, + "frequency": schedule.frequency, + "active_days": sorted(schedule.active_days) if schedule.active_days else None, + "window_start": schedule.window_start, + "window_end": schedule.window_end, + # Metadata stored for debugging purposes + "confidence": round(schedule.confidence, 4), + "num_events": schedule.num_events, + "sensitivity": sensitivity.value, + "deadline_buffer_hours": deadline_buffer, + }) + + if schedule.stage == "active": + excluded_days = frozenset(range(7)) - schedule.active_days if schedule.active_days else None + + # For sub-daily schedules, apply window exclusion for overnight gaps + has_window = ( + schedule.frequency == "sub_daily" + and schedule.window_start is not None + and schedule.window_end is not None + ) + + # Recompute gap thresholds with schedule-aware exclusion + if excluded_days or has_window: + try: + result = get_freshness_gap_threshold( + history, + upper_percentile=upper_percentile, + floor_multiplier=floor_multiplier, + lower_percentile=lower_percentile, + exclude_weekends=exclude_weekends, + holiday_codes=holiday_codes, + tz=schedule_tz, + staleness_factor=staleness_factor, + excluded_days=excluded_days, + window_start=schedule.window_start if has_window else None, + window_end=schedule.window_end if has_window else None, + ) + lower, upper = result.lower, result.upper + staleness = result.staleness + except NotEnoughData: + pass # Keep first-pass thresholds + + # Override upper threshold with schedule-based deadline (daily/weekly only) + if schedule.frequency != "sub_daily": + holiday_dates = resolve_holiday_dates(holiday_codes, history.index) if holiday_codes else None + schedule_upper = minutes_to_next_deadline( + result.last_update, schedule, + exclude_weekends, holiday_dates, schedule_tz, + deadline_buffer, excluded_days=excluded_days, + ) + if schedule_upper is not None: + upper = schedule_upper + + return lower, upper, staleness, json.dumps(prediction_data) + + +def compute_sarimax_threshold( + history: pd.DataFrame, + sensitivity: PredictSensitivity, + num_forecast: int = NUM_FORECAST, + min_lookback: int = 1, + exclude_weekends: bool = False, + holiday_codes: list[str] | None = None, + schedule_tz: str | None = None, +) -> tuple[float | None, float | None, str | None]: + """Compute SARIMAX-based thresholds for the next forecast point. + + Returns (lower, upper, forecast_json) or (None, None, None) if insufficient data. + """ + if len(history) < min_lookback: + return None, None, None + + try: + forecast = get_sarimax_forecast( + history, + num_forecast=num_forecast, + exclude_weekends=exclude_weekends, + holiday_codes=holiday_codes, + tz=schedule_tz, + ) + + num_points = len(history) + for key, z_score in Z_SCORE_MAP.items(): + if num_points < T_DISTRIBUTION_THRESHOLD: + percentile = stats.norm.cdf(z_score) + multiplier = stats.t.ppf(percentile, df=num_points - 1) + else: + multiplier = z_score + column = f"{key[0]}|{key[1].value}" + forecast[column] = forecast["mean"] + (multiplier * forecast["se"]) + + next_date = forecast.index[0] + lower_tolerance = forecast.at[next_date, f"lower_tolerance|{sensitivity.value}"] + upper_tolerance = forecast.at[next_date, f"upper_tolerance|{sensitivity.value}"] + + if pd.isna(lower_tolerance) or pd.isna(upper_tolerance): + return None, None, None + else: + return float(lower_tolerance), float(upper_tolerance), forecast.to_json() + except NotEnoughData: + return None, None, None diff --git a/testgen/common/__init__.py b/testgen/common/__init__.py index e413faa0..900b4c09 100644 --- a/testgen/common/__init__.py +++ b/testgen/common/__init__.py @@ -3,6 +3,5 @@ from .clean_sql import * from .credentials import * from .encrypt import * -from .get_pipeline_parms import * from .logs import * from .read_file import * diff --git a/testgen/common/database/database_service.py b/testgen/common/database/database_service.py index cc712403..4b340b18 100644 --- a/testgen/common/database/database_service.py +++ b/testgen/common/database/database_service.py @@ -2,6 +2,7 @@ import csv import importlib import logging +import re from collections.abc import Callable, Iterable from contextlib import suppress from dataclasses import dataclass, field @@ -193,7 +194,7 @@ def fetch_data(query: str, params: dict | None, index: int) -> tuple[list[Legacy LOG.exception(f"Failed to execute threaded query: {query}") return row_data, column_names, index, error - + result_data: list[LegacyRow] = [] result_columns: list[str] = [] error_data: dict[int, str] = {} @@ -284,12 +285,42 @@ def write_to_app_db(data: list[LegacyRow], column_names: Iterable[str], table_na connection.close() +def apply_params(query: str, params: dict[str, Any]) -> str: + query = process_conditionals(query, params) + query = replace_params(query, params) + return query + + def replace_params(query: str, params: dict[str, Any]) -> str: for key, value in params.items(): query = query.replace(f"{{{key}}}", "" if value is None else str(value)) return query +def process_conditionals(query: str, params: dict[str, Any]) -> str: + re_pattern = re.compile(r"^--\s+TG-(IF|ELSE|ENDIF)(?:\s+(\w+))?\s*$") + condition = None + updated_query = [] + for line in query.splitlines(True): + if re_match := re_pattern.match(line): + match re_match.group(1): + case "IF" if condition is None and (variable := re_match.group(2)) is not None: + condition = bool(params.get(variable)) + case "ELSE" if condition is not None: + condition = not condition + case "ENDIF" if condition is not None: + condition = None + case _: + raise ValueError("Template conditional misused") + elif condition is not False: + updated_query.append(line) + + if condition is not None: + raise ValueError("Template conditional misused") + + return "".join(updated_query) + + def get_queries_for_command( sub_directory: str, params: dict[str, Any], mask: str = r"^.*sql$", path: str | None = None ) -> list[str]: diff --git a/testgen/common/freshness_service.py b/testgen/common/freshness_service.py new file mode 100644 index 00000000..f7810787 --- /dev/null +++ b/testgen/common/freshness_service.py @@ -0,0 +1,609 @@ +import json +import logging +import zoneinfo +from collections import Counter +from datetime import date, datetime +from typing import NamedTuple + +import numpy as np +import pandas as pd + +from testgen.common.time_series_service import NotEnoughData, get_holiday_dates + +LOG = logging.getLogger("testgen") + +# Minimum completed gaps needed before freshness threshold is meaningful +MIN_FRESHNESS_GAPS = 5 + +# Default sliding window size β€” use only the most recent N gaps +MAX_FRESHNESS_GAPS = 40 + + +class FreshnessThreshold(NamedTuple): + lower: float | None + upper: float + staleness: float + last_update: pd.Timestamp + + +class InferredSchedule(NamedTuple): + stage: str # "training", "tentative", "active", "irregular" + frequency: str # "sub_daily", "daily", "weekly", "irregular" + active_days: frozenset[int] # weekday numbers (0=Mon, 6=Sun) + window_start: float | None # hour of day (0-24), P10 + window_end: float | None # hour of day (0-24), P90 + confidence: float # fraction of events matching schedule + num_events: int # total update events used + + +def get_freshness_gap_threshold( + history: pd.DataFrame, + upper_percentile: float, + floor_multiplier: float, + lower_percentile: float, + exclude_weekends: bool = False, + holiday_codes: list[str] | None = None, + tz: str | None = None, + staleness_factor: float = 0.85, + excluded_days: frozenset[int] | None = None, + window_start: float | None = None, + window_end: float | None = None, +) -> FreshnessThreshold: + """Compute freshness thresholds from completed gap durations. + + Extracts gaps between consecutive table updates (points where result_signal == 0) + and returns upper and lower thresholds based on percentiles, with a floor for the + upper bound derived from the maximum observed gap. + + When exclusion flags are set, gap durations are normalized by subtracting + excluded time (weekends/holidays) that fall within each gap. + + A sliding window limits the number of recent gaps used, so old outliers + age out of the distribution over time. + + :param history: DataFrame with DatetimeIndex and a result_signal column. + :param upper_percentile: Percentile for upper bound (e.g. 80, 95, 99). + :param floor_multiplier: Multiplied by max gap to set an upper floor (e.g. 1.0, 1.25, 1.5). + :param lower_percentile: Percentile for lower bound (e.g. 5, 10, 20). + :param exclude_weekends: Subtract weekend days from gap durations. + :param holiday_codes: Country/market codes for holidays to subtract from gap durations. + :param tz: IANA timezone (e.g. "America/New_York") for weekday/holiday determination. + :returns: FreshnessThreshold with lower (in business minutes, None if not computed), + upper (in business minutes), and last_update timestamp. + :raises NotEnoughData: If fewer than MIN_FRESHNESS_GAPS completed gaps are found. + """ + signal = history.iloc[:, 0] + update_times = signal.index[signal == 0] + + if len(update_times) - 1 < MIN_FRESHNESS_GAPS: + raise NotEnoughData( + f"Need at least {MIN_FRESHNESS_GAPS} completed gaps, found {max(len(update_times) - 1, 0)}." + ) + + has_exclusions = exclude_weekends or holiday_codes or excluded_days or (window_start is not None and window_end is not None) + holiday_dates = resolve_holiday_dates(holiday_codes, history.index) if holiday_codes else None + gaps_minutes = np.diff(update_times).astype("timedelta64[m]").astype(float) + + if has_exclusions: + for i in range(len(gaps_minutes)): + excluded_minutes = count_excluded_minutes( + update_times[i], update_times[i + 1], exclude_weekends, holiday_dates, + tz=tz, excluded_days=excluded_days, + window_start=window_start, window_end=window_end, + ) + gaps_minutes[i] = max(gaps_minutes[i] - excluded_minutes, 0) + + # Sliding window: keep only the most recent gaps + if len(gaps_minutes) > MAX_FRESHNESS_GAPS: + gaps_minutes = gaps_minutes[-MAX_FRESHNESS_GAPS:] + + upper = max( + float(np.percentile(gaps_minutes, upper_percentile)), + float(np.max(gaps_minutes)) * floor_multiplier, + ) + + lower = float(np.percentile(gaps_minutes, lower_percentile)) + if lower <= 0: + lower = None + + staleness = float(np.median(gaps_minutes)) * staleness_factor + + return FreshnessThreshold(lower=lower, upper=upper, staleness=staleness, last_update=update_times[-1]) + + +def resolve_holiday_dates(codes: list[str], index: pd.DatetimeIndex) -> set[date]: + return {d.date() if isinstance(d, datetime) else d for d in get_holiday_dates(codes, index)} + + +class ScheduleParams(NamedTuple): + excluded_days: frozenset[int] | None + window_start: float | None + window_end: float | None + + +def get_schedule_params(prediction: dict | str | None) -> ScheduleParams: + empty = ScheduleParams(excluded_days=None, window_start=None, window_end=None) + if not prediction: + return empty + prediction = prediction if isinstance(prediction, dict) else json.loads(prediction) + + if prediction.get("schedule_stage") != "active": + return empty + + active_days = prediction.get("active_days") + excluded_days = frozenset(range(7)) - frozenset(active_days) if active_days else None + + window_start: float | None = None + window_end: float | None = None + if prediction.get("frequency") == "sub_daily": + if (ws := prediction.get("window_start")) is not None and (we := prediction.get("window_end")) is not None: + window_start = float(ws) + window_end = float(we) + + return ScheduleParams(excluded_days=excluded_days, window_start=window_start, window_end=window_end) + + +def is_excluded_day( + dt: pd.Timestamp, + exclude_weekends: bool, + holiday_dates: set[date] | None, + tz: str | None = None, + excluded_days: frozenset[int] | None = None, + window_start: float | None = None, + window_end: float | None = None, +) -> bool: + """Check if a timestamp falls on excluded time. + + Excluded time includes: + - Weekends (if exclude_weekends is True) + - Holidays (if holiday_dates is provided) + - Inferred inactive days (if excluded_days is provided) + - Hours outside the active window on active days (if window_start/window_end are provided) + + When tz is provided, naive timestamps are interpreted as UTC and converted + to the given timezone for the weekday/holiday/hour check. + """ + if tz: + local_ts = _to_local(dt, tz) + date_ = local_ts.date() + else: + local_ts = dt + date_ = dt.date() + + if exclude_weekends and date_.weekday() >= 5: + return True + if excluded_days and date_.weekday() in excluded_days: + return True + if holiday_dates and date_ in holiday_dates: + return True + + if window_start is not None and window_end is not None: + hour = local_ts.hour + local_ts.minute / 60.0 + if not _is_in_time_window(hour, window_start, window_end): + return True + return False + + +def next_business_day_start( + dt: pd.Timestamp, + exclude_weekends: bool, + holiday_dates: set[date] | None, + tz: str | None = None, + excluded_days: frozenset[int] | None = None, +) -> pd.Timestamp: + day = (pd.Timestamp(dt) + pd.DateOffset(days=1)).normalize() + while is_excluded_day(day, exclude_weekends, holiday_dates, tz=tz, excluded_days=excluded_days): + day += pd.DateOffset(days=1) + return day + + +def count_excluded_minutes( + start: pd.Timestamp, + end: pd.Timestamp, + exclude_weekends: bool, + holiday_dates: set[date] | None, + tz: str | None = None, + excluded_days: frozenset[int] | None = None, + window_start: float | None = None, + window_end: float | None = None, +) -> float: + """Count excluded minutes between two timestamps, including partial days. + + Iterates day-by-day from start to end, counting the overlap between each + excluded day (weekend or holiday) and the [start, end] interval. Partial + excluded days at the boundaries are correctly prorated. + + When window_start/window_end are provided (sub-daily active schedules), + hours outside the [window_start, window_end] range on active days are also + counted as excluded. Fully excluded days (weekends, holidays, inactive days) + still count their entire overlap as excluded β€” the window only applies to + days that are otherwise active. + + When tz is provided, naive timestamps are converted to the local timezone + for weekday/holiday determination. The overlap is computed in naive local + time (timezone stripped after conversion) so that every calendar day is + exactly 24 h β€” this keeps excluded minutes consistent with UTC-based raw + gaps and avoids DST distortion (fall-back days counting 25 h, spring-forward + days counting 23 h). + """ + start = pd.Timestamp(start) + end = pd.Timestamp(end) + + if tz: + # Convert to local, then strip timezone β†’ naive local time so each day + # is exactly 24 h. This prevents DST transitions from inflating/deflating + # excluded time relative to the UTC-based raw gap that callers subtract from. + start = _to_local(start, tz).tz_localize(None) + end = _to_local(end, tz).tz_localize(None) + + if start >= end: + return 0.0 + + has_window = window_start is not None and window_end is not None + + total_minutes = 0.0 + day_start = start.normalize() + + while day_start < end: + next_day = day_start + pd.Timedelta(days=1) + + if is_excluded_day(day_start, exclude_weekends, holiday_dates, excluded_days=excluded_days): + # Full day excluded (weekend, holiday, inactive day) + overlap_start = max(start, day_start) + overlap_end = min(end, next_day) + total_minutes += (overlap_end - overlap_start).total_seconds() / 60 + elif has_window: + # Active day but with window exclusion: exclude hours outside the window + # Compute the active window boundaries for this calendar day + win_open = day_start + pd.Timedelta(hours=window_start) + win_close = day_start + pd.Timedelta(hours=window_end) + + # Clip to the [start, end] interval + overlap_start = max(start, day_start) + overlap_end = min(end, next_day) + + # Excluded = time in [overlap_start, overlap_end] that is outside [win_open, win_close] + # = total overlap - time inside window + total_overlap = (overlap_end - overlap_start).total_seconds() / 60 + + # Compute overlap with the active window + active_start = max(overlap_start, win_open) + active_end = min(overlap_end, win_close) + active_minutes = max((active_end - active_start).total_seconds() / 60, 0) + + excluded_on_day = total_overlap - active_minutes + if excluded_on_day > 0: + total_minutes += excluded_on_day + + day_start = next_day + + return total_minutes + + +def add_business_minutes( + start: pd.Timestamp | datetime, + business_minutes: float, + exclude_weekends: bool, + holiday_dates: set[date] | None, + tz: str | None = None, + excluded_days: frozenset[int] | None = None, +) -> pd.Timestamp: + """Advance wall-clock time by N business minutes, skipping excluded days. + + Inverse of count_excluded_minutes: given a start time and a number of + business minutes to elapse, returns the wall-clock timestamp at which + those minutes will have passed, skipping weekends and holidays. + + When tz is provided, naive timestamps are interpreted as UTC and day + boundary checks use the local timezone. + """ + start = pd.Timestamp(start) + if business_minutes <= 0: + return start + + has_exclusions = exclude_weekends or bool(holiday_dates) or bool(excluded_days) + if not has_exclusions: + return start + pd.Timedelta(minutes=business_minutes) + + cursor = start + if tz: + cursor = _to_local(cursor, tz) + + remaining = business_minutes + + while remaining > 0: + day_start = cursor.normalize() + next_day = (day_start + pd.DateOffset(days=1)).normalize() + + if is_excluded_day(cursor, exclude_weekends, holiday_dates, excluded_days=excluded_days): + cursor = next_day + # Skip consecutive excluded days + for _ in range(365): + if not is_excluded_day(cursor, exclude_weekends, holiday_dates, excluded_days=excluded_days): + break + cursor = (cursor + pd.DateOffset(days=1)).normalize() + continue + + minutes_left_today = (next_day - cursor).total_seconds() / 60 + + if remaining <= minutes_left_today: + cursor = cursor + pd.Timedelta(minutes=remaining) + remaining = 0 + else: + remaining -= minutes_left_today + cursor = next_day + + if tz and start.tzinfo is None: + cursor = cursor.tz_convert("UTC").tz_localize(None) + + return cursor + + +def _is_in_time_window(hour: float, window_start: float, window_end: float) -> bool: + if window_start <= window_end: + return window_start <= hour <= window_end + return hour >= window_start or hour <= window_end + + +def _to_local(ts: pd.Timestamp, tz: str) -> pd.Timestamp: + if ts.tzinfo is None: + ts = ts.tz_localize("UTC") + return ts.tz_convert(zoneinfo.ZoneInfo(tz)) + + +def _next_active_day(start: pd.Timestamp, active_days: frozenset[int], max_days: int = 14) -> pd.Timestamp | None: + candidate = start + for _ in range(max_days): + if candidate.weekday() in active_days: + return candidate + candidate += pd.Timedelta(days=1) + return None + + +def _set_fractional_hour(ts: pd.Timestamp, fractional_hour: float) -> pd.Timestamp: + hour = int(fractional_hour) + minute = int((fractional_hour - hour) * 60) + return ts.replace(hour=hour, minute=minute, second=0, microsecond=0) + + +def classify_frequency(gaps_hours: np.ndarray) -> str: + """Classify table update frequency from inter-update gaps. + + Frequency does NOT gate the schedule stage β€” stage is determined by + confidence (>= 0.75 β†’ "active"). However, frequency does affect which + threshold path is used: "sub_daily" enables within-window gap thresholds, + while other values use deadline-based thresholds. + + Multi-day cadences (median 36-120h, e.g. Mon/Wed/Fri or Tue/Thu) classify + as "irregular" because they fall between the daily and weekly bands, but + they can still reach "active" stage when ``detect_active_days`` finds a + consistent day-of-week and time-of-day pattern. + + Bands: + - sub_daily: median < 6h + - daily: 6h <= median < 36h + - weekly: 120h < median < 240h (roughly 5-10 days) + - irregular: everything else (0 gaps, 36-120h, 240h+) + + :param gaps_hours: Array of gap durations in hours between consecutive updates. + :returns: One of "sub_daily", "daily", "weekly", "irregular". + """ + if len(gaps_hours) == 0: + return "irregular" + median_gap = float(np.median(gaps_hours)) + if median_gap < 6: + return "sub_daily" + elif median_gap < 36: + return "daily" + elif 120 < median_gap < 240: + return "weekly" + else: + return "irregular" + + +def detect_active_days( + update_times: list[pd.Timestamp], + tz: str, + min_weeks: int = 3, +) -> frozenset[int] | None: + """Detect which days of the week have updates. + + :param update_times: Sorted list of update timestamps (UTC or naive-UTC). + :param tz: IANA timezone for local day-of-week mapping. + :param min_weeks: Minimum weeks of data needed. + :returns: frozenset of weekday numbers (0=Mon, 6=Sun) or None if insufficient data. + """ + if len(update_times) < 2: + return None + + local_times = [_to_local(t, tz) for t in update_times] + + date_range_days = (local_times[-1] - local_times[0]).days + if date_range_days < min_weeks * 7: + return None + + day_counts: Counter[int] = Counter(t.weekday() for t in local_times) + weeks_observed = max(1, date_range_days // 7) + + active_days: set[int] = set() + for day in range(7): + count = day_counts.get(day, 0) + hit_rate = count / weeks_observed + if hit_rate >= 0.5: + active_days.add(day) + + return frozenset(active_days) if active_days else None + + +def detect_update_window( + update_times: list[pd.Timestamp], + active_days: frozenset[int], + tz: str, +) -> tuple[float, float] | None: + """Detect the time-of-day window when updates arrive on active days. + + :returns: (window_start, window_end) as hours 0-24, or None. + """ + local_times = [_to_local(t, tz) for t in update_times] + + hours_on_active_days = [ + t.hour + t.minute / 60.0 + for t in local_times + if t.weekday() in active_days + ] + + if len(hours_on_active_days) < 10: + return None + + # Handle midnight-wrapping clusters (e.g., 23:00-01:00) + shifted = False + late = sum(1 for h in hours_on_active_days if h >= 22) / len(hours_on_active_days) + early = sum(1 for h in hours_on_active_days if h < 3) / len(hours_on_active_days) + if late > 0.25 and early > 0.25: + hours_on_active_days = [(h + 12) % 24 for h in hours_on_active_days] + shifted = True + + p10 = float(np.percentile(hours_on_active_days, 10)) + p90 = float(np.percentile(hours_on_active_days, 90)) + + if shifted: + p10 = (p10 - 12) % 24 + p90 = (p90 - 12) % 24 + + return (p10, p90) + + +def compute_schedule_confidence( + update_times: list[pd.Timestamp], + schedule: InferredSchedule, + tz: str, +) -> float: + """Fraction of historical updates that match the detected schedule. + + An update "matches" if it falls on an active day and (if a window is defined) + within the P10-P90 time window. + """ + if not update_times: + return 0.0 + + matching = 0 + for t in update_times: + lt = _to_local(t, tz) + if lt.weekday() not in schedule.active_days: + continue + if schedule.window_start is not None and schedule.window_end is not None: + hour = lt.hour + lt.minute / 60.0 + if not _is_in_time_window(hour, schedule.window_start, schedule.window_end): + continue + matching += 1 + return matching / len(update_times) + + +def infer_schedule( + history: pd.DataFrame, + tz: str, +) -> InferredSchedule | None: + """Attempt to infer a table's update schedule from its freshness history. + + :param history: DataFrame with DatetimeIndex and result_signal column (0 = update). + :param tz: IANA timezone for local time analysis. + :returns: InferredSchedule or None if insufficient data for any inference. + """ + signal = history.iloc[:, 0] + update_times = list(signal.index[signal == 0]) + + if len(update_times) < 10: + return None + + # Compute gaps in hours + gaps_hours = np.diff(update_times).astype("timedelta64[m]").astype(float) / 60.0 + + frequency = classify_frequency(gaps_hours) + num_events = len(update_times) + + # Determine stage based on data quantity + local_times = [_to_local(t, tz) for t in update_times] + date_range_days = (local_times[-1] - local_times[0]).days + + if date_range_days < 21 or num_events < 10: + return None # Not enough data for any inference + + # Detect active days + active_days = detect_active_days(update_times, tz) + if active_days is None: + active_days = frozenset(range(7)) + + # Detect update window + window_result = detect_update_window(update_times, active_days, tz) + window_start = window_result[0] if window_result else None + window_end = window_result[1] if window_result else None + + # Build preliminary schedule for confidence scoring + preliminary = InferredSchedule( + frequency=frequency, + active_days=active_days, + window_start=window_start, + window_end=window_end, + confidence=0.0, + num_events=num_events, + stage="training", + ) + + confidence = compute_schedule_confidence(update_times, preliminary, tz) + + # Determine stage + if num_events < 20: + stage = "tentative" + elif confidence >= 0.75: + stage = "active" + elif confidence < 0.60: + stage = "irregular" + else: + stage = "tentative" + + return preliminary._replace(confidence=confidence, stage=stage) + + +def minutes_to_next_deadline( + last_update: pd.Timestamp, + schedule: InferredSchedule, + exclude_weekends: bool, + holiday_dates: set[date] | None, + tz: str, + buffer_hours: float, + excluded_days: frozenset[int] | None = None, +) -> float | None: + if schedule.window_end is None: + return None + + deadline_hour = (schedule.window_end + buffer_hours) % 24 + local_last = _to_local(last_update, tz) + + # Find the next active day after last_update + candidate = _next_active_day(local_last.normalize() + pd.Timedelta(days=1), schedule.active_days) + if candidate is None: + return None + + # Set the deadline time on that day + deadline_ts = _set_fractional_hour(candidate, deadline_hour) + + # If the deadline is already past relative to now, move to next active day + if deadline_ts <= local_last: + candidate = _next_active_day(candidate + pd.Timedelta(days=1), schedule.active_days) + if candidate is None: + return None + deadline_ts = _set_fractional_hour(candidate, deadline_hour) + + # Convert both to UTC for consistent gap calculation + utc_last = local_last.tz_convert("UTC").tz_localize(None) + utc_deadline = deadline_ts.tz_convert("UTC").tz_localize(None) + + wall_minutes = (utc_deadline - utc_last).total_seconds() / 60.0 + if wall_minutes <= 0: + return None + + if exclude_weekends or holiday_dates or excluded_days: + excl = count_excluded_minutes(utc_last, utc_deadline, exclude_weekends, holiday_dates, tz=tz, excluded_days=excluded_days) + return max(wall_minutes - excl, 0) + + return wall_minutes diff --git a/testgen/common/get_pipeline_parms.py b/testgen/common/get_pipeline_parms.py deleted file mode 100644 index 1f10b364..00000000 --- a/testgen/common/get_pipeline_parms.py +++ /dev/null @@ -1,24 +0,0 @@ -from typing import TypedDict - -from testgen.common.database.database_service import fetch_dict_from_db -from testgen.common.read_file import read_template_sql_file - - -class BaseParams(TypedDict): - project_code: str - connection_id: str - -class TestGenerationParams(BaseParams): - export_to_observability: str - test_suite_id: str - profiling_as_of_date: str - - -def get_test_generation_params(table_group_id: str, test_suite: str) -> TestGenerationParams: - results = fetch_dict_from_db( - read_template_sql_file("parms_test_gen.sql", "parms"), - {"TABLE_GROUP_ID": table_group_id, "TEST_SUITE": test_suite}, - ) - if not results: - raise ValueError("Connection parameters not found for test generation.") - return TestGenerationParams(results[0]) diff --git a/testgen/common/models/entity.py b/testgen/common/models/entity.py index 2fe3ac93..6d0b0950 100644 --- a/testgen/common/models/entity.py +++ b/testgen/common/models/entity.py @@ -1,5 +1,5 @@ from collections.abc import Iterable -from dataclasses import asdict, dataclass +from dataclasses import asdict, dataclass, fields from typing import Any, Self from uuid import UUID @@ -23,7 +23,7 @@ class EntityMinimal: @classmethod def columns(cls) -> list[str]: - return list(cls.__annotations__.keys()) + return [f.name for f in fields(cls)] def to_dict(self, json_safe: bool = False) -> dict[str, Any]: result = asdict(self) diff --git a/testgen/common/models/notification_settings.py b/testgen/common/models/notification_settings.py index 66c616bb..e15349f4 100644 --- a/testgen/common/models/notification_settings.py +++ b/testgen/common/models/notification_settings.py @@ -35,10 +35,15 @@ class ProfilingRunNotificationTrigger(enum.Enum): on_changes = "on_changes" +class MonitorNotificationTrigger(enum.Enum): + on_anomalies = "on_anomalies" + + class NotificationEvent(enum.Enum): test_run = "test_run" profiling_run = "profiling_run" score_drop = "score_drop" + monitor_run = "monitor_run" class NotificationSettingsValidationError(Exception): @@ -96,7 +101,7 @@ def _base_select_query( fk_count = len([None for fk in (test_suite_id, table_group_id, score_definition_id) if fk is not SENTINEL]) if fk_count > 1: raise ValueError("Only one foreign key can be used at a time.") - elif fk_count == 1 and (project_code is not SENTINEL or event is not SENTINEL): + elif fk_count == 1 and (project_code is not SENTINEL): raise ValueError("Filtering by project_code or event is not allowed when filtering by a foreign key.") query = select(cls) @@ -284,3 +289,40 @@ def create( ) ns.save() return ns + + +class MonitorNotificationSettings(RunNotificationSettings[TestRunNotificationTrigger]): + + __mapper_args__: ClassVar = { + "polymorphic_identity": NotificationEvent.monitor_run, + } + trigger_enum = MonitorNotificationTrigger + + @property + def table_name(self) -> str | None: + return self.settings["table_name"] if self.settings.get("table_name") else None + + @table_name.setter + def table_name(self, value: str | None) -> None: + self.settings = {**self.settings, "table_name": value} + + @classmethod + def create( + cls, + project_code: str, + table_group_id: UUID, + test_suite_id: UUID, + recipients: list[str], + trigger: TestRunNotificationTrigger, + table_name: str | None = None, + ) -> Self: + ns = cls( + event=NotificationEvent.monitor_run, + project_code=project_code, + table_group_id=table_group_id, + test_suite_id=test_suite_id, + recipients=recipients, + settings={"trigger": trigger.value, "table_name": table_name}, + ) + ns.save() + return ns diff --git a/testgen/common/models/scheduler.py b/testgen/common/models/scheduler.py index da3f7073..fa070e03 100644 --- a/testgen/common/models/scheduler.py +++ b/testgen/common/models/scheduler.py @@ -3,16 +3,19 @@ from typing import Any, Self from uuid import UUID, uuid4 +import streamlit as st from cron_converter import Cron from sqlalchemy import Boolean, Column, String, cast, delete, func, select, update from sqlalchemy.dialects import postgresql from sqlalchemy.orm import InstrumentedAttribute from testgen.common.models import Base, get_current_session +from testgen.common.models.entity import ENTITY_HASH_FUNCS from testgen.common.models.test_definition import TestDefinition from testgen.common.models.test_suite import TestSuite RUN_TESTS_JOB_KEY = "run-tests" +RUN_MONITORS_JOB_KEY = "run-monitors" RUN_PROFILE_JOB_KEY = "run-profile" @@ -29,13 +32,20 @@ class JobSchedule(Base): cron_tz: str = Column(String, nullable=False) active: bool = Column(Boolean, default=True) + @classmethod + @st.cache_data(show_spinner=False, hash_funcs=ENTITY_HASH_FUNCS) + def get(cls, *clauses) -> Self | None: + query = select(cls).where(*clauses) + return get_current_session().scalars(query).first() + @classmethod def select_where(cls, *clauses, order_by: str | InstrumentedAttribute | None = None) -> Iterable[Self]: + test_job_keys = [RUN_TESTS_JOB_KEY, RUN_MONITORS_JOB_KEY] test_definitions_count = ( select(cls.id) .join(TestSuite, TestSuite.id == cast(cls.kwargs["test_suite_id"].astext, postgresql.UUID)) .join(TestDefinition, TestDefinition.test_suite_id == TestSuite.id) - .where(cls.key == RUN_TESTS_JOB_KEY, cls.active == True) + .where(cls.key.in_(test_job_keys), cls.active == True) .group_by(cls.id, TestSuite.id) .having(func.count(TestDefinition.id) > 0) .subquery() @@ -45,14 +55,14 @@ def select_where(cls, *clauses, order_by: str | InstrumentedAttribute | None = N .join(test_definitions_count, test_definitions_count.c.id == cls.id) .where(*clauses) ) - non_test_runs_query = select(cls).where(cls.key != RUN_TESTS_JOB_KEY, cls.active == True, *clauses) + non_test_runs_query = select(cls).where(cls.key.not_in(test_job_keys), cls.active == True, *clauses) query = test_runs_query.union_all(non_test_runs_query).order_by(order_by) return get_current_session().execute(query) @classmethod def delete(cls, job_id: str | UUID) -> None: - query = delete(cls).where(JobSchedule.id == UUID(job_id)) + query = delete(cls).where(JobSchedule.id == job_id) db_session = get_current_session() try: db_session.execute(query) @@ -60,10 +70,11 @@ def delete(cls, job_id: str | UUID) -> None: db_session.rollback() else: db_session.commit() + cls.clear_cache() @classmethod def update_active(cls, job_id: str | UUID, active: bool) -> None: - query = update(cls).where(JobSchedule.id == UUID(job_id)).values(active=active) + query = update(cls).where(JobSchedule.id == job_id).values(active=active) db_session = get_current_session() try: db_session.execute(query) @@ -71,10 +82,15 @@ def update_active(cls, job_id: str | UUID, active: bool) -> None: db_session.rollback() else: db_session.commit() + cls.clear_cache() @classmethod def count(cls): return get_current_session().query(cls).count() + + @classmethod + def clear_cache(cls) -> None: + cls.get.clear() def get_sample_triggering_timestamps(self, n=3) -> list[datetime]: schedule = Cron(cron_string=self.cron_expr).schedule(timezone_str=self.cron_tz) @@ -83,3 +99,9 @@ def get_sample_triggering_timestamps(self, n=3) -> list[datetime]: @property def cron_tz_str(self) -> str: return self.cron_tz.replace("_", " ") + + def save(self) -> None: + db_session = get_current_session() + db_session.add(self) + db_session.commit() + self.__class__.clear_cache() diff --git a/testgen/common/models/table_group.py b/testgen/common/models/table_group.py index ae24af04..938a851b 100644 --- a/testgen/common/models/table_group.py +++ b/testgen/common/models/table_group.py @@ -11,7 +11,6 @@ from testgen.common.models import get_current_session from testgen.common.models.custom_types import NullIfEmptyString, YNString from testgen.common.models.entity import ENTITY_HASH_FUNCS, Entity, EntityMinimal -from testgen.common.models.scheduler import RUN_TESTS_JOB_KEY, JobSchedule from testgen.common.models.scores import ScoreDefinition from testgen.common.models.test_suite import TestSuite @@ -28,6 +27,8 @@ class TableGroupMinimal(EntityMinimal): profiling_exclude_mask: str profile_use_sampling: bool profiling_delay_days: str + monitor_test_suite_id: UUID | None + last_complete_profile_run_id: UUID | None @dataclass @@ -62,6 +63,25 @@ class TableGroupSummary(EntityMinimal): latest_anomalies_likely_ct: int latest_anomalies_possible_ct: int latest_anomalies_dismissed_ct: int + monitor_test_suite_id: UUID | None + monitor_lookback: int | None + monitor_lookback_start: datetime | None + monitor_lookback_end: datetime | None + monitor_freshness_anomalies: int | None + monitor_schema_anomalies: int | None + monitor_volume_anomalies: int | None + monitor_metric_anomalies: int | None + monitor_freshness_has_errors: bool | None + monitor_volume_has_errors: bool | None + monitor_schema_has_errors: bool | None + monitor_metric_has_errors: bool | None + monitor_freshness_is_training: bool | None + monitor_volume_is_training: bool | None + monitor_metric_is_training: bool | None + monitor_freshness_is_pending: bool | None + monitor_volume_is_pending: bool | None + monitor_schema_is_pending: bool | None + monitor_metric_is_pending: bool | None class TableGroup(Entity): @@ -70,6 +90,11 @@ class TableGroup(Entity): id: UUID = Column(postgresql.UUID(as_uuid=True), primary_key=True, default=uuid4) project_code: str = Column(String, ForeignKey("projects.project_code")) connection_id: int = Column(BigInteger, ForeignKey("connections.connection_id")) + default_test_suite_id: UUID | None = Column( + postgresql.UUID(as_uuid=True), + ForeignKey("test_suites.id"), + default=None, + ) monitor_test_suite_id: UUID | None = Column( postgresql.UUID(as_uuid=True), ForeignKey("test_suites.id"), @@ -223,6 +248,59 @@ def select_summary(cls, project_code: str, for_dashboard: bool = False) -> Itera anomaly_types.id = latest_anomalies.anomaly_id ) GROUP BY latest_run.id + ), + ranked_test_runs AS ( + SELECT + table_groups.id AS table_group_id, + test_runs.id, + test_runs.test_starttime, + COALESCE(test_suites.monitor_lookback, 1) AS lookback, + ROW_NUMBER() OVER (PARTITION BY test_runs.test_suite_id ORDER BY test_runs.test_starttime DESC) AS position + FROM table_groups + INNER JOIN test_runs + ON (test_runs.test_suite_id = table_groups.monitor_test_suite_id) + INNER JOIN test_suites + ON (table_groups.monitor_test_suite_id = test_suites.id) + WHERE table_groups.project_code = :project_code + ), + monitor_tables AS ( + SELECT + ranked_test_runs.table_group_id, + SUM(CASE WHEN results.test_type = 'Freshness_Trend' AND results.result_code = 0 THEN 1 ELSE 0 END) AS freshness_anomalies, + SUM(CASE WHEN results.test_type = 'Schema_Drift' AND results.result_code = 0 THEN 1 ELSE 0 END) AS schema_anomalies, + SUM(CASE WHEN results.test_type = 'Volume_Trend' AND results.result_code = 0 THEN 1 ELSE 0 END) AS volume_anomalies, + SUM(CASE WHEN results.test_type = 'Metric_Trend' AND results.result_code = 0 THEN 1 ELSE 0 END) AS metric_anomalies, + BOOL_OR(results.result_status = 'Error') FILTER (WHERE results.test_type = 'Freshness_Trend' AND ranked_test_runs.position = 1) AS freshness_has_errors, + BOOL_OR(results.result_status = 'Error') FILTER (WHERE results.test_type = 'Volume_Trend' AND ranked_test_runs.position = 1) AS volume_has_errors, + BOOL_OR(results.result_status = 'Error') FILTER (WHERE results.test_type = 'Schema_Drift' AND ranked_test_runs.position = 1) AS schema_has_errors, + BOOL_OR(results.result_status = 'Error') FILTER (WHERE results.test_type = 'Metric_Trend' AND ranked_test_runs.position = 1) AS metric_has_errors, + BOOL_AND(results.result_code = -1) FILTER (WHERE results.test_type = 'Freshness_Trend' AND ranked_test_runs.position = 1) AS freshness_is_training, + BOOL_AND(results.result_code = -1) FILTER (WHERE results.test_type = 'Volume_Trend' AND ranked_test_runs.position = 1) AS volume_is_training, + BOOL_AND(results.result_code = -1) FILTER (WHERE results.test_type = 'Metric_Trend' AND ranked_test_runs.position = 1) AS metric_is_training, + BOOL_OR(results.test_type = 'Freshness_Trend') IS NOT TRUE AS freshness_is_pending, + BOOL_OR(results.test_type = 'Volume_Trend') IS NOT TRUE AS volume_is_pending, + -- Schema monitor only creates results on schema changes (Failed) + -- Mark it as pending only if there are no results of any test type + BOOL_OR(results.test_time IS NOT NULL) IS NOT TRUE AS schema_is_pending, + BOOL_OR(results.test_type = 'Metric_Trend') IS NOT TRUE AS metric_is_pending + FROM ranked_test_runs + INNER JOIN test_results AS results + ON (results.test_run_id = ranked_test_runs.id) + WHERE ranked_test_runs.position <= ranked_test_runs.lookback + AND results.table_name IS NOT NULL + GROUP BY ranked_test_runs.table_group_id + ), + lookback_windows AS ( + SELECT + table_group_id, + lookback, + MIN(test_starttime) FILTER (WHERE position = LEAST(lookback + 1, max_position)) AS lookback_start, + MAX(test_starttime) FILTER (WHERE position = 1) AS lookback_end + FROM ( + SELECT *, MAX(position) OVER (PARTITION BY table_group_id) as max_position + FROM ranked_test_runs + ) pos + GROUP BY table_group_id, lookback ) SELECT groups.id, groups.table_groups_name, @@ -240,10 +318,31 @@ def select_summary(cls, project_code: str, for_dashboard: bool = False) -> Itera latest_profile.definite_ct AS latest_anomalies_definite_ct, latest_profile.likely_ct AS latest_anomalies_likely_ct, latest_profile.possible_ct AS latest_anomalies_possible_ct, - latest_profile.dismissed_ct AS latest_anomalies_dismissed_ct + latest_profile.dismissed_ct AS latest_anomalies_dismissed_ct, + groups.monitor_test_suite_id AS monitor_test_suite_id, + lookback_windows.lookback AS monitor_lookback, + lookback_windows.lookback_start AS monitor_lookback_start, + lookback_windows.lookback_end AS monitor_lookback_end, + monitor_tables.freshness_anomalies AS monitor_freshness_anomalies, + monitor_tables.schema_anomalies AS monitor_schema_anomalies, + monitor_tables.volume_anomalies AS monitor_volume_anomalies, + monitor_tables.metric_anomalies AS monitor_metric_anomalies, + monitor_tables.freshness_has_errors AS monitor_freshness_has_errors, + monitor_tables.volume_has_errors AS monitor_volume_has_errors, + monitor_tables.schema_has_errors AS monitor_schema_has_errors, + monitor_tables.metric_has_errors AS monitor_metric_has_errors, + monitor_tables.freshness_is_training AS monitor_freshness_is_training, + monitor_tables.volume_is_training AS monitor_volume_is_training, + monitor_tables.metric_is_training AS monitor_metric_is_training, + monitor_tables.freshness_is_pending AS monitor_freshness_is_pending, + monitor_tables.volume_is_pending AS monitor_volume_is_pending, + monitor_tables.schema_is_pending AS monitor_schema_is_pending, + monitor_tables.metric_is_pending AS monitor_metric_is_pending FROM table_groups AS groups LEFT JOIN stats ON (groups.id = stats.table_groups_id) LEFT JOIN latest_profile ON (groups.id = latest_profile.table_groups_id) + LEFT JOIN monitor_tables ON (groups.id = monitor_tables.table_group_id) + LEFT JOIN lookback_windows ON (groups.id = lookback_windows.table_group_id) WHERE groups.project_code = :project_code {"AND groups.include_in_dashboard IS TRUE" if for_dashboard else ""}; """ @@ -331,12 +430,7 @@ def clear_cache(cls) -> bool: cls.select_minimal_where.clear() cls.select_summary.clear() - def save( - self, - add_scorecard_definition: bool = False, - add_monitor_test_suite: bool = False, - monitor_schedule_timezone: str = "UTC", - ) -> None: + def save(self, add_scorecard_definition: bool = False) -> None: if self.id: values = { column.key: getattr(self, column.key, None) @@ -349,38 +443,6 @@ def save( db_session.commit() else: super().save() - db_session = get_current_session() - if add_scorecard_definition: ScoreDefinition.from_table_group(self).save() - - if add_monitor_test_suite: - test_suite = TestSuite( - project_code=self.project_code, - test_suite=f"{self.table_groups_name} Monitor", - connection_id=self.connection_id, - table_groups_id=self.id, - export_to_observability=False, - dq_score_exclude=True, - view_mode="Monitor", - ) - test_suite.save() - - schedule_job = JobSchedule( - project_code=self.project_code, - key=RUN_TESTS_JOB_KEY, - cron_expr="0 * * * *", - cron_tz=monitor_schedule_timezone, - args=[], - kwargs={"test_suite_id": test_suite.id}, - ) - db_session.add(schedule_job) - - self.monitor_test_suite_id = test_suite.id - db_session.execute( - update(TableGroup) - .where(TableGroup.id == self.id).values(monitor_test_suite_id=test_suite.id) - ) - db_session.commit() - TableGroup.clear_cache() diff --git a/testgen/common/models/test_definition.py b/testgen/common/models/test_definition.py index 195121b7..a18b8223 100644 --- a/testgen/common/models/test_definition.py +++ b/testgen/common/models/test_definition.py @@ -33,7 +33,21 @@ @dataclass -class TestDefinitionSummary(EntityMinimal): +class TestTypeSummary(EntityMinimal): + test_name_short: str + default_test_description: str + measure_uom: str + measure_uom_description: str + default_parm_columns: str + default_parm_prompts: str + default_parm_help: str + default_severity: str + test_scope: TestScope + usage_notes: str + + +@dataclass +class TestDefinitionSummary(TestTypeSummary): id: UUID table_groups_id: UUID profile_run_id: UUID @@ -67,25 +81,17 @@ class TestDefinitionSummary(EntityMinimal): match_having_condition: str custom_query: str history_calculation: str + history_calculation_upper: str history_lookback: int - test_active: str + test_active: bool test_definition_status: str severity: str - lock_refresh: str + lock_refresh: bool last_auto_gen_date: datetime profiling_as_of_date: datetime last_manual_update: datetime - export_to_observability: str - test_name_short: str - default_test_description: str - measure_uom: str - measure_uom_description: str - default_parm_columns: str - default_parm_prompts: str - default_parm_help: str - default_severity: str - test_scope: TestScope - usage_notes: str + export_to_observability: bool + prediction: str | None @dataclass @@ -143,6 +149,17 @@ class TestType(Entity): usage_notes: str = Column(String) active: str = Column(String) + _summary_columns = ( + *[key for key in TestTypeSummary.__annotations__.keys() if key != "default_test_description"], + test_description.label("default_test_description"), + ) + + @classmethod + @st.cache_data(show_spinner=False, hash_funcs=ENTITY_HASH_FUNCS) + def select_summary_where(cls, *clauses) -> Iterable[TestTypeSummary]: + results = cls._select_columns_where(cls._summary_columns, *clauses) + return [TestTypeSummary(**row) for row in results] + class TestDefinition(Entity): __tablename__ = "test_definitions" @@ -179,6 +196,7 @@ class TestDefinition(Entity): match_groupby_names: str = Column(NullIfEmptyString) match_having_condition: str = Column(NullIfEmptyString) history_calculation: str = Column(NullIfEmptyString) + history_calculation_upper: str = Column(NullIfEmptyString) history_lookback: int = Column(ZeroIfEmptyInteger, default=0) test_mode: str = Column(String) custom_query: str = Column(QueryString) @@ -192,10 +210,12 @@ class TestDefinition(Entity): profiling_as_of_date: datetime = Column(postgresql.TIMESTAMP) last_manual_update: datetime = Column(UpdateTimestamp, nullable=False) export_to_observability: bool = Column(YNString) + prediction: dict[str, dict[str, float]] | None = Column(postgresql.JSONB) _default_order_by = (asc(func.lower(schema_name)), asc(func.lower(table_name)), asc(func.lower(column_name)), asc(test_type)) _summary_columns = ( - *[key for key in TestDefinitionSummary.__annotations__.keys() if key != "default_test_description"], + *TestDefinitionSummary.__annotations__.keys(), + *[key for key in TestTypeSummary.__annotations__.keys() if key != "default_test_description"], TestType.test_description.label("default_test_description"), ) _minimal_columns = TestDefinitionMinimal.__annotations__.keys() @@ -211,6 +231,7 @@ class TestDefinition(Entity): check_result, last_auto_gen_date, profiling_as_of_date, + prediction, ) @classmethod diff --git a/testgen/common/models/test_result.py b/testgen/common/models/test_result.py index de96c97e..dd8d9ded 100644 --- a/testgen/common/models/test_result.py +++ b/testgen/common/models/test_result.py @@ -2,7 +2,7 @@ from collections import defaultdict from uuid import UUID, uuid4 -from sqlalchemy import Boolean, Column, Enum, ForeignKey, String, and_, or_, select +from sqlalchemy import Boolean, Column, Enum, ForeignKey, Integer, String, or_, select from sqlalchemy.dialects import postgresql from sqlalchemy.orm import aliased @@ -40,6 +40,7 @@ class TestResult(Entity): status: TestResultStatus = Column("result_status", Enum(TestResultStatus)) message: str = Column("result_message", String) + result_code: int = Column(Integer) # Note: not all table columns are implemented by this entity @classmethod @@ -50,22 +51,7 @@ def diff(cls, test_run_id_a: UUID, test_run_id_b: UUID) -> list[TestResultDiffTy alias_a.status, alias_b.status, alias_b.test_definition_id, ).join( alias_b, - or_( - and_( - alias_a.auto_gen.is_(True), - alias_b.auto_gen.is_(True), - alias_a.test_suite_id == alias_b.test_suite_id, - alias_a.schema_name == alias_b.schema_name, - alias_a.table_name.isnot_distinct_from(alias_b.table_name), - alias_a.column_names.isnot_distinct_from(alias_b.column_names), - alias_a.test_type == alias_b.test_type, - ), - and_( - alias_a.auto_gen.isnot(True), - alias_b.auto_gen.isnot(True), - alias_a.test_definition_id == alias_b.test_definition_id, - ), - ), + alias_a.test_definition_id == alias_b.test_definition_id, full=True, ).where( or_(alias_a.test_run_id == test_run_id_a, alias_a.test_run_id.is_(None)), diff --git a/testgen/common/models/test_run.py b/testgen/common/models/test_run.py index 01671315..3709328a 100644 --- a/testgen/common/models/test_run.py +++ b/testgen/common/models/test_run.py @@ -12,7 +12,9 @@ from testgen.common.models import get_current_session from testgen.common.models.entity import Entity, EntityMinimal -from testgen.common.models.test_result import TestResultStatus +from testgen.common.models.project import Project +from testgen.common.models.table_group import TableGroup +from testgen.common.models.test_result import TestResult, TestResultStatus from testgen.common.models.test_suite import TestSuite from testgen.utils import is_uuid4 @@ -61,6 +63,20 @@ class TestRunSummary(EntityMinimal): dismissed_ct: int dq_score_testing: float + +@dataclass +class TestRunMonitorSummary(EntityMinimal): + test_run_id: UUID + table_group_id: UUID + test_endtime: datetime + table_groups_name: str + project_name: str + freshness_anomalies: int + schema_anomalies: int + volume_anomalies: int + table_name: str | None = None + + class LatestTestRun(NamedTuple): id: str run_time: datetime @@ -240,7 +256,7 @@ def select_summary( INNER JOIN test_suites ON (test_runs.test_suite_id = test_suites.id) INNER JOIN table_groups ON (test_suites.table_groups_id = table_groups.id) INNER JOIN projects ON (test_suites.project_code = projects.project_code) - WHERE TRUE + WHERE test_suites.is_monitor IS NOT TRUE {" AND test_suites.project_code = :project_code" if project_code else ""} {" AND test_suites.table_groups_id = :table_group_id" if table_group_id else ""} {" AND test_suites.id = :test_suite_id" if test_suite_id else ""} @@ -257,6 +273,54 @@ def select_summary( results = db_session.execute(text(query), params).mappings().all() return [TestRunSummary(**row) for row in results] + def get_monitoring_summary(self, table_name: str | None = None) -> TestRunMonitorSummary: + freshness_anomalies = func.sum(case( + ((TestResult.test_type == "Table_Freshness") & (TestResult.result_code == 0), 1), + else_=0, + )) + schema_anomalies = func.sum(case( + ((TestResult.test_type == "Schema_Drift") & (TestResult.result_code == 0), 1), + else_=0, + )) + volume_anomalies = func.sum(case( + ((TestResult.test_type == "Volume_Trend") & (TestResult.result_code == 0), 1), + else_=0, + )) + projection = [ + TestRun.id.label("test_run_id"), + TestRun.test_endtime, + TableGroup.id.label("table_group_id"), + TableGroup.table_groups_name, + Project.project_name, + freshness_anomalies.label("freshness_anomalies"), + schema_anomalies.label("schema_anomalies"), + volume_anomalies.label("volume_anomalies"), + ] + group_by = [ + TestRun.id, + TestRun.test_endtime, + TableGroup.id, + TableGroup.table_groups_name, + Project.project_name, + ] + if table_name: + projection.append(TestResult.table_name) + group_by.append(TestResult.table_name) + + query = ( + select(*projection) + .join(TableGroup, TableGroup.monitor_test_suite_id == TestRun.test_suite_id) + .join(Project, Project.project_code == TableGroup.project_code) + .join(TestResult, TestResult.test_run_id == TestRun.id) + .where( + TestRun.id == self.id, + (TestResult.table_name == table_name) if table_name else True, + ) + .group_by(*group_by) + ) + + return TestRunMonitorSummary(**get_current_session().execute(query).first()) + @classmethod def has_running_process(cls, ids: list[str]) -> bool: query = select(func.count(cls.id)).where(cls.id.in_(ids), cls.status == "Running") diff --git a/testgen/common/models/test_suite.py b/testgen/common/models/test_suite.py index ce7d601c..a8c35b8d 100644 --- a/testgen/common/models/test_suite.py +++ b/testgen/common/models/test_suite.py @@ -1,10 +1,11 @@ +import enum from collections.abc import Iterable from dataclasses import dataclass from datetime import datetime from uuid import UUID, uuid4 import streamlit as st -from sqlalchemy import BigInteger, Boolean, Column, ForeignKey, String, asc, func, text +from sqlalchemy import BigInteger, Boolean, Column, Enum, ForeignKey, Integer, String, asc, func, text from sqlalchemy.dialects import postgresql from sqlalchemy.orm import InstrumentedAttribute @@ -14,6 +15,11 @@ from testgen.utils import is_uuid4 +class PredictSensitivity(enum.Enum): + low = "low" + medium = "medium" + high = "high" + @dataclass class TestSuiteMinimal(EntityMinimal): id: UUID @@ -46,7 +52,6 @@ class TestSuiteSummary(EntityMinimal): last_run_log_ct: int last_run_dismissed_ct: int - class TestSuite(Entity): __tablename__ = "test_suites" @@ -63,7 +68,19 @@ class TestSuite(Entity): component_name: str = Column(NullIfEmptyString) last_complete_test_run_id: UUID = Column(postgresql.UUID(as_uuid=True)) dq_score_exclude: bool = Column(Boolean, default=False) - view_mode: str | None = Column(NullIfEmptyString, default=None) + is_monitor: bool = Column(Boolean, default=False) + monitor_lookback: int | None = Column(Integer) + monitor_regenerate_freshness: bool = Column(Boolean, default=True) + predict_sensitivity: PredictSensitivity | None = Column(Enum(PredictSensitivity)) + predict_min_lookback: int | None = Column(Integer) + predict_exclude_weekends: bool = Column(Boolean, default=False) + predict_holiday_codes: str | None = Column(String) + + @property + def holiday_codes_list(self) -> list[str] | None: + if not self.predict_holiday_codes: + return None + return [code.strip() for code in self.predict_holiday_codes.split(",")] _default_order_by = (asc(func.lower(test_suite)),) _minimal_columns = TestSuiteMinimal.__annotations__.keys() @@ -179,7 +196,8 @@ def select_summary(cls, project_code: str, table_group_id: str | UUID | None = N ON (connections.connection_id = suites.connection_id) LEFT JOIN table_groups AS groups ON (groups.id = suites.table_groups_id) - WHERE suites.project_code = :project_code + WHERE suites.is_monitor IS NOT TRUE + AND suites.project_code = :project_code {"AND suites.table_groups_id = :table_group_id" if table_group_id else ""} ORDER BY LOWER(suites.test_suite); """ diff --git a/testgen/common/notifications/monitor_run.py b/testgen/common/notifications/monitor_run.py new file mode 100644 index 00000000..c3893aad --- /dev/null +++ b/testgen/common/notifications/monitor_run.py @@ -0,0 +1,310 @@ +import logging + +from testgen.common.models import with_database_session +from testgen.common.models.notification_settings import ( + MonitorNotificationSettings, + MonitorNotificationTrigger, +) +from testgen.common.models.project import Project +from testgen.common.models.settings import PersistedSetting +from testgen.common.models.table_group import TableGroup +from testgen.common.models.test_result import TestResult, TestResultStatus +from testgen.common.models.test_run import TestRun +from testgen.common.notifications.notifications import BaseNotificationTemplate +from testgen.utils import log_and_swallow_exception + +LOG = logging.getLogger("testgen") + + +class MonitorEmailTemplate(BaseNotificationTemplate): + + def get_subject_template(self) -> str: + return ( + "[TestGen] Monitors Alert: {{summary.table_groups_name}}" + "{{#if summary.table_name}} | {{summary.table_name}}{{/if}}" + ' | {{total_anomalies}} {{pluralize total_anomalies "anomaly" "anomalies"}}' + ) + + def get_title_template(self): + return "Monitors Alert: {{summary.table_groups_name}}" + + def get_main_content_template(self): + return """ +
+ + + + + + + + + + {{#if summary.table_name}} + + + + + {{/if}} + + + + +
Project{{summary.project_name}}
Table Group{{summary.table_groups_name}}
Table{{summary.table_name}}
Time{{format_dt summary.test_endtime}}
+
+
+ + + + + + + + + + +
Anomalies Summary + View on TestGen > +
+ + + + {{#each summary_tags}} + {{>summary_tag .}} + {{/each}} + + +
+
+
+
+ + + + + + + {{#each anomalies}} + + + + + + {{/each}} + + + + + +
TableMonitorDetails
{{truncate 30 table_name}}{{type}}{{details}}
+ {{#if truncated}} + + {{truncated}} more + {{/if}} +
+
+ """ + + def get_summary_tag_template(self): + return """ + + + + + + +
+
+ {{{badge_content}}} +
+
{{type}}
+ + """ + + def get_extra_css_template(self) -> str: + return """ + .tg-summary-bar { + width: 350px; + border-radius: 4px; + overflow: hidden; + } + + .tg-summary-bar td { + height: 10px; + padding: 0; + line-height: 10px; + font-size: 0; + } + + .tg-summary-bar--caption { + margin-top: 4px; + color: var(--caption-text-color); + font-size: 13px; + font-style: italic; + line-height: 1; + } + + .tg-summary-bar--legend { + width: auto; + margin-right: 8px; + } + + .tg-summary-bar--legend-dot { + margin-right: 2px; + font-style: normal; + } + """ + + +@log_and_swallow_exception +@with_database_session +def send_monitor_notifications(test_run: TestRun, result_list_ct=20): + notifications = list(MonitorNotificationSettings.select( + enabled=True, + test_suite_id=test_run.test_suite_id, + )) + if not notifications: + return + + triggers = {MonitorNotificationTrigger.on_anomalies} + notifications = [ns for ns in notifications if ns.trigger in triggers] + if not notifications: + return + + table_group, = TableGroup.select_where(TableGroup.monitor_test_suite_id == test_run.test_suite_id) + if not table_group: + return + + project = Project.get(table_group.project_code) + for notification in notifications: + table_name = notification.settings.get("table_name") + test_results = list(TestResult.select_where( + TestResult.test_run_id == test_run.id, + (TestResult.table_name == table_name) if table_name else True, + )) + anomaly_results = [r for r in test_results if r.result_code == 0] + + if len(anomaly_results) <= 0: + continue + + anomalies = [] + for test_result in anomaly_results: + label = _TEST_TYPE_LABELS.get(test_result.test_type) + details = test_result.message or "N/A" + + if test_result.test_type == "Freshness_Trend": + parts = details.split(". ", 1) + message = parts[1].rstrip(".") if len(parts) > 1 else None + prefix = "Table updated" if "detected: Yes" in details else "No table update" + details = f"{prefix} - {message}" if message else prefix + elif test_result.test_type == "Metric_Trend": + label = f"{label}: {test_result.column_names}" + + anomalies.append({ + "table_name": test_result.table_name or "N/A", + "type": label, + "details": details, + }) + + view_in_testgen_url = "".join( + ( + PersistedSetting.get("BASE_URL", ""), + "/monitors?project_code=", + str(table_group.project_code), + "&table_group_id=", + str(table_group.id), + "&table_name_filter=" if table_name else "", + table_name if table_name else "", + "&source=email", + ) + ) + try: + MonitorEmailTemplate().send( + notification.recipients, + { + "summary": { + "test_endtime": test_run.test_endtime, + "table_groups_name": table_group.table_groups_name, + "project_name": project.project_name, + "table_name": table_name, + }, + "total_anomalies": len(anomaly_results), + "summary_tags": _build_summary_tags(test_results), + "anomalies": anomalies[:result_list_ct], + "truncated": max(len(anomalies) - result_list_ct, 0), + "view_in_testgen_url": view_in_testgen_url, + }, + ) + except Exception: + LOG.exception("Failed sending monitor email notifications") + + +_TEST_TYPE_LABELS = { + "Freshness_Trend": "Freshness", + "Volume_Trend": "Volume", + "Schema_Drift": "Schema", + "Metric_Trend": "Metric", +} + +_BADGE_BASE = "text-align: center; font-weight: bold; font-size: 13px;" +_BADGE_STYLES = { + "anomaly": f"background-color: #EF5350; min-width: 15px; padding: 0 5px; border-radius: 10px; line-height: 20px; color: #ffffff; {_BADGE_BASE}", + "error": f"width: 20px; height: 20px; line-height: 20px; color: #FFA726; font-size: 16px; {_BADGE_BASE}", + "training": f"border: 2px solid #42A5F5; width: 20px; height: 20px; border-radius: 50%; line-height: 16px; color: #42A5F5; box-sizing: border-box; {_BADGE_BASE}", + "pending": f"width: 20px; height: 20px; line-height: 20px; color: #9E9E9E; {_BADGE_BASE}", + "passed": f"background-color: #9CCC65; width: 20px; height: 20px; border-radius: 50%; line-height: 21px; color: #ffffff; {_BADGE_BASE}", +} +_BADGE_CONTENT = { + "error": "⚠", + "training": "···", + "pending": "—", + "passed": "✓", +} + + +def _build_summary_tags(test_results: list[TestResult]) -> list[dict]: + has_any_results = bool(test_results) + tags = [] + for type_key, label in _TEST_TYPE_LABELS.items(): + type_results = [r for r in test_results if r.test_type == type_key] + anomaly_count = sum(1 for r in type_results if r.result_code == 0) + has_errors = any(r.status == TestResultStatus.Error for r in type_results) + + # Schema Drift only creates results on detected changes, and has no training phase. + # Pending = no results of any type; no Schema results but other types ran = passed. + if type_key == "Schema_Drift": + is_pending = not has_any_results + is_training = False + else: + is_pending = not type_results + is_training = bool(type_results) and all(r.result_code == -1 for r in type_results) + + # Priority matches UI: anomalies > errors > training > pending > passed + if anomaly_count > 0: + state = "anomaly" + elif has_errors: + state = "error" + elif is_training: + state = "training" + elif is_pending: + state = "pending" + else: + state = "passed" + + tags.append({ + "type": label, + "badge_style": _BADGE_STYLES[state], + "badge_content": str(anomaly_count) if state == "anomaly" else _BADGE_CONTENT[state], + }) + return tags diff --git a/testgen/common/notifications/notifications.py b/testgen/common/notifications/notifications.py index e26fdab9..b4343e2e 100644 --- a/testgen/common/notifications/notifications.py +++ b/testgen/common/notifications/notifications.py @@ -393,7 +393,7 @@ def get_body_template(self) -> str: - TestGen Help diff --git a/testgen/common/notifications/profiling_run.py b/testgen/common/notifications/profiling_run.py index e97e494b..91e06092 100644 --- a/testgen/common/notifications/profiling_run.py +++ b/testgen/common/notifications/profiling_run.py @@ -258,7 +258,7 @@ def send_profiling_run_notifications(profiling_run: ProfilingRun, result_list_ct return profiling_run_issues_url = "".join( - (PersistedSetting.get("BASE_URL", ""), "/profiling-runs:hygiene?run_id=", str(profiling_run.id)) + (PersistedSetting.get("BASE_URL", ""), "/profiling-runs:hygiene?run_id=", str(profiling_run.id), "&source=email") ) hygiene_issues_summary = [] @@ -304,7 +304,7 @@ def send_profiling_run_notifications(profiling_run: ProfilingRun, result_list_ct "id": str(profiling_run.id), "issues_url": profiling_run_issues_url, "results_url": "".join( - (PersistedSetting.get("BASE_URL", ""), "/profiling-runs:results?run_id=", str(profiling_run.id)) + (PersistedSetting.get("BASE_URL", ""), "/profiling-runs:results?run_id=", str(profiling_run.id), "&source=email") ), "start_time": profiling_run.profiling_starttime, "end_time": profiling_run.profiling_endtime, diff --git a/testgen/common/notifications/score_drop.py b/testgen/common/notifications/score_drop.py index 62dfe817..1bf33d87 100644 --- a/testgen/common/notifications/score_drop.py +++ b/testgen/common/notifications/score_drop.py @@ -180,6 +180,7 @@ def send_score_drop_notifications(notification_data: list[tuple[ScoreDefinition, PersistedSetting.get("BASE_URL", ""), "/quality-dashboard:score-details?definition_id=", str(definition.id), + "&source=email", ) ), "diff": context_diff, diff --git a/testgen/common/notifications/test_run.py b/testgen/common/notifications/test_run.py index 7a24578f..8452cd39 100644 --- a/testgen/common/notifications/test_run.py +++ b/testgen/common/notifications/test_run.py @@ -3,7 +3,10 @@ from sqlalchemy import case, literal, select from testgen.common.models import get_current_session, with_database_session -from testgen.common.models.notification_settings import TestRunNotificationSettings, TestRunNotificationTrigger +from testgen.common.models.notification_settings import ( + TestRunNotificationSettings, + TestRunNotificationTrigger, +) from testgen.common.models.settings import PersistedSetting from testgen.common.models.test_definition import TestType from testgen.common.models.test_result import TestResult, TestResultStatus @@ -242,7 +245,10 @@ def get_extra_css_template(self) -> str: @with_database_session def send_test_run_notifications(test_run: TestRun, result_list_ct=20, result_status_min=5): - notifications = list(TestRunNotificationSettings.select(enabled=True, test_suite_id=test_run.test_suite_id)) + notifications = list(TestRunNotificationSettings.select( + enabled=True, + test_suite_id=test_run.test_suite_id, + )) if not notifications: return @@ -303,7 +309,10 @@ def send_test_run_notifications(test_run: TestRun, result_list_ct=20, result_sta TestType.test_name_short.label("test_type"), ) .join(TestType, TestType.test_type == TestResult.test_type) - .where(TestResult.test_run_id == test_run.id, TestResult.status == status) + .where( + TestResult.test_run_id == test_run.id, + TestResult.status == status, + ) .order_by(changed_case.desc()) .limit(result_count_by_status[status]) ) @@ -312,15 +321,18 @@ def send_test_run_notifications(test_run: TestRun, result_list_ct=20, result_sta tr_summary, = TestRun.select_summary(test_run_ids=[test_run.id]) + test_run_url = "".join( + ( + PersistedSetting.get("BASE_URL", ""), + "/test-runs:results?run_id=", + str(test_run.id), + "&source=email", + ) + ) + context = { "test_run": tr_summary, - "test_run_url": "".join( - ( - PersistedSetting.get("BASE_URL", ""), - "/test-runs:results?run_id=", - str(test_run.id), - ) - ), + "test_run_url": test_run_url, "test_run_id": str(test_run.id), "test_result_summary": [ { @@ -329,6 +341,7 @@ def send_test_run_notifications(test_run: TestRun, result_list_ct=20, result_sta "total": test_run.ct_by_status[status], "truncated": test_run.ct_by_status[status] - len(result_list), "result_list": result_list, + "test_run_url": test_run_url, } for status, label in summary_statuses if (result_list := result_list_by_status.get(status, None)) diff --git a/testgen/common/read_yaml_metadata_records.py b/testgen/common/read_yaml_metadata_records.py index 28f8cf59..66619b1c 100644 --- a/testgen/common/read_yaml_metadata_records.py +++ b/testgen/common/read_yaml_metadata_records.py @@ -63,6 +63,9 @@ "target_data_lookups": [ "lookup_query", ], + "test_templates": [ + "template", + ], } diff --git a/testgen/common/time_series_service.py b/testgen/common/time_series_service.py new file mode 100644 index 00000000..7aca697a --- /dev/null +++ b/testgen/common/time_series_service.py @@ -0,0 +1,167 @@ +import logging +from datetime import datetime + +import holidays +import numpy as np +import pandas as pd +from statsmodels.tsa.statespace.sarimax import SARIMAX + +LOG = logging.getLogger("testgen") + +# This is a heuristic minimum to get a reasonable prediction +# Not a hard limit of the model +MIN_TRAIN_VALUES = 20 + + +class NotEnoughData(ValueError): + pass + + +def get_sarimax_forecast( + history: pd.DataFrame, + num_forecast: int, + exclude_weekends: bool = False, + holiday_codes: list[str] | None = None, + tz: str | None = None, +) -> pd.DataFrame: + """ + # Parameters + :param history: Pandas dataframe containing time series data to be used for training the model. + It must have a DatetimeIndex and a column with the historical values. + Only the first column will be used for the model. + :param num_forcast: Number of values to predict in the future. + :param exclude_weekends: Whether weekends should be considered exogenous when training the model and forecasting. + :param holiday_codes: List of country or financial market codes defining holidays to be considered exogenous when training the model and forecasting. + :param tz: IANA timezone (e.g. "America/New_York") for day-of-week/holiday checks. Naive timestamps are treated as UTC and converted to this timezone before determining weekday/holiday status. + + # Return value + Returns a Pandas dataframe with forecast DatetimeIndex, "mean" column, and "se" (standard error) column. + """ + if len(history) < MIN_TRAIN_VALUES: + raise NotEnoughData("Not enough data points in history.") + + # statsmodels requires DatetimeIndex with a regular frequency + # Resample the data to get a regular time series + datetimes = history.index.to_series() + frequency = infer_frequency(datetimes) + resampled_history = history.resample(frequency).mean().interpolate(method="linear") + + if len(resampled_history) < MIN_TRAIN_VALUES: + raise NotEnoughData("Not enough data points after resampling.") + + # Generate DatetimeIndex with future dates + forecast_start = resampled_history.index[-1] + pd.to_timedelta(frequency) + forecast_index = pd.date_range(start=forecast_start, periods=num_forecast, freq=frequency) + + # Detect holidays in entire date range + holiday_dates = None + if holiday_codes: + all_dates_index = resampled_history.index.append(forecast_index) + holiday_dates = get_holiday_dates(holiday_codes, all_dates_index) + + def get_exog_flags(index: pd.DatetimeIndex) -> pd.DataFrame: + exog = pd.DataFrame(index=index) + exog["is_excluded"] = 0 + # Use local timezone for day-of-week and holiday checks when available + check_index = index.tz_localize("UTC").tz_convert(tz) if tz else index + if exclude_weekends: + # .dayofweek: 5=Saturday, 6=Sunday + exog.loc[check_index.dayofweek >= 5, "is_excluded"] = 1 + if holiday_dates: + exog.loc[pd.Index(check_index.date).isin(holiday_dates), "is_excluded"] = 1 + return exog + + exog_train = get_exog_flags(resampled_history.index) + + # When seasonal_order is not specified, this is effectively the ARIMAX model + model = SARIMAX( + resampled_history.iloc[:, 0], + exog=exog_train, + # This is a good starting point according to Gemini - tune if needed + order=(1, 1, 1), + # Prevent model from crashing when it encounters noisy/non-standard data + enforce_stationarity=False, + enforce_invertibility=False + ) + fitted_model = model.fit(disp=False) + + forecast_index = pd.date_range( + start=resampled_history.index[-1] + pd.to_timedelta(frequency), + periods=num_forecast, + freq=frequency + ) + exog_forecast = get_exog_flags(forecast_index) + forecast = fitted_model.get_forecast(steps=num_forecast, exog=exog_forecast) + + results = pd.DataFrame(index=forecast_index) + results["mean"] = forecast.predicted_mean + + # SE estimation: take the max of three sources to prevent overconfident bounds. + # 1. Model SE (var_pred_mean): can be artificially small when AR/MA nearly cancel + # 2. Residual SE: the model's actual 1-step prediction errors (after Kalman burn-in) + # 3. Raw diff SE: std of first-differences of the original data β€” captures inherent + # point-to-point variability that the model may underestimate + model_se = forecast.var_pred_mean ** 0.5 + order_sum = model.k_ar + model.k_diff + model.k_ma + burn_in = max(order_sum, 3) + usable_residuals = fitted_model.resid.iloc[burn_in:] + resid_se = usable_residuals.std() if len(usable_residuals) >= 5 else 0.0 + raw_diffs = np.diff(history.iloc[:, 0].values) + raw_diff_se = np.std(raw_diffs, ddof=1) if len(raw_diffs) > 1 else 0.0 + results["se"] = np.maximum(model_se, max(resid_se, raw_diff_se)) + + return results + + +def infer_frequency(datetime_series: pd.Series) -> str: + # Calculate the median frequency + time_diffs = datetime_series.diff().dropna() + median_diff = time_diffs.median() + + total_seconds = median_diff.total_seconds() + + # Close to an integer number of days + days = total_seconds / 86400 + nearest_day = round(days) + if nearest_day >= 1 and abs(days - nearest_day) / nearest_day < 0.05: + return f"{int(nearest_day)}D" + + # Close to an integer number of hours + hours = total_seconds / 3600 + nearest_hour = round(hours) + if nearest_hour > 0 and abs(hours - nearest_hour) / nearest_hour < 0.05: + return f"{int(nearest_hour)}h" + + # Fallback to minutes or seconds + frequency = f"{int(total_seconds // 60)}min" + return frequency if frequency != "0min" else f"{int(total_seconds)}S" + + +def get_holiday_dates(holiday_codes: list[str], datetime_index: pd.DatetimeIndex) -> set[datetime]: + years = list(range(datetime_index.year.min(), datetime_index.year.max() + 1)) + + holiday_dates = set() + if holiday_codes: + for code in holiday_codes: + code = code.strip().upper() + found = False + + try: + country_holidays = holidays.country_holidays(code, years=years) + holiday_dates.update(country_holidays.keys()) + found = True + except NotImplementedError: + pass # Not a valid country code + + if not found: + try: + financial_holidays = holidays.financial_holidays(code, years=years) + holiday_dates.update(financial_holidays.keys()) + found = True + except NotImplementedError: + pass # Not a valid financial code + + if not found: + LOG.warning(f"Holiday code '{code}' could not be resolved as a country or financial market") + + return holiday_dates diff --git a/testgen/settings.py b/testgen/settings.py index 98d1c7f5..cf71768d 100644 --- a/testgen/settings.py +++ b/testgen/settings.py @@ -299,14 +299,6 @@ defaults to: `default_suite_desc` """ -DEFAULT_MONITOR_TEST_SUITE_KEY: str = os.getenv("DEFAULT_MONITOR_TEST_SUITE_NAME", "default-monitor-suite-1") -""" -Key to be assgined to the auto generated monitoring test suite. - -from env variable: `DEFAULT_MONITOR_TEST_SUITE_NAME` -defaults to: `default-monitor-suite-1` -""" - DEFAULT_PROFILING_TABLE_SET = os.getenv("DEFAULT_PROFILING_TABLE_SET", "") """ Comma separated list of specific table names to include when running diff --git a/testgen/template/data_chars/data_chars_update.sql b/testgen/template/data_chars/data_chars_update.sql index 448d07cf..38281dad 100644 --- a/testgen/template/data_chars/data_chars_update.sql +++ b/testgen/template/data_chars/data_chars_update.sql @@ -17,20 +17,37 @@ WITH new_chars AS ( schema_name, table_name, run_date +), +updated_records AS ( + UPDATE data_table_chars + SET approx_record_ct = n.approx_record_ct, + record_ct = n.record_ct, + column_ct = n.column_ct, + last_refresh_date = n.run_date, + drop_date = NULL + FROM new_chars n + INNER JOIN data_table_chars d ON ( + n.table_groups_id = d.table_groups_id + AND n.schema_name = d.schema_name + AND n.table_name = d.table_name + ) + WHERE data_table_chars.table_id = d.table_id + RETURNING data_table_chars.*, d.drop_date as old_drop_date ) -UPDATE data_table_chars -SET approx_record_ct = n.approx_record_ct, - record_ct = n.record_ct, - column_ct = n.column_ct, - last_refresh_date = n.run_date, - drop_date = NULL -FROM new_chars n - INNER JOIN data_table_chars d ON ( - n.table_groups_id = d.table_groups_id - AND n.schema_name = d.schema_name - AND n.table_name = d.table_name - ) -WHERE data_table_chars.table_id = d.table_id; +INSERT INTO data_structure_log ( + table_groups_id, + table_id, + table_name, + change_date, + change +) +SELECT u.table_groups_id, + u.table_id, + u.table_name, + u.last_refresh_date, + 'A' + FROM updated_records u + WHERE u.old_drop_date IS NOT NULL; -- Add new records WITH new_chars AS ( @@ -47,32 +64,48 @@ WITH new_chars AS ( schema_name, table_name, run_date +), +inserted_records AS ( + INSERT INTO data_table_chars ( + table_groups_id, + schema_name, + table_name, + add_date, + last_refresh_date, + approx_record_ct, + record_ct, + column_ct + ) + SELECT n.table_groups_id, + n.schema_name, + n.table_name, + n.run_date, + n.run_date, + n.approx_record_ct, + n.record_ct, + n.column_ct + FROM new_chars n + LEFT JOIN data_table_chars d ON ( + n.table_groups_id = d.table_groups_id + AND n.schema_name = d.schema_name + AND n.table_name = d.table_name + ) + WHERE d.table_id IS NULL + RETURNING data_table_chars.* ) -INSERT INTO data_table_chars ( - table_groups_id, - schema_name, - table_name, - add_date, - last_refresh_date, - approx_record_ct, - record_ct, - column_ct - ) -SELECT n.table_groups_id, - n.schema_name, - n.table_name, - n.run_date, - n.run_date, - n.approx_record_ct, - n.record_ct, - n.column_ct -FROM new_chars n - LEFT JOIN data_table_chars d ON ( - n.table_groups_id = d.table_groups_id - AND n.schema_name = d.schema_name - AND n.table_name = d.table_name - ) -WHERE d.table_id IS NULL; +INSERT INTO data_structure_log ( + table_groups_id, + table_id, + table_name, + change_date, + change +) +SELECT i.table_groups_id, + i.table_id, + i.table_name, + i.add_date, + 'A' + FROM inserted_records i; -- Mark dropped records WITH new_chars AS ( @@ -91,19 +124,35 @@ last_run AS ( FROM stg_data_chars_updates WHERE table_groups_id = :TABLE_GROUPS_ID GROUP BY table_groups_id +), +deleted_records AS ( + UPDATE data_table_chars + SET drop_date = l.last_run_date + FROM last_run l + INNER JOIN data_table_chars d ON (l.table_groups_id = d.table_groups_id) + LEFT JOIN new_chars n ON ( + d.table_groups_id = n.table_groups_id + AND d.schema_name = n.schema_name + AND d.table_name = n.table_name + ) + WHERE data_table_chars.table_id = d.table_id + AND d.drop_date IS NULL + AND n.table_name IS NULL + RETURNING data_table_chars.* ) -UPDATE data_table_chars -SET drop_date = l.last_run_date -FROM last_run l - INNER JOIN data_table_chars d ON (l.table_groups_id = d.table_groups_id) - LEFT JOIN new_chars n ON ( - d.table_groups_id = n.table_groups_id - AND d.schema_name = n.schema_name - AND d.table_name = n.table_name - ) -WHERE data_table_chars.table_id = d.table_id - AND d.drop_date IS NULL - AND n.table_name IS NULL; +INSERT INTO data_structure_log ( + table_groups_id, + table_id, + table_name, + change_date, + change +) +SELECT del.table_groups_id, + del.table_id, + del.table_name, + del.drop_date, + 'D' + FROM deleted_records del; -- ============================================================================== -- | Column Characteristics @@ -138,22 +187,43 @@ update_chars AS ( ) WHERE data_column_chars.table_id = d.table_id AND data_column_chars.column_name = d.column_name - RETURNING data_column_chars.*, d.db_data_type as old_data_type + RETURNING data_column_chars.*, d.db_data_type as old_data_type, d.drop_date as old_drop_date, n.run_date as run_date ) INSERT INTO data_structure_log ( - element_id, + table_groups_id, + table_id, + column_id, + table_name, + column_name, change_date, change, old_data_type, new_data_type ) -SELECT u.column_id, +SELECT u.table_groups_id, + u.table_id, + u.column_id, + u.table_name, + u.column_name, u.last_mod_date, 'M', u.old_data_type, u.db_data_type FROM update_chars u - WHERE u.old_data_type <> u.db_data_type; + WHERE u.old_data_type <> u.db_data_type + AND u.old_drop_date IS NULL +UNION ALL +SELECT u.table_groups_id, + u.table_id, + u.column_id, + u.table_name, + u.column_name, + u.run_date, + 'A', + NULL, + u.db_data_type + FROM update_chars u + WHERE u.old_drop_date IS NOT NULL; -- Add new records @@ -211,12 +281,20 @@ inserted_records AS ( RETURNING data_column_chars.* ) INSERT INTO data_structure_log ( - element_id, + table_groups_id, + table_id, + column_id, + table_name, + column_name, change_date, change, new_data_type ) -SELECT i.column_id, +SELECT i.table_groups_id, + i.table_id, + i.column_id, + i.table_name, + i.column_name, i.add_date, 'A', i.db_data_type @@ -256,12 +334,20 @@ deleted_records AS ( RETURNING data_column_chars.* ) INSERT INTO data_structure_log ( - element_id, + table_groups_id, + table_id, + column_id, + table_name, + column_name, change_date, change, old_data_type ) -SELECT del.column_id, +SELECT del.table_groups_id, + del.table_id, + del.column_id, + del.table_name, + del.column_name, del.drop_date, 'D', del.db_data_type diff --git a/testgen/template/dbsetup/020_create_standard_functions_sprocs.sql b/testgen/template/dbsetup/020_create_standard_functions_sprocs.sql index 01b65623..d2285833 100644 --- a/testgen/template/dbsetup/020_create_standard_functions_sprocs.sql +++ b/testgen/template/dbsetup/020_create_standard_functions_sprocs.sql @@ -298,7 +298,7 @@ END; $$ LANGUAGE plpgsql IMMUTABLE; -DROP AGGREGATE IF EXISTS {SCHEMA_NAME}.sum_ln (double precision); +DROP AGGREGATE IF EXISTS {SCHEMA_NAME}.sum_ln (double precision) CASCADE; CREATE AGGREGATE {SCHEMA_NAME}.sum_ln (double precision) ( SFUNC = sum_ln_agg_state, diff --git a/testgen/template/dbsetup/030_initialize_new_schema_structure.sql b/testgen/template/dbsetup/030_initialize_new_schema_structure.sql index 1880e323..d55ba76a 100644 --- a/testgen/template/dbsetup/030_initialize_new_schema_structure.sql +++ b/testgen/template/dbsetup/030_initialize_new_schema_structure.sql @@ -39,6 +39,16 @@ CREATE TABLE stg_data_chars_updates ( record_ct BIGINT ); +CREATE TABLE stg_test_definition_updates ( + test_suite_id UUID, + test_definition_id UUID, + run_date TIMESTAMP, + lower_tolerance VARCHAR(1000), + upper_tolerance VARCHAR(1000), + threshold_value VARCHAR(1000), + prediction JSONB +); + CREATE TABLE projects ( id UUID DEFAULT gen_random_uuid(), project_code VARCHAR(30) NOT NULL @@ -89,6 +99,7 @@ CREATE TABLE table_groups connection_id BIGINT CONSTRAINT table_groups_connections_connection_id_fk REFERENCES connections, + default_test_suite_id UUID DEFAULT NULL, monitor_test_suite_id UUID DEFAULT NULL, table_groups_name VARCHAR(100), table_group_schema VARCHAR(100), @@ -159,8 +170,15 @@ CREATE TABLE test_suites ( component_type VARCHAR(100), component_name VARCHAR(100), last_complete_test_run_id UUID, - dq_score_exclude BOOLEAN default FALSE, - view_mode VARCHAR(20) DEFAULT NULL, + dq_score_exclude BOOLEAN DEFAULT FALSE, + is_monitor BOOLEAN DEFAULT FALSE, + monitor_lookback INTEGER DEFAULT NULL, + monitor_regenerate_freshness BOOLEAN DEFAULT TRUE, + predict_sensitivity VARCHAR(6), + predict_min_lookback INTEGER, + predict_exclude_weekends BOOLEAN DEFAULT FALSE, + predict_holiday_codes VARCHAR(100), + CONSTRAINT test_suites_id_pk PRIMARY KEY (id) ); @@ -202,8 +220,10 @@ CREATE TABLE test_definitions ( match_subset_condition VARCHAR(500), match_groupby_names VARCHAR, match_having_condition VARCHAR(500), - history_calculation VARCHAR(20), + history_calculation VARCHAR(1000), + history_calculation_upper VARCHAR(1000), history_lookback INTEGER, + prediction JSONB, test_mode VARCHAR(20), custom_query VARCHAR, test_active VARCHAR(10) DEFAULT 'Y':: CHARACTER VARYING, @@ -355,14 +375,18 @@ CREATE TABLE profile_pair_rules ( CREATE TABLE data_structure_log ( - log_id UUID DEFAULT gen_random_uuid() - CONSTRAINT pk_dsl_id - PRIMARY KEY, - element_id UUID, - change_date TIMESTAMP, - change VARCHAR(10), - old_data_type VARCHAR(50), - new_data_type VARCHAR(50) + log_id UUID DEFAULT gen_random_uuid() + CONSTRAINT pk_dsl_id + PRIMARY KEY, + table_groups_id UUID, + table_id UUID, + column_id UUID, + table_name VARCHAR(120), + column_name VARCHAR(120), + change_date TIMESTAMP, + change VARCHAR(10), + old_data_type VARCHAR(50), + new_data_type VARCHAR(50) ); CREATE TABLE data_table_chars ( @@ -452,6 +476,7 @@ CREATE TABLE test_types ( measure_uom VARCHAR(100), measure_uom_description VARCHAR(200), selection_criteria TEXT, + generation_template VARCHAR(100), dq_score_prevalence_formula TEXT, dq_score_risk_factor TEXT, column_name_prompt TEXT, @@ -478,7 +503,7 @@ CREATE TABLE test_templates ( CONSTRAINT test_templates_test_types_test_type_fk REFERENCES test_types, sql_flavor VARCHAR(20) NOT NULL, - template_name VARCHAR(400), + template VARCHAR, CONSTRAINT test_templates_test_type_sql_flavor_pk PRIMARY KEY (test_type, sql_flavor) ); @@ -692,11 +717,6 @@ CREATE UNIQUE INDEX idx_tg_last_profile ON table_groups (last_complete_profile_run_id) WHERE last_complete_profile_run_id IS NOT NULL; --- Index Profile Results - ORIGINAL -- still relevant? -CREATE INDEX profile_results_tgid_sn_tn_cn - ON profile_results (table_groups_id, schema_name, table_name, column_name); - - -- Index test_suites CREATE UNIQUE INDEX uix_ts_id ON test_suites(id); @@ -724,6 +744,24 @@ CREATE INDEX ix_td_tg CREATE INDEX ix_td_ts_tc ON test_definitions(test_suite_id, table_name, column_name, test_type); +CREATE UNIQUE INDEX uix_td_autogen_schema + ON test_definitions (test_suite_id, test_type, schema_name) + WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NULL + AND column_name IS NULL; + +CREATE UNIQUE INDEX uix_td_autogen_table + ON test_definitions (test_suite_id, test_type, schema_name, table_name) + WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL; + +CREATE UNIQUE INDEX uix_td_autogen_column + ON test_definitions (test_suite_id, test_type, schema_name, table_name, column_name) + WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NOT NULL; + -- Index test_runs CREATE INDEX ix_trun_ts_fk ON test_runs(test_suite_id); @@ -744,6 +782,9 @@ CREATE INDEX ix_tr_pc_ts CREATE INDEX ix_tr_trun ON test_results(test_run_id); +CREATE INDEX ix_tr_trun_table + ON test_results(test_run_id, table_name); + CREATE INDEX ix_tr_tt ON test_results(test_type); @@ -753,6 +794,10 @@ CREATE INDEX ix_tr_pc_sctc_tt CREATE INDEX ix_tr_ts_tctt ON test_results(test_suite_id, table_name, column_names, test_type); +-- Index data_structure_log +CREATE INDEX ix_dsl_tg_tcd + ON data_structure_log (table_groups_id, table_name, change_date); + -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -- PROFILING OPTIMIZATION -- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - @@ -777,6 +822,12 @@ CREATE INDEX ix_pr_prun CREATE INDEX ix_pr_pc_con ON profile_results(project_code, connection_id); +CREATE INDEX ix_pr_tg_s_t_c + ON profile_results (table_groups_id, schema_name, table_name, column_name); + +CREATE INDEX ix_pr_tg_rd + ON profile_results (table_groups_id, run_date); + CREATE UNIQUE INDEX uix_pr_tg_t_c_prun ON profile_results(table_groups_id, table_name, column_name, profile_run_id); @@ -794,12 +845,21 @@ CREATE INDEX ix_ares_anid ON profile_anomaly_results(anomaly_id); -- Index data_table_chars -CREATE INDEX idx_dtc_tgid_table - ON data_table_chars (table_groups_id, table_name); +CREATE INDEX idx_dtc_tg_schema_table + ON data_table_chars (table_groups_id, schema_name, table_name); + +CREATE INDEX idx_dtc_id + ON data_table_chars (table_id); -- Index data_column_chars -CREATE INDEX idx_dcc_tg_table_column - ON data_column_chars (table_groups_id, table_name, column_name); +CREATE INDEX idx_dcc_tg_schema_table_column + ON data_column_chars (table_groups_id, schema_name, table_name, column_name); + +CREATE INDEX idx_dcc_tableid_column + ON data_column_chars (table_id, column_name); + +CREATE INDEX idx_dcc_id + ON data_column_chars (column_id); -- Conditional Index for dq_scoring views CREATE INDEX idx_test_results_filter_join diff --git a/testgen/template/dbsetup/050_populate_new_schema_metadata.sql b/testgen/template/dbsetup/050_populate_new_schema_metadata.sql index 4628a4ae..4c7d0b79 100644 --- a/testgen/template/dbsetup/050_populate_new_schema_metadata.sql +++ b/testgen/template/dbsetup/050_populate_new_schema_metadata.sql @@ -17,14 +17,42 @@ TRUNCATE TABLE test_types; TRUNCATE TABLE generation_sets; INSERT INTO generation_sets (generation_set, test_type) -VALUES ('Monitor', 'Recency'), - ('Monitor', 'Row_Ct'), - ('Monitor', 'Row_Ct_Pct'), - ('Monitor', 'Daily_Record_Ct'), - ('Monitor', 'Monthly_Rec_Ct'), - ('Monitor', 'Weekly_Rec_Ct'), - ('Monitor', 'Table_Freshness'), - ('Monitor', 'Schema_Drift'); +VALUES ('Standard', 'Alpha_Trunc'), + ('Standard', 'Avg_Shift'), + ('Standard', 'Constant'), + ('Standard', 'Daily_Record_Ct'), + ('Standard', 'Dec_Trunc'), + ('Standard', 'Distinct_Date_Ct'), + ('Standard', 'Distinct_Value_Ct'), + ('Standard', 'Dupe_Rows'), + ('Standard', 'Email_Format'), + ('Standard', 'Future_Date'), + ('Standard', 'Future_Date_1Y'), + ('Standard', 'Incr_Avg_Shift'), + ('Standard', 'LOV_Match'), + ('Standard', 'Min_Date'), + ('Standard', 'Min_Val'), + ('Standard', 'Missing_Pct'), + ('Standard', 'Monthly_Rec_Ct'), + ('Standard', 'Outlier_Pct_Above'), + ('Standard', 'Outlier_Pct_Below'), + ('Standard', 'Pattern_Match'), + ('Standard', 'Recency'), + ('Standard', 'Required'), + ('Standard', 'Street_Addr_Pattern'), + ('Standard', 'US_State'), + ('Standard', 'Unique'), + ('Standard', 'Unique_Pct'), + ('Standard', 'Valid_Characters'), + ('Standard', 'Valid_Month'), + ('Standard', 'Valid_US_Zip'), + ('Standard', 'Valid_US_Zip3'), + ('Standard', 'Variability_Decrease'), + ('Standard', 'Variability_Increase'), + ('Standard', 'Weekly_Rec_Ct'), + ('Monitor', 'Schema_Drift'), + ('Monitor', 'Freshness_Trend'), + ('Monitor', 'Volume_Trend'); TRUNCATE TABLE test_templates; diff --git a/testgen/template/dbsetup/060_create_standard_views.sql b/testgen/template/dbsetup/060_create_standard_views.sql index 536edcee..a36d0897 100644 --- a/testgen/template/dbsetup/060_create_standard_views.sql +++ b/testgen/template/dbsetup/060_create_standard_views.sql @@ -31,78 +31,6 @@ SELECT DISTINCT anomaly_id, table_groups_id, schema_name, table_name, column_nam WHERE disposition = 'Inactive'; -DROP VIEW IF EXISTS v_test_results; - -CREATE VIEW v_test_results -AS -SELECT p.project_name, - ts.test_suite, - tg.table_groups_name, - cn.connection_name, cn.project_host, cn.sql_flavor, - tt.dq_dimension, - r.schema_name, r.table_name, r.column_names, - r.test_time as test_date, - r.test_type, tt.id as test_type_id, tt.test_name_short, tt.test_name_long, - r.test_description, - tt.measure_uom, tt.measure_uom_description, - c.test_operator, - r.threshold_value::NUMERIC(16, 5) as threshold_value, - r.result_measure::NUMERIC(16, 5), - r.result_status, - r.input_parameters, - r.result_message, - tt.result_visualization, - tt.result_visualization_params, - CASE WHEN result_code <> 1 THEN r.severity END as severity, - CASE - WHEN result_code <> 1 THEN r.disposition - ELSE 'Passed' - END as disposition, - r.result_code as passed_ct, - (1 - COALESCE(r.result_code, 0))::INTEGER as exception_ct, - CASE - WHEN result_status = 'Warning' THEN 1 - END::INTEGER as warning_ct, - CASE - WHEN result_status = 'Failed' THEN 1 - END::INTEGER as failed_ct, - CASE - WHEN result_status = 'Error' THEN 1 - END as execution_error_ct, - p.project_code, - r.table_groups_id, - r.id as test_result_id, c.id as connection_id, - r.test_suite_id, - r.test_definition_id as test_definition_id_runtime, - CASE - WHEN r.auto_gen = TRUE THEN d.id - ELSE r.test_definition_id - END as test_definition_id_current, - r.test_run_id as test_run_id, - r.auto_gen - FROM test_results r -INNER JOIN test_types tt - ON (r.test_type = tt.test_type) -LEFT JOIN test_definitions d - ON (r.test_suite_id = d.test_suite_id - AND r.table_name = d.table_name - AND r.column_names = COALESCE(d.column_name, 'N/A') - AND r.test_type = d.test_type - AND r.auto_gen = TRUE - AND d.last_auto_gen_date IS NOT NULL) -INNER JOIN test_suites ts - ON (r.test_suite_id = ts.id) -INNER JOIN projects p - ON (ts.project_code = p.project_code) -INNER JOIN table_groups tg - ON (r.table_groups_id = tg.id) -INNER JOIN connections cn - ON (tg.connection_id = cn.connection_id) -LEFT JOIN cat_test_conditions c - ON (cn.sql_flavor = c.sql_flavor - AND r.test_type = c.test_type); - - DROP VIEW IF EXISTS v_queued_observability_results; CREATE VIEW v_queued_observability_results @@ -293,8 +221,8 @@ SELECT dcc.functional_data_type as semantic_data_type, r.test_time, r.table_name, r.column_names as column_name, COUNT(*) as test_ct, - SUM(r.result_code) as passed_ct, - SUM(1 - r.result_code) as issue_ct, + SUM(CASE WHEN r.result_code = 1 THEN 1 ELSE 0 END) as passed_ct, + SUM(CASE WHEN r.result_code = 0 THEN 1 ELSE 0 END) as issue_ct, MAX(r.dq_record_ct) as dq_record_ct, SUM_LN(COALESCE(r.dq_prevalence, 0.0)) as good_data_pct FROM test_results r @@ -334,8 +262,8 @@ WITH dimension_rollup AS (SELECT r.test_run_id, r.test_suite_id, r.table_groups_id, r.test_time, r.table_name, r.column_names, tt.dq_dimension, COUNT(*) as test_ct, - SUM(r.result_code) as passed_ct, - SUM(1 - r.result_code) as issue_ct, + SUM(CASE WHEN r.result_code = 1 THEN 1 ELSE 0 END) as passed_ct, + SUM(CASE WHEN r.result_code = 0 THEN 1 ELSE 0 END) as issue_ct, MAX(r.dq_record_ct) as dq_record_ct, SUM_LN(COALESCE(r.dq_prevalence::NUMERIC, 0)) as good_data_pct FROM test_results r @@ -479,8 +407,8 @@ SELECT dcc.functional_data_type as semantic_data_type, r.test_time, r.table_name, r.column_names as column_name, COUNT(*) as test_ct, - SUM(r.result_code) as passed_ct, - SUM(1 - r.result_code) as issue_ct, + SUM(CASE WHEN r.result_code = 1 THEN 1 ELSE 0 END) as passed_ct, + SUM(CASE WHEN r.result_code = 0 THEN 1 ELSE 0 END) as issue_ct, MAX(r.dq_record_ct) as dq_record_ct, SUM_LN(COALESCE(r.dq_prevalence, 0.0)) as good_data_pct FROM test_results r diff --git a/testgen/template/dbsetup/075_grant_role_rights.sql b/testgen/template/dbsetup/075_grant_role_rights.sql index 97a54b48..df1d6dea 100644 --- a/testgen/template/dbsetup/075_grant_role_rights.sql +++ b/testgen/template/dbsetup/075_grant_role_rights.sql @@ -22,6 +22,7 @@ GRANT SELECT, INSERT, DELETE, UPDATE ON {SCHEMA_NAME}.stg_functional_table_updates, {SCHEMA_NAME}.stg_secondary_profile_updates, {SCHEMA_NAME}.stg_data_chars_updates, + {SCHEMA_NAME}.stg_test_definition_updates, {SCHEMA_NAME}.test_runs, {SCHEMA_NAME}.functional_test_results, {SCHEMA_NAME}.connections, @@ -39,7 +40,8 @@ GRANT SELECT, INSERT, DELETE, UPDATE ON {SCHEMA_NAME}.score_definition_results_history, {SCHEMA_NAME}.score_history_latest_runs, {SCHEMA_NAME}.job_schedules, - {SCHEMA_NAME}.settings + {SCHEMA_NAME}.settings, + {SCHEMA_NAME}.notification_settings TO testgen_execute_role; diff --git a/testgen/template/dbsetup_anomaly_types/profile_anomaly_types_Table_Pattern_Mismatch.yaml b/testgen/template/dbsetup_anomaly_types/profile_anomaly_types_Table_Pattern_Mismatch.yaml index e31fd5dc..8771cd40 100644 --- a/testgen/template/dbsetup_anomaly_types/profile_anomaly_types_Table_Pattern_Mismatch.yaml +++ b/testgen/template/dbsetup_anomaly_types/profile_anomaly_types_Table_Pattern_Mismatch.yaml @@ -45,11 +45,8 @@ profile_anomaly_types: test_type: Table_Pattern_Mismatch sql_flavor: databricks lookup_type: null - lookup_query: "SELECT DISTINCT column_name, columns.table_name FROM information_schema.columns\ - \ JOIN information_schema.tables ON columns.table_name = tables.table_name AND\ - \ columns.table_schema = tables.table_schema WHERE columns.table_schema = '{TARGET_SCHEMA}'\ - \ AND columns.column_name = '{COLUMN_NAME}' AND UPPER(tables.table_type) = 'BASE\ - \ TABLE' ORDER BY table_name LIMIT {LIMIT};" + lookup_query: |- + SELECT DISTINCT column_name, columns.table_name FROM information_schema.columns JOIN information_schema.tables ON columns.table_name = tables.table_name AND columns.table_schema = tables.table_schema WHERE columns.table_schema = '{TARGET_SCHEMA}' AND columns.column_name = '{COLUMN_NAME}' AND UPPER(tables.table_type) = 'BASE TABLE' ORDER BY table_name LIMIT {LIMIT}; error_type: Profile Anomaly - id: '1122' test_id: '1008' @@ -88,9 +85,6 @@ profile_anomaly_types: test_type: Table_Pattern_Mismatch sql_flavor: snowflake lookup_type: null - lookup_query: "SELECT DISTINCT column_name, columns.table_name FROM information_schema.columns\ - \ JOIN information_schema.tables ON columns.table_name = tables.table_name AND\ - \ columns.table_schema = tables.table_schema WHERE columns.table_schema = '{TARGET_SCHEMA}'\ - \ AND columns.column_name = '{COLUMN_NAME}' AND UPPER(tables.table_type) = 'BASE\ - \ TABLE' ORDER BY table_name LIMIT {LIMIT};" + lookup_query: |- + SELECT DISTINCT column_name, columns.table_name FROM information_schema.columns JOIN information_schema.tables ON columns.table_name = tables.table_name AND columns.table_schema = tables.table_schema WHERE columns.table_schema = '{TARGET_SCHEMA}' AND columns.column_name = '{COLUMN_NAME}' AND UPPER(tables.table_type) = 'BASE TABLE' ORDER BY table_name LIMIT {LIMIT}; error_type: Profile Anomaly diff --git a/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance.yaml b/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance.yaml index 5b277a5e..3fe5b288 100644 --- a/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Aggregate Balance test_name_long: Aggregate values per group match reference test_description: |- - Tests for exact match in aggregate values for each set of column values vs. reference dataset + Tests for exact match in aggregate measure for each set of column values compared to reference dataset. except_message: |- Aggregate measure per set of column values does not exactly match reference dataset. measure_uom: Mismatched measures measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '1.0' @@ -218,28 +219,343 @@ test_types: - id: '2506' test_type: Aggregate_Balance sql_flavor: bigquery - template_name: ex_aggregate_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total <> match_total + OR (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL); - id: '2406' test_type: Aggregate_Balance sql_flavor: databricks - template_name: ex_aggregate_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total <> match_total + OR (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL); - id: '2206' test_type: Aggregate_Balance sql_flavor: mssql - template_name: ex_aggregate_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total <> match_total + OR (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL); - id: '2306' test_type: Aggregate_Balance sql_flavor: postgresql - template_name: ex_aggregate_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total <> match_total + OR (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL); - id: '2006' test_type: Aggregate_Balance sql_flavor: redshift - template_name: ex_aggregate_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total <> match_total + OR (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL); - id: '2506' test_type: Aggregate_Balance sql_flavor: redshift_spectrum - template_name: ex_aggregate_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total <> match_total + OR (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL); - id: '2106' test_type: Aggregate_Balance sql_flavor: snowflake - template_name: ex_aggregate_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total <> match_total + OR (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL); diff --git a/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Percent.yaml b/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Percent.yaml index 84b28ecf..f5fc0618 100644 --- a/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Percent.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Percent.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Aggregate Balance Percent test_name_long: Aggregate measure per group within percent of reference test_description: |- - Tests that aggregate measure for each set of column values fall within a percent range above or below the measure for reference dataset + Tests that aggregate measure for each set of column values falls within a percent range above or below the measure for reference dataset. except_message: |- Aggregate measure per set of column values is outside percent range of reference dataset. measure_uom: Mismatched measures measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '1.0' @@ -232,28 +233,343 @@ test_types: - id: '2509' test_type: Aggregate_Balance_Percent sql_flavor: bigquery - template_name: ex_aggregate_match_percent_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); - id: '2409' test_type: Aggregate_Balance_Percent sql_flavor: databricks - template_name: ex_aggregate_match_percent_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); - id: '2209' test_type: Aggregate_Balance_Percent sql_flavor: mssql - template_name: ex_aggregate_match_percent_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); - id: '2309' test_type: Aggregate_Balance_Percent sql_flavor: postgresql - template_name: ex_aggregate_match_percent_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); - id: '2009' test_type: Aggregate_Balance_Percent sql_flavor: redshift - template_name: ex_aggregate_match_percent_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); - id: '2509' test_type: Aggregate_Balance_Percent sql_flavor: redshift_spectrum - template_name: ex_aggregate_match_percent_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); - id: '2109' test_type: Aggregate_Balance_Percent sql_flavor: snowflake - template_name: ex_aggregate_match_percent_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); diff --git a/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Range.yaml b/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Range.yaml index b4b03bc1..9d594da4 100644 --- a/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Range.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Aggregate_Balance_Range.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Aggregate Balance Range test_name_long: Aggregate measure per group within hard range of reference test_description: |- - Tests that aggregate measure for each set of column values fall within a hard range above or below the measure for reference dataset + Tests that aggregate measure for each set of column values falls within a hard range above or below the measure for reference dataset. except_message: |- Aggregate measure per set of column values is outside expected range of reference dataset. measure_uom: Mismatched measures measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '1.0' @@ -232,28 +233,343 @@ test_types: - id: '2510' test_type: Aggregate_Balance_Range sql_flavor: bigquery - template_name: ex_aggregate_match_range_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); - id: '2410' test_type: Aggregate_Balance_Range sql_flavor: databricks - template_name: ex_aggregate_match_range_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); - id: '2210' test_type: Aggregate_Balance_Range sql_flavor: mssql - template_name: ex_aggregate_match_range_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); - id: '2310' test_type: Aggregate_Balance_Range sql_flavor: postgresql - template_name: ex_aggregate_match_range_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); - id: '2010' test_type: Aggregate_Balance_Range sql_flavor: redshift - template_name: ex_aggregate_match_range_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); - id: '2510' test_type: Aggregate_Balance_Range sql_flavor: redshift_spectrum - template_name: ex_aggregate_match_range_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); - id: '2110' test_type: Aggregate_Balance_Range sql_flavor: snowflake - template_name: ex_aggregate_match_range_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE (total IS NOT NULL AND match_total IS NULL) + OR (total IS NULL AND match_total IS NOT NULL) + OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); diff --git a/testgen/template/dbsetup_test_types/test_types_Aggregate_Minimum.yaml b/testgen/template/dbsetup_test_types/test_types_Aggregate_Minimum.yaml index e5355a76..676052a2 100644 --- a/testgen/template/dbsetup_test_types/test_types_Aggregate_Minimum.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Aggregate_Minimum.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Aggregate Minimum test_name_long: Aggregate values per group are at or above reference test_description: |- - Tests that aggregate values for each set of column values are at least the same as reference dataset + Tests that aggregate values for each set of column values are at least the same as reference dataset. except_message: |- Aggregate measure per set of column values is not at least the same as reference dataset. measure_uom: Mismatched measures measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '1.0' @@ -218,28 +219,343 @@ test_types: - id: '2502' test_type: Aggregate_Minimum sql_flavor: bigquery - template_name: ex_aggregate_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total < match_total + -- OR (total IS NOT NULL AND match_total IS NULL) -- New categories + OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories - id: '2402' test_type: Aggregate_Minimum sql_flavor: databricks - template_name: ex_aggregate_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total < match_total + -- OR (total IS NOT NULL AND match_total IS NULL) -- New categories + OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories - id: '2202' test_type: Aggregate_Minimum sql_flavor: mssql - template_name: ex_aggregate_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total < match_total + -- OR (total IS NOT NULL AND match_total IS NULL) -- New categories + OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories - id: '2302' test_type: Aggregate_Minimum sql_flavor: postgresql - template_name: ex_aggregate_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total < match_total + -- OR (total IS NOT NULL AND match_total IS NULL) -- New categories + OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories - id: '2002' test_type: Aggregate_Minimum sql_flavor: redshift - template_name: ex_aggregate_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total < match_total + -- OR (total IS NOT NULL AND match_total IS NULL) -- New categories + OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories - id: '2502' test_type: Aggregate_Minimum sql_flavor: redshift_spectrum - template_name: ex_aggregate_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total < match_total + -- OR (total IS NOT NULL AND match_total IS NULL) -- New categories + OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories - id: '2102' test_type: Aggregate_Minimum sql_flavor: snowflake - template_name: ex_aggregate_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL + FROM + ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + {HAVING_CONDITION} + UNION ALL + SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} ) a + GROUP BY {GROUPBY_NAMES} ) s + WHERE total < match_total + -- OR (total IS NOT NULL AND match_total IS NULL) -- New categories + OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories diff --git a/testgen/template/dbsetup_test_types/test_types_Alpha_Trunc.yaml b/testgen/template/dbsetup_test_types/test_types_Alpha_Trunc.yaml index 3e9297e5..aa070119 100644 --- a/testgen/template/dbsetup_test_types/test_types_Alpha_Trunc.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Alpha_Trunc.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Alpha Truncation test_name_long: Maximum character count consistent test_description: |- - Tests that the maximum count of characters in a column value has not dropped vs. baseline data + Tests that maximum count of characters in column values has not dropped compared to baseline data. except_message: |- - Maximum length of values has dropped from prior expected length. + Maximum length of values has dropped compared to baseline. measure_uom: Values over max measure_uom_description: null selection_criteria: |- general_type ='A' AND max_length > 0 AND ( (min_length = avg_length AND max_length = avg_length) OR (numeric_ct <> value_ct ) ) AND functional_table_type NOT LIKE '%window%' /* The conditions below are to eliminate overlap with : LOV_Match (excluded selection criteria for this test_type), Pattern_Match (excluded selection criteria for this test_type), Constant (excluded functional_data_type Constant and Boolean) */ AND ( (distinct_value_ct NOT BETWEEN 2 AND 10 AND functional_data_type NOT IN ( 'Constant', 'Boolean') ) AND NOT ( fn_charcount(top_patterns, E' \| ' ) = 1 AND fn_charcount(top_patterns, E' \| ' ) IS NOT NULL AND REPLACE(SPLIT_PART(top_patterns, '|' , 2), 'N' , '' ) > '')) + generation_template: null dq_score_prevalence_formula: |- {VALUE_CT}::FLOAT * (FN_NORMAL_CDF(({MAX_LENGTH}::FLOAT - {AVG_LENGTH}::FLOAT) / (NULLIF({MAX_LENGTH}::FLOAT, 0) / 3)) - FN_NORMAL_CDF(({RESULT_MEASURE}::FLOAT - {AVG_LENGTH}::FLOAT) / (NULLIF({MAX_LENGTH}::FLOAT, 0) / 3)) ) /NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Avg_Shift.yaml b/testgen/template/dbsetup_test_types/test_types_Avg_Shift.yaml index b5a0aaf6..367c833c 100644 --- a/testgen/template/dbsetup_test_types/test_types_Avg_Shift.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Avg_Shift.yaml @@ -4,7 +4,7 @@ test_types: test_name_short: Average Shift test_name_long: Column mean is consistent with reference test_description: |- - Tests for statistically-significant shift in mean value for column from average calculated at baseline. + Tests for statistically significant shift in mean value for column from average calculated at baseline. except_message: |- Standardized difference between averages is over the selected threshold level. measure_uom: Difference Measure @@ -12,6 +12,7 @@ test_types: Cohen's D Difference (0.20 small, 0.5 mod, 0.8 large, 1.2 very large, 2.0 huge) selection_criteria: |- general_type='N' AND distinct_value_ct > 10 AND functional_data_type ilike 'Measure%' AND functional_data_type <> 'Measurement Spike' AND column_name NOT ilike '%latitude%' AND column_name NOT ilike '%longitude%' + generation_template: null dq_score_prevalence_formula: |- 2.0 * (1.0 - fn_normal_cdf(ABS({RESULT_MEASURE}::FLOAT) / 2.0)) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_CUSTOM.yaml b/testgen/template/dbsetup_test_types/test_types_CUSTOM.yaml index 9c404a15..fbfa7fa1 100644 --- a/testgen/template/dbsetup_test_types/test_types_CUSTOM.yaml +++ b/testgen/template/dbsetup_test_types/test_types_CUSTOM.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Custom Test test_name_long: Custom-defined business rule test_description: |- - Custom SQL Query Test + Custom SQL Query Test. A highly flexible business-rule test covering any error state that can be expressed by a SQL query against one or more tables in the database. except_message: |- Errors were detected according to test definition. measure_uom: Errors found measure_uom_description: |- Count of errors identified by query selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' @@ -42,28 +43,273 @@ test_types: - id: '2504' test_type: CUSTOM sql_flavor: bigquery - template_name: ex_custom_query_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + CASE + WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN NULL + ELSE '{COLUMN_NAME_NO_QUOTES}' + END as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ + 'Skip_Errors={SKIP_ERRORS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + {CUSTOM_QUERY} + ) TEST; - id: '2404' test_type: CUSTOM sql_flavor: databricks - template_name: ex_custom_query_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + CASE + WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN NULL + ELSE '{COLUMN_NAME_NO_QUOTES}' + END as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ + 'Skip_Errors={SKIP_ERRORS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + {CUSTOM_QUERY} + ) TEST; - id: '2204' test_type: CUSTOM sql_flavor: mssql - template_name: ex_custom_query_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + CASE + WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN NULL + ELSE '{COLUMN_NAME_NO_QUOTES}' + END as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ + 'Skip_Errors={SKIP_ERRORS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + {CUSTOM_QUERY} + ) TEST; - id: '2304' test_type: CUSTOM sql_flavor: postgresql - template_name: ex_custom_query_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + CASE + WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN NULL + ELSE '{COLUMN_NAME_NO_QUOTES}' + END as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ + 'Skip_Errors={SKIP_ERRORS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + {CUSTOM_QUERY} + ) TEST; - id: '2004' test_type: CUSTOM sql_flavor: redshift - template_name: ex_custom_query_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + CASE + WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN NULL + ELSE '{COLUMN_NAME_NO_QUOTES}' + END as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ + 'Skip_Errors={SKIP_ERRORS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + {CUSTOM_QUERY} + ) TEST; - id: '2504' test_type: CUSTOM sql_flavor: redshift_spectrum - template_name: ex_custom_query_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + CASE + WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN NULL + ELSE '{COLUMN_NAME_NO_QUOTES}' + END as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ + 'Skip_Errors={SKIP_ERRORS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + {CUSTOM_QUERY} + ) TEST; - id: '2104' test_type: CUSTOM sql_flavor: snowflake - template_name: ex_custom_query_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + CASE + WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN NULL + ELSE '{COLUMN_NAME_NO_QUOTES}' + END as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ + 'Skip_Errors={SKIP_ERRORS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + {CUSTOM_QUERY} + ) TEST; diff --git a/testgen/template/dbsetup_test_types/test_types_Combo_Match.yaml b/testgen/template/dbsetup_test_types/test_types_Combo_Match.yaml index 2c02c157..f9dffc4d 100644 --- a/testgen/template/dbsetup_test_types/test_types_Combo_Match.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Combo_Match.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Reference Match test_name_long: Column values or combinations found in reference test_description: |- - Tests for the presence of one or a set of column values in a reference table + Tests for the presence of one or a set of column values in reference dataset. except_message: |- Column value combinations are not found in reference table values. measure_uom: Missing values measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' @@ -195,28 +196,310 @@ test_types: - id: '2501' test_type: Combo_Match sql_flavor: bigquery - template_name: ex_data_match_bigquery.sql + template: |- + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + NULL as result_signal, + CASE WHEN COUNT(*) > {SKIP_ERRORS} THEN 0 ELSE 1 END AS result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CAST(COUNT(*) AS STRING), + ' error(s) identified, ', + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) AS result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + {HAVING_CONDITION} + + EXCEPT DISTINCT + + SELECT {MATCH_GROUPBY_NAMES} + FROM `{MATCH_SCHEMA_NAME}.{MATCH_TABLE_NAME}` + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} + ) test; - id: '2401' test_type: Combo_Match sql_flavor: databricks - template_name: ex_data_match_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + {HAVING_CONDITION} + EXCEPT + SELECT {MATCH_GROUPBY_NAMES} + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} + ) test; - id: '2201' test_type: Combo_Match sql_flavor: mssql - template_name: ex_data_match_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + {HAVING_CONDITION} + EXCEPT + SELECT {MATCH_GROUPBY_NAMES} + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} + ) test; - id: '2301' test_type: Combo_Match sql_flavor: postgresql - template_name: ex_data_match_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + {HAVING_CONDITION} + EXCEPT + SELECT {MATCH_GROUPBY_NAMES} + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} + ) test; - id: '2001' test_type: Combo_Match sql_flavor: redshift - template_name: ex_data_match_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + {HAVING_CONDITION} + EXCEPT + SELECT {MATCH_GROUPBY_NAMES} + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} + ) test; - id: '2501' test_type: Combo_Match sql_flavor: redshift_spectrum - template_name: ex_data_match_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + {HAVING_CONDITION} + EXCEPT + SELECT {MATCH_GROUPBY_NAMES} + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} + ) test; - id: '2101' test_type: Combo_Match sql_flavor: snowflake - template_name: ex_data_match_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + {HAVING_CONDITION} + EXCEPT + SELECT {MATCH_GROUPBY_NAMES} + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + {MATCH_HAVING_CONDITION} + ) test; diff --git a/testgen/template/dbsetup_test_types/test_types_Condition_Flag.yaml b/testgen/template/dbsetup_test_types/test_types_Condition_Flag.yaml index fcde8abd..11125999 100644 --- a/testgen/template/dbsetup_test_types/test_types_Condition_Flag.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Condition_Flag.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Custom Condition test_name_long: Column values match pre-defined condition test_description: |- - Tests that each record in the table matches a pre-defined, custom condition + Tests that each record in the table matches a predefined custom condition. except_message: |- - Value(s) found not matching defined condition. + Values found not matching defined condition. measure_uom: Values Failing measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Constant.yaml b/testgen/template/dbsetup_test_types/test_types_Constant.yaml index 67521638..2bdd1a04 100644 --- a/testgen/template/dbsetup_test_types/test_types_Constant.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Constant.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Constant Match test_name_long: All column values match constant value test_description: |- - Tests that all values in the column match the constant value identified in baseline data + Tests that all values in column match the constant value identified in baseline data. except_message: |- - A constant value is expected for this column. + Column values do not match expected constant value. measure_uom: Mismatched values measure_uom_description: null selection_criteria: |- TEMPLATE + generation_template: gen_Constant.sql dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Daily_Record_Ct.yaml b/testgen/template/dbsetup_test_types/test_types_Daily_Record_Ct.yaml index 7f341c3f..389bf0af 100644 --- a/testgen/template/dbsetup_test_types/test_types_Daily_Record_Ct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Daily_Record_Ct.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Daily Records test_name_long: All dates present within date range test_description: |- - Tests for presence of every calendar date within min/max date range, per baseline data + Tests for presence of every calendar date within minimum and maximum date range, per baseline data. except_message: |- - Not every date value between min and max dates is present, unlike at baseline. + Not every date value between minimum and maximum dates is present. measure_uom: Missing dates measure_uom_description: null selection_criteria: |- general_type= 'D' AND date_days_present > 21 AND date_days_present - (DATEDIFF('day', '1800-01-05'::DATE, max_date) - DATEDIFF('day', '1800-01-05'::DATE, min_date) + 1) = 0 AND future_date_ct::FLOAT / NULLIF(value_ct, 0) <= 0.75 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT*{PRO_RECORD_CT}::FLOAT/NULLIF({DATE_DAYS_PRESENT}::FLOAT, 0)/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Dec_Trunc.yaml b/testgen/template/dbsetup_test_types/test_types_Dec_Trunc.yaml index ffa38aa9..02fe0dda 100644 --- a/testgen/template/dbsetup_test_types/test_types_Dec_Trunc.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Dec_Trunc.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: Decimal Truncation test_name_long: Sum of fractional values at or above reference test_description: |- - Tests for decimal truncation by confirming that the sum of fractional values in data is no less than the sum at baseline + Tests for decimal truncation by confirming that sum of fractional values in data is no less than sum at baseline. except_message: |- - The sum of fractional values is under baseline, which may indicate decimal truncation + Sum of fractional values is under baseline sum, which may indicate decimal truncation. measure_uom: Fractional sum measure_uom_description: |- The sum of all decimal values from all data for this column selection_criteria: |- fractional_sum > 0 AND functional_table_type LIKE'%cumulative%' + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Distinct_Date_Ct.yaml b/testgen/template/dbsetup_test_types/test_types_Distinct_Date_Ct.yaml index 1762b558..54be295e 100644 --- a/testgen/template/dbsetup_test_types/test_types_Distinct_Date_Ct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Distinct_Date_Ct.yaml @@ -4,7 +4,7 @@ test_types: test_name_short: Date Count test_name_long: Count of distinct dates at or above reference test_description: |- - Tests that the count of distinct dates referenced in the column has not dropped vs. baseline data + Tests that count of distinct dates referenced in column has not dropped compared to baseline data. except_message: |- Drop in count of unique dates recorded in column. measure_uom: Unique dates @@ -12,6 +12,7 @@ test_types: Count of unique dates in transactional date column selection_criteria: |- functional_data_type ILIKE 'Transactional Date%' AND date_days_present > 1 AND functional_table_type ILIKE '%cumulative%' + generation_template: null dq_score_prevalence_formula: |- (({RECORD_CT}-{PRO_RECORD_CT})::FLOAT*{DISTINCT_VALUE_CT}::FLOAT/NULLIF({PRO_RECORD_CT}::FLOAT, 0))/NULLIF({PRO_RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Distinct_Value_Ct.yaml b/testgen/template/dbsetup_test_types/test_types_Distinct_Value_Ct.yaml index 9e43a2b1..150289ab 100644 --- a/testgen/template/dbsetup_test_types/test_types_Distinct_Value_Ct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Distinct_Value_Ct.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Value Count test_name_long: Count of distinct values has not dropped test_description: |- - Tests that the count of unique values in the column has not changed from baseline. + ests that count of unique values in column has not changed from baseline. except_message: |- Count of unique values in column has changed from baseline. measure_uom: Unique Values measure_uom_description: null selection_criteria: |- distinct_value_ct between 2 and 10 AND value_ct > 50 AND functional_data_type IN ('Code', 'Category', 'Attribute', 'Description') AND NOT coalesce(top_freq_values,'') > '' + generation_template: gen_Distinct_Value_Ct.sql dq_score_prevalence_formula: |- ABS({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT*{PRO_RECORD_CT}::FLOAT/NULLIF({DISTINCT_VALUE_CT}::FLOAT, 0)/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Distribution_Shift.yaml b/testgen/template/dbsetup_test_types/test_types_Distribution_Shift.yaml index 8b5bcce2..b44fcd2d 100644 --- a/testgen/template/dbsetup_test_types/test_types_Distribution_Shift.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Distribution_Shift.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Distribution Shift test_name_long: Probability distribution consistent with reference test_description: |- - Tests the closeness of match between two distributions of aggregate measures across combinations of column values, using Jensen-Shannon Divergence test + Tests the closeness of match between two distributions of aggregate measures across combinations of column values, using Jensen-Shannon Divergence test. except_message: |- Divergence between two distributions exceeds specified threshold. measure_uom: Divergence level (0-1) measure_uom_description: |- Jensen-Shannon Divergence, from 0 (identical distributions), to 1.0 (max divergence) selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '0.75' @@ -220,28 +221,372 @@ test_types: - id: '2503' test_type: Distribution_Shift sql_flavor: bigquery - template_name: ex_relative_entropy_bigquery.sql + template: |- + -- Relative Entropy: measured by Jensen-Shannon Divergence + -- Smoothed and normalized version of KL divergence, + -- with scores between 0 (identical) and 1 (maximally different), + -- when using the base-2 logarithm. Formula is: + -- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) + -- Log base 2 of x = LN(x)/LN(2) + WITH latest_ver AS ( + SELECT {CONCAT_COLUMNS} AS category, + CAST(COUNT(*) AS FLOAT64) / CAST(SUM(COUNT(*)) OVER () AS FLOAT64) AS pct_of_total + FROM `{SCHEMA_NAME}.{TABLE_NAME}` v1 + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} + ), + older_ver AS ( + SELECT {CONCAT_MATCH_GROUPBY} AS category, + CAST(COUNT(*) AS FLOAT64) / CAST(SUM(COUNT(*)) OVER () AS FLOAT64) AS pct_of_total + FROM `{MATCH_SCHEMA_NAME}.{TABLE_NAME}` v2 + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} + ), + dataset AS ( + SELECT COALESCE(l.category, o.category) AS category, + COALESCE(o.pct_of_total, 0.0000001) AS old_pct, + COALESCE(l.pct_of_total, 0.0000001) AS new_pct, + (COALESCE(o.pct_of_total, 0.0000001) + COALESCE(l.pct_of_total, 0.0000001)) / 2.0 AS avg_pct + FROM latest_ver l + FULL JOIN older_ver o + ON l.category = o.category + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + -- '{GROUPBY_NAMES}' as column_names, + '{THRESHOLD_VALUE}' AS threshold_value, + NULL AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + NULL as result_signal, + CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END AS result_code, + CONCAT('Divergence Level: ', CAST(js_divergence AS STRING), ', Threshold: {THRESHOLD_VALUE}.') AS result_message, + js_divergence AS result_measure + FROM ( + SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) + + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) AS js_divergence + FROM dataset + ) rslt; - id: '2403' test_type: Distribution_Shift sql_flavor: databricks - template_name: ex_relative_entropy_generic.sql + template: |- + -- Relative Entropy: measured by Jensen-Shannon Divergence + -- Smoothed and normalized version of KL divergence, + -- with scores between 0 (identical) and 1 (maximally different), + -- when using the base-2 logarithm. Formula is: + -- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) + -- Log base 2 of x = LN(x)/LN(2) + WITH latest_ver + AS ( SELECT {CONCAT_COLUMNS} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v1 + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} ), + older_ver + AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v2 + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} ), + dataset + AS ( SELECT COALESCE(l.category, o.category) AS category, + COALESCE(o.pct_of_total, 0.0000001) AS old_pct, + COALESCE(l.pct_of_total, 0.0000001) AS new_pct, + (COALESCE(o.pct_of_total, 0.0000001) + + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct + FROM latest_ver l + FULL JOIN older_ver o + ON (l.category = o.category) ) + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + -- '{GROUPBY_NAMES}' as column_names, + '{THRESHOLD_VALUE}' as threshold_value, + NULL as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, + CONCAT('Divergence Level: ', + CONCAT(CAST(js_divergence AS {VARCHAR_TYPE}), + ', Threshold: {THRESHOLD_VALUE}.')) as result_message, + js_divergence as result_measure + FROM ( + SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) + + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) as js_divergence + FROM dataset ) rslt; - id: '2203' test_type: Distribution_Shift sql_flavor: mssql - template_name: ex_relative_entropy_mssql.sql + template: |- + -- Relative Entropy: measured by Jensen-Shannon Divergence + -- Smoothed and normalized version of KL divergence, + -- with scores between 0 (identical) and 1 (maximally different), + -- when using the base-2 logarithm. Formula is: + -- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) + -- Log base 2 of x = LN(x)/LN(2) + WITH latest_ver + AS ( SELECT {CONCAT_COLUMNS} as category, + CAST(COUNT(*) as FLOAT) / CAST(SUM(COUNT(*)) OVER () as FLOAT) AS pct_of_total + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" v1 + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} ), + older_ver + AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, + CAST(COUNT(*) as FLOAT) / CAST(SUM(COUNT(*)) OVER () as FLOAT) AS pct_of_total + FROM "{MATCH_SCHEMA_NAME}"."{TABLE_NAME}" v2 + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} ), + dataset + AS ( SELECT COALESCE(l.category, o.category) AS category, + COALESCE(o.pct_of_total, 0.0000001) AS old_pct, + COALESCE(l.pct_of_total, 0.0000001) AS new_pct, + (COALESCE(o.pct_of_total, 0.0000001) + + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct + FROM latest_ver l + FULL JOIN older_ver o + ON (l.category = o.category) ) + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + -- '{GROUPBY_NAMES}' as column_names, + '{THRESHOLD_VALUE}' as threshold_value, + NULL as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, + CONCAT('Divergence Level: ', + CONCAT(CAST(js_divergence AS VARCHAR), + ', Threshold: {THRESHOLD_VALUE}.')) as result_message, + js_divergence as result_measure + FROM ( + SELECT 0.5 * ABS(SUM(new_pct * LOG(new_pct/avg_pct)/LOG(2))) + + 0.5 * ABS(SUM(old_pct * LOG(old_pct/avg_pct)/LOG(2))) as js_divergence + FROM dataset ) rslt; - id: '2303' test_type: Distribution_Shift sql_flavor: postgresql - template_name: ex_relative_entropy_generic.sql + template: |- + -- Relative Entropy: measured by Jensen-Shannon Divergence + -- Smoothed and normalized version of KL divergence, + -- with scores between 0 (identical) and 1 (maximally different), + -- when using the base-2 logarithm. Formula is: + -- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) + -- Log base 2 of x = LN(x)/LN(2) + WITH latest_ver + AS ( SELECT {CONCAT_COLUMNS} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v1 + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} ), + older_ver + AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v2 + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} ), + dataset + AS ( SELECT COALESCE(l.category, o.category) AS category, + COALESCE(o.pct_of_total, 0.0000001) AS old_pct, + COALESCE(l.pct_of_total, 0.0000001) AS new_pct, + (COALESCE(o.pct_of_total, 0.0000001) + + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct + FROM latest_ver l + FULL JOIN older_ver o + ON (l.category = o.category) ) + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + -- '{GROUPBY_NAMES}' as column_names, + '{THRESHOLD_VALUE}' as threshold_value, + NULL as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, + CONCAT('Divergence Level: ', + CONCAT(CAST(js_divergence AS {VARCHAR_TYPE}), + ', Threshold: {THRESHOLD_VALUE}.')) as result_message, + js_divergence as result_measure + FROM ( + SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) + + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) as js_divergence + FROM dataset ) rslt; - id: '2003' test_type: Distribution_Shift sql_flavor: redshift - template_name: ex_relative_entropy_generic.sql + template: |- + -- Relative Entropy: measured by Jensen-Shannon Divergence + -- Smoothed and normalized version of KL divergence, + -- with scores between 0 (identical) and 1 (maximally different), + -- when using the base-2 logarithm. Formula is: + -- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) + -- Log base 2 of x = LN(x)/LN(2) + WITH latest_ver + AS ( SELECT {CONCAT_COLUMNS} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v1 + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} ), + older_ver + AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v2 + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} ), + dataset + AS ( SELECT COALESCE(l.category, o.category) AS category, + COALESCE(o.pct_of_total, 0.0000001) AS old_pct, + COALESCE(l.pct_of_total, 0.0000001) AS new_pct, + (COALESCE(o.pct_of_total, 0.0000001) + + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct + FROM latest_ver l + FULL JOIN older_ver o + ON (l.category = o.category) ) + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + -- '{GROUPBY_NAMES}' as column_names, + '{THRESHOLD_VALUE}' as threshold_value, + NULL as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, + CONCAT('Divergence Level: ', + CONCAT(CAST(js_divergence AS {VARCHAR_TYPE}), + ', Threshold: {THRESHOLD_VALUE}.')) as result_message, + js_divergence as result_measure + FROM ( + SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) + + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) as js_divergence + FROM dataset ) rslt; - id: '2503' test_type: Distribution_Shift sql_flavor: redshift_spectrum - template_name: ex_relative_entropy_generic.sql + template: |- + -- Relative Entropy: measured by Jensen-Shannon Divergence + -- Smoothed and normalized version of KL divergence, + -- with scores between 0 (identical) and 1 (maximally different), + -- when using the base-2 logarithm. Formula is: + -- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) + -- Log base 2 of x = LN(x)/LN(2) + WITH latest_ver + AS ( SELECT {CONCAT_COLUMNS} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v1 + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} ), + older_ver + AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v2 + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} ), + dataset + AS ( SELECT COALESCE(l.category, o.category) AS category, + COALESCE(o.pct_of_total, 0.0000001) AS old_pct, + COALESCE(l.pct_of_total, 0.0000001) AS new_pct, + (COALESCE(o.pct_of_total, 0.0000001) + + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct + FROM latest_ver l + FULL JOIN older_ver o + ON (l.category = o.category) ) + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + -- '{GROUPBY_NAMES}' as column_names, + '{THRESHOLD_VALUE}' as threshold_value, + NULL as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, + CONCAT('Divergence Level: ', + CONCAT(CAST(js_divergence AS {VARCHAR_TYPE}), + ', Threshold: {THRESHOLD_VALUE}.')) as result_message, + js_divergence as result_measure + FROM ( + SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) + + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) as js_divergence + FROM dataset ) rslt; - id: '2103' test_type: Distribution_Shift sql_flavor: snowflake - template_name: ex_relative_entropy_generic.sql + template: |- + -- Relative Entropy: measured by Jensen-Shannon Divergence + -- Smoothed and normalized version of KL divergence, + -- with scores between 0 (identical) and 1 (maximally different), + -- when using the base-2 logarithm. Formula is: + -- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) + -- Log base 2 of x = LN(x)/LN(2) + WITH latest_ver + AS ( SELECT {CONCAT_COLUMNS} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v1 + WHERE {SUBSET_CONDITION} + GROUP BY {COLUMN_NAME_NO_QUOTES} ), + older_ver + AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, + COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total + FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v2 + WHERE {MATCH_SUBSET_CONDITION} + GROUP BY {MATCH_GROUPBY_NAMES} ), + dataset + AS ( SELECT COALESCE(l.category, o.category) AS category, + COALESCE(o.pct_of_total, 0.0000001) AS old_pct, + COALESCE(l.pct_of_total, 0.0000001) AS new_pct, + (COALESCE(o.pct_of_total, 0.0000001) + + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct + FROM latest_ver l + FULL JOIN older_ver o + ON (l.category = o.category) ) + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + -- '{GROUPBY_NAMES}' as column_names, + '{THRESHOLD_VALUE}' as threshold_value, + NULL as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, + CONCAT('Divergence Level: ', + CONCAT(CAST(js_divergence AS {VARCHAR_TYPE}), + ', Threshold: {THRESHOLD_VALUE}.')) as result_message, + js_divergence as result_measure + FROM ( + SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) + + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) as js_divergence + FROM dataset ) rslt; diff --git a/testgen/template/dbsetup_test_types/test_types_Dupe_Rows.yaml b/testgen/template/dbsetup_test_types/test_types_Dupe_Rows.yaml index a186f74d..480988a5 100644 --- a/testgen/template/dbsetup_test_types/test_types_Dupe_Rows.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Dupe_Rows.yaml @@ -4,12 +4,14 @@ test_types: test_name_short: Duplicate Rows test_name_long: Rows are not duplicated in table test_description: |- - Tests for the absence of duplicate rows based on unique combination of column values + Tests for the absence of duplicate rows based on unique combination of column values. except_message: |- Column value combinations are duplicated in the table. measure_uom: Duplicate records measure_uom_description: null - selection_criteria: null + selection_criteria: |- + TEMPLATE + generation_template: gen_Dupe_Rows.sql dq_score_prevalence_formula: |- (({RESULT_MEASURE}-{THRESHOLD_VALUE}))::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' @@ -138,28 +140,266 @@ test_types: - id: '2511' test_type: Dupe_Rows sql_flavor: bigquery - template_name: ex_dupe_rows_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COALESCE(SUM(record_ct), 0) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + HAVING COUNT(*) > 1 + ) test; - id: '2411' test_type: Dupe_Rows sql_flavor: databricks - template_name: ex_dupe_rows_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COALESCE(SUM(record_ct), 0) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + HAVING COUNT(*) > 1 + ) test; - id: '2211' test_type: Dupe_Rows sql_flavor: mssql - template_name: ex_dupe_rows_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COALESCE(SUM(record_ct), 0) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + HAVING COUNT(*) > 1 + ) test; - id: '2311' test_type: Dupe_Rows sql_flavor: postgresql - template_name: ex_dupe_rows_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COALESCE(SUM(record_ct), 0) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + HAVING COUNT(*) > 1 + ) test; - id: '2011' test_type: Dupe_Rows sql_flavor: redshift - template_name: ex_dupe_rows_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COALESCE(SUM(record_ct), 0) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + HAVING COUNT(*) > 1 + ) test; - id: '2511' test_type: Dupe_Rows sql_flavor: redshift_spectrum - template_name: ex_dupe_rows_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COALESCE(SUM(record_ct), 0) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + HAVING COUNT(*) > 1 + ) test; - id: '2111' test_type: Dupe_Rows sql_flavor: snowflake - template_name: ex_dupe_rows_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COALESCE(SUM(record_ct), 0) as result_measure + FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + GROUP BY {GROUPBY_NAMES} + HAVING COUNT(*) > 1 + ) test; diff --git a/testgen/template/dbsetup_test_types/test_types_Email_Format.yaml b/testgen/template/dbsetup_test_types/test_types_Email_Format.yaml index 7cebba6e..1ec48c42 100644 --- a/testgen/template/dbsetup_test_types/test_types_Email_Format.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Email_Format.yaml @@ -4,7 +4,7 @@ test_types: test_name_short: Email Format test_name_long: Email is correctly formatted test_description: |- - Tests that non-blank, non-empty email addresses match the standard format + Tests that non-blank, non-empty email addresses match standard format. except_message: |- Invalid email address formats found. measure_uom: Invalid emails @@ -12,6 +12,7 @@ test_types: Number of emails that do not match standard format selection_criteria: |- std_pattern_match='EMAIL' + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Freshness_Trend.yaml b/testgen/template/dbsetup_test_types/test_types_Freshness_Trend.yaml new file mode 100644 index 00000000..0cfeecf7 --- /dev/null +++ b/testgen/template/dbsetup_test_types/test_types_Freshness_Trend.yaml @@ -0,0 +1,391 @@ +test_types: + id: '1515' + test_type: Freshness_Trend + test_name_short: Freshness + test_name_long: Table updated within expected time window + test_description: |- + Tests that table has been updated within expected time window. + except_message: |- + Table has not been updated within expected time window. + measure_uom: Interval since last update + measure_uom_description: null + selection_criteria: |- + TEMPLATE + generation_template: gen_Freshness_Trend.sql + dq_score_prevalence_formula: null + dq_score_risk_factor: null + column_name_prompt: |- + null + column_name_help: |- + null + default_parm_columns: subset_condition,history_calculation,history_calculation_upper,history_lookback + default_parm_values: null + default_parm_prompts: |- + Record Subset Condition,Lower Bound,Upper Bound,History Lookback + default_parm_help: |- + Condition defining a subset of records in main table + default_severity: Fail + run_type: QUERY + test_scope: table + dq_dimension: Recency + health_dimension: Recency + threshold_description: |- + Expected time window + result_visualization: binary_chart + result_visualization_params: '{"legend":{"labels":{"0":"Not updated on time","1":"Updated"}}}' + usage_notes: |- + This test compares the current table fingerprint, calculated signature of column contents, to confirm that the table has been updated within the expectd time window. The table fingerprint is derived from a set of values and aggregates from columns most likely to change. This test allows you to track the schedule and frequency of updates and refreshes to the table. + active: Y + cat_test_conditions: [] + target_data_lookups: [] + test_templates: + - id: '2517' + test_type: Freshness_Trend + sql_flavor: bigquery + template: |- + WITH test_data AS ( + SELECT + {CUSTOM_QUERY} AS fingerprint, + DATETIME_DIFF(DATETIME('{RUN_DATE}'), DATETIME(NULLIF('{BASELINE_SUM}', '')), MINUTE) AS interval_minutes + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_measure, + CASE + -- Training mode: tolerances not yet calculated + WHEN {LOWER_TOLERANCE} IS NULL AND {UPPER_TOLERANCE} IS NULL THEN -1 + -- No change and excluded day: suppress + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 1 THEN 1 + -- No change, beyond time range (business time): LATE + WHEN fingerprint = '{BASELINE_VALUE}' + AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN 0 + -- Table changed outside time range (business time): UNEXPECTED + WHEN fingerprint <> '{BASELINE_VALUE}' + AND NOT (interval_minutes - {EXCLUDED_MINUTES}) + BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN 0 + ELSE 1 + END AS result_code, + 'Table update detected: ' || CASE WHEN fingerprint <> '{BASELINE_VALUE}' THEN 'Yes' ELSE 'No' END + || CASE + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN '. On time.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) < {LOWER_TOLERANCE} THEN '. Earlier than expected.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) > {UPPER_TOLERANCE} THEN '. Later than expected.' + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 0 AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN '. Late.' + ELSE '' + END AS result_message, + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' THEN '0' + ELSE COALESCE(CAST(interval_minutes AS STRING), 'Unknown') + END AS result_signal + FROM test_data; + - id: '2417' + test_type: Freshness_Trend + sql_flavor: databricks + template: |- + WITH test_data AS ( + SELECT + {CUSTOM_QUERY} AS fingerprint, + DATEDIFF(MINUTE, TO_TIMESTAMP(NULLIF('{BASELINE_SUM}', '')), TIMESTAMP '{RUN_DATE}') AS interval_minutes + FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_measure, + CASE + -- Training mode: tolerances not yet calculated + WHEN {LOWER_TOLERANCE} IS NULL AND {UPPER_TOLERANCE} IS NULL THEN -1 + -- No change and excluded day: suppress + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 1 THEN 1 + -- No change, beyond time range (business time): LATE + WHEN fingerprint = '{BASELINE_VALUE}' + AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN 0 + -- Table changed outside time range (business time): UNEXPECTED + WHEN fingerprint <> '{BASELINE_VALUE}' + AND NOT (interval_minutes - {EXCLUDED_MINUTES}) + BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN 0 + ELSE 1 + END AS result_code, + 'Table update detected: ' || CASE WHEN fingerprint <> '{BASELINE_VALUE}' THEN 'Yes' ELSE 'No' END + || CASE + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN '. On time.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) < {LOWER_TOLERANCE} THEN '. Earlier than expected.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) > {UPPER_TOLERANCE} THEN '. Later than expected.' + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 0 AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN '. Late.' + ELSE '' + END AS result_message, + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' THEN '0' + ELSE COALESCE(interval_minutes::STRING, 'Unknown') + END AS result_signal + FROM test_data; + - id: '2217' + test_type: Freshness_Trend + sql_flavor: mssql + template: |- + WITH test_data AS ( + SELECT + {CUSTOM_QUERY} AS fingerprint, + DATEDIFF(MINUTE, CAST(NULLIF('{BASELINE_SUM}', '') AS DATETIME2), CAST('{RUN_DATE}' AS DATETIME2)) AS interval_minutes + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" WITH (NOLOCK) + WHERE {SUBSET_CONDITION} + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_measure, + CASE + -- Training mode: tolerances not yet calculated + WHEN {LOWER_TOLERANCE} IS NULL AND {UPPER_TOLERANCE} IS NULL THEN -1 + -- No change and excluded day: suppress + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 1 THEN 1 + -- No change, beyond time range (business time): LATE + WHEN fingerprint = '{BASELINE_VALUE}' + AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN 0 + -- Table changed outside time range (business time): UNEXPECTED + WHEN fingerprint <> '{BASELINE_VALUE}' + AND NOT (interval_minutes - {EXCLUDED_MINUTES}) + BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN 0 + ELSE 1 + END AS result_code, + 'Table update detected: ' + CASE WHEN fingerprint <> '{BASELINE_VALUE}' THEN 'Yes' ELSE 'No' END + + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN '. On time.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) < {LOWER_TOLERANCE} THEN '. Earlier than expected.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) > {UPPER_TOLERANCE} THEN '. Later than expected.' + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 0 AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN '. Late.' + ELSE '' + END AS result_message, + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' THEN '0' + ELSE COALESCE(CAST(interval_minutes AS VARCHAR), 'Unknown') + END AS result_signal + FROM test_data; + - id: '2317' + test_type: Freshness_Trend + sql_flavor: postgresql + template: |- + WITH test_data AS ( + SELECT + {CUSTOM_QUERY} AS fingerprint, + (EXTRACT(EPOCH FROM ('{RUN_DATE}'::TIMESTAMP - NULLIF('{BASELINE_SUM}', '')::TIMESTAMP)) / 60)::INTEGER AS interval_minutes + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_measure, + CASE + -- Training mode: tolerances not yet calculated + WHEN {LOWER_TOLERANCE} IS NULL AND {UPPER_TOLERANCE} IS NULL THEN -1 + -- No change and excluded day: suppress + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 1 THEN 1 + -- No change, beyond time range (business time): LATE + WHEN fingerprint = '{BASELINE_VALUE}' + AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN 0 + -- Table changed outside time range (business time): UNEXPECTED + WHEN fingerprint <> '{BASELINE_VALUE}' + AND NOT (interval_minutes - {EXCLUDED_MINUTES}) + BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN 0 + ELSE 1 + END AS result_code, + 'Table update detected: ' || CASE WHEN fingerprint <> '{BASELINE_VALUE}' THEN 'Yes' ELSE 'No' END + || CASE + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN '. On time.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) < {LOWER_TOLERANCE} THEN '. Earlier than expected.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) > {UPPER_TOLERANCE} THEN '. Later than expected.' + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 0 AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN '. Late.' + ELSE '' + END AS result_message, + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' THEN '0' + ELSE COALESCE(interval_minutes::TEXT, 'Unknown') + END AS result_signal + FROM test_data; + - id: '2017' + test_type: Freshness_Trend + sql_flavor: redshift + template: |- + WITH test_data AS ( + SELECT + {CUSTOM_QUERY} AS fingerprint, + DATEDIFF(MINUTE, NULLIF('{BASELINE_SUM}', '')::TIMESTAMP, '{RUN_DATE}'::TIMESTAMP) AS interval_minutes + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_measure, + CASE + -- Training mode: tolerances not yet calculated + WHEN {LOWER_TOLERANCE} IS NULL AND {UPPER_TOLERANCE} IS NULL THEN -1 + -- No change and excluded day: suppress + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 1 THEN 1 + -- No change, beyond time range (business time): LATE + WHEN fingerprint = '{BASELINE_VALUE}' + AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN 0 + -- Table changed outside time range (business time): UNEXPECTED + WHEN fingerprint <> '{BASELINE_VALUE}' + AND NOT (interval_minutes - {EXCLUDED_MINUTES}) + BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN 0 + ELSE 1 + END AS result_code, + 'Table update detected: ' || CASE WHEN fingerprint <> '{BASELINE_VALUE}' THEN 'Yes' ELSE 'No' END + || CASE + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN '. On time.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) < {LOWER_TOLERANCE} THEN '. Earlier than expected.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) > {UPPER_TOLERANCE} THEN '. Later than expected.' + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 0 AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN '. Late.' + ELSE '' + END AS result_message, + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' THEN '0' + ELSE COALESCE(interval_minutes::VARCHAR, 'Unknown') + END AS result_signal + FROM test_data; + - id: '2517' + test_type: Freshness_Trend + sql_flavor: redshift_spectrum + template: |- + WITH test_data AS ( + SELECT + {CUSTOM_QUERY} AS fingerprint, + DATEDIFF(MINUTE, NULLIF('{BASELINE_SUM}', '')::TIMESTAMP, '{RUN_DATE}'::TIMESTAMP) AS interval_minutes + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_measure, + CASE + -- Training mode: tolerances not yet calculated + WHEN {LOWER_TOLERANCE} IS NULL AND {UPPER_TOLERANCE} IS NULL THEN -1 + -- No change and excluded day: suppress + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 1 THEN 1 + -- No change, beyond time range (business time): LATE + WHEN fingerprint = '{BASELINE_VALUE}' + AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN 0 + -- Table changed outside time range (business time): UNEXPECTED + WHEN fingerprint <> '{BASELINE_VALUE}' + AND NOT (interval_minutes - {EXCLUDED_MINUTES}) + BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN 0 + ELSE 1 + END AS result_code, + 'Table update detected: ' || CASE WHEN fingerprint <> '{BASELINE_VALUE}' THEN 'Yes' ELSE 'No' END + || CASE + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN '. On time.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) < {LOWER_TOLERANCE} THEN '. Earlier than expected.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) > {UPPER_TOLERANCE} THEN '. Later than expected.' + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 0 AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN '. Late.' + ELSE '' + END AS result_message, + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' THEN '0' + ELSE COALESCE(interval_minutes::VARCHAR, 'Unknown') + END AS result_signal + FROM test_data; + - id: '2117' + test_type: Freshness_Trend + sql_flavor: snowflake + template: |- + WITH test_data AS ( + SELECT + {CUSTOM_QUERY} AS fingerprint, + DATEDIFF(MINUTE, NULLIF('{BASELINE_SUM}', '')::TIMESTAMP, '{RUN_DATE}'::TIMESTAMP) AS interval_minutes + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + ) + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_measure, + CASE + -- Training mode: tolerances not yet calculated + WHEN {LOWER_TOLERANCE} IS NULL AND {UPPER_TOLERANCE} IS NULL THEN -1 + -- No change and excluded day: suppress + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 1 THEN 1 + -- No change, beyond time range (business time): LATE + WHEN fingerprint = '{BASELINE_VALUE}' + AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN 0 + -- Table changed outside time range (business time): UNEXPECTED + WHEN fingerprint <> '{BASELINE_VALUE}' + AND NOT (interval_minutes - {EXCLUDED_MINUTES}) + BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN 0 + ELSE 1 + END AS result_code, + 'Table update detected: ' || CASE WHEN fingerprint <> '{BASELINE_VALUE}' THEN 'Yes' ELSE 'No' END + || CASE + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) BETWEEN {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} THEN '. On time.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) < {LOWER_TOLERANCE} THEN '. Earlier than expected.' + WHEN fingerprint <> '{BASELINE_VALUE}' AND (interval_minutes - {EXCLUDED_MINUTES}) > {UPPER_TOLERANCE} THEN '. Later than expected.' + WHEN fingerprint = '{BASELINE_VALUE}' AND {IS_EXCLUDED_DAY} = 0 AND (interval_minutes - {EXCLUDED_MINUTES}) > {THRESHOLD_VALUE} THEN '. Late.' + ELSE '' + END AS result_message, + CASE + WHEN fingerprint <> '{BASELINE_VALUE}' THEN '0' + ELSE COALESCE(interval_minutes::VARCHAR, 'Unknown') + END AS result_signal + FROM test_data; diff --git a/testgen/template/dbsetup_test_types/test_types_Future_Date.yaml b/testgen/template/dbsetup_test_types/test_types_Future_Date.yaml index 5aab6fc8..646cc9c0 100644 --- a/testgen/template/dbsetup_test_types/test_types_Future_Date.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Future_Date.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Past Dates test_name_long: Latest date is prior to test run date test_description: |- - Tests that the maximum date referenced in the column is no greater than the test date, consistent with baseline data + Tests that maximum date referenced in column is no greater than the test date, consistent with baseline data. except_message: |- Future date found when absent in baseline data. measure_uom: Future dates measure_uom_description: null selection_criteria: |- general_type='D'AND future_date_ct = 0 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Future_Date_1Y.yaml b/testgen/template/dbsetup_test_types/test_types_Future_Date_1Y.yaml index a11cebaf..7f55192c 100644 --- a/testgen/template/dbsetup_test_types/test_types_Future_Date_1Y.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Future_Date_1Y.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Future Year test_name_long: Future dates within year of test run date test_description: |- - Tests that the maximum date referenced in the column is no greater than one year beyond the test date, consistent with baseline data + Tests that maximum date referenced in column is no greater than one year beyond the test date, consistent with baseline data. except_message: |- - Future date beyond one-year found when absent in baseline. + Future date beyond one year found when absent in baseline. measure_uom: Future dates post 1 year measure_uom_description: null selection_criteria: |- general_type='D'AND future_date_ct > 0 AND max_date <='{AS_OF_DATE}'::DATE + INTERVAL'365 DAYS' + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Incr_Avg_Shift.yaml b/testgen/template/dbsetup_test_types/test_types_Incr_Avg_Shift.yaml index 1e4c6259..94655ff8 100644 --- a/testgen/template/dbsetup_test_types/test_types_Incr_Avg_Shift.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Incr_Avg_Shift.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: New Shift test_name_long: New record mean is consistent with reference test_description: |- - Tests for statistically-significant shift in mean of new values for column compared to average calculated at baseline. + Tests for statistically significant shift in mean of new values for column compared to average calculated at baseline. except_message: |- - Significant shift in average of new values vs. baseline avg + Significant shift in average of new values compared to baseline average. measure_uom: Z-score of mean shift measure_uom_description: |- Absolute Z-score (number of SD's outside mean) of prior avg - incremental avg selection_criteria: |- general_type='N' AND distinct_value_ct > 10 AND functional_data_type ilike 'Measure%' AND functional_data_type <> 'Measurement Spike' AND column_name NOT ilike '%latitude%' AND column_name NOT ilike '%longitude%' + generation_template: null dq_score_prevalence_formula: |- {RECORD_CT}::FLOAT*(1-FN_NORMAL_CDF({RESULT_MEASURE}::FLOAT))/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_LOV_All.yaml b/testgen/template/dbsetup_test_types/test_types_LOV_All.yaml index 36686814..85665563 100644 --- a/testgen/template/dbsetup_test_types/test_types_LOV_All.yaml +++ b/testgen/template/dbsetup_test_types/test_types_LOV_All.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Value Match All test_name_long: List of expected values all present in column test_description: |- - Tests that all values match a pipe-delimited list of expected values and that all expected values are present + Tests that column values match pipe-delimited list of expected values and that all expected values are present. except_message: |- - Column values found don't exactly match the expected list of values + Column values do not exactly match expected list of values. measure_uom: Values found measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_LOV_Match.yaml b/testgen/template/dbsetup_test_types/test_types_LOV_Match.yaml index 6f2aa126..fed0b3ec 100644 --- a/testgen/template/dbsetup_test_types/test_types_LOV_Match.yaml +++ b/testgen/template/dbsetup_test_types/test_types_LOV_Match.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Value Match test_name_long: All column values present in expected list test_description: |- - Tests that all values in the column match the list-of-values identified in baseline data. + Tests that column values match the list of values identified in baseline data. except_message: |- - Values not matching expected List-of-Values from baseline. + Values not matching expected list of values from baseline. measure_uom: Non-matching records measure_uom_description: null selection_criteria: |- functional_data_type IN ('Boolean', 'Code', 'Category') AND top_freq_values > '' AND distinct_value_ct BETWEEN 2 and 10 AND value_ct > 5 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Metric_Trend.yaml b/testgen/template/dbsetup_test_types/test_types_Metric_Trend.yaml new file mode 100644 index 00000000..545d25f6 --- /dev/null +++ b/testgen/template/dbsetup_test_types/test_types_Metric_Trend.yaml @@ -0,0 +1,169 @@ +test_types: + id: '1514' + test_type: Metric_Trend + test_name_short: Metric Trend + test_name_long: Aggregate metric is within tolerance range + test_description: |- + Tests that aggregate metric of all or subset of records in a table is within tolerance range. + except_message: |- + Aggregate metric is outside expected range. + measure_uom: Aggregate metric + measure_uom_description: null + selection_criteria: null + generation_template: null + dq_score_prevalence_formula: null + dq_score_risk_factor: null + column_name_prompt: null + column_name_help: null + default_parm_columns: column_name,custom_query,history_calculation,history_calculation_upper,history_lookback + default_parm_values: null + default_parm_prompts: Metric Name,Metric Expression,Lower Bound,Upper Bound,History Lookback + default_parm_help: null + default_severity: Fail + run_type: CAT + test_scope: table + dq_dimension: Validity + health_dimension: null + threshold_description: |- + Expected aggregate metric range. + result_visualization: line_chart + result_visualization_params: null + usage_notes: |- + This test compares the aggregate metric of all or a subset of records in a table against a derived tolerance range. + active: Y + cat_test_conditions: + - id: '2516' + test_type: Metric_Trend + sql_flavor: bigquery + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2416' + test_type: Metric_Trend + sql_flavor: databricks + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2216' + test_type: Metric_Trend + sql_flavor: mssql + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2316' + test_type: Metric_Trend + sql_flavor: postgresql + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2016' + test_type: Metric_Trend + sql_flavor: redshift + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2616' + test_type: Metric_Trend + sql_flavor: redshift_spectrum + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2116' + test_type: Metric_Trend + sql_flavor: snowflake + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + target_data_lookups: + - id: '1484' + test_id: '1514' + test_type: Metric_Trend + sql_flavor: bigquery + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM `{TARGET_SCHEMA}`.`{TABLE_NAME}`; + error_type: Test Results + - id: '1485' + test_id: '1514' + test_type: Metric_Trend + sql_flavor: databricks + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM `{TARGET_SCHEMA}`.`{TABLE_NAME}`; + error_type: Test Results + - id: '1486' + test_id: '1514' + test_type: Metric_Trend + sql_flavor: mssql + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1487' + test_id: '1514' + test_type: Metric_Trend + sql_flavor: postgresql + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1488' + test_id: '1514' + test_type: Metric_Trend + sql_flavor: redshift + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1489' + test_id: '1514' + test_type: Metric_Trend + sql_flavor: redshift_spectrum + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1490' + test_id: '1514' + test_type: Metric_Trend + sql_flavor: snowflake + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + test_templates: [] diff --git a/testgen/template/dbsetup_test_types/test_types_Min_Date.yaml b/testgen/template/dbsetup_test_types/test_types_Min_Date.yaml index 698d63a3..01dbf230 100644 --- a/testgen/template/dbsetup_test_types/test_types_Min_Date.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Min_Date.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Minimum Date test_name_long: All dates on or after set minimum test_description: |- - Tests that the earliest date referenced in the column is no earlier than baseline data + Tests that earliest date referenced in column is no earlier than baseline data. except_message: |- - The earliest date value found is before the earliest value at baseline. + Earliest date value found is before earliest value at baseline. measure_uom: Dates prior to limit measure_uom_description: null selection_criteria: |- general_type='D'and min_date IS NOT NULL AND distinct_value_ct > 1 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Min_Val.yaml b/testgen/template/dbsetup_test_types/test_types_Min_Val.yaml index ea5b7d56..bfac4c70 100644 --- a/testgen/template/dbsetup_test_types/test_types_Min_Val.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Min_Val.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Minimum Value test_name_long: All values at or above set minimum test_description: |- - Tests that the minimum value present in the column is no lower than the minimum value in baseline data + Tests that minimum value present in column is no lower than minimum value in baseline data. except_message: |- Minimum column value less than baseline. measure_uom: Values under limit measure_uom_description: null selection_criteria: |- general_type='N' AND functional_data_type ILIKE 'Measure%' AND min_value IS NOT NULL AND (distinct_value_ct >= 2 OR (distinct_value_ct=2 and min_value<>0 and max_value<>1)) + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Missing_Pct.yaml b/testgen/template/dbsetup_test_types/test_types_Missing_Pct.yaml index 7598d6ed..3bc7069a 100644 --- a/testgen/template/dbsetup_test_types/test_types_Missing_Pct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Missing_Pct.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: Percent Missing test_name_long: Consistent ratio of missing values test_description: |- - Tests for statistically-significant shift in percentage of missing values in column vs. baseline data + Tests for statistically significant shift in percentage of missing values in column compared to baseline data. except_message: |- - Significant shift in percent of missing values vs. baseline. + Significant shift in percent of missing values compared to baseline. measure_uom: Difference measure measure_uom_description: |- Cohen's H Difference (0.20 small, 0.5 mod, 0.8 large, 1.2 very large, 2.0 huge) selection_criteria: |- record_ct <> value_ct + generation_template: null dq_score_prevalence_formula: |- 2.0 * (1.0 - fn_normal_cdf(ABS({RESULT_MEASURE}::FLOAT) / 2.0)) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Monthly_Rec_Ct.yaml b/testgen/template/dbsetup_test_types/test_types_Monthly_Rec_Ct.yaml index 0f155edc..4ce0fc6a 100644 --- a/testgen/template/dbsetup_test_types/test_types_Monthly_Rec_Ct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Monthly_Rec_Ct.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: Monthly Records test_name_long: At least one date per month present within date range test_description: |- - Tests for presence of at least one date per calendar month within min/max date range, per baseline data + Tests for presence of at least one date per calendar month within minimum and maximum date range, per baseline data. except_message: |- - At least one date per month expected in min/max date range. + Not every month between minimum and maximum date range has at least one date present. measure_uom: Missing months measure_uom_description: |- Calendar months without date values present selection_criteria: |- functional_data_type ILIKE 'Transactional Date%' AND date_days_present > 1 AND functional_table_type ILIKE '%cumulative%' AND date_months_present > 2 AND date_months_present - (datediff( 'MON' , min_date, max_date) + 1) = 0 AND future_date_ct::FLOAT / NULLIF(value_ct, 0) <= 0.75 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT*{PRO_RECORD_CT}::FLOAT/NULLIF({DATE_MONTHS_PRESENT}::FLOAT, 0)/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Above.yaml b/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Above.yaml index 1901fdac..be6ad5eb 100644 --- a/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Above.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Above.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Outliers Above test_name_long: Consistent outlier counts over 2 SD above mean test_description: |- - Tests that percent of outliers over 2 SD above Mean doesn't exceed threshold + Tests that percent of outliers over two standard deviations above mean does not exceed threshold. except_message: |- - Percent of outliers exceeding 2 SD above the mean is greater than expected threshold. + Percent of outliers exceeding two standard deviations above the mean is greater than expected threshold. measure_uom: Pct records over limit measure_uom_description: null selection_criteria: |- functional_data_type = 'Measurement' AND distinct_value_ct > 30 AND NOT distinct_value_ct = max_value - min_value + 1 AND distinct_value_ct::FLOAT/value_ct::FLOAT > 0.1 AND stdev_value::FLOAT/avg_value::FLOAT > 0.01 AND column_name NOT ILIKE '%latitude%' AND column_name NOT ilike '%longitude%' + generation_template: null dq_score_prevalence_formula: |- GREATEST(0, {RESULT_MEASURE}::FLOAT-{THRESHOLD_VALUE}::FLOAT) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Below.yaml b/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Below.yaml index 0d9b45cb..0fd3341a 100644 --- a/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Below.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Outlier_Pct_Below.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Outliers Below test_name_long: Consistent outlier counts under 2 SD below mean test_description: |- - Tests that percent of outliers over 2 SD below Mean doesn't exceed threshold + Tests that percent of outliers over two standard deviations below mean does not exceed threshold. except_message: |- - Percent of outliers exceeding 2 SD below the mean is greater than expected threshold. + Percent of outliers exceeding two standard deviations below the mean is greater than expected threshold. measure_uom: Pct records under limit measure_uom_description: null selection_criteria: |- functional_data_type = 'Measurement' AND distinct_value_ct > 30 AND NOT distinct_value_ct = max_value - min_value + 1 AND distinct_value_ct::FLOAT/value_ct::FLOAT > 0.1 AND stdev_value::FLOAT/avg_value::FLOAT > 0.01 AND column_name NOT ILIKE '%latitude%' AND column_name NOT ilike '%longitude%' + generation_template: null dq_score_prevalence_formula: |- GREATEST(0, {RESULT_MEASURE}::FLOAT-{THRESHOLD_VALUE}::FLOAT) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Pattern_Match.yaml b/testgen/template/dbsetup_test_types/test_types_Pattern_Match.yaml index 84d0052b..6fd1f981 100644 --- a/testgen/template/dbsetup_test_types/test_types_Pattern_Match.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Pattern_Match.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Pattern Match test_name_long: Column values match alpha-numeric pattern test_description: |- - Tests that all values in the column match the same alpha-numeric pattern identified in baseline data + Tests that all values in column match the same alphanumeric pattern identified in baseline data. except_message: |- Alpha values do not match consistent pattern in baseline. measure_uom: Pattern Mismatches measure_uom_description: null selection_criteria: |- (functional_data_type IN ('Attribute', 'DateTime Stamp', 'Phone') OR functional_data_type ILIKE 'ID%' OR functional_data_type ILIKE 'Period%') AND fn_charcount(top_patterns, E' \| ' ) = 1 AND REPLACE(SPLIT_PART(top_patterns, '|' , 2), 'N' , '' ) > '' AND distinct_value_ct > 10 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Recency.yaml b/testgen/template/dbsetup_test_types/test_types_Recency.yaml index 278eb9d4..c69df2e2 100644 --- a/testgen/template/dbsetup_test_types/test_types_Recency.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Recency.yaml @@ -4,7 +4,7 @@ test_types: test_name_short: Recency test_name_long: Latest date within expected range of test date test_description: |- - Tests that the latest date in column is within a set number of days of the test date + Tests that most recent date in column is within a set number of days of the test date. except_message: |- Most recent date value not within expected days of test date. measure_uom: Days before test @@ -12,6 +12,7 @@ test_types: Number of days that most recent date precedes the date of test selection_criteria: |- general_type= 'D' AND max_date <= run_date AND NOT column_name IN ( 'filedate' , 'file_date' ) AND NOT functional_data_type IN ('Future Date', 'Schedule Date') AND DATEDIFF( 'DAY' , max_date, run_date) <= 62 + generation_template: null dq_score_prevalence_formula: |- (ABS({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT*{PRO_RECORD_CT}::FLOAT/(1.0+DATEDIFF('DAY', '{MIN_DATE}', '{MAX_DATE}'))::FLOAT)/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Required.yaml b/testgen/template/dbsetup_test_types/test_types_Required.yaml index ada30dfe..fcb3200b 100644 --- a/testgen/template/dbsetup_test_types/test_types_Required.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Required.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Required Entry test_name_long: Required non-null value present test_description: |- - Tests that a non-null value is present in each record for the column, consistent with baseline data + Tests that a non-null value is present in each record for the column, consistent with baseline data. except_message: |- - Every record for this column is expected to be filled, but some are missing. + Not every record for the column is filled with non-null values. measure_uom: Missing values measure_uom_description: null selection_criteria: |- record_ct = value_ct AND record_ct > 10 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Row_Ct.yaml b/testgen/template/dbsetup_test_types/test_types_Row_Ct.yaml index 776bea6a..47c71112 100644 --- a/testgen/template/dbsetup_test_types/test_types_Row_Ct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Row_Ct.yaml @@ -9,8 +9,8 @@ test_types: Row count less than baseline count. measure_uom: Row count measure_uom_description: null - selection_criteria: |- - TEMPLATE + selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({THRESHOLD_VALUE}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Row_Ct_Pct.yaml b/testgen/template/dbsetup_test_types/test_types_Row_Ct_Pct.yaml index 5b5ab463..08209512 100644 --- a/testgen/template/dbsetup_test_types/test_types_Row_Ct_Pct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Row_Ct_Pct.yaml @@ -6,12 +6,12 @@ test_types: test_description: |- Tests that the count of records is within a percentage above or below the baseline count. except_message: |- - Row Count is outside of threshold percent of baseline count. + Row count is outside of threshold percent of baseline count. measure_uom: Percent of baseline measure_uom_description: |- Row count percent above or below baseline - selection_criteria: |- - TEMPLATE + selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- (100.0 - {RESULT_MEASURE}::FLOAT)/100.0 dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Schema_Drift.yaml b/testgen/template/dbsetup_test_types/test_types_Schema_Drift.yaml index 7dbd646b..d1ea92cf 100644 --- a/testgen/template/dbsetup_test_types/test_types_Schema_Drift.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Schema_Drift.yaml @@ -1,16 +1,17 @@ test_types: id: '1512' test_type: Schema_Drift - test_name_short: Schema Drift + test_name_short: Schema test_name_long: Table Schema Changed test_description: |- - Checks whether table schema has changed + Checks whether table schema has changed. except_message: |- Table schema has changed. - measure_uom: Was Schema Change Detected + measure_uom: Schema changes measure_uom_description: null selection_criteria: |- TEMPLATE + generation_template: gen_Schema_Drift.sql dq_score_prevalence_formula: null dq_score_risk_factor: null column_name_prompt: null @@ -19,7 +20,7 @@ test_types: default_parm_values: null default_parm_prompts: null default_parm_help: null - default_severity: Warning + default_severity: Fail run_type: METADATA test_scope: tablegroup dq_dimension: null @@ -29,35 +30,392 @@ test_types: result_visualization_params: '{"legend":{"labels":{"0":"No Changes","1":"Changes"}}}' usage_notes: |- This test compares the current table column types with previous data, to check whether the table schema has changed. This test allows you to track any changes to the table structure. - active: N + active: Y cat_test_conditions: [] target_data_lookups: [] test_templates: - id: '2514' test_type: Schema_Drift sql_flavor: bigquery - template_name: ex_schema_drift_generic.sql + template: |- + WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM {APP_SCHEMA_NAME}.test_runs + WHERE test_suite_id = '{TEST_SUITE_ID}'::UUID + -- Ignore current run + AND id <> '{TEST_RUN_ID}'::UUID + ), + table_changes AS ( + SELECT + dsl.table_name, + MAX(prev_test.last_run_time) as window_start, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'A' THEN dsl.change_date ELSE NULL END) as last_add_date, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'D' THEN dsl.change_date ELSE NULL END) as last_drop_date, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'A') AS column_adds, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'D') AS column_drops, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'M') AS column_mods + FROM {APP_SCHEMA_NAME}.data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = '{TABLE_GROUPS_ID}'::UUID + -- if no previous tests, this comparision yelds null and nothing is counted + AND dsl.change_date > prev_test.last_run_time + GROUP BY dsl.table_name + ) + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + table_name, + '{INPUT_PARAMETERS}' AS input_parameters, + (CASE + WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'A' + WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'D' + ELSE 'M' + END) + || '|' || column_adds + || '|' || column_drops + || '|' || column_mods + || '|' || window_start::TEXT + AS result_signal, + 0 AS result_code, + CASE WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'Table added. ' ELSE '' END + || CASE WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'Table dropped. ' ELSE '' END + || CASE WHEN column_adds > 0 THEN column_adds || ' columns added. ' ELSE '' END + || CASE WHEN column_drops > 0 THEN column_drops || ' columns dropped. ' ELSE '' END + || CASE WHEN column_mods > 0 THEN column_mods || ' columns modified. ' ELSE '' END + AS result_message, + column_adds + column_drops + column_mods AS result_measure + FROM table_changes; - id: '2414' test_type: Schema_Drift sql_flavor: databricks - template_name: ex_schema_drift_generic.sql + template: |- + WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM {APP_SCHEMA_NAME}.test_runs + WHERE test_suite_id = '{TEST_SUITE_ID}'::UUID + -- Ignore current run + AND id <> '{TEST_RUN_ID}'::UUID + ), + table_changes AS ( + SELECT + dsl.table_name, + MAX(prev_test.last_run_time) as window_start, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'A' THEN dsl.change_date ELSE NULL END) as last_add_date, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'D' THEN dsl.change_date ELSE NULL END) as last_drop_date, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'A') AS column_adds, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'D') AS column_drops, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'M') AS column_mods + FROM {APP_SCHEMA_NAME}.data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = '{TABLE_GROUPS_ID}'::UUID + -- if no previous tests, this comparision yelds null and nothing is counted + AND dsl.change_date > prev_test.last_run_time + GROUP BY dsl.table_name + ) + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + table_name, + '{INPUT_PARAMETERS}' AS input_parameters, + (CASE + WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'A' + WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'D' + ELSE 'M' + END) + || '|' || column_adds + || '|' || column_drops + || '|' || column_mods + || '|' || window_start::TEXT + AS result_signal, + 0 AS result_code, + CASE WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'Table added. ' ELSE '' END + || CASE WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'Table dropped. ' ELSE '' END + || CASE WHEN column_adds > 0 THEN column_adds || ' columns added. ' ELSE '' END + || CASE WHEN column_drops > 0 THEN column_drops || ' columns dropped. ' ELSE '' END + || CASE WHEN column_mods > 0 THEN column_mods || ' columns modified. ' ELSE '' END + AS result_message, + column_adds + column_drops + column_mods AS result_measure + FROM table_changes; - id: '2214' test_type: Schema_Drift sql_flavor: mssql - template_name: ex_schema_drift_generic.sql + template: |- + WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM {APP_SCHEMA_NAME}.test_runs + WHERE test_suite_id = '{TEST_SUITE_ID}'::UUID + -- Ignore current run + AND id <> '{TEST_RUN_ID}'::UUID + ), + table_changes AS ( + SELECT + dsl.table_name, + MAX(prev_test.last_run_time) as window_start, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'A' THEN dsl.change_date ELSE NULL END) as last_add_date, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'D' THEN dsl.change_date ELSE NULL END) as last_drop_date, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'A') AS column_adds, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'D') AS column_drops, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'M') AS column_mods + FROM {APP_SCHEMA_NAME}.data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = '{TABLE_GROUPS_ID}'::UUID + -- if no previous tests, this comparision yelds null and nothing is counted + AND dsl.change_date > prev_test.last_run_time + GROUP BY dsl.table_name + ) + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + table_name, + '{INPUT_PARAMETERS}' AS input_parameters, + (CASE + WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'A' + WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'D' + ELSE 'M' + END) + || '|' || column_adds + || '|' || column_drops + || '|' || column_mods + || '|' || window_start::TEXT + AS result_signal, + 0 AS result_code, + CASE WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'Table added. ' ELSE '' END + || CASE WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'Table dropped. ' ELSE '' END + || CASE WHEN column_adds > 0 THEN column_adds || ' columns added. ' ELSE '' END + || CASE WHEN column_drops > 0 THEN column_drops || ' columns dropped. ' ELSE '' END + || CASE WHEN column_mods > 0 THEN column_mods || ' columns modified. ' ELSE '' END + AS result_message, + column_adds + column_drops + column_mods AS result_measure + FROM table_changes; - id: '2314' test_type: Schema_Drift sql_flavor: postgresql - template_name: ex_schema_drift_generic.sql + template: |- + WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM {APP_SCHEMA_NAME}.test_runs + WHERE test_suite_id = '{TEST_SUITE_ID}'::UUID + -- Ignore current run + AND id <> '{TEST_RUN_ID}'::UUID + ), + table_changes AS ( + SELECT + dsl.table_name, + MAX(prev_test.last_run_time) as window_start, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'A' THEN dsl.change_date ELSE NULL END) as last_add_date, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'D' THEN dsl.change_date ELSE NULL END) as last_drop_date, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'A') AS column_adds, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'D') AS column_drops, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'M') AS column_mods + FROM {APP_SCHEMA_NAME}.data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = '{TABLE_GROUPS_ID}'::UUID + -- if no previous tests, this comparision yelds null and nothing is counted + AND dsl.change_date > prev_test.last_run_time + GROUP BY dsl.table_name + ) + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + table_name, + '{INPUT_PARAMETERS}' AS input_parameters, + (CASE + WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'A' + WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'D' + ELSE 'M' + END) + || '|' || column_adds + || '|' || column_drops + || '|' || column_mods + || '|' || window_start::TEXT + AS result_signal, + 0 AS result_code, + CASE WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'Table added. ' ELSE '' END + || CASE WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'Table dropped. ' ELSE '' END + || CASE WHEN column_adds > 0 THEN column_adds || ' columns added. ' ELSE '' END + || CASE WHEN column_drops > 0 THEN column_drops || ' columns dropped. ' ELSE '' END + || CASE WHEN column_mods > 0 THEN column_mods || ' columns modified. ' ELSE '' END + AS result_message, + column_adds + column_drops + column_mods AS result_measure + FROM table_changes; - id: '2014' test_type: Schema_Drift sql_flavor: redshift - template_name: ex_schema_drift_generic.sql + template: |- + WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM {APP_SCHEMA_NAME}.test_runs + WHERE test_suite_id = '{TEST_SUITE_ID}'::UUID + -- Ignore current run + AND id <> '{TEST_RUN_ID}'::UUID + ), + table_changes AS ( + SELECT + dsl.table_name, + MAX(prev_test.last_run_time) as window_start, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'A' THEN dsl.change_date ELSE NULL END) as last_add_date, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'D' THEN dsl.change_date ELSE NULL END) as last_drop_date, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'A') AS column_adds, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'D') AS column_drops, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'M') AS column_mods + FROM {APP_SCHEMA_NAME}.data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = '{TABLE_GROUPS_ID}'::UUID + -- if no previous tests, this comparision yelds null and nothing is counted + AND dsl.change_date > prev_test.last_run_time + GROUP BY dsl.table_name + ) + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + table_name, + '{INPUT_PARAMETERS}' AS input_parameters, + (CASE + WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'A' + WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'D' + ELSE 'M' + END) + || '|' || column_adds + || '|' || column_drops + || '|' || column_mods + || '|' || window_start::TEXT + AS result_signal, + 0 AS result_code, + CASE WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'Table added. ' ELSE '' END + || CASE WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'Table dropped. ' ELSE '' END + || CASE WHEN column_adds > 0 THEN column_adds || ' columns added. ' ELSE '' END + || CASE WHEN column_drops > 0 THEN column_drops || ' columns dropped. ' ELSE '' END + || CASE WHEN column_mods > 0 THEN column_mods || ' columns modified. ' ELSE '' END + AS result_message, + column_adds + column_drops + column_mods AS result_measure + FROM table_changes; - id: '2614' test_type: Schema_Drift sql_flavor: redshift_spectrum - template_name: ex_schema_drift_generic.sql + template: |- + WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM {APP_SCHEMA_NAME}.test_runs + WHERE test_suite_id = '{TEST_SUITE_ID}'::UUID + -- Ignore current run + AND id <> '{TEST_RUN_ID}'::UUID + ), + table_changes AS ( + SELECT + dsl.table_name, + MAX(prev_test.last_run_time) as window_start, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'A' THEN dsl.change_date ELSE NULL END) as last_add_date, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'D' THEN dsl.change_date ELSE NULL END) as last_drop_date, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'A') AS column_adds, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'D') AS column_drops, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'M') AS column_mods + FROM {APP_SCHEMA_NAME}.data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = '{TABLE_GROUPS_ID}'::UUID + -- if no previous tests, this comparision yelds null and nothing is counted + AND dsl.change_date > prev_test.last_run_time + GROUP BY dsl.table_name + ) + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + table_name, + '{INPUT_PARAMETERS}' AS input_parameters, + (CASE + WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'A' + WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'D' + ELSE 'M' + END) + || '|' || column_adds + || '|' || column_drops + || '|' || column_mods + || '|' || window_start::TEXT + AS result_signal, + 0 AS result_code, + CASE WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'Table added. ' ELSE '' END + || CASE WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'Table dropped. ' ELSE '' END + || CASE WHEN column_adds > 0 THEN column_adds || ' columns added. ' ELSE '' END + || CASE WHEN column_drops > 0 THEN column_drops || ' columns dropped. ' ELSE '' END + || CASE WHEN column_mods > 0 THEN column_mods || ' columns modified. ' ELSE '' END + AS result_message, + column_adds + column_drops + column_mods AS result_measure + FROM table_changes; - id: '2114' test_type: Schema_Drift sql_flavor: snowflake - template_name: ex_schema_drift_generic.sql + template: |- + WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM {APP_SCHEMA_NAME}.test_runs + WHERE test_suite_id = '{TEST_SUITE_ID}'::UUID + -- Ignore current run + AND id <> '{TEST_RUN_ID}'::UUID + ), + table_changes AS ( + SELECT + dsl.table_name, + MAX(prev_test.last_run_time) as window_start, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'A' THEN dsl.change_date ELSE NULL END) as last_add_date, + MAX(CASE WHEN dsl.column_id IS NULL AND dsl.change = 'D' THEN dsl.change_date ELSE NULL END) as last_drop_date, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'A') AS column_adds, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'D') AS column_drops, + COUNT(*) FILTER (WHERE dsl.column_id IS NOT NULL AND dsl.change = 'M') AS column_mods + FROM {APP_SCHEMA_NAME}.data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = '{TABLE_GROUPS_ID}'::UUID + -- if no previous tests, this comparision yelds null and nothing is counted + AND dsl.change_date > prev_test.last_run_time + GROUP BY dsl.table_name + ) + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + table_name, + '{INPUT_PARAMETERS}' AS input_parameters, + (CASE + WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'A' + WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'D' + ELSE 'M' + END) + || '|' || column_adds + || '|' || column_drops + || '|' || column_mods + || '|' || window_start::TEXT + AS result_signal, + 0 AS result_code, + CASE WHEN last_add_date IS NOT NULL AND (last_drop_date IS NULL OR last_add_date > last_drop_date) THEN 'Table added. ' ELSE '' END + || CASE WHEN last_drop_date IS NOT NULL AND (last_add_date IS NULL OR last_drop_date > last_add_date) THEN 'Table dropped. ' ELSE '' END + || CASE WHEN column_adds > 0 THEN column_adds || ' columns added. ' ELSE '' END + || CASE WHEN column_drops > 0 THEN column_drops || ' columns dropped. ' ELSE '' END + || CASE WHEN column_mods > 0 THEN column_mods || ' columns modified. ' ELSE '' END + AS result_message, + column_adds + column_drops + column_mods AS result_measure + FROM table_changes; diff --git a/testgen/template/dbsetup_test_types/test_types_Street_Addr_Pattern.yaml b/testgen/template/dbsetup_test_types/test_types_Street_Addr_Pattern.yaml index 0fb0a904..c5f9a5c6 100644 --- a/testgen/template/dbsetup_test_types/test_types_Street_Addr_Pattern.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Street_Addr_Pattern.yaml @@ -12,6 +12,7 @@ test_types: Percent of records that match street address pattern selection_criteria: |- (std_pattern_match='STREET_ADDR') AND (avg_length <> round(avg_length)) AND (avg_embedded_spaces BETWEEN 2 AND 6) AND (avg_length < 35) + generation_template: null dq_score_prevalence_formula: |- ({VALUE_CT}::FLOAT * ({RESULT_MEASURE}::FLOAT - {THRESHOLD_VALUE}::FLOAT)/100.0)/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Table_Freshness.yaml b/testgen/template/dbsetup_test_types/test_types_Table_Freshness.yaml index 9149319b..ed3e6340 100644 --- a/testgen/template/dbsetup_test_types/test_types_Table_Freshness.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Table_Freshness.yaml @@ -4,26 +4,26 @@ test_types: test_name_short: Table Freshness test_name_long: Stale Table Not Updated test_description: |- - Confirms whether table has been updated based on data fingerprint + Tests that table has been updated based on data fingerprint. except_message: |- Table has not been updated. measure_uom: Was Change Detected measure_uom_description: null selection_criteria: |- TEMPLATE - dq_score_prevalence_formula: |- - (({RESULT_MEASURE}-{THRESHOLD_VALUE}))::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) - dq_score_risk_factor: '0.0' + generation_template: gen_Table_Freshness.sql + dq_score_prevalence_formula: null + dq_score_risk_factor: null column_name_prompt: |- null column_name_help: |- null - default_parm_columns: history_calculation,history_lookback,subset_condition,custom_query + default_parm_columns: subset_condition,custom_query default_parm_values: null default_parm_prompts: |- - History Aggregate,History Lookback,Record Subset Condition,Fingerprint Expression + Record Subset Condition,Fingerprint Expression default_parm_help: |- - Aggregate calculation to be performed on the N lookback results|Last N tests to use for history aggregate calculation|Condition defining a subset of records in main table|String expression combining key column measures into a distinct representation of table state + Condition defining a subset of records in main table|String expression combining key column measures into a distinct representation of table state default_severity: Log run_type: QUERY test_scope: table @@ -42,28 +42,225 @@ test_types: - id: '2512' test_type: Table_Freshness sql_flavor: bigquery - template_name: ex_table_changed_bigquery.sql + template: |- + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + fingerprint AS result_signal, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_code, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 'No table change detected.' + ELSE 'Table change detected.' + END AS result_message, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_measure + FROM ( + SELECT {CUSTOM_QUERY} AS fingerprint + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + ) test; - id: '2412' test_type: Table_Freshness sql_flavor: databricks - template_name: ex_table_changed_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + fingerprint as result_signal, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_code, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 'No table change detected.' + ELSE 'Table change detected.' + END AS result_message, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_measure + FROM ( SELECT {CUSTOM_QUERY} as fingerprint + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + ) test; - id: '2212' test_type: Table_Freshness sql_flavor: mssql - template_name: ex_table_changed_mssql.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + fingerprint as result_signal, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_code, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 'No table change detected.' + ELSE 'Table change detected.' + END AS result_message, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_measure + FROM ( SELECT {CUSTOM_QUERY} as fingerprint + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" WITH (NOLOCK) + WHERE {SUBSET_CONDITION} + ) test; - id: '2312' test_type: Table_Freshness sql_flavor: postgresql - template_name: ex_table_changed_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + fingerprint as result_signal, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_code, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 'No table change detected.' + ELSE 'Table change detected.' + END AS result_message, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_measure + FROM ( SELECT {CUSTOM_QUERY} as fingerprint + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + ) test; - id: '2012' test_type: Table_Freshness sql_flavor: redshift - template_name: ex_table_changed_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + fingerprint as result_signal, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_code, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 'No table change detected.' + ELSE 'Table change detected.' + END AS result_message, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_measure + FROM ( SELECT {CUSTOM_QUERY} as fingerprint + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + ) test; - id: '2512' test_type: Table_Freshness sql_flavor: redshift_spectrum - template_name: ex_table_changed_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + fingerprint as result_signal, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_code, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 'No table change detected.' + ELSE 'Table change detected.' + END AS result_message, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_measure + FROM ( SELECT {CUSTOM_QUERY} as fingerprint + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + ) test; - id: '2112' test_type: Table_Freshness sql_flavor: snowflake - template_name: ex_table_changed_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + fingerprint as result_signal, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_code, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 'No table change detected.' + ELSE 'Table change detected.' + END AS result_message, + CASE + WHEN '{LOWER_TOLERANCE}' = 'NULL' OR fingerprint = '{LOWER_TOLERANCE}' THEN 0 + ELSE 1 + END AS result_measure + FROM ( SELECT {CUSTOM_QUERY} as fingerprint + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + ) test; diff --git a/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Gain.yaml b/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Gain.yaml index 746913cb..d4d1152b 100644 --- a/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Gain.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Gain.yaml @@ -5,12 +5,13 @@ test_types: test_name_long: Latest timeframe has at least all value combinations from prior period test_description: |- - Tests that column values in most recent time-window include at least same as prior time window + Tests that column values in most recent time window include at least same as prior time window. except_message: |- - Column values in most recent time-window don't include all values in prior window. + Column values in most recent time window do not include all values in prior time window. measure_uom: Mismatched values measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' @@ -161,28 +162,320 @@ test_types: - id: '2507' test_type: Timeframe_Combo_Gain sql_flavor: bigquery - template_name: ex_window_match_no_drops_bigquery.sql + template: |- + SELECT + '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + NULL as result_signal, + CASE WHEN COUNT(*) > {SKIP_ERRORS} THEN 0 ELSE 1 END AS result_code, + CASE + WHEN COUNT(*) > 0 THEN CONCAT( + CAST(COUNT(*) AS STRING), ' error(s) identified, ', + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) AS result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATE_SUB((SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), INTERVAL 2 * {WINDOW_DAYS} DAY) + AND {WINDOW_DATE_COLUMN} < DATE_SUB((SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), INTERVAL {WINDOW_DAYS} DAY) + GROUP BY {COLUMN_NAME_NO_QUOTES} + EXCEPT DISTINCT + SELECT {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATE_SUB((SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), INTERVAL {WINDOW_DAYS} DAY) + GROUP BY {COLUMN_NAME_NO_QUOTES} + ) test; - id: '2407' test_type: Timeframe_Combo_Gain sql_flavor: databricks - template_name: ex_window_match_no_drops_databricks.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + AND {WINDOW_DATE_COLUMN} < DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + GROUP BY {COLUMN_NAME_NO_QUOTES} + EXCEPT + SELECT {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + GROUP BY {COLUMN_NAME_NO_QUOTES} + ) test; - id: '2207' test_type: Timeframe_Combo_Gain sql_flavor: mssql - template_name: ex_window_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + EXCEPT + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + ) test; - id: '2307' test_type: Timeframe_Combo_Gain sql_flavor: postgresql - template_name: ex_window_match_no_drops_postgresql.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS VARCHAR), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - 2 * {WINDOW_DAYS} + AND {WINDOW_DATE_COLUMN} < (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} + GROUP BY {COLUMN_NAME_NO_QUOTES} + EXCEPT + SELECT {COLUMN_NAME_NO_QUOTES} + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} + GROUP BY {COLUMN_NAME_NO_QUOTES} + ) test; - id: '2007' test_type: Timeframe_Combo_Gain sql_flavor: redshift - template_name: ex_window_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + EXCEPT + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + ) test; - id: '2507' test_type: Timeframe_Combo_Gain sql_flavor: redshift_spectrum - template_name: ex_window_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + EXCEPT + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + ) test; - id: '2107' test_type: Timeframe_Combo_Gain sql_flavor: snowflake - template_name: ex_window_match_no_drops_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + EXCEPT + SELECT {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + GROUP BY {COLUMN_NAME_NO_QUOTES} + ) test; diff --git a/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Match.yaml b/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Match.yaml index 8f6d9362..24b17cc4 100644 --- a/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Match.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Timeframe_Combo_Match.yaml @@ -4,12 +4,13 @@ test_types: test_name_short: Timeframe Match test_name_long: Column value combinations from latest timeframe same as prior period test_description: |- - Tests for presence of same column values in most recent time-window vs. prior time window + Tests for presence of same column values in most recent time window compared to prior time window. except_message: |- - Column values don't match in most recent time-windows. + Column values do not match in most recent time windows. measure_uom: Mismatched values measure_uom_description: null selection_criteria: null + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' @@ -276,28 +277,432 @@ test_types: - id: '2508' test_type: Timeframe_Combo_Match sql_flavor: bigquery - template_name: ex_window_match_same_bigquery.sql + template: |- + SELECT '{TEST_TYPE}' AS test_type, + '{TEST_DEFINITION_ID}' AS test_definition_id, + '{TEST_SUITE_ID}' AS test_suite_id, + '{TEST_RUN_ID}' AS test_run_id, + '{RUN_DATE}' AS test_time, + '{SCHEMA_NAME}' AS schema_name, + '{TABLE_NAME}' AS table_name, + '{COLUMN_NAME_NO_QUOTES}' AS column_names, + '{SKIP_ERRORS}' AS threshold_value, + {SKIP_ERRORS} AS skip_errors, + '{INPUT_PARAMETERS}' AS input_parameters, + NULL as result_signal, + CASE WHEN COUNT(*) > {SKIP_ERRORS} THEN 0 ELSE 1 END AS result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CAST(COUNT(*) AS STRING), + ' error(s) identified, ', + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) AS result_measure + FROM ( + -- Values in the prior timeframe but not in the latest + ( + SELECT 'Prior Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATE_ADD( + (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), + INTERVAL -{WINDOW_DAYS} DAY + ) + EXCEPT DISTINCT + SELECT 'Prior Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATE_ADD( + (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), + INTERVAL -2 * {WINDOW_DAYS} DAY + ) + AND {WINDOW_DATE_COLUMN} < DATE_ADD( + (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), + INTERVAL -{WINDOW_DAYS} DAY + ) + ) + UNION ALL + -- Values in the latest timeframe but not in the prior + ( + SELECT 'Latest Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATE_ADD( + (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), + INTERVAL -2 * {WINDOW_DAYS} DAY + ) + AND {WINDOW_DATE_COLUMN} < DATE_ADD( + (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), + INTERVAL -{WINDOW_DAYS} DAY + ) + EXCEPT DISTINCT + SELECT 'Latest Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}.{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATE_ADD( + (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), + INTERVAL -{WINDOW_DAYS} DAY + ) + ) + ) test; - id: '2408' test_type: Timeframe_Combo_Match sql_flavor: databricks - template_name: ex_window_match_same_databricks.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + ( + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + EXCEPT + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + AND {WINDOW_DATE_COLUMN} < DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + ) + UNION ALL + ( + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + AND {WINDOW_DATE_COLUMN} < DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + EXCEPT + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) + ) + ) test; - id: '2208' test_type: Timeframe_Combo_Match sql_flavor: mssql - template_name: ex_window_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + ( + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + UNION ALL + ( + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + ) test; - id: '2308' test_type: Timeframe_Combo_Match sql_flavor: postgresql - template_name: ex_window_match_same_postgresql.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS VARCHAR), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + ( + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} + EXCEPT + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - 2 * {WINDOW_DAYS} + AND {WINDOW_DATE_COLUMN} < (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} + ) + UNION ALL + ( + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - 2 * {WINDOW_DAYS} + AND {WINDOW_DATE_COLUMN} < (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} + EXCEPT + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM "{SCHEMA_NAME}"."{TABLE_NAME}" + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} + ) + ) test; - id: '2008' test_type: Timeframe_Combo_Match sql_flavor: redshift - template_name: ex_window_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + ( + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + UNION ALL + ( + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + ) test; - id: '2508' test_type: Timeframe_Combo_Match sql_flavor: redshift_spectrum - template_name: ex_window_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + ( + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + UNION ALL + ( + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + ) test; - id: '2108' test_type: Timeframe_Combo_Match sql_flavor: snowflake - template_name: ex_window_match_same_generic.sql + template: |- + SELECT '{TEST_TYPE}' as test_type, + '{TEST_DEFINITION_ID}' as test_definition_id, + '{TEST_SUITE_ID}' as test_suite_id, + '{TEST_RUN_ID}' as test_run_id, + '{RUN_DATE}' as test_time, + '{SCHEMA_NAME}' as schema_name, + '{TABLE_NAME}' as table_name, + '{COLUMN_NAME_NO_QUOTES}' as column_names, + '{SKIP_ERRORS}' as threshold_value, + {SKIP_ERRORS} as skip_errors, + '{INPUT_PARAMETERS}' as input_parameters, + NULL as result_signal, + CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, + CASE + WHEN COUNT(*) > 0 THEN + CONCAT( + CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), + CONCAT( + CASE + WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' + ELSE 'within limit of ' + END, + '{SKIP_ERRORS}.' + ) + ) + ELSE 'No errors found.' + END AS result_message, + COUNT(*) as result_measure + FROM ( + ( + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + UNION ALL + ( + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + EXCEPT + SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} + FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} + WHERE {SUBSET_CONDITION} + AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) + ) + ) test; diff --git a/testgen/template/dbsetup_test_types/test_types_US_State.yaml b/testgen/template/dbsetup_test_types/test_types_US_State.yaml index c9d51c5d..d663db1f 100644 --- a/testgen/template/dbsetup_test_types/test_types_US_State.yaml +++ b/testgen/template/dbsetup_test_types/test_types_US_State.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: US State test_name_long: Column value is two-letter US state code test_description: |- - Tests that the recorded column value is a valid US state. + Tests that values in column are valid US states. except_message: |- - Column Value is not a valid US state. + Column values found that are not valid US states. measure_uom: Not US States measure_uom_description: |- Values that doo not match 2-character US state abbreviations. selection_criteria: |- general_type= 'A' AND column_name ILIKE '%state%' AND distinct_value_ct < 70 AND max_length = 2 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Unique.yaml b/testgen/template/dbsetup_test_types/test_types_Unique.yaml index 61eabf82..a084f307 100644 --- a/testgen/template/dbsetup_test_types/test_types_Unique.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Unique.yaml @@ -6,12 +6,13 @@ test_types: test_description: |- Tests that no values for the column are repeated in multiple records. except_message: |- - Column values should be unique per row. + Column values not unique per row. measure_uom: Duplicate values measure_uom_description: |- Count of non-unique values selection_criteria: |- record_ct > 500 and record_ct = distinct_value_ct and value_ct > 0 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Unique_Pct.yaml b/testgen/template/dbsetup_test_types/test_types_Unique_Pct.yaml index 374a4d50..4f79e0dd 100644 --- a/testgen/template/dbsetup_test_types/test_types_Unique_Pct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Unique_Pct.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: Percent Unique test_name_long: Consistent ratio of unique values test_description: |- - Tests for statistically-significant shift in percentage of unique values vs. baseline data. + Tests for statistically significant shift in percentage of unique values compared to baseline data. except_message: |- - Significant shift in percent of unique values vs. baseline. + Significant shift in percent of unique values compared to baseline. measure_uom: Difference measure measure_uom_description: |- Cohen's H Difference (0.20 small, 0.5 mod, 0.8 large, 1.2 very large, 2.0 huge) selection_criteria: |- distinct_value_ct > 10 AND functional_data_type NOT ILIKE 'Measurement%' + generation_template: null dq_score_prevalence_formula: |- 2.0 * (1.0 - fn_normal_cdf(ABS({RESULT_MEASURE}::FLOAT) / 2.0)) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Valid_Characters.yaml b/testgen/template/dbsetup_test_types/test_types_Valid_Characters.yaml index 4d5f876d..e2e2f9ce 100644 --- a/testgen/template/dbsetup_test_types/test_types_Valid_Characters.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Valid_Characters.yaml @@ -4,7 +4,7 @@ test_types: test_name_short: Valid Characters test_name_long: Column contains no invalid characters test_description: |- - Tests for the presence of non-printing characters, leading spaces, or surrounding quotes. + Tests for presence of non-printing characters, leading spaces, or surrounding quotes. except_message: |- Invalid characters, such as non-printing characters, leading spaces, or surrounding quotes, were found. measure_uom: Invalid records @@ -12,6 +12,7 @@ test_types: Expected count of values with invalid characters selection_criteria: |- general_type = 'A' + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Valid_Month.yaml b/testgen/template/dbsetup_test_types/test_types_Valid_Month.yaml index 32e74026..07dd037f 100644 --- a/testgen/template/dbsetup_test_types/test_types_Valid_Month.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Valid_Month.yaml @@ -4,13 +4,14 @@ test_types: test_name_short: Valid Month test_name_long: Valid calendar month in expected format test_description: |- - Tests for the presence of a valid representation of a calendar month consistent with the format at baseline. + Tests for presence of valid representation of calendar months consistent with format at baseline. except_message: |- - Column values are not a valid representation of a calendar month consistent with the format at baseline. + Column values are not valid representations of calendar months. measure_uom: Invalid months measure_uom_description: null selection_criteria: |- functional_data_type = 'Period Month' + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip.yaml b/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip.yaml index 6c08cc73..29e12359 100644 --- a/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: Valid US Zip test_name_long: Valid USA Postal Codes test_description: |- - Tests that postal codes match the 5 or 9 digit standard US format + Tests that postal codes match the 5-digit or 9-digit standard US formats. except_message: |- - Invalid US Zip Code formats found. + Invalid US zip code formats found. measure_uom: Invalid Zip Codes measure_uom_description: |- Expected count of values with invalid Zip Codes selection_criteria: |- functional_data_type = 'Zip' + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip3.yaml b/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip3.yaml index ab616fd8..f2611807 100644 --- a/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip3.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Valid_US_Zip3.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: 'Valid US Zip-3 ' test_name_long: Valid USA Zip-3 Prefix test_description: |- - Tests that postal codes match the 3 digit format of a regional prefix. + Tests that postal codes match the 3-digit format of a regional prefix. except_message: |- - Invalid 3-digit US Zip Code regional prefix formats found. + Invalid 3-digit US zip code regional prefix formats found. measure_uom: Invalid Zip-3 Prefix measure_uom_description: |- Expected count of values with invalid Zip-3 Prefix Codes selection_criteria: |- functional_data_type = 'Zip3' + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Variability_Decrease.yaml b/testgen/template/dbsetup_test_types/test_types_Variability_Decrease.yaml index 6f476d0a..6cab00de 100644 --- a/testgen/template/dbsetup_test_types/test_types_Variability_Decrease.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Variability_Decrease.yaml @@ -6,12 +6,13 @@ test_types: test_description: |- Tests that the spread or dispersion of column values has decreased significantly over baseline, indicating a shift in stability of the measure. This could signal a change in a process or a data quality issue. except_message: |- - The Standard Deviation of the measure has decreased below the defined threshold. This could signal a change in a process or a data quality issue. + The standard deviation of the measure has decreased below the defined threshold. measure_uom: Pct SD shift measure_uom_description: |- Percent of baseline Standard Deviation selection_criteria: |- general_type = 'N' AND functional_data_type ilike 'Measure%' AND functional_data_type <> 'Measurement Spike' AND column_name NOT ilike '%latitude%' AND column_name NOT ilike '%longitude%' AND value_ct <> distinct_value_ct AND distinct_value_ct > 10 AND stdev_value > 0 AND avg_value IS NOT NULL AND NOT (distinct_value_ct = max_value - min_value + 1 AND distinct_value_ct > 2) + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Variability_Increase.yaml b/testgen/template/dbsetup_test_types/test_types_Variability_Increase.yaml index ec4a921a..e05a1234 100644 --- a/testgen/template/dbsetup_test_types/test_types_Variability_Increase.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Variability_Increase.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: Variability Increase test_name_long: Variability has increased above threshold test_description: |- - Tests that the spread or dispersion of column values has increased significantly over baseline, indicating a drop in stability of the measure. + Tests that the spread or dispersion of column values has increased significantly over baseline, indicating a drop in stability of the measure. This could signal a change in a process or a data quality issue. except_message: |- - The Standard Deviation of the measure has increased beyond the defined threshold. This could signal a change in a process or a data quality issue. + The standard deviation of the measure has increased beyond the defined threshold. measure_uom: Pct SD shift measure_uom_description: |- Percent of baseline Standard Deviation selection_criteria: |- general_type = 'N' AND functional_data_type ilike 'Measure%' AND functional_data_type <> 'Measurement Spike' AND column_name NOT ilike '%latitude%' AND column_name NOT ilike '%longitude%' AND value_ct <> distinct_value_ct AND distinct_value_ct > 10 AND stdev_value > 0 AND avg_value IS NOT NULL AND NOT (distinct_value_ct = max_value - min_value + 1 AND distinct_value_ct > 2) + generation_template: null dq_score_prevalence_formula: |- 1 dq_score_risk_factor: '0.75' diff --git a/testgen/template/dbsetup_test_types/test_types_Volume_Trend.yaml b/testgen/template/dbsetup_test_types/test_types_Volume_Trend.yaml new file mode 100644 index 00000000..3bc15367 --- /dev/null +++ b/testgen/template/dbsetup_test_types/test_types_Volume_Trend.yaml @@ -0,0 +1,170 @@ +test_types: + id: '1513' + test_type: Volume_Trend + test_name_short: Volume + test_name_long: Number of rows is within tolerance range + test_description: |- + Tests that row count of all or subset of records in a table is within tolerance range. + except_message: |- + Row count is outside expected range. + measure_uom: Row count + measure_uom_description: null + selection_criteria: |- + TEMPLATE + generation_template: gen_Volume_Trend.sql + dq_score_prevalence_formula: null + dq_score_risk_factor: null + column_name_prompt: null + column_name_help: null + default_parm_columns: subset_condition,history_calculation,history_calculation_upper,history_lookback + default_parm_values: null + default_parm_prompts: Record Subset Condition,Lower Bound,Upper Bound,History Lookback + default_parm_help: Condition defining a subset of records in main table + default_severity: Fail + run_type: CAT + test_scope: table + dq_dimension: Completeness + health_dimension: Volume + threshold_description: |- + Expected row count range. + result_visualization: line_chart + result_visualization_params: null + usage_notes: |- + This test compares the row count of all or a subset of records in a table against a derived tolerance range. + active: Y + cat_test_conditions: + - id: '2515' + test_type: Volume_Trend + sql_flavor: bigquery + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2415' + test_type: Volume_Trend + sql_flavor: databricks + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2215' + test_type: Volume_Trend + sql_flavor: mssql + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2315' + test_type: Volume_Trend + sql_flavor: postgresql + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2015' + test_type: Volume_Trend + sql_flavor: redshift + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2615' + test_type: Volume_Trend + sql_flavor: redshift_spectrum + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + - id: '2115' + test_type: Volume_Trend + sql_flavor: snowflake + measure: |- + {CUSTOM_QUERY} + test_operator: NOT BETWEEN + test_condition: |- + {LOWER_TOLERANCE} AND {UPPER_TOLERANCE} + target_data_lookups: + - id: '1477' + test_id: '1513' + test_type: Volume_Trend + sql_flavor: bigquery + lookup_type: null + lookup_query: |- + SELECT COUNT(*) AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM `{TARGET_SCHEMA}`.`{TABLE_NAME}`; + error_type: Test Results + - id: '1478' + test_id: '1513' + test_type: Volume_Trend + sql_flavor: databricks + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM `{TARGET_SCHEMA}`.`{TABLE_NAME}`; + error_type: Test Results + - id: '1479' + test_id: '1513' + test_type: Volume_Trend + sql_flavor: mssql + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1480' + test_id: '1513' + test_type: Volume_Trend + sql_flavor: postgresql + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1481' + test_id: '1513' + test_type: Volume_Trend + sql_flavor: redshift + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1482' + test_id: '1513' + test_type: Volume_Trend + sql_flavor: redshift_spectrum + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + - id: '1483' + test_id: '1513' + test_type: Volume_Trend + sql_flavor: snowflake + lookup_type: null + lookup_query: |- + SELECT {CUSTOM_QUERY} AS current_count, + {LOWER_TOLERANCE} AS lower_bound, + {UPPER_TOLERANCE} AS upper_bound + FROM "{TARGET_SCHEMA}"."{TABLE_NAME}"; + error_type: Test Results + test_templates: [] diff --git a/testgen/template/dbsetup_test_types/test_types_Weekly_Rec_Ct.yaml b/testgen/template/dbsetup_test_types/test_types_Weekly_Rec_Ct.yaml index 8217f3ad..1aff7bb4 100644 --- a/testgen/template/dbsetup_test_types/test_types_Weekly_Rec_Ct.yaml +++ b/testgen/template/dbsetup_test_types/test_types_Weekly_Rec_Ct.yaml @@ -4,14 +4,15 @@ test_types: test_name_short: Weekly Records test_name_long: At least one date per week present within date range test_description: |- - Tests for presence of at least one date per calendar week within min/max date range, per baseline data + Tests for presence of at least one date per calendar week within minimum and maximum date range, per baseline data. except_message: |- - At least one date per week expected in min/max date range. + Not every week between minimum and maximum date range has at least one date present. measure_uom: Missing weeks measure_uom_description: |- Calendar weeks without date values present selection_criteria: |- functional_data_type ILIKE 'Transactional Date%' AND date_days_present > 1 AND functional_table_type ILIKE '%cumulative%' AND date_weeks_present > 3 AND date_weeks_present - (DATEDIFF('week', '1800-01-05'::DATE, max_date) - DATEDIFF('week', '1800-01-05'::DATE, min_date) + 1) = 0 AND future_date_ct::FLOAT / NULLIF(value_ct, 0) <= 0.75 + generation_template: null dq_score_prevalence_formula: |- ({RESULT_MEASURE}-{THRESHOLD_VALUE})::FLOAT*{PRO_RECORD_CT}::FLOAT/NULLIF({DATE_WEEKS_PRESENT}::FLOAT, 0)/NULLIF({RECORD_CT}::FLOAT, 0) dq_score_risk_factor: '1.0' diff --git a/testgen/template/dbupgrade/0162_incremental_upgrade.sql b/testgen/template/dbupgrade/0162_incremental_upgrade.sql new file mode 100644 index 00000000..69c9acff --- /dev/null +++ b/testgen/template/dbupgrade/0162_incremental_upgrade.sql @@ -0,0 +1,5 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE test_templates DROP COLUMN template_name; + +ALTER TABLE test_templates ADD COLUMN template VARCHAR; diff --git a/testgen/template/dbupgrade/0163_incremental_upgrade.sql b/testgen/template/dbupgrade/0163_incremental_upgrade.sql new file mode 100644 index 00000000..5f58e0a6 --- /dev/null +++ b/testgen/template/dbupgrade/0163_incremental_upgrade.sql @@ -0,0 +1,19 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +CREATE TABLE stg_test_definition_updates ( + test_suite_id UUID, + test_definition_id UUID, + run_date TIMESTAMP, + lower_tolerance VARCHAR(1000), + upper_tolerance VARCHAR(1000), + prediction JSONB +); + +ALTER TABLE test_definitions + ALTER COLUMN history_calculation TYPE VARCHAR(1000), + ADD COLUMN history_calculation_upper VARCHAR(1000), + ADD COLUMN prediction JSONB; + +ALTER TABLE test_suites + ADD COLUMN predict_sensitivity VARCHAR(6), + ADD COLUMN predict_min_lookback INTEGER; diff --git a/testgen/template/dbupgrade/0164_incremental_upgrade.sql b/testgen/template/dbupgrade/0164_incremental_upgrade.sql new file mode 100644 index 00000000..13e03f0b --- /dev/null +++ b/testgen/template/dbupgrade/0164_incremental_upgrade.sql @@ -0,0 +1,37 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE test_suites + ADD COLUMN monitor_lookback INTEGER DEFAULT NULL; + +ALTER TABLE data_structure_log + ADD COLUMN table_groups_id UUID; + +ALTER TABLE data_structure_log + ADD COLUMN table_name VARCHAR(120); + +ALTER TABLE data_structure_log + ADD COLUMN column_name VARCHAR(120); + +WITH update_log AS ( + SELECT + data_structure_log.element_id, + data_column_chars.table_groups_id, + data_column_chars.table_name, + data_column_chars.column_name + FROM data_structure_log + INNER JOIN data_column_chars + ON (data_structure_log.element_id = data_column_chars.column_id) +) +UPDATE data_structure_log +SET table_groups_id = u.table_groups_id, + table_name = u.table_name, + column_name = u.column_name +FROM update_log as u +INNER JOIN data_column_chars AS d + ON ( + d.column_id = u.element_id + AND d.table_groups_id = u.table_groups_id + AND d.table_name = u.table_name + AND d.column_name = u.column_name + ) +WHERE data_structure_log.element_id = d.column_id; diff --git a/testgen/template/dbupgrade/0165_incremental_upgrade.sql b/testgen/template/dbupgrade/0165_incremental_upgrade.sql new file mode 100644 index 00000000..3b0f47f1 --- /dev/null +++ b/testgen/template/dbupgrade/0165_incremental_upgrade.sql @@ -0,0 +1,5 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE test_suites + DROP COLUMN view_mode, + ADD COLUMN is_monitor BOOLEAN DEFAULT FALSE; diff --git a/testgen/template/dbupgrade/0166_incremental_upgrade.sql b/testgen/template/dbupgrade/0166_incremental_upgrade.sql new file mode 100644 index 00000000..c06eecbf --- /dev/null +++ b/testgen/template/dbupgrade/0166_incremental_upgrade.sql @@ -0,0 +1,17 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE data_structure_log + ADD COLUMN table_id UUID, + ADD COLUMN column_id UUID; + +UPDATE data_structure_log +SET table_id = dcc.table_id, + column_id = dcc.column_id +FROM data_column_chars dcc +WHERE data_structure_log.element_id = dcc.column_id; + +ALTER TABLE data_structure_log + DROP COLUMN element_id; + +CREATE INDEX ix_dsl_tg_tcd + ON data_structure_log (table_groups_id, table_name, change_date); diff --git a/testgen/template/dbupgrade/0167_incremental_upgrade.sql b/testgen/template/dbupgrade/0167_incremental_upgrade.sql new file mode 100644 index 00000000..69030443 --- /dev/null +++ b/testgen/template/dbupgrade/0167_incremental_upgrade.sql @@ -0,0 +1,5 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE test_suites + ADD COLUMN predict_exclude_weekends BOOLEAN DEFAULT FALSE, + ADD COLUMN predict_holiday_codes VARCHAR(100); diff --git a/testgen/template/dbupgrade/0168_incremental_upgrade.sql b/testgen/template/dbupgrade/0168_incremental_upgrade.sql new file mode 100644 index 00000000..db12d491 --- /dev/null +++ b/testgen/template/dbupgrade/0168_incremental_upgrade.sql @@ -0,0 +1,4 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE table_groups + ADD COLUMN default_test_suite_id UUID DEFAULT NULL; diff --git a/testgen/template/dbupgrade/0169_incremental_upgrade.sql b/testgen/template/dbupgrade/0169_incremental_upgrade.sql new file mode 100644 index 00000000..ec5d4f72 --- /dev/null +++ b/testgen/template/dbupgrade/0169_incremental_upgrade.sql @@ -0,0 +1,64 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +DROP VIEW IF EXISTS v_test_results; + +UPDATE test_results + SET column_names = NULL +WHERE column_names = 'N/A'; + +UPDATE test_results + SET test_definition_id = d.id +FROM test_results r + INNER JOIN test_definitions d ON ( + r.auto_gen IS TRUE + AND r.test_suite_id = d.test_suite_id + AND r.schema_name = d.schema_name + AND r.table_name IS NOT DISTINCT FROM d.table_name + AND r.column_names IS NOT DISTINCT FROM d.column_name + AND r.test_type = d.test_type + ) +WHERE d.last_auto_gen_date IS NOT NULL + AND test_results.id = r.id; + +CREATE UNIQUE INDEX uix_td_autogen_schema + ON test_definitions (test_suite_id, test_type, schema_name) + WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NULL + AND column_name IS NULL; + +CREATE UNIQUE INDEX uix_td_autogen_table + ON test_definitions (test_suite_id, test_type, schema_name, table_name) + WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL; + +CREATE UNIQUE INDEX uix_td_autogen_column + ON test_definitions (test_suite_id, test_type, schema_name, table_name, column_name) + WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NOT NULL; + +DROP INDEX idx_dtc_tgid_table; + +CREATE INDEX idx_dtc_tg_schema_table + ON data_table_chars (table_groups_id, schema_name, table_name); + +CREATE INDEX idx_dtc_id + ON data_table_chars (table_id); + +DROP INDEX idx_dcc_tg_table_column; + +CREATE INDEX idx_dcc_tg_schema_table_column + ON data_column_chars (table_groups_id, schema_name, table_name, column_name); + +CREATE INDEX idx_dcc_tableid_column + ON data_column_chars (table_id, column_name); + +CREATE INDEX idx_dcc_id + ON data_column_chars (column_id); + +ALTER INDEX IF EXISTS profile_results_tgid_sn_tn_cn + RENAME TO ix_pr_tg_s_t_c; + +CREATE INDEX ix_pr_tg_rd + ON profile_results (table_groups_id, run_date); diff --git a/testgen/template/dbupgrade/0170_incremental_upgrade.sql b/testgen/template/dbupgrade/0170_incremental_upgrade.sql new file mode 100644 index 00000000..44fa7769 --- /dev/null +++ b/testgen/template/dbupgrade/0170_incremental_upgrade.sql @@ -0,0 +1,4 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE test_types + ADD COLUMN generation_template VARCHAR(100); diff --git a/testgen/template/dbupgrade/0171_incremental_upgrade.sql b/testgen/template/dbupgrade/0171_incremental_upgrade.sql new file mode 100644 index 00000000..99b56008 --- /dev/null +++ b/testgen/template/dbupgrade/0171_incremental_upgrade.sql @@ -0,0 +1,4 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE test_suites + ADD COLUMN monitor_regenerate_freshness BOOLEAN DEFAULT TRUE; diff --git a/testgen/template/dbupgrade/0172_incremental_upgrade.sql b/testgen/template/dbupgrade/0172_incremental_upgrade.sql new file mode 100644 index 00000000..45142326 --- /dev/null +++ b/testgen/template/dbupgrade/0172_incremental_upgrade.sql @@ -0,0 +1,4 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +CREATE INDEX IF NOT EXISTS ix_tr_trun_table + ON test_results(test_run_id, table_name); diff --git a/testgen/template/dbupgrade/0173_incremental_upgrade.sql b/testgen/template/dbupgrade/0173_incremental_upgrade.sql new file mode 100644 index 00000000..5d5f5be3 --- /dev/null +++ b/testgen/template/dbupgrade/0173_incremental_upgrade.sql @@ -0,0 +1,4 @@ +SET SEARCH_PATH TO {SCHEMA_NAME}; + +ALTER TABLE stg_test_definition_updates + ADD COLUMN IF NOT EXISTS threshold_value VARCHAR(1000); diff --git a/testgen/template/execution/ex_get_tests_metadata.sql b/testgen/template/execution/ex_get_tests_metadata.sql index 1068b017..fbc1fdef 100644 --- a/testgen/template/execution/ex_get_tests_metadata.sql +++ b/testgen/template/execution/ex_get_tests_metadata.sql @@ -37,7 +37,7 @@ SELECT tt.test_type, else concat('HAVING ', match_having_condition) END as match_having_condition, coalesce(custom_query, '') as custom_query, - coalesce(tm.template_name, '') as template_name + coalesce(tm.template, '') as template FROM test_definitions td INNER JOIN test_suites ts ON (td.test_suite_id = ts.id) diff --git a/testgen/template/execution/get_active_test_definitions.sql b/testgen/template/execution/get_active_test_definitions.sql index f59b670c..c8701130 100644 --- a/testgen/template/execution/get_active_test_definitions.sql +++ b/testgen/template/execution/get_active_test_definitions.sql @@ -25,10 +25,12 @@ SELECT td.id, match_subset_condition, match_groupby_names, match_having_condition, + history_calculation, custom_query, + td.prediction, tt.run_type, tt.test_scope, - tm.template_name, + tm.template, c.measure, c.test_operator, c.test_condition @@ -43,4 +45,4 @@ FROM test_definitions td AND :SQL_FLAVOR = c.sql_flavor ) WHERE td.test_suite_id = :TEST_SUITE_ID - AND td.test_active = 'Y'; \ No newline at end of file + AND td.test_active = 'Y'; diff --git a/testgen/template/execution/get_errored_autogen_monitors.sql b/testgen/template/execution/get_errored_autogen_monitors.sql new file mode 100644 index 00000000..dd0fb0a5 --- /dev/null +++ b/testgen/template/execution/get_errored_autogen_monitors.sql @@ -0,0 +1,15 @@ +WITH prev_run AS ( + SELECT id + FROM test_runs + WHERE test_suite_id = :TEST_SUITE_ID ::UUID + AND id <> :TEST_RUN_ID ::UUID + AND status = 'Complete' + ORDER BY test_starttime DESC + LIMIT 1 +) +SELECT DISTINCT tr.test_type, tr.table_name +FROM test_results tr +INNER JOIN prev_run ON tr.test_run_id = prev_run.id +WHERE tr.result_status = 'Error' + AND tr.auto_gen IS TRUE + AND tr.test_type IN ('Freshness_Trend', 'Volume_Trend') diff --git a/testgen/template/execution/has_schema_changes.sql b/testgen/template/execution/has_schema_changes.sql new file mode 100644 index 00000000..eab034b8 --- /dev/null +++ b/testgen/template/execution/has_schema_changes.sql @@ -0,0 +1,18 @@ +WITH prev_test AS ( + SELECT MAX(test_starttime) AS last_run_time + FROM test_runs + WHERE test_suite_id = :TEST_SUITE_ID ::UUID + -- Ignore current run + AND id <> :TEST_RUN_ID ::UUID +), +recent_changes AS ( + SELECT dsl.change, dsl.column_id + FROM data_structure_log dsl + CROSS JOIN prev_test + WHERE dsl.table_groups_id = :TABLE_GROUPS_ID ::UUID + -- Changes since previous test run + AND dsl.change_date > COALESCE(prev_test.last_run_time, '1900-01-01') +) +SELECT + EXISTS (SELECT 1 FROM recent_changes WHERE change = 'A' AND column_id IS NULL) AS has_table_adds, + EXISTS (SELECT 1 FROM recent_changes WHERE change = 'D' AND column_id IS NULL) AS has_table_drops; diff --git a/testgen/template/execution/update_historic_thresholds.sql b/testgen/template/execution/update_historic_thresholds.sql deleted file mode 100644 index 4d3fbbeb..00000000 --- a/testgen/template/execution/update_historic_thresholds.sql +++ /dev/null @@ -1,71 +0,0 @@ -WITH filtered_defs AS ( - -- Step 1: Filter definitions first to minimize join surface area - SELECT id, - test_suite_id, - schema_name, - table_name, - column_name, - test_type, - history_calculation, - CASE WHEN history_calculation = 'Value' THEN 1 ELSE COALESCE(history_lookback, 1) END AS lookback - FROM test_definitions - WHERE test_suite_id = :TEST_SUITE_ID - AND test_active = 'Y' - AND history_calculation IS NOT NULL - AND history_lookback IS NOT NULL -), -normalized_results AS ( - -- Step 2: Normalize definition IDs for autogenerated tests - SELECT CASE - WHEN r.auto_gen THEN d.id - ELSE r.test_definition_id - END AS test_definition_id, - r.test_time, - r.result_signal - FROM test_results r - LEFT JOIN filtered_defs d ON r.auto_gen = TRUE - AND r.test_suite_id = d.test_suite_id - AND r.schema_name = d.schema_name - AND r.table_name IS NOT DISTINCT FROM d.table_name - AND r.column_names IS NOT DISTINCT FROM d.column_name - AND r.test_type = d.test_type - WHERE r.test_suite_id = :TEST_SUITE_ID -), -ranked_results AS ( - -- Step 3: Use a Window Function to get the N most recent results - SELECT n.test_definition_id, - n.result_signal, - CASE - WHEN n.result_signal ~ '^-?[0-9]*\.?[0-9]+$' THEN n.result_signal::NUMERIC - ELSE NULL - END AS signal_numeric, - ROW_NUMBER() OVER (PARTITION BY n.test_definition_id ORDER BY n.test_time DESC) AS rank - FROM normalized_results n - WHERE n.test_definition_id IN (SELECT id FROM filtered_defs) -), -stats AS ( - -- Step 4: Aggregate only the rows within the lookback range - SELECT d.id AS test_definition_id, - d.history_calculation, - MAX(CASE WHEN rr.rank = 1 THEN rr.result_signal END) AS val, - MIN(rr.signal_numeric) AS min, - MAX(rr.signal_numeric) AS max, - SUM(rr.signal_numeric) AS sum, - AVG(rr.signal_numeric) AS avg - FROM filtered_defs d - JOIN ranked_results rr ON d.id = rr.test_definition_id - WHERE rr.rank <= d.lookback - GROUP BY d.id, - d.history_calculation -) -UPDATE test_definitions t -SET baseline_value = CASE - WHEN s.history_calculation = 'Value' THEN s.val - WHEN s.history_calculation = 'Minimum' THEN s.min::VARCHAR - WHEN s.history_calculation = 'Maximum' THEN s.max::VARCHAR - WHEN s.history_calculation = 'Sum' THEN s.sum::VARCHAR - WHEN s.history_calculation = 'Average' THEN s.avg::VARCHAR - ELSE NULL - END -FROM stats s -WHERE t.id = s.test_definition_id; diff --git a/testgen/template/execution/update_history_calc_thresholds.sql b/testgen/template/execution/update_history_calc_thresholds.sql new file mode 100644 index 00000000..f9e0df33 --- /dev/null +++ b/testgen/template/execution/update_history_calc_thresholds.sql @@ -0,0 +1,115 @@ +WITH filtered_defs AS ( + -- Filter definitions first to minimize join surface area + SELECT id, + test_suite_id, + schema_name, + table_name, + column_name, + test_type, + history_calculation, + history_calculation_upper, + GREATEST( + CASE WHEN history_calculation = 'Value' THEN 1 ELSE COALESCE(history_lookback, 1) END, + CASE WHEN history_calculation_upper = 'Value' THEN 1 ELSE COALESCE(history_lookback, 1) END + ) AS lookback + FROM test_definitions + WHERE test_suite_id = :TEST_SUITE_ID + AND test_active = 'Y' + AND history_calculation IS NOT NULL + AND history_calculation <> 'PREDICT' + AND history_lookback IS NOT NULL +), +ranked_results AS ( + -- Use a Window Function to get the N most recent results + SELECT r.test_definition_id, + r.result_signal, + CASE + WHEN r.result_signal ~ '^-?[0-9]*\.?[0-9]+$' THEN r.result_signal::NUMERIC + ELSE NULL + END AS signal_numeric, + ROW_NUMBER() OVER (PARTITION BY r.test_definition_id ORDER BY r.test_time DESC) AS rank + FROM test_results r + WHERE r.test_suite_id = :TEST_SUITE_ID + AND r.test_definition_id IN (SELECT id FROM filtered_defs) +), +stats AS ( + -- Aggregate only the rows within the lookback range + SELECT d.id AS test_definition_id, + d.history_calculation, + d.history_calculation_upper, + MAX(CASE WHEN rr.rank = 1 THEN rr.result_signal END) AS val, + MIN(rr.signal_numeric) AS min, + MAX(rr.signal_numeric) AS max, + SUM(rr.signal_numeric) AS sum, + AVG(rr.signal_numeric) AS avg, + STDDEV(rr.signal_numeric) AS stddev + FROM filtered_defs d + JOIN ranked_results rr ON d.id = rr.test_definition_id + WHERE rr.rank <= d.lookback + GROUP BY d.id, + d.history_calculation, + d.history_calculation_upper +) +UPDATE test_definitions t +SET lower_tolerance = CASE + WHEN s.history_calculation = 'Value' THEN s.val + WHEN s.history_calculation = 'Minimum' THEN s.min::VARCHAR + WHEN s.history_calculation = 'Maximum' THEN s.max::VARCHAR + WHEN s.history_calculation = 'Sum' THEN s.sum::VARCHAR + WHEN s.history_calculation = 'Average' THEN s.avg::VARCHAR + WHEN s.history_calculation LIKE 'EXPR:[%]' THEN + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE( + SUBSTRING(s.history_calculation, 7, LENGTH(s.history_calculation) - 7), + '{VALUE}', COALESCE(s.val, 'NULL')), + '{MINIMUM}', COALESCE(s.min::VARCHAR, 'NULL')), + '{MAXIMUM}', COALESCE(s.max::VARCHAR, 'NULL')), + '{SUM}', COALESCE(s.sum::VARCHAR, 'NULL')), + '{AVERAGE}', COALESCE(s.avg::VARCHAR, 'NULL')), + '{STANDARD_DEVIATION}', COALESCE(s.stddev::VARCHAR, 'NULL')) + ELSE NULL + END, + upper_tolerance = CASE + WHEN s.history_calculation_upper = 'Value' THEN s.val + WHEN s.history_calculation_upper = 'Minimum' THEN s.min::VARCHAR + WHEN s.history_calculation_upper = 'Maximum' THEN s.max::VARCHAR + WHEN s.history_calculation_upper = 'Sum' THEN s.sum::VARCHAR + WHEN s.history_calculation_upper = 'Average' THEN s.avg::VARCHAR + WHEN s.history_calculation_upper LIKE 'EXPR:[%]' THEN + REPLACE(REPLACE(REPLACE(REPLACE(REPLACE(REPLACE( + SUBSTRING(s.history_calculation_upper, 7, LENGTH(s.history_calculation_upper) - 7), + '{VALUE}', COALESCE(s.val, 'NULL')), + '{MINIMUM}', COALESCE(s.min::VARCHAR, 'NULL')), + '{MAXIMUM}', COALESCE(s.max::VARCHAR, 'NULL')), + '{SUM}', COALESCE(s.sum::VARCHAR, 'NULL')), + '{AVERAGE}', COALESCE(s.avg::VARCHAR, 'NULL')), + '{STANDARD_DEVIATION}', COALESCE(s.stddev::VARCHAR, 'NULL')) + ELSE NULL + END +FROM stats s +WHERE t.id = s.test_definition_id; + + +WITH changed_fingerprints AS ( + SELECT test_definition_id, test_time, result_measure + FROM ( + SELECT test_definition_id, test_time, result_measure, + result_measure IS DISTINCT FROM LAG(result_measure) OVER (PARTITION BY test_definition_id ORDER BY test_time) AS changed + FROM test_results + WHERE test_suite_id = :TEST_SUITE_ID + AND test_type = 'Freshness_Trend' + ) tr + WHERE changed = TRUE +), +fingerprint_history AS ( + SELECT test_definition_id, + test_time AS change_time, + result_measure AS last_fingerprint, + ROW_NUMBER() OVER (PARTITION BY test_definition_id ORDER BY test_time DESC) AS rn + FROM changed_fingerprints +) +UPDATE test_definitions +SET baseline_value = h.last_fingerprint, + baseline_sum = h.change_time::VARCHAR +FROM fingerprint_history h +WHERE test_definitions.id = h.test_definition_id + AND h.rn = 1; diff --git a/testgen/template/execution/update_test_results.sql b/testgen/template/execution/update_test_results.sql index 2e210c1a..f5fbf7ad 100644 --- a/testgen/template/execution/update_test_results.sql +++ b/testgen/template/execution/update_test_results.sql @@ -7,6 +7,7 @@ SET test_description = COALESCE(r.test_description, d.test_description, tt.test_ CASE WHEN r.result_status = 'Error' THEN 'Error' WHEN COALESCE(d.severity, s.severity, tt.default_severity) = 'Log' THEN 'Log' + WHEN r.result_code = -1 THEN 'Log' WHEN r.result_code = 1 THEN 'Passed' WHEN r.result_code = 0 AND COALESCE(d.severity, s.severity, tt.default_severity) = 'Warning' THEN 'Warning' WHEN r.result_code = 0 AND COALESCE(d.severity, s.severity, tt.default_severity) = 'Fail' THEN 'Failed' @@ -22,7 +23,22 @@ SET test_description = COALESCE(r.test_description, d.test_description, tt.test_ ), result_message = COALESCE( r.result_message, - tt.measure_uom || ': ' || r.result_measure::VARCHAR || ', Threshold: ' || d.threshold_value::VARCHAR || ( + tt.measure_uom || ': ' || r.result_measure::VARCHAR || ( + CASE + WHEN d.threshold_value IS NOT NULL THEN ', Threshold: ' || d.threshold_value::VARCHAR + ELSE '' + END + ) || ( + CASE + WHEN d.lower_tolerance IS NOT NULL THEN ', Lower Bound: ' || d.lower_tolerance::VARCHAR + ELSE '' + END + ) || ( + CASE + WHEN d.upper_tolerance IS NOT NULL THEN ', Upper Bound: ' || d.upper_tolerance::VARCHAR + ELSE '' + END + ) || ( CASE WHEN r.skip_errors > 0 THEN 'Errors Ignored: ' || r.skip_errors::VARCHAR ELSE '' diff --git a/testgen/template/execution/update_test_run_stats.sql b/testgen/template/execution/update_test_run_stats.sql index 15dab138..2aeaa9e9 100644 --- a/testgen/template/execution/update_test_run_stats.sql +++ b/testgen/template/execution/update_test_run_stats.sql @@ -1,7 +1,7 @@ WITH stats AS ( SELECT r.id as test_run_id, COALESCE(COUNT(tr.id), 0) AS test_ct, - SUM(result_code) AS passed_ct, + COALESCE(SUM(CASE WHEN result_code = 1 THEN 1 END), 0) AS passed_ct, COALESCE(SUM(CASE WHEN tr.result_status = 'Failed' THEN 1 END), 0) AS failed_ct, COALESCE(SUM(CASE WHEN tr.result_status = 'Warning' THEN 1 END), 0) AS warning_ct, COALESCE(SUM(CASE WHEN tr.result_status = 'Log' THEN 1 END), 0) AS log_ct, diff --git a/testgen/template/flavors/bigquery/exec_query_tests/ex_data_match_bigquery.sql b/testgen/template/flavors/bigquery/exec_query_tests/ex_data_match_bigquery.sql deleted file mode 100644 index 374de512..00000000 --- a/testgen/template/flavors/bigquery/exec_query_tests/ex_data_match_bigquery.sql +++ /dev/null @@ -1,42 +0,0 @@ -SELECT '{TEST_TYPE}' AS test_type, - '{TEST_DEFINITION_ID}' AS test_definition_id, - '{TEST_SUITE_ID}' AS test_suite_id, - '{TEST_RUN_ID}' AS test_run_id, - '{RUN_DATE}' AS test_time, - '{SCHEMA_NAME}' AS schema_name, - '{TABLE_NAME}' AS table_name, - '{COLUMN_NAME_NO_QUOTES}' AS column_names, - '{SKIP_ERRORS}' AS threshold_value, - {SKIP_ERRORS} AS skip_errors, - '{INPUT_PARAMETERS}' AS input_parameters, - NULL as result_signal, - CASE WHEN COUNT(*) > {SKIP_ERRORS} THEN 0 ELSE 1 END AS result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CAST(COUNT(*) AS STRING), - ' error(s) identified, ', - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) AS result_measure -FROM ( - SELECT {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - GROUP BY {COLUMN_NAME_NO_QUOTES} - {HAVING_CONDITION} - - EXCEPT DISTINCT - - SELECT {MATCH_GROUPBY_NAMES} - FROM `{MATCH_SCHEMA_NAME}.{MATCH_TABLE_NAME}` - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} -) test; diff --git a/testgen/template/flavors/bigquery/exec_query_tests/ex_relative_entropy_bigquery.sql b/testgen/template/flavors/bigquery/exec_query_tests/ex_relative_entropy_bigquery.sql deleted file mode 100644 index 780538e4..00000000 --- a/testgen/template/flavors/bigquery/exec_query_tests/ex_relative_entropy_bigquery.sql +++ /dev/null @@ -1,50 +0,0 @@ --- Relative Entropy: measured by Jensen-Shannon Divergence --- Smoothed and normalized version of KL divergence, --- with scores between 0 (identical) and 1 (maximally different), --- when using the base-2 logarithm. Formula is: --- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) --- Log base 2 of x = LN(x)/LN(2) -WITH latest_ver AS ( - SELECT {CONCAT_COLUMNS} AS category, - CAST(COUNT(*) AS FLOAT64) / CAST(SUM(COUNT(*)) OVER () AS FLOAT64) AS pct_of_total - FROM `{SCHEMA_NAME}.{TABLE_NAME}` v1 - WHERE {SUBSET_CONDITION} - GROUP BY {COLUMN_NAME_NO_QUOTES} -), -older_ver AS ( - SELECT {CONCAT_MATCH_GROUPBY} AS category, - CAST(COUNT(*) AS FLOAT64) / CAST(SUM(COUNT(*)) OVER () AS FLOAT64) AS pct_of_total - FROM `{MATCH_SCHEMA_NAME}.{TABLE_NAME}` v2 - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} -), -dataset AS ( - SELECT COALESCE(l.category, o.category) AS category, - COALESCE(o.pct_of_total, 0.0000001) AS old_pct, - COALESCE(l.pct_of_total, 0.0000001) AS new_pct, - (COALESCE(o.pct_of_total, 0.0000001) + COALESCE(l.pct_of_total, 0.0000001)) / 2.0 AS avg_pct - FROM latest_ver l - FULL JOIN older_ver o - ON l.category = o.category -) -SELECT '{TEST_TYPE}' AS test_type, - '{TEST_DEFINITION_ID}' AS test_definition_id, - '{TEST_SUITE_ID}' AS test_suite_id, - '{TEST_RUN_ID}' AS test_run_id, - '{RUN_DATE}' AS test_time, - '{SCHEMA_NAME}' AS schema_name, - '{TABLE_NAME}' AS table_name, - '{COLUMN_NAME_NO_QUOTES}' AS column_names, - -- '{GROUPBY_NAMES}' as column_names, - '{THRESHOLD_VALUE}' AS threshold_value, - NULL AS skip_errors, - '{INPUT_PARAMETERS}' AS input_parameters, - NULL as result_signal, - CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END AS result_code, - CONCAT('Divergence Level: ', CAST(js_divergence AS STRING), ', Threshold: {THRESHOLD_VALUE}.') AS result_message, - js_divergence AS result_measure -FROM ( - SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) - + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) AS js_divergence - FROM dataset -) rslt; diff --git a/testgen/template/flavors/bigquery/exec_query_tests/ex_table_changed_bigquery.sql b/testgen/template/flavors/bigquery/exec_query_tests/ex_table_changed_bigquery.sql deleted file mode 100644 index 87365dc5..00000000 --- a/testgen/template/flavors/bigquery/exec_query_tests/ex_table_changed_bigquery.sql +++ /dev/null @@ -1,26 +0,0 @@ -SELECT '{TEST_TYPE}' AS test_type, - '{TEST_DEFINITION_ID}' AS test_definition_id, - '{TEST_SUITE_ID}' AS test_suite_id, - '{TEST_RUN_ID}' AS test_run_id, - '{RUN_DATE}' AS test_time, - '{SCHEMA_NAME}' AS schema_name, - '{TABLE_NAME}' AS table_name, - '{COLUMN_NAME_NO_QUOTES}' AS column_names, - '{SKIP_ERRORS}' AS threshold_value, - {SKIP_ERRORS} AS skip_errors, - '{INPUT_PARAMETERS}' AS input_parameters, - fingerprint AS result_signal, - /* Fails if table is the same */ - CASE WHEN fingerprint = '{BASELINE_VALUE}' THEN 0 ELSE 1 END AS result_code, - CASE - WHEN fingerprint = '{BASELINE_VALUE}' THEN 'No table change detected.' - ELSE 'Table change detected.' - END AS result_message, - CASE - WHEN fingerprint = '{BASELINE_VALUE}' THEN 0 ELSE 1 - END AS result_measure -FROM ( - SELECT {CUSTOM_QUERY} AS fingerprint - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} -) test; diff --git a/testgen/template/flavors/bigquery/exec_query_tests/ex_window_match_no_drops_bigquery.sql b/testgen/template/flavors/bigquery/exec_query_tests/ex_window_match_no_drops_bigquery.sql deleted file mode 100644 index 4e47eaff..00000000 --- a/testgen/template/flavors/bigquery/exec_query_tests/ex_window_match_no_drops_bigquery.sql +++ /dev/null @@ -1,40 +0,0 @@ -SELECT - '{TEST_TYPE}' AS test_type, - '{TEST_DEFINITION_ID}' AS test_definition_id, - '{TEST_SUITE_ID}' AS test_suite_id, - '{TEST_RUN_ID}' AS test_run_id, - '{RUN_DATE}' AS test_time, - '{SCHEMA_NAME}' AS schema_name, - '{TABLE_NAME}' AS table_name, - '{COLUMN_NAME_NO_QUOTES}' AS column_names, - '{SKIP_ERRORS}' AS threshold_value, - {SKIP_ERRORS} AS skip_errors, - '{INPUT_PARAMETERS}' AS input_parameters, - NULL as result_signal, - CASE WHEN COUNT(*) > {SKIP_ERRORS} THEN 0 ELSE 1 END AS result_code, - CASE - WHEN COUNT(*) > 0 THEN CONCAT( - CAST(COUNT(*) AS STRING), ' error(s) identified, ', - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) AS result_measure - FROM ( - SELECT {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATE_SUB((SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), INTERVAL 2 * {WINDOW_DAYS} DAY) - AND {WINDOW_DATE_COLUMN} < DATE_SUB((SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), INTERVAL {WINDOW_DAYS} DAY) - GROUP BY {COLUMN_NAME_NO_QUOTES} - EXCEPT DISTINCT - SELECT {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATE_SUB((SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), INTERVAL {WINDOW_DAYS} DAY) - GROUP BY {COLUMN_NAME_NO_QUOTES} - ) test; diff --git a/testgen/template/flavors/bigquery/exec_query_tests/ex_window_match_same_bigquery.sql b/testgen/template/flavors/bigquery/exec_query_tests/ex_window_match_same_bigquery.sql deleted file mode 100644 index 9b051977..00000000 --- a/testgen/template/flavors/bigquery/exec_query_tests/ex_window_match_same_bigquery.sql +++ /dev/null @@ -1,74 +0,0 @@ -SELECT '{TEST_TYPE}' AS test_type, - '{TEST_DEFINITION_ID}' AS test_definition_id, - '{TEST_SUITE_ID}' AS test_suite_id, - '{TEST_RUN_ID}' AS test_run_id, - '{RUN_DATE}' AS test_time, - '{SCHEMA_NAME}' AS schema_name, - '{TABLE_NAME}' AS table_name, - '{COLUMN_NAME_NO_QUOTES}' AS column_names, - '{SKIP_ERRORS}' AS threshold_value, - {SKIP_ERRORS} AS skip_errors, - '{INPUT_PARAMETERS}' AS input_parameters, - NULL as result_signal, - CASE WHEN COUNT(*) > {SKIP_ERRORS} THEN 0 ELSE 1 END AS result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CAST(COUNT(*) AS STRING), - ' error(s) identified, ', - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) AS result_measure -FROM ( - -- Values in the prior timeframe but not in the latest - ( - SELECT 'Prior Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATE_ADD( - (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), - INTERVAL -{WINDOW_DAYS} DAY - ) - EXCEPT DISTINCT - SELECT 'Prior Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATE_ADD( - (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), - INTERVAL -2 * {WINDOW_DAYS} DAY - ) - AND {WINDOW_DATE_COLUMN} < DATE_ADD( - (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), - INTERVAL -{WINDOW_DAYS} DAY - ) - ) - UNION ALL - -- Values in the latest timeframe but not in the prior - ( - SELECT 'Latest Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATE_ADD( - (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), - INTERVAL -2 * {WINDOW_DAYS} DAY - ) - AND {WINDOW_DATE_COLUMN} < DATE_ADD( - (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), - INTERVAL -{WINDOW_DAYS} DAY - ) - EXCEPT DISTINCT - SELECT 'Latest Timeframe' AS missing_from, {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}.{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATE_ADD( - (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}.{TABLE_NAME}`), - INTERVAL -{WINDOW_DAYS} DAY - ) - ) -) test; diff --git a/testgen/template/flavors/bigquery/gen_query_tests/gen_Freshness_Trend.sql b/testgen/template/flavors/bigquery/gen_query_tests/gen_Freshness_Trend.sql new file mode 100644 index 00000000..231cad88 --- /dev/null +++ b/testgen/template/flavors/bigquery/gen_query_tests/gen_Freshness_Trend.sql @@ -0,0 +1,204 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT p.profile_run_id, p.schema_name, p.table_name, p.column_name, + p.functional_data_type, p.general_type, + p.distinct_value_ct, p.record_ct, p.null_value_ct, + p.max_value, p.min_value, p.avg_value, p.stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + INNER JOIN data_table_chars dtc ON ( + dtc.table_groups_id = p.table_groups_id + AND dtc.schema_name = p.schema_name + AND dtc.table_name = p.table_name + -- Ignore dropped tables + AND dtc.drop_date IS NULL + ) + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + STRING_AGG(column_name, ',' ORDER BY element_type, fingerprint_order, column_name) AS column_names, + 'CAST(COUNT(*) AS STRING) || "|" || ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'CAST(MIN(@@@) AS STRING) || "|" || CAST(MAX(@@@) AS STRING) || "|" || CAST(COUNT(DISTINCT @@@) AS STRING)' + WHEN general_type = 'A' THEN 'CAST(MIN(@@@) AS STRING) || "|" || CAST(MAX(@@@) AS STRING) || "|" || CAST(COUNT(DISTINCT @@@) AS STRING) || "|" || CAST(SUM(LENGTH(@@@)) AS STRING)' + WHEN general_type = 'N' THEN 'ARRAY_TO_STRING([ + CAST(COUNT(@@@) AS STRING), + CAST(COUNT(DISTINCT MOD(CAST(COALESCE(@@@,0) AS NUMERIC) * 1000000, CAST(1000003 AS NUMERIC))) AS STRING), + COALESCE(CAST(ROUND(MIN(CAST(@@@ AS NUMERIC)), 6) AS STRING), ''''), + COALESCE(CAST(ROUND(MAX(CAST(@@@ AS NUMERIC)), 6) AS STRING), ''''), + CAST(MOD(COALESCE(SUM(MOD(CAST(ABS(COALESCE(@@@,0)) AS NUMERIC) * 1000000, CAST(1000000007 AS NUMERIC))), CAST(0 AS NUMERIC)), CAST(1000000007 AS NUMERIC)) AS STRING), + CAST(MOD(COALESCE(SUM(MOD(CAST(ABS(COALESCE(@@@,0)) AS NUMERIC) * 1000000, CAST(1000000009 AS NUMERIC))), CAST(0 AS NUMERIC)), CAST(1000000009 AS NUMERIC)) AS STRING) + ], ''|'', '''')' + END, + '@@@', '`' || column_name || '`' + ), + ' || "|" || ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, groupby_names, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Freshness_Trend' AS test_type, + s.schema_name, + s.table_name, + s.column_names AS groupby_names, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'PREDICT' AS history_calculation, + NULL AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Freshness_Trend' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Freshness_Trend' AND generation_set = :GENERATION_SET) + {TABLE_FILTER} + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + groupby_names = EXCLUDED.groupby_names, + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N' + -- Don't update existing tests in "insert" mode + AND NOT COALESCE(:INSERT_ONLY, FALSE); diff --git a/testgen/template/flavors/bigquery/gen_query_tests/gen_Table_Freshness.sql b/testgen/template/flavors/bigquery/gen_query_tests/gen_Table_Freshness.sql new file mode 100644 index 00000000..b4aa6c85 --- /dev/null +++ b/testgen/template/flavors/bigquery/gen_query_tests/gen_Table_Freshness.sql @@ -0,0 +1,191 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, + distinct_value_ct, record_ct, null_value_ct, + max_value, min_value, avg_value, stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + 'CAST(COUNT(*) AS STRING) || "|" || ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'CAST(MIN(@@@) AS STRING) || "|" || CAST(MAX(@@@) AS STRING) || "|" || CAST(COUNT(DISTINCT @@@) AS STRING)' + WHEN general_type = 'A' THEN 'CAST(MIN(@@@) AS STRING) || "|" || CAST(MAX(@@@) AS STRING) || "|" || CAST(COUNT(DISTINCT @@@) AS STRING) || "|" || CAST(SUM(LENGTH(@@@)) AS STRING)' + WHEN general_type = 'N' THEN 'ARRAY_TO_STRING([ + CAST(COUNT(@@@) AS STRING), + CAST(COUNT(DISTINCT MOD(CAST(COALESCE(@@@,0) AS NUMERIC) * 1000000, CAST(1000003 AS NUMERIC))) AS STRING), + COALESCE(CAST(ROUND(MIN(CAST(@@@ AS NUMERIC)), 6) AS STRING), ''''), + COALESCE(CAST(ROUND(MAX(CAST(@@@ AS NUMERIC)), 6) AS STRING), ''''), + CAST(MOD(COALESCE(SUM(MOD(CAST(ABS(COALESCE(@@@,0)) AS NUMERIC) * 1000000, CAST(1000000007 AS NUMERIC))), CAST(0 AS NUMERIC)), CAST(1000000007 AS NUMERIC)) AS STRING), + CAST(MOD(COALESCE(SUM(MOD(CAST(ABS(COALESCE(@@@,0)) AS NUMERIC) * 1000000, CAST(1000000009 AS NUMERIC))), CAST(0 AS NUMERIC)), CAST(1000000009 AS NUMERIC)) AS STRING) + ], ''|'', '''')' + END, + '@@@', '`' || column_name || '`' + ), + ' || "|" || ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Table_Freshness' AS test_type, + s.schema_name, + s.table_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'Value' AS history_calculation, + 1 AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Table_Freshness' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Table_Freshness' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/flavors/bigquery/gen_query_tests/gen_table_changed_test.sql b/testgen/template/flavors/bigquery/gen_query_tests/gen_table_changed_test.sql deleted file mode 100644 index 23c60db8..00000000 --- a/testgen/template/flavors/bigquery/gen_query_tests/gen_table_changed_test.sql +++ /dev/null @@ -1,168 +0,0 @@ -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, - skip_errors, test_active, last_auto_gen_date, profiling_as_of_date, - lock_refresh, history_calculation, history_lookback, custom_query ) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = '{PROJECT_CODE}' - AND r.table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND ts.id = '{TEST_SUITE_ID}' - AND p.run_date::DATE <= '{AS_OF_DATE}' - GROUP BY r.table_groups_id), -curprof AS (SELECT p.profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, record_ct, max_value, min_value, avg_value, stdev_value, null_value_ct - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), -locked AS (SELECT schema_name, table_name - FROM test_definitions - WHERE table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND test_suite_id = '{TEST_SUITE_ID}' - AND test_type = 'Table_Freshness' - AND lock_refresh = 'Y'), --- IDs - TOP 2 -id_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 - WHEN functional_data_type = 'ID-Secondary' THEN 2 - ELSE 3 - END, distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'ID%'), --- Process Date - TOP 1 -process_date_cols - AS (SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN column_name ILIKE '%mod%' THEN 1 - WHEN column_name ILIKE '%up%' THEN 1 - WHEN column_name ILIKE '%cr%' THEN 2 - WHEN column_name ILIKE '%in%' THEN 2 - END , distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'process%'), --- Transaction Date - TOP 1 -tran_date_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'transactional date%' - OR functional_data_type ILIKE 'period%' - OR functional_data_type = 'timestamp' ), - --- Numeric Measures -numeric_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, -/* - -- Subscores - distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, - (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, - LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, - stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, - 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, -*/ - -- Weighted score - ( - 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + - 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + - 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) - ) AS change_detection_score - FROM curprof - WHERE general_type = 'N' - AND (functional_data_type ILIKE 'Measure%' OR functional_data_type IN ('Sequence', 'Constant')) - ), -numeric_cols_ranked - AS ( SELECT *, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY change_detection_score DESC, column_name) as rank - FROM numeric_cols - WHERE change_detection_score IS NOT NULL), -combined - AS ( SELECT profile_run_id, schema_name, table_name, column_name, 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order - FROM id_cols - WHERE rank <= 2 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order - FROM process_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order - FROM tran_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order - FROM numeric_cols_ranked - WHERE rank = 1 ), -newtests AS ( - SELECT profile_run_id, schema_name, table_name, - 'CAST(COUNT(*) AS STRING) || "|" || ' || - STRING_AGG( - REPLACE( - CASE - WHEN general_type = 'D' THEN - 'CAST(MIN(@@@) AS STRING) || "|" || CAST(MAX(@@@) AS STRING) || "|" || CAST(COUNT(DISTINCT @@@) AS STRING)' - WHEN general_type = 'A' THEN - 'CAST(MIN(@@@) AS STRING) || "|" || CAST(MAX(@@@) AS STRING) || "|" || CAST(COUNT(DISTINCT @@@) AS STRING) || "|" || CAST(SUM(LENGTH(@@@)) AS STRING)' - WHEN general_type = 'N' THEN - 'ARRAY_TO_STRING([ - CAST(COUNT(@@@) AS STRING), - CAST(COUNT(DISTINCT MOD(CAST(COALESCE(@@@,0) AS NUMERIC) * 1000000, CAST(1000003 AS NUMERIC))) AS STRING), - COALESCE(CAST(ROUND(MIN(CAST(@@@ AS NUMERIC)), 6) AS STRING), ''''), - COALESCE(CAST(ROUND(MAX(CAST(@@@ AS NUMERIC)), 6) AS STRING), ''''), - CAST(MOD(COALESCE(SUM(MOD(CAST(ABS(COALESCE(@@@,0)) AS NUMERIC) * 1000000, CAST(1000000007 AS NUMERIC))), CAST(0 AS NUMERIC)), CAST(1000000007 AS NUMERIC)) AS STRING), - CAST(MOD(COALESCE(SUM(MOD(CAST(ABS(COALESCE(@@@,0)) AS NUMERIC) * 1000000, CAST(1000000009 AS NUMERIC))), CAST(0 AS NUMERIC)), CAST(1000000009 AS NUMERIC)) AS STRING) - ], ''|'', '''')' - END, - '@@@', '`' || column_name || '`'), - ' || "|" || ' - ORDER BY element_type, fingerprint_order, column_name - ) as fingerprint - FROM combined - GROUP BY profile_run_id, schema_name, table_name -) -SELECT '{TABLE_GROUPS_ID}'::UUID as table_groups_id, - n.profile_run_id, - 'Table_Freshness' AS test_type, - '{TEST_SUITE_ID}' AS test_suite_id, - n.schema_name, n.table_name, - 0 as skip_errors, 'Y' as test_active, - - '{RUN_DATE}'::TIMESTAMP as last_auto_gen_date, - '{AS_OF_DATE}'::TIMESTAMP as profiling_as_of_date, - 'N' as lock_refresh, - 'Value' as history_calculation, - 1 as history_lookback, - fingerprint as custom_query -FROM newtests n -INNER JOIN test_types t - ON ('Table_Freshness' = t.test_type - AND 'Y' = t.active) -LEFT JOIN generation_sets s - ON (t.test_type = s.test_type - AND '{GENERATION_SET}' = s.generation_set) -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name) -WHERE (s.generation_set IS NOT NULL - OR '{GENERATION_SET}' = '') - AND l.schema_name IS NULL; diff --git a/testgen/template/flavors/databricks/exec_query_tests/ex_window_match_no_drops_databricks.sql b/testgen/template/flavors/databricks/exec_query_tests/ex_window_match_no_drops_databricks.sql deleted file mode 100644 index fc354f45..00000000 --- a/testgen/template/flavors/databricks/exec_query_tests/ex_window_match_no_drops_databricks.sql +++ /dev/null @@ -1,42 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure -FROM ( - SELECT {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) - AND {WINDOW_DATE_COLUMN} < DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) - GROUP BY {COLUMN_NAME_NO_QUOTES} - EXCEPT - SELECT {COLUMN_NAME_NO_QUOTES} - FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) - GROUP BY {COLUMN_NAME_NO_QUOTES} - ) test; diff --git a/testgen/template/flavors/databricks/exec_query_tests/ex_window_match_same_databricks.sql b/testgen/template/flavors/databricks/exec_query_tests/ex_window_match_same_databricks.sql deleted file mode 100644 index a30768b1..00000000 --- a/testgen/template/flavors/databricks/exec_query_tests/ex_window_match_same_databricks.sql +++ /dev/null @@ -1,55 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure - FROM ( - ( -SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) -EXCEPT -SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) - AND {WINDOW_DATE_COLUMN} < DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) -) -UNION ALL -( -SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) - AND {WINDOW_DATE_COLUMN} < DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) - EXCEPT -SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM `{SCHEMA_NAME}`.`{TABLE_NAME}` -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD(day, - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM `{SCHEMA_NAME}`.`{TABLE_NAME}`)) -) - ) test; diff --git a/testgen/template/flavors/databricks/gen_query_tests/gen_Freshness_Trend.sql b/testgen/template/flavors/databricks/gen_query_tests/gen_Freshness_Trend.sql new file mode 100644 index 00000000..7aaba268 --- /dev/null +++ b/testgen/template/flavors/databricks/gen_query_tests/gen_Freshness_Trend.sql @@ -0,0 +1,204 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT p.profile_run_id, p.schema_name, p.table_name, p.column_name, + p.functional_data_type, p.general_type, + p.distinct_value_ct, p.record_ct, p.null_value_ct, + p.max_value, p.min_value, p.avg_value, p.stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + INNER JOIN data_table_chars dtc ON ( + dtc.table_groups_id = p.table_groups_id + AND dtc.schema_name = p.schema_name + AND dtc.table_name = p.table_name + -- Ignore dropped tables + AND dtc.drop_date IS NULL + ) + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + STRING_AGG(column_name, ',' ORDER BY element_type, fingerprint_order, column_name) AS column_names, + 'COUNT(*)::STRING || ''|'' || ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'MIN(@@@)::STRING || ''|'' || MAX(@@@::STRING) || ''|'' || COUNT(DISTINCT @@@)::STRING' + WHEN general_type = 'A' THEN 'MIN(@@@)::STRING || ''|'' || MAX(@@@::STRING) || ''|'' || COUNT(DISTINCT @@@)::STRING || ''|'' || SUM(LENGTH(@@@))::STRING' + WHEN general_type = 'N' THEN 'CONCAT_WS(''|'', + COUNT(@@@)::STRING, + COUNT(DISTINCT MOD((COALESCE(@@@,0)::DECIMAL(38,6) * 1000000)::DECIMAL(38,0), 1000003))::STRING, + COALESCE((MIN(@@@)::DECIMAL(38,6))::STRING, ''''), + COALESCE((MAX(@@@)::DECIMAL(38,6))::STRING, ''''), + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000007)), 0), 1000000007)::STRING, ''''), + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000009)), 0), 1000000009)::STRING, '''') + )' + END, + '@@@', '`' || column_name || '`' + ), + ' || ''|'' || ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, groupby_names, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Freshness_Trend' AS test_type, + s.schema_name, + s.table_name, + s.column_names AS groupby_names, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'PREDICT' AS history_calculation, + NULL AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Freshness_Trend' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Freshness_Trend' AND generation_set = :GENERATION_SET) + {TABLE_FILTER} + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + groupby_names = EXCLUDED.groupby_names, + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N' + -- Don't update existing tests in "insert" mode + AND NOT COALESCE(:INSERT_ONLY, FALSE); diff --git a/testgen/template/flavors/databricks/gen_query_tests/gen_Table_Freshness.sql b/testgen/template/flavors/databricks/gen_query_tests/gen_Table_Freshness.sql new file mode 100644 index 00000000..5251efad --- /dev/null +++ b/testgen/template/flavors/databricks/gen_query_tests/gen_Table_Freshness.sql @@ -0,0 +1,191 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, + distinct_value_ct, record_ct, null_value_ct, + max_value, min_value, avg_value, stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + 'COUNT(*)::STRING || ''|'' || ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'MIN(@@@)::STRING || ''|'' || MAX(@@@::STRING) || ''|'' || COUNT(DISTINCT @@@)::STRING' + WHEN general_type = 'A' THEN 'MIN(@@@)::STRING || ''|'' || MAX(@@@::STRING) || ''|'' || COUNT(DISTINCT @@@)::STRING || ''|'' || SUM(LENGTH(@@@))::STRING' + WHEN general_type = 'N' THEN 'CONCAT_WS(''|'', + COUNT(@@@)::STRING, + COUNT(DISTINCT MOD((COALESCE(@@@,0)::DECIMAL(38,6) * 1000000)::DECIMAL(38,0), 1000003))::STRING, + COALESCE((MIN(@@@)::DECIMAL(38,6))::STRING, ''''), + COALESCE((MAX(@@@)::DECIMAL(38,6))::STRING, ''''), + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000007)), 0), 1000000007)::STRING, ''''), + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000009)), 0), 1000000009)::STRING, '''') + )' + END, + '@@@', '`' || column_name || '`' + ), + ' || ''|'' || ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Table_Freshness' AS test_type, + s.schema_name, + s.table_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'Value' AS history_calculation, + 1 AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Table_Freshness' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Table_Freshness' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/flavors/databricks/gen_query_tests/gen_table_changed_test.sql b/testgen/template/flavors/databricks/gen_query_tests/gen_table_changed_test.sql deleted file mode 100644 index 17e085da..00000000 --- a/testgen/template/flavors/databricks/gen_query_tests/gen_table_changed_test.sql +++ /dev/null @@ -1,164 +0,0 @@ -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, - skip_errors, test_active, last_auto_gen_date, profiling_as_of_date, - lock_refresh, history_calculation, history_lookback, custom_query ) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = '{PROJECT_CODE}' - AND r.table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND ts.id = '{TEST_SUITE_ID}' - AND p.run_date::DATE <= '{AS_OF_DATE}' - GROUP BY r.table_groups_id), -curprof AS (SELECT p.profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, record_ct, max_value, min_value, avg_value, stdev_value, null_value_ct - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), -locked AS (SELECT schema_name, table_name - FROM test_definitions - WHERE table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND test_suite_id = '{TEST_SUITE_ID}' - AND test_type = 'Table_Freshness' - AND lock_refresh = 'Y'), --- IDs - TOP 2 -id_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 - WHEN functional_data_type = 'ID-Secondary' THEN 2 - ELSE 3 - END, distinct_value_ct, column_name DESC) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'ID%'), --- Process Date - TOP 1 -process_date_cols - AS (SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN column_name ILIKE '%mod%' THEN 1 - WHEN column_name ILIKE '%up%' THEN 1 - WHEN column_name ILIKE '%cr%' THEN 2 - WHEN column_name ILIKE '%in%' THEN 2 - END , distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'process%'), --- Transaction Date - TOP 1 -tran_date_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'transactional date%' - OR functional_data_type ILIKE 'period%' - OR functional_data_type = 'timestamp' ), - --- Numeric Measures -numeric_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, -/* - -- Subscores - distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, - (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, - LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, - stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, - 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, -*/ - -- Weighted score - ( - 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + - 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + - 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) - ) AS change_detection_score - FROM curprof - WHERE general_type = 'N' - AND (functional_data_type ILIKE 'Measure%' OR functional_data_type IN ('Sequence', 'Constant')) - ), -numeric_cols_ranked - AS ( SELECT *, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY change_detection_score DESC, column_name) as rank - FROM numeric_cols - WHERE change_detection_score IS NOT NULL), -combined - AS ( SELECT profile_run_id, schema_name, table_name, column_name, 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order - FROM id_cols - WHERE rank <= 2 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order - FROM process_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order - FROM tran_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order - FROM numeric_cols_ranked - WHERE rank = 1 ), -newtests - AS (SELECT profile_run_id, schema_name, table_name, - 'COUNT(*)::STRING || ''|'' || ' || - STRING_AGG( - REPLACE( - CASE - WHEN general_type = 'D' THEN 'MIN(@@@)::STRING || ''|'' || MAX(@@@::STRING) || ''|'' || COUNT(DISTINCT @@@)::STRING' - WHEN general_type = 'A' THEN 'MIN(@@@)::STRING || ''|'' || MAX(@@@::STRING) || ''|'' || COUNT(DISTINCT @@@)::STRING || ''|'' || SUM(LENGTH(@@@))::STRING' - WHEN general_type = 'N' THEN 'CONCAT_WS(''|'', - COUNT(@@@)::STRING, - COUNT(DISTINCT MOD((COALESCE(@@@,0)::DECIMAL(38,6) * 1000000)::DECIMAL(38,0), 1000003))::STRING, - COALESCE((MIN(@@@)::DECIMAL(38,6))::STRING, ''''), - COALESCE((MAX(@@@)::DECIMAL(38,6))::STRING, ''''), - COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL, 1000000007)), 0), 1000000007)::STRING, ''''), - COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL, 1000000009)), 0), 1000000009)::STRING, '''') - )' - END, - '@@@', '`' || column_name || '`'), - ' || ''|'' || ' - ORDER BY element_type, fingerprint_order, column_name) as fingerprint - FROM combined - GROUP BY profile_run_id, schema_name, table_name) -SELECT '{TABLE_GROUPS_ID}'::UUID as table_groups_id, - n.profile_run_id, - 'Table_Freshness' AS test_type, - '{TEST_SUITE_ID}' AS test_suite_id, - n.schema_name, n.table_name, - 0 as skip_errors, 'Y' as test_active, - - '{RUN_DATE}'::TIMESTAMP as last_auto_gen_date, - '{AS_OF_DATE}'::TIMESTAMP as profiling_as_of_date, - 'N' as lock_refresh, - 'Value' as history_calculation, - 1 as history_lookback, - fingerprint as custom_query -FROM newtests n -INNER JOIN test_types t - ON ('Table_Freshness' = t.test_type - AND 'Y' = t.active) -LEFT JOIN generation_sets s - ON (t.test_type = s.test_type - AND '{GENERATION_SET}' = s.generation_set) -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name) -WHERE (s.generation_set IS NOT NULL - OR '{GENERATION_SET}' = '') - AND l.schema_name IS NULL; - diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_no_drops_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_no_drops_generic.sql deleted file mode 100644 index 7e8d3fff..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_no_drops_generic.sql +++ /dev/null @@ -1,45 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure -FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL - FROM - ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {GROUPBY_NAMES} - {HAVING_CONDITION} - UNION ALL - SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} ) a - GROUP BY {GROUPBY_NAMES} ) s - WHERE total < match_total --- OR (total IS NOT NULL AND match_total IS NULL) -- New categories - OR (total IS NULL AND match_total IS NOT NULL); -- Dropped categories diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_percent_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_percent_generic.sql deleted file mode 100644 index accad515..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_percent_generic.sql +++ /dev/null @@ -1,45 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure -FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL - FROM - ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {GROUPBY_NAMES} - {HAVING_CONDITION} - UNION ALL - SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} ) a - GROUP BY {GROUPBY_NAMES} ) s - WHERE (total IS NOT NULL AND match_total IS NULL) - OR (total IS NULL AND match_total IS NOT NULL) - OR (total NOT BETWEEN match_total * (1 + {LOWER_TOLERANCE}/100.0) AND match_total * (1 + {UPPER_TOLERANCE}/100.0)); diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_range_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_range_generic.sql deleted file mode 100644 index e183241f..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_range_generic.sql +++ /dev/null @@ -1,45 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure -FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL - FROM - ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {GROUPBY_NAMES} - {HAVING_CONDITION} - UNION ALL - SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} ) a - GROUP BY {GROUPBY_NAMES} ) s - WHERE (total IS NOT NULL AND match_total IS NULL) - OR (total IS NULL AND match_total IS NOT NULL) - OR (total NOT BETWEEN match_total + {LOWER_TOLERANCE} AND match_total + {UPPER_TOLERANCE}); diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_same_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_same_generic.sql deleted file mode 100644 index e5dbfbf8..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_aggregate_match_same_generic.sql +++ /dev/null @@ -1,45 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure -FROM ( SELECT {GROUPBY_NAMES}, SUM(TOTAL) as total, SUM(MATCH_TOTAL) as MATCH_TOTAL - FROM - ( SELECT {GROUPBY_NAMES}, {COLUMN_NAME_NO_QUOTES} as total, NULL as match_total - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {GROUPBY_NAMES} - {HAVING_CONDITION} - UNION ALL - SELECT {MATCH_GROUPBY_NAMES}, NULL as total, {MATCH_COLUMN_NAMES} as match_total - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} ) a - GROUP BY {GROUPBY_NAMES} ) s - WHERE total <> match_total - OR (total IS NOT NULL AND match_total IS NULL) - OR (total IS NULL AND match_total IS NOT NULL); diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_custom_query_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_custom_query_generic.sql deleted file mode 100644 index 0d17c0fc..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_custom_query_generic.sql +++ /dev/null @@ -1,35 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - CASE - WHEN '{COLUMN_NAME_NO_QUOTES}' = '' OR '{COLUMN_NAME_NO_QUOTES}' IS NULL THEN 'N/A' - ELSE '{COLUMN_NAME_NO_QUOTES}' - END as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - /* TODO: 'custom_query= {CUSTOM_QUERY_ESCAPED}' as input_parameters, */ - 'Skip_Errors={SKIP_ERRORS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure - FROM ( - {CUSTOM_QUERY} - ) TEST; diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_data_match_2way_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_data_match_2way_generic.sql deleted file mode 100644 index 52dd918d..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_data_match_2way_generic.sql +++ /dev/null @@ -1,54 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure - FROM ( - ( SELECT {GROUPBY_NAMES} - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {GROUPBY_NAMES} - {HAVING_CONDITION} - EXCEPT - SELECT {MATCH_COLUMN_NAMES} - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} ) - UNION - ( SELECT {MATCH_COLUMN_NAMES} - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} - EXCEPT - SELECT {GROUPBY_NAMES} - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {GROUPBY_NAMES} - {HAVING_CONDITION} - ) - ) test; diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_data_match_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_data_match_generic.sql deleted file mode 100644 index f7758fa1..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_data_match_generic.sql +++ /dev/null @@ -1,40 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure - FROM ( SELECT {COLUMN_NAME_NO_QUOTES} - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {COLUMN_NAME_NO_QUOTES} - {HAVING_CONDITION} - EXCEPT - SELECT {MATCH_GROUPBY_NAMES} - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{MATCH_TABLE_NAME}{QUOTE} - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} - {MATCH_HAVING_CONDITION} - ) test; diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_dupe_rows_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_dupe_rows_generic.sql deleted file mode 100644 index b194bde3..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_dupe_rows_generic.sql +++ /dev/null @@ -1,34 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' duplicate row(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COALESCE(SUM(record_ct), 0) as result_measure - FROM ( SELECT {GROUPBY_NAMES}, COUNT(*) as record_ct - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - GROUP BY {GROUPBY_NAMES} - HAVING COUNT(*) > 1 - ) test; diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_relative_entropy_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_relative_entropy_generic.sql deleted file mode 100644 index 6f30c530..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_relative_entropy_generic.sql +++ /dev/null @@ -1,49 +0,0 @@ --- Relative Entropy: measured by Jensen-Shannon Divergence --- Smoothed and normalized version of KL divergence, --- with scores between 0 (identical) and 1 (maximally different), --- when using the base-2 logarithm. Formula is: --- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) --- Log base 2 of x = LN(x)/LN(2) -WITH latest_ver - AS ( SELECT {CONCAT_COLUMNS} as category, - COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v1 - WHERE {SUBSET_CONDITION} - GROUP BY {COLUMN_NAME_NO_QUOTES} ), -older_ver - AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, - COUNT(*)::FLOAT / SUM(COUNT(*)) OVER ()::FLOAT AS pct_of_total - FROM {QUOTE}{MATCH_SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} v2 - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} ), -dataset - AS ( SELECT COALESCE(l.category, o.category) AS category, - COALESCE(o.pct_of_total, 0.0000001) AS old_pct, - COALESCE(l.pct_of_total, 0.0000001) AS new_pct, - (COALESCE(o.pct_of_total, 0.0000001) - + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct - FROM latest_ver l - FULL JOIN older_ver o - ON (l.category = o.category) ) -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, --- '{GROUPBY_NAMES}' as column_names, - '{THRESHOLD_VALUE}' as threshold_value, - NULL as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, - CONCAT('Divergence Level: ', - CONCAT(CAST(js_divergence AS {VARCHAR_TYPE}), - ', Threshold: {THRESHOLD_VALUE}.')) as result_message, - js_divergence as result_measure - FROM ( - SELECT 0.5 * ABS(SUM(new_pct * LN(new_pct/avg_pct)/LN(2))) - + 0.5 * ABS(SUM(old_pct * LN(old_pct/avg_pct)/LN(2))) as js_divergence - FROM dataset ) rslt; diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_schema_drift_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_schema_drift_generic.sql deleted file mode 100644 index c6b7f9ef..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_schema_drift_generic.sql +++ /dev/null @@ -1,31 +0,0 @@ -WITH prev_test AS ( - SELECT MAX(test_time) as last_run_time - from {APP_SCHEMA_NAME}.test_results - where test_definition_id = '{TEST_DEFINITION_ID}' -), -change_counts AS ( - SELECT COUNT(*) FILTER (WHERE dsl.change = 'A') AS schema_adds, - COUNT(*) FILTER (WHERE dsl.change = 'D') AS schema_drops, - COUNT(*) FILTER (WHERE dsl.change = 'M') AS schema_mods - FROM prev_test, {APP_SCHEMA_NAME}.data_structure_log dsl - LEFT JOIN {APP_SCHEMA_NAME}.data_column_chars dcc ON dcc.column_id = dsl.element_id - WHERE dcc.table_groups_id = '{TABLE_GROUPS_ID}' - -- if no previous tests, this comparision yelds null and nothing is counted. - AND change_date > prev_test.last_run_time -) -SELECT '{TEST_TYPE}' AS test_type, - '{TEST_DEFINITION_ID}' AS test_definition_id, - '{TEST_SUITE_ID}' AS test_suite_id, - '{TEST_RUN_ID}' AS test_run_id, - '{RUN_DATE}' AS test_time, - '1' AS threshold_value, - 1 AS skip_errors, - '{INPUT_PARAMETERS}' AS input_parameters, - schema_adds::VARCHAR || '|' || schema_mods::VARCHAR || '|' || schema_drops::VARCHAR AS result_signal, - CASE WHEN schema_adds+schema_mods+schema_drops > 0 THEN 0 ELSE 1 END AS result_code, - CASE WHEN schema_adds+schema_mods+schema_drops > 0 THEN - 'Table schema changes detected' - ELSE 'No table schema changes found.' - END AS result_message, - schema_adds+schema_mods+schema_drops AS result_measure -FROM change_counts diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_table_changed_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_table_changed_generic.sql deleted file mode 100644 index 672f19d6..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_table_changed_generic.sql +++ /dev/null @@ -1,29 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - fingerprint as result_signal, - /* Fails if table is the same */ - CASE WHEN fingerprint = '{BASELINE_VALUE}' THEN 0 ELSE 1 END as result_code, - - CASE - WHEN fingerprint = '{BASELINE_VALUE}' - THEN 'No table change detected.' - ELSE 'Table change detected.' - END AS result_message, - CASE - WHEN fingerprint = '{BASELINE_VALUE}' - THEN 0 - ELSE 1 - END as result_measure - FROM ( SELECT {CUSTOM_QUERY} as fingerprint - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - ) test; diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_window_match_no_drops_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_window_match_no_drops_generic.sql deleted file mode 100644 index 7ece651a..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_window_match_no_drops_generic.sql +++ /dev/null @@ -1,42 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure -FROM ( - SELECT {COLUMN_NAME_NO_QUOTES} - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) - AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) - GROUP BY {COLUMN_NAME_NO_QUOTES} - EXCEPT - SELECT {COLUMN_NAME_NO_QUOTES} - FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) - GROUP BY {COLUMN_NAME_NO_QUOTES} - ) test; diff --git a/testgen/template/flavors/generic/exec_query_tests/ex_window_match_same_generic.sql b/testgen/template/flavors/generic/exec_query_tests/ex_window_match_same_generic.sql deleted file mode 100644 index 9b463d7c..00000000 --- a/testgen/template/flavors/generic/exec_query_tests/ex_window_match_same_generic.sql +++ /dev/null @@ -1,55 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS {VARCHAR_TYPE}), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure - FROM ( - ( -SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) -EXCEPT -SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) - AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) -) -UNION ALL -( -SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - 2 * {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) - AND {WINDOW_DATE_COLUMN} < DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) - EXCEPT -SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE} -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= DATEADD("day", - {WINDOW_DAYS}, (SELECT MAX({WINDOW_DATE_COLUMN}) FROM {QUOTE}{SCHEMA_NAME}{QUOTE}.{QUOTE}{TABLE_NAME}{QUOTE})) -) - ) test; diff --git a/testgen/template/flavors/mssql/exec_query_tests/ex_relative_entropy_mssql.sql b/testgen/template/flavors/mssql/exec_query_tests/ex_relative_entropy_mssql.sql deleted file mode 100644 index 4ec91d25..00000000 --- a/testgen/template/flavors/mssql/exec_query_tests/ex_relative_entropy_mssql.sql +++ /dev/null @@ -1,49 +0,0 @@ --- Relative Entropy: measured by Jensen-Shannon Divergence --- Smoothed and normalized version of KL divergence, --- with scores between 0 (identical) and 1 (maximally different), --- when using the base-2 logarithm. Formula is: --- 0.5 * kl_divergence(p, m) + 0.5 * kl_divergence(q, m) --- Log base 2 of x = LN(x)/LN(2) -WITH latest_ver - AS ( SELECT {CONCAT_COLUMNS} as category, - CAST(COUNT(*) as FLOAT) / CAST(SUM(COUNT(*)) OVER () as FLOAT) AS pct_of_total - FROM "{SCHEMA_NAME}"."{TABLE_NAME}" v1 - WHERE {SUBSET_CONDITION} - GROUP BY {COLUMN_NAME_NO_QUOTES} ), -older_ver - AS ( SELECT {CONCAT_MATCH_GROUPBY} as category, - CAST(COUNT(*) as FLOAT) / CAST(SUM(COUNT(*)) OVER () as FLOAT) AS pct_of_total - FROM "{MATCH_SCHEMA_NAME}"."{TABLE_NAME}" v2 - WHERE {MATCH_SUBSET_CONDITION} - GROUP BY {MATCH_GROUPBY_NAMES} ), -dataset - AS ( SELECT COALESCE(l.category, o.category) AS category, - COALESCE(o.pct_of_total, 0.0000001) AS old_pct, - COALESCE(l.pct_of_total, 0.0000001) AS new_pct, - (COALESCE(o.pct_of_total, 0.0000001) - + COALESCE(l.pct_of_total, 0.0000001))/2.0 AS avg_pct - FROM latest_ver l - FULL JOIN older_ver o - ON (l.category = o.category) ) -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, --- '{GROUPBY_NAMES}' as column_names, - '{THRESHOLD_VALUE}' as threshold_value, - NULL as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN js_divergence > {THRESHOLD_VALUE} THEN 0 ELSE 1 END as result_code, - CONCAT('Divergence Level: ', - CONCAT(CAST(js_divergence AS VARCHAR), - ', Threshold: {THRESHOLD_VALUE}.')) as result_message, - js_divergence as result_measure - FROM ( - SELECT 0.5 * ABS(SUM(new_pct * LOG(new_pct/avg_pct)/LOG(2))) - + 0.5 * ABS(SUM(old_pct * LOG(old_pct/avg_pct)/LOG(2))) as js_divergence - FROM dataset ) rslt; diff --git a/testgen/template/flavors/mssql/exec_query_tests/ex_table_changed_mssql.sql b/testgen/template/flavors/mssql/exec_query_tests/ex_table_changed_mssql.sql deleted file mode 100644 index b448fe84..00000000 --- a/testgen/template/flavors/mssql/exec_query_tests/ex_table_changed_mssql.sql +++ /dev/null @@ -1,29 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - fingerprint as result_signal, - /* Fails if table is the same */ - CASE WHEN fingerprint = '{BASELINE_VALUE}' THEN 0 ELSE 1 END as result_code, - - CASE - WHEN fingerprint = '{BASELINE_VALUE}' - THEN 'No table change detected.' - ELSE 'Table change detected.' - END AS result_message, - CASE - WHEN fingerprint = '{BASELINE_VALUE}' - THEN 0 - ELSE 1 - END as result_measure - FROM ( SELECT {CUSTOM_QUERY} as fingerprint - FROM "{SCHEMA_NAME}"."{TABLE_NAME}" WITH (NOLOCK) - WHERE {SUBSET_CONDITION} - ) test; diff --git a/testgen/template/flavors/mssql/gen_query_tests/gen_Freshness_Trend.sql b/testgen/template/flavors/mssql/gen_query_tests/gen_Freshness_Trend.sql new file mode 100644 index 00000000..aa18dac0 --- /dev/null +++ b/testgen/template/flavors/mssql/gen_query_tests/gen_Freshness_Trend.sql @@ -0,0 +1,204 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT p.profile_run_id, p.schema_name, p.table_name, p.column_name, + p.functional_data_type, p.general_type, + p.distinct_value_ct, p.record_ct, p.null_value_ct, + p.max_value, p.min_value, p.avg_value, p.stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + INNER JOIN data_table_chars dtc ON ( + dtc.table_groups_id = p.table_groups_id + AND dtc.schema_name = p.schema_name + AND dtc.table_name = p.table_name + -- Ignore dropped tables + AND dtc.drop_date IS NULL + ) + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + STRING_AGG(column_name, ',' ORDER BY element_type, fingerprint_order, column_name) AS column_names, + 'CAST(COUNT(*) AS varchar) + ''|'' + ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'CAST(MIN(@@@) AS NVARCHAR) + ''|'' + CAST(MAX(@@@) AS NVARCHAR) + ''|'' + CAST(COUNT_BIG(DISTINCT @@@) AS NVARCHAR)' + WHEN general_type = 'A' THEN 'CAST(MIN(@@@) AS NVARCHAR) + ''|'' + CAST(MAX(@@@) AS NVARCHAR) + ''|'' + CAST(COUNT_BIG(DISTINCT @@@) AS NVARCHAR) + ''|'' + CAST(SUM(LEN(@@@)) AS NVARCHAR)' + WHEN general_type = 'N' THEN 'CONCAT_WS(''|'', + CAST(COUNT_BIG(@@@) AS VARCHAR(20)), + CAST(COUNT_BIG(DISTINCT CAST(CAST(CAST(COALESCE(@@@,0) AS DECIMAL(38,6)) * 1000000 AS DECIMAL(38,0)) % 1000003 AS INT)) AS VARCHAR(20)), + COALESCE(CAST(CAST(MIN(@@@) AS DECIMAL(38,6)) AS VARCHAR(50)), ''''), + COALESCE(CAST(CAST(MAX(@@@) AS DECIMAL(38,6)) AS VARCHAR(50)), ''''), + CAST((COALESCE(SUM(CAST(CAST(ABS(CAST(COALESCE(@@@,0) AS DECIMAL(38,6))) * 1000000 AS DECIMAL(38,0)) % 1000000007 AS DECIMAL(38,0))), 0) % 1000000007) AS VARCHAR(12)), + CAST((COALESCE(SUM(CAST(CAST(ABS(CAST(COALESCE(@@@,0) AS DECIMAL(38,6))) * 1000000 AS DECIMAL(38,0)) % 1000000009 AS DECIMAL(38,0))), 0) % 1000000009) AS VARCHAR(12)) + )' + END, + '@@@', '"' || column_name || '"' + ), + ' + ''|'' + ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, groupby_names, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Freshness_Trend' AS test_type, + s.schema_name, + s.table_name, + s.column_names AS groupby_names, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'PREDICT' AS history_calculation, + NULL AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Freshness_Trend' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Freshness_Trend' AND generation_set = :GENERATION_SET) + {TABLE_FILTER} + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + groupby_names = EXCLUDED.groupby_names, + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N' + -- Don't update existing tests in "insert" mode + AND NOT COALESCE(:INSERT_ONLY, FALSE); diff --git a/testgen/template/flavors/mssql/gen_query_tests/gen_Table_Freshness.sql b/testgen/template/flavors/mssql/gen_query_tests/gen_Table_Freshness.sql new file mode 100644 index 00000000..6379a897 --- /dev/null +++ b/testgen/template/flavors/mssql/gen_query_tests/gen_Table_Freshness.sql @@ -0,0 +1,191 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, + distinct_value_ct, record_ct, null_value_ct, + max_value, min_value, avg_value, stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + 'CAST(COUNT(*) AS varchar) + ''|'' + ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'CAST(MIN(@@@) AS NVARCHAR) + ''|'' + CAST(MAX(@@@) AS NVARCHAR) + ''|'' + CAST(COUNT_BIG(DISTINCT @@@) AS NVARCHAR)' + WHEN general_type = 'A' THEN 'CAST(MIN(@@@) AS NVARCHAR) + ''|'' + CAST(MAX(@@@) AS NVARCHAR) + ''|'' + CAST(COUNT_BIG(DISTINCT @@@) AS NVARCHAR) + ''|'' + CAST(SUM(LEN(@@@)) AS NVARCHAR)' + WHEN general_type = 'N' THEN 'CONCAT_WS(''|'', + CAST(COUNT_BIG(@@@) AS VARCHAR(20)), + CAST(COUNT_BIG(DISTINCT CAST(CAST(CAST(COALESCE(@@@,0) AS DECIMAL(38,6)) * 1000000 AS DECIMAL(38,0)) % 1000003 AS INT)) AS VARCHAR(20)), + COALESCE(CAST(CAST(MIN(@@@) AS DECIMAL(38,6)) AS VARCHAR(50)), ''''), + COALESCE(CAST(CAST(MAX(@@@) AS DECIMAL(38,6)) AS VARCHAR(50)), ''''), + CAST((COALESCE(SUM(CAST(CAST(ABS(CAST(COALESCE(@@@,0) AS DECIMAL(38,6))) * 1000000 AS DECIMAL(38,0)) % 1000000007 AS DECIMAL(38,0))), 0) % 1000000007) AS VARCHAR(12)), + CAST((COALESCE(SUM(CAST(CAST(ABS(CAST(COALESCE(@@@,0) AS DECIMAL(38,6))) * 1000000 AS DECIMAL(38,0)) % 1000000009 AS DECIMAL(38,0))), 0) % 1000000009) AS VARCHAR(12)) + )' + END, + '@@@', '"' || column_name || '"' + ), + ' + ''|'' + ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Table_Freshness' AS test_type, + s.schema_name, + s.table_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'Value' AS history_calculation, + 1 AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Table_Freshness' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Table_Freshness' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/flavors/mssql/gen_query_tests/gen_table_changed_test.sql b/testgen/template/flavors/mssql/gen_query_tests/gen_table_changed_test.sql deleted file mode 100644 index d352848e..00000000 --- a/testgen/template/flavors/mssql/gen_query_tests/gen_table_changed_test.sql +++ /dev/null @@ -1,170 +0,0 @@ -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, - skip_errors, test_active, last_auto_gen_date, profiling_as_of_date, - lock_refresh, history_calculation, history_lookback, custom_query ) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = '{PROJECT_CODE}' - AND r.table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND ts.id = '{TEST_SUITE_ID}' - AND p.run_date::DATE <= '{AS_OF_DATE}' - GROUP BY r.table_groups_id), -curprof AS (SELECT p.profile_run_id, p.schema_name, p.table_name, p.column_name, p.functional_data_type, - p.general_type, p.distinct_value_ct, p.record_ct, p.max_value, p.min_value, - p.avg_value, p.stdev_value, p.null_value_ct - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), -locked AS (SELECT schema_name, table_name - FROM test_definitions - WHERE table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND test_suite_id = '{TEST_SUITE_ID}' - AND test_type = 'Table_Freshness' - AND lock_refresh = 'Y'), --- IDs - TOP 2 -id_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 - WHEN functional_data_type = 'ID-Secondary' THEN 2 - ELSE 3 - END, distinct_value_ct, column_name DESC) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'ID%'), --- Process Date - TOP 1 -process_date_cols - AS (SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN column_name ILIKE '%mod%' THEN 1 - WHEN column_name ILIKE '%up%' THEN 1 - WHEN column_name ILIKE '%cr%' THEN 2 - WHEN column_name ILIKE '%in%' THEN 2 - END , distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'process%'), --- Transaction Date - TOP 1 -tran_date_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'transactional date%' - OR functional_data_type ILIKE 'period%' - OR functional_data_type = 'timestamp' ), - --- Numeric Measures -numeric_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, -/* - -- Subscores -- save for reference - distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, - (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, - LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, - stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, - 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, -*/ - -- Weighted score - ( - 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + - 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + - 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) - ) AS change_detection_score - FROM curprof - WHERE general_type = 'N' - AND (functional_data_type ILIKE 'Measure%' OR functional_data_type IN ('Sequence', 'Constant')) - ), -numeric_cols_ranked - AS ( SELECT *, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY change_detection_score DESC, column_name) as rank - FROM numeric_cols - WHERE change_detection_score IS NOT NULL), -combined - AS ( SELECT profile_run_id, schema_name, table_name, column_name, 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order - FROM id_cols - WHERE rank <= 2 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order - FROM process_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order - FROM tran_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order - FROM numeric_cols_ranked - WHERE rank = 1 ), -newtests AS ( - SELECT - profile_run_id, - schema_name, - table_name, - 'CAST(COUNT(*) AS varchar) + ''|'' + ' || STRING_AGG( - REPLACE( - CASE - WHEN general_type = 'D' THEN 'CAST(MIN(@@@) AS NVARCHAR) + ''|'' + CAST(MAX(@@@) AS NVARCHAR) + ''|'' + CAST(COUNT_BIG(DISTINCT @@@) AS NVARCHAR)' - WHEN general_type = 'A' THEN 'CAST(MIN(@@@) AS NVARCHAR) + ''|'' + CAST(MAX(@@@) AS NVARCHAR) + ''|'' + CAST(COUNT_BIG(DISTINCT @@@) AS NVARCHAR) + ''|'' + CAST(SUM(LEN(@@@)) AS NVARCHAR)' - WHEN general_type = 'N' THEN 'CONCAT_WS(''|'', - CAST(COUNT_BIG(@@@) AS VARCHAR(20)), - CAST(COUNT_BIG(DISTINCT CAST(CAST(CAST(COALESCE(@@@,0) AS DECIMAL(38,6)) * 1000000 AS DECIMAL(38,0)) % 1000003 AS INT)) AS VARCHAR(20)), - COALESCE(CAST(CAST(MIN(@@@) AS DECIMAL(38,6)) AS VARCHAR(50)), ''''), - COALESCE(CAST(CAST(MAX(@@@) AS DECIMAL(38,6)) AS VARCHAR(50)), ''''), - CAST((COALESCE(SUM(CAST(CAST(ABS(CAST(COALESCE(@@@,0) AS DECIMAL(38,6))) * 1000000 AS DECIMAL(38,0)) % 1000000007 AS DECIMAL(38,0))), 0) % 1000000007) AS VARCHAR(12)), - CAST((COALESCE(SUM(CAST(CAST(ABS(CAST(COALESCE(@@@,0) AS DECIMAL(38,6))) * 1000000 AS DECIMAL(38,0)) % 1000000009 AS DECIMAL(38,0))), 0) % 1000000009) AS VARCHAR(12)) - )' - END, - '@@@', '"' || column_name || '"' - ), - ' + ''|'' + ' - ORDER BY element_type, fingerprint_order, column_name - ) as fingerprint - FROM combined - GROUP BY profile_run_id, schema_name, table_name -) -SELECT '{TABLE_GROUPS_ID}'::UUID as table_groups_id, - n.profile_run_id, - 'Table_Freshness' AS test_type, - '{TEST_SUITE_ID}' AS test_suite_id, - n.schema_name, n.table_name, - 0 as skip_errors, 'Y' as test_active, - - '{RUN_DATE}'::TIMESTAMP as last_auto_gen_date, - '{AS_OF_DATE}'::TIMESTAMP as profiling_as_of_date, - 'N' as lock_refresh, - 'Value' as history_calculation, - 1 as history_lookback, - fingerprint as custom_query -FROM newtests n -INNER JOIN test_types t - ON ('Table_Freshness' = t.test_type - AND 'Y' = t.active) -LEFT JOIN generation_sets s - ON (t.test_type = s.test_type - AND '{GENERATION_SET}' = s.generation_set) -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name) -WHERE (s.generation_set IS NOT NULL - OR '{GENERATION_SET}' = '') - AND l.schema_name IS NULL; - diff --git a/testgen/template/flavors/postgresql/exec_query_tests/ex_window_match_no_drops_postgresql.sql b/testgen/template/flavors/postgresql/exec_query_tests/ex_window_match_no_drops_postgresql.sql deleted file mode 100644 index 6088cd63..00000000 --- a/testgen/template/flavors/postgresql/exec_query_tests/ex_window_match_no_drops_postgresql.sql +++ /dev/null @@ -1,42 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS VARCHAR), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure -FROM ( - SELECT {COLUMN_NAME_NO_QUOTES} - FROM "{SCHEMA_NAME}"."{TABLE_NAME}" - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - 2 * {WINDOW_DAYS} - AND {WINDOW_DATE_COLUMN} < (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} - GROUP BY {COLUMN_NAME_NO_QUOTES} - EXCEPT - SELECT {COLUMN_NAME_NO_QUOTES} - FROM "{SCHEMA_NAME}"."{TABLE_NAME}" - WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} - GROUP BY {COLUMN_NAME_NO_QUOTES} - ) test; diff --git a/testgen/template/flavors/postgresql/exec_query_tests/ex_window_match_same_postgresql.sql b/testgen/template/flavors/postgresql/exec_query_tests/ex_window_match_same_postgresql.sql deleted file mode 100644 index 4cf4faf2..00000000 --- a/testgen/template/flavors/postgresql/exec_query_tests/ex_window_match_same_postgresql.sql +++ /dev/null @@ -1,55 +0,0 @@ -SELECT '{TEST_TYPE}' as test_type, - '{TEST_DEFINITION_ID}' as test_definition_id, - '{TEST_SUITE_ID}' as test_suite_id, - '{TEST_RUN_ID}' as test_run_id, - '{RUN_DATE}' as test_time, - '{SCHEMA_NAME}' as schema_name, - '{TABLE_NAME}' as table_name, - '{COLUMN_NAME_NO_QUOTES}' as column_names, - '{SKIP_ERRORS}' as threshold_value, - {SKIP_ERRORS} as skip_errors, - '{INPUT_PARAMETERS}' as input_parameters, - NULL as result_signal, - CASE WHEN COUNT (*) > {SKIP_ERRORS} THEN 0 ELSE 1 END as result_code, - CASE - WHEN COUNT(*) > 0 THEN - CONCAT( - CONCAT( CAST(COUNT(*) AS VARCHAR), ' error(s) identified, ' ), - CONCAT( - CASE - WHEN COUNT(*) > {SKIP_ERRORS} THEN 'exceeding limit of ' - ELSE 'within limit of ' - END, - '{SKIP_ERRORS}.' - ) - ) - ELSE 'No errors found.' - END AS result_message, - COUNT(*) as result_measure - FROM ( - ( -SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM "{SCHEMA_NAME}"."{TABLE_NAME}" -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} -EXCEPT -SELECT 'Prior Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM "{SCHEMA_NAME}"."{TABLE_NAME}" -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - 2 * {WINDOW_DAYS} - AND {WINDOW_DATE_COLUMN} < (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} -) -UNION ALL -( -SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM "{SCHEMA_NAME}"."{TABLE_NAME}" -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - 2 * {WINDOW_DAYS} - AND {WINDOW_DATE_COLUMN} < (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} - EXCEPT -SELECT 'Latest Timeframe' as missing_from, {COLUMN_NAME_NO_QUOTES} -FROM "{SCHEMA_NAME}"."{TABLE_NAME}" -WHERE {SUBSET_CONDITION} - AND {WINDOW_DATE_COLUMN} >= (SELECT MAX({WINDOW_DATE_COLUMN}) FROM "{SCHEMA_NAME}"."{TABLE_NAME}") - {WINDOW_DAYS} -) - ) test; diff --git a/testgen/template/gen_funny_cat_tests/gen_Constant.sql b/testgen/template/gen_funny_cat_tests/gen_Constant.sql new file mode 100644 index 00000000..4a0af8d6 --- /dev/null +++ b/testgen/template/gen_funny_cat_tests/gen_Constant.sql @@ -0,0 +1,117 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT p.* + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID +), +all_runs AS ( + SELECT DISTINCT table_groups_id, run_date, + DENSE_RANK() OVER (PARTITION BY table_groups_id ORDER BY run_date DESC) AS run_rank + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +recent_runs AS ( + SELECT table_groups_id, run_date, run_rank + FROM all_runs + WHERE run_rank <= 5 +), +selected_columns AS ( + -- Select columns based on recent profiling results + SELECT p.schema_name, p.table_name, p.column_name, + SUM(CASE WHEN p.distinct_value_ct = 1 THEN 0 ELSE 1 END) AS always_one_val, + COUNT( + DISTINCT CASE + WHEN p.general_type = 'A' THEN p.min_text + WHEN p.general_type = 'N' THEN p.min_value::VARCHAR + WHEN p.general_type IN ('D','T') THEN p.min_date::VARCHAR + WHEN p.general_type = 'B' AND p.boolean_true_ct = p.value_ct THEN 'TRUE' + WHEN p.general_type = 'B' AND p.boolean_true_ct = 0 AND p.distinct_value_ct = 1 THEN 'FALSE' + END + ) AS agg_distinct_val_ct + FROM recent_runs rr + INNER JOIN profile_results p ON ( + rr.table_groups_id = p.table_groups_id + AND rr.run_date = p.run_date + ) + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID + -- No dates as constants + AND NOT (p.general_type = 'D' AND rr.run_rank = 1) + GROUP BY p.schema_name, p.table_name, p.column_name + HAVING SUM(CASE WHEN p.distinct_value_ct = 1 THEN 0 ELSE 1 END) = 0 + AND SUM(CASE WHEN p.max_length < 100 THEN 0 ELSE 1 END) = 0 + AND COUNT( + DISTINCT CASE + WHEN p.general_type = 'A' THEN p.min_text + WHEN p.general_type = 'N' THEN p.min_value::VARCHAR + WHEN p.general_type IN ('D','T') THEN p.min_date::VARCHAR + WHEN p.general_type = 'B' AND p.boolean_true_ct = p.value_ct THEN 'TRUE' + WHEN p.general_type = 'B' AND p.boolean_true_ct = 0 AND p.distinct_value_ct = 1 THEN 'FALSE' + END + ) = 1 + -- Only constant if more than one profiling result + AND COUNT(*) > 1 +) +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, column_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + baseline_value, threshold_value, skip_errors +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Constant' AS test_type, + r.schema_name, + r.table_name, + r.column_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + r.profile_run_id, + CASE WHEN r.general_type = 'A' THEN fn_quote_literal_escape(r.min_text, :SQL_FLAVOR)::VARCHAR + WHEN r.general_type = 'D' THEN fn_quote_literal_escape(r.min_date::VARCHAR, :SQL_FLAVOR)::VARCHAR + WHEN r.general_type = 'N' THEN r.min_value::VARCHAR + WHEN r.general_type = 'B' AND r.boolean_true_ct = 0 THEN 'FALSE'::VARCHAR + WHEN r.general_type = 'B' AND r.boolean_true_ct > 0 THEN 'TRUE'::VARCHAR + ELSE '' + END AS baseline_value, + '0' AS threshold_value, + 0 AS skip_errors +FROM latest_results r +-- Only insert tests for selected columns +INNER JOIN selected_columns c ON ( + r.schema_name = c.schema_name + AND r.table_name = c.table_name + AND r.column_name = c.column_name +) + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Constant' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Constant' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_column" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name, column_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NOT NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + baseline_value = EXCLUDED.baseline_value, + threshold_value = EXCLUDED.threshold_value, + skip_errors = EXCLUDED.skip_errors +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/gen_funny_cat_tests/gen_Distinct_Value_Ct.sql b/testgen/template/gen_funny_cat_tests/gen_Distinct_Value_Ct.sql new file mode 100644 index 00000000..c06b458a --- /dev/null +++ b/testgen/template/gen_funny_cat_tests/gen_Distinct_Value_Ct.sql @@ -0,0 +1,113 @@ +-- FIRST TYPE OF CONSTANT IS HANDLED IN SEPARATE SQL FILE gen_standard_tests.sql using generic parameters +-- Second type: constants with changing values (1 distinct value) + +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT p.* + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID +), +all_runs AS ( + SELECT DISTINCT table_groups_id, run_date, + DENSE_RANK() OVER (PARTITION BY table_groups_id ORDER BY run_date DESC) AS run_rank + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +recent_runs AS ( + SELECT table_groups_id, run_date, run_rank + FROM all_runs + WHERE run_rank <= 5 +), +selected_columns AS ( + -- Select columns based on recent profiling results + SELECT p.schema_name, p.table_name, p.column_name, + SUM(CASE WHEN p.distinct_value_ct = 1 THEN 0 ELSE 1 END) AS always_one_val, + COUNT( + DISTINCT CASE + WHEN p.general_type = 'A' THEN p.min_text + WHEN p.general_type = 'N' THEN p.min_value::VARCHAR + WHEN p.general_type IN ('D','T') THEN p.min_date::VARCHAR + WHEN p.general_type = 'B' AND p.boolean_true_ct = p.value_ct THEN 'TRUE' + WHEN p.general_type = 'B' AND p.boolean_true_ct = 0 AND p.distinct_value_ct = 1 THEN 'FALSE' + END + ) AS agg_distinct_val_ct + FROM recent_runs rr + INNER JOIN profile_results p ON ( + rr.table_groups_id = p.table_groups_id + AND rr.run_date = p.run_date + ) + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID + GROUP BY p.schema_name, p.table_name, p.column_name + HAVING SUM(CASE WHEN p.distinct_value_ct = 1 THEN 0 ELSE 1 END) = 0 + AND ( + COUNT( + DISTINCT CASE + WHEN p.general_type = 'A' THEN p.min_text + WHEN p.general_type = 'N' THEN p.min_value::VARCHAR + WHEN p.general_type IN ('D','T') THEN p.min_date::VARCHAR + WHEN p.general_type = 'B' AND p.boolean_true_ct = p.value_ct THEN 'TRUE' + WHEN p.general_type = 'B' AND p.boolean_true_ct = 0 AND p.distinct_value_ct = 1 THEN 'FALSE' + END + ) > 1 + -- Include cases with only single profiling result -- can't yet assume constant + OR COUNT(*) = 1 + ) +) +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, column_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + baseline_value_ct, threshold_value, skip_errors +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Distinct_Value_Ct' AS test_type, + r.schema_name, + r.table_name, + r.column_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + r.profile_run_id, + r.distinct_value_ct AS baseline_value_ct, + r.distinct_value_ct AS threshold_value, + 0 AS skip_errors +FROM latest_results r +-- Only insert tests for selected columns +INNER JOIN selected_columns c ON ( + r.schema_name = c.schema_name + AND r.table_name = c.table_name + AND r.column_name = c.column_name +) + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Distinct_Value_Ct' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Distinct_Value_Ct' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_column" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name, column_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NOT NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + baseline_value_ct = EXCLUDED.baseline_value_ct, + threshold_value = EXCLUDED.threshold_value, + skip_errors = EXCLUDED.skip_errors +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/gen_funny_cat_tests/gen_test_constant.sql b/testgen/template/gen_funny_cat_tests/gen_test_constant.sql deleted file mode 100644 index 98181fb6..00000000 --- a/testgen/template/gen_funny_cat_tests/gen_test_constant.sql +++ /dev/null @@ -1,107 +0,0 @@ --- Then insert new tests where a locked test is not already present - -INSERT INTO test_definitions (table_groups_id, profile_run_id, - test_type, test_suite_id, - schema_name, table_name, column_name, skip_errors, - last_auto_gen_date, test_active, - baseline_value, threshold_value, profiling_as_of_date) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = :PROJECT_CODE - AND r.table_groups_id = :TABLE_GROUPS_ID - AND ts.id = :TEST_SUITE_ID - AND p.run_date::DATE <= :AS_OF_DATE - GROUP BY r.table_groups_id), - curprof AS (SELECT p.* - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), - locked AS (SELECT schema_name, table_name, column_name, test_type - FROM test_definitions - WHERE table_groups_id = :TABLE_GROUPS_ID - AND test_suite_id = :TEST_SUITE_ID - AND lock_refresh = 'Y'), - all_runs AS ( SELECT DISTINCT p.table_groups_id, p.schema_name, p.run_date, - DENSE_RANK() OVER (PARTITION BY p.table_groups_id ORDER BY p.run_date DESC) as run_rank - FROM profile_results p - INNER JOIN test_suites ts - ON p.connection_id = ts.connection_id - AND p.project_code = ts.project_code - WHERE p.table_groups_id = :TABLE_GROUPS_ID - AND ts.id = :TEST_SUITE_ID - AND p.run_date::DATE <= :AS_OF_DATE), - recent_runs AS (SELECT table_groups_id, schema_name, run_date, run_rank - FROM all_runs - WHERE run_rank <= 5), - rightcols as (SELECT p.schema_name, p.table_name, p.column_name, - SUM(CASE WHEN distinct_value_ct = 1 THEN 0 ELSE 1 END) as always_one_val, - COUNT(DISTINCT CASE - WHEN p.general_type = 'A' THEN min_text - WHEN p.general_type = 'N' THEN min_value::VARCHAR - WHEN p.general_type IN ('D','T') THEN min_date::VARCHAR - WHEN p.general_type = 'B' - AND boolean_true_ct = value_ct THEN 'TRUE' - WHEN p.general_type = 'B' - AND p.boolean_true_ct = 0 - AND p.distinct_value_ct = 1 THEN 'FALSE' - END ) as agg_distinct_val_ct - FROM recent_runs rr - INNER JOIN profile_results p - ON (rr.table_groups_id = p.table_groups_id - AND rr.run_date = p.run_date) - -- No Dates as constants - WHERE NOT (p.general_type = 'D' AND rr.run_rank = 1) - GROUP BY p.schema_name, p.table_name, p.column_name - HAVING SUM(CASE WHEN distinct_value_ct = 1 THEN 0 ELSE 1 END) = 0 - AND SUM(CASE WHEN max_length < 100 THEN 0 ELSE 1 END) = 0 - AND COUNT(DISTINCT CASE - WHEN p.general_type = 'A' THEN min_text - WHEN p.general_type = 'N' THEN min_value::VARCHAR - WHEN p.general_type IN ('D','T') THEN min_date::VARCHAR - WHEN p.general_type = 'B' - AND boolean_true_ct = value_ct THEN 'TRUE' - WHEN p.general_type = 'B' - AND p.boolean_true_ct = 0 - AND p.distinct_value_ct = 1 THEN 'FALSE' - END ) = 1 - -- Only constant if more than one profiling result - AND COUNT(*) > 1), -newtests AS ( SELECT 'Constant'::VARCHAR AS test_type, - :TEST_SUITE_ID ::UUID AS test_suite_id, - c.profile_run_id, - c.schema_name, c.table_name, c.column_name, - c.run_date AS last_run_date, - case when general_type='A' then fn_quote_literal_escape(min_text, :SQL_FLAVOR)::VARCHAR - when general_type='D' then fn_quote_literal_escape(min_date :: VARCHAR, :SQL_FLAVOR)::VARCHAR - when general_type='N' then min_value::VARCHAR - when general_type='B' and boolean_true_ct = 0 then 'FALSE'::VARCHAR - when general_type='B' and boolean_true_ct > 0 then 'TRUE'::VARCHAR - end as baseline_value - FROM curprof c - INNER JOIN rightcols r - ON (c.schema_name = r.schema_name - AND c.table_name = r.table_name - AND c.column_name = r.column_name) - LEFT JOIN generation_sets s - ON ('Constant' = s.test_type - AND :GENERATION_SET = s.generation_set) - WHERE (s.generation_set IS NOT NULL - OR :GENERATION_SET = '') ) -SELECT :TABLE_GROUPS_ID as table_groups_id, n.profile_run_id, - n.test_type, n.test_suite_id, n.schema_name, n.table_name, n.column_name, - 0 as skip_errors, :RUN_DATE ::TIMESTAMP as auto_gen_date, - 'Y' as test_active, COALESCE(baseline_value, '') as baseline_value, - '0' as threshold_value, :AS_OF_DATE ::TIMESTAMP - FROM newtests n -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name - AND n.column_name = l.column_name - AND n.test_type = l.test_type) - WHERE l.test_type IS NULL; diff --git a/testgen/template/gen_funny_cat_tests/gen_test_distinct_value_ct.sql b/testgen/template/gen_funny_cat_tests/gen_test_distinct_value_ct.sql deleted file mode 100644 index 350e1048..00000000 --- a/testgen/template/gen_funny_cat_tests/gen_test_distinct_value_ct.sql +++ /dev/null @@ -1,99 +0,0 @@ --- FIRST TYPE OF CONSTANT IS HANDLED IN SEPARATE SQL FILE gen_standard_tests.sql using generic parameters --- Second type: constants with changing values (1 distinct value) -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, column_name, skip_errors, - last_auto_gen_date, test_active, - baseline_value_ct, threshold_value, profiling_as_of_date) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = :PROJECT_CODE - AND r.table_groups_id = :TABLE_GROUPS_ID - AND ts.id = :TEST_SUITE_ID - AND p.run_date::DATE <= :AS_OF_DATE - GROUP BY r.table_groups_id), - curprof AS (SELECT p.* - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), - locked AS (SELECT schema_name, table_name, column_name, test_type - FROM test_definitions - WHERE table_groups_id = :TABLE_GROUPS_ID - AND test_suite_id = :TEST_SUITE_ID - AND lock_refresh = 'Y'), - all_runs AS ( SELECT DISTINCT p.table_groups_id, p.schema_name, p.run_date, - DENSE_RANK() OVER (PARTITION BY p.table_groups_id ORDER BY p.run_date DESC) as run_rank - FROM profile_results p - INNER JOIN test_suites ts - ON p.connection_id = ts.connection_id - AND p.project_code = ts.project_code - WHERE p.table_groups_id = :TABLE_GROUPS_ID - AND ts.id = :TEST_SUITE_ID - AND p.run_date::DATE <= :AS_OF_DATE), - recent_runs AS (SELECT table_groups_id, schema_name, run_date, run_rank - FROM all_runs - WHERE run_rank <= 5), - rightcols as (SELECT p.schema_name, p.table_name, p.column_name, - SUM(CASE WHEN distinct_value_ct = 1 THEN 0 ELSE 1 END) as always_one_val, - COUNT(DISTINCT CASE - WHEN p.general_type = 'A' THEN min_text - WHEN p.general_type = 'N' THEN min_value::VARCHAR - WHEN p.general_type IN ('D','T') THEN min_date::VARCHAR - WHEN p.general_type = 'B' - AND boolean_true_ct = value_ct THEN 'TRUE' - WHEN p.general_type = 'B' - AND p.boolean_true_ct = 0 - AND p.distinct_value_ct = 1 THEN 'FALSE' - END ) as agg_distinct_val_ct - FROM recent_runs rr - INNER JOIN profile_results p - ON (rr.table_groups_id = p.table_groups_id - AND rr.run_date = p.run_date) - GROUP BY p.schema_name, p.table_name, p.column_name - HAVING SUM(CASE WHEN distinct_value_ct = 1 THEN 0 ELSE 1 END) = 0 - AND (COUNT(DISTINCT CASE - WHEN p.general_type = 'A' THEN min_text - WHEN p.general_type = 'N' THEN min_value::VARCHAR - WHEN p.general_type IN ('D','T') THEN min_date::VARCHAR - WHEN p.general_type = 'B' - AND boolean_true_ct = value_ct THEN 'TRUE' - WHEN p.general_type = 'B' - AND p.boolean_true_ct = 0 - AND p.distinct_value_ct = 1 THEN 'FALSE' - END ) > 1 - -- include cases with only single profiling result -- can't yet assume constant - OR COUNT(*) = 1)), -newtests AS ( SELECT 'Distinct_Value_Ct'::VARCHAR AS test_type, - :TEST_SUITE_ID ::UUID AS test_suite_id, - c.table_groups_id, c.profile_run_id, - c.schema_name, c.table_name, c.column_name, - c.run_date AS last_run_date, - c.distinct_value_ct - FROM curprof c - INNER JOIN rightcols r - ON (c.schema_name = r.schema_name - AND c.table_name = r.table_name - AND c.column_name = r.column_name) - LEFT JOIN generation_sets s - ON ('Distinct_Value_Ct' = s.test_type - AND :GENERATION_SET = s.generation_set) - WHERE (s.generation_set IS NOT NULL - OR :GENERATION_SET = '') ) -SELECT n.table_groups_id, n.profile_run_id, - n.test_type, n.test_suite_id, - n.schema_name, n.table_name, n.column_name, 0 as skip_errors, - :RUN_DATE ::TIMESTAMP as last_auto_gen_date, 'Y' as test_active, - distinct_value_ct as baseline_value_ct, distinct_value_ct as threshold_value, - :AS_OF_DATE ::TIMESTAMP as profiling_as_of_date - FROM newtests n -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name - AND n.column_name = l.column_name - AND n.test_type = l.test_type) - WHERE l.test_type IS NULL; diff --git a/testgen/template/gen_funny_cat_tests/gen_test_row_ct.sql b/testgen/template/gen_funny_cat_tests/gen_test_row_ct.sql deleted file mode 100644 index c1e4578f..00000000 --- a/testgen/template/gen_funny_cat_tests/gen_test_row_ct.sql +++ /dev/null @@ -1,56 +0,0 @@ --- Insert new tests where a locked test is not already present -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, - skip_errors, threshold_value, - last_auto_gen_date, test_active, baseline_ct, profiling_as_of_date) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = :PROJECT_CODE - AND r.table_groups_id = :TABLE_GROUPS_ID - AND ts.id = :TEST_SUITE_ID - AND p.run_date::DATE <= :AS_OF_DATE - GROUP BY r.table_groups_id), - curprof AS (SELECT p.* - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), - locked AS (SELECT schema_name, table_name, column_name, test_type - FROM test_definitions - WHERE table_groups_id = :TABLE_GROUPS_ID - AND test_suite_id = :TEST_SUITE_ID - AND lock_refresh = 'Y'), - newtests AS (SELECT table_groups_id, profile_run_id, - 'Row_Ct' AS test_type, - :TEST_SUITE_ID ::UUID AS test_suite_id, - schema_name, - table_name, - MAX(record_ct) as record_ct - FROM curprof c - LEFT JOIN generation_sets s - ON ('Row_Ct' = s.test_type - AND :GENERATION_SET = s.generation_set) - WHERE schema_name = :DATA_SCHEMA - AND functional_table_type LIKE '%cumulative%' - AND (s.generation_set IS NOT NULL - OR :GENERATION_SET = '') - GROUP BY project_code, table_groups_id, profile_run_id, - test_type, test_suite_id, schema_name, table_name ) -SELECT n.table_groups_id, n.profile_run_id, - n.test_type, n.test_suite_id, - n.schema_name, n.table_name, - 0 as skip_errors, record_ct AS threshold_value, - :RUN_DATE ::TIMESTAMP as last_auto_gen_date, - 'Y' as test_active, record_ct as baseline_ct, - :AS_OF_DATE ::TIMESTAMP as profiling_as_of_date -FROM newtests n -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name - AND n.test_type = l.test_type) -WHERE l.test_type IS NULL; diff --git a/testgen/template/gen_funny_cat_tests/gen_test_row_ct_pct.sql b/testgen/template/gen_funny_cat_tests/gen_test_row_ct_pct.sql deleted file mode 100644 index 656ad687..00000000 --- a/testgen/template/gen_funny_cat_tests/gen_test_row_ct_pct.sql +++ /dev/null @@ -1,59 +0,0 @@ --- Insert new tests where a locked test is not already present -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, skip_errors, - last_auto_gen_date, profiling_as_of_date, test_active, - baseline_ct, threshold_value) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = :PROJECT_CODE - AND r.table_groups_id = :TABLE_GROUPS_ID - AND ts.id = :TEST_SUITE_ID - AND p.run_date::DATE <= :AS_OF_DATE - GROUP BY r.table_groups_id), - curprof AS (SELECT p.* - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), - locked AS (SELECT schema_name, table_name, column_name, test_type - FROM test_definitions - WHERE table_groups_id = :TABLE_GROUPS_ID - AND test_suite_id = :TEST_SUITE_ID - AND lock_refresh = 'Y'), - newtests AS ( - SELECT table_groups_id, - profile_run_id, - 'Row_Ct_Pct' AS test_type, - :TEST_SUITE_ID ::UUID AS test_suite_id, - schema_name, - table_name, - MAX(record_ct) as record_ct - FROM curprof - LEFT JOIN generation_sets s - ON ('Row_Ct_Pct' = s.test_type - AND :GENERATION_SET = s.generation_set) - WHERE schema_name = :DATA_SCHEMA - AND functional_table_type NOT ILIKE '%cumulative%' - AND (s.generation_set IS NOT NULL - OR :GENERATION_SET = '') - GROUP BY project_code, table_groups_id, profile_run_id, - test_type, test_suite_id, schema_name, table_name - HAVING MAX(record_ct) >= 500) -SELECT n.table_groups_id, n.profile_run_id, - n.test_type, n.test_suite_id, - n.schema_name, n.table_name, 0 as skip_errors, - :RUN_DATE ::TIMESTAMP as last_auto_gen_date, - :AS_OF_DATE ::TIMESTAMP as profiling_as_of_date, - 'Y' as test_active, - record_ct as baseline_ct, 10 AS threshold_value - FROM newtests n -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name - AND n.test_type = l.test_type) -WHERE l.test_type IS NULL; diff --git a/testgen/template/gen_query_tests/gen_Dupe_Rows.sql b/testgen/template/gen_query_tests/gen_Dupe_Rows.sql new file mode 100644 index 00000000..e1f98e0c --- /dev/null +++ b/testgen/template/gen_query_tests/gen_Dupe_Rows.sql @@ -0,0 +1,53 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + STRING_AGG(:QUOTE || column_name || :QUOTE, ', ' ORDER BY position) AS groupby_names + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + GROUP BY profile_run_id, schema_name, table_name +) +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + groupby_names, skip_errors +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Dupe_Rows' AS test_type, + s.schema_name, + s.table_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + s.groupby_names, + 0 AS skip_errors +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Dupe_Rows' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Dupe_Rows' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + groupby_names = EXCLUDED.groupby_names, + skip_errors = EXCLUDED.skip_errors +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/gen_query_tests/gen_Freshness_Trend.sql b/testgen/template/gen_query_tests/gen_Freshness_Trend.sql new file mode 100644 index 00000000..19c75fd6 --- /dev/null +++ b/testgen/template/gen_query_tests/gen_Freshness_Trend.sql @@ -0,0 +1,202 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT p.profile_run_id, p.schema_name, p.table_name, p.column_name, + p.functional_data_type, p.general_type, + p.distinct_value_ct, p.record_ct, p.null_value_ct, + p.max_value, p.min_value, p.avg_value, p.stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + INNER JOIN data_table_chars dtc ON ( + dtc.table_groups_id = p.table_groups_id + AND dtc.schema_name = p.schema_name + AND dtc.table_name = p.table_name + -- Ignore dropped tables + AND dtc.drop_date IS NULL + ) + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + STRING_AGG(column_name, ',' ORDER BY element_type, fingerprint_order, column_name) AS column_names, + 'COUNT(*)::VARCHAR || ''|'' || ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'MIN(@@@)::VARCHAR || ''|'' || MAX(@@@::VARCHAR) || ''|'' || COUNT(DISTINCT @@@)::VARCHAR' + WHEN general_type = 'A' THEN 'MIN(@@@)::VARCHAR || ''|'' || MAX(@@@::VARCHAR) || ''|'' || COUNT(DISTINCT @@@)::VARCHAR || ''|'' || SUM(LENGTH(@@@))::VARCHAR' + WHEN general_type = 'N' THEN 'COUNT(@@@)::VARCHAR || ''|'' || + COUNT(DISTINCT MOD((COALESCE(@@@,0)::DECIMAL(38,6) * 1000000)::DECIMAL(38,0), 1000003))::VARCHAR || ''|'' || + COALESCE((MIN(@@@)::DECIMAL(38,6))::VARCHAR, '''') || ''|'' || + COALESCE((MAX(@@@)::DECIMAL(38,6))::VARCHAR, '''') || ''|'' || + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000007)), 0), 1000000007)::VARCHAR, '''') || ''|'' || + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000009)), 0), 1000000009)::VARCHAR, '''')' + END, + '@@@', '"' || column_name || '"' + ), + ' || ''|'' || ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, groupby_names, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Freshness_Trend' AS test_type, + s.schema_name, + s.table_name, + s.column_names AS groupby_names, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'PREDICT' AS history_calculation, + NULL AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Freshness_Trend' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Freshness_Trend' AND generation_set = :GENERATION_SET) + {TABLE_FILTER} + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + groupby_names = EXCLUDED.groupby_names, + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N' + -- Don't update existing tests in "insert" mode + AND NOT COALESCE(:INSERT_ONLY, FALSE); diff --git a/testgen/template/gen_query_tests/gen_Schema_Drift.sql b/testgen/template/gen_query_tests/gen_Schema_Drift.sql new file mode 100644 index 00000000..903a29f6 --- /dev/null +++ b/testgen/template/gen_query_tests/gen_Schema_Drift.sql @@ -0,0 +1,30 @@ +-- Insert test for current schema +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, + test_active, last_auto_gen_date +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Schema_Drift' AS test_type, + :DATA_SCHEMA AS schema_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Schema_Drift' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Schema_Drift' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_schema" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NULL + AND column_name IS NULL + +-- Update test if it already exists +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/gen_query_tests/gen_Table_Freshness.sql b/testgen/template/gen_query_tests/gen_Table_Freshness.sql new file mode 100644 index 00000000..bb66dca9 --- /dev/null +++ b/testgen/template/gen_query_tests/gen_Table_Freshness.sql @@ -0,0 +1,189 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +latest_results AS ( + -- Column results for latest run + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, + distinct_value_ct, record_ct, null_value_ct, + max_value, min_value, avg_value, stdev_value + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID +), +-- IDs - TOP 2 +id_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 + WHEN functional_data_type = 'ID-Secondary' THEN 2 + ELSE 3 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'ID%' +), +-- Process Date - TOP 1 +process_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY + CASE + WHEN column_name ILIKE '%mod%' THEN 1 + WHEN column_name ILIKE '%up%' THEN 1 + WHEN column_name ILIKE '%cr%' THEN 2 + WHEN column_name ILIKE '%in%' THEN 2 + END, distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'process%' +), +-- Transaction Date - TOP 1 +tran_date_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, distinct_value_ct, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY distinct_value_ct DESC, column_name + ) AS rank + FROM latest_results + WHERE general_type IN ('A', 'D', 'N') + AND functional_data_type ILIKE 'transactional date%' + OR functional_data_type ILIKE 'period%' + OR functional_data_type = 'timestamp' +), +-- Numeric Measures +numeric_cols AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + functional_data_type, general_type, +/* + -- Subscores + distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, + (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, + LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, + stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, + 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, +*/ + -- Weighted score + ( + 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + + 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + + 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + + 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) + ) AS change_detection_score + FROM latest_results + WHERE general_type = 'N' + AND ( + functional_data_type ILIKE 'Measure%' + OR functional_data_type IN ('Sequence', 'Constant') + ) +), +numeric_cols_ranked AS ( + SELECT *, + ROW_NUMBER() OVER ( + PARTITION BY schema_name, table_name + ORDER BY change_detection_score DESC, column_name + ) AS rank + FROM numeric_cols + WHERE change_detection_score IS NOT NULL +), +combined AS ( + SELECT profile_run_id, schema_name, table_name, column_name, + 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order + FROM id_cols + WHERE rank <= 2 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order + FROM process_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order + FROM tran_date_cols + WHERE rank = 1 + UNION ALL + SELECT profile_run_id, schema_name, table_name, column_name, + 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order + FROM numeric_cols_ranked + WHERE rank = 1 +), +selected_tables AS ( + SELECT profile_run_id, schema_name, table_name, + 'COUNT(*)::VARCHAR || ''|'' || ' || + STRING_AGG( + REPLACE( + CASE + WHEN general_type = 'D' THEN 'MIN(@@@)::VARCHAR || ''|'' || MAX(@@@::VARCHAR) || ''|'' || COUNT(DISTINCT @@@)::VARCHAR' + WHEN general_type = 'A' THEN 'MIN(@@@)::VARCHAR || ''|'' || MAX(@@@::VARCHAR) || ''|'' || COUNT(DISTINCT @@@)::VARCHAR || ''|'' || SUM(LENGTH(@@@))::VARCHAR' + WHEN general_type = 'N' THEN 'COUNT(@@@)::VARCHAR || ''|'' || + COUNT(DISTINCT MOD((COALESCE(@@@,0)::DECIMAL(38,6) * 1000000)::DECIMAL(38,0), 1000003))::VARCHAR || ''|'' || + COALESCE((MIN(@@@)::DECIMAL(38,6))::VARCHAR, '''') || ''|'' || + COALESCE((MAX(@@@)::DECIMAL(38,6))::VARCHAR, '''') || ''|'' || + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000007)), 0), 1000000007)::VARCHAR, '''') || ''|'' || + COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL(38,6), 1000000009)), 0), 1000000009)::VARCHAR, '''')' + END, + '@@@', '"' || column_name || '"' + ), + ' || ''|'' || ' + ORDER BY element_type, fingerprint_order, column_name + ) AS fingerprint + FROM combined + GROUP BY profile_run_id, schema_name, table_name +) +-- Insert tests for selected tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, + history_calculation, history_lookback, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Table_Freshness' AS test_type, + s.schema_name, + s.table_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 'Value' AS history_calculation, + 1 AS history_lookback, + s.fingerprint AS custom_query +FROM selected_tables s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Table_Freshness' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Table_Freshness' AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/gen_query_tests/gen_Volume_Trend.sql b/testgen/template/gen_query_tests/gen_Volume_Trend.sql new file mode 100644 index 00000000..e4c0cf4c --- /dev/null +++ b/testgen/template/gen_query_tests/gen_Volume_Trend.sql @@ -0,0 +1,47 @@ +-- Insert tests for current tables +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, + test_active, last_auto_gen_date, + history_calculation, history_lookback, subset_condition, custom_query +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + 'Volume_Trend' AS test_type, + c.schema_name, + c.table_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + 'PREDICT' AS history_calculation, + NULL AS history_lookback, + NULL AS subset_condition, + 'COUNT(CASE WHEN {SUBSET_CONDITION} THEN 1 END)' AS custom_query +FROM data_table_chars c +WHERE c.table_groups_id = :TABLE_GROUPS_ID ::UUID + -- Ignore dropped tables + AND c.drop_date IS NULL + -- Only insert if test type is active + AND EXISTS (SELECT 1 FROM test_types WHERE test_type = 'Volume_Trend' AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = 'Volume_Trend' AND generation_set = :GENERATION_SET) + {TABLE_FILTER} + +-- Match "uix_td_autogen_table" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + history_calculation = EXCLUDED.history_calculation, + history_lookback = EXCLUDED.history_lookback, + subset_condition = EXCLUDED.subset_condition, + custom_query = EXCLUDED.custom_query +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N' + -- Don't update existing tests in "insert" mode + AND NOT COALESCE(:INSERT_ONLY, FALSE); diff --git a/testgen/template/gen_query_tests/gen_dupe_rows_test.sql b/testgen/template/gen_query_tests/gen_dupe_rows_test.sql deleted file mode 100644 index 3e75460c..00000000 --- a/testgen/template/gen_query_tests/gen_dupe_rows_test.sql +++ /dev/null @@ -1,54 +0,0 @@ -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, - skip_errors, test_active, last_auto_gen_date, profiling_as_of_date, - groupby_names ) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = '{PROJECT_CODE}' - AND r.table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND ts.id = '{TEST_SUITE_ID}' - AND p.run_date::DATE <= '{AS_OF_DATE}' - GROUP BY r.table_groups_id), - curprof AS (SELECT p.schema_name, p.table_name, p.profile_run_id, - STRING_AGG('{QUOTE}' || p.column_name || '{QUOTE}', ', ' ORDER BY p.position) as unique_by_columns - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) - GROUP BY p.schema_name, p.table_name, p.profile_run_id), - locked AS (SELECT schema_name, table_name - FROM test_definitions - WHERE table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND test_suite_id = '{TEST_SUITE_ID}' - AND test_type = 'Dupe_Rows' - AND lock_refresh = 'Y'), - newtests AS (SELECT * - FROM curprof p - INNER JOIN test_types t - ON ('Dupe_Rows' = t.test_type - AND 'Y' = t.active) - LEFT JOIN generation_sets s - ON (t.test_type = s.test_type - AND '{GENERATION_SET}' = s.generation_set) - WHERE p.schema_name = '{DATA_SCHEMA}' - AND (s.generation_set IS NOT NULL - OR '{GENERATION_SET}' = '') ) -SELECT '{TABLE_GROUPS_ID}'::UUID as table_groups_id, - n.profile_run_id, - 'Dupe_Rows' AS test_type, - '{TEST_SUITE_ID}' AS test_suite_id, - n.schema_name, n.table_name, - 0 as skip_errors, 'Y' as test_active, - '{RUN_DATE}'::TIMESTAMP as last_auto_gen_date, - '{AS_OF_DATE}'::TIMESTAMP as profiling_as_of_date, - unique_by_columns as groupby_columns -FROM newtests n -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name) -WHERE l.schema_name IS NULL; diff --git a/testgen/template/gen_query_tests/gen_schema_drift_tests.sql b/testgen/template/gen_query_tests/gen_schema_drift_tests.sql deleted file mode 100644 index 02817428..00000000 --- a/testgen/template/gen_query_tests/gen_schema_drift_tests.sql +++ /dev/null @@ -1,44 +0,0 @@ -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, - skip_errors, test_active, last_auto_gen_date, profiling_as_of_date) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date, p.schema_name, p.profile_run_id - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = '{PROJECT_CODE}' - AND r.table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND ts.id = '{TEST_SUITE_ID}' - AND p.run_date::DATE <= '{AS_OF_DATE}' - GROUP BY r.table_groups_id, p.schema_name, p.profile_run_id), - locked AS (SELECT schema_name - FROM test_definitions - WHERE table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND test_suite_id = '{TEST_SUITE_ID}' - AND test_type = 'Schema_Drift' - AND lock_refresh = 'Y'), - newtests AS (SELECT * - FROM last_run lr - INNER JOIN test_types t - ON ('Schema_Drift' = t.test_type - AND 'Y' = t.active) - LEFT JOIN generation_sets s - ON (t.test_type = s.test_type - AND '{GENERATION_SET}' = s.generation_set) - WHERE lr.schema_name = '{DATA_SCHEMA}' - AND (s.generation_set IS NOT NULL - OR '{GENERATION_SET}' = '') ) -SELECT '{TABLE_GROUPS_ID}'::UUID as table_groups_id, - n.profile_run_id, - 'Schema_Drift' AS test_type, - '{TEST_SUITE_ID}' AS test_suite_id, - n.schema_name, - 0 as skip_errors, 'Y' as test_active, - '{RUN_DATE}'::TIMESTAMP as last_auto_gen_date, - '{AS_OF_DATE}'::TIMESTAMP as profiling_as_of_date -FROM newtests n -LEFT JOIN locked l - ON (n.schema_name = l.schema_name) -WHERE l.schema_name IS NULL; diff --git a/testgen/template/gen_query_tests/gen_table_changed_test.sql b/testgen/template/gen_query_tests/gen_table_changed_test.sql deleted file mode 100644 index 4c578f13..00000000 --- a/testgen/template/gen_query_tests/gen_table_changed_test.sql +++ /dev/null @@ -1,162 +0,0 @@ -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, - skip_errors, test_active, last_auto_gen_date, profiling_as_of_date, - lock_refresh, history_calculation, history_lookback, custom_query ) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = '{PROJECT_CODE}' - AND r.table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND ts.id = '{TEST_SUITE_ID}' - AND p.run_date::DATE <= '{AS_OF_DATE}' - GROUP BY r.table_groups_id), -curprof AS (SELECT p.profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, record_ct, max_value, min_value, avg_value, stdev_value, null_value_ct - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), -locked AS (SELECT schema_name, table_name - FROM test_definitions - WHERE table_groups_id = '{TABLE_GROUPS_ID}'::UUID - AND test_suite_id = '{TEST_SUITE_ID}' - AND test_type = 'Table_Freshness' - AND lock_refresh = 'Y'), --- IDs - TOP 2 -id_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN functional_data_type ILIKE 'ID-Unique%' THEN 1 - WHEN functional_data_type = 'ID-Secondary' THEN 2 - ELSE 3 - END, distinct_value_ct, column_name DESC) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'ID%'), --- Process Date - TOP 1 -process_date_cols - AS (SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - CASE - WHEN column_name ILIKE '%mod%' THEN 1 - WHEN column_name ILIKE '%up%' THEN 1 - WHEN column_name ILIKE '%cr%' THEN 2 - WHEN column_name ILIKE '%in%' THEN 2 - END , distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'process%'), --- Transaction Date - TOP 1 -tran_date_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, - distinct_value_ct, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY - distinct_value_ct DESC, column_name) AS rank - FROM curprof - WHERE general_type IN ('A', 'D', 'N') - AND functional_data_type ILIKE 'transactional date%' - OR functional_data_type ILIKE 'period%' - OR functional_data_type = 'timestamp' ), - --- Numeric Measures -numeric_cols - AS ( SELECT profile_run_id, schema_name, table_name, column_name, functional_data_type, general_type, -/* - -- Subscores - distinct_value_ct * 1.0 / NULLIF(record_ct, 0) AS cardinality_score, - (max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS range_score, - LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2)) AS nontriviality_score, - stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1) AS variability_score, - 1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1)) AS null_penalty, -*/ - -- Weighted score - ( - 0.25 * (distinct_value_ct * 1.0 / NULLIF(record_ct, 0)) + - 0.15 * ((max_value - min_value) / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (LEAST(1, LOG(GREATEST(distinct_value_ct, 2))) / LOG(GREATEST(record_ct, 2))) + - 0.40 * (stdev_value / NULLIF(ABS(NULLIF(avg_value, 0)), 1)) + - 0.10 * (1.0 - (null_value_ct * 1.0 / NULLIF(NULLIF(record_ct, 0), 1))) - ) AS change_detection_score - FROM curprof - WHERE general_type = 'N' - AND (functional_data_type ILIKE 'Measure%' OR functional_data_type IN ('Sequence', 'Constant')) - ), -numeric_cols_ranked - AS ( SELECT *, - ROW_NUMBER() OVER (PARTITION BY schema_name, table_name - ORDER BY change_detection_score DESC, column_name) as rank - FROM numeric_cols - WHERE change_detection_score IS NOT NULL), -combined - AS ( SELECT profile_run_id, schema_name, table_name, column_name, 'ID' AS element_type, general_type, 10 + rank AS fingerprint_order - FROM id_cols - WHERE rank <= 2 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_P' AS element_type, general_type, 20 + rank AS fingerprint_order - FROM process_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'DATE_T' AS element_type, general_type, 30 + rank AS fingerprint_order - FROM tran_date_cols - WHERE rank = 1 - UNION ALL - SELECT profile_run_id, schema_name, table_name, column_name, 'MEAS' AS element_type, general_type, 40 + rank AS fingerprint_order - FROM numeric_cols_ranked - WHERE rank = 1 ), -newtests - AS (SELECT profile_run_id, schema_name, table_name, - 'COUNT(*)::VARCHAR || ''|'' || ' || - STRING_AGG( - REPLACE( - CASE - WHEN general_type = 'D' THEN 'MIN(@@@)::VARCHAR || ''|'' || MAX(@@@::VARCHAR) || ''|'' || COUNT(DISTINCT @@@)::VARCHAR' - WHEN general_type = 'A' THEN 'MIN(@@@)::VARCHAR || ''|'' || MAX(@@@::VARCHAR) || ''|'' || COUNT(DISTINCT @@@)::VARCHAR || ''|'' || SUM(LENGTH(@@@))::VARCHAR' - WHEN general_type = 'N' THEN 'COUNT(@@@)::VARCHAR || ''|'' || - COUNT(DISTINCT MOD((COALESCE(@@@,0)::DECIMAL(38,6) * 1000000)::DECIMAL(38,0), 1000003))::VARCHAR || ''|'' || - COALESCE((MIN(@@@)::DECIMAL(38,6))::VARCHAR, '''') || ''|'' || - COALESCE((MAX(@@@)::DECIMAL(38,6))::VARCHAR, '''') || ''|'' || - COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL, 1000000007)), 0), 1000000007)::VARCHAR, '''') || ''|'' || - COALESCE(MOD(COALESCE(SUM(MOD((ABS(COALESCE(@@@,0))::DECIMAL(38,6) * 1000000)::DECIMAL, 1000000009)), 0), 1000000009)::VARCHAR, '''')' - END, - '@@@', '"' || column_name || '"'), - ' || ''|'' || ' - ORDER BY element_type, fingerprint_order, column_name) as fingerprint - FROM combined - GROUP BY profile_run_id, schema_name, table_name) -SELECT '{TABLE_GROUPS_ID}'::UUID as table_groups_id, - n.profile_run_id, - 'Table_Freshness' AS test_type, - '{TEST_SUITE_ID}' AS test_suite_id, - n.schema_name, n.table_name, - 0 as skip_errors, 'Y' as test_active, - - '{RUN_DATE}'::TIMESTAMP as last_auto_gen_date, - '{AS_OF_DATE}'::TIMESTAMP as profiling_as_of_date, - 'N' as lock_refresh, - 'Value' as history_calculation, - 1 as history_lookback, - fingerprint as custom_query -FROM newtests n -INNER JOIN test_types t - ON ('Table_Freshness' = t.test_type - AND 'Y' = t.active) -LEFT JOIN generation_sets s - ON (t.test_type = s.test_type - AND '{GENERATION_SET}' = s.generation_set) -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name) -WHERE (s.generation_set IS NOT NULL - OR '{GENERATION_SET}' = '') - AND l.schema_name IS NULL; - diff --git a/testgen/template/generation/delete_stale_autogen_tests.sql b/testgen/template/generation/delete_stale_autogen_tests.sql new file mode 100644 index 00000000..570d9a05 --- /dev/null +++ b/testgen/template/generation/delete_stale_autogen_tests.sql @@ -0,0 +1,10 @@ +DELETE FROM test_definitions +WHERE test_suite_id = :TEST_SUITE_ID ::UUID + -- Delete old autogenerated tests + AND last_auto_gen_date < :RUN_DATE + -- Ignore manual tests + AND last_auto_gen_date IS NOT NULL + -- Ignore locked tests + AND lock_refresh = 'N' + -- Filter by test types if specified (NULL = no filter) + AND (:TEST_TYPES_FILTER IS NULL OR test_type = ANY(:TEST_TYPES_FILTER)); diff --git a/testgen/template/generation/delete_stale_monitors.sql b/testgen/template/generation/delete_stale_monitors.sql new file mode 100644 index 00000000..86e11e1f --- /dev/null +++ b/testgen/template/generation/delete_stale_monitors.sql @@ -0,0 +1,12 @@ +-- Deletes all monitors for dropped tables, including manual and locked ones +DELETE FROM test_definitions td +WHERE td.test_suite_id = :TEST_SUITE_ID ::UUID + -- Filter by test types if specified (NULL = no filter) + AND (:TEST_TYPES_FILTER IS NULL OR td.test_type = ANY(:TEST_TYPES_FILTER)) + AND EXISTS ( + SELECT 1 FROM data_table_chars dtc + WHERE dtc.table_groups_id = td.table_groups_id + AND dtc.schema_name = td.schema_name + AND dtc.table_name = td.table_name + AND dtc.drop_date IS NOT NULL + ); diff --git a/testgen/template/generation/gen_delete_old_tests.sql b/testgen/template/generation/gen_delete_old_tests.sql deleted file mode 100644 index 0aeeec7d..00000000 --- a/testgen/template/generation/gen_delete_old_tests.sql +++ /dev/null @@ -1,5 +0,0 @@ -DELETE FROM test_definitions - WHERE table_groups_id = :TABLE_GROUPS_ID - AND test_suite_id = :TEST_SUITE_ID - AND last_auto_gen_date IS NOT NULL - AND COALESCE(lock_refresh, 'N') <> 'Y'; diff --git a/testgen/template/generation/gen_insert_test_suite.sql b/testgen/template/generation/gen_insert_test_suite.sql deleted file mode 100644 index c070f65b..00000000 --- a/testgen/template/generation/gen_insert_test_suite.sql +++ /dev/null @@ -1,6 +0,0 @@ -INSERT INTO test_suites - (project_code, test_suite, connection_id, table_groups_id, test_suite_description, - component_type, component_key) -VALUES (:PROJECT_CODE, :TEST_SUITE, :CONNECTION_ID, :TABLE_GROUPS_ID, :TEST_SUITE || ' Test Suite', - 'dataset', :TEST_SUITE) -RETURNING id::VARCHAR; diff --git a/testgen/template/generation/gen_selection_tests.sql b/testgen/template/generation/gen_selection_tests.sql new file mode 100644 index 00000000..c6b846dd --- /dev/null +++ b/testgen/template/generation/gen_selection_tests.sql @@ -0,0 +1,56 @@ +WITH latest_run AS ( + -- Latest complete profiling run before as-of-date + SELECT MAX(run_date) AS last_run_date + FROM profile_results + WHERE table_groups_id = :TABLE_GROUPS_ID ::UUID + AND run_date::DATE <= :AS_OF_DATE ::DATE +), +selected_columns AS ( + -- Column results for latest run matching selection criteria + SELECT p.* + FROM profile_results p + INNER JOIN latest_run lr ON p.run_date = lr.last_run_date + WHERE p.table_groups_id = :TABLE_GROUPS_ID ::UUID + AND {SELECTION_CRITERIA} +) +INSERT INTO test_definitions ( + table_groups_id, test_suite_id, test_type, + schema_name, table_name, column_name, + test_active, last_auto_gen_date, profiling_as_of_date, profile_run_id, skip_errors, + {DEFAULT_PARM_COLUMNS} +) +SELECT + :TABLE_GROUPS_ID ::UUID AS table_groups_id, + :TEST_SUITE_ID ::UUID AS test_suite_id, + :TEST_TYPE AS test_type, + s.schema_name, + s.table_name, + s.column_name, + 'Y' AS test_active, + :RUN_DATE ::TIMESTAMP AS last_auto_gen_date, + :AS_OF_DATE ::TIMESTAMP AS profiling_as_of_date, + s.profile_run_id, + 0 AS skip_errors, + {DEFAULT_PARM_VALUES} +FROM selected_columns s + -- Only insert if test type is active +WHERE EXISTS (SELECT 1 FROM test_types WHERE test_type = :TEST_TYPE AND active = 'Y') + -- Only insert if test type is included in generation set + AND EXISTS (SELECT 1 FROM generation_sets WHERE test_type = :TEST_TYPE AND generation_set = :GENERATION_SET) + +-- Match "uix_td_autogen_column" unique index exactly +ON CONFLICT (test_suite_id, test_type, schema_name, table_name, column_name) +WHERE last_auto_gen_date IS NOT NULL + AND table_name IS NOT NULL + AND column_name IS NOT NULL + +-- Update tests if they already exist +DO UPDATE SET + test_active = EXCLUDED.test_active, + last_auto_gen_date = EXCLUDED.last_auto_gen_date, + profiling_as_of_date = EXCLUDED.profiling_as_of_date, + profile_run_id = EXCLUDED.profile_run_id, + skip_errors = EXCLUDED.skip_errors, + {DEFAULT_PARM_COLUMNS_UPDATE} +-- Ignore locked tests +WHERE test_definitions.lock_refresh = 'N'; diff --git a/testgen/template/generation/gen_standard_test_type_list.sql b/testgen/template/generation/gen_standard_test_type_list.sql deleted file mode 100644 index 9f041c9f..00000000 --- a/testgen/template/generation/gen_standard_test_type_list.sql +++ /dev/null @@ -1,13 +0,0 @@ -SELECT t.test_type, - t.selection_criteria, - t.default_parm_columns, - t.default_parm_values -FROM test_types t -LEFT JOIN generation_sets s - ON (t.test_type = s.test_type - AND :GENERATION_SET = s.generation_set) -WHERE t.active = 'Y' - AND t.selection_criteria <> 'TEMPLATE' -- Also excludes NULL - AND (s.generation_set IS NOT NULL - OR :GENERATION_SET = '') -ORDER BY test_type; diff --git a/testgen/template/generation/gen_standard_tests.sql b/testgen/template/generation/gen_standard_tests.sql deleted file mode 100644 index 2053ba54..00000000 --- a/testgen/template/generation/gen_standard_tests.sql +++ /dev/null @@ -1,46 +0,0 @@ --- Insert new tests where a locked test is not already present -INSERT INTO test_definitions (table_groups_id, profile_run_id, test_type, test_suite_id, - schema_name, table_name, column_name, - skip_errors, test_active, last_auto_gen_date, profiling_as_of_date, - {DEFAULT_PARM_COLUMNS} ) -WITH last_run AS (SELECT r.table_groups_id, MAX(run_date) AS last_run_date - FROM profile_results p - INNER JOIN profiling_runs r - ON (p.profile_run_id = r.id) - INNER JOIN test_suites ts - ON p.project_code = ts.project_code - AND p.connection_id = ts.connection_id - WHERE p.project_code = :PROJECT_CODE - AND r.table_groups_id = :TABLE_GROUPS_ID - AND ts.id = :TEST_SUITE_ID - AND p.run_date::DATE <= :AS_OF_DATE - GROUP BY r.table_groups_id), - curprof AS (SELECT p.*, datediff('MM', p.min_date, p.max_date) as min_max_months, datediff('week', '1800-01-05'::DATE, p.max_date) - datediff('week', '1800-01-05'::DATE, p.min_date) as min_max_weeks - FROM last_run lr - INNER JOIN profile_results p - ON (lr.table_groups_id = p.table_groups_id - AND lr.last_run_date = p.run_date) ), - locked AS (SELECT schema_name, table_name, column_name - FROM test_definitions - WHERE table_groups_id = :TABLE_GROUPS_ID - AND test_suite_id = :TEST_SUITE_ID - AND test_type = :TEST_TYPE - AND lock_refresh = 'Y'), - newtests AS (SELECT * - FROM curprof - WHERE schema_name = :DATA_SCHEMA - AND {SELECTION_CRITERIA} ) -SELECT :TABLE_GROUPS_ID as table_groups_id, - n.profile_run_id, - :TEST_TYPE AS test_type, - :TEST_SUITE_ID AS test_suite_id, - n.schema_name, n.table_name, n.column_name, - 0 as skip_errors, 'Y' as test_active, :RUN_DATE ::TIMESTAMP as last_auto_gen_date, - :AS_OF_DATE ::TIMESTAMP as profiling_as_of_date, - {DEFAULT_PARM_VALUES} -FROM newtests n -LEFT JOIN locked l - ON (n.schema_name = l.schema_name - AND n.table_name = l.table_name - AND n.column_name = l.column_name) -WHERE l.schema_name IS NULL; diff --git a/testgen/template/generation/get_test_types.sql b/testgen/template/generation/get_test_types.sql new file mode 100644 index 00000000..775fc7e4 --- /dev/null +++ b/testgen/template/generation/get_test_types.sql @@ -0,0 +1,12 @@ +SELECT t.test_type, + t.selection_criteria, + t.generation_template, + t.default_parm_columns, + t.default_parm_values +FROM test_types t + INNER JOIN generation_sets s ON (t.test_type = s.test_type) + -- Only active test types +WHERE t.active = 'Y' + -- Only test types included in generation set + AND s.generation_set = :GENERATION_SET +ORDER BY test_type; diff --git a/testgen/template/parms/parms_test_gen.sql b/testgen/template/parms/parms_test_gen.sql deleted file mode 100644 index ebc717d1..00000000 --- a/testgen/template/parms/parms_test_gen.sql +++ /dev/null @@ -1,9 +0,0 @@ - SELECT tg.project_code, - tg.table_group_schema, - ts.export_to_observability, - ts.id::VARCHAR as test_suite_id, - CURRENT_TIMESTAMP AT TIME ZONE 'UTC' - - CAST(tg.profiling_delay_days AS integer) * INTERVAL '1 day' as profiling_as_of_date - FROM table_groups tg - LEFT JOIN test_suites ts ON tg.connection_id = ts.connection_id AND ts.test_suite = :TEST_SUITE - WHERE tg.id = :TABLE_GROUP_ID; diff --git a/testgen/template/prediction/delete_staging_test_definitions.sql b/testgen/template/prediction/delete_staging_test_definitions.sql new file mode 100644 index 00000000..99cd1edf --- /dev/null +++ b/testgen/template/prediction/delete_staging_test_definitions.sql @@ -0,0 +1,3 @@ +DELETE FROM stg_test_definition_updates +WHERE test_suite_id = :TEST_SUITE_ID + AND run_date = :RUN_DATE; diff --git a/testgen/template/prediction/get_historical_test_results.sql b/testgen/template/prediction/get_historical_test_results.sql new file mode 100644 index 00000000..800ecc10 --- /dev/null +++ b/testgen/template/prediction/get_historical_test_results.sql @@ -0,0 +1,24 @@ +WITH filtered_defs AS ( + -- Filter definitions first to minimize join surface area + SELECT id, + test_suite_id, + schema_name, + table_name, + column_name, + test_type + FROM test_definitions + WHERE test_suite_id = :TEST_SUITE_ID + AND test_active = 'Y' + AND history_calculation = 'PREDICT' +) +SELECT r.test_definition_id, + d.test_type, + r.test_time, + CASE + WHEN r.result_signal ~ '^-?[0-9]*\.?[0-9]+$' THEN r.result_signal::NUMERIC + ELSE NULL + END AS result_signal +FROM test_results r +JOIN filtered_defs d ON d.id = r.test_definition_id +WHERE r.test_suite_id = :TEST_SUITE_ID +ORDER BY r.test_time; diff --git a/testgen/template/prediction/update_predicted_test_thresholds.sql b/testgen/template/prediction/update_predicted_test_thresholds.sql new file mode 100644 index 00000000..55af464a --- /dev/null +++ b/testgen/template/prediction/update_predicted_test_thresholds.sql @@ -0,0 +1,9 @@ +UPDATE test_definitions +SET lower_tolerance = s.lower_tolerance, + upper_tolerance = s.upper_tolerance, + threshold_value = COALESCE(s.threshold_value, s.upper_tolerance), + prediction = s.prediction +FROM stg_test_definition_updates s +WHERE s.test_definition_id = test_definitions.id + AND s.test_suite_id = :TEST_SUITE_ID + AND s.run_date = :RUN_DATE; diff --git a/testgen/template/profiling/cde_flagger_query.sql b/testgen/template/profiling/cde_flagger_query.sql index a23e69ec..53ff280c 100644 --- a/testgen/template/profiling/cde_flagger_query.sql +++ b/testgen/template/profiling/cde_flagger_query.sql @@ -3,7 +3,7 @@ UPDATE data_column_chars WHERE table_groups_id = :TABLE_GROUPS_ID; WITH cde_selects - AS ( SELECT table_groups_id, table_name, column_name + AS ( SELECT table_groups_id, schema_name, table_name, column_name -- ,functional_data_type, -- record_ct, -- ROUND(100.0 * (value_ct - COALESCE(zero_length_ct, 0.0) - COALESCE(filled_value_ct, 0.0))::DEC(15, 3) / @@ -29,5 +29,6 @@ UPDATE data_column_chars SET critical_data_element = TRUE FROM cde_selects WHERE data_column_chars.table_groups_id = cde_selects.table_groups_id + AND data_column_chars.schema_name = cde_selects.schema_name AND data_column_chars.table_name = cde_selects.table_name AND data_column_chars.column_name = cde_selects.column_name; diff --git a/testgen/template/quick_start/initial_data_seeding.sql b/testgen/template/quick_start/initial_data_seeding.sql index f47a161b..fb1283ca 100644 --- a/testgen/template/quick_start/initial_data_seeding.sql +++ b/testgen/template/quick_start/initial_data_seeding.sql @@ -28,7 +28,7 @@ SELECT '0ea85e17-acbe-47fe-8394-9970725ad37d'::UUID as id, NULLIF('{PROFILING_TABLE_SET}', '') as profiling_table_set, NULLIF('{PROFILING_INCLUDE_MASK}', '') as profiling_include_mask, NULLIF('{PROFILING_EXCLUDE_MASK}', '') as profiling_exclude_mask, - 15000 as profile_sample_min_count; + 15000 as profile_sample_min_count; INSERT INTO test_suites (id, project_code, test_suite, connection_id, table_groups_id, test_suite_description, @@ -42,3 +42,58 @@ SELECT '9df7489d-92b3-49f9-95ca-512160d7896f'::UUID as id, 'Y' as export_to_observability, NULL as component_key, '{OBSERVABILITY_COMPONENT_TYPE}' as component_type; + +INSERT INTO test_suites + (id, project_code, test_suite, connection_id, table_groups_id, test_suite_description, + export_to_observability, is_monitor, monitor_lookback, predict_min_lookback) +SELECT '823a1fef-9b6d-48d5-9d0f-2db9812cc318'::UUID AS id, + '{PROJECT_CODE}' AS project_code, + '{TABLE_GROUPS_NAME} Monitors' AS test_suite, + 1 AS connection_id, + '0ea85e17-acbe-47fe-8394-9970725ad37d'::UUID AS table_groups_id, + '{TABLE_GROUPS_NAME} Monitor Suite' AS test_suite_description, + 'N' AS export_to_observability, + TRUE AS is_monitor, + 28 AS monitor_lookback, + 30 AS predict_min_lookback; + +INSERT INTO job_schedules + (id, project_code, key, args, kwargs, cron_expr, cron_tz, active) +SELECT 'eac9d722-d06a-4b1f-b8c4-bb2854bd4cfd'::UUID AS id, + '{PROJECT_CODE}' AS project_code, + 'run-monitors' AS key, + '[]'::JSONB AS args, + '{"test_suite_id": "823a1fef-9b6d-48d5-9d0f-2db9812cc318"}'::JSONB AS kwargs, + '0 */12 * * *' AS cron_expr, + 'UTC' AS cron_tz, + TRUE AS TRUE; + +UPDATE table_groups +SET monitor_test_suite_id = '823a1fef-9b6d-48d5-9d0f-2db9812cc318'::UUID +WHERE id = '0ea85e17-acbe-47fe-8394-9970725ad37d'::UUID; + +-- Metric monitors +INSERT INTO test_definitions + (id, table_groups_id, test_suite_id, test_type, schema_name, table_name, column_name, + custom_query, history_calculation, history_calculation_upper, lower_tolerance, upper_tolerance, test_active) +VALUES + -- Average Discount + ('a1b2c3d4-1006-4000-8000-000000000006'::UUID, + '0ea85e17-acbe-47fe-8394-9970725ad37d'::UUID, + '823a1fef-9b6d-48d5-9d0f-2db9812cc318'::UUID, + 'Metric_Trend', '{PROJECT_SCHEMA}', 'f_ebike_sales', 'Average Discount', + 'AVG(discount_amount)', NULL, NULL, 15, 25, 'Y'), + + -- Average Product Price + ('a1b2c3d4-3333-4000-8000-000000000003'::UUID, + '0ea85e17-acbe-47fe-8394-9970725ad37d'::UUID, + '823a1fef-9b6d-48d5-9d0f-2db9812cc318'::UUID, + 'Metric_Trend', '{PROJECT_SCHEMA}', 'd_ebike_products', 'Average Product Price', + 'AVG(price)', NULL, NULL, 1000, 1500, 'Y'), + + -- Max Discount + ('a1b2c3d4-2006-4000-8000-000000000006'::UUID, + '0ea85e17-acbe-47fe-8394-9970725ad37d'::UUID, + '823a1fef-9b6d-48d5-9d0f-2db9812cc318'::UUID, + 'Metric_Trend', '{PROJECT_SCHEMA}', 'd_ebike_products', 'Max Discount', + 'MAX(max_discount)', 'PREDICT', NULL, NULL, NULL, 'Y'); diff --git a/testgen/template/quick_start/run_monitor_iteration.sql b/testgen/template/quick_start/run_monitor_iteration.sql new file mode 100644 index 00000000..5717dc6c --- /dev/null +++ b/testgen/template/quick_start/run_monitor_iteration.sql @@ -0,0 +1,132 @@ + +WITH max_sale_id AS ( + SELECT MAX(sale_id) AS max_id FROM demo.f_ebike_sales +), +new_sales AS ( + SELECT + max_id + ROW_NUMBER() OVER () AS sale_id, + sale_date + (i * INTERVAL '1 day') AS sale_date, + customer_id, + supplier_id, + product_id, + quantity_sold, + sale_price, + total_amount, + discount_amount, + adjusted_total_amount, + warranty_end_date, + next_maintenance_date, + return_reason + FROM + demo.f_ebike_sales, + max_sale_id, + generate_series(1, {NEW_SALES}) AS i + LIMIT {NEW_SALES} +) +INSERT INTO demo.f_ebike_sales ( + sale_id, + sale_date, + customer_id, + supplier_id, + product_id, + quantity_sold, + sale_price, + total_amount, + discount_amount, + adjusted_total_amount, + warranty_end_date, + next_maintenance_date, + return_reason +) +SELECT * FROM new_sales; + + +UPDATE demo.d_ebike_customers +SET last_contact = :RUN_DATE, + customer_decile = customer_decile + 1 +WHERE ctid IN ( + SELECT ctid + FROM demo.d_ebike_customers + ORDER BY RANDOM() + LIMIT 10 +); + + +-- TG-IF IS_UPDATE_SUPPLIERS_ITER +UPDATE demo.d_ebike_suppliers +SET last_order = :RUN_DATE +WHERE supplier_id IN (40001, 40002); +-- TG-ENDIF + + +-- TG-IF IS_UPDATE_PRODUCT_ITER +UPDATE demo.d_ebike_products +SET price = price + 50 * (RANDOM() - 0.5) +WHERE product_id IN ( + SELECT product_id + FROM demo.d_ebike_products + ORDER BY RANDOM() + LIMIT 4 +); +-- TG-ENDIF + + +-- Metric_Trend variation: shift discount averages and product prices each iteration +UPDATE demo.f_ebike_sales +SET discount_amount = GREATEST(0, discount_amount + {DISCOUNT_DELTA}); + +UPDATE demo.d_ebike_products +SET price = GREATEST(50, price + {PRICE_DELTA}); + + +-- TG-IF IS_DELETE_CUSTOMER_COL_ITER +ALTER TABLE demo.d_ebike_customers + DROP COLUMN occupation, + DROP COLUMN tax_id; +-- TG-ENDIF + + +-- TG-IF IS_ADD_CUSTOMER_COL_ITER +ALTER TABLE demo.d_ebike_customers + ADD COLUMN is_international BOOL DEFAULT FALSE, + ADD COLUMN first_contact DATE; +-- TG-ENDIF + + +-- TG-IF IS_CREATE_RETURNS_TABLE_ITER +CREATE TABLE demo.f_ebike_returns +( + return_id INTEGER, + sale_id INTEGER, + return_date DATE, + refund_amount NUMERIC(10, 2), + return_reason TEXT +); + +INSERT INTO demo.f_ebike_returns +( + return_id, + sale_id, + return_date, + refund_amount, + return_reason +) +SELECT + ROW_NUMBER() OVER (), + sale_id, + :RUN_DATE, + sale_price * 0.8, + 'No reason' +FROM demo.f_ebike_sales +ORDER BY RANDOM() +LIMIT 200; +-- TG-ENDIF + + +-- TG-IF IS_DELETE_CUSTOMER_ITER +DELETE FROM demo.d_ebike_customers +WHERE customer_id IN +( + SELECT customer_id FROM demo.d_ebike_customers ORDER BY RANDOM() LIMIT 1 +); +-- TG-ENDIF diff --git a/testgen/template/rollup_scores/calc_prevalence_test_results.sql b/testgen/template/rollup_scores/calc_prevalence_test_results.sql index 88fdb6fb..c75b64b7 100644 --- a/testgen/template/rollup_scores/calc_prevalence_test_results.sql +++ b/testgen/template/rollup_scores/calc_prevalence_test_results.sql @@ -13,6 +13,7 @@ UPDATE test_results FROM test_results r INNER JOIN data_table_chars tc ON (r.table_groups_id = tc.table_groups_id + AND r.schema_name = tc.schema_name AND r.table_name ILIKE tc.table_name) WHERE r.test_run_id = '{RUN_ID}'::UUID AND test_results.id = r.id; @@ -50,11 +51,13 @@ WITH result_calc AND r.column_names = p.column_name) LEFT JOIN data_table_chars tc ON (r.table_groups_id = tc.table_groups_id + AND r.schema_name = tc.schema_name AND r.table_name ILIKE tc.table_name) WHERE r.test_run_id = '{RUN_ID}'::UUID AND result_code = 0 AND r.result_measure IS NOT NULL AND tt.test_scope = 'column' + AND tt.dq_score_prevalence_formula IS NOT NULL AND NOT COALESCE(disposition, '') IN ('Dismissed', 'Inactive') ) UPDATE test_results SET dq_record_ct = c.dq_record_ct, @@ -78,11 +81,13 @@ WITH result_calc ON r.test_type = tt.test_type INNER JOIN data_table_chars tc ON (r.table_groups_id = tc.table_groups_id + AND r.schema_name = tc.schema_name AND r.table_name ILIKE tc.table_name) WHERE r.test_run_id = '{RUN_ID}'::UUID AND result_code = 0 AND r.result_measure IS NOT NULL AND tt.test_scope <> 'column' + AND tt.dq_score_prevalence_formula IS NOT NULL AND NOT COALESCE(disposition, '') IN ('Dismissed', 'Inactive') ) UPDATE test_results SET dq_record_ct = c.dq_record_ct, diff --git a/testgen/template/rollup_scores/rollup_scores_profile_table_group.sql b/testgen/template/rollup_scores/rollup_scores_profile_table_group.sql index 4290e384..d11f5df0 100644 --- a/testgen/template/rollup_scores/rollup_scores_profile_table_group.sql +++ b/testgen/template/rollup_scores/rollup_scores_profile_table_group.sql @@ -41,6 +41,7 @@ WITH score_detail ON (r.id = pr.profile_run_id) INNER JOIN data_column_chars dcc ON (pr.table_groups_id = dcc.table_groups_id + AND pr.schema_name = dcc.schema_name AND pr.table_name = dcc.table_name AND pr.column_name = dcc.column_name) LEFT JOIN profile_anomaly_results p @@ -77,6 +78,7 @@ WITH score_detail ON (r.id = pr.profile_run_id) INNER JOIN data_column_chars dcc ON (pr.table_groups_id = dcc.table_groups_id + AND pr.schema_name = dcc.schema_name AND pr.table_name = dcc.table_name AND pr.column_name = dcc.column_name) LEFT JOIN profile_anomaly_results p diff --git a/testgen/template/rollup_scores/rollup_scores_test_table_group.sql b/testgen/template/rollup_scores/rollup_scores_test_table_group.sql index 7aebeadd..ce1ec3b5 100644 --- a/testgen/template/rollup_scores/rollup_scores_test_table_group.sql +++ b/testgen/template/rollup_scores/rollup_scores_test_table_group.sql @@ -31,7 +31,7 @@ UPDATE data_column_chars -- Roll up latest scores to data_column_chars -- excludes multi-column tests WITH score_calc AS (SELECT dcc.column_id, - SUM(1 - r.result_code) as issue_ct, + SUM(CASE WHEN r.result_code = 0 THEN 1 ELSE 0 END) as issue_ct, -- Use AVG instead of MAX because column counts may differ by test_run AVG(r.dq_record_ct) as row_ct, -- bad data pct * record count = affected_data_points @@ -42,6 +42,7 @@ WITH score_calc ON (r.test_suite_id = ts.id AND r.test_run_id = ts.last_complete_test_run_id)) ON (dcc.table_groups_id = ts.table_groups_id + AND dcc.schema_name = r.schema_name AND dcc.table_name = r.table_name AND dcc.column_name = r.column_names) WHERE dcc.table_groups_id = :TABLE_GROUPS_ID @@ -72,6 +73,7 @@ WITH score_detail ON (r.test_suite_id = ts.id AND r.test_run_id = ts.last_complete_test_run_id)) ON (dtc.table_groups_id = ts.table_groups_id + AND dtc.schema_name = r.schema_name AND dtc.table_name = r.table_name) WHERE dtc.table_groups_id = :TABLE_GROUPS_ID AND COALESCE(ts.dq_score_exclude, FALSE) = FALSE diff --git a/testgen/ui/bootstrap.py b/testgen/ui/bootstrap.py index bb39c83b..b21cf6a2 100644 --- a/testgen/ui/bootstrap.py +++ b/testgen/ui/bootstrap.py @@ -11,6 +11,7 @@ from testgen.ui.views.data_catalog import DataCatalogPage from testgen.ui.views.hygiene_issues import HygieneIssuesPage from testgen.ui.views.login import LoginPage +from testgen.ui.views.monitors_dashboard import MonitorsDashboardPage from testgen.ui.views.profiling_results import ProfilingResultsPage from testgen.ui.views.profiling_runs import DataProfilingPage from testgen.ui.views.project_dashboard import ProjectDashboardPage @@ -42,6 +43,7 @@ TestSuitesPage, TestDefinitionsPage, ProjectSettingsPage, + MonitorsDashboardPage, ] LOG = logging.getLogger("testgen") diff --git a/testgen/ui/components/frontend/css/material-symbols-rounded.css b/testgen/ui/components/frontend/css/material-symbols-rounded.css index 15cf8997..16eec0f4 100644 --- a/testgen/ui/components/frontend/css/material-symbols-rounded.css +++ b/testgen/ui/components/frontend/css/material-symbols-rounded.css @@ -3,7 +3,7 @@ font-style: normal; font-weight: 100 700; font-display: block; - src: url("./material-symbols-rounded.woff2") format("woff2"); + src: url("/app/static/fonts/material-symbols-rounded.woff2") format("woff2"); } .material-symbols-rounded { font-family: "Material Symbols Rounded"; @@ -22,3 +22,11 @@ text-rendering: optimizeLegibility; font-feature-settings: "liga"; } + +.material-symbols-filled { + font-variation-settings: + 'FILL' 1, + 'wght' 400, + 'GRAD' 0, + 'opsz' 24; +} diff --git a/testgen/ui/components/frontend/css/roboto-font-faces.css b/testgen/ui/components/frontend/css/roboto-font-faces.css index 61d5de8f..1b435eaa 100644 --- a/testgen/ui/components/frontend/css/roboto-font-faces.css +++ b/testgen/ui/components/frontend/css/roboto-font-faces.css @@ -3,7 +3,7 @@ font-style: normal; font-weight: 400; font-display: swap; - src: url(./KFOmCnqEu92Fr1Mu7GxKOzY.woff2) format('woff2'); + src: url(/app/static/fonts/KFOmCnqEu92Fr1Mu7GxKOzY.woff2) format('woff2'); unicode-range: U+0100-02AF, U+0304, U+0308, U+0329, U+1E00-1E9F, U+1EF2-1EFF, U+2020, U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F, U+A720-A7FF; } @@ -12,7 +12,7 @@ font-style: normal; font-weight: 400; font-display: swap; - src: url(./KFOmCnqEu92Fr1Mu4mxK.woff2) format('woff2'); + src: url(/app/static/fonts/KFOmCnqEu92Fr1Mu4mxK.woff2) format('woff2'); unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, U+02DC, U+0304, U+0308, U+0329, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD; } @@ -21,7 +21,7 @@ font-style: normal; font-weight: 500; font-display: swap; - src: url(./KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2) format('woff2'); + src: url(/app/static/fonts/KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2) format('woff2'); unicode-range: U+0100-02AF, U+0304, U+0308, U+0329, U+1E00-1E9F, U+1EF2-1EFF, U+2020, U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F, U+A720-A7FF; } @@ -30,6 +30,6 @@ font-style: normal; font-weight: 500; font-display: swap; - src: url(./KFOlCnqEu92Fr1MmEU9fBBc4.woff2) format('woff2'); + src: url(/app/static/fonts/KFOlCnqEu92Fr1MmEU9fBBc4.woff2) format('woff2'); unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, U+02DC, U+0304, U+0308, U+0329, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD; } diff --git a/testgen/ui/components/frontend/css/shared.css b/testgen/ui/components/frontend/css/shared.css index 7665ae48..8390aafe 100644 --- a/testgen/ui/components/frontend/css/shared.css +++ b/testgen/ui/components/frontend/css/shared.css @@ -20,8 +20,10 @@ body { --blue: #42A5F5; --brown: #8D6E63; --grey: #BDBDBD; + --light-grey: #E0E0E0; --empty: #EEEEEE; --empty-light: #FAFAFA; + --empty-dark: #BDBDBD; --empty-teal: #E7F1F0; --primary-text-color: #000000de; @@ -33,6 +35,8 @@ body { --tooltip-color: #333d; --tooltip-text-color: #fff; --dk-card-background: #fff; + --dk-dialog-background: #fff; + --selected-item-background: #06a04a17; --sidebar-background-color: white; --sidebar-item-hover-color: #f5f5f5; @@ -80,12 +84,16 @@ body { --select-hover-background: rgb(240, 242, 246); --app-background-color: #f8f9fa; + + --table-hover-color: #ecf0f1; + --table-selection-color: rgba(0,145,234,.28); } @media (prefers-color-scheme: dark) { body { --empty: #424242; --empty-light: #212121; + --empty-dark: #757575; --empty-teal: #242E2D; --primary-text-color: rgba(255, 255, 255); @@ -97,6 +105,7 @@ body { --tooltip-color: #eee; --tooltip-text-color: #000; --dk-card-background: #14181f; + --dk-dialog-background: #0e1117; --sidebar-background-color: #14181f; --sidebar-item-hover-color: #10141b; @@ -196,6 +205,18 @@ body { color: var(--disabled-text-color); } +.text-bold { + font-weight: 500; +} + +.text-small { + font-size: 13px; +} + +.text-large { + font-size: 16px; +} + .text-caption { font-size: 12px; color: var(--caption-text-color); @@ -644,6 +665,10 @@ code > .tg-icon:hover { border-radius: 8px; } +input { + line-height: normal !important; +} + input::-ms-reveal, input::-ms-clear { display: none; @@ -660,3 +685,66 @@ input::-ms-clear { .text-center { text-align: center; } + +.visible-overflow { + overflow: visible; +} + +.anomaly-tag { + display: inline-flex; + align-items: center; + justify-content: center; + vertical-align: middle; + border-radius: 18px; + background: var(--green); + height: 20px; + width: 20px; + box-sizing: border-box; +} + +.anomaly-tag > .material-symbols-rounded { + color: var(--empty-light); + font-size: 20px; +} + +.anomaly-tag.has-anomalies { + padding: 1px 5px; + border-radius: 10px; + background: var(--error-color); + color: var(--empty-light); + width: auto; + min-width: 20px; +} + +.anomaly-tag.has-errors { + position: relative; + background: transparent; +} + +.anomaly-tag.has-errors > .material-symbols-rounded { + color: var(--orange); + font-size: 22px; +} + +.anomaly-tag.is-training { + position: relative; + background: transparent; + border: 2px solid var(--blue); +} + +.anomaly-tag.is-training > .material-symbols-rounded { + color: var(--blue); +} + +.anomaly-tag.is-pending { + background: none; + color: var(--primary-text-color); +} + +.notifications--empty.tg-empty-state { + margin-top: 0; +} + +.warning-text { + color: var(--orange); +} diff --git a/testgen/ui/components/frontend/js/axis_utils.js b/testgen/ui/components/frontend/js/axis_utils.js index 1822092e..2e5240df 100644 --- a/testgen/ui/components/frontend/js/axis_utils.js +++ b/testgen/ui/components/frontend/js/axis_utils.js @@ -51,6 +51,17 @@ function niceBounds(axisStart, axisEnd, tickCount = 4) { }; } +function niceTicks(axisStart, axisEnd, tickCount = 4) { + const { min, max, step } = niceBounds(axisStart, axisEnd, tickCount); + const ticks = []; + let currentTick = min; + while (currentTick <= max) { + ticks.push(currentTick); + currentTick = currentTick + step; + } + return ticks; +} + /** * * @typedef Range @@ -73,4 +84,418 @@ function scale(value, ranges, zero=0) { return ((value - ranges.old.min) * newRange / oldRange) + ranges.new.min; } -export { niceBounds, scale }; +/** + * @param {SVGElement} svg + * @param {MouseEvent} event + * @returns {({x: number, y: number})} + */ +function screenToSvgCoordinates(svg, event) { + const pt = svg.createSVGPoint(); + pt.x = event.offsetX; + pt.y = event.offsetY; + const inverseCTM = svg.getScreenCTM().inverse(); + const svgPoint = pt.matrixTransform(inverseCTM); + return svgPoint; +} + +/** + * Generates an array of "nice" and properly spaced tick dates for a time-series axis. + * It automatically selects the best time step (granularity) based on the range. + * + * @param {Date[]} dates An array of Date objects representing the data points. + * @param {number} minTicks The minimum number of ticks desired. + * @param {number} maxTicks The maximum number of ticks desired. + * @returns {Date[]} An array of Date objects for the axis ticks. + */ +function getAdaptiveTimeTicks(dates, minTicks, maxTicks) { + if (!dates || dates.length === 0) { + return []; + } + + if (typeof dates[0] === 'number') { + dates = dates.map(d => new Date(d * 1000)); + } + + const timestamps = dates.map(d => d.getTime()); + const minTime = Math.min(...timestamps); + const maxTime = Math.max(...timestamps); + const rangeMs = maxTime - minTime; + + const timeSteps = [ + { name: 'hour', ms: 3600000 }, + { name: '4 hours', ms: 4 * 3600000 }, + { name: '8 hours', ms: 8 * 3600000 }, + { name: 'day', ms: 86400000 }, + { name: 'week', ms: 7 * 86400000 }, + { name: 'month', ms: null, count: 1 }, + { name: '3 months', ms: null, count: 3 }, + { name: '6 months', ms: null, count: 6 }, + { name: 'year', ms: null, count: 12 }, + ]; + + let bestStepIndex = -1; + let ticks = []; + + for (let i = timeSteps.length - 1; i >= 0; i--) { + const step = timeSteps[i]; + let estimatedTickCount; + + if (step.ms !== null) { + estimatedTickCount = Math.ceil(rangeMs / step.ms) + 1; + } else { + estimatedTickCount = estimateMonthYearTicks(minTime, maxTime, step.count); + } + + if (estimatedTickCount <= maxTicks) { + bestStepIndex = i; + break; + } + } + + if (bestStepIndex === -1) { + const roughStep = rangeMs / (maxTicks - 1); + const niceMsStep = getNiceStep(roughStep); + return generateMsTicks(minTime, maxTime, niceMsStep).map(t => new Date(t)); + } + + const bestStep = timeSteps[bestStepIndex]; + if (bestStep.ms !== null) { + ticks = generateMsTicks(minTime, maxTime, bestStep.ms).map(t => new Date(t)); + } else { + ticks = generateMonthYearTicks(minTime, maxTime, bestStep.count); + } + + while (ticks.length < minTicks && bestStepIndex > 0) { + bestStepIndex--; + const nextStep = timeSteps[bestStepIndex]; + + if (nextStep.ms !== null) { + ticks = generateMsTicks(minTime, maxTime, nextStep.ms).map(t => new Date(t)); + } else { + ticks = generateMonthYearTicks(minTime, maxTime, nextStep.count); + } + } + + return ticks; +} + +/** Calculates a "nice" step size (1, 2, 5, etc. * power of 10) for raw milliseconds. */ +function getNiceStep(step) { + const exponent = Math.floor(Math.log10(step)); + const fraction = step / Math.pow(10, exponent); + let niceFraction; + if (fraction <= 1) niceFraction = 1; + else if (fraction <= 2) niceFraction = 2; + else if (fraction <= 5) niceFraction = 5; + else return 1 * Math.pow(10, exponent + 1); // Next power of 10 + + return niceFraction * Math.pow(10, exponent); +} + +/** Generates ticks for fixed-length steps (hours, days, weeks). */ +function generateMsTicks(minTime, maxTime, niceStepMs) { + // let tickStart = minTime; // Use it to start at minimum tick + let tickStart = Math.floor(minTime / niceStepMs) * niceStepMs; // Use it to start at a nicer tick + while (tickStart > minTime) { + tickStart -= niceStepMs; + } + + const ONE_DAY = 86400000; + if (niceStepMs >= ONE_DAY) { + const date = new Date(tickStart); + date.setHours(0, 0, 0, 0); + tickStart = date.getTime(); + while (tickStart + niceStepMs < minTime) { + tickStart += niceStepMs; + } + } + + const ticks = []; + const epsilon = 1e-10; + let currentTick = tickStart; + + while (currentTick <= maxTime + niceStepMs + epsilon) { + ticks.push(Math.round(currentTick)); + currentTick += niceStepMs; + } + + return ticks; +} + +/** Generates ticks for variable-length steps (months, years). */ +function generateMonthYearTicks(minTime, maxTime, monthStep) { + const ticks = []; + let currentDate = new Date(minTime); + + currentDate.setDate(1); // Set to the 1st of the month + currentDate.setHours(0, 0, 0, 0); + + let year = currentDate.getFullYear(); + let month = currentDate.getMonth(); + + while (month % monthStep !== 0) { + month--; + if (month < 0) { + month = 11; + year--; + } + } + currentDate.setFullYear(year, month, 1); + + while (currentDate.getTime() + monthStep * 30 * 86400000 < minTime) { + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + while (currentDate.getTime() <= maxTime) { + ticks.push(new Date(currentDate.getTime())); + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + if (ticks.length > 0 && currentDate.getTime() - maxTime < monthStep * 30 * 86400000 / 2) { + ticks.push(new Date(currentDate.getTime())); + } + + return ticks; +} + +/** Estimates the number of ticks for month/year steps. */ +function estimateMonthYearTicks(minTime, maxTime, monthStep) { + const minDate = new Date(minTime); + const maxDate = new Date(maxTime); + + let years = maxDate.getFullYear() - minDate.getFullYear(); + let months = maxDate.getMonth() - minDate.getMonth(); + let totalMonths = years * 12 + months; + + return Math.ceil(totalMonths / monthStep) + 2; +} + +function getAdaptiveTimeTicksV2(dates, totalWidth, tickWidth) { + if (!dates || dates.length === 0) { + return []; + } + + if (typeof dates[0] === 'number') { + dates = dates.map(d => new Date(d)); + } + + const timestamps = dates.map(d => d.getTime()); + const minTime = Math.min(...timestamps); + const maxTime = Math.max(...timestamps); + const rangeMs = maxTime - minTime; + + const maxTicks = Math.floor(totalWidth / tickWidth); + const timeSteps = [ + { name: 'hour', ms: 3600000 }, + { name: '2 hours', ms: 7200000 }, + { name: '4 hours', ms: 14400000 }, + { name: '6 hours', ms: 21600000 }, + { name: '8 hours', ms: 28800000 }, + { name: '12 hours', ms: 43200000 }, + { name: 'day', ms: 86400000 }, + { name: '2 days', ms: 172800000 }, + { name: '3 days', ms: 259200000 }, + { name: 'week', ms: 604800000 }, + { name: '2 weeks', ms: 1209600000 }, + { name: 'month', ms: null, count: 1 }, + { name: '3 months', ms: null, count: 3 }, + { name: '6 months', ms: null, count: 6 }, + { name: 'year', ms: null, count: 1 }, + ]; + + for (let i = 0; i < timeSteps.length; i++) { + const step = timeSteps[i]; + let tickCount = 0; + + if (step.ms !== null) { + // Precise calculation: how many strict ticks fit in [minTime, maxTime]? + const firstTick = Math.ceil(minTime / step.ms) * step.ms; + const lastTick = Math.floor(maxTime / step.ms) * step.ms; + if (lastTick >= firstTick) { + tickCount = Math.floor((lastTick - firstTick) / step.ms) + 1; + } + } else { + tickCount = estimateMonthYearTicksStrict(minTime, maxTime, step.count); + } + + if (tickCount <= maxTicks && tickCount > 0) { + if (step.ms !== null) { + return generateMsTicksStrict(minTime, maxTime, step.ms); + } else { + return generateMonthYearTicksStrict(minTime, maxTime, step.count); + } + } + } + + const targetStep = rangeMs / Math.max(1, maxTicks); + const niceStep = getNiceStep(targetStep); + return generateMsTicksStrict(minTime, maxTime, niceStep); +} + +/** * Generates ticks strictly within [minTime, maxTime]. + * Uses Math.ceil to start 'inside' the range. + */ +function generateMsTicksStrict(minTime, maxTime, stepMs) { + const ticks = []; + + let currentTick = Math.ceil(minTime / stepMs) * stepMs; + + while (currentTick <= maxTime) { + ticks.push(new Date(currentTick)); + currentTick += stepMs; + } + + return ticks; +} + +/** * Generates Month/Year ticks strictly within bounds. + */ +function generateMonthYearTicksStrict(minTime, maxTime, monthStep) { + const ticks = []; + let currentDate = new Date(minTime); + + currentDate.setDate(1); + currentDate.setHours(0, 0, 0, 0); + + let month = currentDate.getMonth(); + let year = currentDate.getFullYear(); + while (month % monthStep !== 0) { + month--; + if (month < 0) { month = 11; year--; } + } + currentDate.setFullYear(year, month, 1); + + while (currentDate.getTime() < minTime) { + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + while (currentDate.getTime() <= maxTime) { + ticks.push(new Date(currentDate)); + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + return ticks; +} + +function estimateMonthYearTicksStrict(minTime, maxTime, monthStep) { + let count = 0; + let d = new Date(minTime); + d.setDate(1); d.setHours(0,0,0,0); + + let m = d.getMonth(); + let y = d.getFullYear(); + while (m % monthStep !== 0) { m--; if(m<0){m=11; y--;} } + d.setFullYear(y, m, 1); + + while (d.getTime() < minTime) { + d.setMonth(d.getMonth() + monthStep); + } + while (d.getTime() <= maxTime) { + count++; + d.setMonth(d.getMonth() + monthStep); + } + return count; +} + +/** + * Formats an array of Date objects into smart, non-redundant labels. + * It only displays the year, month, or day when it changes from the previous tick. + * + * @param {Date[]} ticks An array of Date objects (the tick values). + * @returns {Array} An array of formatted labels (strings or string arrays). + */ +function formatSmartTimeTicks(ticks) { + if (!ticks || ticks.length === 0) { + return []; + } + + const formattedLabels = []; + const locale = 'en-US'; + + const yearFormat = { year: 'numeric' }; + const monthFormat = { month: 'short' }; + const dayFormat = { day: 'numeric' }; + const timeFormat = { hour: '2-digit', minute: '2-digit', hourCycle: 'h23' }; + const ONE_DAY_MS = 86400000; + + const formatPart = (date, options) => date.toLocaleString(locale, options); + + for (let i = 0; i < ticks.length; i++) { + const currentTick = ticks[i]; + const previousTick = ticks[i - 1]; + const nextTick = ticks[i + 1]; + + let needsYear = false; + let needsMonth = false; + let needsDay = false; + let needsTime = false; + + if (!previousTick) { + needsYear = true; + needsMonth = true; + needsDay = true; + needsTime = nextTick && nextTick.getTime() - currentTick.getTime() < ONE_DAY_MS; + } else { + const curr = currentTick; + const prev = previousTick; + + if (curr.getFullYear() !== prev.getFullYear()) { + needsYear = true; + needsMonth = true; + needsDay = true; + } else if (curr.getMonth() !== prev.getMonth()) { + needsMonth = true; + needsDay = true; + } else if (curr.getDate() !== prev.getDate()) { + needsDay = true; + needsMonth = true; + } + + const stepMs = currentTick.getTime() - previousTick.getTime(); + if (stepMs < ONE_DAY_MS || (curr.getHours() !== 0 || curr.getMinutes() !== 0)) { + needsTime = true; + } + } + + let line1 = []; + let line2 = []; + + if (needsTime) { + line1.push(formatPart(currentTick, timeFormat)); + } + + if (needsMonth || needsDay) { + let datePart = []; + if (needsMonth) { + datePart.push(formatPart(currentTick, monthFormat)); + } + if (needsDay) { + datePart.push(formatPart(currentTick, dayFormat)); + } + const dateString = datePart.join(' '); + + if (needsTime) { + line2.push(dateString); + } else { + line1.push(dateString); + } + } + + if (needsYear) { + line2.push(formatPart(currentTick, yearFormat)); + } + + line1 = line1.filter(p => p.length > 0).join(' '); + line2 = line2.filter(p => p.length > 0).join(' '); + + if (line2.length > 0) { + formattedLabels.push([line1, line2]); + } else { + formattedLabels.push(line1); + } + } + + return formattedLabels; +} + +export { niceBounds, niceTicks, scale, screenToSvgCoordinates, getAdaptiveTimeTicks, getAdaptiveTimeTicksV2, formatSmartTimeTicks }; diff --git a/testgen/ui/components/frontend/js/components/button.js b/testgen/ui/components/frontend/js/components/button.js index ec543bc5..c78f2173 100644 --- a/testgen/ui/components/frontend/js/components/button.js +++ b/testgen/ui/components/frontend/js/components/button.js @@ -139,7 +139,7 @@ button.tg-button[disabled] { cursor: not-allowed; } -button.tg-button > i:has(+ span) { +button.tg-button > i:has(+ span:not(.tg-tooltip)) { margin-right: 8px; } diff --git a/testgen/ui/components/frontend/js/components/chart_canvas.js b/testgen/ui/components/frontend/js/components/chart_canvas.js new file mode 100644 index 00000000..e2e13648 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/chart_canvas.js @@ -0,0 +1,655 @@ +/** + * A container that renders a coordinate system and all the + * provided (compatible) chart components "cocentered" in the + * aforementioned coordinates. + * + * Functionalities: + * - display the axis and their ticks for the chart + * - display the hover-over elements, if any + * - allows zooming in and out + * + * @typedef Options + * @type {object} + * @property {number} width + * @property {number} height + * @property {Point[]} points + * @property {AxisConfigs?} axis + * @property {((point: Point) => SVGElement)?} legend + * @property {((getPoint: ((Point) => Point), showToolip: ((message: string, point: Point) => void), hideToolip: (() => void)) => SVGElement)?} markers + * + * @typedef Point + * @type {object} + * @property {number} x + * @property {number} y + * @property {number} originalX + * @property {number} originalY + * + * @typedef AxisConfigs + * @type {object} + * @property {SingleAxisConfig?} x + * @property {SingleAxisConfig?} y + * + * @typedef SingleAxisConfig + * @type {object} + * @property {any?} min + * @property {any?} max + * @property {string?} label + * @property {number?} ticksCount + * @property {boolean?} renderLine + * @property {boolean?} renderGridLines + * + * @typedef ChartRenderer + * @type {((viewBox: ChartViewBox, area: DrawingArea, getPoint: ((Point) => Point)) => SVGElement)} + * + * @typedef ChartViewBox + * @type {object} + * @property {number} minX + * @property {number} minY + * @property {number} width + * @property {number} height + * + * @typedef DrawingArea + * @type {object} + * @property {Point} topLeft + * @property {Point} topRight + * @property {Point} bottomLeft + * @property {Point} bottomRight + */ +import van from '../van.min.js'; +import { afterMount, getRandomId, getValue, loadStylesheet } from '../utils.js'; +import { colorMap } from '../display_utils.js'; +import { formatSmartTimeTicks, getAdaptiveTimeTicks, niceTicks, scale, screenToSvgCoordinates } from '../axis_utils.js'; +import { Button } from './button.js'; +import { Tooltip, withTooltip } from './tooltip.js'; + +const { div } = van.tags; +const { clipPath, defs, foreignObject, g, line, rect, svg, text } = van.tags("http://www.w3.org/2000/svg"); + +const spacing = 8; +const topLegendHeight = spacing * 8; +const verticalAxisLabelWidth = spacing * 2; +const verticalAxisLabelLeftMargin = 5; +const verticalAxisTicksLeftMargin = spacing * 3; + +const horizontalAxisLabelHeight = spacing * 2; +const horizontalAxisTicksHeight = spacing * 6; +const horizontalAxisLabelBottomMargin = 0; +const horizontalAxisTicksBottomMargin = spacing * 5; + +const innerPaddingX = spacing * 3; +const innerPaddingY = spacing * 2; + +const cornerDash = 10; +const draggingOverlayColor = '#FFFFFF66'; + +const tickTextHeight = 14; + +const actionsWidth = 40; +const actionsHeight = 40; + +/** + * @param {Options} options + * @param {...ChartRenderer} charts + * @returns {HTMLDivElement} + */ +const ChartCanvas = (options, ...charts) => { + loadStylesheet('chartCanvas', stylesheet); + + const canvasWidth = van.state(0); + const canvasHeight = van.state(0); + + const topLeft = van.state({x: 0, y: 0}); + const topRight = van.state({x: 0, y: 0}); + const bottomLeft = van.state({x: 0, y: 0}); + const bottomRight = van.state({x: 0, y: 0}); + + const xAxisChartRange = van.state({min: 0, max: 0}); + const yAxisChartRange = van.state({min: 0, max: 0}); + + const xAxisLabel = van.state(null); + const xAxisDataRange = van.state({min: 0, max: 0}); + const initialXAxisDataRange = van.state({min: 0, max: 0}); + const xAxisTicksCount = van.state(8); + const xRenderLine = van.state(false); + const xRenderGridLines = van.state(true); + + const yAxisLabel = van.state(null); + const yAxisDataRange = van.state({min: 0, max: 0}); + const initialYAxisDataRange = van.state({min: 0, max: 0}); + const yAxisTicksCount = van.state(4); + const yRenderLine = van.state(false); + const yRenderGridLines = van.state(false); + + const legendRenderer = van.state(null); + const markersRenderer = van.state(null); + + const dataPoints = van.state([]); + const dataPointsMapping = van.state({}); + + const isZoomed = van.state(false); + const isDragZooming = van.state(false); + const dragZoomStartingPoint = van.state(null); + const dragZoomCurrentPoint = van.state(null); + const isHoveringOver = van.state(false); + + let /** @type {SVGElement?} */ interactiveLayerSvg; + + const DOMIdSuffix = getRandomId(); + const getDOMId = (domId) => `${domId}-${DOMIdSuffix}`; + + const asSVGX = (value) => scale(value, {old: xAxisDataRange.rawVal, new: xAxisChartRange.rawVal}, bottomLeft.rawVal.x); + const asSVGY = (value) => scale(value, {old: yAxisDataRange.rawVal, new: yAxisChartRange.rawVal}, bottomLeft.rawVal.y); + + van.derive(() => { + canvasWidth.val = getValue(options.width); + }); + + van.derive(() => { + canvasHeight.val = getValue(options.height); + }); + + van.derive(() => { + const axisConfig = getValue(options.axis); + const originalPoints = getValue(options.points); + + const xRange = {min: axisConfig?.x?.min, max: axisConfig?.x?.max}; + const yRange = {min: axisConfig?.y?.min, max: axisConfig?.y?.max}; + + if (!xRange.min || !xRange.max) { + const xAxisValues = originalPoints.map(p => p.x); + xRange.min = Math.min(...xAxisValues); + xRange.max = Math.max(...xAxisValues); + } + + if (!yRange.min || !yRange.max) { + const yAxisValues = originalPoints.map(p => p.y); + yRange.min = Math.min(...yAxisValues); + yRange.max = Math.max(...yAxisValues); + } + + xAxisLabel.val = axisConfig?.x?.label ?? null; + xAxisTicksCount.val = axisConfig?.x?.ticksCount ?? 8; + xAxisDataRange.val = {min: xRange.min, max: xRange.max}; + initialXAxisDataRange.val = {...xAxisDataRange.rawVal}; + xRenderLine.val = axisConfig?.x?.renderLine ?? false; + xRenderGridLines.val = axisConfig?.x?.renderGridLines ?? false; + + yAxisLabel.val = axisConfig?.y?.label ?? null; + yAxisTicksCount.val = axisConfig?.y?.ticksCount ?? 4; + yAxisDataRange.val = {min: yRange.min, max: yRange.max}; + initialYAxisDataRange.val = {...yAxisDataRange.rawVal}; + yRenderLine.val = axisConfig?.y?.renderLine ?? false; + yRenderGridLines.val = axisConfig?.y?.renderGridLines ?? false; + }); + + van.derive(() => { + legendRenderer.val = getValue(options.legend); + }); + + van.derive(() => { + markersRenderer.val = getValue(options.markers); + }); + + van.derive(() => { + xAxisChartRange.val; + yAxisChartRange.val; + + const originalPoints = getValue(options.points); + const dataPoints_ = []; + const dataPointsMapping_ = {}; + + for (const original of originalPoints) { + const point = {x: asSVGX(original.x), y: asSVGY(original.y)}; + dataPoints_.push(point); + dataPointsMapping_[`${original.x}-${original.y}`] = point; + } + + dataPoints.val = dataPoints_; + dataPointsMapping.val = dataPointsMapping_; + }); + + const resizeChartBoundaries = () => { + const marginTop = topLegendHeight; + const marginBottom = (xAxisLabel.rawVal ? horizontalAxisLabelHeight : 0) + horizontalAxisTicksHeight; + + let marginLeft = (yAxisLabel.rawVal ? verticalAxisLabelWidth : 0) + spacing * 2; + const yAxisElement = document.getElementById(getDOMId('y-axis-ticks-group')); + if (yAxisElement) { + const box = yAxisElement.getBoundingClientRect(); + marginLeft += box.width; + } + + topLeft.val = {x: marginLeft, y: marginTop}; + topRight.val = {x: canvasWidth.rawVal, y: marginTop}; + bottomLeft.val = {x: marginLeft, y: Math.max(canvasHeight.rawVal - marginBottom, 0)}; + bottomRight.val = {x: canvasWidth.rawVal, y: Math.max(canvasHeight.rawVal - marginBottom, 0)}; + + xAxisChartRange.val = {min: bottomLeft.rawVal.x + innerPaddingX, max: bottomRight.rawVal.x - innerPaddingX}; + yAxisChartRange.val = {min: bottomLeft.rawVal.y - innerPaddingY, max: topLeft.rawVal.y + innerPaddingY}; + }; + + van.derive(() => { + canvasWidth.val; + canvasHeight.val; + resizeChartBoundaries(); + + xAxisDataRange.val = {...xAxisDataRange.rawVal}; + yAxisDataRange.val = {...yAxisDataRange.rawVal}; + }); + + const startDragZoom = (event) => { + interactiveLayerSvg = event.target.parentNode; + dragZoomStartingPoint.val = screenToSvgCoordinates(interactiveLayerSvg, event); + isDragZooming.val = true; + document.addEventListener('mousemove', updateDragZoomRect); + document.addEventListener('mouseup', stopDragZoom); + document.addEventListener('touchmove', updateDragZoomRect); + document.addEventListener('touchend', stopDragZoom); + }; + const updateDragZoomRect = (event) => { + if (isDragZooming.val) { + dragZoomCurrentPoint.val = screenToSvgCoordinates(interactiveLayerSvg, event); + } + }; + const stopDragZoom = (event) => { + document.removeEventListener('mousemove', updateDragZoomRect); + document.removeEventListener('mouseup', stopDragZoom); + document.removeEventListener('touchmove', updateDragZoomRect); + document.removeEventListener('touchend', stopDragZoom); + + const startingPoint = dragZoomStartingPoint.rawVal; + const currentPoint = screenToSvgCoordinates(interactiveLayerSvg, event); + + isDragZooming.val = false; + dragZoomStartingPoint.val = null; + dragZoomCurrentPoint.val = null; + + const selectedMinX = Math.min(startingPoint.x, currentPoint.x); + const selectedMaxX = Math.max(startingPoint.x, currentPoint.x); + const selectedMinY = Math.min(startingPoint.y, currentPoint.y); + const selectedMaxY = Math.max(startingPoint.y, currentPoint.y); + + const selectedWidth = selectedMaxX - selectedMinX; + const selectedHeight = selectedMaxY - selectedMinY; + + if (selectedWidth > 0 || selectedHeight > 0) { + const currentXDataRange = xAxisDataRange.rawVal; + const currentYDataRange = yAxisDataRange.rawVal; + const currentXChartRange = xAxisChartRange.rawVal; + const currentYChartRange = yAxisChartRange.rawVal; + + let newXDataMin = scale(selectedMinX, {old: currentXChartRange, new: currentXDataRange}, 0); + let newXDataMax = scale(selectedMaxX, {old: currentXChartRange, new: currentXDataRange}, 0); + let newYDataMin = scale(selectedMinY, {old: currentYChartRange, new: currentYDataRange}, 0); + let newYDataMax = scale(selectedMaxY, {old: currentYChartRange, new: currentYDataRange}, 0); + + if (newXDataMin > newXDataMax) [newXDataMin, newXDataMax] = [newXDataMax, newXDataMin]; + if (newYDataMin > newYDataMax) [newYDataMin, newYDataMax] = [newYDataMax, newYDataMin]; + + xAxisDataRange.val = {min: newXDataMin, max: newXDataMax}; + yAxisDataRange.val = {min: newYDataMin, max: newYDataMax}; + + isZoomed.val = true; + } + }; + + const getSharedDefinitions = (drawinAreaClipId, yAxisClipId, xAxisClipId) => defs( + {}, + clipPath( + {id: getDOMId(drawinAreaClipId)}, + () => rect({ + x: topLeft.val.x, + y: topLeft.val.y, + width: Math.max(bottomRight.val.x - bottomLeft.val.x, 0), + height: Math.max(bottomLeft.val.y - topLeft.val.y, 0), + }), + ), + yAxisClipId ? clipPath( + {id: getDOMId(yAxisClipId)}, + () => rect({ + x: 0, + y: topLeft.val.y - 10, + width: 999999.9, + height: Math.max(bottomLeft.val.y - topLeft.val.y, 0), + }), + ) : undefined, + xAxisClipId ? clipPath( + {id: getDOMId(xAxisClipId)}, + () => rect({ + x: topLeft.val.x, + y: topLeft.val.y, + width: Math.max(bottomRight.val.x - bottomLeft.val.x, 0), + height: 999999.9, + }), + ) : undefined, + ); + + const resetZoom = () => { + isZoomed.val = false; + xAxisDataRange.val = {...initialXAxisDataRange.rawVal}; + yAxisDataRange.val = {...initialYAxisDataRange.rawVal}; + dataPoints.val = [...dataPoints.rawVal]; + }; + + const getPoint = (original) => { + let point = dataPointsMapping.rawVal[`${original.x}-${original.y}`]; + if (!point) { + point = {x: asSVGX(original.x), y: asSVGY(original.y)}; + } + return {...point, originalX: original.x, originalY: original.y}; + }; + + const tooltipText = van.state(''); + const shouldShowTooltip = van.state(false); + const tooltipExtraStyle = van.state(''); + const tooltipElement = Tooltip({ + text: tooltipText, + show: shouldShowTooltip, + position: '--', + style: tooltipExtraStyle, + }); + const showTooltip = (message, point) => { + let timeout; + + tooltipText.val = message; + tooltipExtraStyle.val = 'visibility: hidden;'; + shouldShowTooltip.val = true; + + timeout = setTimeout(() => { + const tooltipRect = tooltipElement.getBoundingClientRect(); + let tooltipX = point.x + 10; + let tooltipY = point.y + 10; + + if (tooltipX + tooltipRect.width >= bottomRight.rawVal.x) { + tooltipX = point.x - tooltipRect.width - 10; + } + + tooltipExtraStyle.val = `transform: translate(${tooltipX}px, ${tooltipY}px);`; + + clearTimeout(timeout); + }, 0); + }; + const hideTooltip = () => { + tooltipText.val = ''; + tooltipExtraStyle.val = ''; + shouldShowTooltip.val = false; + }; + + return div( + { + id: getDOMId('chart-canvas'), + class: 'tg-chart', + style: () => `width: ${canvasWidth.val}px; height: ${canvasHeight.val}px;`, + onmouseenter: () => isHoveringOver.val = true, + onmouseleave: () => isHoveringOver.val = false, + }, + svg( + { + width: '100%', + height: '100%', + style: 'z-index: 0;', + class: 'tg-chart-layer axis-layer', + viewBox: () => `0 0 ${canvasWidth.val} ${canvasHeight.val}`, + }, + getSharedDefinitions('axis-clippath', 'y-axis-ticks-clippath', 'x-axis-ticks-clippath'), + () => { + const maxY = canvasHeight.val; + const yLabelPos = {x: verticalAxisLabelLeftMargin, y: (bottomLeft.val.y - topLeft.val.y) / 2 + topLeft.val.y}; + const xLabelPos = {x: (bottomRight.val.x - bottomLeft.val.x) / 2, y: maxY - horizontalAxisLabelBottomMargin}; + + return g( + {}, + yAxisLabel.val ? text({...yLabelPos, 'text-anchor': 'middle', 'dominant-baseline': 'central', transform: `rotate(-90, ${yLabelPos.x}, ${yLabelPos.y})`, fill: 'var(--caption-text-color)'}, yAxisLabel.val) : null, + xAxisLabel.val ? text({...xLabelPos, fill: 'var(--caption-text-color)'}, xAxisLabel.val) : null, + ); + }, + () => { + const {min: yMin, max: yMax} = yAxisDataRange.val; + const ticks = niceTicks(yMin, yMax, yAxisTicksCount.val); + if (!yAxisLabel.val) { + return g(); + } + + afterMount(() => { + resizeChartBoundaries(); + }); + + return g( + {}, + g( + {id: getDOMId('y-axis-ticks-group'), 'clip-path': `url(#${getDOMId('y-axis-ticks-clippath')})`}, + ...ticks.map(value => { + const tickY = asSVGY(value); + if (tickY < topLeft.rawVal.y || (tickY + tickTextHeight) > bottomLeft.rawVal.y) { + return undefined; + } + + return text( + {x: verticalAxisTicksLeftMargin, y: tickY, class: 'text-small', 'dominant-baseline': 'central', fill: 'var(--caption-text-color)'}, + Math.floor(value * 1000) / 1000, + ); + }), + ), + () => yRenderGridLines.val ? g( + {'clip-path': `url(#${getDOMId('y-axis-ticks-clippath')})`}, + ...ticks.map(value => { + const tickY = asSVGY(value); + if (tickY < topLeft.rawVal.y || (tickY + tickTextHeight) > bottomLeft.rawVal.y) { + return undefined; + } + + return line({ + x1: bottomLeft.val.x, + y1: tickY, + x2: bottomRight.val.x, + y2: tickY, + stroke: colorMap.lightGrey, + }); + }), + ) : g(), + ); + }, + () => { + xAxisChartRange.val; + + const maxY = canvasHeight.val; + const {min: xMin, max: xMax} = xAxisDataRange.val; + const ticks = getAdaptiveTimeTicks([xMin, xMax], 4, 8); + const labels = formatSmartTimeTicks(ticks); + + return g( + {}, + g( + {id: getDOMId('x-axis-ticks-group'), 'clip-path': `url(#${getDOMId('x-axis-ticks-clippath')})`}, + ...ticks.map((value, idx) => { + const tickX = asSVGX(value.getTime()); + const labelLines = typeof labels[idx] === 'string' ? [labels[idx]] : labels[idx]; + return g( + {}, + labelLines.map((line, lineIdx) => text( + {x: tickX, y: maxY - horizontalAxisTicksBottomMargin + (lineIdx * 15), 'text-anchor': 'middle', 'dominant-baseline': 'central', class: 'text-small', fill: 'var(--caption-text-color)'}, + line, + )), + ); + }), + ), + () => xRenderGridLines.val ? g( + {'clip-path': `url(#${getDOMId('x-axis-ticks-clippath')})`}, + ...ticks.map(value => { + const tickX = asSVGX(value.getTime()); + + return line({ + x1: tickX, + y1: bottomRight.val.y, + x2: tickX, + y2: topRight.val.y, + stroke: colorMap.lightGrey, + }); + }), + ) : g(), + ); + }, + g( + {}, + () => yRenderLine.val ? line({x1: bottomLeft.val.x, y1: bottomLeft.val.y, x2: topLeft.val.x, y2: topLeft.val.y, stroke: colorMap.grey }) : g(), + () => xRenderLine.val ? line({x1: bottomLeft.val.x, y1: bottomLeft.val.y, x2: bottomRight.val.x, y2: bottomRight.val.y, stroke: colorMap.grey }) : g(), + ), + ), + svg( + { + width: '100%', + height: '100%', + style: 'z-index: 2;', + class: 'tg-chart-layer interactive-layer', + viewBox: () => `0 0 ${canvasWidth.val} ${canvasHeight.val}`, + }, + getSharedDefinitions('markers-clippath'), + () => { + const width = bottomRight.val.x - bottomLeft.val.x; + const height = bottomLeft.val.y - topLeft.val.y; + + return rect({ + x: topLeft.val.x, + y: topLeft.val.y, + width: Math.max(width, 0), + height: Math.max(height, 0), + fill: isDragZooming.val ? draggingOverlayColor : 'transparent', + ontouchstart: startDragZoom, + onmousedown: startDragZoom, + }); + }, + () => { + const children = []; + if (legendRenderer.val) { + children.push( + legendRenderer.rawVal({y: 20, x: topLeft.val.x}), + ); + } + + if (markersRenderer.val) { + children.push( + g( + {'clip-path': `url(#${getDOMId('markers-clippath')})`}, + markersRenderer.rawVal(getPoint, showTooltip, hideTooltip), + ) + ); + } + + if (isHoveringOver.val) { + children.push( + foreignObject( + {y: 0, x: canvasWidth.val - actionsWidth - (spacing * 2), width: actionsWidth, height: actionsHeight, class: 'visible-overflow'}, + withTooltip( + Button({ + type: 'icon', + icon: 'zoom_out_map', + iconSize: 20, + style: 'overflow: visible;', + onclick: resetZoom, + }), + {position: 'bottom-left', text: 'Autoscale'}, + ), + ) + ); + } + + if (children.length <= 0) { + children.push(g()); + } + + return g( + {class: 'visible-overflow'}, + ...children, + ); + }, + () => { + const isDragging = isDragZooming.val; + const currentPoint = dragZoomCurrentPoint.val; + const startingPoint = dragZoomStartingPoint.rawVal; + if (!isDragging || !currentPoint || !startingPoint) { + return g(); // NOTE: vanjs+svg might have an issue, if this is null, subsquent state changes won't trigger this reactive function + } + + const x = Math.min(startingPoint.x, currentPoint.x); + const y = Math.min(startingPoint.y, currentPoint.y); + const rectHeight = Math.abs(currentPoint?.y - startingPoint?.y); + const rectWidth = Math.abs(currentPoint?.x - startingPoint?.x); + + const strokeDashArray = [ + cornerDash, + rectWidth - cornerDash*2, + cornerDash + 0.001, + 0.001, + cornerDash, + rectHeight - cornerDash*2, + cornerDash, + 0.001, + cornerDash, + rectWidth - cornerDash*2, + cornerDash, + 0.001, + cornerDash, + rectHeight - cornerDash*2, + cornerDash, + 0.001, + ]; + + return g( + {style: 'z-index: 3;'}, + rect({ + x: x, + y: y, + width: rectWidth, + height: rectHeight, + fill: 'transparent', + stroke: colorMap.grey, + 'stroke-width': 3, + 'stroke-dasharray': strokeDashArray.join(','), + }), + ); + }, + foreignObject({fill: 'none', width: '100%', height: '100%', 'pointer-events': 'none', style: 'overflow: visible;'}, tooltipElement), + ), + svg( + { + width: '100%', + height: '100%', + style: 'z-index: 1;', + viewBox: () => `0 0 ${canvasWidth.val} ${canvasHeight.val}`, + }, + getSharedDefinitions('charts-clippath'), + g( + {'clip-path': `url(#${getDOMId('charts-clippath')})`}, + ...charts.map((renderer) => () => { + const dataPointsMapping_ = dataPointsMapping.val; + if (Object.keys(dataPointsMapping_).length <= 0) { + return g(); + } + + return renderer( + { minX: 0, minY: 0, width: canvasWidth.val, height: canvasHeight.val }, + { topLeft: topLeft.val, topRight: topRight.val, bottomLeft: bottomLeft.val, bottomRight: bottomRight.val }, + getPoint, + ); + }), + ), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-chart { + position: relative; +} + +.tg-chart > svg { + z-index: 1; +} + +.tg-chart > svg { + position: absolute; +} +`); + +export { ChartCanvas }; diff --git a/testgen/ui/components/frontend/js/components/crontab_input.js b/testgen/ui/components/frontend/js/components/crontab_input.js index 3bb489a3..5f0fc190 100644 --- a/testgen/ui/components/frontend/js/components/crontab_input.js +++ b/testgen/ui/components/frontend/js/components/crontab_input.js @@ -1,16 +1,12 @@ /** + * @import { CronSample } from '../types.js'; + * * @typedef EditOptions * @type {object} * @property {CronSample?} sample * @property {(expr: string) => void} onChange * @property {(() => void)?} onClose * - * @typedef CronSample - * @type {object} - * @property {string?} error - * @property {string[]?} samples - * @property {string?} readable_expr - * * @typedef InitialValue * @type {object} * @property {string} timezone @@ -19,10 +15,13 @@ * @typedef Options * @type {object} * @property {(string|null)} id + * @property {(string|null)} name * @property {string?} testId * @property {string?} class * @property {CronSample?} sample * @property {InitialValue?} value + * @property {('x_hours'|'x_days'|'certain_days'|'custom'))[]?} modes + * @property {boolean?} hideExpression * @property {((expr: string) => void)?} onChange */ import { getRandomId, getValue, loadStylesheet } from '../utils.js'; @@ -67,6 +66,7 @@ const CrontabInput = (/** @type Options */ props) => { { id: domId, class: () => `tg-crontab-input ${getValue(props.class) ?? ''}`, + style: 'position: relative', 'data-testid': getValue(props.testId) ?? null, }, div( @@ -76,6 +76,7 @@ const CrontabInput = (/** @type Options */ props) => { } }}, Input({ + name: props.name ?? getRandomId(), label: 'Schedule', icon: 'calendar_clock', readonly: true, @@ -85,12 +86,14 @@ const CrontabInput = (/** @type Options */ props) => { }), ), Portal( - {target: domId.val, align: 'right', style: 'width: 500px;', opened}, + {target: domId.val, targetRelative: true, align: 'right', style: 'width: 500px;', opened}, () => CrontabEditorPortal( { onChange: onEditorChange, onClose: () => opened.val = false, sample: props.sample, + modes: props.modes, + hideExpression: props.hideExpression, }, expression, ), @@ -109,11 +112,13 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { const xHoursState = { hours: van.state(1), minute: van.state(0), + startHour: van.state(0), }; const xDaysState = { days: van.state(1), hour: van.state(1), minute: van.state(0), + startDay: van.state(1), }; const certainDaysState = { sunday: van.state(false), @@ -134,12 +139,30 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { if (mode.val === 'x_hours') { const hours = xHoursState.hours.val; const minute = xHoursState.minute.val; - options.onChange(`${minute ?? 0} ${(hours && hours !== 1) ? '*/' + hours : '*'} * * *`); + const startHour = xHoursState.startHour.val; + let hourField; + if (!hours || hours <= 1) { + hourField = '*'; + } else if (startHour > 0) { + hourField = generateSteppedValues(startHour, hours, 23); + } else { + hourField = '*/' + hours; + } + options.onChange(`${minute ?? 0} ${hourField} * * *`); } else if (mode.val === 'x_days') { const days = xDaysState.days.val; const hour = xDaysState.hour.val; const minute = xDaysState.minute.val; - options.onChange(`${minute ?? 0} ${hour ?? 0} ${(days && days !== 1) ? '*/' + days : '*'} * *`); + const startDay = xDaysState.startDay.val; + let dayField; + if (!days || days <= 1) { + dayField = '*'; + } else if (startDay > 1) { + dayField = generateSteppedValues(startDay, days, 31); + } else { + dayField = '*/' + days; + } + options.onChange(`${minute ?? 0} ${hour ?? 0} ${dayField} * *`); } else if (mode.val === 'certain_days') { const days = []; const dayMap = [ @@ -186,34 +209,34 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { { class: 'tg-crontab-editor-content flex-row' }, div( { class: 'tg-crontab-editor-left flex-column' }, - span( + !options.modes || options.modes.includes('x_hours') ? span( { class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'x_hours' ? 'selected' : ''}`, onclick: () => mode.val = 'x_hours', }, 'Every x hours', - ), - span( + ) : null, + !options.modes || options.modes.includes('x_days') ? span( { class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'x_days' ? 'selected' : ''}`, onclick: () => mode.val = 'x_days', }, 'Every x days', - ), - span( + ) : null, + !options.modes || options.modes.includes('certain_days') ? span( { class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'certain_days' ? 'selected' : ''}`, onclick: () => mode.val = 'certain_days', }, 'On certain days', - ), - span( + ) : null, + !options.modes || options.modes.includes('custom') ? span( { class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'custom' ? 'selected' : ''}`, onclick: () => mode.val = 'custom', }, 'Custom', - ), + ) : null, ), div( { class: 'tg-crontab-editor-right flex-column p-4 fx-flex' }, @@ -224,16 +247,19 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { span({}, 'Every'), () => Select({ label: "", - options: Array.from({length: 24}, (_, i) => i).map(i => ({label: i.toString(), value: i})), + options: Array.from({length: 24}, (_, i) => i + 1).map(i => ({label: i.toString(), value: i})), triggerStyle: 'inline', portalClass: 'tg-crontab--select-portal', value: xHoursState.hours, - onChange: (value) => xHoursState.hours.val = value, + onChange: (value) => { + xHoursState.hours.val = value; + if (value <= 1) xHoursState.startHour.val = 0; + }, }), span({}, 'hours'), ), div( - {class: 'flex-row fx-gap-2'}, + {class: () => `flex-row fx-gap-2 ${xHoursState.hours.val > 1 ? 'mb-2' : ''}`}, span({}, 'on'), span({}, 'minute'), () => Select({ @@ -245,6 +271,18 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { onChange: (value) => xHoursState.minute.val = value, }), ), + div( + {class: () => `flex-row fx-gap-2 ${xHoursState.hours.val > 1 ? '' : 'hidden'}`}, + span({}, 'starting at hour'), + () => Select({ + label: "", + options: Array.from({length: 24}, (_, i) => i).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xHoursState.startHour, + onChange: (value) => xHoursState.startHour.val = value, + }), + ), ), div( { class: () => `${mode.val === 'x_days' ? '' : 'hidden'}`}, @@ -257,12 +295,15 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { triggerStyle: 'inline', portalClass: 'tg-crontab--select-portal', value: xDaysState.days, - onChange: (value) => xDaysState.days.val = value, + onChange: (value) => { + xDaysState.days.val = value; + if (value <= 1) xDaysState.startDay.val = 1; + }, }), span({}, 'days'), ), div( - {class: 'flex-row fx-gap-2'}, + {class: () => `flex-row fx-gap-2 ${xDaysState.days.val > 1 ? 'mb-2' : ''}`}, span({}, 'at'), () => Select({ label: "", @@ -281,6 +322,18 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { onChange: (value) => xDaysState.minute.val = value, }), ), + div( + {class: () => `flex-row fx-gap-2 ${xDaysState.days.val > 1 ? '' : 'hidden'}`}, + span({}, 'starting on day'), + () => Select({ + label: "", + options: Array.from({length: 31}, (_, i) => i + 1).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xDaysState.startDay, + onChange: (value) => xDaysState.startDay.val = value, + }), + ), ), div( { class: () => `${mode.val === 'certain_days' ? '' : 'hidden'}`}, @@ -369,7 +422,7 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { div( {class: 'flex-column fx-gap-1 mt-3 text-secondary'}, () => span( - { class: mode.val === 'custom' ? 'hidden': '' }, + { class: mode.val === 'custom' || getValue(options.hideExpression) ? 'hidden': '' }, `Cron Expression: ${expr.val ?? ''}`, ), () => div( @@ -408,6 +461,25 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { ); }; +function generateSteppedValues(start, step, max) { + const values = []; + for (let i = start; i <= max; i += step) { + values.push(i); + } + return values.join(','); +} + +function parseSteppedList(field) { + const values = field.split(',').map(Number); + if (values.length < 2 || values.some(isNaN)) return null; + const step = values[1] - values[0]; + if (step <= 0) return null; + for (let i = 2; i < values.length; i++) { + if (values[i] - values[i - 1] !== step) return null; + } + return { start: values[0], step }; +} + /** * Populates the state variables for the initial mode based on the cron expression * @param {string} expr @@ -419,21 +491,35 @@ const CrontabEditorPortal = ({sample, ...options}, expr) => { function populateInitialModeState(expr, mode, xHoursState, xDaysState, certainDaysState) { const parts = (expr || '').trim().split(/\s+/); if (mode === 'x_hours' && parts.length === 5) { - // e.g. "M */H * * *" or "M * * * *" xHoursState.minute.val = Number(parts[0]) || 0; if (parts[1].startsWith('*/')) { xHoursState.hours.val = Number(parts[1].slice(2)) || 1; + xHoursState.startHour.val = 0; + } else if (parts[1].includes(',')) { + const parsed = parseSteppedList(parts[1]); + if (parsed) { + xHoursState.hours.val = parsed.step; + xHoursState.startHour.val = parsed.start; + } } else { xHoursState.hours.val = 1; + xHoursState.startHour.val = 0; } } else if (mode === 'x_days' && parts.length === 5) { - // e.g. "M H */D * *" or "M H * * *" xDaysState.minute.val = Number(parts[0]) || 0; xDaysState.hour.val = Number(parts[1]) || 0; if (parts[2].startsWith('*/')) { xDaysState.days.val = Number(parts[2].slice(2)) || 1; + xDaysState.startDay.val = 1; + } else if (parts[2].includes(',')) { + const parsed = parseSteppedList(parts[2]); + if (parsed) { + xDaysState.days.val = parsed.step; + xDaysState.startDay.val = parsed.start; + } } else { xDaysState.days.val = 1; + xDaysState.startDay.val = 1; } } else if (mode === 'certain_days' && parts.length === 5) { // e.g. "M H * * DAY[,DAY...]" @@ -464,14 +550,22 @@ function populateInitialModeState(expr, mode, xHoursState, xDaysState, certainDa function determineMode(expression) { // Normalize whitespace const expr = (expression || '').trim().replace(/\s+/g, ' '); - // x_hours: "M */H * * *" or "M * * * *" + // x_hours: "M */H * * *" or "M * * * *" or "M H1,H2,... * * *" if (/^\d{1,2} \*\/\d+ \* \* \*$/.test(expr) || /^\d{1,2} \* \* \* \*$/.test(expr)) { return 'x_hours'; } - // x_days: "M H */D * *" or "M H * * *" + if (/^\d{1,2} \d+(,\d+)+ \* \* \*$/.test(expr)) { + const hourField = expr.split(' ')[1]; + if (parseSteppedList(hourField)) return 'x_hours'; + } + // x_days: "M H */D * *" or "M H * * *" or "M H D1,D2,... * *" if (/^\d{1,2} \d{1,2} \*\/\d+ \* \*$/.test(expr) || /^\d{1,2} \d{1,2} \* \* \*$/.test(expr)) { return 'x_days'; } + if (/^\d{1,2} \d{1,2} \d+(,\d+)+ \* \*$/.test(expr)) { + const dayField = expr.split(' ')[2]; + if (parseSteppedList(dayField)) return 'x_days'; + } // certain_days: "M H * * DAY[,DAY...]" (DAY = SUN,MON,...) if (/^\d{1,2} \d{1,2} \* \* ((SUN|MON|TUE|WED|THU|FRI|SAT)(-(SUN|MON|TUE|WED|THU|FRI|SAT))?(,)?)+$/.test(expr)) { return 'certain_days'; @@ -481,6 +575,10 @@ function determineMode(expression) { const stylesheet = new CSSStyleSheet(); stylesheet.replace(` +.tg-crontab-input { + position: relative; +} + .tg-crontab-display { border-bottom: 1px dashed var(--border-color); } @@ -493,6 +591,7 @@ stylesheet.replace(` } .tg-crontab-editor-content { + align-items: stretch; border-bottom: 1px solid var(--border-color); } @@ -527,4 +626,4 @@ stylesheet.replace(` } `); -export { CrontabInput }; +export { CrontabInput, parseSteppedList }; diff --git a/testgen/ui/components/frontend/js/components/dual_pane.js b/testgen/ui/components/frontend/js/components/dual_pane.js new file mode 100644 index 00000000..65d89266 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/dual_pane.js @@ -0,0 +1,80 @@ +/** + * @typedef Options + * @property {('left'|'right')} resizablePanel + * @property {string} resizablePanelDomId + * @property {number} minSize + * @property {number} maxSize + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; + +const { div, span } = van.tags; +const EMPTY_IMAGE = new Image(1, 1); +EMPTY_IMAGE.src = 'data:image/gif;base64,R0lGODlhAQABAIAAAP///wAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw=='; + +/** + * + * @param {Options} options + * @param {HTMLElement?} left + * @param {HTMLElement?} right + * @returns + */ +const DualPane = function (options, left, right) { + loadStylesheet('dualPanel', stylesheet); + + const dragState = van.state(null); + const dragConstraints = { min: options.minSize, max: options.maxSize }; + const dragResize = (/** @type Event */ event) => { + // https://stackoverflow.com/questions/36308460/why-is-clientx-reset-to-0-on-last-drag-event-and-how-to-solve-it + if (event.screenX && dragState.val) { + const dragWidth = dragState.val.startWidth + (event.screenX - dragState.val.startX) * (options.resizablePanel === 'right' ? -1 : 1); + const constrainedWidth = Math.min(dragConstraints.max, Math.max(dragWidth, dragConstraints.min)); + + const element = document.getElementById(options.resizablePanelDomId); + if (element) { + element.style.minWidth = `${constrainedWidth}px`; + } + } + }; + + return div( + { ...options, class: () => `tg-dualpane flex-row fx-align-flex-start ${getValue(options.class) ?? ''}` }, + left, + div( + { + class: 'tg-dualpane-divider', + draggable: true, + ondragstart: (event) => { + event.dataTransfer.effectAllowed = 'move'; + event.dataTransfer.setDragImage(EMPTY_IMAGE, 0, 0); + + const element = document.getElementById(options.resizablePanelDomId); + dragState.val = { startX: event.screenX, startWidth: element.offsetWidth }; + }, + ondragend: (event) => { + dragResize(event); + dragState.val = null; + }, + ondrag: (event) => dragState.rawVal ? dragResize(event) : null, + }, + '', + ), + right, + ); +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + .tg-dualpane { + // height: auto; + } + + .tg-dualpane-divider { + min-height: 100px; + place-self: stretch; + cursor: col-resize; + min-width: 16px; + } +`); + +export { DualPane }; diff --git a/testgen/ui/components/frontend/js/components/empty_state.js b/testgen/ui/components/frontend/js/components/empty_state.js index 1ac1f55e..86628c88 100644 --- a/testgen/ui/components/frontend/js/components/empty_state.js +++ b/testgen/ui/components/frontend/js/components/empty_state.js @@ -58,6 +58,10 @@ const EMPTY_STATE_MESSAGE = { line1: '', line2: 'Configure an SMTP email server for TestGen to get alerts on profiling runs, test runs, and quality scorecards.', }, + monitors: { + line1: 'Monitor your tables', + line2: 'Set up freshness, volume, and schema monitors on your data to detect anomalies.', + }, }; const EmptyState = (/** @type Properties */ props) => { diff --git a/testgen/ui/components/frontend/js/components/freshness_chart.js b/testgen/ui/components/frontend/js/components/freshness_chart.js new file mode 100644 index 00000000..919136fa --- /dev/null +++ b/testgen/ui/components/frontend/js/components/freshness_chart.js @@ -0,0 +1,211 @@ +/** + * @import {ChartViewBox, Point} from './chart_canvas.js'; + * + * @typedef Options + * @type {object} + * @property {number} width + * @property {number} height + * @property {number} lineWidth + * @property {number} lineHeight + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {ChartViewBox?} viewBox + * @property {Function?} showTooltip + * @property {Function?} hideTooltip + * @property {{startX: number?, endX: number, startTime: number?, endTime: number}?} predictedWindow + * + * @typedef FreshnessEvent + * @type {object} + * @property {Point} point + * @property {number} time + * @property {boolean} changed + * @property {string} status + * @property {string} message + * @property {boolean} isTraining + * @property {boolean} isPending + */ +import van from '../van.min.js'; +import { colorMap, formatTimestamp } from '../display_utils.js'; +import { getValue } from '../utils.js'; + +const { div, span } = van.tags; +const { circle, g, line, rect, svg } = van.tags("http://www.w3.org/2000/svg"); +const colorByStatus = { + Passed: colorMap.limeGreen, + Failed: colorMap.red, + Warning: colorMap.orange, + Log: colorMap.blueLight, +}; + +/** + * @param {Options} options + * @param {Array} events + */ +const FreshnessChart = (options, ...events) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox?.width; + height.val = viewBox?.height; + minX.val = viewBox?.minX; + minY.val = viewBox?.minY; + }); + + const freshnessEvents = events.map(event => { + if (event.isPending) { + return null; + } + + const point = event.point; + const minY = point.y - (_options.lineHeight / 2) + 2; + const maxY = point.y + (_options.lineHeight / 2) - 2; + const lineProps = { x1: point.x, y1: minY, x2: point.x, y2: maxY }; + const eventColor = getFreshnessEventColor(event); + const markerProps = _options.showTooltip ? { + onmouseenter: () => _options.showTooltip?.(FreshnessChartTooltip(event), point), + onmouseleave: () => _options.hideTooltip?.(), + } : {}; + + return g( + {...markerProps}, + event.changed + ? line({ + ...lineProps, + style: `stroke: ${eventColor}; stroke-width: ${event.isTraining ? '1' : _options.lineWidth};`, + }) + : null, + !['Passed', 'Log'].includes(event.status) + ? rect({ + width: _options.markerSize, + height: _options.markerSize, + x: lineProps.x1 - (_options.markerSize / 2), + y: maxY - (_options.markerSize / 2), + fill: eventColor, + style: `transform-box: fill-box; transform-origin: center;`, + transform: 'rotate(45)', + }) + : circle({ + cx: lineProps.x1, + cy: maxY, + r: 2, + fill: event.isTraining ? 'var(--dk-dialog-background)' : eventColor, + style: `stroke: ${eventColor}; stroke-width: 1;`, + }), + // Larger hit area for tooltip + rect({ + width: _options.markerSize, + height: _options.lineHeight, + x: lineProps.x1 - (_options.markerSize / 2), + y: 0, + fill: 'transparent', + style: `transform-box: fill-box; transform-origin: center;`, + }) + ); + }); + + const extraAttributes = {}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + return svg( + { + width: '100%', + height: '100%', + ...extraAttributes, + }, + ...freshnessEvents, + FreshnessPredictedWindow(_options), + ); +}; + +const /** @type Options */ defaultOptions = { + width: 600, + height: 200, + lineWidth: 2, + lineHeight: 5, + markerSize: 8, + nestedPosition: {x: 0, y: 0}, +}; + +/** + * @param {FreshnessEvent} event + * @returns + */ +const getFreshnessEventColor = (event) => { + if (!event.changed && (event.status === 'Passed' || event.isTraining)) { + return colorMap.emptyDark; + } + return colorByStatus[event.status]; +} + +/** + * @param {FreshnessEvent} event + * @returns {HTMLDivElement} + */ +const FreshnessChartTooltip = (event) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, formatTimestamp(event.time, false)), + span( + {class: 'text-left text-small'}, + `${event.changed ? 'Table updated' : 'No update'}${event.message ? ' - ' + event.message : ''}`, + ), + ); +}; + +/** + * @param {Options} options + * @returns {SVGGElement|null} + */ +const FreshnessPredictedWindow = (options) => { + const window = getValue(options.predictedWindow); + if (!window) return null; + + const barHeight = getValue(options.height); + const startX = window.startX ?? window.endX; + const windowWidth = window.endX - startX; + if (windowWidth <= 0) return null; + + const markerProps = options.showTooltip ? { + onmouseenter: () => options.showTooltip?.(FreshnessWindowTooltip(window), {x: startX + windowWidth / 2, y: barHeight / 2}), + onmouseleave: () => options.hideTooltip?.(), + } : {}; + + return g( + {...markerProps}, + rect({ + width: windowWidth, + height: barHeight, + x: startX, + y: 0, + fill: colorMap.emptyDark, + opacity: 0.15, + rx: 2, + }), + ); +}; + +const FreshnessWindowTooltip = (window) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, 'Next update expected'), + window.startTime + ? span({class: 'text-left text-small'}, `${formatTimestamp(window.startTime, false)} - ${formatTimestamp(window.endTime, false)}`) + : span({class: 'text-left text-small'}, `By ${formatTimestamp(window.endTime, false)}`), + ); +}; + +export { FreshnessChart }; diff --git a/testgen/ui/components/frontend/js/components/help_menu.js b/testgen/ui/components/frontend/js/components/help_menu.js index 1a364a23..3ea341db 100644 --- a/testgen/ui/components/frontend/js/components/help_menu.js +++ b/testgen/ui/components/frontend/js/components/help_menu.js @@ -23,7 +23,7 @@ import { Icon } from './icon.js'; const { a, div, span } = van.tags; -const baseHelpUrl = 'https://docs.datakitchen.io/articles/#!dataops-testgen-help/'; +const baseHelpUrl = 'https://docs.datakitchen.io/articles/dataops-testgen-help/'; const releaseNotesTopic = 'testgen-release-notes'; const upgradeTopic = 'upgrade-testgen'; diff --git a/testgen/ui/components/frontend/js/components/icon.js b/testgen/ui/components/frontend/js/components/icon.js index b4e879d7..6f76331b 100644 --- a/testgen/ui/components/frontend/js/components/icon.js +++ b/testgen/ui/components/frontend/js/components/icon.js @@ -1,8 +1,9 @@ /** * @typedef Properties * @type {object} + * @property {string?} classes * @property {number?} size - * @property {string} classes + * @property {boolean?} filled */ import { getValue, isDataURL, loadStylesheet } from '../utils.js'; import van from '../van.min.js'; @@ -18,7 +19,7 @@ const Icon = (/** @type Properties */ props, /** @type string */ icon) => { { width: () => getValue(props.size) || DEFAULT_SIZE, height: () => getValue(props.size) || DEFAULT_SIZE, src: icon, - class: () => `tg-icon tg-icon-image ${getValue(props.classes)}`, + class: () => `tg-icon tg-icon-image ${getValue(props.classes) ?? ''}`, src: icon, } ); @@ -26,7 +27,7 @@ const Icon = (/** @type Properties */ props, /** @type string */ icon) => { return i( { - class: () => `material-symbols-rounded tg-icon text-secondary ${getValue(props.classes)}`, + class: () => `material-symbols-rounded tg-icon text-secondary ${getValue(props.filled) ? 'material-symbols-filled' : ''} ${getValue(props.classes) ?? ''}`, style: () => `font-size: ${getValue(props.size) || DEFAULT_SIZE}px;`, ...props, }, diff --git a/testgen/ui/components/frontend/js/components/link.js b/testgen/ui/components/frontend/js/components/link.js index 96468963..f92f3fb2 100644 --- a/testgen/ui/components/frontend/js/components/link.js +++ b/testgen/ui/components/frontend/js/components/link.js @@ -17,6 +17,7 @@ * @property {string?} tooltip * @property {string?} tooltipPosition * @property {boolean?} disabled + * @property {((event: any) => void)?} onClick */ import { emitEvent, enforceElementWidth, getValue, loadStylesheet } from '../utils.js'; import van from '../van.min.js'; @@ -42,6 +43,7 @@ const Link = (/** @type Properties */ props) => { const href = getValue(props.href); const params = getValue(props.params) ?? {}; const open_new = !!getValue(props.open_new); + const onClick = getValue(props.onClick); const showTooltip = van.state(false); const isExternal = /http(s)?:\/\//.test(href); @@ -54,11 +56,11 @@ const Link = (/** @type Properties */ props) => { style: props.style, href: isExternal ? href : `/${href}${getQueryFromParams(params)}`, target: open_new ? '_blank' : '', - onclick: open_new ? null : (event) => { + onclick: open_new ? null : (onClick ?? ((event) => { event.preventDefault(); event.stopPropagation(); emitEvent('LinkClicked', { href, params }); - }, + })), onmouseenter: props.tooltip ? (() => showTooltip.val = true) : undefined, onmouseleave: props.tooltip ? (() => showTooltip.val = false) : undefined, }, diff --git a/testgen/ui/components/frontend/js/components/monitor_anomalies_summary.js b/testgen/ui/components/frontend/js/components/monitor_anomalies_summary.js new file mode 100644 index 00000000..5b53a219 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/monitor_anomalies_summary.js @@ -0,0 +1,126 @@ +/** + * @typedef MonitorSummary + * @type {object} + * @property {number} freshness_anomalies + * @property {number} volume_anomalies + * @property {number} schema_anomalies + * @property {number} metric_anomalies + * @property {boolean?} freshness_has_errors + * @property {boolean?} volume_has_errors + * @property {boolean?} schema_has_errors + * @property {boolean?} metric_has_errors + * @property {boolean?} freshness_is_training + * @property {boolean?} volume_is_training + * @property {boolean?} metric_is_training + * @property {boolean?} freshness_is_pending + * @property {boolean?} volume_is_pending + * @property {boolean?} schema_is_pending + * @property {boolean?} metric_is_pending + * @property {number} lookback + * @property {number} lookback_start + * @property {number} lookback_end + * @property {string?} project_code + * @property {string?} table_group_id + * + * @typedef SummaryOptions + * @type {object} + * @property {function(string)?} onTagClick + * @property {object?} activeTypes + */ +import { emitEvent, getValue, loadStylesheet } from '../utils.js'; +import { formatDuration, humanReadableDuration } from '../display_utils.js'; +import { withTooltip } from './tooltip.js'; +import van from '../van.min.js'; + +const { a, div, i, span } = van.tags; + +/** + * @param {MonitorSummary} summary + * @param {string?} label + * @param {SummaryOptions?} options + */ +const AnomaliesSummary = (summary, label = 'Anomalies', options = {}) => { + loadStylesheet('anomalies-summary', summaryStylesheet); + + if (!summary.lookback) { + return span({class: 'text-secondary mt-3 mb-2'}, 'No monitor runs yet'); + } + + const SummaryTag = (typeKey, tagLabel, value, hasErrors, isTraining, isPending) => { + const isClickable = !!options.onTagClick; + const isActive = van.derive(() => (getValue(options.activeTypes) ?? []).includes(typeKey)); + + return div( + { + class: () => `flex-row fx-gap-1 p-1 border-radius-1 summary-tag ${isClickable ? 'clickable' : ''} ${isActive.val ? 'active' : ''}`, + onclick: isClickable ? (event) => { + event.stopPropagation(); + options.onTagClick(typeKey); + } : undefined, + }, + div( + {class: `flex-row fx-justify-center anomaly-tag ${value > 0 ? 'has-anomalies' : hasErrors ? 'has-errors' : isTraining ? 'is-training' : isPending ? 'is-pending' : ''}`}, + value > 0 + ? value + : hasErrors + ? withTooltip( + i({class: 'material-symbols-rounded'}, 'warning'), + {text: 'Execution error', position: 'top-right'}, + ) + : isTraining + ? withTooltip( + i({class: 'material-symbols-rounded'}, 'more_horiz'), + {text: 'Training model', position: 'top-right'}, + ) + : isPending + ? withTooltip( + span({class: 'pl-2 pr-2', style: 'position: relative;'}, '-'), + {text: 'No results yet or not configured'}, + ) + : i({class: 'material-symbols-rounded'}, 'check'), + ), + span({}, tagLabel), + ); + }; + + const numRuns = summary.lookback === 1 ? 'run' : `${summary.lookback} runs`; + const duration = humanReadableDuration(formatDuration(summary.lookback_start, new Date()), true) + const labelElement = span({class: 'text-small text-secondary'}, `${label} in last ${numRuns} (${duration})`); + + const contentElement = div( + {class: 'flex-row fx-gap-5'}, + SummaryTag('freshness', 'Freshness', summary.freshness_anomalies, summary.freshness_has_errors, summary.freshness_is_training, summary.freshness_is_pending), + SummaryTag('volume', 'Volume', summary.volume_anomalies, summary.volume_has_errors, summary.volume_is_training, summary.volume_is_pending), + SummaryTag('schema', 'Schema', summary.schema_anomalies, summary.schema_has_errors, false, summary.schema_is_pending), + SummaryTag('metrics', 'Metrics', summary.metric_anomalies, summary.metric_has_errors, summary.metric_is_training, summary.metric_is_pending), + ); + + if (summary.project_code && summary.table_group_id) { + return a( + { + class: `flex-column fx-gap-2 clickable`, + style: 'text-decoration: none; color: unset;', + href: summary.table_group_id ? `/monitors?project_code=${summary.project_code}&table_group_id=${summary.table_group_id}`: null, + onclick: summary.table_group_id ? (event) => { + event.preventDefault(); + event.stopPropagation(); + emitEvent('LinkClicked', { href: 'monitors', params: {project_code: summary.project_code, table_group_id: summary.table_group_id} }); + }: null, + }, + labelElement, + contentElement, + ); + } + + return div({class: 'flex-column fx-gap-2'}, labelElement, contentElement); +}; + +const summaryStylesheet = new CSSStyleSheet(); +summaryStylesheet.replace(` +.summary-tag.clickable:hover, +.summary-tag.active { + background: var(--select-hover-background); +} +`); + +export { AnomaliesSummary }; diff --git a/testgen/ui/components/frontend/js/components/monitor_settings_form.js b/testgen/ui/components/frontend/js/components/monitor_settings_form.js new file mode 100644 index 00000000..edd88d7a --- /dev/null +++ b/testgen/ui/components/frontend/js/components/monitor_settings_form.js @@ -0,0 +1,405 @@ +/** + * @import { CronSample } from '../types.js'; + * + * @typedef Schedule + * @type {object} + * @property {string?} cron_tz + * @property {string} cron_expr + * @property {boolean} active + * + * @typedef MonitorSuite + * @type {object} + * @property {string?} id + * @property {string?} table_groups_id + * @property {string?} test_suite + * @property {number?} monitor_lookback + * @property {boolean?} monitor_regenerate_freshness + * @property {('low'|'medium'|'high')?} predict_sensitivity + * @property {number?} predict_min_lookback + * @property {boolean?} predict_exclude_weekends + * @property {string?} predict_holiday_codes + * + * @typedef FormState + * @type {object} + * @property {boolean} dirty + * @property {boolean} valid + * + * @typedef Properties + * @type {object} + * @property {Schedule} schedule + * @property {MonitorSuite} monitorSuite + * @property {CronSample?} cronSample + * @property {boolean?} hideActiveCheckbox + * @property {(sch: Schedule, ts: MonitorSuite, state: FormState) => void} onChange + */ +import van from '../van.min.js'; +import { getValue, isEqual, loadStylesheet, emitEvent } from '../utils.js'; +import { Input } from './input.js'; +import { RadioGroup } from './radio_group.js'; +import { Caption } from './caption.js'; +import { Select } from './select.js'; +import { Checkbox } from './checkbox.js'; +import { CrontabInput, parseSteppedList } from './crontab_input.js'; +import { Icon } from './icon.js'; +import { Link } from './link.js'; +import { withTooltip } from './tooltip.js'; +import { numberBetween, required } from '../form_validators.js'; +import { timezones, holidayCodes } from '../values.js'; +import { formatDurationSeconds, humanReadableDuration } from '../display_utils.js'; + +const { div, span } = van.tags; + +const monitorLookbackConfig = { + default: 14, + min: 1, + max: 200, +}; +const predictLookbackConfig = { + default: 30, + min: 20, + max: 1000, +} + +/** + * + * @param {Properties} props + * @returns + */ +const MonitorSettingsForm = (props) => { + loadStylesheet('monitor-settings-form', stylesheet); + + const schedule = getValue(props.schedule) ?? {}; + const cronTimezone = van.state(schedule.cron_tz ?? Intl.DateTimeFormat().resolvedOptions().timeZone); + const cronExpression = van.state(schedule.cron_expr ?? '0 */12 * * *'); + const scheduleActive = van.state(schedule.active ?? true); + + const monitorSuite = getValue(props.monitorSuite) ?? {}; + const monitorLookback = van.state(monitorSuite.monitor_lookback ?? monitorLookbackConfig.default); + const monitorRegenerateFreshness = van.state(monitorSuite.monitor_regenerate_freshness ?? true); + const predictSensitivity = van.state(monitorSuite.predict_sensitivity ?? 'medium'); + const predictMinLookback = van.state(monitorSuite.predict_min_lookback ?? predictLookbackConfig.default); + const predictExcludeWeekends = van.state(monitorSuite.predict_exclude_weekends ?? false); + const predictHolidayCodes = van.state(monitorSuite.predict_holiday_codes); + + const updatedSchedule = van.derive(() => { + return { + cron_tz: cronTimezone.val, + cron_expr: cronExpression.val, + active: scheduleActive.val, + }; + }); + const updatedTestSuite = van.derive(() => { + return { + id: monitorSuite.id, + table_groups_id: monitorSuite.table_groups_id, + test_suite: monitorSuite.test_suite, + monitor_lookback: monitorLookback.val, + monitor_regenerate_freshness: monitorRegenerateFreshness.val, + predict_sensitivity: predictSensitivity.val, + predict_min_lookback: predictMinLookback.val, + predict_exclude_weekends: predictExcludeWeekends.val, + predict_holiday_codes: predictHolidayCodes.val, + }; + }); + + const dirty = van.derive(() => !isEqual(updatedSchedule.val, schedule) || !isEqual(updatedTestSuite.val, monitorSuite)); + const validityPerField = van.state({}); + + van.derive(() => { + const fieldsValidity = validityPerField.val; + const isValid = Object.keys(fieldsValidity).length > 0 && + Object.values(fieldsValidity).every(v => v); + props.onChange?.(updatedSchedule.val, updatedTestSuite.val, { dirty: dirty.val, valid: isValid }); + }); + + const setFieldValidity = (field, validity) => { + validityPerField.val = {...validityPerField.rawVal, [field]: validity}; + } + + return div( + { class: 'flex-column fx-gap-4' }, + MainForm( + { setValidity: setFieldValidity }, + monitorLookback, + monitorRegenerateFreshness, + cronExpression, + ), + ScheduleForm( + { + hideActiveCheckbox: getValue(props.hideActiveCheckbox), + originalActive: schedule.active ?? true, + cronSample: props.cronSample, + setValidity: setFieldValidity, + }, + cronTimezone, + cronExpression, + scheduleActive, + ), + PredictionForm( + { setValidity: setFieldValidity }, + predictSensitivity, + predictMinLookback, + predictExcludeWeekends, + predictHolidayCodes, + ), + ); +}; + +const MainForm = ( + options, + monitorLookback, + monitorRegenerateFreshness, + cronExpression, +) => { + return div( + { class: 'flex-column fx-gap-4' }, + div( + { class: 'flex-row fx-align-flex-start fx-gap-3 fx-flex-wrap monitor-settings-row' }, + Input({ + name: 'monitor_lookback', + label: 'Lookback Runs', + value: monitorLookback, + help: 'Number of monitor runs to summarize on dashboard views', + helpPlacement: 'bottom-right', + type: 'number', + step: 1, + onChange: (value, state) => { + monitorLookback.val = value; + options.setValidity?.('monitor_lookback', state.valid); + }, + validators: [ + numberBetween(monitorLookbackConfig.min, monitorLookbackConfig.max, 1), + ], + }), + () => { + const cronDuration = determineDuration(cronExpression.val); + if (!cronDuration || !monitorLookback.val) { + return span({}); + } + + const lookbackDuration = monitorLookback.val * cronDuration; + return div( + { class: 'flex-column' }, + div( + { class: 'flex-row fx-gap-1 text-caption mt-1 mb-3' }, + span('Lookback Window (calculated)'), + withTooltip( + Icon({ size: 16, classes: 'text-disabled' }, 'help'), + { text: 'Time window to summarize on dashboard views. Calculated based on Lookback Runs and Schedule.', width: 200 }, + ) + ), + span(humanReadableDuration(formatDurationSeconds(lookbackDuration))), + ); + } + ), + div( + { class: 'flex-row fx-align-flex-start fx-gap-3 fx-flex-wrap mb-2 monitor-settings-row' }, + Checkbox({ + name: 'monitor_regenerate_freshness', + label: 'Reconfigure Freshness monitors after profiling', + help: 'When enabled, Freshness monitors will be automatically reconfigured with new fingerprints after each profiling run', + width: 350, + checked: monitorRegenerateFreshness, + onChange: (value) => monitorRegenerateFreshness.val = value, + }), + ), + ); +}; + +const ScheduleForm = ( + options, + cronTimezone, + cronExpression, + scheduleActive, +) => { + const cronEditorValue = van.derive(() => { + if (cronExpression.val && cronTimezone.val) { + emitEvent('GetCronSample', {payload: {cron_expr: cronExpression.val, tz: cronTimezone.val}}); + } + return { + timezone: cronTimezone.val, + expression: cronExpression.val, + }; + }); + + return div( + { class: 'flex-column fx-gap-3 border border-radius-1 p-3', style: 'position: relative;' }, + Caption({content: 'Monitor Schedule', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + div( + { class: 'flex-row fx-gap-3 fx-flex-wrap fx-align-flex-start monitor-settings-row' }, + () => Select({ + label: 'Timezone', + options: timezones.map(tz_ => ({label: tz_, value: tz_})), + value: cronTimezone, + allowNull: false, + filterable: true, + onChange: (value) => cronTimezone.val = value, + portalClass: 'short-select-portal', + }), + CrontabInput({ + name: 'monitor_settings_schedule', + sample: options.cronSample, + value: cronEditorValue, + modes: ['x_hours', 'x_days'], + hideExpression: true, + onChange: (value) => cronExpression.val = value, + }), + ), + !options.hideActiveCheckbox + ? div( + { class: 'flex-row fx-gap-6 fx-flex-wrap' }, + Checkbox({ + name: 'schedule_active', + label: 'Activate schedule', + checked: scheduleActive, + onChange: (value) => scheduleActive.val = value, + }), + () => !scheduleActive.val + ? div( + { class: 'flex-row fx-gap-1' }, + Icon({ style: 'font-size: 16px; color: var(--purple);' }, 'info'), + span( + { class: 'text-caption', style: 'color: var(--purple);' }, + options.originalActive ? 'Monitor schedule will be paused.' : 'Monitor schedule is paused.', + ), + ) + : '', + ) + : null, + ); +}; + +const PredictionForm = ( + options, + predictSensitivity, + predictMinLookback, + predictExcludeWeekends, + predictHolidayCodes, +) => { + const excludeHolidays = van.state(!!predictHolidayCodes.val); + return div( + { class: 'flex-column fx-gap-4 border border-radius-1 p-3', style: 'position: relative;' }, + Caption({content: 'Prediction Model', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + div( + { class: 'flex-row fx-gap-3 fx-flex-wrap monitor-settings-row' }, + RadioGroup({ + name: 'predict_sensitivity', + label: 'Sensitivity', + options: [ + { label: 'Low', value: 'low', help: 'Fewer alerts. Volume/Metric: 3 standard deviations. Freshness: wider interval tolerance.' }, + { label: 'Medium', value: 'medium', help: 'Balanced. Volume/Metric: 2.5 standard deviations. Freshness: moderate interval tolerance.' }, + { label: 'High', value: 'high', help: 'More alerts. Volume/Metric: 2 standard deviations. Freshness: tighter interval tolerance.' }, + ], + value: predictSensitivity, + onChange: (value) => predictSensitivity.val = value, + }), + Input({ + name: 'predict_min_lookback', + type: 'number', + label: 'Minimum Training Lookback', + value: predictMinLookback, + help: 'Minimum number of monitor runs to use for training models', + type: 'number', + step: 1, + onChange: (value, state) => { + predictMinLookback.val = value; + options.setValidity?.('predict_min_lookback', state.valid); + }, + validators: [ + numberBetween(predictLookbackConfig.min, predictLookbackConfig.max, 1), + ], + }), + ), + Checkbox({ + name: 'predict_exclude_weekends', + label: 'Exclude weekends from training', + width: 250, + checked: predictExcludeWeekends, + onChange: (value) => predictExcludeWeekends.val = value, + }), + Checkbox({ + name: 'predict_exclude_holidays', + label: 'Exclude holidays from training', + width: 250, + checked: excludeHolidays, + onChange: (value) => excludeHolidays.val = value, + }), + () => excludeHolidays.val + ? div( + { style: 'width: 250px; margin: -8px 0 0 25px; position: relative;' }, + Input({ + name: 'predict_holiday_codes', + label: 'Holiday Codes', + value: predictHolidayCodes, + help: 'Comma-separated list of country or financial market codes', + autocompleteOptions: holidayCodes, + onChange: (value, state) => { + predictHolidayCodes.val = value; + options.setValidity?.('predict_holiday_codes', state.valid); + }, + validators: [ + required, + ], + }), + div( + { class: 'flex-row fx-gap-1 mt-1 text-caption' }, + span({}, 'See supported'), + Link({ + open_new: true, + label: 'codes', + href: 'https://holidays.readthedocs.io/en/latest/#available-countries', + right_icon: 'open_in_new', + right_icon_size: 13, + }), + ), + ) + : '', + ); +}; + +/** + * @param {string} expression + * @returns {number} + */ +function determineDuration(expression) { + // Normalize whitespace + const expr = (expression || '').trim().replace(/\s+/g, ' '); + // "M * * * *" + if (/^\d{1,2} \* \* \* \*$/.test(expr)) { + return 60 * 60; // 1 hour + } + // "M */H * * *" + let match = expr.match(/^\d{1,2} \*\/(\d+) \* \* \*$/); + if (match) { + return Number(match[1]) * 60 * 60; // H hours + } + // "M H1,H2,... * * *" (stepped hours with starting offset) + if (/^\d{1,2} \d+(,\d+)+ \* \* \*$/.test(expr)) { + const parsed = parseSteppedList(expr.split(' ')[1]); + if (parsed) return parsed.step * 60 * 60; + } + // "M H * * *" + if (/^\d{1,2} \d{1,2} \* \* \*$/.test(expr)) { + return 24 * 60 * 60; // 1 day + } + // "M H */D * *" + match = expr.match(/^\d{1,2} \d{1,2} \*\/(\d+) \* \*$/); + if (match) { + return Number(match[1]) * 24 * 60 * 60; // D days + } + // "M H D1,D2,... * *" (stepped days with starting offset) + if (/^\d{1,2} \d{1,2} \d+(,\d+)+ \* \*$/.test(expr)) { + const parsed = parseSteppedList(expr.split(' ')[2]); + if (parsed) return parsed.step * 24 * 60 * 60; + } + return null; +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.monitor-settings-row > * { + flex: 250px; +} +`); + +export { MonitorSettingsForm }; diff --git a/testgen/ui/components/frontend/js/components/monitoring_sparkline.js b/testgen/ui/components/frontend/js/components/monitoring_sparkline.js new file mode 100644 index 00000000..716fb047 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/monitoring_sparkline.js @@ -0,0 +1,276 @@ +/** + * @import {ChartViewBox, Point} from './chart_canvas.js'; + * + * @typedef Options + * @type {object} + * @property {ChartViewBox} viewBox + * @property {string} lineColor + * @property {number} lineWidth + * @property {string} markerColor + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {number[]?} yAxisTicks + * @property {Object?} attributes + * @property {PredictionPoint[]?} prediction + * @property {('predict'|'static')?} predictionMethod + * + * @typedef MonitoringPoint + * @type {Object} + * @property {number} x + * @property {number} y + * @property {string?} label + * @property {boolean?} isAnomaly + * @property {boolean?} isTraining + * @property {boolean?} isPending + * @property {number?} lowerTolerance + * @property {number?} upperTolerance + * + * @typedef PredictionPoint + * @type {Object} + * @property {number} x + * @property {number?} y + * @property {number} upper + * @property {number} lower + */ +import van from '../van.min.js'; +import { colorMap, formatNumber, formatTimestamp } from '../display_utils.js'; +import { getValue } from '../utils.js'; + +const { div, span } = van.tags(); +const { circle, g, path, polyline, rect, svg } = van.tags("http://www.w3.org/2000/svg"); + +/** + * + * @param {Options} options + * @param {MonitoringPoint[]} points + */ +const MonitoringSparklineChart = (options, ...points) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + const linePoints = van.state(points.filter(e => !e.isPending)); + const isStaticPrediction = _options.predictionMethod === 'static'; + const predictionPoints = van.derive(() => { + const _linePoints = linePoints.val; + const _predictionPoints = _options.prediction ?? []; + if (_linePoints.length > 0 && _predictionPoints.length > 0) { + const lastPoint = _linePoints[_linePoints.length - 1]; + if (isStaticPrediction) { + _predictionPoints.unshift({ + x: lastPoint.x, + y: lastPoint.y, + upper: lastPoint.upperTolerance ?? lastPoint.y, + lower: lastPoint.lowerTolerance ?? lastPoint.y, + }); + } else { + _predictionPoints.unshift({ + x: lastPoint.x, + y: lastPoint.y, + upper: lastPoint.upperTolerance ?? lastPoint.y, + lower: lastPoint.lowerTolerance ?? lastPoint.y, + }); + } + } + return _predictionPoints; + }); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox?.width; + height.val = viewBox?.height; + minX.val = viewBox?.minX; + minY.val = viewBox?.minY; + }); + + const extraAttributes = {...(_options.attributes ?? {})}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + return svg( + { + width: '100%', + height: '100%', + ...extraAttributes, + }, + () => { + const validPoints = linePoints.val.filter(p => + Number.isFinite(p.x) && Number.isFinite(p.y) + ); + if (validPoints.length < 2) return ''; + return polyline({ + points: validPoints.map(point => `${point.x} ${point.y}`).join(', '), + style: `stroke: ${getValue(_options.lineColor)}; stroke-width: ${getValue(_options.lineWidth)};`, + fill: 'none', + }); + }, + () => { + const tolerancePoints = linePoints.val.filter(p => + Number.isFinite(p.lowerTolerance) || Number.isFinite(p.upperTolerance) + ); + if (tolerancePoints.length < 2) return ''; + + return path({ + d: generateTolerancePath(tolerancePoints, _options.height, getValue(_options.lineWidth)), + fill: colorMap.blue, + 'fill-opacity': 0.1, + stroke: 'none', + }); + }, + () => { + const validPoints = predictionPoints.rawVal.filter(p => + Number.isFinite(p.x) && (Number.isFinite(p.upper) || Number.isFinite(p.lower)) + ); + if (validPoints.length < 2) return ''; + return path({ + d: generateShadowPath(validPoints, _options.height), + fill: colorMap.emptyDark, + opacity: 0.25, + stroke: 'none', + }); + }, + () => { + if (isStaticPrediction) return ''; + const validPoints = predictionPoints.rawVal.filter(p => + Number.isFinite(p.x) && Number.isFinite(p.y) + ); + if (validPoints.length < 2) return ''; + return polyline({ + points: validPoints.map(point => `${point.x} ${point.y}`).join(', '), + style: `stroke: ${getValue(colorMap.grey)}; stroke-width: ${getValue(_options.lineWidth)};`, + fill: 'none', + }); + }, + ); +}; + +function generateTolerancePath(points, chartHeight, minHeight = 0) { + const getBounds = (p) => { + let upper = Number.isFinite(p.upperTolerance) ? p.upperTolerance : 0; + let lower = Number.isFinite(p.lowerTolerance) ? p.lowerTolerance : chartHeight; + const height = lower - upper; + if (minHeight > 0 && height < minHeight) { + const midpoint = (upper + lower) / 2; + const halfMin = minHeight / 2; + upper = midpoint - halfMin; + lower = midpoint + halfMin; + } + return { upper, lower }; + }; + + const bounds = points.map(getBounds); + + let pathString = `M ${points[0].x} ${bounds[0].upper}`; + for (let i = 1; i < points.length; i++) { + pathString += ` L ${points[i].x} ${bounds[i].upper}`; + } + for (let i = points.length - 1; i >= 0; i--) { + pathString += ` L ${points[i].x} ${bounds[i].lower}`; + } + pathString += ' Z'; + return pathString; +} + +function generateShadowPath(data, chartHeight) { + const getUpper = (p) => Number.isFinite(p.upper) ? p.upper : 0; + const getLower = (p) => Number.isFinite(p.lower) ? p.lower : chartHeight; + + let pathString = `M ${data[0].x} ${getUpper(data[0])}`; + for (let i = 1; i < data.length; i++) { + pathString += ` L ${data[i].x} ${getUpper(data[i])}`; + } + for (let i = data.length - 1; i >= 0; i--) { + pathString += ` L ${data[i].x} ${getLower(data[i])}`; + } + pathString += ' Z'; + return pathString; +} + +/** + * + * @param {*} options + * @param {MonitoringPoint[]} points + * @returns + */ +const MonitoringSparklineMarkers = (options, points) => { + return g( + {transform: options.transform ?? undefined}, + ...points.map((point) => { + if (point.isPending || !Number.isFinite(point.x) || !Number.isFinite(point.y)) { + return null; + } + + const size = options.anomalySize || defaultAnomalyMarkerSize; + return g( + { + onmouseenter: () => options.showTooltip?.(MonitoringSparklineChartTooltip(point), point), + onmouseleave: () => options.hideTooltip?.(), + }, + circle({ + cx: point.x, + cy: point.y, + r: size, + fill: 'transparent', + }), + point.isAnomaly + ? rect({ + width: size, + height: size, + x: point.x - (size / 2), + y: point.y - (size / 2), + fill: options.anomalyColor || defaultAnomalyMarkerColor, + style: `transform-box: fill-box; transform-origin: center;`, + transform: 'rotate(45)', + + }) + : circle({ + cx: point.x, + cy: point.y, + r: options.size || defaultMarkerSize, + fill: point.isTraining ? 'var(--dk-dialog-background)' : (options.color || defaultMarkerColor), + style: `stroke: ${options.color || defaultMarkerColor}; stroke-width: 1;`, + }), + ); + }), + ); +}; + +/** + * * @param {MonitoringPoint} point + * @returns {HTMLDivElement} + */ +const MonitoringSparklineChartTooltip = (point) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, formatTimestamp(point.originalX)), + span({class: 'text-left text-small'}, `${point.label || 'Value'}: ${formatNumber(point.originalY)}`), + point.lowerTolerance != undefined + ? span({class: 'text-left text-small'}, `Lower bound: ${formatNumber(point.originalLowerTolerance)}`) + : '', + point.upperTolerance != undefined + ? span({class: 'text-left text-small'}, `Upper bound: ${formatNumber(point.originalUpperTolerance)}`) + : '', + ); +}; + +const /** @type Options */ defaultOptions = { + lineColor: colorMap.blueLight, + lineWidth: 3, + yAxisTicks: undefined, + attributes: {}, +}; +const defaultMarkerSize = 3; +const defaultMarkerColor = colorMap.blueLight; +const defaultAnomalyMarkerSize = 8; +const defaultAnomalyMarkerColor = colorMap.red; + +export { MonitoringSparklineChart, MonitoringSparklineMarkers }; diff --git a/testgen/ui/components/frontend/js/components/portal.js b/testgen/ui/components/frontend/js/components/portal.js index 51c82b63..12fa2e70 100644 --- a/testgen/ui/components/frontend/js/components/portal.js +++ b/testgen/ui/components/frontend/js/components/portal.js @@ -10,6 +10,7 @@ * @property {boolean?} targetRelative * @property {boolean} opened * @property {'left' | 'right'} align + * @property {('top' | 'bottom')?} position * @property {(string|undefined)} style * @property {(string|undefined)} class */ @@ -19,7 +20,7 @@ import { getValue } from '../utils.js'; const { div } = van.tags; const Portal = (/** @type Options */ options, ...args) => { - const { target, targetRelative, align = 'left' } = getValue(options); + const { target, targetRelative, align = 'left', position = 'bottom' } = getValue(options); const id = `${target}-portal`; window.testgen.portals[id] = { domId: id, targetId: target, opened: options.opened }; @@ -30,21 +31,13 @@ const Portal = (/** @type Options */ options, ...args) => { } const anchor = document.getElementById(target); - const anchorRect = anchor.getBoundingClientRect(); - const top = (targetRelative ? 0 : anchorRect.top) + anchorRect.height; - const left = targetRelative ? 0 : anchorRect.left; - const right = targetRelative ? 0 : (window.innerWidth - anchorRect.right); - const minWidth = anchorRect.width; - return div( { id, class: getValue(options.class) ?? '', style: `position: absolute; z-index: 99; - min-width: ${minWidth}px; - top: ${top}px; - ${align === 'left' ? `left: ${left}px;` : `right: ${right}px;`} + ${position === 'bottom' ? calculateBottomPosition(anchor, align, targetRelative) : calculateTopPosition(anchor, align, targetRelative)} ${getValue(options.style)}`, }, ...args, @@ -52,4 +45,22 @@ const Portal = (/** @type Options */ options, ...args) => { }; }; +function calculateTopPosition(anchor, align, targetRelative) { + const anchorRect = anchor.getBoundingClientRect(); + const bottom = (targetRelative ? anchorRect.height : anchorRect.top); + const left = targetRelative ? 0 : anchorRect.left; + const right = targetRelative ? 0 : (window.innerWidth - anchorRect.right); + + return `min-width: ${anchorRect.width}px; bottom: ${bottom}px; ${align === 'left' ? `left: ${left}px;` : `right: ${right}px;`}`; +} + +function calculateBottomPosition(anchor, align, targetRelative) { + const anchorRect = anchor.getBoundingClientRect(); + const top = (targetRelative ? 0 : anchorRect.top) + anchorRect.height; + const left = targetRelative ? 0 : anchorRect.left; + const right = targetRelative ? 0 : (window.innerWidth - anchorRect.right); + + return `min-width: ${anchorRect.width}px; top: ${top}px; ${align === 'left' ? `left: ${left}px;` : `right: ${right}px;`}`; +} + export { Portal }; diff --git a/testgen/ui/components/frontend/js/components/schema_changes_chart.js b/testgen/ui/components/frontend/js/components/schema_changes_chart.js new file mode 100644 index 00000000..0116587d --- /dev/null +++ b/testgen/ui/components/frontend/js/components/schema_changes_chart.js @@ -0,0 +1,163 @@ +/** + * @import {ChartViewBox, Point} from './chart_canvas.js'; + * * @typedef Options + * @type {object} + * @property {number} lineWidth + * @property {string} lineColor + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {ChartViewBox?} viewBox + * @property {Function?} showTooltip + * @property {Function?} hideTooltip + * @property {((e: SchemaEvent) => void)} onClick + * * @typedef SchemaEvent + * @type {object} + * @property {Point} point + * @property {string | number} time + * @property {number} additions + * @property {number} deletions + * @property {number} modifications + * @property {string | number} window_start + */ +import van from '../van.min.js'; +import { colorMap, formatNumber, formatTimestamp } from '../display_utils.js'; +import { scale } from '../axis_utils.js'; +import { getValue } from '../utils.js'; + +const { div, span } = van.tags(); +const { circle, g, rect, svg } = van.tags("http://www.w3.org/2000/svg"); + +/** + * * @param {Options} options + * @param {Array} events + */ +const SchemaChangesChart = (options, ...events) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox?.width; + height.val = viewBox?.height; + minX.val = viewBox?.minX; + minY.val = viewBox?.minY; + }); + + const currentViewBox = getValue(_options.viewBox); + const chartHeight = currentViewBox?.height ?? getValue(_options.height) ?? 100; + + const maxValue = Math.ceil(Math.max(...events.map(e => Math.max(e.additions, e.deletions, e.modifications))) / 10) * 10 || 10; + + const schemaEvents = events.map(e => { + const xPosition = e.point.x; + const markerProps = {}; + const parts = []; + + if (_options.showTooltip) { + markerProps.onmouseenter = () => _options.showTooltip?.(SchemaChangesChartTooltip(e), e.point); + markerProps.onmouseleave = () => _options.hideTooltip?.(); + } + + const totalChanges = e.additions + e.deletions + e.modifications; + + if (totalChanges <= 0) { + parts.push(circle({ + cx: xPosition, + cy: chartHeight - (_options.markerSize * 2), + r: _options.markerSize, + fill: colorMap.emptyDark, + })); + } else { + const barWidth = _options.lineWidth; + const gap = 1; + const groupWidth = (barWidth * 3) + (gap * 2); + const startX = xPosition - (groupWidth / 2); + + const drawBar = (val, index, color) => { + const barHeight = scale(val, {old: {min: 0, max: maxValue}, new: {min: 0, max: chartHeight}}); + const yPos = chartHeight - barHeight; + + return rect({ + x: startX + (index * (barWidth + gap)), + y: yPos, + width: barWidth, + height: Math.max(barHeight, 0), + fill: color, + 'shape-rendering': 'crispEdges' + }); + }; + + parts.push(drawBar(e.additions, 0, e.additions ? colorMap.blue : 'transparent')); + parts.push(drawBar(e.deletions, 1, e.deletions ? colorMap.orange : 'transparent')); + parts.push(drawBar(e.modifications, 2, e.modifications ? colorMap.purple : 'transparent')); + + if (_options.onClick && totalChanges > 0) { + const barGroupWidth = (_options.lineWidth * 3) + 4; + const clickableWidth = Math.max(barGroupWidth + 4, 14); + parts.push( + rect({ + width: clickableWidth, + height: chartHeight, + x: xPosition - (clickableWidth / 2), + y: 0, + fill: 'transparent', + style: `transform-box: fill-box; transform-origin: center; cursor: pointer;`, + onclick: () => _options.onClick?.(e), + }) + ); + } + } + + return g( + {...markerProps}, + ...parts, + ); + }); + + const extraAttributes = {}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + return svg( + { + width: '100%', + height: '100%', + ...extraAttributes, + }, + ...schemaEvents, + ); +}; + +const defaultOptions = { + lineWidth: 4, + lineColor: colorMap.red, + markerSize: 2, + nestedPosition: {x: 0, y: 0}, +}; + +/** + * * @param {SchemaEvent} event + * @returns {HTMLDivElement} + */ +const SchemaChangesChartTooltip = (event) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, formatTimestamp(event.time, false)), + span({class: 'text-left text-small'}, `Additions: ${formatNumber(event.additions)}`), + span({class: 'text-left text-small'}, `Deletions: ${formatNumber(event.deletions)}`), + span({class: 'text-left text-small'}, `Modifications: ${formatNumber(event.modifications)}`), + ); +}; + +export { SchemaChangesChart }; \ No newline at end of file diff --git a/testgen/ui/components/frontend/js/components/schema_changes_list.js b/testgen/ui/components/frontend/js/components/schema_changes_list.js new file mode 100644 index 00000000..80277e33 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/schema_changes_list.js @@ -0,0 +1,125 @@ +/** + * @typedef DataStructureLog + * @type {object} + * @property {('A'|'D'|'M')} change + * @property {string} old_data_type + * @property {string} new_data_type + * @property {string} column_name + * + * @typedef Properties + * @type {object} + * @property {number} window_start + * @property {number} window_end + * @property {(DataStructureLog[])?} data_structure_logs + */ +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { Icon } from '../components/icon.js'; +import { formatTimestamp } from '../display_utils.js'; +import { getValue, loadStylesheet, resizeFrameHeightOnDOMChange, resizeFrameHeightToElement } from '../utils.js'; + +const { div, span } = van.tags; + +/** + * @param {Properties} props + */ +const SchemaChangesList = (props) => { + loadStylesheet('schema-changes-list', stylesheet); + const domId = 'schema-changes-list'; + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(1); + resizeFrameHeightToElement(domId); + resizeFrameHeightOnDOMChange(domId); + } + + const dataStructureLogs = getValue(props.data_structure_logs) ?? []; + const windowStart = getValue(props.window_start); + const windowEnd = getValue(props.window_end); + + return div( + { id: domId, class: 'flex-column fx-gap-1 fx-flex schema-changes-list' }, + span({ style: 'font-size: 16px; font-weight: 500;' }, 'Schema Changes'), + span( + { class: 'mb-3 text-caption', style: 'min-width: 200px;' }, + `${formatTimestamp(windowStart)} ~ ${formatTimestamp(windowEnd)}`, + ), + ...dataStructureLogs.map(log => StructureLogEntry(log)), + ); +}; + +const StructureLogEntry = (/** @type {DataStructureLog} */ log) => { + if (log.change === 'A') { + return div( + { class: 'flex-row fx-gap-1 fx-align-flex-start' }, + Icon( + {style: `font-size: 20px; color: var(--primary-text-color)`, filled: !log.column_name}, + log.column_name ? 'add' : 'add_box', + ), + div( + { class: 'schema-changes-item flex-column' }, + span({ class: 'truncate-text' }, log.column_name ?? 'Table added'), + span(log.new_data_type), + ), + ); + } else if (log.change === 'D') { + return div( + { class: 'flex-row fx-gap-1' }, + Icon( + {style: `font-size: 20px; color: var(--primary-text-color)`, filled: !log.column_name}, + log.column_name ? 'remove' : 'indeterminate_check_box', + ), + div( + { class: 'schema-changes-item flex-column' }, + span({ class: 'truncate-text' }, log.column_name ?? 'Table dropped'), + ), + ); + } else if (log.change === 'M') { + return div( + { class: 'flex-row fx-gap-1 fx-align-flex-start' }, + Icon({style: `font-size: 18px; color: var(--primary-text-color)`}, 'change_history'), + div( + { class: 'schema-changes-item flex-column' }, + span({ class: 'truncate-text' }, log.column_name), + + div( + { class: 'flex-row fx-gap-1' }, + span({ class: 'truncate-text' }, log.old_data_type), + Icon({ size: 10 }, 'arrow_right_alt'), + span({ class: 'truncate-text' }, log.new_data_type), + ), + ), + ); + } + + return null; +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + .schema-changes-list { + overflow-y: auto; + } + + .schema-changes-item { + color: var(--secondary-text-color); + white-space: nowrap; + text-overflow: ellipsis; + overflow: hidden; + } + + .schema-changes-item span { + font-family: 'Courier New', Courier, monospace; + + white-space: nowrap; + text-overflow: ellipsis; + overflow: hidden; + } + + .schema-changes-item > span:first-child { + font-family: 'Roboto', 'Helvetica Neue', sans-serif; + color: var(--primary-text-color); + } +`); + +export { SchemaChangesList }; diff --git a/testgen/ui/components/frontend/js/components/select.js b/testgen/ui/components/frontend/js/components/select.js index 455b7783..3e3e658c 100644 --- a/testgen/ui/components/frontend/js/components/select.js +++ b/testgen/ui/components/frontend/js/components/select.js @@ -9,17 +9,19 @@ * @type {object} * @property {string?} id * @property {string} label - * @property {string?} value + * @property {string?|Array.?} value * @property {Array.} options * @property {boolean} allowNull * @property {Function|null} onChange * @property {boolean?} disabled * @property {boolean?} required + * @property {boolean?} multiSelect * @property {number?} width * @property {number?} height * @property {string?} style * @property {string?} testId * @property {number?} portalClass + * @property {('top' | 'bottom')?} portalPosition * @property {boolean?} filterable * @property {('normal' | 'inline')?} triggerStyle */ @@ -33,6 +35,10 @@ const { div, i, input, label, span } = van.tags; const Select = (/** @type {Properties} */ props) => { loadStylesheet('select', stylesheet); + if (getValue(props.multiSelect)) { + return MultiSelect(props); + } + const domId = van.derive(() => props.id?.val ?? getRandomId()); const opened = van.state(false); const optionsFilter = van.state(''); @@ -84,11 +90,12 @@ const Select = (/** @type {Properties} */ props) => { optionsFilter.val = event.target.value; }; - const showPortal = (/** @type Event */ event) => { - event.stopPropagation(); - event.stopImmediatePropagation(); - opened.val = getValue(props.disabled) ? false : true; - }; + // Reset filtering when closed + van.derive(() => { + if (!opened.val) { + optionsFilter.val = ''; + } + }); van.derive(() => { const currentOptions = getValue(options); @@ -115,7 +122,12 @@ const Select = (/** @type {Properties} */ props) => { class: () => `flex-column fx-gap-1 text-caption tg-select--label ${getValue(props.disabled) ? 'disabled' : ''}`, style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}; ${getValue(props.style)}`, 'data-testid': getValue(props.testId) ?? '', - onclick: showPortal, + onclick: (/** @type Event */ event) => { + event.stopPropagation(); + event.stopImmediatePropagation(); + // Should toggle open/close unless disabled + opened.val = getValue(props.disabled) ? false : !opened.val; + }, }, span( { class: 'flex-row fx-gap-1', 'data-testid': 'select-label' }, @@ -144,6 +156,9 @@ const Select = (/** @type {Properties} */ props) => { 'data-testid': 'select-input', }, () => { + // Hack to display value again when closed + // For some reason, it goes away when opened + opened.val; return div( { class: 'tg-select--field--content', 'data-testid': 'select-input-display' }, valueIcon.val @@ -170,7 +185,7 @@ const Select = (/** @type {Properties} */ props) => { ), Portal( - {target: domId.val, targetRelative: true, opened}, + {target: domId.val, targetRelative: true, position: props.portalPosition?.val ?? props?.portalPosition, opened}, () => div( { class: () => `tg-select--options-wrapper mt-1 ${getValue(props.portalClass) ?? ''}`, @@ -197,6 +212,112 @@ const Select = (/** @type {Properties} */ props) => { ); }; +/** + * @param {Properties} props + */ +const MultiSelect = (props) => { + const domId = van.derive(() => props.id?.val ?? getRandomId()); + const opened = van.state(false); + const options = van.derive(() => getValue(props.options) ?? []); + + const selectedValues = isState(props.value) ? props.value : van.state(props.value ?? []); + + const displayLabel = van.derive(() => { + const selected = getValue(selectedValues) ?? []; + if (!selected.length) { + return '---'; + }; + const allOptions = getValue(options); + return selected + .map(value => allOptions.find(opt => opt.value === value)?.label ?? value) + .join(', '); + }); + + const toggleOption = (optionValue) => { + const current = [...(getValue(selectedValues) ?? [])]; + const index = current.indexOf(optionValue); + if (index >= 0) { + current.splice(index, 1); + } else { + current.push(optionValue); + } + selectedValues.val = current; + props.onChange?.(current, { valid: current.length > 0 || !getValue(props.required) }); + }; + + return div( + { + id: domId, + class: () => `flex-column fx-gap-1 text-caption tg-select--label ${getValue(props.disabled) ? 'disabled' : ''}`, + style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}; ${getValue(props.style)}`, + 'data-testid': getValue(props.testId) ?? '', + onclick: (/** @type Event */ event) => { + event.stopPropagation(); + event.stopImmediatePropagation(); + // Should toggle open/close unless disabled + opened.val = getValue(props.disabled) ? false : !opened.val; + }, + }, + span( + { class: 'flex-row fx-gap-1', 'data-testid': 'select-label' }, + props.label, + () => getValue(props.required) + ? span({ class: 'text-error' }, '*') + : '', + ), + + div( + { + class: () => `flex-row tg-select--field ${opened.val ? 'opened' : ''}`, + style: () => getValue(props.height) ? `height: ${getValue(props.height)}px;` : '', + 'data-testid': 'select-input', + }, + () => { + // Hack to display value again when closed + // For some reason, it goes away when opened + opened.val; + return div( + { class: 'tg-select--field--content tg-select--multi-display', 'data-testid': 'select-input-display' }, + displayLabel.val || '', + ); + }, + div( + { class: 'tg-select--field--icon', 'data-testid': 'select-input-trigger' }, + i({ class: 'material-symbols-rounded' }, 'expand_more'), + ), + ), + + Portal( + {target: domId.val, targetRelative: true, position: props.portalPosition?.val ?? props?.portalPosition, opened}, + () => div( + { + class: () => `tg-select--options-wrapper mt-1 ${getValue(props.portalClass) ?? ''}`, + 'data-testid': 'select-options', + }, + getValue(options).map(option => { + const isSelected = van.derive(() => (getValue(selectedValues) ?? []).includes(option.value)); + return div( + { + class: () => `tg-select--option fx-gap-2 ${isSelected.val ? 'selected' : ''}`, + onclick: (/** @type Event */ event) => { + event.stopPropagation(); + toggleOption(option.value); + }, + 'data-testid': 'select-options-item', + }, + input({ + type: 'checkbox', + class: 'tg-select--checkbox', + checked: isSelected, + }), + span(option.label), + ); + }), + ), + ), + ); +}; + const stylesheet = new CSSStyleSheet(); stylesheet.replace(` .tg-select--label { @@ -238,6 +359,12 @@ stylesheet.replace(` font-weight: 500; } +.tg-select--multi-display { + overflow: hidden; + text-overflow: ellipsis; + white-space: nowrap; +} + .tg-select--field--content > input { border: unset !important; background: transparent !important; @@ -298,6 +425,36 @@ stylesheet.replace(` color: var(--primary-color); } +.tg-select--checkbox { + appearance: none; + box-sizing: border-box; + margin: 0; + width: 18px; + height: 18px; + flex-shrink: 0; + border: 1px solid var(--secondary-text-color); + border-radius: 4px; + position: relative; + pointer-events: none; + transition-property: border-color, background-color; + transition-duration: 0.3s; +} + +.tg-select--checkbox:checked { + border-color: transparent; + background-color: var(--primary-color); +} + +.tg-select--checkbox:checked::after { + content: 'check'; + position: absolute; + top: -4px; + left: -3px; + font-family: 'Material Symbols Rounded'; + font-size: 22px; + color: white; +} + .tg-select--inline-trigger { border-bottom: 1px solid var(--border-color); } diff --git a/testgen/ui/components/frontend/js/components/table.js b/testgen/ui/components/frontend/js/components/table.js new file mode 100644 index 00000000..c21ac284 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/table.js @@ -0,0 +1,540 @@ +/** + * @import {VanState} from '../van.min.js'; + * + * @typedef Column + * @type {object} + * @property {string} name + * @property {string} label + * @property {number?} colspan + * @property {number?} width + * @property {boolean?} sortable + * @property {('left' | 'center' | 'right')?} align + * @property {('hidden' | 'visible')?} overflow + * + * @typedef Sort + * @type {object} + * @property {string?} field + * @property {('asc'|'desc')?} order + * + * @typedef SelectonOptions + * @type {object} + * @property {boolean?} multi + * @property {((rowIndexes: number[]) => void)?} onRowsSelected + * + * @typedef SortOptions + * @type {object} + * @property {string?} field + * @property {('asc'|'desc')?} order + * @property {((a: Sort) => void)} onSortChange + * + * @typedef PaginatorOptions + * @type {object} + * @property {number?} itemsPerPage + * @property {number?} totalItems + * @property {number?} currentPageIdx + * @property {((a: number, b: number) => void)?} onPageChange + * @property {HTMLElement?} leftContent + * + * @typedef Options + * @type {object} + * @property {(Column[] | Column[][])} columns + * @property {any?} header + * @property {any?} emptyState + * @property {string?} class + * @property {((row: any, index: number) => string)?} rowClass + * @property {string?} height + * @property {string?} width + * @property {boolean?} highDensity + * @property {boolean?} dynamicWidth + * @property {SortOptions?} sort + * @property {PaginatorOptions?} paginator + * @property {SelectonOptions?} selection + */ +import { getValue, loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; +import { Button } from './button.js'; +import { Icon } from './icon.js'; +import { Select } from './select.js'; + +const { colgroup, col, div, span, table, thead, th, tbody, tr, td } = van.tags; +const defaultItemsPerPage = 20; +const defaultHeight = 'calc(100% - 76.5px)'; +const defaultWidth = '100%'; + +/** + * @param {Options?} options + * @param {...Row} rows + * @returns {HTMLElement} + */ +const Table = (options, rows) => { + loadStylesheet('table', stylesheet); + + const headerLines = van.derive(() => { + const columns = getValue(options.columns); + if (Array.isArray(columns[0])) { + return columns; + } + return [columns]; + }); + const dataColumns = van.derive(() => getValue(headerLines)?.slice(-1)?.[0] ?? []); + const widthSum = van.state(0); + const columnWidths = []; + + van.derive(() => { + for (let i = 0; i < dataColumns.val.length; i++) { + const column = dataColumns.val[i]; + columnWidths[i] = columnWidths[i] ?? van.state(0); + columnWidths[i].val = column.width; + widthSum.val += column.width; + } + widthSum.val = widthSum.val || undefined; + }); + + const selectedRows = []; + van.derive(() => { + const rows_ = getValue(rows); + rows_.forEach((_, idx) => { + selectedRows[idx] = selectedRows[idx] ?? van.state(false) + selectedRows[idx].val = false; + }); + }); + van.derive(() => { + const selectedRows_ = []; + for (let i = 0; i < selectedRows.length; i++) { + if (selectedRows[i].val) { + selectedRows_.push(i); + } + } + + options.selection?.onRowsSelected?.(selectedRows_); + }); + const onRowSelected = (idx) => { + if (!options.selection?.multi) { + for (const state of selectedRows) { + state.val = false; + } + } + + if (options.selection?.onRowsSelected) { + selectedRows[idx].val = !selectedRows[idx].val; + } + }; + + + const renderPaginator = van.derive(() => getValue(options.paginator) != undefined); + const paginatorOptions = van.derive(() => { + const p = getValue(options.paginator); + return { + itemsPerPage: p?.itemsPerPage ?? defaultItemsPerPage, + totalItems: p?.totalItems ?? undefined, + currentPageIdx: p?.currentPageIdx ?? 0, + onPageChange: p?.onPageChange, + leftContent: p?.leftContent, + }; + }); + + const sortOptions = van.derive(() => { + const s = getValue(options.sort); + + return { + field: s?.field, + order: s?.order, + onSortChange: (columnName) => { + let newSortOrder = 'desc'; + let columnNameOrClear = columnName; + if (s?.field === columnName && s?.order === 'desc') { + newSortOrder = 'asc'; + } else if (s?.field === columnName && s?.order === 'asc') { + newSortOrder = null; + columnNameOrClear = null; + } + + s?.onSortChange?.({field: columnNameOrClear, order: newSortOrder}); + }, + }; + }); + + return div( + { + class: () => `tg-table flex-column border border-radius-1 ${getValue(options.highDensity) ? 'tg-table-high-density' : ''} ${getValue(options.dynamicWidth) ? 'tg-table-dynamic-width' : ''} ${options.onRowsSelected ? 'tg-table-hoverable' : ''}`, + style: () => `height: ${getValue(options.height) ? getValue(options.height) + 'px' : defaultHeight};`, + }, + options.header, + div( + {class: 'tg-table-scrollable flex-column fx-flex'}, + table( + { + class: () => getValue(options.class) ?? '', + style: () => { + const dynamicWidth = getValue(options.dynamicWidth) ?? false; + let widthNumber = getValue(options.width) ?? widthSum.val; + if (widthNumber < window.innerWidth) { + widthNumber = window.innerWidth; + } + return `width: ${(widthNumber && dynamicWidth) ? widthNumber + 'px' : defaultWidth}; ${dynamicWidth ? 'table-layout: fixed;' : ''}`; + }, + }, + () => colgroup( + ...dataColumns.val.map((_, idx) => col({style: `width: ${columnWidths[idx].val}px;`})), + ), + () => thead( + getValue(headerLines).map((headerLine, idx, allHeaderLines) => { + const dynamicWidth = getValue(options.dynamicWidth) ?? false; + return tr( + ...getValue(headerLine).map((column, colIdx) => + TableHeaderColumn( + column, + idx === allHeaderLines.length - 1, + columnWidths, + colIdx, + dynamicWidth, + sortOptions, + ) + ), + ); + }) + ), + () => { + const rows_ = getValue(rows); + if (rows_.length <= 0 && options.emptyState) { + return tbody( + {class: 'tg-table-empty-state-body'}, + tr( + td( + {colspan: dataColumns.val.length}, + options.emptyState, + ), + ), + ); + } + + return tbody( + rows_.map((row, idx) => + tr( + { + class: () => `${selectedRows[idx].val ? 'selected' : ''} ${options.rowClass?.(row, idx) ?? ''}`, + onclick: () => onRowSelected(idx), + }, + ...getValue(dataColumns).map(column => TableCell(column, row, idx)), + ) + ), + ) + }, + ), + ), + () => renderPaginator.val + ? Paginatior( + getValue(paginatorOptions).itemsPerPage, + getValue(paginatorOptions).totalItems, + getValue(paginatorOptions).currentPageIdx, + getValue(options.highDensity), + getValue(paginatorOptions).onPageChange, + getValue(paginatorOptions).leftContent, + ) + : undefined, + ); +}; + +/** + * @typedef SortOptionsB + * @type {object} + * @property {string?} field + * @property {('asc'|'desc')?} order + * @property {((field: string) => void)} onSortChange + * + * @param {Column} column + * @param {boolean} isDataColumn + * @param {VanState[]} columnWidths + * @param {number} columnIndex + * @param {boolean} dynamicWidth + * @param {VanState} sortOptions + */ +const TableHeaderColumn = ( + column, + isDataColumn, + columnWidths, + columnIndex, + dynamicWidth, + sortOptions, +) => { + let startX, startWidth; + + const doDrag = (e) => { + const newWidth = startWidth + (e.clientX - startX); + if (newWidth > 50) { + columnWidths[columnIndex].val = newWidth; + } + }; + + const stopDrag = () => { + document.removeEventListener('mousemove', doDrag); + document.removeEventListener('mouseup', stopDrag); + document.body.style.cursor = ''; + document.documentElement.style.userSelect = ''; + document.documentElement.style.pointerEvents = ''; + }; + + const initDrag = (e) => { + startX = e.clientX; + startWidth = columnWidths[columnIndex].val; + document.addEventListener('mousemove', doDrag); + document.addEventListener('mouseup', stopDrag); + document.body.style.cursor = 'col-resize'; + document.documentElement.style.userSelect = 'none'; + document.documentElement.style.pointerEvents = 'none'; + }; + + const sortIcon = van.derive(() => { + if (!isDataColumn || !column.sortable) { + return null; + } + + const isSorted = sortOptions.val.field === column.name; + return ( + Icon( + {style: `font-size: 13px; cursor: pointer; color: var(${isSorted ? '--primary-text-color' : '--disabled-text-color'})`}, + isSorted ? (sortOptions.val.order === 'desc' ? 'south' : 'north') : 'expand_all', + ) + ); + }); + + return th( + { + class: `${isDataColumn ? 'tg-table-column' : 'tg-table-helper-column'} text-small text-secondary ${column.name} ${column.sortable ? 'clickable' : ''}`, + align: column.align, + width: column.width, + colspan: column.colspan ?? 1, + 'data-testid': column.name, + style: `overflow-x: ${column.overflow ?? 'hidden'}`, + onclick: () => { + if (isDataColumn && column.sortable) { + sortOptions.val.onSortChange(column.name); + } + }, + }, + () => div( + {class: 'flex-row fx-gap-2', style: 'display: inline-flex'}, + span(column.label), + sortIcon.val, + ), + ( + isDataColumn && dynamicWidth + ? div( + {class: 'tg-column-resizer', onmousedown: initDrag}, + div() + ) + : null + ), + ); +}; + +/** + * + * @param {Column} column + * @param {Row} row + * @param {number} index + */ +const TableCell = (column, row, index) => { + return td( + { + class: `tg-table-cell ${column.name}`, + align: column.align, + width: column.width, + colspan: column.colspan ?? 1, + 'data-testid': `table-cell:${index},${column.name}`, + style: `overflow-x: ${column.overflow ?? 'hidden'}`, + }, + getValue(row[column.name]), + ); +}; + +/** + * + * @param {number} itemsPerPage + * @param {number?} totalItems + * @param {number} currentPageIdx + * @param {boolean?} highDensity + * @param {((number, number) => void)?} onPageChange + * @param {HTMLElement?} leftContent + * @returns {HTMLElement} + */ +const Paginatior = ( + itemsPerPage, + totalItems, + currentPageIdx, + highDensity, + onPageChange, + leftContent = undefined, +) => { + const pageStart = itemsPerPage * currentPageIdx + 1; + const pageEnd = Math.min(pageStart + itemsPerPage - 1, totalItems); + const lastPage = (Math.floor(totalItems / itemsPerPage) + (totalItems % itemsPerPage > 0) - 1); + + return div( + {class: `tg-table-paginator flex-row fx-justify-content-flex-end ${highDensity ? '' : 'p-1'} text-secondary`}, + + leftContent, + leftContent != undefined ? span({class: 'fx-flex'}) : '', + + span({class: 'mr-2'}, 'Rows per page:'), + Select({ + triggerStyle: 'inline', + testId: 'items-per-page', + value: itemsPerPage, + options: [ + {label: '20', value: 20}, + {label: '50', value: 50}, + {label: '100', value: 100}, + ], + portalPosition: 'top', + onChange: (value) => onPageChange(currentPageIdx, parseInt(value)), + }), + span({class: 'mr-6'}, ''), + span({class: 'mr-6'}, `${pageStart}-${pageEnd} of ${totalItems ?? '∞'}`), + Button({ + type: 'icon', + icon: 'first_page', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: currentPageIdx === 0, + onclick: () => onPageChange(0, itemsPerPage), + }), + Button({ + type: 'icon', + icon: 'chevron_left', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: currentPageIdx === 0, + onclick: () => onPageChange(currentPageIdx - 1, itemsPerPage), + }), + Button({ + type: 'icon', + icon: 'chevron_right', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: pageEnd >= totalItems, + onclick: () => onPageChange(currentPageIdx + 1, itemsPerPage), + }), + Button({ + type: 'icon', + icon: 'last_page', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: pageEnd >= totalItems, + onclick: () => onPageChange(lastPage, itemsPerPage), + }), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-table { + background: var(--dk-card-background); +} + +.tg-table > .tg-table-scrollable { + overflow: auto; + border-radius: 4px; +} + +.tg-table > .tg-table-scrollable > table { + border-collapse: collapse; + border-color: var(--border-color); +} + +.tg-table > .tg-table-scrollable > table:has(.tg-table-empty-state-body) { + height: 100%; +} + +.tg-table > .tg-table-scrollable > table > thead { + border-bottom: var(--button-stroked-border); + position: sticky; + top: 0; + background: var(--dk-card-background); /* Ensure header background is solid when sticky */ + z-index: 1; /* Ensure header is above scrolling content */ +} + +.tg-table > .tg-table-scrollable > table > thead th { + font-weight: normal; +} + +.tg-table > .tg-table-scrollable > table > thead th > div { + text-overflow: ellipsis; + white-space: nowrap; + overflow-x: hidden; +} + +.tg-table > .tg-table-scrollable > table > thead th.tg-table-helper-column { + padding: 0px; +} + +.tg-table > .tg-table-scrollable > table > thead th.tg-table-column { + padding: 4px 8px; + height: 32px; + text-transform: uppercase; + position: relative; /* Needed for absolute positioning of resizer */ +} + +.tg-table > .tg-table-scrollable > table > thead th .tg-column-resizer { + position: absolute; + right: 0; + top: 0; + width: 5px; + height: 90%; + background: transparent; + cursor: col-resize; + z-index: 2; /* Ensure resizer is above other content */ +} + +.tg-table > .tg-table-scrollable > table > thead th .tg-column-resizer > div { + height: 100%; + width: 1px; + background: var(--border-color); +} + +.tg-table > .tg-table-scrollable > table > tbody > tr { + height: 40px; +} + +.tg-table > .tg-table-scrollable > table > tbody > tr:not(:last-of-type) { + border-bottom: var(--button-stroked-border); +} + +.tg-table > .tg-table-scrollable > table > tbody > tr.selected { + background-color: var(--table-selection-color); +} + +.tg-table > .tg-table-scrollable > table .tg-table-cell { + padding: 4px 8px; + height: 40px; +} + +.tg-table > .tg-table-paginator { + border-top: var(--button-stroked-border); +} + +.tg-table.tg-table-high-density > .tg-table-scrollable > table > thead th.tg-table-column { + padding: 0px 8px; + height: 27px; +} + +.tg-table.tg-table-high-density > .tg-table-scrollable > table .tg-table-cell { + padding: 0px 8px; + height: 27px; +} + +.tg-table.tg-table-dynamic-width > .tg-table-scrollable > table { + table-layout: fixed; +} + +.tg-table.tg-table-dynamic-width > .tg-table-scrollable > table > tbody td { + text-overflow: ellipsis; + white-space: nowrap; +} + +.tg-table.tg-table-hoverable > .tg-table-scrollable > table > tbody tr:hover { + background-color: var(--table-hover-color); +} +`); + +export { Table, TableHeaderColumn }; diff --git a/testgen/ui/components/frontend/js/components/table_group_form.js b/testgen/ui/components/frontend/js/components/table_group_form.js index c2329cdf..6b072255 100644 --- a/testgen/ui/components/frontend/js/components/table_group_form.js +++ b/testgen/ui/components/frontend/js/components/table_group_form.js @@ -83,7 +83,6 @@ const TableGroupForm = (props) => { const profileFlagCdes = van.state(tableGroup.profile_flag_cdes ?? true); const includeInDashboard = van.state(tableGroup.include_in_dashboard ?? true); const addScorecardDefinition = van.state(tableGroup.add_scorecard_definition ?? true); - const addMonitorTestSuite = van.state(tableGroup.add_monitor_test_suite ?? false); const profileUseSampling = van.state(tableGroup.profile_use_sampling ?? false); const profileSamplePercent = van.state(tableGroup.profile_sample_percent ?? 30); const profileSampleMinCount = van.state(tableGroup.profile_sample_min_count ?? 15000); @@ -123,7 +122,6 @@ const TableGroupForm = (props) => { profile_flag_cdes: profileFlagCdes.val, include_in_dashboard: includeInDashboard.val, add_scorecard_definition: addScorecardDefinition.val, - add_monitor_test_suite: addMonitorTestSuite.val, profile_use_sampling: profileUseSampling.val, profile_sample_percent: profileSamplePercent.val, profile_sample_min_count: profileSampleMinCount.val, @@ -190,7 +188,6 @@ const TableGroupForm = (props) => { profileFlagCdes, includeInDashboard, addScorecardDefinition, - addMonitorTestSuite, ), SamplingForm( { setValidity: setFieldValidity }, @@ -330,7 +327,6 @@ const SettingsForm = ( profileFlagCdes, includeInDashboard, addScorecardDefinition, - addMonitorTestSuite, ) => { return div( { class: 'flex-row fx-gap-3 fx-flex-wrap fx-align-flex-start border border-radius-1 p-3 mt-1', style: 'position: relative;' }, diff --git a/testgen/ui/components/frontend/js/components/test_definition_form.js b/testgen/ui/components/frontend/js/components/test_definition_form.js new file mode 100644 index 00000000..31812f87 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/test_definition_form.js @@ -0,0 +1,451 @@ +/** + * @typedef TestDefinition + * @type {object} + * @property {string} id + * @property {string} table_groups_id + * @property {string?} profile_run_id + * @property {string} test_type + * @property {string} test_suite_id + * @property {string?} test_description + * @property {string} schema_name + * @property {string?} table_name + * @property {string?} column_name + * @property {number?} skip_errors + * @property {string?} baseline_ct + * @property {string?} baseline_unique_ct + * @property {string?} baseline_value + * @property {string?} baseline_value_ct + * @property {string?} threshold_value + * @property {string?} baseline_sum + * @property {string?} baseline_avg + * @property {string?} baseline_sd + * @property {string?} lower_tolerance + * @property {string?} upper_tolerance + * @property {string?} subset_condition + * @property {string?} groupby_names + * @property {string?} having_condition + * @property {string?} window_date_column + * @property {number?} window_days + * @property {string?} match_schema_name + * @property {string?} match_table_name + * @property {string?} match_column_names + * @property {string?} match_subset_condition + * @property {string?} match_groupby_names + * @property {string?} match_having_condition + * @property {string?} custom_query + * @property {string?} history_calculation + * @property {string?} history_calculation_upper + * @property {number?} history_lookback + * @property {boolean} test_active + * @property {string?} test_definition_status + * @property {string?} severity + * @property {boolean} lock_refresh + * @property {number?} last_auto_gen_date + * @property {number?} profiling_as_of_date + * @property {number?} last_manual_update + * @property {boolean} export_to_observability + * @property {string} test_name_short + * @property {string} default_test_description + * @property {string} measure_uom + * @property {string} measure_uom_description + * @property {string} default_parm_columns + * @property {string} default_parm_prompts + * @property {string} default_parm_help + * @property {string} default_severity + * @property {'column'|'referential'|'table'|'tablegroup'|'custom'} test_scope + * @property {string?} prediction + * + * @typedef Properties + * @type {object} + * @property {TestDefinition} definition + * @property {string?} class + * @property {(changes: object, valid: boolean) => void} onChange + */ + +import van from '../van.min.js'; +import { getValue, isEqual, loadStylesheet } from '../utils.js'; +import { Input } from './input.js'; +import { Select } from './select.js'; +import { Textarea } from './textarea.js'; +import { RadioGroup } from './radio_group.js'; +import { Caption } from './caption.js'; +import { numberBetween } from '../form_validators.js'; + +const { div, span } = van.tags; + +const thresholdColumns = [ + 'history_calculation', + 'history_calculation_upper', + 'history_lookback', + 'lower_tolerance', + 'upper_tolerance', +]; + +// Columns using the default { type: 'text' } do not need to be specified here +const PARAMETER_CONFIG = { + custom_query: { type: 'textarea' }, + lower_tolerance: { type: 'number' }, + upper_tolerance: { type: 'number' }, +}; + + +const TestDefinitionForm = (/** @type Properties */ props) => { + loadStylesheet('test-definition-form', stylesheet); + + const definition = getValue(props.definition); + + const paramColumns = (definition.default_parm_columns || '').split(',').map(v => v.trim()); + const paramLabels = (definition.default_parm_prompts || '').split(',').map(v => v.trim()); + const paramHelp = (definition.default_parm_help || '').split('|').map(v => v.trim()); + + const hasThresholds = paramColumns.includes('history_calculation'); + const dynamicParamColumns = paramColumns + .map((column, index) => ({ + ...(PARAMETER_CONFIG[column] || { type: 'text' }), + column, + label: paramLabels[index] || column.replaceAll('_', ' '), + help: paramHelp[index] || null, + })) + .filter(config => !hasThresholds || !thresholdColumns.includes(config.column)) + + const updatedDefinition = van.state({ ...definition }); + const validityPerField = van.state({}); + + van.derive(() => { + const newDefinition = updatedDefinition.val + const fieldsValidity = validityPerField.val; + const isValid = Object.keys(fieldsValidity).length > 0 && + Object.values(fieldsValidity).every(v => v); + + const changes = {}; + for (const key in newDefinition) { + if (!isEqual(newDefinition[key], definition[key])) { + changes[key] = newDefinition[key]; + } + } + props.onChange?.(changes, { dirty: !!Object.keys(changes).length, valid: isValid }); + }); + + const setFieldValues = (updatedValues) => { + updatedDefinition.val = { ...updatedDefinition.rawVal, ...updatedValues }; + }; + + const setFieldValidity = (field, validity) => { + validityPerField.val = { ...validityPerField.rawVal, [field]: validity }; + }; + + return div( + { class: props.class }, + div( + { class: 'mb-2' }, + div({ class: 'text-large' }, definition.test_name_short), + definition.test_description || definition.default_test_description + ? span({ class: 'text-caption mt-2' }, definition.test_description ?? definition.default_test_description) + : null, + ), + () => div( + { class: 'flex-row fx-flex-wrap fx-gap-3' }, + dynamicParamColumns.map(config => { + const column = config.column; + const currentValue = () => updatedDefinition.val[column] ?? config.default; + + if (config.type === 'select') { + return div( + { class: 'td-form--field' }, + () => Select({ + label: config.label, + options: config.options, + value: currentValue(), + onChange: (value) => setFieldValues({ [column]: value }), + }), + ); + } + + if (config.type === 'number') { + return div( + { class: 'td-form--field' }, + () => Input({ + name: column, + label: config.label, + help: config.help, + type: 'number', + value: currentValue(), + step: config.step, + onChange: (value, state) => { + setFieldValues({ [column]: value || null }) + setFieldValidity(column, state.valid); + }, + }), + ); + } + + if (config.type === 'textarea') { + return div( + { class: 'td-form--field-wide' }, + () => Textarea({ + name: column, + label: config.label, + help: config.help, + value: currentValue(), + height: 100, + onChange: (value) => { + setFieldValues({ [column]: value || null }) + }, + }), + ); + } + + return div( + { class: 'td-form--field' }, + () => Input({ + name: column, + label: config.label, + help: config.help, + value: currentValue(), + onChange: (value, state) => { + setFieldValues({ [column]: value || null }) + setFieldValidity(column, state.valid); + }, + }), + ); + }), + ), + hasThresholds + ? ThresholdForm( + { setFieldValues, setFieldValidity }, + definition, + ) + : null, + ); +}; + +const thresholdModeOptions = [ + { + label: 'Prediction Model', + value: 'prediction', + help: 'Use time series prediction to automatically determine expected bounds', + }, + { + label: 'Historical Calculation', + value: 'historical', + help: 'Calculate bounds based on historical results', + }, + { + label: 'Static Thresholds', + value: 'static', + help: 'Manually specify fixed upper and lower bounds', + }, +]; + +const historyCalcOptions = [ + { label: 'Value', value: 'Value' }, + { label: 'Minimum', value: 'Minimum' }, + { label: 'Maximum', value: 'Maximum' }, + { label: 'Sum', value: 'Sum' }, + { label: 'Average', value: 'Average' }, + { label: 'Expression', value: 'Expression' }, +]; + +/** + * @typedef ThresholdFormOptions + * @type {object} + * @property {(updatedValues: object) => void} setFieldValues + * @property {(field: string, valid: boolean) => void} setFieldValidity + * + * @param {ThresholdFormOptions} options + * @param {TestDefinition} definition + */ +const ThresholdForm = (options, definition) => { + const { setFieldValues, setFieldValidity } = options; + const isFreshnessTrend = definition.test_type === 'Freshness_Trend'; + const initialHistoryCalc = definition.history_calculation; + + const initialMode = initialHistoryCalc === 'PREDICT' ? 'prediction' : initialHistoryCalc ? 'historical' : 'static'; + const mode = van.state(initialMode); + + const historyCalc = van.state(initialHistoryCalc === 'PREDICT' || !initialHistoryCalc ? 'Minimum' : initialHistoryCalc); + const historyCalcUpper = van.state(definition.history_calculation_upper ?? 'Maximum'); + const historyLookback = van.state(definition.history_lookback || 10); + const lowerTolerance = van.state(definition.lower_tolerance); + const upperTolerance = van.state(definition.upper_tolerance); + + const lowerParsed = van.derive(() => parseExpressionValue(historyCalc.val)); + const upperParsed = van.derive(() => parseExpressionValue(historyCalcUpper.val)); + + return div( + { class: 'flex-column fx-gap-4 border border-radius-1 p-3 mt-5', style: 'position: relative;' }, + Caption({ content: 'Thresholds', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + RadioGroup({ + name: 'threshold_mode', + options: isFreshnessTrend + ? thresholdModeOptions.filter(option => option.value !== 'historical') + : thresholdModeOptions, + value: mode, + layout: 'vertical', + onChange: (newMode) => { + mode.val = newMode; + options.setFieldValues({ + 'history_calculation': newMode === 'prediction' ? 'PREDICT' : newMode === 'historical' ? historyCalc.val : null, + 'history_calculation_upper': newMode === 'historical' ? historyCalcUpper.val : null, + 'history_lookback': newMode === 'historical' ? historyLookback.val : null, + 'lower_tolerance': newMode === 'static' ? lowerTolerance.val : newMode === 'prediction' ? definition.lower_tolerance : null, + 'upper_tolerance': newMode === 'static' ? upperTolerance.val : newMode === 'prediction' ? definition.upper_tolerance : null, + }); + }, + }), + () => { + if (mode.val === 'historical') { + return div( + { class: 'flex-column fx-gap-3 mt-2' }, + div( + { class: 'flex-row fx-align-flex-start fx-gap-3 fx-flex-wrap' }, + div( + { class: 'td-form--field flex-column fx-gap-3' }, + () => Select({ + label: 'Lower Bound Calculation', + options: historyCalcOptions, + value: lowerParsed.val.selectValue, + onChange: (value) => { + const fieldValue = value === 'Expression' ? formatExpressionValue('') : value; + historyCalc.val = fieldValue; + setFieldValues({ history_calculation: fieldValue }); + }, + }), + () => lowerParsed.val.isExpression + ? Input({ + name: 'history_calculation_expression', + label: 'Lower Bound Expression', + value: lowerParsed.val.expression, + help: 'Use {VALUE}, {MINIMUM}, {MAXIMUM}, {SUM}, {AVERAGE}, {STANDARD_DEVIATION} to reference historical aggregates. Example: 0.5 * {AVERAGE}', + onChange: (value) => { + const fieldValue = formatExpressionValue(value); + setFieldValues({ history_calculation: fieldValue }); + }, + }) + : '', + ), + div( + { class: 'td-form--field flex-column fx-gap-3' }, + () => Select({ + label: 'Upper Bound Calculation', + options: historyCalcOptions, + value: upperParsed.val.selectValue, + onChange: (value) => { + const fieldValue = value === 'Expression' ? formatExpressionValue('') : value; + historyCalcUpper.val = fieldValue; + setFieldValues({ history_calculation_upper: fieldValue }); + }, + }), + () => upperParsed.val.isExpression + ? Input({ + name: 'history_calculation_upper_expression', + label: 'Upper Bound Expression', + value: upperParsed.val.expression, + help: 'Use {VALUE}, {MINIMUM}, {MAXIMUM}, {SUM}, {AVERAGE}, {STANDARD_DEVIATION} to reference historical aggregates. Example: 1.5 * {AVERAGE}', + onChange: (value) => { + const fieldValue = formatExpressionValue(value); + setFieldValues({ history_calculation_upper: fieldValue }); + }, + }) + : '', + ), + ), + div( + { class: 'flex-row fx-gap-3' }, + div( + { class: 'td-form--field' }, + Input({ + name: 'history_lookback', + label: 'History Lookback', + type: 'number', + value: historyLookback, + help: 'Number of historical runs to use for calculation', + step: 1, + disabled: () => lowerParsed.val.selectValue === 'Value' && upperParsed.val.selectValue === 'Value', + onChange: (value, state) => { + historyLookback.val = value; + setFieldValues({ history_lookback: value }); + setFieldValidity('history_lookback', state.valid); + }, + validators: [numberBetween(1, 1000, 1)], + }), + ), + ) + ); + } + + if (mode.val === 'static') { + return div( + { class: 'flex-row fx-gap-3 fx-flex-wrap mt-2' }, + !isFreshnessTrend + ? div( + { class: 'td-form--field' }, + Input({ + name: 'lower_tolerance', + label: 'Lower Bound', + type: 'number', + value: lowerTolerance, + onChange: (value, state) => { + lowerTolerance.val = value; + setFieldValues({ lower_tolerance: value }); + setFieldValidity('lower_tolerance', state.valid); + }, + }), + ) + : null, + div( + { class: 'td-form--field' }, + Input({ + name: 'upper_tolerance', + label: isFreshnessTrend ? 'Maximum interval since last update (minutes)' : 'Upper Bound', + type: 'number', + value: upperTolerance, + onChange: (value, state) => { + upperTolerance.val = value; + setFieldValues({ upper_tolerance: value }); + setFieldValidity('upper_tolerance', state.valid); + }, + }), + ), + ); + } + + return span({ class: 'text-caption mt-2' }, 'The prediction model will automatically determine expected bounds based on historical patterns.'); + }, + ); +}; + +/** + * @param {string?} value + * @returns {{ isExpression: boolean, selectValue: string?, expression: string? }} + */ +const parseExpressionValue = (value) => { + if (!value) { + return { isExpression: false, selectValue: value, expression: null }; + } + // Format: EXPR:[...] + const match = value.match(/^EXPR:\[(.*)\]$/); + if (match) { + return { isExpression: true, selectValue: 'Expression', expression: match[1] }; + } + return { isExpression: false, selectValue: value, expression: null }; +}; + +/** + * @param {string?} expression + * @returns {string} + */ +const formatExpressionValue = (expression) => `EXPR:[${expression || ''}]`; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.td-form--field { + flex: calc(50% - 8px) 0 0; +} + +.td-form--field-wide { + flex: 100% 1 1; +} +`); + +export { TestDefinitionForm }; diff --git a/testgen/ui/components/frontend/js/components/threshold_chart.js b/testgen/ui/components/frontend/js/components/threshold_chart.js new file mode 100644 index 00000000..ea92d8ad --- /dev/null +++ b/testgen/ui/components/frontend/js/components/threshold_chart.js @@ -0,0 +1,106 @@ +/** + * @import {ChartViewBox, DrawingArea} from './chart_canvas.js'; + * + * @typedef Point + * @type {object} + * @property {number} x + * @property {number} y + * + * @typedef Options + * @type {object} + * @property {number} width + * @property {number} height + * @property {DrawingArea} area + * @property {ChartViewBox} viewBox + * @property {number} paddingLeft + * @property {number} paddingRight + * @property {string} color + * @property {number} lineWidth + * @property {string} markerColor + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {number[]?} yAxisTicks + * + * @typedef MonitoringEvent + * @type {object} + * @property {number} value + * @property {string} time + */ +import van from '../van.min.js'; +import { colorMap } from '../display_utils.js'; +import { getValue } from '../utils.js'; + +const { polygon, polyline, svg } = van.tags("http://www.w3.org/2000/svg"); + +/** + * + * @param {Options} options + * @param {Array} line1 + * @param {Array?} line2 + */ +const ThresholdChart = (options, line1, line2) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + const widthFactor = van.state(1.0); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox.width; + height.val = viewBox.height; + minX.val = viewBox.minX; + minY.val = viewBox.minY; + widthFactor.val = viewBox.widthFactor; + }); + + const extraAttributes = {}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + let content = () => polyline({ + points: line1.map(point => `${point.x} ${point.y}`).join(', '), + style: `stroke: ${getValue(_options.color)}; stroke-width: ${getValue(_options.lineWidth)};`, + fill: 'none', + }); + if (line2) { + content = () => polygon({ + points: `${line1.map(point => `${point.x} ${point.y}`).join(', ')} ${line2.map(point => `${point.x} ${point.y}`).join(', ')}`, + fill: getValue(_options.color), + stroke: 'none', + }); + } + + return svg( + { + width: '100%', + height: '100%', + style: `overflow: visible;`, + ...extraAttributes, + }, + content, + ); +}; + +const /** @type Options */ defaultOptions = { + width: 600, + height: 200, + paddingLeft: 16, + paddingRight: 16, + color: colorMap.redLight, + lineWidth: 3, + markerColor: colorMap.red, + markerSize: 8, + yAxisTicks: undefined, +}; + +export { ThresholdChart }; diff --git a/testgen/ui/components/frontend/js/components/toggle.js b/testgen/ui/components/frontend/js/components/toggle.js index 8d01755a..0a635c7c 100644 --- a/testgen/ui/components/frontend/js/components/toggle.js +++ b/testgen/ui/components/frontend/js/components/toggle.js @@ -4,6 +4,7 @@ * @property {string} label * @property {string?} name * @property {boolean?} checked + * @property {string?} style * @property {function(boolean)?} onChange */ import van from '../van.min.js'; @@ -15,7 +16,7 @@ const Toggle = (/** @type Properties */ props) => { loadStylesheet('toggle', stylesheet); return label( - { class: 'flex-row fx-gap-2 clickable', 'data-testid': props.name ?? '' }, + { class: 'flex-row fx-gap-2 clickable', style: props.style ?? '', 'data-testid': props.name ?? '' }, input({ type: 'checkbox', role: 'switch', diff --git a/testgen/ui/components/frontend/js/components/tree.js b/testgen/ui/components/frontend/js/components/tree.js index 1b737b94..82acc371 100644 --- a/testgen/ui/components/frontend/js/components/tree.js +++ b/testgen/ui/components/frontend/js/components/tree.js @@ -508,7 +508,8 @@ stylesheet.replace(` } .tg-tree--row.selected { - background-color: #06a04a17; + background-color: var(--sidebar-item-hover-color); + color: var(--primary-color); font-weight: 500; } diff --git a/testgen/ui/components/frontend/js/components/wizard_progress_indicator.js b/testgen/ui/components/frontend/js/components/wizard_progress_indicator.js new file mode 100644 index 00000000..88bbb789 --- /dev/null +++ b/testgen/ui/components/frontend/js/components/wizard_progress_indicator.js @@ -0,0 +1,147 @@ + +/** + * @typedef WizardStepMeta + * @type {object} + * @property {int} index + * @property {string} title + * @property {boolean} skipped + * @property {string[]} includedSteps + * + * @typedef CurrentStep + * @type {object} + * @property {int} index + * @property {string} name + * + * @param {WizardStepMeta[]} steps + * @param {CurrentStep} currentStep + * @returns + */ +import van from '../van.min.js'; +import { colorMap } from '../display_utils.js'; + +const { div, i, span } = van.tags; + +const WizardProgressIndicator = (steps, currentStep) => { + const currentPhysicalIndex = steps.findIndex(s => s.includedSteps.includes(currentStep.name)); + const progressWidth = van.state('0px'); + + const updateProgress = () => { + const container = document.getElementById('wizard-progress-container'); + const activeIcon = document.querySelector('.step-icon-current'); + + if (container && activeIcon) { + const containerRect = container.getBoundingClientRect(); + const iconRect = activeIcon.getBoundingClientRect(); + const centerOffset = (iconRect.left - containerRect.left) + (iconRect.width / 2); + progressWidth.val = `${centerOffset}px`; + } + }; + + setTimeout(updateProgress, 10); + + const progressLineStyle = () => ` + position: absolute; + top: 10px; + left: 0; + height: 4px; + width: ${progressWidth.val}; + background: ${colorMap.green}; + transition: width 0.3s ease-out; + z-index: -4; + `; + + const currentStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 step-icon-current`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row fx-justify-center', style: `border: 2px solid var(--secondary-text-color); background: var(--dk-dialog-background); border-radius: 50%; height: 24px; width: 24px;` }, + div({ style: 'width: 14px; height: 14px; border-radius: 50%; background: var(--secondary-text-color);' }, ''), + ), + span({}, title), + ); + + const pendingStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 ${currentPhysicalIndex === stepIndex ? 'step-icon-current' : 'text-secondary'}`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row', style: `color: var(--empty-light); border: 2px solid var(--disabled-text-color); background: var(--dk-dialog-background); border-radius: 50%;` }, + i({style: 'width: 20px; height: 20px;'}, ''), + ), + span({}, title), + ); + + const completedStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 ${currentPhysicalIndex === stepIndex ? 'step-icon-current' : 'text-secondary'}`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row', style: `color: var(--empty-light); border: 2px solid ${colorMap.green}; background: ${colorMap.green}; border-radius: 50%;` }, + i( + { + class: 'material-symbols-rounded', + style: `font-size: 20px; color: var(--empty-light);`, + }, + 'check', + ), + ), + span({}, title), + ); + + const skippedStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 ${currentPhysicalIndex === stepIndex ? 'step-icon-current' : 'text-secondary'}`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row', style: `color: var(--empty-light); border: 2px solid var(--grey); background: var(--grey); border-radius: 50%;` }, + i( + { + class: 'material-symbols-rounded', + style: `font-size: 20px; color: var(--empty-light);`, + }, + 'remove', + ), + ), + span({}, title), + ); + + return div( + { + id: 'wizard-progress-container', + class: 'flex-row fx-justify-space-between mb-5', + style: 'position: relative; margin-top: -20px;' + }, + div({ style: `position: absolute; top: 10px; left: 0; width: 100%; height: 4px; background: var(--disabled-text-color); z-index: -5;` }), + div({ style: progressLineStyle }), + + ...steps.map((step, physicalIdx) => { + if (step.index < currentStep.index) { + if (step.skipped) return skippedStepIndicator(step.title, physicalIdx); + return completedStepIndicator(step.title, physicalIdx); + } else if (step.includedSteps.includes(currentStep.name)) { + return currentStepIndicator(step.title, physicalIdx); + } else { + return pendingStepIndicator(step.title, physicalIdx); + } + }), + ); +}; + +export { WizardProgressIndicator }; diff --git a/testgen/ui/components/frontend/js/data_profiling/column_profiling_history.js b/testgen/ui/components/frontend/js/data_profiling/column_profiling_history.js index c2ca57cc..9d79f918 100644 --- a/testgen/ui/components/frontend/js/data_profiling/column_profiling_history.js +++ b/testgen/ui/components/frontend/js/data_profiling/column_profiling_history.js @@ -68,6 +68,7 @@ stylesheet.replace(` .column-history--item { padding: 8px; + border-radius: 4px; } .column-history--item:hover { @@ -75,7 +76,7 @@ stylesheet.replace(` } .column-history--item.selected { - background-color: #06a04a17; + background-color: var(--selected-item-background); } .column-history--item.selected > div { @@ -89,7 +90,7 @@ stylesheet.replace(` .column-history--divider { width: 1px; - background-color: var(--grey); + background-color: var(--border-color); margin: 0 10px; } diff --git a/testgen/ui/components/frontend/js/display_utils.js b/testgen/ui/components/frontend/js/display_utils.js index 070e0d8a..c590c9a0 100644 --- a/testgen/ui/components/frontend/js/display_utils.js +++ b/testgen/ui/components/frontend/js/display_utils.js @@ -3,7 +3,10 @@ function formatTimestamp( /** @type boolean */ showYear, ) { if (timestamp) { - const date = new Date(typeof timestamp === 'number' ? timestamp * 1000 : timestamp); + let date = timestamp; + if (typeof timestamp === 'number') { + date = new Date(timestamp.toString().length === 10 ? timestamp * 1000 : timestamp); + } if (!isNaN(date)) { const months = [ 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ]; const hours = date.getHours(); @@ -24,7 +27,17 @@ function formatDuration( const startDate = new Date(typeof startTime === 'number' ? startTime * 1000 : startTime); const endDate = new Date(typeof endTime === 'number' ? endTime * 1000 : endTime); + const totalSeconds = Math.floor((endDate.getTime() - startDate.getTime()) / 1000); + return formatDurationSeconds(totalSeconds); +} + +function formatDurationSeconds( + /** @type number */ totalSeconds, +) { + if (!totalSeconds) { + return '--'; + } let formatted = [ { value: Math.floor(totalSeconds / (3600 * 24)), unit: 'd' }, @@ -37,6 +50,36 @@ function formatDuration( return formatted.trim() || '< 1s'; } +function humanReadableDuration(/** @type string */ duration, /** @type boolean */ round = false) { + if (duration === '< 1s') { + return 'Less than 1 second'; + } + + + const unitTemplates = { + d: (/** @type number */ value) => `${value} day${value === 1 ? '' : 's'}`, + h: (/** @type number */ value) => `${value} hour${value === 1 ? '' : 's'}`, + m: (/** @type number */ value) => `${value} minute${value === 1 ? '' : 's'}`, + s: (/** @type number */ value) => `${value} second${value === 1 ? '' : 's'}`, + }; + + if (round) { + const biggestPart = duration.split(' ')[0]; + const durationUnit = biggestPart.slice(-1)[0]; + const durationValue = Number(biggestPart.replace(durationUnit, '')); + return unitTemplates[durationUnit](durationValue); + } + + return duration + .split(' ') + .map(part => { + const unit = part.slice(-1)[0]; + const value = Number(part.replace(unit, '')); + return unitTemplates[unit](value); + }) + .join(' '); +} + function formatNumber(/** @type number | string */ number, /** @type number */ decimals = 3) { if (!['number', 'string'].includes(typeof number) || isNaN(number)) { return '--'; @@ -82,25 +125,50 @@ const caseInsensitiveIncludes = (/** @type string */ value, /** @type string */ return !search; } +/** + * Convert viewport units to pixels using the current + * window's `innerHeight` and defaulting to the top window's + * `innerHeight` when needed. + * + * @param {number} value + * @param {('height'|'width')} dim + * @returns {number} + */ +function viewPortUnitsToPixels(value, dim) { + if (typeof value !== 'number') { + return 0; + } + + const viewPortSize = window[`inner${capitalize(dim)}`] || window.top[`inner${capitalize(dim)}`]; + return (value / 100) * viewPortSize; +} + // https://m2.material.io/design/color/the-color-system.html#tools-for-picking-colors const colorMap = { red: '#EF5350', // Red 400 + redLight: '#FFB6C180', // Clear red + redDark: '#D32F2F', // Red 700 orange: '#FF9800', // Orange 500 yellow: '#FDD835', // Yellow 600 green: '#9CCC65', // Light Green 400 + greenLight: '#90EE90FF', // Clear green limeGreen: '#C0CA33', // Lime Green 600 purple: '#AB47BC', // Purple 400 purpleLight: '#CE93D8', // Purple 200 + deepPurple: '#9575CD', // Deep Purple 300 blue: '#2196F3', // Blue 500 blueLight: '#90CAF9', // Blue 200 indigo: '#5C6BC0', // Indigo 400 teal: '#26A69A', // Teal 400 + tealDark: '#009688', // Teal 500 brown: '#8D6E63', // Brown 400 brownLight: '#D7CCC8', // Brown 100 brownDark: '#4E342E', // Brown 800 grey: '#BDBDBD', // Gray 400 + lightGrey: '#E0E0E0', // Gray 300 empty: 'var(--empty)', // Light: Gray 200, Dark: Gray 800 emptyLight: 'var(--empty-light)', // Light: Gray 50, Dark: Gray 900 + emptyDark: 'var(--empty-dark)', // Light: Gray 400, Dark: Gray 600 emptyTeal: 'var(--empty-teal)', } @@ -109,11 +177,14 @@ const DISABLED_ACTION_TEXT = 'You do not have permissions to perform this action export { formatTimestamp, formatDuration, + formatDurationSeconds, formatNumber, capitalize, humanReadableSize, caseInsensitiveSort, caseInsensitiveIncludes, + humanReadableDuration, + viewPortUnitsToPixels, colorMap, DISABLED_ACTION_TEXT, }; diff --git a/testgen/ui/components/frontend/js/main.js b/testgen/ui/components/frontend/js/main.js index 0cf78e3a..8819548e 100644 --- a/testgen/ui/components/frontend/js/main.js +++ b/testgen/ui/components/frontend/js/main.js @@ -9,73 +9,51 @@ import van from './van.min.js'; import pluginSpec from './plugins.js'; import { Streamlit } from './streamlit.js'; import { isEqual, getParents } from './utils.js'; -import { Button } from './components/button.js' -import { Breadcrumbs } from './components/breadcrumbs.js' -import { ExpanderToggle } from './components/expander_toggle.js'; -import { Link } from './components/link.js'; -import { Paginator } from './components/paginator.js'; -import { SortingSelector } from './components/sorting_selector.js'; -import { ColumnSelector } from './components/explorer_column_selector.js'; -import { TestRuns } from './pages/test_runs.js'; -import { ProfilingRuns } from './pages/profiling_runs.js'; -import { DataCatalog } from './pages/data_catalog.js'; -import { ProjectDashboard } from './pages/project_dashboard.js'; -import { TestSuites } from './pages/test_suites.js'; -import { QualityDashboard } from './pages/quality_dashboard.js'; -import { ScoreDetails } from './pages/score_details.js'; -import { ScoreExplorer } from './pages/score_explorer.js'; -import { ColumnProfilingResults } from './data_profiling/column_profiling_results.js'; -import { ColumnProfilingHistory } from './data_profiling/column_profiling_history.js'; -import { ScheduleList } from './pages/schedule_list.js'; -import { Connections } from './pages/connections.js'; -import { TableGroupWizard } from './pages/table_group_wizard.js'; -import { HelpMenu } from './components/help_menu.js' -import { TableGroupList } from './pages/table_group_list.js'; -import { TableGroupDeleteConfirmation } from './pages/table_group_delete_confirmation.js'; -import { RunProfilingDialog } from './pages/run_profiling_dialog.js'; -import { ConfirmationDialog } from './pages/confirmation_dialog.js'; -import { TestDefinitionSummary } from './pages/test_definition_summary.js'; -import { NotificationSettings } from './pages/notification_settings.js'; let currentWindowVan = van; let topWindowVan = window.top.van; -const TestGenComponent = (/** @type {string} */ id, /** @type {object} */ props) => { - const componentById = { - breadcrumbs: Breadcrumbs, - button: Button, - expander_toggle: ExpanderToggle, - link: Link, - paginator: Paginator, - sorting_selector: SortingSelector, - sidebar: window.top.testgen.components.Sidebar, - test_runs: TestRuns, - profiling_runs: ProfilingRuns, - data_catalog: DataCatalog, - column_profiling_results: ColumnProfilingResults, - column_profiling_history: ColumnProfilingHistory, - project_dashboard: ProjectDashboard, - test_suites: TestSuites, - quality_dashboard: QualityDashboard, - score_details: ScoreDetails, - score_explorer: ScoreExplorer, - schedule_list: ScheduleList, - column_selector: ColumnSelector, - connections: Connections, - table_group_wizard: TableGroupWizard, - help_menu: HelpMenu, - table_group_list: TableGroupList, - table_group_delete: TableGroupDeleteConfirmation, - run_profiling_dialog: RunProfilingDialog, - confirm_dialog: ConfirmationDialog, - test_definition_summary: TestDefinitionSummary, - notification_settings: NotificationSettings, - }; - - if (Object.keys(window.testgen.plugins).includes(id)) { - return window.testgen.plugins[id](props); - } else if (Object.keys(componentById).includes(id)) { - return componentById[id](props); +const componentLoaders = { + breadcrumbs: () => import('./components/breadcrumbs.js').then(m => m.Breadcrumbs), + button: () => import('./components/button.js').then(m => m.Button), + expander_toggle: () => import('./components/expander_toggle.js').then(m => m.ExpanderToggle), + link: () => import('./components/link.js').then(m => m.Link), + paginator: () => import('./components/paginator.js').then(m => m.Paginator), + sorting_selector: () => import('./components/sorting_selector.js').then(m => m.SortingSelector), + sidebar: () => Promise.resolve(window.top.testgen.components.Sidebar), + test_runs: () => import('./pages/test_runs.js').then(m => m.TestRuns), + profiling_runs: () => import('./pages/profiling_runs.js').then(m => m.ProfilingRuns), + data_catalog: () => import('./pages/data_catalog.js').then(m => m.DataCatalog), + column_profiling_results: () => import('./data_profiling/column_profiling_results.js').then(m => m.ColumnProfilingResults), + column_profiling_history: () => import('./data_profiling/column_profiling_history.js').then(m => m.ColumnProfilingHistory), + project_dashboard: () => import('./pages/project_dashboard.js').then(m => m.ProjectDashboard), + test_suites: () => import('./pages/test_suites.js').then(m => m.TestSuites), + quality_dashboard: () => import('./pages/quality_dashboard.js').then(m => m.QualityDashboard), + score_details: () => import('./pages/score_details.js').then(m => m.ScoreDetails), + score_explorer: () => import('./pages/score_explorer.js').then(m => m.ScoreExplorer), + schedule_list: () => import('./pages/schedule_list.js').then(m => m.ScheduleList), + column_selector: () => import('./components/explorer_column_selector.js').then(m => m.ColumnSelector), + connections: () => import('./pages/connections.js').then(m => m.Connections), + table_group_wizard: () => import('./pages/table_group_wizard.js').then(m => m.TableGroupWizard), + help_menu: () => import('./components/help_menu.js').then(m => m.HelpMenu), + table_group_list: () => import('./pages/table_group_list.js').then(m => m.TableGroupList), + table_group_delete: () => import('./pages/table_group_delete_confirmation.js').then(m => m.TableGroupDeleteConfirmation), + run_profiling_dialog: () => import('./pages/run_profiling_dialog.js').then(m => m.RunProfilingDialog), + confirm_dialog: () => import('./pages/confirmation_dialog.js').then(m => m.ConfirmationDialog), + test_definition_summary: () => import('./pages/test_definition_summary.js').then(m => m.TestDefinitionSummary), + notification_settings: () => import('./pages/notification_settings.js').then(m => m.NotificationSettings), + monitors_dashboard: () => import('./pages/monitors_dashboard.js').then(m => m.MonitorsDashboard), + table_monitoring_trends: () => import('./pages/table_monitoring_trends.js').then(m => m.TableMonitoringTrend), + test_results_chart: () => import('./pages/test_results_chart.js').then(m => m.TestResultsChart), + schema_changes_list: () => import('./components/schema_changes_list.js').then(m => m.SchemaChangesList), + edit_monitor_settings: () => import('./pages/edit_monitor_settings.js').then(m => m.EditMonitorSettings), +}; + +const TestGenComponent = async (/** @type {string} */ id, /** @type {object} */ props) => { + const loader = window.testgen.plugins[id] ?? componentLoaders[id]; + if (loader) { + const Component = await loader(); + return Component(props); } return ''; }; @@ -120,7 +98,7 @@ window.addEventListener('message', async (event) => { window.testgen.states[componentKey] = componentState; } - return van.add(mountPoint, TestGenComponent(componentId, componentState)); + return van.add(mountPoint, await TestGenComponent(componentId, componentState)); } for (const [ key, value ] of Object.entries(event.data.args.props)) { @@ -159,10 +137,10 @@ async function loadPlugins() { try { const modules = await Promise.all(Object.values(pluginSpec).map(plugin => import(plugin.entrypoint))) for (const pluginModule of modules) { - if (pluginModule && pluginModule.components) { - Object.assign(window.testgen.plugins, pluginModule.components) + if (pluginModule && pluginModule.componentLoaders) { + Object.assign(window.testgen.plugins, pluginModule.componentLoaders) } else if (pluginModule) { - console.warn(`Plugin '${pluginModule}' does not export a member 'components'.`); + console.warn(`Plugin '${pluginModule}' does not export a member 'componentLoaders'.`); } } } catch (error) { diff --git a/testgen/ui/components/frontend/js/pages/confirmation_dialog.js b/testgen/ui/components/frontend/js/pages/confirmation_dialog.js index c1fa1aad..a91ba8dc 100644 --- a/testgen/ui/components/frontend/js/pages/confirmation_dialog.js +++ b/testgen/ui/components/frontend/js/pages/confirmation_dialog.js @@ -11,7 +11,6 @@ * * @typedef Properties * @type {object} - * @property {string} project_code * @property {string} message * @property {Constraint?} constraint * @property {Result?} result diff --git a/testgen/ui/components/frontend/js/pages/connections.js b/testgen/ui/components/frontend/js/pages/connections.js index 959510dc..27ac6c2a 100644 --- a/testgen/ui/components/frontend/js/pages/connections.js +++ b/testgen/ui/components/frontend/js/pages/connections.js @@ -119,7 +119,7 @@ stylesheet.replace(` .tg-connections--link { margin-left: auto; border-radius: 4px; - background: var(--dk-card-background); + background: var(--button-generic-background-color); border: var(--button-stroked-border); padding: 8px 8px 8px 16px; color: var(--primary-color) !important; diff --git a/testgen/ui/components/frontend/js/pages/data_catalog.js b/testgen/ui/components/frontend/js/pages/data_catalog.js index 2edc1efd..1e2f4dfb 100644 --- a/testgen/ui/components/frontend/js/pages/data_catalog.js +++ b/testgen/ui/components/frontend/js/pages/data_catalog.js @@ -13,8 +13,10 @@ * @property {string} functional_data_type * @property {number} record_ct * @property {number} value_ct - * @property {number} drop_date - * @property {number} table_drop_date + * @property {string} add_date + * @property {string} drop_date + * @property {string} table_add_date + * @property {string} table_drop_date * @property {boolean} critical_data_element * @property {boolean} table_critical_data_element * @property {string} data_source @@ -117,12 +119,12 @@ const DataCatalog = (/** @type Properties */ props) => { const tables = {}; columns.forEach((item) => { - const { column_id, table_id, column_name, table_name, record_ct, value_ct, drop_date, table_drop_date } = item; + const { column_id, table_id, column_name, table_name, record_ct, value_ct, add_date, drop_date, table_add_date, table_drop_date } = item; if (!tables[table_id]) { tables[table_id] = { id: table_id, label: table_name, - classes: table_drop_date ? 'text-disabled' : '', + classes: table_drop_date ? 'text-disabled' : (table_add_date && (Date.now() - new Date(table_add_date * 1000).getTime()) < 7 * 86400000) ? 'text-bold' : '', ...TABLE_ICON, iconColor: record_ct === 0 ? 'red' : null, iconTooltip: record_ct === 0 ? 'No records detected' : null, @@ -134,7 +136,7 @@ const DataCatalog = (/** @type Properties */ props) => { const columnNode = { id: column_id, label: column_name, - classes: drop_date ? 'text-disabled' : '', + classes: drop_date ? 'text-disabled' : (add_date && (Date.now() - new Date(add_date * 1000).getTime()) < 7 * 86400000) ? 'text-bold' : '', ...getColumnIcon(item), iconColor: value_ct === 0 ? 'red' : null, iconTooltip: value_ct === 0 ? 'No non-null values detected' : null, @@ -333,7 +335,7 @@ const ExportOptions = (/** @type TreeNode[] */ treeNodes, /** @type SelectedNode tooltip: 'Download columns to Excel', tooltipPosition: 'left', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => exportOptionsOpened.val = !exportOptionsOpened.val, }), Portal( @@ -733,7 +735,7 @@ const ConditionalEmptyState = ( color: 'primary', label: 'Run Profiling', width: 'fit-content', - style: 'margin: auto; background: background: var(--dk-card-background);', + style: 'margin: auto; background: var(--button-generic-background-color);', disabled: !userCanEdit, tooltip: userCanEdit ? null : DISABLED_ACTION_TEXT, tooltipPosition: 'bottom', diff --git a/testgen/ui/components/frontend/js/pages/edit_monitor_settings.js b/testgen/ui/components/frontend/js/pages/edit_monitor_settings.js new file mode 100644 index 00000000..1b1614ad --- /dev/null +++ b/testgen/ui/components/frontend/js/pages/edit_monitor_settings.js @@ -0,0 +1,125 @@ +/** + * @import { MonitorSuite, Schedule } from '../components/monitor_settings_form.js'; + * @import { CronSample } from '../types.js'; + * + * @typedef TableGroup + * @type {object} + * @property {string} id + * @property {string} connection_id + * @property {string} table_groups_name + * @property {string} monitor_test_suite_id + * @property {string} last_complete_profile_run_id + * + * @typedef Properties + * @type {object} + * @property {TableGroup} table_group + * @property {Schedule} schedule + * @property {MonitorSuite} monitor_suite + * @property {CronSample?} cron_sample + */ +import van from '../van.min.js'; +import { Button } from '../components/button.js'; +import { Icon } from '../components/icon.js'; +import { MonitorSettingsForm } from '../components/monitor_settings_form.js'; +import { emitEvent, getValue, isEqual } from '../utils.js'; +import { Streamlit } from '../streamlit.js'; + +const { div, span } = van.tags; + +/** + * + * @param {Properties} props + * @returns + */ +const EditMonitorSettings = (props) => { + window.testgen.isPage = true; + + const tableGroup = getValue(props.table_group); + + const schedule = getValue(props.schedule); + const updatedSchedule = van.state(schedule); + + const monitorSuite = getValue(props.monitor_suite); + const updatedMonitorSuite = van.state(monitorSuite); + + const formState = van.state({dirty: false, valid: false}); + + return div( + {}, + div( + { class: 'flex-row fx-gap-1 mb-5 text-large' }, + span({ class: 'text-secondary' }, 'Table Group:'), + span(tableGroup.table_groups_name), + ), + MonitorSettingsForm( + { + schedule: props.schedule, + monitorSuite: props.monitor_suite, + cronSample: props.cron_sample, + onChange: (schedule, monitorSuite, state) => { + formState.val = state; + updatedSchedule.val = schedule; + updatedMonitorSuite.val = monitorSuite; + }, + }, + ), + div( + { class: 'flex-row fx-justify-space-between fx-gap-3 mt-4' }, + !monitorSuite.id + ? div( + { class: 'flex-row fx-gap-1' }, + Icon({ size: 16 }, 'info'), + span( + { class: 'text-caption' }, + tableGroup.last_complete_profile_run_id + ? 'Monitors will be configured based on latest profiling and run periodically on schedule.' + : 'Monitors will be configured after first profiling and run periodically on schedule.' + ), + ) + : span({}), + Button({ + label: 'Save', + color: 'primary', + type: 'flat', + width: 'auto', + disabled: () => !formState.val.dirty || !formState.val.valid, + onclick: () => { + const payload = { + schedule: updatedSchedule.val, + monitor_suite: updatedMonitorSuite.val, + }; + emitEvent('SaveSettingsClicked', { payload }); + }, + }), + ), + ); +}; + +export { EditMonitorSettings }; + +export default (component) => { + const { data, setStateValue, setTriggerValue, parentElement } = component; + + Streamlit.enableV2(setTriggerValue); + + let componentState = parentElement.state; + if (componentState === undefined) { + componentState = {}; + for (const [ key, value ] of Object.entries(data)) { + componentState[key] = van.state(value); + } + + parentElement.state = componentState; + van.add(parentElement, EditMonitorSettings(componentState)); + } else { + for (const [ key, value ] of Object.entries(data)) { + if (!isEqual(componentState[key].val, value)) { + componentState[key].val = value; + } + } + } + + return () => { + parentElement.state = null; + }; +}; diff --git a/testgen/ui/components/frontend/js/pages/edit_table_monitors.js b/testgen/ui/components/frontend/js/pages/edit_table_monitors.js new file mode 100644 index 00000000..5fd564ae --- /dev/null +++ b/testgen/ui/components/frontend/js/pages/edit_table_monitors.js @@ -0,0 +1,322 @@ +/** + * @import { TestDefinition } from '../components/test_definition_form.js'; + * + * @typedef Properties + * @type {object} + * @property {string} table_name + * @property {TestDefinition[]} definitions + * @property {object} metric_test_type + * @property {{ success: boolean, timestamp: string }?} result + */ + +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { emitEvent, getValue, loadStylesheet, isEqual } from '../utils.js'; +import { Button } from '../components/button.js'; +import { Card } from '../components/card.js'; +import { Icon } from '../components/icon.js'; +import { TestDefinitionForm } from '../components/test_definition_form.js'; + +const { div, span } = van.tags; + +const defaultMonitorOptions = [ + { key: 'Freshness_Trend', label: 'Freshness' }, + { key: 'Volume_Trend', label: 'Volume' }, +]; + +const EditTableMonitors = (/** @type Properties */ props) => { + loadStylesheet('edit-table-monitors', stylesheet); + window.testgen.isPage = true; + + const metricTestType = getValue(props.metric_test_type); + + const updatedDefinitions = van.state({}); // { [id]: changes } - only changes for existing definitions + const newMetrics = van.state({}); // { [tempId]: metric } + const deletedMetricIds = van.state([]); + + const showSaveSuccess = van.state(false); + let lastSaveTimestamp = null; + + van.derive(() => { + const result = getValue(props.result); + if (result?.success && result.timestamp !== lastSaveTimestamp) { + lastSaveTimestamp = result.timestamp; + showSaveSuccess.val = true; + updatedDefinitions.val = {}; + newMetrics.val = {}; + deletedMetricIds.val = []; + formStates.val = {}; + setTimeout(() => { showSaveSuccess.val = false; }, 2000); + } + }); + + const formStates = van.state({}); // { [id]: { dirty, valid } } + const isDirty = van.derive(() => { + return Object.values(formStates.val).some(s => s.dirty) // changes + || Object.keys(newMetrics.val).length // adds + || deletedMetricIds.val.length; // deletes + }); + const isValid = van.derive(() => Object.values(formStates.val).every(s => s.valid)); + + const existingMetrics = van.derive(() => Object.fromEntries( + getValue(props.definitions).filter(td => td.test_type === 'Metric_Trend').map(metric => [metric.id, metric]) + )); + const displayedMetrics = van.derive(() => { + const existing = Object.values(existingMetrics.val).filter(metric => !deletedMetricIds.val.includes(metric.id)); + return [...existing, ...Object.values(newMetrics.val)]; + }); + const selectedItem = van.state({ type: 'Freshness_Trend', id: null }); + + return div( + div( + { class: 'edit-monitors flex-row fx-align-stretch' }, + div( + { class: 'edit-monitors--list' }, + defaultMonitorOptions.map(({ key, label }) => div( + { + class: () => `edit-monitors--item clickable p-2 border-radius-1 ${selectedItem.val.type === key ? 'selected' : ''}`, + onclick: () => selectedItem.val = { type: key, id: null }, + }, + span(label), + )), + div({ class: 'edit-monitors--list-divider mt-3 mb-1' }), + div( + { class: 'flex-row fx-justify-space-between fx-align-center mb-2' }, + span({ class: 'text-secondary' }, 'Metrics'), + Button({ + icon: 'add', + label: 'Add', + width: 'auto', + color: 'primary', + onclick: () => { + const tempId = `temp_${Date.now()}`; + const newMetric = { + _tempId: tempId, + column_name: '', + custom_query: '', + history_calculation: 'PREDICT', + history_calculation_upper: null, + history_lookback: null, + ...metricTestType, + }; + newMetrics.val = { ...newMetrics.val, [tempId]: newMetric }; + selectedItem.val = { type: 'Metric_Trend', id: tempId }; + }, + }), + ), + () => displayedMetrics.val.length + ? div( + displayedMetrics.val.map(metric => { + const id = metric.id || metric._tempId; + const isNew = !metric.id; + + return div( + { + class: () => `edit-monitors--item clickable p-2 pr-0 border-radius-1 flex-row fx-justify-space-between ${selectedItem.val.id === id ? 'selected' : ''}`, + onclick: () => selectedItem.val = { type: 'Metric_Trend', id }, + }, + span( + { style: `text-overflow: ellipsis; ${!metric.column_name ? 'font-style: italic;' : ''}` }, + metric.column_name || '(Unnamed Metric)', + ), + Button({ + type: 'icon', + icon: 'delete', + onclick: (event) => { + // Prevent bubbling the event and triggering the parent's onclick + event.stopPropagation(); + if (isNew) { + const { [id]: _removed, ...remaining } = newMetrics.val; + newMetrics.val = remaining; + } else { + deletedMetricIds.val = [...deletedMetricIds.val, id]; + const { [id]: _removedDef, ...remainingDefs } = updatedDefinitions.val; + updatedDefinitions.val = remainingDefs; + } + const { [id]: _removedState, ...remainingStates } = formStates.val; + formStates.val = remainingStates; + if (selectedItem.val.id === id) { + selectedItem.val = { type: 'Freshness_Trend', id: null }; + } + }, + }), + ); + }), + ) + : div( + { class: 'flex-row fx-justify-center text-caption', style: 'height: 100px;' }, + 'No metrics defined yet', + ), + ), + span({ class: 'edit-monitors--divider' }), + () => { + const { type, id } = selectedItem.val; + + if (type === 'Metric_Trend') { + const isNew = id.startsWith('temp_'); + const metricDefinition = isNew + ? newMetrics.rawVal[id] + : { ...existingMetrics.val[id], ...updatedDefinitions.rawVal[id] }; + + return TestDefinitionForm({ + definition: metricDefinition, + class: 'edit-monitors--form', + onChange: (changes, state) => { + if (isNew) { + newMetrics.val = { + ...newMetrics.val, + [id]: { ...newMetrics.val[id], ...changes }, + }; + } else { + updatedDefinitions.val = { + ...updatedDefinitions.val, + [id]: { ...changes, id }, + }; + } + formStates.val = { ...formStates.val, [id]: state }; + }, + }); + } + + const selectedDef = getValue(props.definitions).find(td => td.test_type === type); + if (!selectedDef) { + return Card({ + class: 'edit-monitors--empty flex-row fx-justify-center', + content: 'Monitor not configured for this table.', + }); + } + + return TestDefinitionForm({ + definition: { ...selectedDef, ...updatedDefinitions.rawVal[selectedDef.id] }, + class: 'edit-monitors--form', + onChange: (changes, state) => { + updatedDefinitions.val = { + ...updatedDefinitions.val, + [selectedDef.id]: { ...changes, id: selectedDef.id }, + }; + formStates.val = { ...formStates.val, [selectedDef.id]: state }; + }, + }); + }, + ), + div( + { class: 'edit-monitors--footer flex-row fx-gap-3 fx-justify-content-flex-end fx-align-center mt-4 pt-4' }, + () => showSaveSuccess.val + ? span( + { class: 'flex-row fx-gap-1 text-secondary mr-4' }, + Icon({ style: 'color: var(--green);'}, 'check_circle'), + 'Changes saved', + ) + : '', + Button({ + label: 'Save', + color: 'primary', + type: 'stroked', + width: 'auto', + disabled: () => !isDirty.val || !isValid.val, + onclick: () => { + const payload = { + updated_definitions: Object.values(updatedDefinitions.val), + new_metrics: Object.values(newMetrics.val), + deleted_metric_ids: deletedMetricIds.val, + }; + emitEvent('SaveTestDefinition', { payload }); + }, + }), + Button({ + label: 'Save & Close', + color: 'primary', + type: 'flat', + width: 'auto', + disabled: () => !isDirty.val || !isValid.val, + onclick: () => { + const payload = { + updated_definitions: Object.values(updatedDefinitions.val), + new_metrics: Object.values(newMetrics.val), + deleted_metric_ids: deletedMetricIds.val, + close: true, + }; + emitEvent('SaveTestDefinition', { payload }); + }, + }), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.edit-monitors { + min-height: 350px; +} + +.edit-monitors--list { + flex: 200px 0 0; +} + +.edit-monitors--item { + height: 40px; +} + +.edit-monitors--item:hover { + background-color: var(--sidebar-item-hover-color); +} + +.edit-monitors--item.selected { + background-color: #06a04a17; +} + +.edit-monitors--item.selected > span { + font-weight: 500; +} + +.edit-monitors--list-divider { + height: 1px; + background-color: var(--border-color); +} + +.edit-monitors--divider { + width: 2px; + background-color: var(--border-color); + margin: 0 12px; +} + +.edit-monitors--form { + flex: auto; +} + +.edit-monitors--empty { + flex: 1; + margin: 0; +} + +.edit-monitors--footer { + border-top: 1px solid var(--border-color); +} +`); + +export { EditTableMonitors }; + +export default (component) => { + const { data, setTriggerValue, parentElement } = component; + Streamlit.enableV2(setTriggerValue); + + let componentState = parentElement.state; + if (componentState === undefined) { + componentState = {}; + for (const [key, value] of Object.entries(data)) { + componentState[key] = van.state(value); + } + parentElement.state = componentState; + van.add(parentElement, EditTableMonitors(componentState)); + } else { + for (const [key, value] of Object.entries(data)) { + if (!isEqual(componentState[key].val, value)) { + componentState[key].val = value; + } + } + } + + return () => { + parentElement.state = null; + }; +}; diff --git a/testgen/ui/components/frontend/js/pages/monitors_dashboard.js b/testgen/ui/components/frontend/js/pages/monitors_dashboard.js new file mode 100644 index 00000000..b5294681 --- /dev/null +++ b/testgen/ui/components/frontend/js/pages/monitors_dashboard.js @@ -0,0 +1,640 @@ +/** + * @import { MonitorSummary } from '../components/monitor_anomalies_summary.js'; + * @import { CronSample, FilterOption, ProjectSummary } from '../types.js'; + * + * @typedef Schedule + * @type {object} + * @property {boolean} active + * @property {string} cron_tz + * @property {CronSample} cron_sample + * + * @typedef Monitor + * @type {object} + * @property {string} table_group_id + * @property {string} table_name + * @property {('modified'|'added'|'dropped')} table_state + * @property {number?} freshness_anomalies + * @property {number?} volume_anomalies + * @property {number?} schema_anomalies + * @property {number?} metric_anomalies + * @property {string?} freshness_error_message + * @property {string?} volume_error_message + * @property {string?} schema_error_message + * @property {string?} metric_error_message + * @property {boolean?} freshness_is_training + * @property {boolean?} volume_is_training + * @property {boolean?} metric_is_training + * @property {boolean} freshness_is_pending + * @property {boolean} volume_is_pending + * @property {boolean} schema_is_pending + * @property {boolean} metric_is_pending + * @property {number?} lookback_start + * @property {number?} lookback_end + * @property {string?} latest_update + * @property {number?} row_count + * @property {number?} previous_row_count + * @property {number?} column_adds + * @property {number?} column_drops + * @property {number?} column_mods + * + * @typedef MonitorList + * @type {object} + * @property {Monitor[]} items + * @property {number} current_page + * @property {number} items_per_page + * @property {number} total_count + * + * @typedef MonitorListFilters + * @type {object} + * @property {string?} table_group_id + * @property {string?} table_name_filter + * @property {string?} anomaly_type_filter + * + * @typedef MonitorListSort + * @type {object} + * @property {string?} sort_field + * @property {('asc'|'desc')?} sort_order + * + * @typedef Permissions + * @type {object} + * @property {boolean} can_edit + * + * @typedef TableGroupFilterOption + * @type {FilterOption & { has_monitors: boolean }} + * + * @typedef Properties + * @type {object} + * @property {ProjectSummary} project_summary + * @property {MonitorSummary?} summary + * @property {Schedule?} schedule + * @property {TableGroupFilterOption[]} table_group_filter_options + * @property {boolean?} has_monitor_test_suite + * @property {string?} auto_open_table + * @property {MonitorList} monitors + * @property {MonitorListFilters} filters + * @property {MonitorListSort?} sort + * @property {Permissions} permissions + */ +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { emitEvent, getValue, loadStylesheet } from '../utils.js'; +import { formatDuration, formatTimestamp, humanReadableDuration, formatNumber, viewPortUnitsToPixels } from '../display_utils.js'; +import { Button } from '../components/button.js'; +import { Select } from '../components/select.js'; +import { Input } from '../components/input.js'; +import { Checkbox } from '../components/checkbox.js'; +import { EmptyState, EMPTY_STATE_MESSAGE } from '../components/empty_state.js'; +import { Icon } from '../components/icon.js'; +import { Table } from '../components/table.js'; +import { withTooltip } from '../components/tooltip.js'; +import { AnomaliesSummary } from '../components/monitor_anomalies_summary.js'; + +const { div, i, span, b } = van.tags; +const SHOW_CHANGES_COLUMNS_KEY = 'testgen__monitors__showchanges'; + +const MonitorsDashboard = (/** @type Properties */ props) => { + loadStylesheet('monitors-dashboard', stylesheet); + Streamlit.setFrameHeight(viewPortUnitsToPixels(90, 'height')); + window.testgen.isPage = true; + + let renderTime = new Date(); + const tableGroupFilterValue = van.derive(() => getValue(props.filters).table_group_id ?? null); + const tableNameFilterValue = van.derive(() => getValue(props.filters).table_name_filter ?? null); + const anomalyTypeFilterValue = van.derive(() => getValue(props.filters).anomaly_type_filter ?? []); + const tableSort = van.derive(() => { + const sort = getValue(props.sort); + return { + field: sort?.sort_field, + order: sort?.sort_order, + onSortChange: (sort) => emitEvent('SetParamValues', { payload: { sort_field: sort.field ?? null, sort_order: sort.order ?? null } }), + }; + }); + const showChangesColumns = van.state(Boolean(window.localStorage?.getItem(SHOW_CHANGES_COLUMNS_KEY) === '1')); + const setShowChanges = (value) => { + showChangesColumns.val = value ?? false; + window.localStorage?.setItem(SHOW_CHANGES_COLUMNS_KEY, Number(showChangesColumns.val)) + }; + const tablePaginator = van.derive(() => { + const result = getValue(props.monitors); + return { + currentPageIdx: result.current_page, + itemsPerPage: result.items_per_page, + totalItems: result.total_count, + onPageChange: (page, pageSize) => emitEvent('SetParamValues', { payload: { current_page: page, items_per_page: pageSize } }), + leftContent: div( + { class: 'ml-2' }, + Checkbox({ + label: span({ class: 'mr-1' }, 'Show changes'), + checked: showChangesColumns, + disabled: false, + onChange: setShowChanges, + }), + ), + }; + }); + const autoOpenTable = getValue(props.auto_open_table); + if (autoOpenTable) { + setTimeout(() => emitEvent('OpenMonitoringTrends', { payload: { table_name: autoOpenTable } }), 0); + } + + const openChartsDialog = (monitor) => emitEvent('OpenMonitoringTrends', { payload: { table_name: monitor.table_name }}); + + + const tableRows = van.derive(() => { + const result = getValue(props.monitors); + renderTime = new Date(); + return result.items.map(monitor => { + const rowCountChange = (monitor.row_count ?? 0) - (monitor.previous_row_count ?? 0); + + return { + _hasAnomalies: monitor.freshness_anomalies || monitor.volume_anomalies || monitor.schema_anomalies || monitor.metric_anomalies, + table_name: () => ['added', 'dropped'].includes(monitor.table_state) + ? withTooltip( + span( + { + class: monitor.table_state === 'dropped' ? 'text-disabled' : '', + style: `position: relative; ${monitor.table_state === 'added' ? 'font-weight: 500;' : ''}`, + }, + monitor.table_name, + ), + { text: `Table ${monitor.table_state}` }, + ) + : monitor.table_name, + freshness_anomalies: () => AnomalyTag(monitor.freshness_anomalies, monitor.freshness_error_message, monitor.freshness_is_training, monitor.freshness_is_pending, () => openChartsDialog(monitor)), + volume_anomalies: () => AnomalyTag(monitor.volume_anomalies, monitor.volume_error_message, monitor.volume_is_training, monitor.volume_is_pending, () => openChartsDialog(monitor)), + schema_anomalies: () => AnomalyTag(monitor.schema_anomalies, monitor.schema_error_message, false, monitor.schema_is_pending, () => openChartsDialog(monitor)), + metric_anomalies: () => AnomalyTag(monitor.metric_anomalies, monitor.metric_error_message, monitor.metric_is_training, monitor.metric_is_pending, () => openChartsDialog(monitor)), + latest_update: () => monitor.latest_update + ? withTooltip( + span( + {class: 'text-small', style: 'position: relative;'}, + `${humanReadableDuration(formatDuration(monitor.latest_update, renderTime), true)} ago`, + ), + { text: `Latest update detected: ${formatTimestamp(monitor.latest_update)}` }, + ) + : span({class: 'text-small text-secondary'}, '-'), + row_count: () => rowCountChange !== 0 ? + withTooltip( + div( + {class: 'flex-row fx-gap-1', style: 'position: relative; display: inline-flex;'}, + Icon( + {style: 'font-size: 20px; color: var(--primary-text-color);'}, + rowCountChange > 0 ? 'arrow_upward' : 'arrow_downward', + ), + span({class: 'text-small'}, formatNumber(Math.abs(rowCountChange))), + ), + { + text: div( + {class: 'flex-column fx-align-flex-start mb-1'}, + span(`Previous count: ${formatNumber(monitor.previous_row_count)}`), + span(`Latest count: ${formatNumber(monitor.row_count)}`), + span(`Percent change: ${monitor.previous_row_count ? formatNumber(rowCountChange * 100 / monitor.previous_row_count, 2) : '100'}%`), + ), + }, + ) + : span({class: 'text-small text-secondary'}, '-'), + schema_changes: () => monitor.schema_anomalies ? + withTooltip( + div( + { + class: 'flex-row fx-gap-1 schema-changes', + onclick: () => { + const summary = getValue(props.summary); + emitEvent('OpenSchemaChanges', { payload: { + table_name: monitor.table_name, + start_time: summary?.lookback_start, + end_time: summary?.lookback_end, + }}); + }, + }, + monitor.table_state === 'added' + ? Icon({size: 20, classes: 'schema-icon', filled: true}, 'add_box') + : null, + monitor.table_state === 'dropped' + ? Icon({size: 20, classes: 'schema-icon', filled: true}, 'indeterminate_check_box') + : null, + monitor.column_adds ? div( + {class: 'flex-row'}, + Icon({size: 20, classes: 'schema-icon'}, 'add'), + span({class: 'text-small'}, formatNumber(monitor.column_adds)), + ) : null, + monitor.column_drops ? div( + {class: 'flex-row'}, + Icon({size: 20, classes: 'schema-icon'}, 'remove'), + span({class: 'text-small'}, formatNumber(monitor.column_drops)), + ) : null, + monitor.column_mods ? div( + {class: 'flex-row'}, + Icon({size: 18, classes: 'schema-icon'}, 'change_history'), + span({class: 'text-small'}, formatNumber(monitor.column_mods)), + ) : null, + ), + { + text: div( + {class: 'flex-column fx-align-flex-start'}, + monitor.table_state === 'added' + ? span({class: 'mb-1', style: 'font-size: 14px;'}, 'Table added.') + : null, + monitor.table_state === 'dropped' + ? span({class: 'mb-1', style: 'font-size: 14px;'}, 'Table dropped.') + : null, + b({class: 'mb-1'}, 'Columns'), + monitor.column_adds ? span(`Added: ${monitor.column_adds}`) : null, + monitor.column_drops ? span(`Dropped: ${monitor.column_drops}`) : null, + monitor.column_mods ? span(`Modified: ${monitor.column_mods}`) : null, + ), + width: 200, + position: 'right', + }, + ) : span({class: 'text-small text-secondary'}, '-'), + action: () => div( + { class: 'flex-row fx-justify-center fx-gap-2' }, + Button({ + icon: 'insights', + type: 'icon', + tooltip: 'View table trends', + tooltipPosition: 'top-left', + style: 'color: var(--secondary-text-color);', + onclick: () => openChartsDialog(monitor), + }), + getValue(props.permissions)?.can_edit + ? Button({ + icon: 'edit', + type: 'icon', + tooltip: 'Edit table monitors', + tooltipPosition: 'top-left', + style: 'color: var(--secondary-text-color);', + onclick: () => emitEvent('EditTableMonitors', { payload: { table_name: monitor.table_name }}), + }) + : null, + ), + }; + }); + }); + + const userCanEdit = getValue(props.permissions)?.can_edit ?? false; + const projectSummary = getValue(props.project_summary); + + return projectSummary.table_group_count > 0 + ? div( + {style: 'height: 100%;'}, + div( + { class: 'flex-row fx-align-flex-end fx-justify-space-between fx-gap-4 fx-flex-wrap mb-4' }, + Select({ + label: 'Table Group', + value: tableGroupFilterValue, + options: (getValue(props.table_group_filter_options) ?? []).map(option => ({ + ...option, + label: span( + { class: 'flex-row fx-gap-2' }, + span({ class: `has-monitors dot text-disabled ${option.has_monitors ? '' : 'invisible'}` }), + option.label, + ), + })), + allowNull: false, + style: 'font-size: 14px;', + testId: 'table-group-filter', + onChange: (value) => emitEvent('SetParamValues', {payload: {table_group_id: value, table_name: null}}), + }), + () => getValue(props.has_monitor_test_suite) + ? AnomaliesSummary(getValue(props.summary), 'Total anomalies', { + onTagClick: (type) => { + const current = anomalyTypeFilterValue.val; + const newFilter = current.length === 1 && current[0] === type ? null : type; + emitEvent('SetParamValues', { payload: { anomaly_type_filter: newFilter, current_page: 0 } }); + }, + activeTypes: anomalyTypeFilterValue, + }) + : '', + () => getValue(props.has_monitor_test_suite) && userCanEdit + ? div( + {class: 'flex-row fx-gap-3'}, + Button({ + icon: 'notifications', + tooltip: 'Configure email notifications for table group monitors', + tooltipPosition: 'bottom-left', + color: 'basic', + type: 'stroked', + style: 'background: var(--button-generic-background-color);', + onclick: () => emitEvent('EditNotifications', {}), + }), + Button({ + icon: 'settings', + tooltip: 'Edit monitor settings for table group', + tooltipPosition: 'bottom-left', + color: 'basic', + type: 'stroked', + style: 'background: var(--button-generic-background-color);', + onclick: () => emitEvent('EditMonitorSettings', {}), + }), + Button({ + icon: 'delete', + tooltip: 'Delete all monitors for table group', + tooltipPosition: 'bottom-left', + color: 'basic', + type: 'stroked', + style: 'background: var(--button-generic-background-color);', + onclick: () => emitEvent('DeleteMonitorSuite', {}), + }), + ) + : '', + ), + () => getValue(props.has_monitor_test_suite) ? Table( + { + header: () => div( + {class: 'flex-row fx-align-flex-end fx-gap-3 p-4 pt-2 pb-2'}, + Input({ + id: 'search-tables', + name: 'search-tables', + placeholder: 'Search tables', + clearable: true, + width: 230, + style: 'font-size: 14px;', + icon: 'search', + testId: 'search-tables', + value: tableNameFilterValue, + onChange: (value, state) => emitEvent('SetParamValues', {payload: {table_name_filter: value, current_page: 0}}), + }), + Select({ + label: 'Anomaly type', + value: anomalyTypeFilterValue, + options: [ + { label: 'Freshness', value: 'freshness' }, + { label: 'Volume', value: 'volume' }, + { label: 'Schema', value: 'schema' }, + { label: 'Metrics', value: 'metrics' }, + ], + multiSelect: true, + width: 200, + onChange: (values) => emitEvent('SetParamValues', { + payload: { anomaly_type_filter: values.length ? values.join(',') : null, current_page: 0 }, + }), + }), + span({class: 'fx-flex'}, ''), + () => { + const schedule = getValue(props.schedule); + if (schedule && !schedule.active) { + return div( + { class: 'flex-row fx-gap-1' }, + Icon({ style: 'font-size: 16px; color: var(--purple);' }, 'info'), + span( + { style: 'color: var(--purple);' }, + 'Monitor schedule is paused.', + ), + ); + }; + if (schedule && schedule.cron_sample.samples) { + return withTooltip( + span( + { class: 'text-caption', style: 'position: relative;' }, + `Next run: ${formatTimestamp(schedule.cron_sample.samples[0])}`, + ), + { + text: `Schedule: ${schedule.cron_sample.readable_expr} (${schedule.cron_tz})`, + width: 150, + }, + ); + } + return ''; + }, + ), + columns: () => { + const lookback = getValue(props.summary)?.lookback ?? 0; + const numRuns = lookback === 1 ? 'run' : `${lookback} runs`; + const showChanges = showChangesColumns.val; + + return [ + [ + {name: 'filler_1', colspan: 1, label: ''}, + {name: 'anomalies', label: `Anomalies in last ${numRuns}`, colspan: 4, padding: 8, align: 'center'}, + + ...( + showChanges + ? [ + {name: 'changes', label: `Changes in last ${numRuns}`, colspan: 3, padding: 8, align: 'center'}, + {name: 'filler_2', label: ''}, + ] + : [] + ), + ], + [ + {name: 'table_name', label: 'Table', width: 200, align: 'left', sortable: true, overflow: 'visible'}, + {name: 'freshness_anomalies', label: 'Freshness', width: 85, align: 'left', sortable: true, overflow: 'visible'}, + {name: 'volume_anomalies', label: 'Volume', width: 85, align: 'left', sortable: true, overflow: 'visible'}, + {name: 'schema_anomalies', label: 'Schema', width: 85, sortable: true, align: 'left'}, + {name: 'metric_anomalies', label: 'Metrics', width: 85, sortable: true, align: 'left', overflow: 'visible'}, + + ...( + showChanges + ? [ + {name: 'latest_update', label: 'Latest Update', width: 150, align: 'left', sortable: true, overflow: 'visible'}, + {name: 'row_count', label: 'Row Count', width: 150, align: 'left', sortable: true, overflow: 'visible'}, + {name: 'schema_changes', label: 'Schema', width: 150, align: 'left', overflow: 'visible'}, + ] + : [] + ), + + { + name: 'action', + label: showChanges ? `View trends | + Edit monitors` : 'View trends | Edit monitors', // Formatted this way for white-space: pre-line + width: 100, + align: 'center', + overflow: 'visible', + }, + ], + ]; + }, + emptyState: div( + {class: 'flex-row fx-justify-center empty-table-message'}, + span( + {class: 'text-secondary'}, + 'No tables found matching filters', + ), + ), + sort: tableSort, + paginator: tablePaginator, + rowClass: (row) => row._hasAnomalies ? 'has-anomalies' : '', + }, + tableRows, + ) + : ConditionalEmptyState(projectSummary, userCanEdit), + ) + : ConditionalEmptyState(projectSummary, userCanEdit); +} + +/** + * @param {number?} anomalies + * @param {string?} errorMessage + * @param {boolean} isTraining + * @param {boolean} isPending + * @param {Function} onClick + */ +const AnomalyTag = (anomalies, errorMessage = null, isTraining = false, isPending = false, onClick = undefined) => { + if (isPending) { + return withTooltip( + span({class: 'text-secondary pl-2 pr-2', style: 'position: relative;'}, '-'), + { text: 'No results yet or not configured' }, + ); + } + + const hasErrors = !!errorMessage; + const content = van.derive(() => { + if (anomalies > 0) { + return span(anomalies); + } + if (hasErrors) { + return withTooltip( + i({class: 'material-symbols-rounded'}, 'warning'), + { + text: div( + { class: 'flex-column fx-gap-2 text-left' }, + span('Error in latest run. Reconfigure the monitor or contact support.'), + i(errorMessage), + ), + width: 360, + }, + ); + } + if (isTraining) { + return withTooltip( + i({class: 'material-symbols-rounded'}, 'more_horiz'), + {text: 'Training model'}, + ); + } + return i({class: 'material-symbols-rounded'}, 'check'); + }); + + return div( + { class: `anomaly-tag-wrapper flex-row p-1 ${onClick ? 'clickable' : ''}`, onclick: onClick }, + div( + { + class: `anomaly-tag ${anomalies > 0 ? 'has-anomalies' : ''} ${hasErrors ? 'has-errors' : ''} ${isTraining ? 'is-training' : ''}`, + }, + content, + ), + ); +}; + +/** + * @param {ProjectSummary} projectSummary + * @param {boolean} userCanEdit + */ +const ConditionalEmptyState = (projectSummary, userCanEdit) => { + let args = { + label: 'No monitors yet for table group', + message: EMPTY_STATE_MESSAGE.monitors, + button: Button({ + type: 'stroked', + icon: 'settings', + label: 'Configure Monitors', + color: 'primary', + style: 'width: unset;', + disabled: !userCanEdit, + onclick: () => emitEvent('EditMonitorSettings', {}), + }), + } + if (projectSummary.connection_count <= 0) { + args = { + label: 'Your project is empty', + message: EMPTY_STATE_MESSAGE.connection, + link: { + label: 'Go to Connections', + href: 'connections', + params: { project_code: projectSummary.project_code }, + }, + }; + } else if (projectSummary.table_group_count <= 0) { + args = { + label: 'Your project is empty', + message: EMPTY_STATE_MESSAGE.tableGroup, + link: { + label: 'Go to Table Groups', + href: 'table-groups', + params: { + project_code: projectSummary.project_code, + connection_id: projectSummary.default_connection_id, + }, + }, + }; + } + + return EmptyState({ + icon: 'apps_outage', + ...args, + }); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.empty-table-message { + min-height: 300px; +} + +.has-monitors { + font-size: 5px; +} + +.tg-select--field .has-monitors { + display: none; +} + +th.tg-table-column.action span { + white-space: pre-line; + text-transform: none; +} + +.tg-table-column.table_name, +.tg-table-column.freshness_anomalies, +.tg-table-column.latest_update, +.tg-table-cell.table_name, +.tg-table-cell.freshness_anomalies, +.tg-table-cell.latest_update { + padding-left: 16px !important; +} + +.tg-table-column.table_name, +.tg-table-column.metric_anomalies, +.tg-table-column.schema_changes, +.tg-table-cell.table_name, +.tg-table-cell.metric_anomalies, +.tg-table-cell.schema_changes { + border-right: 1px dashed var(--border-color); +} + +.tg-table-cell.schema_changes { + padding-right: 0; + padding-left: 0; +} + +.schema-changes { + position: relative; + display: inline-flex; + cursor: pointer; + padding: 4px; + border-radius: 4px; +} + +.schema-changes:hover { + background: var(--select-hover-background); +} + +.tg-icon.schema-icon { + cursor: pointer; + color: var(--primary-text-color); +} + +.anomaly-tag-wrapper { + width: fit-content; + border-radius: 4px; +} +.anomaly-tag-wrapper.clickable:hover { + background: var(--select-hover-background); +} + +tr.has-anomalies { + background-color: rgba(239, 83, 80, 0.08); +} +`); + +export { MonitorsDashboard }; diff --git a/testgen/ui/components/frontend/js/pages/notification_settings.js b/testgen/ui/components/frontend/js/pages/notification_settings.js index 77c1374b..570115de 100644 --- a/testgen/ui/components/frontend/js/pages/notification_settings.js +++ b/testgen/ui/components/frontend/js/pages/notification_settings.js @@ -7,7 +7,13 @@ * @property {string[]} recipients * @property {string} trigger * @property {boolean} enabled + * @property {string[]} duplicates * + * @typedef Subtitle + * @type {object} + * @property {string} label + * @property {string} value + * * @typedef Permissions * @type {object} * @property {boolean} can_edit @@ -28,6 +34,7 @@ * @property {import('../components/select.js').Option[]} trigger_options * @property {Boolean} cde_enabled; * @property {Boolean} total_enabled; + * @property {Subtitle?} subtitle * @property {Result?} result */ import van from '../van.min.js'; @@ -98,6 +105,8 @@ const NotificationSettings = (/** @type Properties */ props) => { isEdit: van.state(false), }; + const subtitle = getValue(props.subtitle); + const resetForm = () => { newNotificationItemForm.id.val = null ; newNotificationItemForm.scope.val = null; @@ -109,7 +118,7 @@ const NotificationSettings = (/** @type Properties */ props) => { } van.derive(() => { - if (getValue(props.result)?.success) { + if (getValue(props.result)?.success && newNotificationItemForm.isEdit.rawVal) { resetForm(); } }); @@ -121,76 +130,90 @@ const NotificationSettings = (/** @type Properties */ props) => { ) => { const showTotalScore = totalScoreEnabled && item.total_score_threshold !== '0.0'; const showCdeScore = cdeScoreEnabled && item.cde_score_threshold !== '0.0'; + const duplicatedMessage = item.duplicates?.length + ? `This notification will be delivered multiple times for: ${item.duplicates.join(', ')}` + : ''; + return div( - { class: () => `table-row flex-row ${newNotificationItemForm.isEdit.val && newNotificationItemForm.id.val === item.id ? 'notifications--editing-row' : ''}` }, - event === 'score_drop' - ? div( - { style: `flex: ${columns[0]}%`, class: 'flex-column fx-gap-1 score-threshold' }, - showTotalScore ? div('Total score: ', b(item.total_score_threshold)) : '', - showCdeScore ? div(`${showTotalScore ? 'or ' : ''}CDE score: `, b(item.cde_score_threshold)) : '', - ) - : div( - { style: `flex: ${columns[0]}%` }, - div(scopeLabel(item.scope)), - div({ class: 'text-caption mt-1' }, triggerLabel(item.trigger)), - ), - div( - { style: `flex: ${columns[1]}%` }, - TruncatedText({ max: 6 }, ...item.recipients), - ), + { class: 'flex-column table-row'}, div( - { class: 'flex-row fx-gap-2', style: `flex: ${columns[2]}%` }, - permissions.can_edit - ? (newNotificationItemForm.isEdit.val && newNotificationItemForm.id.val === item.id - ? div( - { class: 'flex-row fx-gap-1' }, - Icon({ size: 18, classes: 'notifications--editing' }, 'edit'), - span({ class: 'notifications--editing' }, 'Editing'), - ) - : [ - item.enabled - ? Button({ + { class: () => `flex-row ${newNotificationItemForm.isEdit.val && newNotificationItemForm.id.val === item.id ? 'notifications--editing-row' : ''}` }, + event === 'score_drop' + ? div( + { style: `flex: ${columns[0]}%`, class: 'flex-column fx-gap-1 score-threshold' }, + showTotalScore ? div('Total score: ', b(item.total_score_threshold)) : '', + showCdeScore ? div(`${showTotalScore ? 'or ' : ''}CDE score: `, b(item.cde_score_threshold)) : '', + ) + : div( + { style: `flex: ${columns[0]}%` }, + div(scopeLabel(item.scope)), + div({ class: 'text-caption mt-1' }, triggerLabel(item.trigger)), + ), + div( + { style: `flex: ${columns[1]}%` }, + TruncatedText({ max: 6 }, ...item.recipients), + ), + div( + { class: 'flex-row fx-gap-2', style: `flex: ${columns[2]}%` }, + permissions.can_edit + ? (newNotificationItemForm.isEdit.val && newNotificationItemForm.id.val === item.id + ? div( + { class: 'flex-row fx-gap-1' }, + Icon({ size: 18, classes: 'notifications--editing' }, 'edit'), + span({ class: 'notifications--editing' }, 'Editing'), + ) + : [ + item.enabled + ? Button({ + type: 'stroked', + icon: 'pause', + tooltip: 'Pause notification', + style: 'height: 32px;', + onclick: () => emitEvent('PauseNotification', { payload: item }), + }) + : Button({ + type: 'stroked', + icon: 'play_arrow', + tooltip: 'Resume notification', + style: 'height: 32px;', + onclick: () => emitEvent('ResumeNotification', { payload: item }), + }), + Button({ type: 'stroked', - icon: 'pause', - tooltip: 'Pause notification', + icon: 'edit', + tooltip: 'Edit notification', style: 'height: 32px;', - onclick: () => emitEvent('PauseNotification', { payload: item }), - }) - : Button({ + onclick: () => { + newNotificationItemForm.isEdit.val = true; + newNotificationItemForm.id.val = item.id; + newNotificationItemForm.recipientsString.val = item.recipients.join(', '); + if (event === 'score_drop') { + newNotificationItemForm.totalScoreThreshold.val = item.total_score_threshold; + newNotificationItemForm.cdeScoreThreshold.val = item.cde_score_threshold; + } else { + newNotificationItemForm.scope.val = item.scope; + newNotificationItemForm.trigger.val = item.trigger; + } + }, + }), + Button({ type: 'stroked', - icon: 'play_arrow', - tooltip: 'Resume notification', + icon: 'delete', + tooltip: 'Delete notification', + tooltipPosition: 'top-left', style: 'height: 32px;', - onclick: () => emitEvent('ResumeNotification', { payload: item }), + onclick: () => emitEvent('DeleteNotification', { payload: item }), }), - Button({ - type: 'stroked', - icon: 'edit', - tooltip: 'Edit notification', - style: 'height: 32px;', - onclick: () => { - newNotificationItemForm.isEdit.val = true; - newNotificationItemForm.id.val = item.id; - newNotificationItemForm.recipientsString.val = item.recipients.join(', '); - if (event === 'score_drop') { - newNotificationItemForm.totalScoreThreshold.val = item.total_score_threshold; - newNotificationItemForm.cdeScoreThreshold.val = item.cde_score_threshold; - } else { - newNotificationItemForm.scope.val = item.scope; - newNotificationItemForm.trigger.val = item.trigger; - } - }, - }), - Button({ - type: 'stroked', - icon: 'delete', - tooltip: 'Delete notification', - tooltipPosition: 'top-left', - style: 'height: 32px;', - onclick: () => emitEvent('DeleteNotification', { payload: item }), - }), - ]) : null, + ]) : null, + ), ), + duplicatedMessage + ? div( + { class: 'flex-row fx-gap-1 text-caption warning-text' }, + Icon({ size: 12, classes: 'warning-text' }, 'warning'), + span({}, duplicatedMessage), + ) + : '', ); } @@ -199,11 +222,18 @@ const NotificationSettings = (/** @type Properties */ props) => { return div( { id: domId, class: 'flex-column fx-gap-2', style: 'height: 100%; overflow-y: auto;' }, + subtitle + ? div( + { class: 'flex-row fx-gap-1 mb-5 text-large' }, + span({ class: 'text-secondary' }, `${subtitle.label}: `), + span(subtitle.value), + ) + : '', () => ExpansionPanel( { title: newNotificationItemForm.isEdit.val ? span({ class: 'notifications--editing' }, 'Edit Notification') - : 'Add Notification', + : span({ class: 'text-green' }, 'Add Notification'), testId: 'notification-item-editor', expanded: newNotificationItemForm.isEdit.val, }, @@ -246,15 +276,17 @@ const NotificationSettings = (/** @type Properties */ props) => { onChange: (value) => newNotificationItemForm.scope.val = value, portalClass: 'short-select-portal', }), - () => Select({ - label: 'When', - options: triggerOptions.map(([value, label]) => ({ - label: label, value: value - })), - value: newNotificationItemForm.trigger, - onChange: (value) => newNotificationItemForm.trigger.val = value, - portalClass: 'short-select-portal', - }), + () => event !== 'monitor_run' + ? Select({ + label: 'When', + options: triggerOptions.map(([value, label]) => ({ + label: label, value: value + })), + value: newNotificationItemForm.trigger, + onChange: (value) => newNotificationItemForm.trigger.val = value, + portalClass: 'short-select-portal', + }) + : '', ]), ), div( diff --git a/testgen/ui/components/frontend/js/pages/profiling_runs.js b/testgen/ui/components/frontend/js/pages/profiling_runs.js index 5d39c061..d166795c 100644 --- a/testgen/ui/components/frontend/js/pages/profiling_runs.js +++ b/testgen/ui/components/frontend/js/pages/profiling_runs.js @@ -231,7 +231,7 @@ const Toolbar = ( onChange: (value) => emitEvent('FilterApplied', { payload: { table_group_id: value } }), }), div( - { class: 'flex-row fx-gap-4' }, + { class: 'flex-row fx-gap-3' }, Button({ icon: 'notifications', type: 'stroked', @@ -239,7 +239,7 @@ const Toolbar = ( tooltip: 'Configure email notifications for profiling runs', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunNotificationsClicked', {}), }), Button({ @@ -249,7 +249,7 @@ const Toolbar = ( tooltip: 'Manage when profiling should run for table groups', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunSchedulesClicked', {}), }), userCanEdit @@ -258,16 +258,16 @@ const Toolbar = ( type: 'stroked', label: 'Run Profiling', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunProfilingClicked', {}), }) : '', Button({ - type: 'icon', + type: 'stroked', icon: 'refresh', tooltip: 'Refresh profiling runs list', tooltipPosition: 'left', - style: 'border: var(--button-stroked-border); border-radius: 4px;', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RefreshData', {}), testId: 'profiling-runs-refresh', }), @@ -461,7 +461,7 @@ const ConditionalEmptyState = ( color: 'primary', label: 'Run Profiling', width: 'fit-content', - style: 'margin: auto; background: var(--dk-card-background);', + style: 'margin: auto; background: var(--button-generic-background-color);', disabled: !userCanEdit, tooltip: userCanEdit ? null : DISABLED_ACTION_TEXT, tooltipPosition: 'bottom', diff --git a/testgen/ui/components/frontend/js/pages/project_dashboard.js b/testgen/ui/components/frontend/js/pages/project_dashboard.js index 734298e4..292b0aba 100644 --- a/testgen/ui/components/frontend/js/pages/project_dashboard.js +++ b/testgen/ui/components/frontend/js/pages/project_dashboard.js @@ -1,6 +1,7 @@ /** * @import { FilterOption, ProjectSummary } from '../types.js'; * @import { TestSuiteSummary } from '../types.js'; + * @import { MonitorSummary } from '../components/monitor_anomalies_summary.js'; * * @typedef TableGroupSummary * @type {object} @@ -24,6 +25,7 @@ * @property {number} latest_anomalies_dismissed_ct * @property {number?} latest_tests_start * @property {TestSuiteSummary[]} test_suites + * @property {MonitorSummary?} monitoring_summary * * @typedef SortOption * @type {object} @@ -49,6 +51,7 @@ import { SummaryBar } from '../components/summary_bar.js'; import { EmptyState, EMPTY_STATE_MESSAGE } from '../components/empty_state.js'; import { ScoreMetric } from '../components/score_metric.js'; import { SummaryCounts } from '../components/summary_counts.js'; +import { AnomaliesSummary } from '../components/monitor_anomalies_summary.js'; const { div, h3, hr, span } = van.tags; @@ -93,7 +96,7 @@ const ProjectDashboard = (/** @type Properties */ props) => { { id: wrapperId, class: 'flex-column tg-overview' }, () => getValue(tableGroups).length ? div( - { class: 'flex-row fx-align-flex-end fx-gap-4' }, + { class: 'flex-row fx-align-flex-end fx-gap-3' }, Input({ width: 230, style: 'font-size: 14px;', @@ -116,7 +119,11 @@ const ProjectDashboard = (/** @type Properties */ props) => { ? getValue(filteredTableGroups).length ? div( { class: 'flex-column mt-4' }, - getValue(filteredTableGroups).map(tableGroup => TableGroupCard(tableGroup)) + getValue(filteredTableGroups).map(tableGroup => + tableGroup.monitoring_summary + ? TableGroupCardWithMonitor(tableGroup) + : TableGroupCard(tableGroup) + ) ) : div( { class: 'mt-7 text-secondary', style: 'text-align: center;' }, @@ -128,6 +135,7 @@ const ProjectDashboard = (/** @type Properties */ props) => { const TableGroupCard = (/** @type TableGroupSummary */ tableGroup) => { const useApprox = tableGroup.record_ct === null || tableGroup.record_ct === undefined; + return Card({ testId: 'table-group-summary-card', border: true, @@ -163,6 +171,50 @@ const TableGroupCard = (/** @type TableGroupSummary */ tableGroup) => { }); }; +const TableGroupCardWithMonitor = (/** @type TableGroupSummary */ tableGroup) => { + const useApprox = tableGroup.record_ct === null || tableGroup.record_ct === undefined; + return Card({ + testId: 'table-group-summary-card', + border: true, + content: () => div( + { class: 'flex-column' }, + + div( + { class: 'flex-row fx-align-flex-start fx-justify-space-between' }, + div( + { class: 'flex-column', style: 'flex: auto;' }, + div( + { class: 'flex-column', style: 'flex: auto;' }, + h3( + { class: 'tg-overview--title' }, + tableGroup.table_groups_name, + ), + span( + { class: 'text-caption mt-1 mb-3 tg-overview--subtitle' }, + `${formatNumber(tableGroup.table_ct ?? 0)} tables | + ${formatNumber(tableGroup.column_ct ?? 0)} columns | + ${formatNumber(useApprox ? tableGroup.approx_record_ct : tableGroup.record_ct)} rows + ${useApprox ? '*' : ''} | + ${formatNumber(useApprox ? tableGroup.approx_data_point_ct : tableGroup.data_point_ct)} data points + ${useApprox ? '*' : ''}`, + ), + ), + AnomaliesSummary(tableGroup.monitoring_summary, 'Monitor anomalies'), + ), + ScoreMetric(tableGroup.dq_score, tableGroup.dq_score_profiling, tableGroup.dq_score_testing), + ), + + hr({ class: 'tg-overview--table-group-divider' }), + TableGroupTestSuiteSummary(tableGroup.test_suites), + hr({ class: 'tg-overview--table-group-divider' }), + TableGroupLatestProfile(tableGroup), + useApprox + ? span({ class: 'text-caption text-right' }, '* Approximate counts based on server statistics') + : null, + ) + }); +}; + const TableGroupLatestProfile = (/** @type TableGroupSummary */ tableGroup) => { if (!tableGroup.latest_profile_start) { return div( @@ -209,7 +261,6 @@ const TableGroupLatestProfile = (/** @type TableGroupSummary */ tableGroup) => { }) : '', ), - div({ style: 'flex: 1 1 120px;' }) ); }; diff --git a/testgen/ui/components/frontend/js/pages/quality_dashboard.js b/testgen/ui/components/frontend/js/pages/quality_dashboard.js index 371c9ce8..3378f21e 100644 --- a/testgen/ui/components/frontend/js/pages/quality_dashboard.js +++ b/testgen/ui/components/frontend/js/pages/quality_dashboard.js @@ -106,10 +106,9 @@ const Toolbar = ( ]; return div( - { class: 'flex-row fx-align-flex-end mb-4' }, + { class: 'flex-row fx-align-flex-end fx-gap-3 mb-4' }, Input({ width: 230, - style: 'font-size: 14px; margin-right: 16px;', icon: 'search', clearable: true, placeholder: 'Search scorecards', @@ -132,7 +131,7 @@ const Toolbar = ( icon: 'data_exploration', label: 'Score Explorer', color: 'primary', - style: 'background: var(--button-generic-background-color); width: unset; margin-right: 16px;', + style: 'background: var(--button-generic-background-color); width: unset;', onclick: () => emitEvent('LinkClicked', { href: 'quality-dashboard:explorer', params: { project_code: projectSummary.project_code }, @@ -140,11 +139,11 @@ const Toolbar = ( }), }), Button({ - type: 'icon', + type: 'stroked', icon: 'refresh', tooltip: 'Refresh page data', tooltipPosition: 'left', - style: 'border: var(--button-stroked-border); border-radius: 4px;', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RefreshData', {}), testId: 'scorecards-refresh', }), diff --git a/testgen/ui/components/frontend/js/pages/schedule_list.js b/testgen/ui/components/frontend/js/pages/schedule_list.js index 39b0422a..9788f0fc 100644 --- a/testgen/ui/components/frontend/js/pages/schedule_list.js +++ b/testgen/ui/components/frontend/js/pages/schedule_list.js @@ -1,5 +1,5 @@ /** - * @import {CronSample} from '../components/crontab_input.js' + * @import { CronSample } from '../types.js' * * @typedef Schedule * @type {object} @@ -74,7 +74,7 @@ const ScheduleList = (/** @type Properties */ props) => { return div( { id: domId, class: 'flex-column fx-gap-2', style: 'height: 100%; overflow-y: auto;' }, ExpansionPanel( - {title: 'Add Schedule', testId: 'scheduler-cron-editor'}, + {title: span({ class: 'text-green' }, 'Add Schedule'), testId: 'scheduler-cron-editor'}, div( { class: 'flex-row fx-gap-2' }, () => Select({ diff --git a/testgen/ui/components/frontend/js/pages/score_explorer.js b/testgen/ui/components/frontend/js/pages/score_explorer.js index 27deb404..7bd64e02 100644 --- a/testgen/ui/components/frontend/js/pages/score_explorer.js +++ b/testgen/ui/components/frontend/js/pages/score_explorer.js @@ -277,7 +277,7 @@ const Toolbar = ( { class: 'flex-column' }, span({ class: 'text-caption mb-1' }, 'Filter by'), div( - { class: 'flex-row fx-flex-wrap fx-gap-4' }, + { class: 'flex-row fx-flex-wrap fx-gap-3' }, () => { const filters_ = getValue(filters); const filterValues_ = getValue(filterValues); @@ -286,7 +286,7 @@ const Toolbar = ( } return div( - { class: 'flex-row fx-flex-wrap fx-gap-4' }, + { class: 'flex-row fx-flex-wrap fx-gap-3' }, filters_.map(({ key, field, value, others }, idx) => { renderedFilters[key] = renderedFilters[key] ?? ( filterByColumns.val @@ -409,11 +409,10 @@ const Toolbar = ( ), ), userCanEdit ? div( - { class: 'flex-row fx-align-flex-end' }, + { class: 'flex-row fx-align-flex-end fx-gap-3' }, Input({ label: 'Scorecard Name', height: 40, - style: 'margin-right: 16px;', value: scoreName, testId: 'scorecard-name-input', onChange: debounce((name) => scoreName.val = name, 300), @@ -443,7 +442,7 @@ const Toolbar = ( label: 'Cancel', type: 'stroked', color: 'warn', - style: 'width: auto; margin-left: 16px;', + style: 'width: auto;', onclick: () => emitEvent('LinkClicked', { href, params }), }); }, diff --git a/testgen/ui/components/frontend/js/pages/table_group_delete_confirmation.js b/testgen/ui/components/frontend/js/pages/table_group_delete_confirmation.js index 96b74346..0c145a1a 100644 --- a/testgen/ui/components/frontend/js/pages/table_group_delete_confirmation.js +++ b/testgen/ui/components/frontend/js/pages/table_group_delete_confirmation.js @@ -68,7 +68,7 @@ const TableGroupDeleteConfirmation = (props) => { { class: 'flex-column fx-gap-4 mt-4' }, Alert( { type: 'warn' }, - div('This Table Group has related data, which may include profiling, test definitions and test results.'), + div('This Table Group has related data, which may include profiling, test definitions, test results, and monitor history.'), div({ class: 'mt-2' }, 'If you proceed, all related data will be permanently deleted.'), ), Toggle({ diff --git a/testgen/ui/components/frontend/js/pages/table_group_list.js b/testgen/ui/components/frontend/js/pages/table_group_list.js index e23212fe..b015f185 100644 --- a/testgen/ui/components/frontend/js/pages/table_group_list.js +++ b/testgen/ui/components/frontend/js/pages/table_group_list.js @@ -230,7 +230,7 @@ const Toolbar = (permissions, connections, selectedConnection, tableGroupNameFil return div( { class: 'flex-row fx-align-flex-end fx-justify-space-between fx-gap-4 fx-flex-wrap mb-4' }, div( - {class: 'flex-row fx-align-flex-end fx-gap-4'}, + {class: 'flex-row fx-align-flex-end fx-gap-3'}, (getValue(connections) ?? [])?.length > 1 ? Select({ testId: 'connection-select', @@ -256,7 +256,7 @@ const Toolbar = (permissions, connections, selectedConnection, tableGroupNameFil }), ), div( - { class: 'flex-row fx-gap-4' }, + { class: 'flex-row fx-gap-3' }, Button({ icon: 'notifications', type: 'stroked', @@ -264,7 +264,7 @@ const Toolbar = (permissions, connections, selectedConnection, tableGroupNameFil tooltip: 'Configure email notifications for profiling runs', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunNotificationsClicked', {}), }), Button({ @@ -274,7 +274,7 @@ const Toolbar = (permissions, connections, selectedConnection, tableGroupNameFil tooltip: 'Manage when profiling should run for table groups', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunSchedulesClicked', {}), }), permissions.can_edit diff --git a/testgen/ui/components/frontend/js/pages/table_group_wizard.js b/testgen/ui/components/frontend/js/pages/table_group_wizard.js index 074bff2b..48ef56b3 100644 --- a/testgen/ui/components/frontend/js/pages/table_group_wizard.js +++ b/testgen/ui/components/frontend/js/pages/table_group_wizard.js @@ -1,13 +1,17 @@ /** * @import { TableGroupPreview } from '../components/table_group_test.js' - * @import { Connection } from ''../components/connection_form.js' - * @import { TableGroup } from ''../components/table_group_form.js' + * @import { Connection } from '../components/connection_form.js' + * @import { TableGroup } from '../components/table_group_form.js' + * @import { CronSample } from '../types.js' * * @typedef WizardResult * @type {object} * @property {boolean} success * @property {string} message - * @property {string?} table_group_id + * @property {boolean} run_profiling + * @property {boolean} generate_test_suite + * @property {boolean} generate_monitor_suite + * @property {string?} test_suite_name * * @typedef Properties * @type {object} @@ -17,231 +21,614 @@ * @property {string[]?} steps * @property {boolean?} is_in_use * @property {TableGroupPreview?} table_group_preview + * @property {CronSample?} standard_cron_sample + * @property {CronSample?} monitor_cron_sample * @property {WizardResult?} results */ import van from '../van.min.js'; -import { Streamlit } from '../streamlit.js'; import { TableGroupForm } from '../components/table_group_form.js'; import { TableGroupTest } from '../components/table_group_test.js'; import { TableGroupStats } from '../components/table_group_stats.js'; -import { emitEvent, getValue, resizeFrameHeightOnDOMChange, resizeFrameHeightToElement } from '../utils.js'; +import { emitEvent, getValue, isEqual } from '../utils.js'; import { Button } from '../components/button.js'; import { Alert } from '../components/alert.js'; import { Checkbox } from '../components/checkbox.js'; import { Icon } from '../components/icon.js'; import { Caption } from '../components/caption.js'; +import { Input } from '../components/input.js'; +import { Select } from '../components/select.js'; +import { Link } from '../components/link.js'; +import { CrontabInput } from '../components/crontab_input.js'; +import { timezones } from '../values.js'; +import { requiredIf } from '../form_validators.js'; +import { MonitorSettingsForm } from '../components/monitor_settings_form.js'; +import { Streamlit } from '../streamlit.js'; +import { WizardProgressIndicator } from '../components/wizard_progress_indicator.js'; -const { div, i, span, strong } = van.tags; -const stepsTitle = { - tableGroup: 'Configure Table Group', - testTableGroup: 'Preview Table Group', - runProfiling: 'Run Profiling', -}; -const lastStepCustonButtonText = { - runProfiling: (state) => state ? 'Save & Run Profiling' : 'Save', +const { div, span, strong } = van.tags; +const lastStepCustomButtonText = { + monitorSuite: (_, states) => states?.runProfiling?.val === true ? 'Save & Run' : 'Save', }; const defaultSteps = [ - 'tableGroup', - 'testTableGroup', + 'tableGroup', + 'testTableGroup', ]; /** * @param {Properties} props */ const TableGroupWizard = (props) => { - Streamlit.setFrameHeight(1); - window.testgen.isPage = true; - - const steps = props.steps?.val ?? defaultSteps; - const stepsState = { - tableGroup: van.state(props.table_group.val), - testTableGroup: van.state(false), - runProfiling: van.state(true), - }; - const stepsValidity = { - tableGroup: van.state(false), - testTableGroup: van.state(false), - runProfiling: van.state(true), - }; - const currentStepIndex = van.state(0); - const currentStepIsInvalid = van.derive(() => { - const stepKey = steps[currentStepIndex.val]; - return !stepsValidity[stepKey].val; - }); - const nextButtonType = van.derive(() => { - const isLastStep = currentStepIndex.val === steps.length - 1; - return isLastStep ? 'flat' : 'stroked'; - }); - const nextButtonLabel = van.derive(() => { - const isLastStep = currentStepIndex.val === steps.length - 1; - if (isLastStep) { - const stepKey = steps[currentStepIndex.val]; - const stepState = stepsState[stepKey]; - return lastStepCustonButtonText[stepKey]?.(stepState.val) ?? 'Save'; + window.testgen.isPage = true; + + const steps = getValue(props.steps) ?? defaultSteps; + const defaultTimezone = Intl.DateTimeFormat().resolvedOptions().timeZone; + const stepsState = { + tableGroup: van.state(getValue(props.table_group)), + testTableGroup: van.state(false), + runProfiling: van.state(true), + testSuite: van.state({ + generate: true, + name: '', + schedule: '0 0 * * *', + timezone: defaultTimezone, + }), + monitorSuite: van.state({ + generate: true, + monitor_lookback: 14, + schedule: '0 */12 * * *', + timezone: defaultTimezone, + predict_sensitivity: 'medium', + predict_min_lookback: 30, + predict_exclude_weekends: false, + predict_holiday_codes: undefined, + }), + }; + + const stepsValidity = { + tableGroup: van.state(false), + testTableGroup: van.state(false), + runProfiling: van.state(true), + testSuite: van.state(true), + monitorSuite: van.state(true), + }; + const currentStepIndex = van.state(0); + const currentStepIsInvalid = van.derive(() => { + const stepKey = steps[currentStepIndex.val]; + return !stepsValidity[stepKey].val; + }); + const nextButtonType = van.derive(() => { + const isLastStep = currentStepIndex.val === steps.length - 1; + return isLastStep ? 'flat' : 'stroked'; + }); + const nextButtonLabel = van.derive(() => { + const isLastStep = currentStepIndex.val === steps.length - 1; + if (isLastStep) { + const stepKey = steps[currentStepIndex.val]; + return lastStepCustomButtonText[stepKey]?.(stepKey, stepsState) ?? 'Save'; + } + return 'Next'; + }); + + const tableGroupPreview = van.state(getValue(props.table_group_preview)); + const isComplete = van.derive(() => getValue(props.results)?.success === true); + + const setStep = (stepIdx) => { + currentStepIndex.val = stepIdx; + // Force scroll reset to top of dialog + document.activeElement?.blur(); + setTimeout(() => document.querySelector('.stDialog').scrollTop = 0, 1); + }; + const saveTableGroup = () => { + const payloadEntries = [ + ['tableGroup', 'table_group', stepsState.tableGroup.val], + ['testTableGroup', 'table_group_verified', stepsState.testTableGroup.val], + ['runProfiling', 'run_profiling', stepsState.runProfiling.val], + ['testSuite', 'standard_test_suite', stepsState.testSuite.val], + ['monitorSuite', 'monitor_test_suite', stepsState.monitorSuite.val], + ].filter(([stepKey,]) => steps.includes(stepKey)).map(([, eventKey, stepState]) => [eventKey, stepState]); + + const payload = Object.fromEntries(payloadEntries); + emitEvent('SaveTableGroupClicked', { payload }); + }; + + const domId = 'table-group-wizard-wrapper'; + + return div( + { id: domId }, + () => { + const stepIndex = currentStepIndex.val; + if (isComplete.val) { + return ''; + } + + return WizardProgressIndicator( + [ + { + index: 1, + title: 'Table Group', + skipped: false, + includedSteps: ['tableGroup', 'testTableGroup'], + }, + { + index: 2, + title: 'Profiling', + skipped: !stepsState.runProfiling.rawVal, + includedSteps: ['runProfiling'], + }, + { + index: 3, + title: 'Testing', + skipped: !stepsState.testSuite.rawVal.generate, + includedSteps: ['testSuite'], + }, + { + index: 4, + title: 'Monitors', + skipped: !stepsState.monitorSuite.rawVal.generate, + includedSteps: ['monitorSuite'], + }, + ], + { + index: stepIndex, + name: steps[stepIndex], + }, + ); + }, + WizardStep(0, currentStepIndex, () => { + currentStepIndex.val; + if (isComplete.val) { + return ''; + } + + const connections = getValue(props.connections) ?? []; + const tableGroup = stepsState.tableGroup.rawVal; + + return TableGroupForm({ + connections, + tableGroup: tableGroup, + showConnectionSelector: connections.length > 1, + disableConnectionSelector: false, + disableSchemaField: props.is_in_use ?? false, + onChange: (updatedTableGroup, state) => { + stepsState.tableGroup.val = updatedTableGroup; + stepsValidity.tableGroup.val = state.valid; + }, + }); + }), + WizardStep(1, currentStepIndex, () => { + currentStepIndex.val; + + if (isComplete.val) { + return ''; + } + + const tableGroup = stepsState.tableGroup.rawVal; + van.derive(() => { + const renewedPreview = getValue(props.table_group_preview); + if (currentStepIndex.rawVal === 1) { + tableGroupPreview.val = renewedPreview; + stepsValidity.testTableGroup.val = tableGroupPreview.rawVal?.success ?? false; + stepsState.testTableGroup.val = tableGroupPreview.rawVal?.success ?? false; } - return 'Next'; - }); - - van.derive(() => { - const tableGroupPreview = getValue(props.table_group_preview); - stepsValidity.testTableGroup.val = tableGroupPreview?.success ?? false; - stepsState.testTableGroup.val = tableGroupPreview?.success ?? false; - }); - - const setStep = (stepIdx) => { - currentStepIndex.val = stepIdx; - }; - const saveTableGroup = () => { - const payloadEntries = [ - ['tableGroup', 'table_group', stepsState.tableGroup.val], - ['testTableGroup', 'table_group_verified', stepsState.testTableGroup.val], - ['runProfiling', 'run_profiling', stepsState.runProfiling.val], - ].filter(([stepKey,]) => steps.includes(stepKey)).map(([, eventKey, stepState]) => [eventKey, stepState]); - - const payload = Object.fromEntries(payloadEntries); - emitEvent('SaveTableGroupClicked', { payload }); - }; - - const domId = 'table-group-wizard-wrapper'; - resizeFrameHeightToElement(domId); - resizeFrameHeightOnDOMChange(domId); - - return div( - { id: domId, class: 'tg-table-group-wizard flex-column fx-gap-3' }, - div( - {}, - () => { - const stepName = steps[currentStepIndex.val]; - const stepNumber = currentStepIndex.val + 1; - return Caption({ - content: `Step ${stepNumber} of ${steps.length}: ${stepsTitle[stepName]}`, - }); - }, - ), - WizardStep(0, currentStepIndex, () => { - currentStepIndex.val; - - const connections = getValue(props.connections) ?? []; - const tableGroup = stepsState.tableGroup.rawVal; - - return TableGroupForm({ - connections, - tableGroup: tableGroup, - showConnectionSelector: connections.length > 1, - disableConnectionSelector: false, - disableSchemaField: props.is_in_use ?? false, - onChange: (updatedTableGroup, state) => { - stepsState.tableGroup.val = updatedTableGroup; - stepsValidity.tableGroup.val = state.valid; - }, - }); - }), - WizardStep(1, currentStepIndex, () => { - const tableGroup = stepsState.tableGroup.rawVal; - - if (currentStepIndex.val === 1) { - props.table_group_preview.val = undefined; - stepsValidity.testTableGroup.val = false; - stepsState.testTableGroup.val = false; - - emitEvent('PreviewTableGroupClicked', { payload: {table_group: tableGroup} }); - } - - return TableGroupTest( - props.table_group_preview, - { - onVerifyAcess: () => { - emitEvent('PreviewTableGroupClicked', { - payload: { - table_group: stepsState.tableGroup.rawVal, - verify_access: true, - }, - }); - } - } - ); - }), - () => { - const runProfiling = van.state(stepsState.runProfiling.rawVal); - const results = getValue(props.results) ?? {}; + }); - van.derive(() => { - stepsState.runProfiling.val = runProfiling.val; - }); + if (currentStepIndex.val === 1) { + emitEvent('PreviewTableGroupClicked', { payload: { table_group: tableGroup } }); + } - return WizardStep(2, currentStepIndex, () => { - currentStepIndex.val; - - return RunProfilingStep( - stepsState.tableGroup.rawVal, - runProfiling, - props.table_group_preview, - results?.success ?? false, - ); + return TableGroupTest( + tableGroupPreview, + { + onVerifyAcess: () => { + emitEvent('PreviewTableGroupClicked', { + payload: { + table_group: stepsState.tableGroup.rawVal, + verify_access: true, + } }); - }, + } + } + ); + }), + () => { + const runProfiling = van.state(stepsState.runProfiling.rawVal); + van.derive(() => { + stepsState.runProfiling.val = runProfiling.val; + }); + + return WizardStep(2, currentStepIndex, () => { + currentStepIndex.val; + + if (isComplete.val) { + return ''; + } + + return RunProfilingStep( + stepsState.tableGroup.rawVal, + runProfiling, + tableGroupPreview, + ); + }); + }, + () => { + const testSuiteState = stepsState.testSuite.rawVal; + const generateStandardTests = van.state(testSuiteState.generate); + const testSuiteName = van.state(testSuiteState.name); + const testSuiteSchedule = van.state(testSuiteState.schedule); + const testSuiteScheduleTimezone = van.state(testSuiteState.timezone); + const testSuiteCronSample = van.state({}); + const testSuiteCrontabEditorValue = van.derive(() => { + if (testSuiteSchedule.val && testSuiteScheduleTimezone.val) { + emitEvent('GetCronSampleAux', {payload: {cron_expr: testSuiteSchedule.val, tz: testSuiteScheduleTimezone.val}}); + } + + return { + expression: testSuiteSchedule.val, + timezone: testSuiteScheduleTimezone.val, + }; + }); + + van.derive(() => { + stepsState.testSuite.val = { + generate: generateStandardTests.val, + name: testSuiteName.val, + schedule: testSuiteSchedule.val, + timezone: testSuiteScheduleTimezone.val, + }; + }); + + van.derive(() => { + const sample = getValue(props.standard_cron_sample); + testSuiteCronSample.val = sample; + }); + + return WizardStep(3, currentStepIndex, () => { + if (currentStepIndex.val === 3) { + emitEvent('GetCronSampleAux', {payload: {cron_expr: testSuiteSchedule.val, tz: testSuiteScheduleTimezone.val}}); + } + + if (isComplete.val) { + return ''; + } + + const tableGroupName = stepsState.tableGroup.rawVal.table_groups_name; + if (!stepsState.testSuite.rawVal.name) { + testSuiteName.val = tableGroupName; + } + + return div( + { class: 'flex-column fx-gap-3' }, + Checkbox({ + label: div( + { class: 'flex-row' }, + span({ class: 'mr-1' }, 'Generate and schedule tests for the table group'), + strong(() => tableGroupName), + ), + checked: generateStandardTests, + disabled: false, + onChange: (value) => generateStandardTests.val = value, + }), + () => generateStandardTests.val + ? div( + { class: 'flex-column fx-gap-4' }, + () => Input({ + label: 'Test Suite Name', + value: testSuiteName, + validators: [ + requiredIf(() => generateStandardTests.val), + ], + onChange: (name, state) => { + testSuiteName.val = name; + stepsValidity.testSuite.val = state.valid && !!testSuiteScheduleTimezone.val && !!testSuiteSchedule.val; + }, + }), + div( + { class: 'flex-column fx-gap-3 border border-radius-1 p-3', style: 'position: relative;' }, + Caption({content: 'Test Run Schedule', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + div( + { class: 'flex-row fx-gap-3 fx-flex-wrap fx-align-flex-start monitor-settings-row' }, + Select({ + label: 'Timezone', + options: timezones.map(tz_ => ({label: tz_, value: tz_})), + value: testSuiteScheduleTimezone, + allowNull: false, + filterable: true, + style: 'flex: 1', + onChange: (value) => testSuiteScheduleTimezone.val = value, + }), + CrontabInput({ + name: 'tg_test_suite_schedule', + value: testSuiteCrontabEditorValue, + modes: ['x_hours', 'x_days'], + sample: testSuiteCronSample, + class: 'fx-flex', + onChange: (value) => testSuiteSchedule.val = value, + }), + ), + ), + ) + : span(), + div( + { class: 'flex-row fx-gap-1' }, + Icon({ size: 16 }, 'info'), + span( + { class: 'text-caption' }, + () => generateStandardTests.val + ? 'Tests will be generated after profiling and run periodically on schedule.' + : 'Test generation will be skipped. You can do this step later on the Test Suites page.', + ), + ), + ); + }); + }, + () => { + const monitorSuiteState = stepsState.monitorSuite.rawVal; + const generateMonitorTests = van.state(monitorSuiteState.generate); + const monitorSuiteLookback = van.state(monitorSuiteState.monitor_lookback); + const monitorSuiteSchedule = van.state(monitorSuiteState.schedule); + const monitorSuiteScheduleTimezone = van.state(monitorSuiteState.timezone); + const monitorPredictSensitivity = van.state(monitorSuiteState.predict_sensitivity); + const monitorPredictMinLookback = van.state(monitorSuiteState.predict_min_lookback); + const monitorPredictExcludeWeekends = van.state(monitorSuiteState.predict_exclude_weekends); + const monitorPredictHolidayCodes = van.state(monitorSuiteState.predict_holiday_codes); + + const monitorSuiteCronSample = van.state({}); + + van.derive(() => { + stepsState.monitorSuite.val = { + generate: generateMonitorTests.val, + monitor_lookback: monitorSuiteLookback.val, + schedule: monitorSuiteSchedule.val, + timezone: monitorSuiteScheduleTimezone.val, + predict_sensitivity: monitorPredictSensitivity.val, + predict_min_lookback: monitorPredictMinLookback.val, + predict_exclude_weekends: monitorPredictExcludeWeekends.val, + predict_holiday_codes: monitorPredictHolidayCodes.val, + }; + }); + + van.derive(() => { + const sample = getValue(props.monitor_cron_sample); + monitorSuiteCronSample.val = sample; + }); + + return WizardStep(4, currentStepIndex, () => { + currentStepIndex.val; + + if (isComplete.val) { + return ''; + } + + const tableGroupName = stepsState.tableGroup.rawVal.table_groups_name; + + return div( + { class: 'flex-column fx-gap-3' }, + Checkbox({ + label: div( + { class: 'flex-row' }, + span({ class: 'mr-1' }, 'Configure monitors for the table group'), + strong(() => tableGroupName), + ), + checked: generateMonitorTests, + disabled: false, + onChange: (value) => generateMonitorTests.val = value, + }), + () => generateMonitorTests.val + ? MonitorSettingsForm({ + schedule: { + active: true, + cron_expr: monitorSuiteSchedule.rawVal, + cron_tz: monitorSuiteScheduleTimezone.rawVal, + }, + monitorSuite: { + monitor_lookback: monitorSuiteLookback.rawVal, + predict_sensitivity: monitorPredictSensitivity.rawVal, + predict_min_lookback: monitorPredictMinLookback.rawVal, + predict_exclude_weekends: monitorPredictExcludeWeekends.rawVal, + predict_holiday_codes: monitorPredictHolidayCodes.rawVal, + }, + cronSample: monitorSuiteCronSample, + hideActiveCheckbox: true, + onChange: (schedule, monitorTestSuite, formState) => { + stepsValidity.monitorSuite.val = formState.valid; + monitorSuiteLookback.val = monitorTestSuite.monitor_lookback; + monitorSuiteSchedule.val = schedule.cron_expr; + monitorSuiteScheduleTimezone.val = schedule.cron_tz; + monitorPredictSensitivity.val = monitorTestSuite.predict_sensitivity; + monitorPredictMinLookback.val = monitorTestSuite.predict_min_lookback; + monitorPredictExcludeWeekends.val = monitorTestSuite.predict_exclude_weekends; + monitorPredictHolidayCodes.val = monitorTestSuite.predict_holiday_codes; + }, + }) + : span(), + div( + { class: 'flex-row fx-gap-1' }, + Icon({ size: 16 }, 'info'), + span( + { class: 'text-caption' }, + () => generateMonitorTests.val + ? 'Volume and Schema monitors will be configured and run periodically on schedule. Freshness monitors will be configured after profiling.' + : 'Monitor configuration will be skipped. You can do this step later on the Monitors page.', + ), + ), + ); + }); + }, + () => { + if (!isComplete.val) { + return ''; + } + + const results = getValue(props.results); + const projectCode = getValue(props.project_code); + const tableGroup = getValue(props.table_group); + const preview = getValue(props.table_group_preview); + + return div( + { class: 'flex-column' }, div( - { class: 'flex-column fx-gap-3' }, - () => { - const results = getValue(props.results) ?? {}; - return Object.keys(results).length > 0 - ? Alert({ type: results.success ? 'success' : 'error' }, span(results.message)) - : ''; - }, + { class: 'flex-column fx-gap-4 mb-4 p-5 border border-radius-2' }, + div( + { class: 'flex-row fx-gap-2' }, + Icon({ style: 'color: var(--green);' }, 'check_circle'), div( - { class: 'flex-row' }, - () => { - const results = getValue(props.results); - - if (currentStepIndex.val <= 0 || results?.success === true) { - return ''; - } - - return Button({ - label: 'Previous', - type: 'stroked', - color: 'basic', - width: 'auto', - style: 'margin-right: auto; min-width: 200px;', - onclick: () => setStep(currentStepIndex.val - 1), - }); - }, - () => { - const results = getValue(props.results); - const runProfiling = stepsState.runProfiling.val; - const stepKey = steps[currentStepIndex.val]; - - if (results && results.success && stepKey === 'runProfiling' && runProfiling) { - return Button({ - type: 'stroked', - color: 'primary', - label: 'Go to Profiling Runs', - width: 'auto', - icon: 'chevron_right', - style: 'margin-left: auto;', - onclick: () => emitEvent('GoToProfilingRunsClicked', { payload: { table_group_id: results.table_group_id } }), - }); - } - - return Button({ - label: nextButtonLabel, - type: nextButtonType, - color: 'primary', - width: 'auto', - style: 'margin-left: auto; min-width: 200px;', - disabled: currentStepIsInvalid, - onclick: () => { - if (currentStepIndex.val < steps.length - 1) { - return setStep(currentStepIndex.val + 1); - } - - saveTableGroup(); - }, - }); - }, + div('Table group ', strong(tableGroup.table_groups_name), ' created.'), + div( + { class: 'text-caption' }, + `Schema: ${tableGroup.table_group_schema} | ${Object.keys(preview.tables).length} tables | ${preview.stats.column_ct} columns`, + ), + ), + ), + div( + { class: 'flex-row fx-gap-2' }, + results.run_profiling + ? Icon({ style: 'color: var(--green);' }, 'play_circle') + : Icon({ style: 'color: var(--grey);' }, 'do_not_disturb_on'), + results.run_profiling + ? div( + { class: 'flex-row fx-gap-1' }, + div('Profiling run started.'), + Link({ + open_new: true, + label: 'View progress', + href: 'profiling-runs', + params: { project_code: projectCode, table_group_id: tableGroup.id }, + right_icon: 'open_in_new', + right_icon_size: 13, + }), + ) + : div( + div('Profiling skipped.'), + div( + { class: 'text-caption flex-row fx-gap-1' }, + 'Run profiling or configure a schedule on the ', + Link({ + open_new: true, + label: 'Table Groups', + href: 'table-groups', + params: { project_code: projectCode, connection_id: tableGroup.connection_id }, + right_icon: 'open_in_new', + right_icon_size: 13, + }), + ' page.', + ), + ), + ), + div( + { class: 'flex-row fx-gap-2' }, + results.generate_test_suite + ? Icon({ style: 'color: var(--blue);' }, 'pending') + : Icon({ style: 'color: var(--grey);' }, 'do_not_disturb_on'), + div( + results.generate_test_suite + ? div('Test suite ', strong(results.test_suite_name), ' created. Tests will be generated and scheduled after profiling.') + : div('Test generation skipped.'), + div( + { class: 'text-caption flex-row fx-gap-1' }, + results.generate_test_suite + ? 'Manage test suites and schedules on the ' + : 'Create test suites, generate and run tests, and configure schedules on the ', + Link({ + open_new: true, + label: 'Test Suites', + href: 'test-suites', + params: { project_code: projectCode, table_group_id: tableGroup.id }, + right_icon: 'open_in_new', + right_icon_size: 13, + }), + ' page.', + ), ), + ), + div( + { class: 'flex-row fx-gap-2' }, + results.generate_monitor_suite + ? Icon({ style: 'color: var(--blue);' }, 'pending') + : Icon({ style: 'color: var(--grey);' }, 'do_not_disturb_on'), + div( + div( + results.generate_monitor_suite + ? 'Volume and Schema monitors configured and scheduled. Freshness monitors will be configured after profiling.' + : 'Monitor configuration skipped.', + ), + div( + { class: 'text-caption flex-row fx-gap-1' }, + results.generate_monitor_suite + ? 'Manage monitors and view anomalies on the ' + : 'Configure freshness, volume, and schema monitors on the ', + Link({ + open_new: true, + label: 'Monitors', + href: 'monitors', + params: { project_code: projectCode, table_group_id: tableGroup.id }, + right_icon: 'open_in_new', + right_icon_size: 13, + }), + ' page.', + ), + ), + ), + ), + div( + {class: 'flex-row fx-justify-content-flex-end'}, + Button({ + type: 'stroked', + color: 'primary', + label: 'Close', + width: 'auto', + onclick: () => emitEvent('CloseClicked', {}), + }), ), - ); + ); + }, + div( + { class: 'flex-column fx-gap-3 mt-4' }, + () => { + const results = getValue(props.results) ?? {}; + return results?.success === false + ? Alert({ type: 'error' }, span(results.message)) + : ''; + }, + div( + { class: 'flex-row' }, + () => { + if (currentStepIndex.val <= 0 || isComplete.val) { + return ''; + } + + return Button({ + label: 'Previous', + type: 'stroked', + color: 'basic', + width: 'auto', + style: 'margin-right: auto; min-width: 200px;', + onclick: () => setStep(currentStepIndex.val - 1), + }); + }, + () => { + if (isComplete.val) { + return ''; + } + + return Button({ + label: nextButtonLabel, + type: nextButtonType, + color: 'primary', + width: 'auto', + style: 'margin-left: auto; min-width: 200px;', + disabled: currentStepIsInvalid, + onclick: () => { + if (currentStepIndex.val < steps.length - 1) { + return setStep(currentStepIndex.val + 1); + } + + saveTableGroup(); + }, + }); + }, + ), + ), + ); }; /** @@ -251,34 +638,33 @@ const TableGroupWizard = (props) => { * @param {boolean?} disabled * @returns */ -const RunProfilingStep = (tableGroup, runProfiling, preview, disabled) => { - return div( - { class: 'flex-column fx-gap-3' }, - Checkbox({ - label: div( - { class: 'flex-row'}, - span({ class: 'mr-1' }, 'Execute profiling for the table group'), - strong(() => tableGroup.table_groups_name), - span('?'), - ), - checked: runProfiling, - disabled: disabled ?? false, - onChange: (value) => runProfiling.val = value, - }), - () => runProfiling.val && preview.val - ? TableGroupStats({ class: 'mt-1 mb-1' }, preview.val.stats) - : '', - div( - { class: 'flex-row fx-gap-1' }, - Icon({ size: 16 }, 'info'), - span( - { class: 'text-caption' }, - () => runProfiling.val - ? 'Profiling will be performed in a background process.' - : 'Profiling will be skipped. You can run this step later from the Profiling Runs page.', - ), - ), - ); +const RunProfilingStep = (tableGroup, runProfiling, preview) => { + return div( + { class: 'flex-column fx-gap-3' }, + Checkbox({ + label: div( + { class: 'flex-row' }, + span({ class: 'mr-1' }, 'Run profiling for the table group'), + strong(() => tableGroup.table_groups_name), + ), + checked: runProfiling, + disabled: false, + onChange: (value) => runProfiling.val = value, + }), + () => runProfiling.val && preview.val + ? TableGroupStats({ class: 'mt-1 mb-1' }, preview.val.stats) + : '', + div( + { class: 'flex-row fx-gap-1' }, + Icon({ size: 16 }, 'info'), + span( + { class: 'text-caption' }, + () => runProfiling.val + ? 'Profiling will be performed in a background process.' + : 'Profiling will be skipped. You can do this step later on the Table Groups page.', + ), + ), + ); }; /** @@ -287,12 +673,39 @@ const RunProfilingStep = (tableGroup, runProfiling, preview, disabled) => { * @param {any} content */ const WizardStep = (index, currentIndex, content) => { - const hidden = van.derive(() => getValue(currentIndex) !== getValue(index)); + const hidden = van.derive(() => getValue(currentIndex) !== getValue(index)); - return div( - { class: () => `flex-column fx-gap-3 ${hidden.val ? 'hidden' : ''}`}, - content, - ); + return div( + { class: () => `flex-column fx-gap-3 ${hidden.val ? 'hidden' : ''}` }, + content, + ); }; export { TableGroupWizard }; + +export default (component) => { + const { data, setStateValue, setTriggerValue, parentElement } = component; + + Streamlit.enableV2(setTriggerValue); + + let componentState = parentElement.state; + if (componentState === undefined) { + componentState = {}; + for (const [ key, value ] of Object.entries(data)) { + componentState[key] = van.state(value); + } + + parentElement.state = componentState; + van.add(parentElement, TableGroupWizard(componentState)); + } else { + for (const [ key, value ] of Object.entries(data)) { + if (!isEqual(componentState[key].val, value)) { + componentState[key].val = value; + } + } + } + + return () => { + parentElement.state = null; + }; +}; diff --git a/testgen/ui/components/frontend/js/pages/table_monitoring_trends.js b/testgen/ui/components/frontend/js/pages/table_monitoring_trends.js new file mode 100644 index 00000000..6694657a --- /dev/null +++ b/testgen/ui/components/frontend/js/pages/table_monitoring_trends.js @@ -0,0 +1,886 @@ +/** + * @import {Point} from '../components/chart_canvas.js'; + * @import {FreshnessEvent} from '../components/freshness_chart.js'; + * @import {SchemaEvent} from '../components/schema_changes_chart.js'; + * @import {DataStructureLog} from '../components/schema_changes_list.js'; + * + * @typedef VolumeTrendEvent + * @type {object} + * @property {number} time + * @property {number} record_count + * @property {boolean} is_anomaly + * @property {boolean} is_pending + * @property {boolean} is_training + * @property {number?} lower_tolerance + * @property {number?} upper_tolerance + * + * @typedef MetricTrendEvent + * @type {object} + * @property {number} time + * @property {number} value + * @property {boolean} is_anomaly + * @property {boolean} is_training + * @property {boolean} is_pending + * @property {number?} lower_tolerance + * @property {number?} upper_tolerance + * + * @typedef MetricEventGroup + * @type {object} + * @property {string} test_definition_id + * @property {string} column_name + * @property {MetricTrendEvent[]} events + * + * @typedef PredictionSet + * @type {object} + * @property {('predict'|'static'|'freshness_window')} method + * @property {object?} mean + * @property {object?} lower_tolerance + * @property {object?} upper_tolerance + * @property {{start: number?, end: number}?} window + * + * @typedef Predictions + * @type {object} + * @property {PredictionSet?} volume_trend + * @property {PredictionSet?} freshness_trend + * + * @typedef Properties + * @type {object} + * @property {FreshnessEvent[]} freshness_events + * @property {VolumeTrendEvent[]} volume_events + * @property {SchemaEvent[]} schema_events + * @property {MetricEventGroup[]} metric_events + * @property {(DataStructureLog[])?} data_structure_logs + * @property {Predictions?} predictions + * @property {boolean} extended_history + */ +import van from '/app/static/js/van.min.js'; +import { Streamlit } from '/app/static/js/streamlit.js'; +import { emitEvent, getValue, loadStylesheet, parseDate, isEqual } from '/app/static/js/utils.js'; +import { FreshnessChart } from '/app/static/js/components/freshness_chart.js'; +import { colorMap, formatNumber } from '/app/static/js/display_utils.js'; +import { SchemaChangesChart } from '/app/static/js/components/schema_changes_chart.js'; +import { SchemaChangesList } from '/app/static/js/components/schema_changes_list.js'; +import { getAdaptiveTimeTicksV2, scale } from '/app/static/js/axis_utils.js'; +import { Tooltip } from '/app/static/js/components/tooltip.js'; +import { DualPane } from '/app/static/js/components/dual_pane.js'; +import { Button } from '/app/static/js/components/button.js'; +import { MonitoringSparklineChart, MonitoringSparklineMarkers } from '/app/static/js/components/monitoring_sparkline.js'; + +const { div, span } = van.tags; +const { circle, clipPath, defs, foreignObject, g, line, path, rect, svg, text } = van.tags("http://www.w3.org/2000/svg"); + +const spacing = 8; +const chartsWidth = 700; +const baseChartsYAxisWidth = 24; +const fresshnessChartHeight = 40; +const schemaChartHeight = 80; +const volumeTrendChartHeight = 80; +const metricTrendChartHeight = 80; +const paddingLeft = 16; +const paddingRight = 16; +const timeTickFormatter = new Intl.DateTimeFormat('en-US', { + month: 'short', + day: 'numeric', + hour: 'numeric', + hour12: true, +}); +const tickWidth = 90; + +/** + * @param {Properties} props + */ +const TableMonitoringTrend = (props) => { + window.testgen.isPage = true; + loadStylesheet('table-monitoring-trends', stylesheet); + + const shouldShowSidebar = van.state(false); + const schemaChartSelection = van.state(null); + van.derive(() => shouldShowSidebar.val = (getValue(props.data_structure_logs)?.length ?? 0) > 0); + + const getDataStructureLogs = (/** @type {SchemaEvent} */ event) => { + emitEvent('ShowDataStructureLogs', { payload: { start_time: event.window_start, end_time: event.time } }); + shouldShowSidebar.val = true; + schemaChartSelection.val = event; + }; + + return DualPane( + { + id: 'monitoring-trends-container', + class: () => `table-monitoring-trend-wrapper ${shouldShowSidebar.val ? 'has-sidebar' : ''}`, + minSize: 150, + maxSize: 400, + resizablePanel: 'right', + resizablePanelDomId: 'data-structure-logs-sidebar', + }, + div( + { class: '', style: 'width: 100%;' }, + () => { + const extendedHistory = getValue(props.extended_history) ?? false; + return div( + { class: 'extended-history-toggle' }, + Button({ + label: extendedHistory ? 'Show default view' : 'Show more history', + icon: extendedHistory ? 'history_toggle_off' : 'history', + width: 'auto', + onclick: () => emitEvent('ToggleExtendedHistory', { payload: {} }), + }), + ); + }, + () => ChartsSection(props, { schemaChartSelection, getDataStructureLogs }), + ChartLegend({ + '': { + items: [ + { icon: svg({ width: 10, height: 10 }, + path({ d: 'M 8 5 A 3 3 0 0 0 2 5', fill: 'none', stroke: colorMap.emptyDark, 'stroke-width': 3, transform: 'rotate(45, 5, 5)' }), + path({ d: 'M 2 5 A 3 3 0 0 0 8 5', fill: 'none', stroke: colorMap.blueLight, 'stroke-width': 3, transform: 'rotate(45, 5, 5)' }), + circle({ cx: 5, cy: 5, r: 3, fill: 'var(--dk-dialog-background)', stroke: 'none' }) + ), label: 'Training' }, + { icon: svg({ width: 10, height: 10 }, circle({ cx: 5, cy: 5, r: 3, fill: colorMap.emptyDark, stroke: 'none' })), label: 'No change' }, + ], + }, + 'Freshness': { + items: [ + { icon: svg({ width: 10, height: 10 }, line({ x1: 4, y1: 0, x2: 4, y2: 10, stroke: colorMap.emptyDark, 'stroke-width': 2 })), label: 'Update' }, + { icon: svg({ width: 10, height: 10 }, circle({ cx: 5, cy: 5, r: 4, fill: colorMap.limeGreen })), label: 'On Time' }, + { + icon: svg( + { width: 10, height: 10, style: 'overflow: visible;' }, + rect({ x: 1.5, y: 1.5, width: 7, height: 7, fill: colorMap.red, transform: 'rotate(45 5 5)' }), + ), + label: 'Early/Late', + }, + ], + }, + 'Volume/Metrics': { + items: [ + { + icon: svg( + { width: 16, height: 10 }, + line({ x1: 0, y1: 5, x2: 16, y2: 5, stroke: colorMap.blueLight, 'stroke-width': 2 }), + circle({ cx: 8, cy: 5, r: 3, fill: colorMap.blueLight }) + ), + label: 'Actual', + }, + { + icon: svg( + { width: 10, height: 10, style: 'overflow: visible;' }, + rect({ x: 1.5, y: 1.5, width: 7, height: 7, fill: colorMap.red, transform: 'rotate(45 5 5)' }), + ), + label: 'Anomaly', + }, + { + icon: svg( + { width: 16, height: 10 }, + path({ d: 'M 0,4 L 16,2 L 16,8 L 0,6 Z', fill: colorMap.emptyDark, opacity: 0.4 }), + line({ x1: 0, y1: 5, x2: 16, y2: 5, stroke: colorMap.grey, 'stroke-width': 2 }) + ), + label: 'Prediction', + }, + ], + }, + 'Schema': { + items: [ + { icon: svg({ width: 10, height: 10 }, rect({ width: 10, height: 10, fill: colorMap.blue })), label: 'Additions' }, + { icon: svg({ width: 10, height: 10 }, rect({ width: 10, height: 10, fill: colorMap.orange })), label: 'Deletions' }, + { icon: svg({ width: 10, height: 10 }, rect({ width: 10, height: 10, fill: colorMap.purple })), label: 'Modifications' }, + ], + }, + }), + ), + + () => { + const _shouldShowSidebar = shouldShowSidebar.val; + const selection = schemaChartSelection.val; + if (!_shouldShowSidebar || !selection) { + return span(); + } + + return div( + { id: 'data-structure-logs-sidebar', class: 'flex-column data-structure-logs-sidebar' }, + SchemaChangesList({ + data_structure_logs: props.data_structure_logs, + window_start: selection.window_start, + window_end: selection.time, + }), + Button({ + label: 'Hide', + style: 'margin-top: 8px; width: auto; align-self: flex-end;', + icon: 'double_arrow', + onclick: () => { + shouldShowSidebar.val = false; + schemaChartSelection.val = null; + }, + }), + ); + }, + ); +}; + +/** + * @param {Properties} props + * @param {object} options + * @param {import('van').State} options.schemaChartSelection + * @param {Function} options.getDataStructureLogs + */ +const ChartsSection = (props, { schemaChartSelection, getDataStructureLogs }) => { + const metricEvents = getValue(props.metric_events) ?? []; + const chartHeight = ( + + (spacing * 4) + fresshnessChartHeight + + (spacing * 4) + volumeTrendChartHeight + + (spacing * 4) + schemaChartHeight + + metricEvents.length * ((spacing * 4) + metricTrendChartHeight) + ); + + const predictions = getValue(props.predictions); + const freshnessWindow = predictions?.freshness_trend?.window; + const predictionTimes = Object.values(predictions ?? {}).reduce((predictionTimes, v) => [ + ...predictionTimes, + ...Object.keys(v.mean ?? {}).map(t => ({time: +t})), + ...(v.window ? [ + v.window.start ? {time: v.window.start} : null, + {time: v.window.end}, + ].filter(Boolean) : []), + ], []); + const freshnessEvents = (getValue(props.freshness_events) ?? []).map(e => ({ ...e, time: parseDate(e.time) })); + const schemaChangeEvents = (getValue(props.schema_events) ?? []).map(e => ({ ...e, time: parseDate(e.time), window_start: parseDate(e.window_start) })); + const schemaChangesMaxValue = schemaChangeEvents.reduce((currentValue, e) => Math.max(currentValue, e.additions, e.deletions), 10); + + // Compute dropped periods from schema events to hide volume/metric data between table drop and re-add + const droppedPeriods = []; + let dropStart = null; + const sorted = [...schemaChangeEvents].sort((a, b) => a.time - b.time); + for (const event of sorted) { + if (event.table_change === 'D' && dropStart === null) { + dropStart = event.time; + } else if (event.table_change === 'A' && dropStart !== null) { + droppedPeriods.push({ start: dropStart, end: event.time }); + dropStart = null; + } + } + const isInDroppedPeriod = (time) => droppedPeriods.some(p => time >= p.start && time <= p.end); + + const volumeTrendEvents = (getValue(props.volume_events) ?? []).map(e => ({ ...e, time: parseDate(e.time) })).filter(e => !isInDroppedPeriod(e.time)); + + const metricEventGroups = metricEvents.map(group => ({ + ...group, + events: group.events.map(e => ({ ...e, time: parseDate(e.time) })).filter(e => !isInDroppedPeriod(e.time)), + })); + + const volumes = [ + ...volumeTrendEvents + .flatMap((e) => [e.record_count, parseInt(e.lower_tolerance), parseInt(e.upper_tolerance)]) + .filter((v) => Number.isFinite(v)), + ...Object.keys(predictions?.volume_trend?.mean ?? {}) + .flatMap((time) => [ + parseInt(predictions.volume_trend.upper_tolerance[time]), + parseInt(predictions.volume_trend.lower_tolerance[time]), + ]) + .filter((v) => Number.isFinite(v)), + ]; + const volumeRange = volumes.length > 0 + ? {min: Math.min(...volumes), max: Math.max(...volumes)} + : {min: 0, max: 100}; + if (volumeRange.min === volumeRange.max) { + volumeRange.max = volumeRange.max + 100; + } + const tickDecimals = (value, range) => (range.max - range.min) < 1 ? 3 : (value >= 1000 ? 0 : 3); + + const metricRanges = metricEventGroups.map(group => { + const predictionKey = `metric:${group.test_definition_id}`; + const metricPrediction = predictions?.[predictionKey]; + + const metricValues = [ + ...group.events + .flatMap(e => [e.value, parseFloat(e.lower_tolerance), parseFloat(e.upper_tolerance)]) + .filter((v) => Number.isFinite(v)), + ...Object.keys(metricPrediction?.mean ?? {}) + .flatMap((time) => [ + parseFloat(metricPrediction.upper_tolerance[time]), + parseFloat(metricPrediction.lower_tolerance[time]), + ]) + .filter((v) => Number.isFinite(v)), + ]; + + const metricRange = metricValues.length > 0 + ? { min: Math.min(...metricValues), max: Math.max(...metricValues) } + : { min: 0, max: 100 }; + if (metricRange.min === metricRange.max) { + metricRange.max = metricRange.max + 100; + } + return metricRange; + }); + + const longestYTickText = Math.max( + String(volumeRange.min).length, + String(volumeRange.max).length, + String(schemaChangesMaxValue).length, + ...metricRanges.flatMap(range => [ + String(Number(range.min.toFixed(3))).length, + String(Number(range.max.toFixed(3))).length, + ]), + ); + const longestYTickSize = longestYTickText * 6 - baseChartsYAxisWidth; + const chartsYAxisWidth = baseChartsYAxisWidth + Math.max(longestYTickSize, 0); + const origin = { x: chartsYAxisWidth + paddingLeft, y: chartHeight + spacing }; + const end = { x: chartsWidth + chartsYAxisWidth - paddingRight, y: chartHeight - spacing }; + + let verticalPosition = 0; + const positionTracking = {}; + const nextPosition = (options) => { + verticalPosition += (options.spaces ?? 1) * spacing + (options.offset ?? 0); + if (options.name) { + positionTracking[options.name] = verticalPosition; + } + return verticalPosition; + }; + + const allTimes = [ + ...freshnessEvents, + ...schemaChangeEvents, + ...volumeTrendEvents, + ...metricEventGroups.flatMap(group => group.events), + ...predictionTimes, + ].map(e => e.time); + + const rawTimeline = [...new Set(allTimes)].sort(); + const dateRange = { min: rawTimeline[0] ?? (new Date()).getTime(), max: rawTimeline[rawTimeline.length - 1] ?? (new Date()).getTime() + 1 * 24 * 60 * 60 * 1000 }; + const toPixelX = (date) => scale(date.getTime(), { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x); + const xTickMinSpacing = 65; + const timeline = (() => { + const adaptiveTicks = getAdaptiveTimeTicksV2( + rawTimeline.map(time => new Date(time)), + end.x - origin.x, + xTickMinSpacing, + ); + + const seen = new Set(); + const candidates = []; + for (const date of [new Date(dateRange.min), ...adaptiveTicks, new Date(dateRange.max)]) { + if (!date) continue; + const t = date.getTime(); + if (!seen.has(t)) { + seen.add(t); + candidates.push(date); + } + } + candidates.sort((a, b) => a.getTime() - b.getTime()); + + if (candidates.length <= 2) return candidates; + + const first = candidates[0]; + const last = candidates[candidates.length - 1]; + const firstPx = toPixelX(first); + const lastPx = toPixelX(last); + + if (lastPx - firstPx < xTickMinSpacing) return [first]; + + const result = [first]; + let prevPx = firstPx; + + for (let i = 1; i < candidates.length - 1; i++) { + const px = toPixelX(candidates[i]); + if (px - prevPx >= xTickMinSpacing && lastPx - px >= xTickMinSpacing) { + result.push(candidates[i]); + prevPx = px; + } + } + + result.push(last); + return result; + })(); + + const parsedFreshnessEvents = freshnessEvents.map((e) => ({ + changed: e.changed, + status: e.status, + message: e.message, + isTraining: e.is_training, + isPending: e.is_pending, + time: e.time, + point: { + x: scale(e.time, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x), + y: fresshnessChartHeight / 2, + }, + })); + const parsedFreshnessWindow = freshnessWindow ? { + startX: freshnessWindow.start + ? scale(freshnessWindow.start, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x) + : null, + endX: scale(freshnessWindow.end, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x), + startTime: freshnessWindow.start, + endTime: freshnessWindow.end, + } : null; + + const parsedSchemaChangeEvents = schemaChangeEvents.map((e) => ({ + time: e.time, + additions: e.additions, + deletions: e.deletions, + modifications: e.modifications, + window_start: e.window_start, + point: { + x: scale(e.time, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x), + y: schemaChartHeight / 2, + }, + })); + + const parsedVolumeTrendEvents = volumeTrendEvents.toSorted((a, b) => a.time - b.time).map((e) => ({ + originalX: e.time, + originalY: e.record_count, + originalLowerTolerance: e.lower_tolerance != undefined + ? parseInt(e.lower_tolerance) + : undefined, + originalUpperTolerance: e.upper_tolerance != undefined + ? parseInt(e.upper_tolerance) + : undefined, + label: 'Row count', + isAnomaly: e.is_anomaly, + isTraining: e.is_training, + isPending: e.is_pending, + x: scale(e.time, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x), + y: scale(e.record_count, { old: volumeRange, new: { min: volumeTrendChartHeight, max: 0 } }, volumeTrendChartHeight), + lowerTolerance: e.lower_tolerance != undefined + ? scale(parseInt(e.lower_tolerance), { old: volumeRange, new: { min: volumeTrendChartHeight, max: 0 } }, volumeTrendChartHeight) + : undefined, + upperTolerance: e.upper_tolerance != undefined + ? scale(parseInt(e.upper_tolerance), { old: volumeRange, new: { min: volumeTrendChartHeight, max: 0 } }, volumeTrendChartHeight) + : undefined, + })); + + const parsedVolumeTrendPredictionPoints = Object.entries(predictions?.volume_trend?.mean ?? {}).toSorted(([a,], [b,]) => (+a) - (+b)).map(([time, count]) => ({ + x: scale(+time, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x), + y: scale(+count, { old: volumeRange, new: { min: volumeTrendChartHeight, max: 0 } }, volumeTrendChartHeight), + upper: predictions.volume_trend.upper_tolerance[time] != undefined + ? scale(parseInt(predictions.volume_trend.upper_tolerance[time]), { old: volumeRange, new: { min: volumeTrendChartHeight, max: 0 } }, volumeTrendChartHeight) + : undefined, + lower: predictions.volume_trend.lower_tolerance[time] != undefined + ? scale(parseInt(predictions.volume_trend.lower_tolerance[time]), { old: volumeRange, new: { min: volumeTrendChartHeight, max: 0 } }, volumeTrendChartHeight) + : undefined, + })).filter(p => p.x != undefined && (p.upper != undefined || p.lower != undefined)); + + const parsedMetricCharts = metricEventGroups.map((group, idx) => { + const predictionKey = `metric:${group.test_definition_id}`; + const metricPrediction = predictions?.[predictionKey]; + const metricRange = metricRanges[idx]; + + const parsedEvents = group.events.toSorted((a, b) => a.time - b.time).map(e => ({ + originalX: e.time, + originalY: e.value, + originalLowerTolerance: e.lower_tolerance, + originalUpperTolerance: e.upper_tolerance, + isAnomaly: e.is_anomaly, + isTraining: e.is_training, + isPending: e.is_pending, + x: scale(e.time, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x), + y: scale(e.value, { old: metricRange, new: { min: metricTrendChartHeight, max: 0 } }, metricTrendChartHeight), + lowerTolerance: e.lower_tolerance != undefined + ? scale(parseFloat(e.lower_tolerance), { old: metricRange, new: { min: metricTrendChartHeight, max: 0 } }, metricTrendChartHeight) + : undefined, + upperTolerance: e.upper_tolerance != undefined + ? scale(parseFloat(e.upper_tolerance), { old: metricRange, new: { min: metricTrendChartHeight, max: 0 } }, metricTrendChartHeight) + : undefined, + })); + + const parsedPredictionPoints = Object.entries(metricPrediction?.mean ?? {}).toSorted(([a,], [b,]) => (+a) - (+b)).map(([time, value]) => ({ + x: scale(+time, { old: dateRange, new: { min: origin.x, max: end.x } }, origin.x), + y: scale(+value, { old: metricRange, new: { min: metricTrendChartHeight, max: 0 } }, metricTrendChartHeight), + upper: metricPrediction.upper_tolerance[time] != undefined + ? scale(parseFloat(metricPrediction.upper_tolerance[time]), { old: metricRange, new: { min: metricTrendChartHeight, max: 0 } }, metricTrendChartHeight) + : undefined, + lower: metricPrediction.lower_tolerance[time] != undefined + ? scale(parseFloat(metricPrediction.lower_tolerance[time]), { old: metricRange, new: { min: metricTrendChartHeight, max: 0 } }, metricTrendChartHeight) + : undefined, + })).filter(p => p.x != undefined && (p.upper != undefined || p.lower != undefined)); + + return { + columnName: group.column_name, + testDefinitionId: group.test_definition_id, + events: parsedEvents, + range: metricRange, + predictionPoints: parsedPredictionPoints, + predictionMethod: metricPrediction?.method, + }; + }); + + let tooltipText = ''; + const shouldShowTooltip = van.state(false); + const tooltipExtraStyle = van.state(''); + const /** @type {HTMLDivElement} */ tooltipWrapperElement = foreignObject( + { fill: 'none', width: '100%', height: '100%', 'pointer-events': 'none', style: 'overflow: visible; position: absolute;' }, + () => { + const show = shouldShowTooltip.val; + const style = tooltipExtraStyle.val; + + return Tooltip({ + text: tooltipText, + position: '--', + show, + style, + }); + }, + ); + const showTooltip = (verticalOffset, message, point) => { + let timeout; + + tooltipText = message; + tooltipExtraStyle.val = 'visibility: hidden;'; + shouldShowTooltip.val = true; + + timeout = setTimeout(() => { + const tooltipRect = tooltipWrapperElement.querySelector('.tg-tooltip').getBoundingClientRect(); + + // Convert screen pixel dimensions to SVG user units for boundary checks + const svgElement = document.getElementById('monitoring-trends-charts-svg'); + const screenToSvg = (chartsWidth + chartsYAxisWidth) / svgElement.getBoundingClientRect().width; + const tooltipWidth = tooltipRect.width * screenToSvg; + const tooltipHeight = tooltipRect.height * screenToSvg; + + let tooltipX = point.x + 10; + let tooltipY = point.y + verticalOffset + 10; + + if ((tooltipX + tooltipWidth) >= (chartsWidth + chartsYAxisWidth)) { + tooltipX = point.x - tooltipWidth - 10; + } + + if (tooltipY + tooltipHeight >= chartHeight) { + tooltipY = chartHeight - tooltipHeight; + } + if (tooltipY < 0) { + tooltipY = 0; + } + + tooltipExtraStyle.val = `transform: translate(${tooltipX}px, ${tooltipY}px);`; + + clearTimeout(timeout); + }, 0); + }; + const hideTooltip = () => { + shouldShowTooltip.val = false; + tooltipExtraStyle.val = ''; + tooltipText = ''; + }; + + return svg( + { + id: 'monitoring-trends-charts-svg', + viewBox: `0 0 ${chartsWidth + chartsYAxisWidth} ${chartHeight}`, + style: `overflow: visible;`, + }, + + text({ x: origin.x, y: nextPosition({ spaces: 2 }), class: 'text-small', fill: 'var(--primary-text-color)' }, 'Freshness'), + FreshnessChart( + { + width: chartsWidth, + height: fresshnessChartHeight, + lineHeight: fresshnessChartHeight, + nestedPosition: { x: 0, y: nextPosition({ name: 'freshnessChart' }) }, + predictedWindow: parsedFreshnessWindow, + showTooltip: showTooltip.bind(null, positionTracking.freshnessChart), + hideTooltip, + }, + ...parsedFreshnessEvents, + ), + DividerLine({ x: origin.x - paddingLeft, y: nextPosition({ offset: fresshnessChartHeight }) }, end), + + text({ x: origin.x, y: nextPosition({ spaces: 2 }), class: 'text-small', fill: 'var(--primary-text-color)' }, 'Volume'), + MonitoringSparklineChart( + { + width: chartsWidth, + height: volumeTrendChartHeight, + nestedPosition: { x: 0, y: nextPosition({ name: 'volumeTrendChart' }) }, + lineWidth: 2, + attributes: {style: 'overflow: visible;'}, + prediction: parsedVolumeTrendPredictionPoints, + predictionMethod: predictions.volume_trend?.method, + }, + ...parsedVolumeTrendEvents, + ), + MonitoringSparklineMarkers( + { + size: 2, + transform: `translate(0, ${positionTracking.volumeTrendChart})`, + showTooltip: showTooltip.bind(null, positionTracking.volumeTrendChart), + hideTooltip, + }, + parsedVolumeTrendEvents, + ), + DividerLine({ x: origin.x - paddingLeft, y: nextPosition({ offset: volumeTrendChartHeight }) }, end), + + // Schema Chart Selection Highlight + () => { + const selection = schemaChartSelection.val; + if (selection) { + const width = 16; + const height = schemaChartHeight + 3 * spacing; + return rect({ + width: width, + height: height, + x: selection.point.x - (width / 2), + y: selection.point.y + positionTracking.schemaChangesChart - 1.5 * spacing - (height / 2), + fill: colorMap.empty, + style: `transform-box: fill-box; transform-origin: center;`, + }); + } + + return g(); + }, + text({ x: origin.x, y: nextPosition({ spaces: 2 }), class: 'text-small', fill: 'var(--primary-text-color)' }, 'Schema'), + SchemaChangesChart( + { + width: chartsWidth, + height: schemaChartHeight, + nestedPosition: { x: 0, y: nextPosition({ name: 'schemaChangesChart' }) }, + onClick: getDataStructureLogs, + showTooltip: showTooltip.bind(null, positionTracking.schemaChangesChart), + hideTooltip, + }, + ...parsedSchemaChangeEvents, + ), + + ...parsedMetricCharts.flatMap((metricChart, idx) => { + const chartName = `metricTrendChart_${idx}`; + return [ + DividerLine({ x: origin.x - paddingLeft, y: nextPosition({ offset: idx === 0 ? schemaChartHeight : metricTrendChartHeight }) }, end), + text({ x: origin.x, y: nextPosition({ spaces: 2 }), class: 'text-small', fill: 'var(--primary-text-color)' }, `Metric: ${metricChart.columnName}`), + MonitoringSparklineChart( + { + width: chartsWidth, + height: metricTrendChartHeight, + nestedPosition: { x: 0, y: nextPosition({ name: chartName }) }, + lineWidth: 2, + attributes: {style: 'overflow: visible;'}, + prediction: metricChart.predictionPoints, + predictionMethod: metricChart.predictionMethod, + }, + ...metricChart.events, + ), + MonitoringSparklineMarkers( + { + size: 2, + transform: `translate(0, ${positionTracking[chartName]})`, + showTooltip: showTooltip.bind(null, positionTracking[chartName]), + hideTooltip, + }, + metricChart.events, + ), + ]; + }), + + g( + {}, + rect({ + width: chartsWidth, + height: chartHeight, + x: origin.x - paddingLeft, + y: 0, + rx: 4, + ry: 4, + stroke: 'var(--border-color)', + fill: 'transparent', + style: 'pointer-events: none;' + }), + + timeline.map((value, idx) => { + const valueAsDate = new Date(value); + const label = timeTickFormatter.format(valueAsDate); + const xPosition = scale(valueAsDate.getTime(), { + old: dateRange, + new: { min: origin.x, max: end.x }, + }, origin.x); + + return g( + {}, + defs( + clipPath( + { id: `xTickClip-${idx}` }, + rect({ x: xPosition, y: -4, width: 4, height: 4 }), + ), + ), + + rect({ + x: xPosition, + y: -4, + width: 4, + height: 8, + rx: 2, + ry: 1, + fill: colorMap.lightGrey, + 'clip-path': `url(#xTickClip-${idx})`, + }), + + text( + { + x: xPosition, + y: 0, + dx: -30, + dy: -8, + fill: colorMap.grey, + 'stroke-width': .1, + style: `font-size: 10px;`, + }, + label, + ), + ); + }), + + // Volume Chart Y axis + g( + { transform: `translate(${chartsYAxisWidth - 4}, ${positionTracking.volumeTrendChart + (volumeTrendChartHeight / 2)})` }, + text({ x: 0, y: 35, class: 'tick-text', 'text-anchor': 'end', fill: 'var(--caption-text-color)' }, formatNumber(volumeRange.min, tickDecimals(volumeRange.min, volumeRange))), + text({ x: 0, y: -35, class: 'tick-text', 'text-anchor': 'end', fill: 'var(--caption-text-color)' }, formatNumber(volumeRange.max, tickDecimals(volumeRange.max, volumeRange))), + ), + + // Schema Chart Y axis + g( + { transform: `translate(${chartsYAxisWidth - 4}, ${positionTracking.schemaChangesChart + (schemaChartHeight / 2)})` }, + text({ x: 0, y: -35, class: 'tick-text', 'text-anchor': 'end', fill: 'var(--caption-text-color)' }, formatNumber(schemaChangesMaxValue)), + text({ x: 0, y: 35, class: 'tick-text', 'text-anchor': 'end', fill: 'var(--caption-text-color)' }, 0), + ), + + // Metric Chart Y axes + ...parsedMetricCharts.map((metricChart, idx) => { + const chartName = `metricTrendChart_${idx}`; + return g( + { transform: `translate(${chartsYAxisWidth - 4}, ${positionTracking[chartName] + (metricTrendChartHeight / 2)})` }, + text({ x: 0, y: 35, class: 'tick-text', 'text-anchor': 'end', fill: 'var(--caption-text-color)' }, formatNumber(metricChart.range.min, tickDecimals(metricChart.range.min, metricChart.range))), + text({ x: 0, y: -35, class: 'tick-text', 'text-anchor': 'end', fill: 'var(--caption-text-color)' }, formatNumber(metricChart.range.max, tickDecimals(metricChart.range.max, metricChart.range))), + ); + }), + ), + tooltipWrapperElement, + ); +}; + +/** + * @param {Point} start + * @param {Point} end + */ +const DividerLine = (start, end) => { + return line({ x1: start.x, y1: start.y, x2: end.x + paddingRight, y2: start.y, stroke: 'var(--border-color)' }); +} + +/** + * @typedef LegendItem + * @type {object} + * @property {Element} icon + * @property {string} label + * + * @typedef LegendGroup + * @type {object} + * @property {LegendItem[]} items + * + * @param {Object.} legendGroups + */ +const ChartLegend = (legendGroups) => { + return div( + { class: 'chart-legend' }, + Object.entries(legendGroups).map(([groupName, { items }]) => + div( + { class: 'chart-legend-group' }, + span({ class: `chart-legend-group-label ${groupName ? '' : 'hidden'}` }, groupName), + ...items.map(item => + div( + { class: 'chart-legend-item' }, + item.icon, + span({ class: 'chart-legend-item-label' }, item.label), + ) + ), + ) + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + .table-monitoring-trend-wrapper { + min-height: 200px; + padding-top: 24px; + padding-right: 24px; + position: relative; + } + + .extended-history-toggle { + position: absolute; + top: -70px; + right: 48px; + z-index: 1; + } + + .table-monitoring-trend-wrapper:not(.has-sidebar) > .tg-dualpane-divider { + display: none; + } + + .data-structure-logs-sidebar { + align-self: stretch; + max-height: 500px; + } + + .tick-text { + font-size: 10px; + } + + .chart-legend { + display: flex; + flex-wrap: wrap; + gap: 36px; + row-gap: 8px; + padding: 12px 16px; + border-top: 1px solid var(--border-color); + background: var(--dk-dialog-background); + position: sticky; + bottom: 0; + margin-left: -24px; + margin-right: -48px; + margin-top: 24px; + } + + .chart-legend-group { + display: flex; + align-items: center; + gap: 12px; + } + + .chart-legend-group-label { + font-size: 11px; + color: var(--secondary-text-color); + font-weight: 500; + } + + .chart-legend-item { + display: inline-flex; + align-items: center; + gap: 4px; + } + + .chart-legend-item-label { + font-size: 11px; + color: var(--caption-text-color); + } +`); + +export { TableMonitoringTrend }; + +export default (component) => { + const { data, setStateValue, setTriggerValue, parentElement } = component; + + Streamlit.enableV2(setTriggerValue); + + let componentState = parentElement.state; + if (componentState === undefined) { + componentState = {}; + for (const [ key, value ] of Object.entries(data)) { + componentState[key] = van.state(value); + } + + parentElement.state = componentState; + van.add(parentElement, TableMonitoringTrend(componentState)); + } else { + for (const [ key, value ] of Object.entries(data)) { + if (!isEqual(componentState[key].val, value)) { + componentState[key].val = value; + } + } + } + + return () => { + parentElement.state = null; + }; +}; diff --git a/testgen/ui/components/frontend/js/pages/test_results_chart.js b/testgen/ui/components/frontend/js/pages/test_results_chart.js new file mode 100644 index 00000000..b26b6d41 --- /dev/null +++ b/testgen/ui/components/frontend/js/pages/test_results_chart.js @@ -0,0 +1,313 @@ +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { getValue, loadStylesheet, onFrameResized, parseDate, resizeFrameHeightOnDOMChange, resizeFrameHeightToElement } from '../utils.js'; +import { ChartCanvas } from '../components/chart_canvas.js'; +import { MonitoringSparklineChart, MonitoringSparklineMarkers } from '../components/monitoring_sparkline.js'; +import { ThresholdChart } from '../components/threshold_chart.js'; +import { colorMap } from '../display_utils.js'; +import { FreshnessChart } from '../components/freshness_chart.js'; + +const { div } = van.tags; +const { circle, g, rect, text } = van.tags("http://www.w3.org/2000/svg"); + +const freshnessColorByStatus = { + Passed: colorMap.limeGreen, + Log: colorMap.blueLight, +}; +const staleColorByStatus = { + Failed: colorMap.red, + Warning: colorMap.orange, + Log: colorMap.lightGrey, +}; + +const TestResultsChart = (/** @type Properties */ props) => { + loadStylesheet('testResultsChart', stylesheet); + Streamlit.setFrameHeight(1); + window.testgen.isPage = true; + + const width = van.state(0); + const height = van.state(0); + const points = van.state([]); + const thresholdPoints = van.state([]); + const allPoints = van.derive(() => [...points.val, ...thresholdPoints.val]); + const axis = van.state({ + x: { + min: null, + max: null, + label: null, + ticksCount: 8, + renderLine: false, + renderGridLines: false, + }, + y: { + min: 0, + max: 1, + label: null, + ticksCount: 8, + renderLine: false, + renderGridLines: false, + }, + }); + const legend = van.state(null); + const markers = van.state(null); + const visualizationType = van.state('line_chart'); + + const sharedInitialization = (data) => { + const dataPoints = []; + + let minY = null; + let maxY = null; + let minThreshold = null; + let maxThreshold = null; + for (const item of data) { + dataPoints.push({x: parseDate(item.test_date), y: item.result_measure}); + + minY = minY == undefined ? item.result_measure : Math.min(minY, item.result_measure); + maxY = maxY == undefined ? item.result_measure : Math.max(maxY, item.result_measure); + minThreshold = minThreshold == undefined ? item.threshold_value : Math.min(minThreshold, item.threshold_value); + maxThreshold = maxThreshold == undefined ? item.threshold_value : Math.max(maxThreshold, item.threshold_value); + } + + minY = Math.min(minY, minThreshold); + maxY = Math.max(maxY, maxThreshold); + if ((minY > 0 && maxY - minY < 0.1 * maxY) || minY === maxY) { + axis.val = { + x: { + ...axis.val.x, + }, + y: { + ...axis.val.y, + min: minY - 1, + max: maxY + 1, + }, + }; + } else { + axis.val = { + x: { + ...axis.val.x, + }, + y: { + ...axis.val.y, + min: undefined, + max: undefined, + }, + }; + } + points.val = dataPoints; + }; + const initilizeLineChart = (data) => { + const thresholdDataPoints = []; + for (const item of data) { + thresholdDataPoints.push({x: parseDate(item.test_date), y: item.threshold_value}); + } + + let thresholdLineColor = colorMap.redLight; + if (data.every(item => '<>' === item.test_operator)) { + thresholdLineColor = colorMap.greenLight; + } + + thresholdPoints.val = thresholdDataPoints; + axis.val = { + x: { + ...axis.val.x, + label: 'Test Date', + renderLine: true, + renderGridLines: true, + }, + y: { + ...axis.val.y, + label: data[0].measure_uom, + }, + }; + + legend.val = (point) => g( + {transform: `translate(${point.x},${point.y})`}, + g( + {}, + circle({ + r: 4, + cx: 0, + cy: -4, + fill: colorMap.blue, + }), + text({x: 10, y: 0, class: 'text-small', fill: 'var(--caption-text-color)'}, 'Observations'), + ), + g( + {transform: 'translate(0, 24)'}, + rect({ + x: -3, + y: -7, + width: 14, + height: 7, + fill: thresholdLineColor, + }), + text({x: 18, y: 0, class: 'text-small', fill: 'var(--caption-text-color)'}, 'Threshold'), + ), + ); + markers.val = (getPoint, showTooltip, hideTooltip) => { + const markerPoints = points.val.map((point) => getPoint(point)).filter((point) => !Number.isNaN(point.x) && !Number.isNaN(point.y)); + return MonitoringSparklineMarkers({showTooltip, hideTooltip}, markerPoints); + }; + }; + const initilizeFreshnessChart = (data) => { + const updateStatuses = new Set(); + const staleStatuses = new Set(); + const dataPoints = []; + + for (const item of data) { + dataPoints.push({x: parseDate(item.test_date), y: item.result_measure, ...item}); + + if (item.result_measure >= 1) { + updateStatuses.add(item.result_status); + } else { + staleStatuses.add(item.result_status); + } + } + + points.val = dataPoints; + axis.val = { + x: { + ...axis.val.x, + label: 'Test Date', + renderLine: true, + }, + y: { + ...axis.val.y, + min: -1, + max: 1, + ticksCount: 3, + }, + }; + legend.val = (point) => g( + {transform: `translate(${point.x},${point.y})`}, + updateStatuses.size > 0 + ? g( + {}, + Array.from(updateStatuses).map((status, idx) => + circle({ + r: 4, + cx: 0 + (11 * idx), + cy: -4, + fill: freshnessColorByStatus[status], + }) + ), + text({x: 10 + (11 * (updateStatuses.size - 1)), y: 0, class: 'text-small', fill: 'var(--caption-text-color)'}, 'Update'), + ) + : null, + staleStatuses.size > 0 + ? g( + {transform: `translate(0, ${staleStatuses.size > 0 ? '24' : '0'})`}, + Array.from(staleStatuses).map((status, idx) => + rect({ + x: -3 + (12 * idx), + y: -7, + width: 7, + height: 7, + fill: staleColorByStatus[status], + style: `transform-box: fill-box; transform-origin: center;`, + transform: 'rotate(45)', + }) + ), + text({x: 10 + (12 * (staleStatuses.size - 1)), y: 0, class: 'text-small', fill: 'var(--caption-text-color)'}, 'No update'), + ) + : null, + ); + }; + const initializers = { + line_chart: initilizeLineChart, + binary_chart: initilizeFreshnessChart, + }; + + van.derive(() => { + const data = getValue(props.data); + + sharedInitialization(data); + visualizationType.val = data[0]?.result_visualization ?? 'line_chart'; + initializers[visualizationType.rawVal]?.(data); + }); + + const wrapperId = 'test-results-chart-wrapper'; + resizeFrameHeightToElement(wrapperId); + resizeFrameHeightOnDOMChange(wrapperId); + + onFrameResized(wrapperId, (box, element) => { + width.val = box.width; + height.val = box.height; + }); + + return div( + { id: wrapperId }, + ChartCanvas( + { + width, + height, + axis, + legend, + markers, + points: allPoints, + + }, + (viewBox, area, getPoint) => { + let data = points.val.map((point) => getPoint(point)); + const visualization = visualizationType.val; + if (visualization === 'binary_chart') { + data = points.val.map((point) => ({changed: point.y > 0, expected: undefined, status: point.result_status, point: getPoint({x: point.x, y: 0})})); + return FreshnessChart( + {width: viewBox.width, height: viewBox.height, lineHeight: viewBox.height * 0.60}, + ...data, + ); + } + return MonitoringSparklineChart( + {viewBox: viewBox, lineWidth: 2, paddingLeft: area.bottomLeft.x, paddingRight: 0}, + ...data, + ); + }, + (viewBox, area, getPoint) => { + if (visualizationType.val !== 'line_chart') { + return null; + } + + const data = getValue(props.data); + const testOperators = data.map(r => r.test_operator); + const lines = [ + thresholdPoints.val.map((point) => getPoint(point)), + ]; + + let lineWidth = 2; + let lineColor = colorMap.redLight; + if (testOperators.every(op => ['<', '<='].includes(op))) { + lines.unshift( + thresholdPoints.val.map((point) => getPoint({ x: point.x, y: -99999 })).slice().reverse(), + ); + } else if (testOperators.every(op => ['>', '>='].includes(op))) { + const maxThresholdValue = Math.max(...thresholdPoints.val.map((point) => point.y)); + lines.unshift( + thresholdPoints.val.map((point) => getPoint({ x: point.x, y: maxThresholdValue * 1.1 })).slice().reverse(), + ); + } else if (testOperators.every(op => ['=', '<>'].includes(op))) { + lineWidth = 5; + if (testOperators.every(op => op === '<>')) { + lineColor = colorMap.greenLight; + } + } + + if (lines.length <= 0) { + return null; + } + return ThresholdChart( + {viewBox: viewBox, lineWidth, paddingLeft: area.bottomLeft.x, paddingRight: 0, color: lineColor}, + ...lines, + ); + }, + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +#test-results-chart-wrapper { + min-height: 450px; +} +`); + +export { TestResultsChart }; diff --git a/testgen/ui/components/frontend/js/pages/test_runs.js b/testgen/ui/components/frontend/js/pages/test_runs.js index 7f53a6e1..f05fa5ad 100644 --- a/testgen/ui/components/frontend/js/pages/test_runs.js +++ b/testgen/ui/components/frontend/js/pages/test_runs.js @@ -217,7 +217,7 @@ const Toolbar = ( return div( { class: 'flex-row fx-align-flex-end fx-justify-space-between mb-4 fx-gap-4 fx-flex-wrap' }, div( - { class: 'flex-row fx-gap-4' }, + { class: 'flex-row fx-gap-3' }, () => Select({ label: 'Table Group', value: getValue(props.table_group_options)?.find((op) => op.selected)?.value ?? null, @@ -238,7 +238,7 @@ const Toolbar = ( }), ), div( - { class: 'flex-row fx-gap-4' }, + { class: 'flex-row fx-gap-3' }, Button({ icon: 'notifications', type: 'stroked', @@ -246,7 +246,7 @@ const Toolbar = ( tooltip: 'Configure email notifications for test runs', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunNotificationsClicked', {}), }), Button({ @@ -256,7 +256,7 @@ const Toolbar = ( tooltip: 'Manage when test suites should run', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunSchedulesClicked', {}), }), userCanEdit @@ -265,16 +265,16 @@ const Toolbar = ( type: 'stroked', label: 'Run Tests', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunTestsClicked', {}), }) : '', Button({ - type: 'icon', + type: 'stroked', icon: 'refresh', tooltip: 'Refresh test runs list', tooltipPosition: 'left', - style: 'border: var(--button-stroked-border); border-radius: 4px;', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RefreshData', {}), testId: 'test-runs-refresh', }), @@ -435,7 +435,7 @@ const ConditionalEmptyState = ( color: 'primary', label: 'Run Tests', width: 'fit-content', - style: 'margin: auto; background: var(--dk-card-background);', + style: 'margin: auto; background: var(--button-generic-background-color);', disabled: !userCanEdit, tooltip: userCanEdit ? null : DISABLED_ACTION_TEXT, tooltipPosition: 'bottom', diff --git a/testgen/ui/components/frontend/js/pages/test_suites.js b/testgen/ui/components/frontend/js/pages/test_suites.js index 3fa1468f..a08d4770 100644 --- a/testgen/ui/components/frontend/js/pages/test_suites.js +++ b/testgen/ui/components/frontend/js/pages/test_suites.js @@ -47,7 +47,7 @@ const TestSuites = (/** @type Properties */ props) => { ? div( { class: 'tg-test-suites'}, () => div( - { class: 'flex-row fx-align-flex-end fx-justify-space-between mb-4' }, + { class: 'flex-row fx-align-flex-end fx-justify-space-between fx-gap-4 mb-4' }, Select({ label: 'Table Group', value: getValue(props.table_group_filter_options)?.find((op) => op.selected)?.value ?? null, @@ -58,7 +58,7 @@ const TestSuites = (/** @type Properties */ props) => { onChange: (value) => emitEvent('FilterApplied', {payload: value}), }), div( - { class: 'flex-row fx-gap-4' }, + { class: 'flex-row fx-gap-3' }, Button({ icon: 'notifications', type: 'stroked', @@ -66,7 +66,7 @@ const TestSuites = (/** @type Properties */ props) => { tooltip: 'Configure email notifications for test runs', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunNotificationsClicked', {}), }), Button({ @@ -76,7 +76,7 @@ const TestSuites = (/** @type Properties */ props) => { tooltip: 'Manage when test suites should run', tooltipPosition: 'bottom', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('RunSchedulesClicked', {}), }), userCanEdit @@ -85,7 +85,7 @@ const TestSuites = (/** @type Properties */ props) => { type: 'stroked', label: 'Add Test Suite', width: 'fit-content', - style: 'background: var(--dk-card-background);', + style: 'background: var(--button-generic-background-color);', onclick: () => emitEvent('AddTestSuiteClicked', {}), }) : '', @@ -153,27 +153,27 @@ const TestSuites = (/** @type Properties */ props) => { { class: 'flex-column' }, Caption({ content: 'Latest Run', style: 'margin-bottom: 2px;' }), testSuite.latest_run_start - ? [ - Link({ - href: 'test-runs:results', - params: { run_id: testSuite.latest_run_id }, - label: formatTimestamp(testSuite.latest_run_start), - class: 'mb-4', - }), - SummaryBar({ - items: [ - { label: 'Passed', value: parseInt(testSuite.last_run_passed_ct), color: 'green' }, - { label: 'Warning', value: parseInt(testSuite.last_run_warning_ct), color: 'yellow' }, - { label: 'Failed', value: parseInt(testSuite.last_run_failed_ct), color: 'red' }, - { label: 'Error', value: parseInt(testSuite.last_run_error_ct), color: 'brown' }, - { label: 'Log', value: parseInt(testSuite.last_run_log_ct), color: 'blue' }, - { label: 'Dismissed', value: parseInt(testSuite.last_run_dismissed_ct), color: 'grey' }, - ], - height: 20, - width: 350, - }) - ] - : span('--'), + ? [ + Link({ + href: 'test-runs:results', + params: { run_id: testSuite.latest_run_id }, + label: formatTimestamp(testSuite.latest_run_start), + class: 'mb-4', + }), + SummaryBar({ + items: [ + { label: 'Passed', value: parseInt(testSuite.last_run_passed_ct), color: 'green' }, + { label: 'Warning', value: parseInt(testSuite.last_run_warning_ct), color: 'yellow' }, + { label: 'Failed', value: parseInt(testSuite.last_run_failed_ct), color: 'red' }, + { label: 'Error', value: parseInt(testSuite.last_run_error_ct), color: 'brown' }, + { label: 'Log', value: parseInt(testSuite.last_run_log_ct), color: 'blue' }, + { label: 'Dismissed', value: parseInt(testSuite.last_run_dismissed_ct), color: 'grey' }, + ], + height: 20, + width: 350, + }) + ] + : span('--'), ), div( { class: 'flex-column' }, @@ -223,7 +223,7 @@ const ConditionalEmptyState = ( color: 'primary', label: 'Add Test Suite', width: 'fit-content', - style: 'margin: auto; background: var(--dk-card-background);', + style: 'margin: auto; background: var(--button-generic-background-color);', disabled: !userCanEdit, tooltip: userCanEdit ? null : DISABLED_ACTION_TEXT, tooltipPosition: 'bottom', diff --git a/testgen/ui/components/frontend/js/streamlit.js b/testgen/ui/components/frontend/js/streamlit.js index 9a348bda..a30ace8c 100644 --- a/testgen/ui/components/frontend/js/streamlit.js +++ b/testgen/ui/components/frontend/js/streamlit.js @@ -1,12 +1,26 @@ const Streamlit = { - init: () => { + _v2: false, + _customSendDataHandler: undefined, + init() { sendMessageToStreamlit('streamlit:componentReady', { apiVersion: 1 }); }, - setFrameHeight: (height) => { - sendMessageToStreamlit('streamlit:setFrameHeight', { height: height }); + enableV2(handler) { + this._v2 = true; + this._customSendDataHandler = handler; }, - sendData: (data) => { - sendMessageToStreamlit('streamlit:setComponentValue', { value: data, dataType: 'json' }); + setFrameHeight(height) { + if (!this._v2) { + sendMessageToStreamlit('streamlit:setFrameHeight', { height: height }); + } + }, + sendData(data) { + if (this._v2) { + const event = data.event; + const triggerData = Object.fromEntries(Object.entries(data).filter(([k, v]) => k !== 'event')); + this._customSendDataHandler(event, triggerData); + } else { + sendMessageToStreamlit('streamlit:setComponentValue', { value: data, dataType: 'json' }); + } }, }; diff --git a/testgen/ui/components/frontend/js/types.js b/testgen/ui/components/frontend/js/types.js index cbf5812d..4926c190 100644 --- a/testgen/ui/components/frontend/js/types.js +++ b/testgen/ui/components/frontend/js/types.js @@ -1,10 +1,19 @@ /** + * @import { MonitorSummary } from '../js/components/monitor_anomalies_summary.js'; + * * @typedef FilterOption * @type {object} * @property {string} label * @property {string} value * @property {boolean} selected * + * @typedef CronSample + * @type {object} + * @property {string?} error + * @property {string[]?} samples + * @property {string?} readable_expr + * @property {string?} id + * * @typedef ProjectSummary * @type {object} * @property {string} project_code @@ -38,4 +47,5 @@ * @property {number} last_run_error_ct * @property {number} last_run_log_ct * @property {number} last_run_dismissed_ct + * @property {MonitorSummary?} monitoring_summary */ diff --git a/testgen/ui/components/frontend/js/utils.js b/testgen/ui/components/frontend/js/utils.js index 3a81eb17..5dc5560f 100644 --- a/testgen/ui/components/frontend/js/utils.js +++ b/testgen/ui/components/frontend/js/utils.js @@ -38,6 +38,23 @@ function resizeFrameHeightOnDOMChange(/** @type string */elementId) { observer.observe(window.frameElement.contentDocument.body, {subtree: true, childList: true}); } +/** + * @param {string} elementId + * @param {((rect: DOMRect, element: HTMLElement) => void)} callback + * @returns {ResizeObserver} + */ +function onFrameResized(elementId, callback) { + const observer = new ResizeObserver(() => { + const element = document.getElementById(elementId); + if (element) { + callback(element.getBoundingClientRect(), element); + } + }); + observer.observe(window.frameElement); + + return observer; +} + function loadStylesheet( /** @type string */key, /** @type CSSStyleSheet */stylesheet, @@ -207,4 +224,19 @@ function checkIsRequired(validators) { return isRequired; } -export { afterMount, debounce, emitEvent, enforceElementWidth, getRandomId, getValue, getParents, isEqual, isState, loadStylesheet, resizeFrameHeightToElement, resizeFrameHeightOnDOMChange, friendlyPercent, slugify, isDataURL, checkIsRequired }; +/** + * + * @param {(string|number)} value + * @returns {number} + */ +function parseDate(value) { + if (typeof value === 'string') { + return Date.parse(value); + } else if (typeof value === 'number') { + return value * 1000; + } + + return value; +} + +export { afterMount, debounce, emitEvent, enforceElementWidth, getRandomId, getValue, getParents, isEqual, isState, loadStylesheet, resizeFrameHeightToElement, resizeFrameHeightOnDOMChange, friendlyPercent, slugify, isDataURL, checkIsRequired, onFrameResized, parseDate }; diff --git a/testgen/ui/components/frontend/js/values.js b/testgen/ui/components/frontend/js/values.js index 99a23a36..725ba2ff 100644 --- a/testgen/ui/components/frontend/js/values.js +++ b/testgen/ui/components/frontend/js/values.js @@ -1,3 +1,266 @@ -const timezones = Intl.supportedValuesOf('timeZone'); +// Chrome does not include UTC: https://github.com/mdn/browser-compat-data/issues/25828 +const timezones = [ 'UTC', ...Intl.supportedValuesOf('timeZone').filter(tz => tz !== 'UTC') ]; -export { timezones }; +const holidayCodes = [ + 'USA', + 'NYSE', + 'ECB', + 'BombayStockExchange', + 'EuropeanCentralBank', + 'IceFuturesEurope', + 'NationalStockExchangeOfIndia', + 'NewYorkStockExchange', + 'BrasilBolsaBalcao', + 'Afghanistan', + 'AlandIslands', + 'Albania', + 'Algeria', + 'AmericanSamoa', + 'Andorra', + 'Angola', + 'Anguilla', + 'Antarctica', + 'AntiguaAndBarbuda', + 'Argentina', + 'Armenia', + 'Aruba', + 'Australia', + 'Austria', + 'Azerbaijan', + 'Bahamas', + 'Bahrain', + 'Bangladesh', + 'Barbados', + 'Belarus', + 'Belgium', + 'Belgium', + 'Belize', + 'Benin', + 'Bermuda', + 'Bhutan', + 'Bolivia', + 'BonaireSintEustatiusAndSaba', + 'BosniaAndHerzegovina', + 'Botswana', + 'BouvetIsland', + 'Brazil', + 'BritishIndianOceanTerritory', + 'BritishVirginIslands', + 'Brunei', + 'Bulgaria', + 'BurkinaFaso', + 'Burundi', + 'CaboVerde', + 'Cambodia', + 'Cameroon', + 'Canada', + 'CaymanIslands', + 'CentralAfricanRepublic', + 'Chad', + 'Chile', + 'China', + 'ChristmasIsland', + 'CocosIslands', + 'Colombia', + 'Comoros', + 'Congo', + 'CookIslands', + 'CostaRica', + 'Croatia', + 'Cuba', + 'Curacao', + 'Cyprus', + 'Czechia', + 'Denmark', + 'Djibouti', + 'Dominica', + 'DominicanRepublic', + 'DRCongo', + 'Ecuador', + 'Egypt', + 'ElSalvador', + 'EquatorialGuinea', + 'Eritrea', + 'Estonia', + 'Eswatini', + 'Ethiopia', + 'FalklandIslands', + 'FaroeIslands', + 'Fiji', + 'Finland', + 'France', + 'FrenchGuiana', + 'FrenchPolynesia', + 'FrenchSouthernTerritories', + 'Gabon', + 'Gambia', + 'Georgia', + 'Germany', + 'Ghana', + 'Gibraltar', + 'Greece', + 'Greenland', + 'Grenada', + 'Guadeloupe', + 'Guam', + 'Guatemala', + 'Guernsey', + 'Guinea', + 'GuineaBissau', + 'Guyana', + 'Haiti', + 'HeardIslandAndMcDonaldIslands', + 'Honduras', + 'HongKong', + 'Hungary', + 'Iceland', + 'India', + 'Indonesia', + 'Iran', + 'Iraq', + 'Ireland', + 'IsleOfMan', + 'Israel', + 'Italy', + 'IvoryCoast', + 'Jamaica', + 'Japan', + 'Jersey', + 'Jordan', + 'Kazakhstan', + 'Kenya', + 'Kiribati', + 'Kuwait', + 'Kyrgyzstan', + 'Laos', + 'Latvia', + 'Lebanon', + 'Lesotho', + 'Liberia', + 'Libya', + 'Liechtenstein', + 'Lithuania', + 'Luxembourg', + 'Macau', + 'Madagascar', + 'Malawi', + 'Malaysia', + 'Maldives', + 'Mali', + 'Malta', + 'MarshallIslands', + 'Martinique', + 'Mauritania', + 'Mauritius', + 'Mayotte', + 'Mexico', + 'Micronesia', + 'Moldova', + 'Monaco', + 'Mongolia', + 'Montenegro', + 'Montserrat', + 'Morocco', + 'Mozambique', + 'Myanmar', + 'Namibia', + 'Nauru', + 'Nepal', + 'Netherlands', + 'NewCaledonia', + 'NewZealand', + 'Nicaragua', + 'Niger', + 'Nigeria', + 'Niue', + 'NorfolkIsland', + 'NorthKorea', + 'NorthMacedonia', + 'NorthernMarianaIslands', + 'Norway', + 'Oman', + 'Pakistan', + 'Palau', + 'Palestine', + 'Panama', + 'PapuaNewGuinea', + 'Paraguay', + 'Peru', + 'Philippines', + 'PitcairnIslands', + 'Poland', + 'Portugal', + 'PuertoRico', + 'Qatar', + 'Reunion', + 'Romania', + 'Russia', + 'Rwanda', + 'SaintBarthelemy', + 'SaintHelenaAscensionAndTristanDaCunha', + 'SaintKittsAndNevis', + 'SaintLucia', + 'SaintMartin', + 'SaintPierreAndMiquelon', + 'SaintVincentAndTheGrenadines', + 'Samoa', + 'SanMarino', + 'SaoTomeAndPrincipe', + 'SaudiArabia', + 'Senegal', + 'Serbia', + 'Seychelles', + 'SierraLeone', + 'Singapore', + 'SintMaarten', + 'Slovakia', + 'Slovenia', + 'SolomonIslands', + 'Somalia', + 'SouthAfrica', + 'SouthGeorgiaAndTheSouthSandwichIslands', + 'SouthKorea', + 'SouthSudan', + 'Spain', + 'SriLanka', + 'Sudan', + 'Suriname', + 'SvalbardAndJanMayen', + 'Sweden', + 'Switzerland', + 'SyrianArabRepublic', + 'Taiwan', + 'Tajikistan', + 'Tanzania', + 'Thailand', + 'TimorLeste', + 'Togo', + 'Tokelau', + 'Tonga', + 'TrinidadAndTobago', + 'Tunisia', + 'Turkey', + 'Turkmenistan', + 'TurksAndCaicosIslands', + 'Tuvalu', + 'Uganda', + 'Ukraine', + 'UnitedArabEmirates', + 'UnitedKingdom', + 'UnitedStates', + 'UnitedStatesMinorOutlyingIslands', + 'UnitedStatesVirginIslands', + 'Uruguay', + 'Uzbekistan', + 'Vanuatu', + 'VaticanCity', + 'Venezuela', + 'Vietnam', + 'WallisAndFutuna', + 'WesternSahara', + 'Yemen', + 'Zambia', + 'Zimbabwe', +]; + +export { timezones, holidayCodes }; diff --git a/testgen/ui/components/utils/component.py b/testgen/ui/components/utils/component.py index c81ced41..330a42ac 100644 --- a/testgen/ui/components/utils/component.py +++ b/testgen/ui/components/utils/component.py @@ -1,6 +1,10 @@ import pathlib +from collections.abc import Callable +import streamlit as st from streamlit.components import v1 as components +from streamlit.components.v2.bidi_component.state import BidiComponentResult +from streamlit.components.v2.types import ComponentRenderer components_dir = pathlib.Path(__file__).parent.parent.joinpath("frontend") component_function = components.declare_component("testgen", path=components_dir) @@ -11,3 +15,38 @@ def component(*, id_, props, key=None, default=None, on_change=None): if not component_props: component_props = {} return component_function(id=id_, props=component_props, key=key, default=default, on_change=on_change) + + +def component_v2_wrapped(renderer: ComponentRenderer) -> ComponentRenderer: + def wrapped_renderer(key: str | None = None, **kwargs) -> BidiComponentResult: + on_change_callbacks = { + name: fn for name, fn, in kwargs.items() + if _is_change_callback(name) + } + other_kwargs = { + "key": key, + **{ + name: value for name, value, in kwargs.items() + if not _is_change_callback(name) and name != "key" + } + } + for name, callback in on_change_callbacks.items(): + on_change_callbacks[name] = _wrap_handler(key, name, callback) + + return renderer(**other_kwargs, **on_change_callbacks) + return wrapped_renderer + + +def _is_change_callback(name: str) -> bool: + return name.startswith("on_") and name.endswith("_change") + + +def _wrap_handler(key: str | None, callback_name: str | None, callback: Callable | None): + if key and callback_name and callback: + def wrapper(): + component_value = st.session_state[key] or {} + trigger_value_name = callback_name.removeprefix("on_").removesuffix("_change") + trigger_value = (component_value.get(trigger_value_name) or {}).get("payload") + return callback(trigger_value) + return wrapper + return callback diff --git a/testgen/ui/components/widgets/__init__.py b/testgen/ui/components/widgets/__init__.py index 29506460..63ff76d7 100644 --- a/testgen/ui/components/widgets/__init__.py +++ b/testgen/ui/components/widgets/__init__.py @@ -1,6 +1,8 @@ # ruff: noqa: F401 -from testgen.ui.components.utils.component import component +from streamlit.components import v2 as components_v2 + +from testgen.ui.components.utils.component import component, component_v2_wrapped from testgen.ui.components.widgets.breadcrumbs import breadcrumbs from testgen.ui.components.widgets.button import button from testgen.ui.components.widgets.card import card @@ -27,3 +29,27 @@ from testgen.ui.components.widgets.summary import summary_bar, summary_counts from testgen.ui.components.widgets.testgen_component import testgen_component from testgen.ui.components.widgets.wizard import WizardStep, wizard + +table_group_wizard = component_v2_wrapped(components_v2.component( + name="dataops-testgen.table_group_wizard", + js="pages/table_group_wizard.js", + isolate_styles=False, +)) + +edit_monitor_settings = component_v2_wrapped(components_v2.component( + name="dataops-testgen.edit_monitor_settings", + js="pages/edit_monitor_settings.js", + isolate_styles=False, +)) + +table_monitoring_trends = component_v2_wrapped(components_v2.component( + name="dataops-testgen.table_monitoring_trends", + js="pages/table_monitoring_trends.js", + isolate_styles=False, +)) + +edit_table_monitors = component_v2_wrapped(components_v2.component( + name="dataops-testgen.edit_table_monitors", + js="pages/edit_table_monitors.js", + isolate_styles=False, +)) diff --git a/testgen/ui/components/widgets/page.py b/testgen/ui/components/widgets/page.py index 8053e353..c6c68148 100644 --- a/testgen/ui/components/widgets/page.py +++ b/testgen/ui/components/widgets/page.py @@ -9,7 +9,7 @@ from testgen.ui.session import session from testgen.ui.views.dialogs.application_logs_dialog import application_logs_dialog -UPGRADE_URL = "https://docs.datakitchen.io/articles/#!dataops-testgen-help/upgrade-testgen" +UPGRADE_URL = "https://docs.datakitchen.io/articles/dataops-testgen-help/upgrade-testgen" def page_header( diff --git a/testgen/ui/components/widgets/paginator.py b/testgen/ui/components/widgets/paginator.py index fe5af404..5a71b30b 100644 --- a/testgen/ui/components/widgets/paginator.py +++ b/testgen/ui/components/widgets/paginator.py @@ -32,7 +32,7 @@ def on_page_change(): if on_change: on_change() - if page_index is None: + if page_index is None and bind_to_query is not None: bound_value = st.query_params.get(bind_to_query, "") page_index = int(bound_value) if bound_value.isdigit() else 0 page_index = page_index if page_index < math.ceil(count / page_size) else 0 diff --git a/testgen/ui/navigation/router.py b/testgen/ui/navigation/router.py index 15c7813d..bb6ae98d 100644 --- a/testgen/ui/navigation/router.py +++ b/testgen/ui/navigation/router.py @@ -27,7 +27,7 @@ def __init__( self._routes = {route.path: route(self) for route in routes} if routes else {} self._pending_navigation: dict | None = None - def _init_session(self): + def _init_session(self, url: str): # Clear cache on initial load or page refresh st.cache_data.clear() @@ -37,13 +37,16 @@ def _init_session(self): except Exception as e: LOG.exception("Error capturing the base URL") + source = st.query_params.pop("source", None) + MixpanelService().send_event(f"nav-{url}", page_load=True, source=source) + def run(self) -> None: streamlit_pages = [route.streamlit_page for route in self._routes.values()] current_page = st.navigation(streamlit_pages, position="hidden") if not session.initialized: - self._init_session() + self._init_session(url=current_page.url_path) session.initialized = True # This hack is needed because the auth cookie is not set if navigation happens immediately after login diff --git a/testgen/ui/pdf/hygiene_issue_report.py b/testgen/ui/pdf/hygiene_issue_report.py index 05d03058..df858ec1 100644 --- a/testgen/ui/pdf/hygiene_issue_report.py +++ b/testgen/ui/pdf/hygiene_issue_report.py @@ -108,7 +108,7 @@ def build_summary_table(document, hi_data): ), ("Profiling Date", profiling_timestamp, "Table Group", hi_data["table_groups_name"]), - ("Database/Schema", hi_data["schema_name"], "Disposition", hi_data["disposition"] or "No Decision"), + ("Database/Schema", hi_data["schema_name"], "Action", hi_data["disposition"] or "No Decision"), ("Table", hi_data["table_name"], "Data Type", hi_data["db_data_type"]), ("Column", hi_data["column_name"], "Semantic Data Type", hi_data["functional_data_type"]), ( diff --git a/testgen/ui/pdf/test_result_report.py b/testgen/ui/pdf/test_result_report.py index a621ee7c..50f79b55 100644 --- a/testgen/ui/pdf/test_result_report.py +++ b/testgen/ui/pdf/test_result_report.py @@ -123,7 +123,7 @@ def build_summary_table(document, tr_data): ("Test Run Date", test_timestamp, None, "Test Suite", tr_data["test_suite"]), ("Database/Schema", tr_data["schema_name"], None, "Table Group", tr_data["table_groups_name"]), ("Table", tr_data["table_name"], None, "Data Quality Dimension", tr_data["dq_dimension"]), - ("Column", tr_data["column_names"], None, "Disposition", tr_data["disposition"] or "No Decision"), + ("Column", tr_data["column_names"], None, "Action", tr_data["disposition"] or "No Decision"), ( "Column Tags", ( diff --git a/testgen/ui/queries/profiling_queries.py b/testgen/ui/queries/profiling_queries.py index a128a241..14f34b13 100644 --- a/testgen/ui/queries/profiling_queries.py +++ b/testgen/ui/queries/profiling_queries.py @@ -1,3 +1,5 @@ +from typing import Literal + import pandas as pd import streamlit as st @@ -180,17 +182,20 @@ def get_tables_by_condition( WITH active_test_definitions AS ( SELECT test_defs.table_groups_id, + test_defs.schema_name, test_defs.table_name, COUNT(*) AS count FROM test_definitions test_defs LEFT JOIN data_column_chars ON ( test_defs.table_groups_id = data_column_chars.table_groups_id + AND test_defs.schema_name = data_column_chars.schema_name AND test_defs.table_name = data_column_chars.table_name AND test_defs.column_name = data_column_chars.column_name ) WHERE test_active = 'Y' AND column_id IS NULL GROUP BY test_defs.table_groups_id, + test_defs.schema_name, test_defs.table_name ) """ if include_active_tests else ""} @@ -250,6 +255,7 @@ def get_tables_by_condition( {""" LEFT JOIN active_test_definitions active_tests ON ( table_chars.table_groups_id = active_tests.table_groups_id + AND table_chars.schema_name = active_tests.schema_name AND table_chars.table_name = active_tests.table_name ) """ if include_active_tests else ""} @@ -410,6 +416,7 @@ def get_columns_by_condition( """ if include_tags else ""} LEFT JOIN profile_results ON ( column_chars.last_complete_profile_run_id = profile_results.profile_run_id + AND column_chars.schema_name = profile_results.schema_name AND column_chars.table_name = profile_results.table_name AND column_chars.column_name = profile_results.column_name ) @@ -472,3 +479,91 @@ def get_hygiene_issues(profile_run_id: str, table_name: str, column_name: str | } results = fetch_all_from_db(query, params) return [ dict(row) for row in results ] + + +@st.cache_data(show_spinner=False) +def get_profiling_anomalies( + profile_run_id: str, + likelihood: str | None = None, + issue_type_id: str | None = None, + table_name: str | None = None, + column_name: str | None = None, + action: Literal["Confirmed", "Dismissed", "Muted", "No Action"] | None = None, + sorting_columns: list[str] | None = None, +) -> pd.DataFrame: + query = f""" + SELECT + r.table_name, + r.column_name, + r.schema_name, + r.db_data_type, + t.anomaly_name, + t.issue_likelihood, + r.disposition, + null as action, + CASE + WHEN t.issue_likelihood = 'Possible' THEN 'Possible: speculative test that often identifies problems' + WHEN t.issue_likelihood = 'Likely' THEN 'Likely: typically indicates a data problem' + WHEN t.issue_likelihood = 'Definite' THEN 'Definite: indicates a highly-likely data problem' + WHEN t.issue_likelihood = 'Potential PII' + THEN 'Potential PII: may require privacy policies, standards and procedures for access, storage and transmission.' + END AS likelihood_explanation, + CASE + WHEN t.issue_likelihood = 'Potential PII' THEN 4 + WHEN t.issue_likelihood = 'Possible' THEN 3 + WHEN t.issue_likelihood = 'Likely' THEN 2 + WHEN t.issue_likelihood = 'Definite' THEN 1 + END AS likelihood_order, + t.anomaly_description, r.detail, t.suggested_action, + r.anomaly_id, r.table_groups_id::VARCHAR, r.id::VARCHAR, p.profiling_starttime, r.profile_run_id::VARCHAR, + tg.table_groups_name, + + -- These are used in the PDF report + dcc.functional_data_type, + dcc.description as column_description, + COALESCE(dcc.critical_data_element, dtc.critical_data_element) as critical_data_element, + COALESCE(dcc.data_source, dtc.data_source, tg.data_source) as data_source, + COALESCE(dcc.source_system, dtc.source_system, tg.source_system) as source_system, + COALESCE(dcc.source_process, dtc.source_process, tg.source_process) as source_process, + COALESCE(dcc.business_domain, dtc.business_domain, tg.business_domain) as business_domain, + COALESCE(dcc.stakeholder_group, dtc.stakeholder_group, tg.stakeholder_group) as stakeholder_group, + COALESCE(dcc.transform_level, dtc.transform_level, tg.transform_level) as transform_level, + COALESCE(dcc.aggregation_level, dtc.aggregation_level) as aggregation_level, + COALESCE(dcc.data_product, dtc.data_product, tg.data_product) as data_product + FROM profile_anomaly_results r + INNER JOIN profile_anomaly_types t + ON r.anomaly_id = t.id + INNER JOIN profiling_runs p + ON r.profile_run_id = p.id + INNER JOIN table_groups tg + ON r.table_groups_id = tg.id + LEFT JOIN data_column_chars dcc + ON (tg.id = dcc.table_groups_id + AND r.schema_name = dcc.schema_name + AND r.table_name = dcc.table_name + AND r.column_name = dcc.column_name) + LEFT JOIN data_table_chars dtc + ON dcc.table_id = dtc.table_id + WHERE r.profile_run_id = :profile_run_id + {"AND t.issue_likelihood = :likelihood" if likelihood else ""} + {"AND t.id = :issue_type_id" if issue_type_id else ""} + {"AND r.table_name = :table_name" if table_name else ""} + {"AND r.column_name ILIKE :column_name" if column_name else ""} + {"AND r.disposition IS NULL" if action == "No Action" else "AND r.disposition = :disposition" if action else ""} + {f"ORDER BY {', '.join(' '.join(col) for col in sorting_columns)}" if sorting_columns else ""} + """ + params = { + "profile_run_id": profile_run_id, + "likelihood": likelihood, + "issue_type_id": issue_type_id, + "table_name": table_name, + "column_name": column_name, + "disposition": { + "Muted": "Inactive", + }.get(action, action), + } + df = fetch_df_from_db(query, params) + dct_replace = {"Confirmed": "βœ“", "Dismissed": "✘", "Inactive": "πŸ”‡"} + df["action"] = df["disposition"].replace(dct_replace) + + return df diff --git a/testgen/ui/queries/scoring_queries.py b/testgen/ui/queries/scoring_queries.py index d16243ab..f8d78bdd 100644 --- a/testgen/ui/queries/scoring_queries.py +++ b/testgen/ui/queries/scoring_queries.py @@ -94,7 +94,7 @@ def get_score_card_issue_reports(selected_issues: list["SelectedIssue"]) -> list suites.test_suite, types.dq_dimension, CASE - WHEN results.result_code <> 1 THEN results.disposition + WHEN results.result_code = 0 THEN results.disposition ELSE 'Passed' END as disposition, results.test_run_id::VARCHAR, @@ -102,12 +102,7 @@ def get_score_card_issue_reports(selected_issues: list["SelectedIssue"]) -> list types.test_type, results.auto_gen, results.test_suite_id, - results.test_definition_id::VARCHAR as test_definition_id_runtime, - CASE - WHEN results.auto_gen = TRUE - THEN definitions.id - ELSE results.test_definition_id - END::VARCHAR AS test_definition_id_current, + results.test_definition_id::VARCHAR, results.table_groups_id::VARCHAR, types.id::VARCHAR AS test_type_id, column_chars.description as column_description, @@ -127,13 +122,6 @@ def get_score_card_issue_reports(selected_issues: list["SelectedIssue"]) -> list ON (results.test_suite_id = suites.id) INNER JOIN table_groups groups ON (results.table_groups_id = groups.id) - LEFT JOIN test_definitions definitions - ON (results.test_suite_id = definitions.test_suite_id - AND results.table_name = definitions.table_name - AND COALESCE(results.column_names, 'N/A') = COALESCE(definitions.column_name, 'N/A') - AND results.test_type = definitions.test_type - AND results.auto_gen = TRUE - AND definitions.last_auto_gen_date IS NOT NULL) LEFT JOIN data_column_chars column_chars ON (groups.id = column_chars.table_groups_id AND results.schema_name = column_chars.schema_name diff --git a/testgen/ui/queries/source_data_queries.py b/testgen/ui/queries/source_data_queries.py index 49348626..d1537023 100644 --- a/testgen/ui/queries/source_data_queries.py +++ b/testgen/ui/queries/source_data_queries.py @@ -11,6 +11,7 @@ from testgen.common.models.test_definition import TestDefinition from testgen.common.read_file import replace_templated_functions from testgen.ui.services.database_service import fetch_from_target_db, fetch_one_from_db +from testgen.ui.utils import parse_fuzzy_date from testgen.utils import to_dataframe LOG = logging.getLogger("testgen") @@ -109,7 +110,7 @@ def get_test_issue_source_query(issue_data: dict, limit: int = DEFAULT_LIMIT) -> if not lookup_data or not lookup_data.lookup_query: return None - test_definition = TestDefinition.get(issue_data["test_definition_id_current"]) + test_definition = TestDefinition.get(issue_data["test_definition_id"]) if not test_definition: return None @@ -118,15 +119,18 @@ def get_test_issue_source_query(issue_data: dict, limit: int = DEFAULT_LIMIT) -> "TABLE_NAME": issue_data["table_name"], "COLUMN_NAME": issue_data["column_names"], # Don't quote this - queries already have quotes "COLUMN_TYPE": issue_data["column_type"], - "TEST_DATE": str(issue_data["test_date"]), + "TEST_DATE": str(parsed_test_date) if (parsed_test_date := parse_fuzzy_date(issue_data["test_date"])) + else None, "CUSTOM_QUERY": test_definition.custom_query, "BASELINE_VALUE": test_definition.baseline_value, "BASELINE_CT": test_definition.baseline_ct, "BASELINE_AVG": test_definition.baseline_avg, "BASELINE_SD": test_definition.baseline_sd, - "LOWER_TOLERANCE": test_definition.lower_tolerance, - "UPPER_TOLERANCE": test_definition.upper_tolerance, - "THRESHOLD_VALUE": test_definition.threshold_value, + "LOWER_TOLERANCE": "NULL" if test_definition.lower_tolerance in (None, "") else test_definition.lower_tolerance, + "UPPER_TOLERANCE": "NULL" if test_definition.upper_tolerance in (None, "") else test_definition.upper_tolerance, + "THRESHOLD_VALUE": test_definition.threshold_value or 0, + # SUBSET_CONDITION should be replaced after CUSTOM_QUERY + # since the latter may contain the former "SUBSET_CONDITION": test_definition.subset_condition or "1=1", "GROUPBY_NAMES": test_definition.groupby_names, "HAVING_CONDITION": f"HAVING {test_definition.having_condition}" if test_definition.having_condition else "", @@ -138,7 +142,7 @@ def get_test_issue_source_query(issue_data: dict, limit: int = DEFAULT_LIMIT) -> "MATCH_HAVING_CONDITION": f"HAVING {test_definition.match_having_condition}" if test_definition.having_condition else "", "COLUMN_NAME_NO_QUOTES": issue_data["column_names"], "WINDOW_DATE_COLUMN": test_definition.window_date_column, - "WINDOW_DAYS": test_definition.window_days, + "WINDOW_DAYS": test_definition.window_days or 0, "CONCAT_COLUMNS": concat_columns(issue_data["column_names"], ""), "CONCAT_MATCH_GROUPBY": concat_columns(test_definition.match_groupby_names, ""), "LIMIT": limit, @@ -158,7 +162,7 @@ def get_test_issue_source_data( ) -> tuple[Literal["OK"], None, str, pd.DataFrame] | tuple[Literal["NA", "ND", "ERR"], str, str | None, None]: lookup_query = None try: - test_definition = TestDefinition.get(issue_data["test_definition_id_current"]) + test_definition = TestDefinition.get(issue_data["test_definition_id"]) if not test_definition: return "NA", "Test definition no longer exists.", None, None @@ -184,7 +188,7 @@ def get_test_issue_source_data( def get_test_issue_source_query_custom( issue_data: dict, ) -> str: - lookup_data = _get_lookup_data_custom(issue_data["test_definition_id_current"]) + lookup_data = _get_lookup_data_custom(issue_data["test_definition_id"]) if not lookup_data or not lookup_data.lookup_query: return None @@ -201,7 +205,7 @@ def get_test_issue_source_data_custom( limit: int | None = None, ) -> tuple[Literal["OK"], None, str, pd.DataFrame] | tuple[Literal["NA", "ND", "ERR"], str, str | None, None]: try: - test_definition = TestDefinition.get(issue_data["test_definition_id_current"]) + test_definition = TestDefinition.get(issue_data["test_definition_id"]) if not test_definition: return "NA", "Test definition no longer exists.", None, None diff --git a/testgen/ui/queries/test_result_queries.py b/testgen/ui/queries/test_result_queries.py index 806ec032..ad35a8b4 100644 --- a/testgen/ui/queries/test_result_queries.py +++ b/testgen/ui/queries/test_result_queries.py @@ -35,13 +35,13 @@ def get_test_results( tt.test_name_short, tt.test_name_long, r.test_description, tt.measure_uom, tt.measure_uom_description, c.test_operator, r.threshold_value::NUMERIC(16, 5), r.result_measure::NUMERIC(16, 5), r.result_status, CASE - WHEN r.result_code <> 1 THEN r.disposition + WHEN r.result_code = 0 THEN r.disposition ELSE 'Passed' END as disposition, NULL::VARCHAR(1) as action, - r.input_parameters, r.result_message, CASE WHEN result_code <> 1 THEN r.severity END as severity, - r.result_code as passed_ct, - (1 - r.result_code)::INTEGER as exception_ct, + r.input_parameters, r.result_message, CASE WHEN result_code = 0 THEN r.severity END as severity, + CASE WHEN r.result_code = 1 THEN 1 ELSE 0 END as passed_ct, + CASE WHEN r.result_code = 0 THEN 1 ELSE 0 END as exception_ct, CASE WHEN result_status = 'Warning' THEN 1 END::INTEGER as warning_ct, @@ -57,11 +57,7 @@ def get_test_results( p.project_code, r.table_groups_id::VARCHAR, r.id::VARCHAR as test_result_id, r.test_run_id::VARCHAR, c.id::VARCHAR as connection_id, r.test_suite_id::VARCHAR, - r.test_definition_id::VARCHAR as test_definition_id_runtime, - CASE - WHEN r.auto_gen = TRUE THEN d.id - ELSE r.test_definition_id - END::VARCHAR as test_definition_id_current, + r.test_definition_id::VARCHAR, r.auto_gen, -- These are used in the PDF report @@ -80,13 +76,6 @@ def get_test_results( FROM run_results r INNER JOIN test_types tt ON (r.test_type = tt.test_type) - LEFT JOIN test_definitions d - ON (r.test_suite_id = d.test_suite_id - AND r.table_name = d.table_name - AND COALESCE(r.column_names, 'N/A') = COALESCE(d.column_name, 'N/A') - AND r.test_type = d.test_type - AND r.auto_gen = TRUE - AND d.last_auto_gen_date IS NOT NULL) INNER JOIN test_suites ts ON r.test_suite_id = ts.id INNER JOIN projects p @@ -126,36 +115,31 @@ def get_test_results( @st.cache_data(show_spinner=False) def get_test_result_history(tr_data, limit: int | None = None): query = f""" - SELECT test_date, - test_type, - test_name_short, - test_name_long, - measure_uom, - test_operator, - threshold_value::NUMERIC, - result_measure::NUMERIC, - result_status, - result_visualization, - result_visualization_params - FROM v_test_results - WHERE {f""" - test_suite_id = :test_suite_id - AND table_name = :table_name - AND column_names {"= :column_names" if tr_data["column_names"] else "IS NULL"} - AND test_type = :test_type - AND auto_gen = TRUE - """ if tr_data["auto_gen"] else """ - test_definition_id_runtime = :test_definition_id_runtime - """} - ORDER BY test_date DESC + SELECT r.test_time AS test_date, + r.test_type, + tt.test_name_short, + tt.test_name_long, + tt.measure_uom, + c.test_operator, + r.threshold_value::NUMERIC(16, 5), + r.result_measure::NUMERIC(16, 5), + r.result_status, + tt.result_visualization, + tt.result_visualization_params + FROM test_results r + INNER JOIN test_types tt ON (r.test_type = tt.test_type) + INNER JOIN table_groups tg ON (r.table_groups_id = tg.id) + INNER JOIN connections cn ON (tg.connection_id = cn.connection_id) + LEFT JOIN cat_test_conditions c ON ( + cn.sql_flavor = c.sql_flavor + AND r.test_type = c.test_type + ) + WHERE r.test_definition_id = :test_definition_id + ORDER BY r.test_time DESC {'LIMIT ' + str(limit) if limit else ''}; """ params = { - "test_suite_id": tr_data["test_suite_id"], - "table_name": tr_data["table_name"], - "column_names": tr_data["column_names"], - "test_type": tr_data["test_type"], - "test_definition_id_runtime": tr_data["test_definition_id_runtime"], + "test_definition_id": tr_data["test_definition_id"], } df = fetch_df_from_db(query, params) diff --git a/testgen/ui/scripts/patch_streamlit.py b/testgen/ui/scripts/patch_streamlit.py index bba728a1..b9683003 100644 --- a/testgen/ui/scripts/patch_streamlit.py +++ b/testgen/ui/scripts/patch_streamlit.py @@ -1,10 +1,12 @@ # ruff: noqa: TRY002 -import functools import pathlib +import re import shutil import streamlit +import streamlit.components.v2.manifest_scanner as streamlit_manifest_scanner +import streamlit.web.server.app_static_file_handler as streamlit_app_static_file_handler from bs4 import BeautifulSoup, Tag INJECTED_CLASS = "testgen-mods" @@ -13,29 +15,26 @@ STREAMLIT_JS_FOLDER = STREAMLIT_ROOT / "static" / "static" / "js" STREAMLIT_CSS_FOLDER = STREAMLIT_ROOT / "static" / "static" / "css" TESTGEN_ROOT = pathlib.Path(__file__).parent.parent.parent - - -def patch(force: bool = False) -> list[str]: - operations = [ - "ui/assets/style.css:insert", - "ui/assets/scripts.js:insert", - "ui/components/frontend/css/KFOmCnqEu92Fr1Mu7GxKOzY.woff2:copy", - "ui/components/frontend/css/KFOmCnqEu92Fr1Mu4mxK.woff2:copy", - "ui/components/frontend/css/KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2:copy", - "ui/components/frontend/css/KFOlCnqEu92Fr1MmEU9fBBc4.woff2:copy", - "ui/components/frontend/css/material-symbols-rounded.woff2:copy", - "ui/components/frontend/css/roboto-font-faces.css:inject", - "ui/components/frontend/css/material-symbols-rounded.css:inject", - "ui/components/frontend/js/van.min.js:copy", - "ui/components/frontend/js/components/sidebar.js:inject", - ] - - _patch_streamlit_index(*operations, force=force) - - return [op.split(":")[0] for op in operations] - - -def _patch_streamlit_index(*operations: str, force: bool = False) -> None: +TESTGEN_STATIC_FOLDER = pathlib.Path(__file__).parent.parent.parent / "ui" / "static" +STATIC_FILES = [ + "css/style.css", + "css/shared.css", + "css/roboto-font-faces.css", + "css/material-symbols-rounded.css", + "js/scripts.js", + "js/sidebar.js", + "js/van.min.js", +] + + +def patch(dev: bool = False) -> None: + _allow_static_files([".js", ".css"]) + _patch_streamlit_index(*STATIC_FILES, dev=dev) + if dev: + _allow_pyproject_from_editable_installs() + + +def _patch_streamlit_index(*static_files: str, dev: bool = False) -> None: """ Patches the index.html inside streamlit package to inject Testgen's own styles and scripts before rendering time. @@ -47,12 +46,12 @@ def _patch_streamlit_index(*operations: str, force: bool = False) -> None: NOTE: keeps a .bak of the original index.html file :param filename: list of path to valid .css and .js files - :param force: to use in development while actively changing the + :param dev: to use in development while actively changing the injected files to force re-injection """ html = BeautifulSoup(STREAMLIT_INDEX.read_text(), features="html.parser") - if force or not html.find_all(attrs={"class": INJECTED_CLASS}): + if dev or not html.find_all(attrs={"class": INJECTED_CLASS}): streamlit_index_backup = STREAMLIT_INDEX.with_suffix(".bak") if not streamlit_index_backup.exists(): @@ -63,49 +62,160 @@ def _patch_streamlit_index(*operations: str, force: bool = False) -> None: head = html.find(name="head") if head: - actions = { - "insert": _inline_tag, - "copy": _sourced_tag, - "inject": functools.partial(_sourced_tag, inject=True), - } - for operation in operations: - filename, action = operation.split(":") - if (filepath := (TESTGEN_ROOT / filename)).exists(): - if tag := actions[action](filepath, html): + for relative_path in static_files: + if (TESTGEN_STATIC_FOLDER / relative_path).exists(): + if tag := _create_tag(relative_path, html): head.append(tag) - STREAMLIT_INDEX.write_text(str(html)) -def _inline_tag(filepath: pathlib.Path, html: BeautifulSoup, **_) -> Tag: - tag_for_ext = { - ".css": lambda: html.new_tag("style", **{"class": INJECTED_CLASS}), - ".js": lambda: html.new_tag("script", **{"type": "module", "class": INJECTED_CLASS}), - } - - try: - tag = tag_for_ext[filepath.suffix]() - except: - raise Exception(f"Unsupported insert operation for file with extension {filepath.suffix}") from None - - tag.string = filepath.read_text() - return tag - - -def _sourced_tag(filepath: pathlib.Path, html: BeautifulSoup, inject: bool = False) -> Tag | None: +def _create_tag(relative_filepath: str, html: BeautifulSoup) -> Tag | None: tag_for_ext = { ".css": lambda: html.new_tag( - "link", **{"href": f"./static/css/{filepath.name}", "rel": "stylesheet", "class": INJECTED_CLASS} + "link", **{"href": f"/app/static/{relative_filepath}", "rel": "stylesheet", "class": INJECTED_CLASS} ), ".js": lambda: html.new_tag( - "script", **{"type": "module", "src": f"./static/js/{filepath.name}", "class": INJECTED_CLASS} + "script", **{"type": "module", "src": f"/app/static/{relative_filepath}", "class": INJECTED_CLASS} ), } - copy_to = ({".js": STREAMLIT_JS_FOLDER}).get(filepath.suffix, STREAMLIT_CSS_FOLDER) - - shutil.copy(filepath, copy_to) - if not inject or filepath.suffix not in tag_for_ext: - return None - - return tag_for_ext[filepath.suffix]() + extension = f".{relative_filepath.split(".")[-1]}" + if extension in tag_for_ext: + return tag_for_ext[extension]() + return None + + +def _allow_static_files(extensions: list[str]): + file_path = pathlib.Path(streamlit_app_static_file_handler.__file__) + backup_file_path = file_path.with_suffix(".py.bak") + + if not backup_file_path.exists(): + shutil.copy(file_path, backup_file_path) + shutil.copy(backup_file_path, file_path) + + content = file_path.read_text() + + match = re.search(r"(SAFE_APP_STATIC_FILE_EXTENSIONS\s*=\s*\()([^)]*)(\))", content, re.DOTALL) + + if match: + prefix = match.group(1) + existing_extensions_str = match.group(2) + suffix = match.group(3) + + existing_extensions: list[str] = [] + for line in existing_extensions_str.splitlines(): + stripped_line = line.strip() + if stripped_line and not stripped_line.startswith("#"): + found_exts = re.findall(r'\"(\.[a-zA-Z0-9]+)\"', stripped_line) + existing_extensions.extend(found_exts) + + all_extensions = [] + for ext in existing_extensions + extensions: + if not ext.startswith("."): + ext = "." + ext + all_extensions.append(ext) + all_extensions = sorted(set(all_extensions)) + + new_extensions_formatted_lines = [] + for ext in all_extensions: + new_extensions_formatted_lines.append(f' "{ext}",') + + new_tuple_content = "\n".join(new_extensions_formatted_lines) + new_tuple_str = f"{prefix}\n{new_tuple_content}\n{suffix}" + + new_content = content.replace(match.group(0), new_tuple_str) + file_path.write_text(new_content) + else: + raise RuntimeError("Could not find SAFE_APP_STATIC_FILE_EXTENSIONS in the file.") + + +def _allow_pyproject_from_editable_installs() -> None: + injected_functions = """ +import json + + +def _pyproject_via_editable_package(dist: importlib.metadata.Distribution) -> Path | None: + from urllib.parse import urlparse, unquote + + if _is_editable_package(dist): + try: + content = dist.read_text("direct_url.json") + data = json.loads(content) + if not data.get("dir_info", {}).get("editable", False): + return None + file_url = data.get("url") + if not file_url or not file_url.startswith("file://"): + return None + path_str = unquote(urlparse(file_url).path) + project_root = Path(path_str) + pyproject_path = project_root / "pyproject.toml" + if pyproject_path.exists(): + return pyproject_path + except (json.JSONDecodeError, ValueError): + pass + return None + + +def _is_editable_package(dist: importlib.metadata.Distribution) -> bool: + content = dist.read_text("direct_url.json") + if not content: + return False + try: + data = json.loads(content) + return data.get("dir_info", {}).get("editable", False) + except (json.JSONDecodeError, ValueError): + pass + return False + + +IGNORED_DIR_NAMES = ["__pycache__", "invocations", "tests"] +def _find_first_package_dir(project_path: Path) -> Path | None: + for item in sorted(project_path.iterdir()): + if item.is_dir(): + if item.name.startswith(".") or item.name in IGNORED_DIR_NAMES: + continue + if (item / "__init__.py").exists(): + return item + return None +""" + + file_path = pathlib.Path(streamlit_manifest_scanner.__file__) + backup_file_path = file_path.with_suffix(".py.bak") + + if not backup_file_path.exists(): + shutil.copy(file_path, backup_file_path) + shutil.copy(backup_file_path, file_path) + + content = file_path.read_text() + to_replace = """ for finder in ( + _pyproject_via_read_text, + _pyproject_via_dist_files, + lambda d: _pyproject_via_import_spec(d, package_name), + ):""" + new_value = """ for finder in ( + _pyproject_via_read_text, + _pyproject_via_dist_files, + lambda d: _pyproject_via_import_spec(d, package_name), + _pyproject_via_editable_package, + ):""" + + new_content = content + "\n" + injected_functions + new_content = new_content.replace(to_replace, new_value) + + to_replace = """ if not package_root: + package_root = pyproject_path.parent""" + new_value = """ if not package_root: + if _is_editable_package(dist): + package_root = _find_first_package_dir(pyproject_path.parent) + + if not package_root: + package_root = pyproject_path.parent""" + + new_content = new_content.replace(to_replace, new_value) + + file_path.write_text(new_content) + + +if __name__ == "__main__": + patch(dev=True) + print("patched internal streamlit files") # noqa: T201 diff --git a/testgen/ui/static/css/material-symbols-rounded.css b/testgen/ui/static/css/material-symbols-rounded.css new file mode 100644 index 00000000..16eec0f4 --- /dev/null +++ b/testgen/ui/static/css/material-symbols-rounded.css @@ -0,0 +1,32 @@ +@font-face { + font-family: "Material Symbols Rounded"; + font-style: normal; + font-weight: 100 700; + font-display: block; + src: url("/app/static/fonts/material-symbols-rounded.woff2") format("woff2"); +} +.material-symbols-rounded { + font-family: "Material Symbols Rounded"; + font-weight: normal; + font-style: normal; + font-size: 24px; + line-height: 1; + letter-spacing: normal; + text-transform: none; + display: inline-block; + white-space: nowrap; + word-wrap: normal; + direction: ltr; + -webkit-font-smoothing: antialiased; + -moz-osx-font-smoothing: grayscale; + text-rendering: optimizeLegibility; + font-feature-settings: "liga"; +} + +.material-symbols-filled { + font-variation-settings: + 'FILL' 1, + 'wght' 400, + 'GRAD' 0, + 'opsz' 24; +} diff --git a/testgen/ui/static/css/roboto-font-faces.css b/testgen/ui/static/css/roboto-font-faces.css new file mode 100644 index 00000000..1b435eaa --- /dev/null +++ b/testgen/ui/static/css/roboto-font-faces.css @@ -0,0 +1,35 @@ +@font-face { + font-family: 'Roboto'; + font-style: normal; + font-weight: 400; + font-display: swap; + src: url(/app/static/fonts/KFOmCnqEu92Fr1Mu7GxKOzY.woff2) format('woff2'); + unicode-range: U+0100-02AF, U+0304, U+0308, U+0329, U+1E00-1E9F, U+1EF2-1EFF, U+2020, U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F, U+A720-A7FF; +} + +@font-face { + font-family: 'Roboto'; + font-style: normal; + font-weight: 400; + font-display: swap; + src: url(/app/static/fonts/KFOmCnqEu92Fr1Mu4mxK.woff2) format('woff2'); + unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, U+02DC, U+0304, U+0308, U+0329, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD; +} + +@font-face { + font-family: 'Roboto'; + font-style: normal; + font-weight: 500; + font-display: swap; + src: url(/app/static/fonts/KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2) format('woff2'); + unicode-range: U+0100-02AF, U+0304, U+0308, U+0329, U+1E00-1E9F, U+1EF2-1EFF, U+2020, U+20A0-20AB, U+20AD-20CF, U+2113, U+2C60-2C7F, U+A720-A7FF; +} + +@font-face { + font-family: 'Roboto'; + font-style: normal; + font-weight: 500; + font-display: swap; + src: url(/app/static/fonts/KFOlCnqEu92Fr1MmEU9fBBc4.woff2) format('woff2'); + unicode-range: U+0000-00FF, U+0131, U+0152-0153, U+02BB-02BC, U+02C6, U+02DA, U+02DC, U+0304, U+0308, U+0329, U+2000-206F, U+2074, U+20AC, U+2122, U+2191, U+2193, U+2212, U+2215, U+FEFF, U+FFFD; +} diff --git a/testgen/ui/static/css/shared.css b/testgen/ui/static/css/shared.css new file mode 100644 index 00000000..8390aafe --- /dev/null +++ b/testgen/ui/static/css/shared.css @@ -0,0 +1,750 @@ +html, +body { + height: 100%; + margin: unset; + color: var(--primary-text-color); + font-size: 14px; + font-family: 'Roboto', 'Helvetica Neue', sans-serif; +} + +body { + --primary-color: #06a04a; + --link-color: #1976d2; + --error-color: #EF5350; + + --red: #EF5350; + --orange: #FF9800; + --yellow: #FDD835; + --green: #9CCC65; + --purple: #AB47BC; + --blue: #42A5F5; + --brown: #8D6E63; + --grey: #BDBDBD; + --light-grey: #E0E0E0; + --empty: #EEEEEE; + --empty-light: #FAFAFA; + --empty-dark: #BDBDBD; + --empty-teal: #E7F1F0; + + --primary-text-color: #000000de; + --secondary-text-color: #0000008a; + --disabled-text-color: #00000042; + --caption-text-color: rgba(49, 51, 63, 0.6); /* Match Streamlit's caption color */ + --form-field-color: rgb(240, 242, 246); /* Match Streamlit's form field color */ + --border-color: rgba(0, 0, 0, .12); + --tooltip-color: #333d; + --tooltip-text-color: #fff; + --dk-card-background: #fff; + --dk-dialog-background: #fff; + --selected-item-background: #06a04a17; + + --sidebar-background-color: white; + --sidebar-item-hover-color: #f5f5f5; + --sidebar-active-item-color: #f5f5f5; + --sidebar-active-item-border-color: #b4e3c9; + + --field-underline-color: #9e9e9e; + + --button-hover-state-opacity: 0.12; + --button-generic-background-color: #ffffff; + + --button-basic-background: transparent; + --button-basic-text-color: rgba(0, 0, 0, .87); + --button-basic-hover-state-background: rgba(0, 0, 0, .54); + + --button-basic-flat-text-color: rgba(0, 0, 0); + --button-basic-flat-background: rgba(0, 0, 0, .87); + + --button-basic-stroked-text-color: rgba(0, 0, 0, .87); + --button-basic-stroked-background: transparent; + + --button-primary-background: transparent; + --button-primary-text-color: var(--primary-color); + --button-primary-hover-state-background: var(--primary-color); + + --button-primary-flat-text-color: rgba(255, 255, 255); + --button-primary-flat-background: var(--primary-color); + + --button-primary-stroked-text-color: var(--primary-color); + --button-primary-stroked-background: transparent; + --button-stroked-border: 1px solid var(--border-color); + + --button-warn-background: transparent; + --button-warn-text-color: var(--red); + --button-warn-hover-state-background: var(--red); + + --button-warn-flat-text-color: rgba(255, 255, 255); + --button-warn-flat-background: var(--red); + + --button-warn-stroked-text-color: var(--red); + --button-warn-stroked-background: transparent; + + --portal-background: white; + --portal-box-shadow: rgba(0, 0, 0, 0.16) 0px 4px 16px; + --select-hover-background: rgb(240, 242, 246); + + --app-background-color: #f8f9fa; + + --table-hover-color: #ecf0f1; + --table-selection-color: rgba(0,145,234,.28); +} + +@media (prefers-color-scheme: dark) { + body { + --empty: #424242; + --empty-light: #212121; + --empty-dark: #757575; + --empty-teal: #242E2D; + + --primary-text-color: rgba(255, 255, 255); + --secondary-text-color: rgba(255, 255, 255, .7); + --disabled-text-color: rgba(255, 255, 255, .5); + --caption-text-color: rgba(250, 250, 250, .6); /* Match Streamlit's caption color */ + --form-field-color: rgb(38, 39, 48); /* Match Streamlit's form field color */ + --border-color: rgba(255, 255, 255, .25); + --tooltip-color: #eee; + --tooltip-text-color: #000; + --dk-card-background: #14181f; + --dk-dialog-background: #0e1117; + + --sidebar-background-color: #14181f; + --sidebar-item-hover-color: #10141b; + --sidebar-active-item-color: #10141b; + --sidebar-active-item-border-color: #b4e3c9; + --dk-text-value-background: unset; + + --button-generic-background-color: rgb(38, 39, 48); + + --button-basic-background: transparent; + --button-basic-text-color: rgba(255, 255, 255); + --button-basic-hover-state-background: rgba(255, 255, 255, .54); + + --button-basic-flat-text-color: rgba(255, 255, 255); + --button-basic-flat-background: rgba(255, 255, 255, .54); + + --button-basic-stroked-text-color: rgba(255, 255, 255, .87); + --button-basic-stroked-background: transparent; + + --button-stroked-border: 1px solid var(--border-color); + + --portal-background: #14181f; + --portal-box-shadow: rgba(0, 0, 0, 0.95) 0px 4px 16px; + --select-hover-background: rgb(38, 39, 48); + + --app-background-color: rgb(14, 17, 23); + } +} + +.clickable { + cursor: pointer !important; +} + +.hidden { + display: none !important; +} + +.invisible { + visibility: hidden !important; +} + +.dot { + font-size: 10px; + font-style: normal; +} + +.dot::before { + content: '⬀'; +} + +/* Table styles */ +.table { + background-color: var(--dk-card-background); + border: var(--button-stroked-border); + border-radius: 8px; + padding: 16px; + box-sizing: border-box; +} + +.table-row { + padding: 8px 0; +} + +.table.hoverable .table-row:hover { + background-color: var(--select-hover-background); +} + +.table-row:not(:last-child) { + border-bottom: var(--button-stroked-border); +} + +.table-header { + border-bottom: var(--button-stroked-border); + padding: 0 0 8px 0; + font-size: 12px; + color: var(--caption-text-color); + text-transform: uppercase; +} + +.table-header > *, +.table-row > * { + box-sizing: border-box; + padding: 0 4px; +} +/* */ + +/* Text utilities */ +.text-primary { + color: var(--primary-text-color); +} + +.text-secondary { + color: var(--secondary-text-color); +} + +.text-disabled { + color: var(--disabled-text-color); +} + +.text-bold { + font-weight: 500; +} + +.text-small { + font-size: 13px; +} + +.text-large { + font-size: 16px; +} + +.text-caption { + font-size: 12px; + color: var(--caption-text-color); +} + +.text-error { + color: var(--error-color); +} + +.text-green { + color: var(--primary-color); +} + +.text-capitalize { + text-transform: capitalize; +} + +.text-code { + font-family:'Courier New', Courier, monospace; + line-height: 1.5; + white-space: pre-wrap; +} +/* */ + +/* Flex utilities */ +.flex-row { + display: flex; + flex-direction: row; + align-items: center; +} + +.flex-column { + display: flex; + flex-direction: column; +} + +.fx-flex { + flex: 1 1 0%; +} + +.fx-flex-wrap { + flex-wrap: wrap; +} + +.fx-align-flex-center { + align-items: center; +} + +.fx-align-flex-start { + align-items: flex-start; +} + +.fx-align-flex-end { + align-items: flex-end; +} + +.fx-align-baseline { + align-items: baseline; +} + +.fx-align-stretch { + align-items: stretch; +} + +.fx-justify-flex-end { + justify-items: flex-end; +} + +.fx-justify-content-flex-end { + justify-content: flex-end; +} + +.fx-justify-flex-start { + justify-content: flex-start; +} + +.fx-justify-center { + justify-content: center; +} + +.fx-justify-space-between { + justify-content: space-between; +} + +.fx-flex-align-content { + align-content: flex-start; +} + +.fx-gap-1 { + gap: 4px; +} + +.fx-gap-2 { + gap: 8px; +} + +.fx-gap-3 { + gap: 12px; +} + +.fx-gap-4 { + gap: 16px; +} + +.fx-gap-5 { + gap: 24px; +} + +.fx-gap-6 { + gap: 32px; +} + +.fx-gap-7 { + gap: 40px; +} + +/* */ + +/* Whitespace utilities */ +.mt-0 { + margin-top: 0; +} + +.mt-1 { + margin-top: 4px; +} + +.mt-2 { + margin-top: 8px; +} + +.mt-3 { + margin-top: 12px; +} + +.mt-4 { + margin-top: 16px; +} + +.mt-5 { + margin-top: 24px; +} + +.mt-6 { + margin-top: 32px; +} + +.mt-7 { + margin-top: 40px; +} + +.mr-0 { + margin-right: 0; +} + +.mr-1 { + margin-right: 4px; +} + +.mr-2 { + margin-right: 8px; +} + +.mr-3 { + margin-right: 12px; +} + +.mr-4 { + margin-right: 16px; +} + +.mr-5 { + margin-right: 24px; +} + +.mr-6 { + margin-right: 32px; +} + +.mr-7 { + margin-right: 40px; +} + +.mb-0 { + margin-bottom: 0; +} + +.mb-1 { + margin-bottom: 4px; +} + +.mb-2 { + margin-bottom: 8px; +} + +.mb-3 { + margin-bottom: 12px; +} + +.mb-4 { + margin-bottom: 16px; +} + +.mb-5 { + margin-bottom: 24px; +} + +.mb-6 { + margin-bottom: 32px; +} + +.mb-7 { + margin-bottom: 40px; +} + +.ml-0 { + margin-left: 0; +} + +.ml-1 { + margin-left: 4px; +} + +.ml-2 { + margin-left: 8px; +} + +.ml-3 { + margin-left: 12px; +} + +.ml-4 { + margin-left: 16px; +} + +.ml-5 { + margin-left: 24px; +} + +.ml-6 { + margin-left: 32px; +} + +.ml-7 { + margin-left: 40px; +} + +.p-0 { + padding: 0; +} + +.p-1 { + padding: 4px; +} + +.p-2 { + padding: 8px; +} + +.p-3 { + padding: 12px; +} + +.p-4 { + padding: 16px; +} + +.p-5 { + padding: 24px; +} + +.p-6 { + padding: 32px; +} + +.p-7 { + padding: 40px; +} + +.pt-0 { + padding-top: 0; +} + +.pt-1 { + padding-top: 4px; +} + +.pt-2 { + padding-top: 8px; +} + +.pt-3 { + padding-top: 12px; +} + +.pt-4 { + padding-top: 16px; +} + +.pt-5 { + padding-top: 24px; +} + +.pt-6 { + padding-top: 32px; +} + +.pt-7 { + padding-top: 40px; +} + +.pr-0 { + padding-right: 0; +} + +.pr-1 { + padding-right: 4px; +} + +.pr-2 { + padding-right: 8px; +} + +.pr-3 { + padding-right: 12px; +} + +.pr-4 { + padding-right: 16px; +} + +.pr-5 { + padding-right: 24px; +} + +.pr-6 { + padding-right: 32px; +} + +.pr-7 { + padding-right: 40px; +} + +.pb-0 { + padding-bottom: 0; +} + +.pb-1 { + padding-bottom: 4px; +} + +.pb-2 { + padding-bottom: 8px; +} + +.pb-3 { + padding-bottom: 12px; +} + +.pb-4 { + padding-bottom: 16px; +} + +.pb-5 { + padding-bottom: 24px; +} + +.pb-6 { + padding-bottom: 32px; +} + +.pb-7 { + padding-bottom: 40px; +} + +.pl-0 { + padding-left: 0; +} + +.pl-1 { + padding-left: 4px; +} + +.pl-2 { + padding-left: 8px; +} + +.pl-3 { + padding-left: 12px; +} + +.pl-4 { + padding-left: 16px; +} + +.pl-5 { + padding-left: 24px; +} + +.pl-6 { + padding-left: 32px; +} + +.pl-7 { + padding-left: 40px; +} +/* */ + +code { + position: relative; + border-radius: 0.5rem; + display: block; + margin: 0px; + overflow: auto; + padding: 24px 16px; + color: var(--primary-text-color); + background-color: var(--empty-light); +} + +code > .tg-icon { + position: absolute; + top: 21px; + right: 16px; + color: var(--secondary-text-color); + cursor: pointer; + opacity: 0; +} + +code > .tg-icon:hover { + opacity: 1; +} + +.accent-primary { + accent-color: var(--primary-color); +} + +.border { + border: var(--button-stroked-border); +} + +.border-radius-1 { + border-radius: 4px; +} + +.border-radius-2 { + border-radius: 8px; +} + +input { + line-height: normal !important; +} + +input::-ms-reveal, +input::-ms-clear { + display: none; +} + +.text-left { + text-align: left; +} + +.text-right { + text-align: right; +} + +.text-center { + text-align: center; +} + +.visible-overflow { + overflow: visible; +} + +.anomaly-tag { + display: inline-flex; + align-items: center; + justify-content: center; + vertical-align: middle; + border-radius: 18px; + background: var(--green); + height: 20px; + width: 20px; + box-sizing: border-box; +} + +.anomaly-tag > .material-symbols-rounded { + color: var(--empty-light); + font-size: 20px; +} + +.anomaly-tag.has-anomalies { + padding: 1px 5px; + border-radius: 10px; + background: var(--error-color); + color: var(--empty-light); + width: auto; + min-width: 20px; +} + +.anomaly-tag.has-errors { + position: relative; + background: transparent; +} + +.anomaly-tag.has-errors > .material-symbols-rounded { + color: var(--orange); + font-size: 22px; +} + +.anomaly-tag.is-training { + position: relative; + background: transparent; + border: 2px solid var(--blue); +} + +.anomaly-tag.is-training > .material-symbols-rounded { + color: var(--blue); +} + +.anomaly-tag.is-pending { + background: none; + color: var(--primary-text-color); +} + +.notifications--empty.tg-empty-state { + margin-top: 0; +} + +.warning-text { + color: var(--orange); +} diff --git a/testgen/ui/assets/style.css b/testgen/ui/static/css/style.css similarity index 90% rename from testgen/ui/assets/style.css rename to testgen/ui/static/css/style.css index dedd11fa..2637dbd5 100644 --- a/testgen/ui/assets/style.css +++ b/testgen/ui/static/css/style.css @@ -37,6 +37,12 @@ body { --app-background-color: #f8f9fa; } +.stBidiComponent { + font-size: 14px; + font-family: 'Roboto', 'Helvetica Neue', sans-serif; + color: var(--primary-text-color); +} + img.dk-logo-img { margin: 0 0 30px 0; width: 100%; @@ -78,6 +84,11 @@ img.dk-logo-img { } /* Sidebar */ +div[data-testid="stSidebarContent"] { + padding-left: unset; + padding-right: unset; +} + [data-testid="stSidebarContent"] [data-testid="stSidebarHeader"] { padding: 16px 20px 20px; margin-bottom: 0; @@ -94,6 +105,10 @@ section.stSidebar { background-color: var(--sidebar-background-color); } +section.stSidebar > [data-testid="stSidebarContent"] { + overflow: visible; +} + [data-testid="stSidebarNav"], [data-testid="stSidebarUserContent"] { display: none; @@ -133,6 +148,10 @@ div[data-testid="stDialog"] div[role="dialog"]:has(i.xl-dialog) { width: calc(95rem); } +div[data-testid="stDialog"] button[aria-label="Close"]:focus { + outline: none; +} + div[data-testid="stSpinner"] { background: transparent; } @@ -235,6 +254,18 @@ button[title="Show password text"] { flex-direction: row; } +/* Required after Streamlit upgrade to 1.53.0 */ +.stVerticalBlock > div.stElementContainer:has(> div.stHtml > i.flex-row) { + display: none; +} + +.stVerticalBlock:has(> div.stElementContainer > div.stHtml > i.flex-row) > div.stElementContainer:nth-child(2) { + width: auto; +} +/* ... */ + + + div[data-testid="stVerticalBlockBorderWrapper"]:has( > div > div[data-testid="stVerticalBlock"] > div.element-container > div.stHtml > i.flex-row) [data-testid="stVerticalBlock"] > div[data-testid="element-container"], div[data-testid="stVerticalBlockBorderWrapper"]:has( > div > div[data-testid="stVerticalBlock"] > div.element-container > div.stHtml > i.flex-row) [data-testid="stVerticalBlock"] > div[data-testid="element-container"] > div[data-testid] { width: auto !important; @@ -341,10 +372,13 @@ Use as testgen.text("text", "extra_styles") */ } /* Help menu */ -.st-key-tg-header--help [data-testid="stPageLink"] { +.st-key-tg-header--help { + position: relative; +} + +.st-key-tg-header--help .stElementContainer:has([data-testid="stPageLink"]) { position: absolute; - top: -7px; - right: 0; + right: -7px; z-index: 5; } @@ -483,14 +517,18 @@ div[data-testid="stPopoverBody"]:has(i.tg-header--help-wrapper) { /* */ /* Export Menu */ -.st-key-tg--export-popover [data-testid="stPopoverButton"] > div:last-child { +/* .st-key-tg--export-popover [data-testid="stPopoverButton"] > div:last-child { display: none; -} +} */ .st-key-tg--export-popover [data-testid="stPopover"] { width: auto; } +[data-testid="stLayoutWrapper"]:has(> .st-key-tg--export-popover > .stElementContainer > .stHtml > .flex-row) { + width: auto; +} + div[data-testid="stPopoverBody"]:has(i.tg--export-wrapper) { min-width: 150px; border-radius: 8px; @@ -501,6 +539,10 @@ div[data-testid="stPopoverBody"] [data-testid="stVerticalBlock"]:has(i.tg--expor gap: 0; } +div[data-testid="stPopoverBody"] [data-testid="stVerticalBlock"]:has(i.tg--export-wrapper) > .stElementContainer { + width: 100%; +} + div[data-testid="stPopoverBody"] [data-testid="stVerticalBlock"]:has(i.tg--export-wrapper) button { width: 100%; padding: 4px 16px; @@ -508,6 +550,10 @@ div[data-testid="stPopoverBody"] [data-testid="stVerticalBlock"]:has(i.tg--expor border-radius: 0; } +div[data-testid="stPopoverBody"] [data-testid="stVerticalBlock"]:has(i.tg--export-wrapper) [data-testid="stElementContainer"] button > div { + justify-content: flex-start; +} + div[data-testid="stPopoverBody"] [data-testid="stVerticalBlock"]:has(i.tg--export-wrapper) [data-testid="stElementContainer"]:nth-child(2) button { border-top-left-radius: 8px; border-top-right-radius: 8px; @@ -524,6 +570,10 @@ div[data-testid="stPopoverBody"] [data-testid="stVerticalBlock"]:has(i.tg--expor } /* */ +input { + line-height: normal !important; +} + input::-ms-reveal, input::-ms-clear { display: none; diff --git a/testgen/ui/components/frontend/css/KFOlCnqEu92Fr1MmEU9fBBc4.woff2 b/testgen/ui/static/fonts/KFOlCnqEu92Fr1MmEU9fBBc4.woff2 similarity index 100% rename from testgen/ui/components/frontend/css/KFOlCnqEu92Fr1MmEU9fBBc4.woff2 rename to testgen/ui/static/fonts/KFOlCnqEu92Fr1MmEU9fBBc4.woff2 diff --git a/testgen/ui/components/frontend/css/KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2 b/testgen/ui/static/fonts/KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2 similarity index 100% rename from testgen/ui/components/frontend/css/KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2 rename to testgen/ui/static/fonts/KFOlCnqEu92Fr1MmEU9fChc4EsA.woff2 diff --git a/testgen/ui/components/frontend/css/KFOmCnqEu92Fr1Mu4mxK.woff2 b/testgen/ui/static/fonts/KFOmCnqEu92Fr1Mu4mxK.woff2 similarity index 100% rename from testgen/ui/components/frontend/css/KFOmCnqEu92Fr1Mu4mxK.woff2 rename to testgen/ui/static/fonts/KFOmCnqEu92Fr1Mu4mxK.woff2 diff --git a/testgen/ui/components/frontend/css/KFOmCnqEu92Fr1Mu7GxKOzY.woff2 b/testgen/ui/static/fonts/KFOmCnqEu92Fr1Mu7GxKOzY.woff2 similarity index 100% rename from testgen/ui/components/frontend/css/KFOmCnqEu92Fr1Mu7GxKOzY.woff2 rename to testgen/ui/static/fonts/KFOmCnqEu92Fr1Mu7GxKOzY.woff2 diff --git a/testgen/ui/components/frontend/css/material-symbols-rounded.woff2 b/testgen/ui/static/fonts/material-symbols-rounded.woff2 similarity index 100% rename from testgen/ui/components/frontend/css/material-symbols-rounded.woff2 rename to testgen/ui/static/fonts/material-symbols-rounded.woff2 diff --git a/testgen/ui/static/js/axis_utils.js b/testgen/ui/static/js/axis_utils.js new file mode 100644 index 00000000..2e5240df --- /dev/null +++ b/testgen/ui/static/js/axis_utils.js @@ -0,0 +1,501 @@ +// https://stackoverflow.com/a/4955179 +function niceNumber(value, round = false) { + const exponent = Math.floor(Math.log10(value)); + const fraction = value / Math.pow(10, exponent); + let niceFraction; + + if (round) { + if (fraction < 1.5) { + niceFraction = 1; + } else if (fraction < 3) { + niceFraction = 2; + } else if (fraction < 7) { + niceFraction = 5; + } else { + niceFraction = 10; + } + } else { + if (fraction <= 1) { + niceFraction = 1; + } else if (fraction <= 2) { + niceFraction = 2; + } else if (fraction <= 5) { + niceFraction = 5; + } else { + niceFraction = 10; + } + } + + return niceFraction * Math.pow(10, exponent); +} + +function niceBounds(axisStart, axisEnd, tickCount = 4) { + let axisWidth = axisEnd - axisStart; + + if (axisWidth == 0) { + axisStart -= 0.5; + axisEnd += 0.5; + axisWidth = axisEnd - axisStart; + } + + const niceRange = niceNumber(axisWidth); + const niceTick = niceNumber(niceRange / (tickCount - 1), true); + axisStart = Math.floor(axisStart / niceTick) * niceTick; + axisEnd = Math.ceil(axisEnd / niceTick) * niceTick; + + return { + min: axisStart, + max: axisEnd, + step: niceTick, + range: axisEnd - axisStart, + }; +} + +function niceTicks(axisStart, axisEnd, tickCount = 4) { + const { min, max, step } = niceBounds(axisStart, axisEnd, tickCount); + const ticks = []; + let currentTick = min; + while (currentTick <= max) { + ticks.push(currentTick); + currentTick = currentTick + step; + } + return ticks; +} + +/** + * + * @typedef Range + * @type {object} + * @property {number} max + * @property {number} min + * + * @param {number} value + * @param {({new: Range, old: Range})} ranges + * @property {number?} zero + */ +function scale(value, ranges, zero=0) { + const oldRange = (ranges.old.max - ranges.old.min); + const newRange = (ranges.new.max - ranges.new.min); + + if (oldRange === 0) { + return zero; + } + + return ((value - ranges.old.min) * newRange / oldRange) + ranges.new.min; +} + +/** + * @param {SVGElement} svg + * @param {MouseEvent} event + * @returns {({x: number, y: number})} + */ +function screenToSvgCoordinates(svg, event) { + const pt = svg.createSVGPoint(); + pt.x = event.offsetX; + pt.y = event.offsetY; + const inverseCTM = svg.getScreenCTM().inverse(); + const svgPoint = pt.matrixTransform(inverseCTM); + return svgPoint; +} + +/** + * Generates an array of "nice" and properly spaced tick dates for a time-series axis. + * It automatically selects the best time step (granularity) based on the range. + * + * @param {Date[]} dates An array of Date objects representing the data points. + * @param {number} minTicks The minimum number of ticks desired. + * @param {number} maxTicks The maximum number of ticks desired. + * @returns {Date[]} An array of Date objects for the axis ticks. + */ +function getAdaptiveTimeTicks(dates, minTicks, maxTicks) { + if (!dates || dates.length === 0) { + return []; + } + + if (typeof dates[0] === 'number') { + dates = dates.map(d => new Date(d * 1000)); + } + + const timestamps = dates.map(d => d.getTime()); + const minTime = Math.min(...timestamps); + const maxTime = Math.max(...timestamps); + const rangeMs = maxTime - minTime; + + const timeSteps = [ + { name: 'hour', ms: 3600000 }, + { name: '4 hours', ms: 4 * 3600000 }, + { name: '8 hours', ms: 8 * 3600000 }, + { name: 'day', ms: 86400000 }, + { name: 'week', ms: 7 * 86400000 }, + { name: 'month', ms: null, count: 1 }, + { name: '3 months', ms: null, count: 3 }, + { name: '6 months', ms: null, count: 6 }, + { name: 'year', ms: null, count: 12 }, + ]; + + let bestStepIndex = -1; + let ticks = []; + + for (let i = timeSteps.length - 1; i >= 0; i--) { + const step = timeSteps[i]; + let estimatedTickCount; + + if (step.ms !== null) { + estimatedTickCount = Math.ceil(rangeMs / step.ms) + 1; + } else { + estimatedTickCount = estimateMonthYearTicks(minTime, maxTime, step.count); + } + + if (estimatedTickCount <= maxTicks) { + bestStepIndex = i; + break; + } + } + + if (bestStepIndex === -1) { + const roughStep = rangeMs / (maxTicks - 1); + const niceMsStep = getNiceStep(roughStep); + return generateMsTicks(minTime, maxTime, niceMsStep).map(t => new Date(t)); + } + + const bestStep = timeSteps[bestStepIndex]; + if (bestStep.ms !== null) { + ticks = generateMsTicks(minTime, maxTime, bestStep.ms).map(t => new Date(t)); + } else { + ticks = generateMonthYearTicks(minTime, maxTime, bestStep.count); + } + + while (ticks.length < minTicks && bestStepIndex > 0) { + bestStepIndex--; + const nextStep = timeSteps[bestStepIndex]; + + if (nextStep.ms !== null) { + ticks = generateMsTicks(minTime, maxTime, nextStep.ms).map(t => new Date(t)); + } else { + ticks = generateMonthYearTicks(minTime, maxTime, nextStep.count); + } + } + + return ticks; +} + +/** Calculates a "nice" step size (1, 2, 5, etc. * power of 10) for raw milliseconds. */ +function getNiceStep(step) { + const exponent = Math.floor(Math.log10(step)); + const fraction = step / Math.pow(10, exponent); + let niceFraction; + if (fraction <= 1) niceFraction = 1; + else if (fraction <= 2) niceFraction = 2; + else if (fraction <= 5) niceFraction = 5; + else return 1 * Math.pow(10, exponent + 1); // Next power of 10 + + return niceFraction * Math.pow(10, exponent); +} + +/** Generates ticks for fixed-length steps (hours, days, weeks). */ +function generateMsTicks(minTime, maxTime, niceStepMs) { + // let tickStart = minTime; // Use it to start at minimum tick + let tickStart = Math.floor(minTime / niceStepMs) * niceStepMs; // Use it to start at a nicer tick + while (tickStart > minTime) { + tickStart -= niceStepMs; + } + + const ONE_DAY = 86400000; + if (niceStepMs >= ONE_DAY) { + const date = new Date(tickStart); + date.setHours(0, 0, 0, 0); + tickStart = date.getTime(); + while (tickStart + niceStepMs < minTime) { + tickStart += niceStepMs; + } + } + + const ticks = []; + const epsilon = 1e-10; + let currentTick = tickStart; + + while (currentTick <= maxTime + niceStepMs + epsilon) { + ticks.push(Math.round(currentTick)); + currentTick += niceStepMs; + } + + return ticks; +} + +/** Generates ticks for variable-length steps (months, years). */ +function generateMonthYearTicks(minTime, maxTime, monthStep) { + const ticks = []; + let currentDate = new Date(minTime); + + currentDate.setDate(1); // Set to the 1st of the month + currentDate.setHours(0, 0, 0, 0); + + let year = currentDate.getFullYear(); + let month = currentDate.getMonth(); + + while (month % monthStep !== 0) { + month--; + if (month < 0) { + month = 11; + year--; + } + } + currentDate.setFullYear(year, month, 1); + + while (currentDate.getTime() + monthStep * 30 * 86400000 < minTime) { + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + while (currentDate.getTime() <= maxTime) { + ticks.push(new Date(currentDate.getTime())); + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + if (ticks.length > 0 && currentDate.getTime() - maxTime < monthStep * 30 * 86400000 / 2) { + ticks.push(new Date(currentDate.getTime())); + } + + return ticks; +} + +/** Estimates the number of ticks for month/year steps. */ +function estimateMonthYearTicks(minTime, maxTime, monthStep) { + const minDate = new Date(minTime); + const maxDate = new Date(maxTime); + + let years = maxDate.getFullYear() - minDate.getFullYear(); + let months = maxDate.getMonth() - minDate.getMonth(); + let totalMonths = years * 12 + months; + + return Math.ceil(totalMonths / monthStep) + 2; +} + +function getAdaptiveTimeTicksV2(dates, totalWidth, tickWidth) { + if (!dates || dates.length === 0) { + return []; + } + + if (typeof dates[0] === 'number') { + dates = dates.map(d => new Date(d)); + } + + const timestamps = dates.map(d => d.getTime()); + const minTime = Math.min(...timestamps); + const maxTime = Math.max(...timestamps); + const rangeMs = maxTime - minTime; + + const maxTicks = Math.floor(totalWidth / tickWidth); + const timeSteps = [ + { name: 'hour', ms: 3600000 }, + { name: '2 hours', ms: 7200000 }, + { name: '4 hours', ms: 14400000 }, + { name: '6 hours', ms: 21600000 }, + { name: '8 hours', ms: 28800000 }, + { name: '12 hours', ms: 43200000 }, + { name: 'day', ms: 86400000 }, + { name: '2 days', ms: 172800000 }, + { name: '3 days', ms: 259200000 }, + { name: 'week', ms: 604800000 }, + { name: '2 weeks', ms: 1209600000 }, + { name: 'month', ms: null, count: 1 }, + { name: '3 months', ms: null, count: 3 }, + { name: '6 months', ms: null, count: 6 }, + { name: 'year', ms: null, count: 1 }, + ]; + + for (let i = 0; i < timeSteps.length; i++) { + const step = timeSteps[i]; + let tickCount = 0; + + if (step.ms !== null) { + // Precise calculation: how many strict ticks fit in [minTime, maxTime]? + const firstTick = Math.ceil(minTime / step.ms) * step.ms; + const lastTick = Math.floor(maxTime / step.ms) * step.ms; + if (lastTick >= firstTick) { + tickCount = Math.floor((lastTick - firstTick) / step.ms) + 1; + } + } else { + tickCount = estimateMonthYearTicksStrict(minTime, maxTime, step.count); + } + + if (tickCount <= maxTicks && tickCount > 0) { + if (step.ms !== null) { + return generateMsTicksStrict(minTime, maxTime, step.ms); + } else { + return generateMonthYearTicksStrict(minTime, maxTime, step.count); + } + } + } + + const targetStep = rangeMs / Math.max(1, maxTicks); + const niceStep = getNiceStep(targetStep); + return generateMsTicksStrict(minTime, maxTime, niceStep); +} + +/** * Generates ticks strictly within [minTime, maxTime]. + * Uses Math.ceil to start 'inside' the range. + */ +function generateMsTicksStrict(minTime, maxTime, stepMs) { + const ticks = []; + + let currentTick = Math.ceil(minTime / stepMs) * stepMs; + + while (currentTick <= maxTime) { + ticks.push(new Date(currentTick)); + currentTick += stepMs; + } + + return ticks; +} + +/** * Generates Month/Year ticks strictly within bounds. + */ +function generateMonthYearTicksStrict(minTime, maxTime, monthStep) { + const ticks = []; + let currentDate = new Date(minTime); + + currentDate.setDate(1); + currentDate.setHours(0, 0, 0, 0); + + let month = currentDate.getMonth(); + let year = currentDate.getFullYear(); + while (month % monthStep !== 0) { + month--; + if (month < 0) { month = 11; year--; } + } + currentDate.setFullYear(year, month, 1); + + while (currentDate.getTime() < minTime) { + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + while (currentDate.getTime() <= maxTime) { + ticks.push(new Date(currentDate)); + currentDate.setMonth(currentDate.getMonth() + monthStep); + } + + return ticks; +} + +function estimateMonthYearTicksStrict(minTime, maxTime, monthStep) { + let count = 0; + let d = new Date(minTime); + d.setDate(1); d.setHours(0,0,0,0); + + let m = d.getMonth(); + let y = d.getFullYear(); + while (m % monthStep !== 0) { m--; if(m<0){m=11; y--;} } + d.setFullYear(y, m, 1); + + while (d.getTime() < minTime) { + d.setMonth(d.getMonth() + monthStep); + } + while (d.getTime() <= maxTime) { + count++; + d.setMonth(d.getMonth() + monthStep); + } + return count; +} + +/** + * Formats an array of Date objects into smart, non-redundant labels. + * It only displays the year, month, or day when it changes from the previous tick. + * + * @param {Date[]} ticks An array of Date objects (the tick values). + * @returns {Array} An array of formatted labels (strings or string arrays). + */ +function formatSmartTimeTicks(ticks) { + if (!ticks || ticks.length === 0) { + return []; + } + + const formattedLabels = []; + const locale = 'en-US'; + + const yearFormat = { year: 'numeric' }; + const monthFormat = { month: 'short' }; + const dayFormat = { day: 'numeric' }; + const timeFormat = { hour: '2-digit', minute: '2-digit', hourCycle: 'h23' }; + const ONE_DAY_MS = 86400000; + + const formatPart = (date, options) => date.toLocaleString(locale, options); + + for (let i = 0; i < ticks.length; i++) { + const currentTick = ticks[i]; + const previousTick = ticks[i - 1]; + const nextTick = ticks[i + 1]; + + let needsYear = false; + let needsMonth = false; + let needsDay = false; + let needsTime = false; + + if (!previousTick) { + needsYear = true; + needsMonth = true; + needsDay = true; + needsTime = nextTick && nextTick.getTime() - currentTick.getTime() < ONE_DAY_MS; + } else { + const curr = currentTick; + const prev = previousTick; + + if (curr.getFullYear() !== prev.getFullYear()) { + needsYear = true; + needsMonth = true; + needsDay = true; + } else if (curr.getMonth() !== prev.getMonth()) { + needsMonth = true; + needsDay = true; + } else if (curr.getDate() !== prev.getDate()) { + needsDay = true; + needsMonth = true; + } + + const stepMs = currentTick.getTime() - previousTick.getTime(); + if (stepMs < ONE_DAY_MS || (curr.getHours() !== 0 || curr.getMinutes() !== 0)) { + needsTime = true; + } + } + + let line1 = []; + let line2 = []; + + if (needsTime) { + line1.push(formatPart(currentTick, timeFormat)); + } + + if (needsMonth || needsDay) { + let datePart = []; + if (needsMonth) { + datePart.push(formatPart(currentTick, monthFormat)); + } + if (needsDay) { + datePart.push(formatPart(currentTick, dayFormat)); + } + const dateString = datePart.join(' '); + + if (needsTime) { + line2.push(dateString); + } else { + line1.push(dateString); + } + } + + if (needsYear) { + line2.push(formatPart(currentTick, yearFormat)); + } + + line1 = line1.filter(p => p.length > 0).join(' '); + line2 = line2.filter(p => p.length > 0).join(' '); + + if (line2.length > 0) { + formattedLabels.push([line1, line2]); + } else { + formattedLabels.push(line1); + } + } + + return formattedLabels; +} + +export { niceBounds, niceTicks, scale, screenToSvgCoordinates, getAdaptiveTimeTicks, getAdaptiveTimeTicksV2, formatSmartTimeTicks }; diff --git a/testgen/ui/static/js/components/alert.js b/testgen/ui/static/js/components/alert.js new file mode 100644 index 00000000..dfb28edd --- /dev/null +++ b/testgen/ui/static/js/components/alert.js @@ -0,0 +1,125 @@ +/** + * @typedef Properties + * @type {object} + * @property {string?} icon + * @property {number?} timeout + * @property {boolean?} closeable + * @property {string?} class + * @property {'info'|'success'|'warn'|'error'} type + * @property {Function?} onClose + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet, getRandomId } from '../utils.js'; +import { Icon } from './icon.js'; +import { Button } from './button.js'; + +const { div } = van.tags; + +const Alert = (/** @type Properties */ props, /** @type Array */ ...children) => { + loadStylesheet('alert', stylesheet); + + const elementId = getValue(props.id) ?? 'tg-alert-' + getRandomId(); + const close = () => { + props.onClose ? props.onClose() : document.getElementById(elementId)?.remove(); + }; + const timeout = getValue(props.timeout); + if (timeout && timeout > 0) { + setTimeout(close, timeout); + } + + return div( + { + ...props, + id: elementId, + class: () => `tg-alert flex-row ${getValue(props.class) ?? ''} tg-alert-${getValue(props.type)}`, + role: 'alert', + }, + () => { + const icon = getValue(props.icon); + if (!icon) { + return ''; + } + + return Icon({size: 20, classes: 'mr-2'}, icon); + }, + div( + {class: 'flex-column'}, + ...children, + ), + () => { + const isCloseable = getValue(props.closeable) ?? false; + if (!isCloseable) { + return ''; + } + + return Button({ + type: 'icon', + icon: 'close', + style: `margin-left: auto;`, + onclick: close, + }); + }, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-alert { + padding: 16px; + border-radius: 0.5rem; + font-size: 16px; + line-height: 24px; +} + +.tg-alert-info { + background-color: rgba(28, 131, 225, 0.1); + color: rgb(0, 66, 128); +} + +.tg-alert-success { + background-color: rgba(33, 195, 84, 0.1); + color: rgb(23, 114, 51); +} + +.tg-alert-warn { + background-color: rgba(255, 227, 18, 0.1); + color: rgb(146, 108, 5); +} + +.tg-alert-error { + background-color: rgba(255, 43, 43, 0.09); + color: rgb(125, 53, 59); +} + +@media (prefers-color-scheme: dark) { + .tg-alert-info { + background-color: rgba(61, 157, 243, 0.2); + color: rgb(199, 235, 255); + } + + .tg-alert-success { + background-color: rgba(61, 213, 109, 0.2); + color: rgb(223, 253, 233); + } + + .tg-alert-warn { + background-color: rgba(255, 227, 18, 0.2); + color: rgb(255, 255, 194); + } + + .tg-alert-error { + background-color: rgba(255, 108, 108, 0.2); + color: rgb(255, 222, 222); + } +} + +.tg-alert > .tg-icon { + color: inherit !important; +} + +.tg-alert > .tg-button { + color: inherit !important; +} +`); + +export { Alert }; diff --git a/testgen/ui/static/js/components/attribute.js b/testgen/ui/static/js/components/attribute.js new file mode 100644 index 00000000..61240f7f --- /dev/null +++ b/testgen/ui/static/js/components/attribute.js @@ -0,0 +1,49 @@ +/** + * @typedef Properties + * @type {object} + * @property {string} label + * @property {string?} help + * @property {string | number} value + * @property {number?} width + * @property {string?} class + */ +import { getValue, loadStylesheet } from '../utils.js'; +import { Icon } from './icon.js'; +import { withTooltip } from './tooltip.js'; +import van from '../van.min.js'; + +const { div } = van.tags; + +const Attribute = (/** @type Properties */ props) => { + loadStylesheet('attribute', stylesheet); + + return div( + { style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}`, class: props.class }, + div( + { class: 'flex-row fx-gap-1 text-caption mb-1' }, + props.label, + () => getValue(props.help) + ? withTooltip( + Icon({size: 16, classes: 'text-disabled' }, 'help'), + { text: props.help, position: 'top', width: 200 }, + ) + : null, + ), + div( + { class: 'attribute-value' }, + () => { + const value = getValue(props.value); + return (value || value === 0) ? value : '--'; + }, + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.attribute-value { + word-wrap: break-word; +} +`); + +export { Attribute }; diff --git a/testgen/ui/static/js/components/box_plot.js b/testgen/ui/static/js/components/box_plot.js new file mode 100644 index 00000000..ef1957b9 --- /dev/null +++ b/testgen/ui/static/js/components/box_plot.js @@ -0,0 +1,290 @@ +/** + * @typedef Properties + * @type {object} + * @property {number} minimum + * @property {number} maximum + * @property {number} median + * @property {number} lowerQuartile + * @property {number} upperQuartile + * @property {number} average + * @property {number} standardDeviation + * @property {number?} width + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; +import { colorMap, formatNumber } from '../display_utils.js'; +import { niceBounds } from '../axis_utils.js'; + +const { div } = van.tags; +const boxColor = colorMap.teal; +const lineColor = colorMap.limeGreen; + +const BoxPlot = (/** @type Properties */ props) => { + loadStylesheet('boxPlot', stylesheet); + + const { minimum, maximum, median, lowerQuartile, upperQuartile, average, standardDeviation, width } = props; + const axisTicks = van.derive(() => niceBounds(getValue(minimum), getValue(maximum))); + + return div( + { + class: 'flex-row fx-flex-wrap fx-gap-6', + style: () => `max-width: ${width ? getValue(width) + 'px' : '100%'};`, + }, + div( + { class: 'pl-7 pr-7', style: 'flex: 300px' }, + div( + { + class: 'tg-box-plot--line', + style: () => { + const { min, range } = axisTicks.val; + return `left: ${(getValue(average) - getValue(standardDeviation) - min) * 100 / range}%; + width: ${getValue(standardDeviation) * 2 * 100 / range}%;`; + }, + }, + div({ class: 'tg-box-plot--dot' }), + ), + div( + { + class: 'tg-box-plot--grid', + style: () => { + const { min, max, range } = axisTicks.val; + + return `grid-template-columns: + ${(getValue(minimum) - min) * 100 / range}% + ${(getValue(lowerQuartile) - getValue(minimum)) * 100 / range}% + ${(getValue(median) - getValue(lowerQuartile)) * 100 / range}% + ${(getValue(upperQuartile) - getValue(median)) * 100 / range}% + ${(getValue(maximum) - getValue(upperQuartile)) * 100 / range}% + ${(max - getValue(maximum)) * 100 / range}%;`; + }, + }, + div({ class: 'tg-box-plot--space-left' }), + div({ class: 'tg-box-plot--top-left' }), + div({ class: 'tg-box-plot--bottom-left' }), + div({ class: 'tg-box-plot--mid-left' }), + div({ class: 'tg-box-plot--mid-right' }), + div({ class: 'tg-box-plot--top-right' }), + div({ class: 'tg-box-plot--bottom-right' }), + div({ class: 'tg-box-plot--space-right' }), + ), + () => { + const { min, max, step, range } = axisTicks.val; + const ticks = []; + let currentTick = min; + while (currentTick <= max) { + ticks.push(currentTick); + currentTick += step; + } + + return div( + { class: 'tg-box-plot--axis' }, + ticks.map(position => div( + { + class: 'tg-box-plot--axis-tick', + style: `left: ${(position - min) * 100 / range}%;` + }, + formatNumber(position), + )), + ); + }, + ), + div( + { class: 'flex-column fx-gap-2 text-caption', style: 'flex: 150px;' }, + div( + { class: 'flex-row fx-gap-2' }, + div({ class: 'tg-blox-plot--legend-line' }), + 'Average---Standard Deviation', + ), + div( + { class: 'flex-row fx-gap-2' }, + div({ class: 'tg-blox-plot--legend-whisker' }), + 'Minimum---Maximum', + ), + div( + { class: 'flex-row fx-gap-2' }, + div({ class: 'tg-blox-plot--legend-box' }), + '25th---Median---75th', + ), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-box-plot--line { + position: relative; + margin: 8px 0 24px 0; + border-top: 2px dotted ${lineColor}; +} + +.tg-box-plot--dot { + position: absolute; + top: -1px; + left: 50%; + transform: translateX(-50%) translateY(-50%); + width: 10px; + height: 10px; + border-radius: 5px; + background-color: ${lineColor}; +} + +.tg-box-plot--grid { + height: 24px; + display: grid; + grid-template-rows: 50% 50%; +} + +.tg-box-plot--grid div { + border-color: var(--caption-text-color); + border-style: solid; +} + +.tg-box-plot--space-left { + grid-column-start: 1; + grid-column-end: 2; + grid-row-start: 1; + grid-row-end: 3; + border: 0; +} + +.tg-box-plot--top-left { + grid-column-start: 2; + grid-column-end: 3; + grid-row-start: 1; + grid-row-end: 2; + border-width: 0 0 1px 2px; +} + +.tg-box-plot--bottom-left { + grid-column-start: 2; + grid-column-end: 3; + grid-row-start: 2; + grid-row-end: 3; + border-width: 1px 0 0 2px; +} + +.tg-box-plot--mid-left { + grid-column-start: 3; + grid-column-end: 4; + grid-row-start: 1; + grid-row-end: 3; + border-width: 1px 2px 1px 1px; + border-radius: 4px 0 0 4px; + background-color: ${boxColor}; +} + +.tg-box-plot--mid-right { + grid-column-start: 4; + grid-column-end: 5; + grid-row-start: 1; + grid-row-end: 3; + border-width: 1px 1px 1px 2px; + border-radius: 0 4px 4px 0; + background-color: ${boxColor}; +} + +.tg-box-plot--top-right { + grid-column-start: 5; + grid-column-end: 6; + grid-row-start: 1; + grid-row-end: 2; + border-width: 0 2px 1px 0; +} + +.tg-box-plot--bottom-right { + grid-column-start: 5; + grid-column-end: 6; + grid-row-start: 2; + grid-row-end: 3; + border-width: 1px 2px 0 0; +} + +.tg-box-plot--space-right { + grid-column-start: 6; + grid-column-end: 7; + grid-row-start: 1; + grid-row-end: 3; + border: 0; +} + +.tg-box-plot--axis { + position: relative; + margin: 24px 0; + width: 100%; + height: 2px; + background-color: var(--disabled-text-color); + color: var(--caption-text-color); +} + +.tg-box-plot--axis-tick { + position: absolute; + top: 8px; + transform: translateX(-50%); +} + +.tg-box-plot--axis-tick::before { + position: absolute; + top: -9px; + left: 50%; + transform: translateX(-50%); + width: 4px; + height: 4px; + border-radius: 2px; + background-color: var(--disabled-text-color); + content: ''; +} + +.tg-blox-plot--legend-line { + width: 26px; + border: 1px dotted ${lineColor}; + position: relative; +} + +.tg-blox-plot--legend-line::after { + position: absolute; + left: 50%; + transform: translateX(-50%) translateY(-50%); + width: 6px; + height: 6px; + border-radius: 6px; + background-color: ${lineColor}; + content: ''; +} + +.tg-blox-plot--legend-whisker { + width: 24px; + height: 12px; + border: solid var(--caption-text-color); + border-width: 0 2px 0 2px; + position: relative; +} + +.tg-blox-plot--legend-whisker::after { + position: absolute; + top: 5px; + width: 24px; + height: 2px; + background-color: var(--caption-text-color); + content: ''; +} + +.tg-blox-plot--legend-box { + width: 26px; + height: 12px; + border: 1px solid var(--caption-text-color); + border-radius: 4px; + background-color: ${boxColor}; + position: relative; +} + +.tg-blox-plot--legend-box::after { + position: absolute; + left: 12px; + width: 2px; + height: 12px; + background-color: var(--caption-text-color); + content: ''; +} +`); + +export { BoxPlot }; diff --git a/testgen/ui/static/js/components/breadcrumbs.js b/testgen/ui/static/js/components/breadcrumbs.js new file mode 100644 index 00000000..52a18a98 --- /dev/null +++ b/testgen/ui/static/js/components/breadcrumbs.js @@ -0,0 +1,90 @@ +/** + * @typedef Breadcrumb + * @type {object} + * @property {string} path + * @property {object} params + * @property {string} label + * + * @typedef Properties + * @type {object} + * @property {Array.} breadcrumbs + */ +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { emitEvent, getValue, loadStylesheet } from '../utils.js'; + +const { a, div, span } = van.tags; + +const Breadcrumbs = (/** @type Properties */ props) => { + loadStylesheet('breadcrumbs', stylesheet); + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(24); + } + + return div( + {class: 'tg-breadcrumbs-wrapper'}, + () => { + const breadcrumbs = getValue(props.breadcrumbs) || []; + + return div( + { class: 'tg-breadcrumbs' }, + breadcrumbs.reduce((items, b, idx) => { + const isLastItem = idx === breadcrumbs.length - 1; + items.push(a({ + class: `tg-breadcrumbs--${ isLastItem ? 'current' : 'active'}`, + onclick: (event) => { + event.preventDefault(); + event.stopPropagation(); + emitEvent('LinkClicked', { href: b.path, params: b.params }); + }}, + b.label, + )); + if (!isLastItem) { + items.push(span({class: 'tg-breadcrumbs--arrow'}, '>')); + } + return items; + }, []) + ); + } + ) +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-breadcrumbs-wrapper { + height: 100%; +} + +.tg-breadcrumbs { + display: flex; + align-items: center; + color: var(--secondary-text-color); + height: 100%; +} + +.tg-breadcrumbs > a { + text-decoration: unset; +} + +.tg-breadcrumbs--arrow { + margin-left: 4px; + margin-right: 4px; +} + +.tg-breadcrumbs--active { + cursor: pointer; + color: var(--secondary-text-color); +} + +.tg-breadcrumbs--active:hover { + text-decoration: underline; +} + +.tg-breadcrumbs--current { + pointer-events: none; + color: var(--secondary-text-color); +} +`); + +export { Breadcrumbs }; diff --git a/testgen/ui/static/js/components/button.js b/testgen/ui/static/js/components/button.js new file mode 100644 index 00000000..c78f2173 --- /dev/null +++ b/testgen/ui/static/js/components/button.js @@ -0,0 +1,215 @@ +/** + * @typedef Properties + * @type {object} + * @property {'basic' | 'flat' | 'icon' | 'stroked'} type + * @property {'basic' | 'primary' | 'warn'} color + * @property {(string|null)} width + * @property {(string|null)} label + * @property {(string|null)} icon + * @property {(int|null)} iconSize + * @property {(string|null)} tooltip + * @property {(string|null)} tooltipPosition + * @property {(string|null)} id + * @property {(Function|null)} onclick + * @property {(bool)} disabled + * @property {string?} style + * @property {string?} testId + */ +import { emitEvent, enforceElementWidth, getValue, loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { Tooltip } from './tooltip.js'; + +const { button, i, span } = van.tags; +const BUTTON_TYPE = { + BASIC: 'basic', + FLAT: 'flat', + ICON: 'icon', + STROKED: 'stroked', +}; +const DEFAULT_ICON_SIZE = 18; + + +const Button = (/** @type Properties */ props) => { + loadStylesheet('button', stylesheet); + + const width = getValue(props.width); + const isIconOnly = getValue(props.type) === BUTTON_TYPE.ICON || (getValue(props.icon) && !getValue(props.label)); + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(40); + if (isIconOnly) { // Force a 40px width for the parent iframe & handle window resizing + enforceElementWidth(window.frameElement, 40); + } + + if (width) { + enforceElementWidth(window.frameElement, width); + } + if (props.tooltip) { + window.frameElement.parentElement.setAttribute('data-tooltip', props.tooltip.val); + window.frameElement.parentElement.setAttribute('data-tooltip-position', props.tooltipPosition.val); + } + } + + const onClickHandler = props.onclick || (() => emitEvent('ButtonClicked')); + const showTooltip = van.state(false); + + return button( + { + id: getValue(props.id) ?? undefined, + class: () => `tg-button tg-${getValue(props.type)}-button tg-${getValue(props.color) ?? 'basic'}-button ${getValue(props.type) !== 'icon' && isIconOnly ? 'tg-icon-button' : ''}`, + style: () => `width: ${isIconOnly ? '' : (width ?? '100%')}; ${getValue(props.style)}`, + onclick: onClickHandler, + disabled: props.disabled, + onmouseenter: props.tooltip ? (() => showTooltip.val = true) : undefined, + onmouseleave: props.tooltip ? (() => showTooltip.val = false) : undefined, + 'data-testid': getValue(props.testId) ?? '', + }, + () => window.testgen.isPage && getValue(props.tooltip) ? Tooltip({ + text: props.tooltip, + show: showTooltip, + position: props.tooltipPosition, + }) : '', + span({class: 'tg-button-focus-state-indicator'}, ''), + props.icon ? i({ + class: 'material-symbols-rounded', + style: () => `font-size: ${getValue(props.iconSize) ?? DEFAULT_ICON_SIZE}px;` + }, props.icon) : undefined, + !isIconOnly ? span(props.label) : undefined, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +button.tg-button { + height: 40px; + + position: relative; + + display: flex; + flex-direction: row; + align-items: center; + justify-content: center; + + outline: 0; + border: unset; + border-radius: 4px; + padding: 8px 11px; + + cursor: pointer; + + font-size: 14px; +} + +button.tg-button .tg-button-focus-state-indicator { + border-radius: inherit; + overflow: hidden; +} + +button.tg-button .tg-button-focus-state-indicator::before { + content: ""; + opacity: 0; + top: 0; + left: 0; + right: 0; + bottom: 0; + position: absolute; + pointer-events: none; + border-radius: inherit; +} + +button.tg-button.tg-stroked-button { + border: var(--button-stroked-border); +} + +button.tg-button.tg-icon-button { + width: 40px; +} + +button.tg-button:has(span) { + padding: 8px 16px; +} + +button.tg-button:not(.tg-icon-button):has(span):has(i) { + padding-left: 12px; +} + +button.tg-button[disabled] { + color: var(--disabled-text-color) !important; + cursor: not-allowed; +} + +button.tg-button > i:has(+ span:not(.tg-tooltip)) { + margin-right: 8px; +} + +button.tg-button:hover:not([disabled]) .tg-button-focus-state-indicator::before { + opacity: var(--button-hover-state-opacity); +} + + +/* Basic button colors */ +button.tg-button.tg-basic-button { + color: var(--button-basic-text-color); + background: var(--button-basic-background); +} + +button.tg-button.tg-basic-button .tg-button-focus-state-indicator::before { + background: var(--button-basic-hover-state-background); +} + +button.tg-button.tg-basic-button.tg-flat-button { + color: var(--button-basic-flat-text-color); + background: var(--button-basic-flat-background); +} + +button.tg-button.tg-basic-button.tg-stroked-button { + color: var(--button-basic-stroked-text-color); + background: var(--button-basic-stroked-background); +} +/* ... */ + +/* Primary button colors */ +button.tg-button.tg-primary-button { + color: var(--button-primary-text-color); + background: var(--button-primary-background); +} + +button.tg-button.tg-primary-button .tg-button-focus-state-indicator::before { + background: var(--button-primary-hover-state-background); +} + +button.tg-button.tg-primary-button.tg-flat-button { + color: var(--button-primary-flat-text-color); + background: var(--button-primary-flat-background); +} + +button.tg-button.tg-primary-button.tg-stroked-button { + color: var(--button-primary-stroked-text-color); + background: var(--button-primary-stroked-background); +} +/* ... */ + +/* Warn button colors */ +button.tg-button.tg-warn-button { + color: var(--button-warn-text-color); + background: var(--button-warn-background); +} + +button.tg-button.tg-warn-button .tg-button-focus-state-indicator::before { + background: var(--button-warn-hover-state-background); +} + +button.tg-button.tg-warn-button.tg-flat-button { + color: var(--button-warn-flat-text-color); + background: var(--button-warn-flat-background); +} + +button.tg-button.tg-warn-button.tg-stroked-button { + color: var(--button-warn-stroked-text-color); + background: var(--button-warn-stroked-background); +} +/* ... */ +`); + +export { Button }; diff --git a/testgen/ui/static/js/components/caption.js b/testgen/ui/static/js/components/caption.js new file mode 100644 index 00000000..8f7f21f4 --- /dev/null +++ b/testgen/ui/static/js/components/caption.js @@ -0,0 +1,29 @@ +/** +* @typedef Properties +* @type {object} +* @property {string} content +* @property {string?} style +*/ +import van from '../van.min.js'; +import { loadStylesheet } from '../utils.js'; + +const { span } = van.tags; + +const Caption = (/** @type Properties */ props) => { + loadStylesheet('caption', stylesheet); + + return span( + { class: 'tg-caption', style: props.style }, + props.content + ); +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-caption { + color: var(--caption-text-color); + font-size: 14px; +} +`); + +export { Caption }; diff --git a/testgen/ui/static/js/components/card.js b/testgen/ui/static/js/components/card.js new file mode 100644 index 00000000..b883b9b7 --- /dev/null +++ b/testgen/ui/static/js/components/card.js @@ -0,0 +1,61 @@ +/** + * @typedef Properties + * @type {object} + * @property {object?} title + * @property {object} content + * @property {object?} actionContent + * @property {boolean?} border + * @property {string?} id + * @property {string?} class + * @property {string?} testId + */ +import { loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; + +const { div, h3 } = van.tags; + +const Card = (/** @type Properties */ props) => { + loadStylesheet('card', stylesheet); + + return div( + { class: `tg-card mb-4 ${props.border ? 'tg-card-border' : ''} ${props.class}`, id: props.id ?? '', 'data-testid': props.testId ?? '' }, + () => + props.title || props.actionContent ? + div( + { class: 'flex-row fx-justify-space-between fx-align-flex-start fx-gap-4' }, + () => + props.title ? + h3( + { class: 'tg-card--title' }, + props.title, + ) : + '', + props.actionContent, + ) : + '', + props.content, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-card { + border-radius: 8px; + background-color: var(--dk-card-background); + padding: 16px; +} + +.tg-card-border { + border: 1px solid var(--border-color); +} + +.tg-card--title { + margin: 0 0 16px; + color: var(--secondary-text-color); + font-size: 16px; + font-weight: 500; + text-transform: capitalize; +} +`); + +export { Card }; diff --git a/testgen/ui/static/js/components/chart_canvas.js b/testgen/ui/static/js/components/chart_canvas.js new file mode 100644 index 00000000..e2e13648 --- /dev/null +++ b/testgen/ui/static/js/components/chart_canvas.js @@ -0,0 +1,655 @@ +/** + * A container that renders a coordinate system and all the + * provided (compatible) chart components "cocentered" in the + * aforementioned coordinates. + * + * Functionalities: + * - display the axis and their ticks for the chart + * - display the hover-over elements, if any + * - allows zooming in and out + * + * @typedef Options + * @type {object} + * @property {number} width + * @property {number} height + * @property {Point[]} points + * @property {AxisConfigs?} axis + * @property {((point: Point) => SVGElement)?} legend + * @property {((getPoint: ((Point) => Point), showToolip: ((message: string, point: Point) => void), hideToolip: (() => void)) => SVGElement)?} markers + * + * @typedef Point + * @type {object} + * @property {number} x + * @property {number} y + * @property {number} originalX + * @property {number} originalY + * + * @typedef AxisConfigs + * @type {object} + * @property {SingleAxisConfig?} x + * @property {SingleAxisConfig?} y + * + * @typedef SingleAxisConfig + * @type {object} + * @property {any?} min + * @property {any?} max + * @property {string?} label + * @property {number?} ticksCount + * @property {boolean?} renderLine + * @property {boolean?} renderGridLines + * + * @typedef ChartRenderer + * @type {((viewBox: ChartViewBox, area: DrawingArea, getPoint: ((Point) => Point)) => SVGElement)} + * + * @typedef ChartViewBox + * @type {object} + * @property {number} minX + * @property {number} minY + * @property {number} width + * @property {number} height + * + * @typedef DrawingArea + * @type {object} + * @property {Point} topLeft + * @property {Point} topRight + * @property {Point} bottomLeft + * @property {Point} bottomRight + */ +import van from '../van.min.js'; +import { afterMount, getRandomId, getValue, loadStylesheet } from '../utils.js'; +import { colorMap } from '../display_utils.js'; +import { formatSmartTimeTicks, getAdaptiveTimeTicks, niceTicks, scale, screenToSvgCoordinates } from '../axis_utils.js'; +import { Button } from './button.js'; +import { Tooltip, withTooltip } from './tooltip.js'; + +const { div } = van.tags; +const { clipPath, defs, foreignObject, g, line, rect, svg, text } = van.tags("http://www.w3.org/2000/svg"); + +const spacing = 8; +const topLegendHeight = spacing * 8; +const verticalAxisLabelWidth = spacing * 2; +const verticalAxisLabelLeftMargin = 5; +const verticalAxisTicksLeftMargin = spacing * 3; + +const horizontalAxisLabelHeight = spacing * 2; +const horizontalAxisTicksHeight = spacing * 6; +const horizontalAxisLabelBottomMargin = 0; +const horizontalAxisTicksBottomMargin = spacing * 5; + +const innerPaddingX = spacing * 3; +const innerPaddingY = spacing * 2; + +const cornerDash = 10; +const draggingOverlayColor = '#FFFFFF66'; + +const tickTextHeight = 14; + +const actionsWidth = 40; +const actionsHeight = 40; + +/** + * @param {Options} options + * @param {...ChartRenderer} charts + * @returns {HTMLDivElement} + */ +const ChartCanvas = (options, ...charts) => { + loadStylesheet('chartCanvas', stylesheet); + + const canvasWidth = van.state(0); + const canvasHeight = van.state(0); + + const topLeft = van.state({x: 0, y: 0}); + const topRight = van.state({x: 0, y: 0}); + const bottomLeft = van.state({x: 0, y: 0}); + const bottomRight = van.state({x: 0, y: 0}); + + const xAxisChartRange = van.state({min: 0, max: 0}); + const yAxisChartRange = van.state({min: 0, max: 0}); + + const xAxisLabel = van.state(null); + const xAxisDataRange = van.state({min: 0, max: 0}); + const initialXAxisDataRange = van.state({min: 0, max: 0}); + const xAxisTicksCount = van.state(8); + const xRenderLine = van.state(false); + const xRenderGridLines = van.state(true); + + const yAxisLabel = van.state(null); + const yAxisDataRange = van.state({min: 0, max: 0}); + const initialYAxisDataRange = van.state({min: 0, max: 0}); + const yAxisTicksCount = van.state(4); + const yRenderLine = van.state(false); + const yRenderGridLines = van.state(false); + + const legendRenderer = van.state(null); + const markersRenderer = van.state(null); + + const dataPoints = van.state([]); + const dataPointsMapping = van.state({}); + + const isZoomed = van.state(false); + const isDragZooming = van.state(false); + const dragZoomStartingPoint = van.state(null); + const dragZoomCurrentPoint = van.state(null); + const isHoveringOver = van.state(false); + + let /** @type {SVGElement?} */ interactiveLayerSvg; + + const DOMIdSuffix = getRandomId(); + const getDOMId = (domId) => `${domId}-${DOMIdSuffix}`; + + const asSVGX = (value) => scale(value, {old: xAxisDataRange.rawVal, new: xAxisChartRange.rawVal}, bottomLeft.rawVal.x); + const asSVGY = (value) => scale(value, {old: yAxisDataRange.rawVal, new: yAxisChartRange.rawVal}, bottomLeft.rawVal.y); + + van.derive(() => { + canvasWidth.val = getValue(options.width); + }); + + van.derive(() => { + canvasHeight.val = getValue(options.height); + }); + + van.derive(() => { + const axisConfig = getValue(options.axis); + const originalPoints = getValue(options.points); + + const xRange = {min: axisConfig?.x?.min, max: axisConfig?.x?.max}; + const yRange = {min: axisConfig?.y?.min, max: axisConfig?.y?.max}; + + if (!xRange.min || !xRange.max) { + const xAxisValues = originalPoints.map(p => p.x); + xRange.min = Math.min(...xAxisValues); + xRange.max = Math.max(...xAxisValues); + } + + if (!yRange.min || !yRange.max) { + const yAxisValues = originalPoints.map(p => p.y); + yRange.min = Math.min(...yAxisValues); + yRange.max = Math.max(...yAxisValues); + } + + xAxisLabel.val = axisConfig?.x?.label ?? null; + xAxisTicksCount.val = axisConfig?.x?.ticksCount ?? 8; + xAxisDataRange.val = {min: xRange.min, max: xRange.max}; + initialXAxisDataRange.val = {...xAxisDataRange.rawVal}; + xRenderLine.val = axisConfig?.x?.renderLine ?? false; + xRenderGridLines.val = axisConfig?.x?.renderGridLines ?? false; + + yAxisLabel.val = axisConfig?.y?.label ?? null; + yAxisTicksCount.val = axisConfig?.y?.ticksCount ?? 4; + yAxisDataRange.val = {min: yRange.min, max: yRange.max}; + initialYAxisDataRange.val = {...yAxisDataRange.rawVal}; + yRenderLine.val = axisConfig?.y?.renderLine ?? false; + yRenderGridLines.val = axisConfig?.y?.renderGridLines ?? false; + }); + + van.derive(() => { + legendRenderer.val = getValue(options.legend); + }); + + van.derive(() => { + markersRenderer.val = getValue(options.markers); + }); + + van.derive(() => { + xAxisChartRange.val; + yAxisChartRange.val; + + const originalPoints = getValue(options.points); + const dataPoints_ = []; + const dataPointsMapping_ = {}; + + for (const original of originalPoints) { + const point = {x: asSVGX(original.x), y: asSVGY(original.y)}; + dataPoints_.push(point); + dataPointsMapping_[`${original.x}-${original.y}`] = point; + } + + dataPoints.val = dataPoints_; + dataPointsMapping.val = dataPointsMapping_; + }); + + const resizeChartBoundaries = () => { + const marginTop = topLegendHeight; + const marginBottom = (xAxisLabel.rawVal ? horizontalAxisLabelHeight : 0) + horizontalAxisTicksHeight; + + let marginLeft = (yAxisLabel.rawVal ? verticalAxisLabelWidth : 0) + spacing * 2; + const yAxisElement = document.getElementById(getDOMId('y-axis-ticks-group')); + if (yAxisElement) { + const box = yAxisElement.getBoundingClientRect(); + marginLeft += box.width; + } + + topLeft.val = {x: marginLeft, y: marginTop}; + topRight.val = {x: canvasWidth.rawVal, y: marginTop}; + bottomLeft.val = {x: marginLeft, y: Math.max(canvasHeight.rawVal - marginBottom, 0)}; + bottomRight.val = {x: canvasWidth.rawVal, y: Math.max(canvasHeight.rawVal - marginBottom, 0)}; + + xAxisChartRange.val = {min: bottomLeft.rawVal.x + innerPaddingX, max: bottomRight.rawVal.x - innerPaddingX}; + yAxisChartRange.val = {min: bottomLeft.rawVal.y - innerPaddingY, max: topLeft.rawVal.y + innerPaddingY}; + }; + + van.derive(() => { + canvasWidth.val; + canvasHeight.val; + resizeChartBoundaries(); + + xAxisDataRange.val = {...xAxisDataRange.rawVal}; + yAxisDataRange.val = {...yAxisDataRange.rawVal}; + }); + + const startDragZoom = (event) => { + interactiveLayerSvg = event.target.parentNode; + dragZoomStartingPoint.val = screenToSvgCoordinates(interactiveLayerSvg, event); + isDragZooming.val = true; + document.addEventListener('mousemove', updateDragZoomRect); + document.addEventListener('mouseup', stopDragZoom); + document.addEventListener('touchmove', updateDragZoomRect); + document.addEventListener('touchend', stopDragZoom); + }; + const updateDragZoomRect = (event) => { + if (isDragZooming.val) { + dragZoomCurrentPoint.val = screenToSvgCoordinates(interactiveLayerSvg, event); + } + }; + const stopDragZoom = (event) => { + document.removeEventListener('mousemove', updateDragZoomRect); + document.removeEventListener('mouseup', stopDragZoom); + document.removeEventListener('touchmove', updateDragZoomRect); + document.removeEventListener('touchend', stopDragZoom); + + const startingPoint = dragZoomStartingPoint.rawVal; + const currentPoint = screenToSvgCoordinates(interactiveLayerSvg, event); + + isDragZooming.val = false; + dragZoomStartingPoint.val = null; + dragZoomCurrentPoint.val = null; + + const selectedMinX = Math.min(startingPoint.x, currentPoint.x); + const selectedMaxX = Math.max(startingPoint.x, currentPoint.x); + const selectedMinY = Math.min(startingPoint.y, currentPoint.y); + const selectedMaxY = Math.max(startingPoint.y, currentPoint.y); + + const selectedWidth = selectedMaxX - selectedMinX; + const selectedHeight = selectedMaxY - selectedMinY; + + if (selectedWidth > 0 || selectedHeight > 0) { + const currentXDataRange = xAxisDataRange.rawVal; + const currentYDataRange = yAxisDataRange.rawVal; + const currentXChartRange = xAxisChartRange.rawVal; + const currentYChartRange = yAxisChartRange.rawVal; + + let newXDataMin = scale(selectedMinX, {old: currentXChartRange, new: currentXDataRange}, 0); + let newXDataMax = scale(selectedMaxX, {old: currentXChartRange, new: currentXDataRange}, 0); + let newYDataMin = scale(selectedMinY, {old: currentYChartRange, new: currentYDataRange}, 0); + let newYDataMax = scale(selectedMaxY, {old: currentYChartRange, new: currentYDataRange}, 0); + + if (newXDataMin > newXDataMax) [newXDataMin, newXDataMax] = [newXDataMax, newXDataMin]; + if (newYDataMin > newYDataMax) [newYDataMin, newYDataMax] = [newYDataMax, newYDataMin]; + + xAxisDataRange.val = {min: newXDataMin, max: newXDataMax}; + yAxisDataRange.val = {min: newYDataMin, max: newYDataMax}; + + isZoomed.val = true; + } + }; + + const getSharedDefinitions = (drawinAreaClipId, yAxisClipId, xAxisClipId) => defs( + {}, + clipPath( + {id: getDOMId(drawinAreaClipId)}, + () => rect({ + x: topLeft.val.x, + y: topLeft.val.y, + width: Math.max(bottomRight.val.x - bottomLeft.val.x, 0), + height: Math.max(bottomLeft.val.y - topLeft.val.y, 0), + }), + ), + yAxisClipId ? clipPath( + {id: getDOMId(yAxisClipId)}, + () => rect({ + x: 0, + y: topLeft.val.y - 10, + width: 999999.9, + height: Math.max(bottomLeft.val.y - topLeft.val.y, 0), + }), + ) : undefined, + xAxisClipId ? clipPath( + {id: getDOMId(xAxisClipId)}, + () => rect({ + x: topLeft.val.x, + y: topLeft.val.y, + width: Math.max(bottomRight.val.x - bottomLeft.val.x, 0), + height: 999999.9, + }), + ) : undefined, + ); + + const resetZoom = () => { + isZoomed.val = false; + xAxisDataRange.val = {...initialXAxisDataRange.rawVal}; + yAxisDataRange.val = {...initialYAxisDataRange.rawVal}; + dataPoints.val = [...dataPoints.rawVal]; + }; + + const getPoint = (original) => { + let point = dataPointsMapping.rawVal[`${original.x}-${original.y}`]; + if (!point) { + point = {x: asSVGX(original.x), y: asSVGY(original.y)}; + } + return {...point, originalX: original.x, originalY: original.y}; + }; + + const tooltipText = van.state(''); + const shouldShowTooltip = van.state(false); + const tooltipExtraStyle = van.state(''); + const tooltipElement = Tooltip({ + text: tooltipText, + show: shouldShowTooltip, + position: '--', + style: tooltipExtraStyle, + }); + const showTooltip = (message, point) => { + let timeout; + + tooltipText.val = message; + tooltipExtraStyle.val = 'visibility: hidden;'; + shouldShowTooltip.val = true; + + timeout = setTimeout(() => { + const tooltipRect = tooltipElement.getBoundingClientRect(); + let tooltipX = point.x + 10; + let tooltipY = point.y + 10; + + if (tooltipX + tooltipRect.width >= bottomRight.rawVal.x) { + tooltipX = point.x - tooltipRect.width - 10; + } + + tooltipExtraStyle.val = `transform: translate(${tooltipX}px, ${tooltipY}px);`; + + clearTimeout(timeout); + }, 0); + }; + const hideTooltip = () => { + tooltipText.val = ''; + tooltipExtraStyle.val = ''; + shouldShowTooltip.val = false; + }; + + return div( + { + id: getDOMId('chart-canvas'), + class: 'tg-chart', + style: () => `width: ${canvasWidth.val}px; height: ${canvasHeight.val}px;`, + onmouseenter: () => isHoveringOver.val = true, + onmouseleave: () => isHoveringOver.val = false, + }, + svg( + { + width: '100%', + height: '100%', + style: 'z-index: 0;', + class: 'tg-chart-layer axis-layer', + viewBox: () => `0 0 ${canvasWidth.val} ${canvasHeight.val}`, + }, + getSharedDefinitions('axis-clippath', 'y-axis-ticks-clippath', 'x-axis-ticks-clippath'), + () => { + const maxY = canvasHeight.val; + const yLabelPos = {x: verticalAxisLabelLeftMargin, y: (bottomLeft.val.y - topLeft.val.y) / 2 + topLeft.val.y}; + const xLabelPos = {x: (bottomRight.val.x - bottomLeft.val.x) / 2, y: maxY - horizontalAxisLabelBottomMargin}; + + return g( + {}, + yAxisLabel.val ? text({...yLabelPos, 'text-anchor': 'middle', 'dominant-baseline': 'central', transform: `rotate(-90, ${yLabelPos.x}, ${yLabelPos.y})`, fill: 'var(--caption-text-color)'}, yAxisLabel.val) : null, + xAxisLabel.val ? text({...xLabelPos, fill: 'var(--caption-text-color)'}, xAxisLabel.val) : null, + ); + }, + () => { + const {min: yMin, max: yMax} = yAxisDataRange.val; + const ticks = niceTicks(yMin, yMax, yAxisTicksCount.val); + if (!yAxisLabel.val) { + return g(); + } + + afterMount(() => { + resizeChartBoundaries(); + }); + + return g( + {}, + g( + {id: getDOMId('y-axis-ticks-group'), 'clip-path': `url(#${getDOMId('y-axis-ticks-clippath')})`}, + ...ticks.map(value => { + const tickY = asSVGY(value); + if (tickY < topLeft.rawVal.y || (tickY + tickTextHeight) > bottomLeft.rawVal.y) { + return undefined; + } + + return text( + {x: verticalAxisTicksLeftMargin, y: tickY, class: 'text-small', 'dominant-baseline': 'central', fill: 'var(--caption-text-color)'}, + Math.floor(value * 1000) / 1000, + ); + }), + ), + () => yRenderGridLines.val ? g( + {'clip-path': `url(#${getDOMId('y-axis-ticks-clippath')})`}, + ...ticks.map(value => { + const tickY = asSVGY(value); + if (tickY < topLeft.rawVal.y || (tickY + tickTextHeight) > bottomLeft.rawVal.y) { + return undefined; + } + + return line({ + x1: bottomLeft.val.x, + y1: tickY, + x2: bottomRight.val.x, + y2: tickY, + stroke: colorMap.lightGrey, + }); + }), + ) : g(), + ); + }, + () => { + xAxisChartRange.val; + + const maxY = canvasHeight.val; + const {min: xMin, max: xMax} = xAxisDataRange.val; + const ticks = getAdaptiveTimeTicks([xMin, xMax], 4, 8); + const labels = formatSmartTimeTicks(ticks); + + return g( + {}, + g( + {id: getDOMId('x-axis-ticks-group'), 'clip-path': `url(#${getDOMId('x-axis-ticks-clippath')})`}, + ...ticks.map((value, idx) => { + const tickX = asSVGX(value.getTime()); + const labelLines = typeof labels[idx] === 'string' ? [labels[idx]] : labels[idx]; + return g( + {}, + labelLines.map((line, lineIdx) => text( + {x: tickX, y: maxY - horizontalAxisTicksBottomMargin + (lineIdx * 15), 'text-anchor': 'middle', 'dominant-baseline': 'central', class: 'text-small', fill: 'var(--caption-text-color)'}, + line, + )), + ); + }), + ), + () => xRenderGridLines.val ? g( + {'clip-path': `url(#${getDOMId('x-axis-ticks-clippath')})`}, + ...ticks.map(value => { + const tickX = asSVGX(value.getTime()); + + return line({ + x1: tickX, + y1: bottomRight.val.y, + x2: tickX, + y2: topRight.val.y, + stroke: colorMap.lightGrey, + }); + }), + ) : g(), + ); + }, + g( + {}, + () => yRenderLine.val ? line({x1: bottomLeft.val.x, y1: bottomLeft.val.y, x2: topLeft.val.x, y2: topLeft.val.y, stroke: colorMap.grey }) : g(), + () => xRenderLine.val ? line({x1: bottomLeft.val.x, y1: bottomLeft.val.y, x2: bottomRight.val.x, y2: bottomRight.val.y, stroke: colorMap.grey }) : g(), + ), + ), + svg( + { + width: '100%', + height: '100%', + style: 'z-index: 2;', + class: 'tg-chart-layer interactive-layer', + viewBox: () => `0 0 ${canvasWidth.val} ${canvasHeight.val}`, + }, + getSharedDefinitions('markers-clippath'), + () => { + const width = bottomRight.val.x - bottomLeft.val.x; + const height = bottomLeft.val.y - topLeft.val.y; + + return rect({ + x: topLeft.val.x, + y: topLeft.val.y, + width: Math.max(width, 0), + height: Math.max(height, 0), + fill: isDragZooming.val ? draggingOverlayColor : 'transparent', + ontouchstart: startDragZoom, + onmousedown: startDragZoom, + }); + }, + () => { + const children = []; + if (legendRenderer.val) { + children.push( + legendRenderer.rawVal({y: 20, x: topLeft.val.x}), + ); + } + + if (markersRenderer.val) { + children.push( + g( + {'clip-path': `url(#${getDOMId('markers-clippath')})`}, + markersRenderer.rawVal(getPoint, showTooltip, hideTooltip), + ) + ); + } + + if (isHoveringOver.val) { + children.push( + foreignObject( + {y: 0, x: canvasWidth.val - actionsWidth - (spacing * 2), width: actionsWidth, height: actionsHeight, class: 'visible-overflow'}, + withTooltip( + Button({ + type: 'icon', + icon: 'zoom_out_map', + iconSize: 20, + style: 'overflow: visible;', + onclick: resetZoom, + }), + {position: 'bottom-left', text: 'Autoscale'}, + ), + ) + ); + } + + if (children.length <= 0) { + children.push(g()); + } + + return g( + {class: 'visible-overflow'}, + ...children, + ); + }, + () => { + const isDragging = isDragZooming.val; + const currentPoint = dragZoomCurrentPoint.val; + const startingPoint = dragZoomStartingPoint.rawVal; + if (!isDragging || !currentPoint || !startingPoint) { + return g(); // NOTE: vanjs+svg might have an issue, if this is null, subsquent state changes won't trigger this reactive function + } + + const x = Math.min(startingPoint.x, currentPoint.x); + const y = Math.min(startingPoint.y, currentPoint.y); + const rectHeight = Math.abs(currentPoint?.y - startingPoint?.y); + const rectWidth = Math.abs(currentPoint?.x - startingPoint?.x); + + const strokeDashArray = [ + cornerDash, + rectWidth - cornerDash*2, + cornerDash + 0.001, + 0.001, + cornerDash, + rectHeight - cornerDash*2, + cornerDash, + 0.001, + cornerDash, + rectWidth - cornerDash*2, + cornerDash, + 0.001, + cornerDash, + rectHeight - cornerDash*2, + cornerDash, + 0.001, + ]; + + return g( + {style: 'z-index: 3;'}, + rect({ + x: x, + y: y, + width: rectWidth, + height: rectHeight, + fill: 'transparent', + stroke: colorMap.grey, + 'stroke-width': 3, + 'stroke-dasharray': strokeDashArray.join(','), + }), + ); + }, + foreignObject({fill: 'none', width: '100%', height: '100%', 'pointer-events': 'none', style: 'overflow: visible;'}, tooltipElement), + ), + svg( + { + width: '100%', + height: '100%', + style: 'z-index: 1;', + viewBox: () => `0 0 ${canvasWidth.val} ${canvasHeight.val}`, + }, + getSharedDefinitions('charts-clippath'), + g( + {'clip-path': `url(#${getDOMId('charts-clippath')})`}, + ...charts.map((renderer) => () => { + const dataPointsMapping_ = dataPointsMapping.val; + if (Object.keys(dataPointsMapping_).length <= 0) { + return g(); + } + + return renderer( + { minX: 0, minY: 0, width: canvasWidth.val, height: canvasHeight.val }, + { topLeft: topLeft.val, topRight: topRight.val, bottomLeft: bottomLeft.val, bottomRight: bottomRight.val }, + getPoint, + ); + }), + ), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-chart { + position: relative; +} + +.tg-chart > svg { + z-index: 1; +} + +.tg-chart > svg { + position: absolute; +} +`); + +export { ChartCanvas }; diff --git a/testgen/ui/static/js/components/checkbox.js b/testgen/ui/static/js/components/checkbox.js new file mode 100644 index 00000000..45591ecc --- /dev/null +++ b/testgen/ui/static/js/components/checkbox.js @@ -0,0 +1,116 @@ +/** + * @typedef Properties + * @type {object} + * @property {string?} name + * @property {string} label + * @property {string?} help + * @property {boolean?} checked + * @property {boolean?} indeterminate + * @property {function(boolean, Event)?} onChange + * @property {number?} width + * @property {string?} testId + * @property {boolean?} disabled + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; +import { withTooltip } from './tooltip.js'; +import { Icon } from './icon.js'; + +const { input, label, span } = van.tags; + +const Checkbox = (/** @type Properties */ props) => { + loadStylesheet('checkbox', stylesheet); + + return label( + { + class: 'flex-row fx-gap-2 clickable', + 'data-testid': props.testId ?? props.name ?? '', + style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}`, + }, + input({ + type: 'checkbox', + name: props.name ?? '', + class: 'tg-checkbox--input clickable', + checked: props.checked, + indeterminate: props.indeterminate, + onchange: van.derive(() => { + const onChange = props.onChange?.val ?? props.onChange; + return onChange ? (/** @type Event */ event) => onChange(event.target.checked, event) : null; + }), + disabled: props.disabled ?? false, + }), + span({'data-testid': 'checkbox-label'}, props.label), + () => getValue(props.help) + ? withTooltip( + Icon({ size: 16, classes: 'text-disabled' }, 'help'), + { text: props.help, position: 'top', width: 200 } + ) + : null, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-checkbox--input { + appearance: none; + box-sizing: border-box; + margin: 0; + width: 18px; + height: 18px; + flex-shrink: 0; + border: 1px solid var(--secondary-text-color); + border-radius: 4px; + position: relative; + transition-property: border-color, background-color; + transition-duration: 0.3s; +} + +.tg-checkbox--input:focus, +.tg-checkbox--input:focus-visible { + outline: none; +} + +.tg-checkbox--input:focus-visible::before { + content: ''; + box-sizing: border-box; + position: absolute; + top: -4px; + left: -4px; + width: 24px; + height: 24px; + border: 3px solid var(--border-color); + border-radius: 7px; +} + +.tg-checkbox--input:checked, +.tg-checkbox--input:indeterminate { + border-color: transparent; + background-color: var(--primary-color); +} + +.tg-checkbox--input:checked:disabled, +.tg-checkbox--input:indeterminate:disabled { + cursor: not-allowed; + background-color: var(--disabled-text-color); +} + +.tg-checkbox--input:checked::after, +.tg-checkbox--input:indeterminate::after { + position: absolute; + top: -4px; + left: -3px; + font-family: 'Material Symbols Rounded'; + font-size: 22px; + color: white; +} + +.tg-checkbox--input:checked::after { + content: 'check'; +} + +.tg-checkbox--input:indeterminate::after { + content: 'check_indeterminate_small'; +} +`); + +export { Checkbox }; diff --git a/testgen/ui/static/js/components/code.js b/testgen/ui/static/js/components/code.js new file mode 100644 index 00000000..4f9f6ba7 --- /dev/null +++ b/testgen/ui/static/js/components/code.js @@ -0,0 +1,43 @@ +/** + * @typedef Options + * @type {object} + * @property {string?} id + * @property {string?} testId + * @property {string?} class + */ + +import van from '../van.min.js'; +import { getRandomId } from '../utils.js'; +import { Icon } from './icon.js'; + +const { code } = van.tags; + +/** + * + * @param {Options} options + * @param {...HTMLElement} children + */ +const Code = (options, ...children) => { + const domId = options.id ?? `code-snippet-${getRandomId()}`; + const icon = 'content_copy'; + + return code( + { ...options, id: domId, class: options.class ?? '', 'data-testid': options.testId ?? '' }, + ...children, + Icon( + { + classes: '', + onclick: () => { + const parentElement = document.getElementById(domId); + const content = (parentElement.textContent || parentElement.innerText).replace(icon, ''); + if (content) { + navigator.clipboard.writeText(content); + } + }, + }, + 'content_copy', + ), + ); +}; + +export { Code }; diff --git a/testgen/ui/static/js/components/connection_form.js b/testgen/ui/static/js/components/connection_form.js new file mode 100644 index 00000000..011e425a --- /dev/null +++ b/testgen/ui/static/js/components/connection_form.js @@ -0,0 +1,1343 @@ +/** + * @import { FileValue } from './file_input.js'; + * @import { VanState } from '../van.min.js'; + * + * @typedef Flavor + * @type {object} + * @property {string} label + * @property {string} value + * @property {string} icon + * @property {string} flavor + * @property {string} connection_string + * + * @typedef ConnectionStatus + * @type {object} + * @property {string} message + * @property {boolean} successful + * @property {string?} details + * + * @typedef Connection + * @type {object} + * @property {string} connection_id + * @property {string} connection_name + * @property {string} sql_flavor + * @property {string} sql_flavor_code + * @property {string} project_code + * @property {string} project_host + * @property {string} project_port + * @property {string} project_db + * @property {string} project_user + * @property {string} project_pw_encrypted + * @property {boolean} connect_by_url + * @property {string?} url + * @property {boolean} connect_by_key + * @property {boolean} connect_with_identity + * @property {string?} private_key + * @property {string?} private_key_passphrase + * @property {string?} http_path + * @property {string?} warehouse + * @property {ConnectionStatus?} status + * + * @typedef FormState + * @type {object} + * @property {boolean} dirty + * @property {boolean} valid + * + * @typedef FieldsCache + * @type {object} + * @property {FileValue} privateKey + * @property {FileValue} serviceAccountKey + * + * @typedef Properties + * @type {object} + * @property {Connection} connection + * @property {Array.} flavors + * @property {boolean} disableFlavor + * @property {FileValue?} cachedPrivateKeyFile + * @property {FileValue?} cachedServiceAccountKeyFile + * @param {string?} dynamicConnectionUrl + * @property {(c: Connection, state: FormState, cache?: FieldsCache) => void} onChange + */ +import van from '../van.min.js'; +import { Button } from './button.js'; +import { Alert } from './alert.js'; +import { getValue, emitEvent, loadStylesheet, isEqual } from '../utils.js'; +import { Input } from './input.js'; +import { Slider } from './slider.js'; +import { Select } from './select.js'; +import { maxLength, minLength, required, requiredIf, sizeLimit } from '../form_validators.js'; +import { RadioGroup } from './radio_group.js'; +import { FileInput } from './file_input.js'; +import { ExpansionPanel } from './expansion_panel.js'; +import { Caption } from './caption.js'; + +const { div, span } = van.tags; +const clearSentinel = ''; +const secretsPlaceholder = ''; +const defaultPorts = { + redshift: '5439', + redshift_spectrum: '5439', + azure_mssql: '1433', + synapse_mssql: '1433', + mssql: '1433', + postgresql: '5432', + snowflake: '443', + databricks: '443', +}; + +/** + * + * @param {Properties} props + * @param {(any|undefined)} saveButton + * @returns {HTMLElement} + */ +const ConnectionForm = (props, saveButton) => { + loadStylesheet('connectionform', stylesheet); + + const connection = getValue(props.connection); + const isEditMode = !!connection?.connection_id; + const defaultPort = defaultPorts[connection?.sql_flavor]; + + const connectionStatus = van.state(undefined); + van.derive(() => { + connectionStatus.val = getValue(props.connection)?.status; + }); + + const connectionFlavor = van.state(connection?.sql_flavor_code); + const connectionName = van.state(connection?.connection_name ?? ''); + const connectionMaxThreads = van.state(connection?.max_threads ?? 4); + const connectionQueryChars = van.state(connection?.max_query_chars ?? 20000); + const privateKeyFile = van.state(getValue(props.cachedPrivateKeyFile) ?? null); + const serviceAccountKeyFile = van.state(getValue(props.cachedServiceAccountKeyFile) ?? null); + + const updatedConnection = van.state({ + project_code: connection.project_code, + connection_id: connection.connection_id, + sql_flavor: connection?.sql_flavor ?? undefined, + project_host: connection?.project_host ?? '', + project_port: connection?.project_port ?? defaultPort ?? '', + project_db: connection?.project_db ?? '', + project_user: connection?.project_user ?? '', + project_pw_encrypted: isEditMode ? '' : (connection?.project_pw_encrypted ?? ''), + connect_by_url: connection?.connect_by_url ?? false, + connect_by_key: connection?.connect_by_key ?? false, + private_key: isEditMode ? '' : (connection?.private_key ?? ''), + private_key_passphrase: isEditMode ? '' : (connection?.private_key_passphrase ?? ''), + http_path: connection?.http_path ?? '', + warehouse: connection?.warehouse ?? '', + url: connection?.url ?? '', + service_account_key: connection?.service_account_key ?? '', + connect_with_identity: connection?.connect_with_identity ?? false, + sql_flavor_code: connectionFlavor.rawVal ?? '', + connection_name: connectionName.rawVal ?? '', + max_threads: connectionMaxThreads.rawVal ?? 4, + max_query_chars: connectionQueryChars.rawVal ?? 20000, + }); + const dynamicConnectionUrl = van.state(props.dynamicConnectionUrl?.rawVal ?? ''); + + van.derive(() => { + const previousValue = updatedConnection.oldVal; + const currentValue = updatedConnection.rawVal; + + if (shouldRefreshUrl(previousValue, currentValue)) { + emitEvent('ConnectionUpdated', {payload: updatedConnection.rawVal}); + } + }); + + van.derive(() => { + const updatedUrl = getValue(props.dynamicConnectionUrl); + dynamicConnectionUrl.val = updatedUrl; + }); + + const dirty = van.derive(() => !isEqual(updatedConnection.val, connection)); + const validityPerField = van.state({}); + + const authenticationForms = { + redshift: () => RedshiftForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + setFieldValidity('redshift_form', isValid); + }, + connection, + dynamicConnectionUrl, + ), + redshift_spectrum: () => RedshiftSpectrumForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + setFieldValidity('redshift_spectrum_form', isValid); + }, + connection, + ), + azure_mssql: () => AzureMSSQLForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + setFieldValidity('mssql_form', isValid); + }, + connection, + dynamicConnectionUrl, + ), + synapse_mssql: () => SynapseMSSQLForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + setFieldValidity('mssql_form', isValid); + }, + connection, + dynamicConnectionUrl, + ), + mssql: () => MSSQLForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + setFieldValidity('mssql_form', isValid); + }, + connection, + dynamicConnectionUrl, + ), + postgresql: () => PostgresqlForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + setFieldValidity('postgresql_form', isValid); + }, + connection, + dynamicConnectionUrl, + ), + snowflake: () => SnowflakeForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, fileValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + privateKeyFile.val = fileValue; + setFieldValidity('snowflake_form', isValid); + }, + connection, + getValue(props.cachedPrivateKeyFile) ?? null, + dynamicConnectionUrl, + ), + databricks: () => DatabricksForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + setFieldValidity('databricks_form', isValid); + }, + connection, + dynamicConnectionUrl, + ), + bigquery: () => BigqueryForm( + updatedConnection, + getValue(props.flavors).find(f => f.value === connectionFlavor.rawVal), + (formValue, fileValue, isValid) => { + updatedConnection.val = {...updatedConnection.val, ...formValue}; + serviceAccountKeyFile.val = fileValue; + setFieldValidity('bigquery_form', isValid); + }, + connection, + getValue(props.cachedServiceAccountKeyFile) ?? null + ), + }; + + const setFieldValidity = (field, validity) => { + validityPerField.val = {...validityPerField.val, [field]: validity}; + } + + const authenticationForm = van.derive(() => { + const selectedFlavorCode = connectionFlavor.val; + validityPerField.val = {connection_name: validityPerField.val.connection_name}; + const flavor = getValue(props.flavors).find(f => f.value === selectedFlavorCode); + return authenticationForms[flavor.value](); + }); + + van.derive(() => { + const selectedFlavorCode = connectionFlavor.val; + const previousFlavorCode = connectionFlavor.oldVal; + const updatedConnection_ = updatedConnection.rawVal; + + const isCustomPort = updatedConnection_?.project_port !== defaultPorts[previousFlavorCode]; + if (selectedFlavorCode !== previousFlavorCode && (!isCustomPort || !updatedConnection_?.project_port)) { + updatedConnection.val = {...updatedConnection_, project_port: defaultPorts[selectedFlavorCode]}; + } + }); + + van.derive(() => { + const selectedFlavor = connectionFlavor.val; + const flavorObject = getValue(props.flavors).find(f => f.value === selectedFlavor); + + updatedConnection.val = { + ...updatedConnection.val, + sql_flavor: flavorObject.flavor, + sql_flavor_code: flavorObject.value, + connection_name: connectionName.val, + max_threads: connectionMaxThreads.val, + max_query_chars: connectionQueryChars.val, + }; + }); + + van.derive(() => { + const fieldsValidity = validityPerField.val; + const isValid = Object.keys(fieldsValidity).length > 0 && + Object.values(fieldsValidity).every(v => v); + props.onChange?.( + updatedConnection.val, + { dirty: dirty.val, valid: isValid }, + { privateKey: privateKeyFile.rawVal, serviceAccountKey: serviceAccountKeyFile.rawVal } + ); + }); + + return div( + { class: 'flex-column fx-gap-3 fx-align-stretch', style: 'overflow-y: auto;' }, + Select({ + label: 'Database Type', + value: connectionFlavor, + options: props.flavors, + disabled: props.disableFlavor, + help: 'Type of database server to connect to. This determines the database driver and SQL dialect that will be used by TestGen.', + testId: 'sql_flavor', + }), + Input({ + name: 'connection_name', + label: 'Connection Name', + value: connectionName, + help: 'Unique name to describe the connection', + onChange: (value, state) => { + connectionName.val = value; + setFieldValidity('connection_name', state.valid); + }, + validators: [ required, minLength(3), maxLength(40) ], + }), + + authenticationForm, + + ExpansionPanel( + { + title: 'Advanced Tuning', + }, + div( + { class: 'flex-row fx-gap-3' }, + Slider({ + label: 'Max Threads', + hint: 'Maximum number of concurrent threads that run tests. Default values should be retained unless test queries are failing.', + value: connectionMaxThreads.rawVal, + min: 1, + max: 8, + onChange: (value) => connectionMaxThreads.val = value, + }), + Slider({ + label: 'Max Expression Length', + hint: 'Some tests are consolidated into queries for maximum performance. Default values should be retained unless test queries are failing.', + value: connectionQueryChars.rawVal, + min: 500, + max: 50000, + onChange: (value) => connectionQueryChars.val = value, + }), + ), + ), + + div( + { class: 'flex-row fx-gap-3 fx-justify-space-between' }, + Button({ + label: 'Test Connection', + color: 'basic', + type: 'stroked', + width: 'auto', + onclick: () => emitEvent('TestConnectionClicked', { payload: updatedConnection.val }), + }), + saveButton, + ), + () => { + return connectionStatus.val + ? Alert( + { + type: connectionStatus.val.successful ? 'success' : 'error', + closeable: true, + onClose: () => connectionStatus.val = undefined, + }, + div( + { class: 'flex-column' }, + span(connectionStatus.val.message), + connectionStatus.val.details ? span(connectionStatus.val.details) : '', + ) + ) + : ''; + }, + ); +}; + +/** + * @param {VanState} connection + * @param {Flavor} flavor + * @param {boolean} maskPassword + * @param {(params: Partial, isValid: boolean) => void} onChange + * @param {Connection?} originalConnection + * @param {VanState} dynamicConnectionUrl + * @returns {HTMLElement} + */ +const RedshiftForm = ( + connection, + flavor, + onChange, + originalConnection, + dynamicConnectionUrl, +) => { + const isValid = van.state(true); + const connectByUrl = van.state(connection.rawVal.connect_by_url ?? false); + const connectionHost = van.state(connection.rawVal.project_host ?? ''); + const connectionPort = van.state(connection.rawVal.project_port || defaultPorts[flavor.flavor]); + const connectionDatabase = van.state(connection.rawVal.project_db ?? ''); + const connectionUsername = van.state(connection.rawVal.project_user ?? ''); + const connectionPassword = van.state(connection.rawVal?.project_pw_encrypted ?? ''); + const connectionUrl = van.state(connection.rawVal?.url ?? ''); + + const validityPerField = {}; + + van.derive(() => { + onChange({ + project_host: connectionHost.val, + project_port: connectionPort.val, + project_db: connectionDatabase.val, + project_user: connectionUsername.val, + project_pw_encrypted: connectionPassword.val, + connect_by_url: connectByUrl.val, + url: connectByUrl.val ? connectionUrl.val : connectionUrl.rawVal, + connect_by_key: false, + }, isValid.val); + }); + + van.derive(() => { + const newUrlValue = (dynamicConnectionUrl.val ?? '').replace(extractPrefix(dynamicConnectionUrl.rawVal), ''); + if (!connectByUrl.rawVal) { + connectionUrl.val = newUrlValue; + } + }); + + return div( + {class: 'flex-column fx-gap-3 fx-flex'}, + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Server', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + RadioGroup({ + label: 'Connect by', + options: [ + { + label: 'Host', + value: false, + }, + { + label: 'URL', + value: true, + }, + ], + value: connectByUrl, + onChange: (value) => connectByUrl.val = value, + layout: 'inline', + }), + div( + { class: 'flex-row fx-gap-3 fx-flex' }, + Input({ + name: 'db_host', + label: 'Host', + value: connectionHost, + class: 'fx-flex', + disabled: connectByUrl, + onChange: (value, state) => { + connectionHost.val = value; + validityPerField['db_host'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + maxLength(250), + requiredIf(() => !connectByUrl.val), + ], + }), + Input({ + name: 'db_port', + label: 'Port', + value: connectionPort, + type: 'number', + disabled: connectByUrl, + onChange: (value, state) => { + connectionPort.val = value; + validityPerField['db_port'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + minLength(3), + maxLength(5), + requiredIf(() => !connectByUrl.val), + ], + }) + ), + Input({ + name: 'db_name', + label: 'Database', + value: connectionDatabase, + disabled: connectByUrl, + onChange: (value, state) => { + connectionDatabase.val = value; + validityPerField['db_name'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + maxLength(100), + requiredIf(() => !connectByUrl.val), + ], + }), + () => div( + { class: 'flex-row fx-gap-3 fx-align-stretch', style: 'position: relative;' }, + Input({ + label: 'URL', + value: connectionUrl, + class: 'fx-flex', + name: 'url_suffix', + prefix: span({ style: 'white-space: nowrap; color: var(--disabled-text-color)' }, extractPrefix(dynamicConnectionUrl.val)), + disabled: !connectByUrl.val, + onChange: (value, state) => { + connectionUrl.val = value; + validityPerField['url_suffix'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => connectByUrl.val), + ], + }), + ), + ), + + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Authentication', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + + Input({ + name: 'db_user', + label: 'Username', + value: connectionUsername, + onChange: (value, state) => { + connectionUsername.val = value; + validityPerField['db_user'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + required, + maxLength(50), + ], + }), + Input({ + name: 'password', + label: 'Password', + value: connectionPassword, + type: 'password', + passwordSuggestions: false, + placeholder: (originalConnection?.connection_id && originalConnection?.project_pw_encrypted) ? secretsPlaceholder : '', + onChange: (value, state) => { + connectionPassword.val = value; + validityPerField['password'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + }), + ), + ); +}; + +const RedshiftSpectrumForm = RedshiftForm; + +const PostgresqlForm = RedshiftForm; + +const AzureMSSQLForm = ( + connection, + flavor, + onChange, + originalConnection, + dynamicConnectionUrl, +) => { + const isValid = van.state(true); + const connectByUrl = van.state(connection.rawVal.connect_by_url ?? false); + const connectionHost = van.state(connection.rawVal.project_host ?? ''); + const connectionPort = van.state(connection.rawVal.project_port || defaultPorts[flavor.flavor]); + const connectionDatabase = van.state(connection.rawVal.project_db ?? ''); + const connectionUsername = van.state(connection.rawVal.project_user ?? ''); + const connectionPassword = van.state(connection.rawVal?.project_pw_encrypted ?? ''); + const connectionUrl = van.state(connection.rawVal?.url ?? ''); + const connectWithIdentity = van.state(connection.rawVal?.connect_with_identity ?? ''); + + const validityPerField = {}; + + van.derive(() => { + onChange({ + project_host: connectionHost.val, + project_port: connectionPort.val, + project_db: connectionDatabase.val, + project_user: connectionUsername.val, + project_pw_encrypted: connectionPassword.val, + connect_by_url: connectByUrl.val, + url: connectByUrl.val ? connectionUrl.val : connectionUrl.rawVal, + connect_by_key: false, + connect_with_identity: connectWithIdentity.val, + }, isValid.val); + }); + + van.derive(() => { + const newUrlValue = (dynamicConnectionUrl.val ?? '').replace(extractPrefix(dynamicConnectionUrl.rawVal), ''); + if (!connectByUrl.rawVal) { + connectionUrl.val = newUrlValue; + } + }); + + return div( + {class: 'flex-column fx-gap-3 fx-flex'}, + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Server', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + RadioGroup({ + label: 'Connect by', + options: [ + { + label: 'Host', + value: false, + }, + { + label: 'URL', + value: true, + }, + ], + value: connectByUrl, + onChange: (value) => connectByUrl.val = value, + layout: 'inline', + }), + div( + { class: 'flex-row fx-gap-3 fx-flex' }, + Input({ + name: 'db_host', + label: 'Host', + value: connectionHost, + class: 'fx-flex', + disabled: connectByUrl, + onChange: (value, state) => { + connectionHost.val = value; + validityPerField['db_host'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + maxLength(250), + requiredIf(() => !connectByUrl.val), + ], + }), + Input({ + name: 'db_port', + label: 'Port', + value: connectionPort, + type: 'number', + disabled: connectByUrl, + onChange: (value, state) => { + connectionPort.val = value; + validityPerField['db_port'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + minLength(3), + maxLength(5), + requiredIf(() => !connectByUrl.val), + ], + }) + ), + Input({ + name: 'db_name', + label: 'Database', + value: connectionDatabase, + disabled: connectByUrl, + onChange: (value, state) => { + connectionDatabase.val = value; + validityPerField['db_name'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + maxLength(100), + requiredIf(() => !connectByUrl.val), + ], + }), + () => div( + { class: 'flex-row fx-gap-3 fx-align-stretch', style: 'position: relative;' }, + Input({ + label: 'URL', + value: connectionUrl, + class: 'fx-flex', + name: 'url_suffix', + prefix: span({ style: 'white-space: nowrap; color: var(--disabled-text-color)' }, extractPrefix(dynamicConnectionUrl.val)), + disabled: !connectByUrl.val, + onChange: (value, state) => { + connectionUrl.val = value; + validityPerField['url_suffix'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => connectByUrl.val), + ], + }), + ), + ), + + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Authentication', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + + RadioGroup({ + label: 'Connection Strategy', + options: [ + {label: 'Connect By Password', value: false}, + {label: 'Connect with Managed Identity', value: true}, + ], + value: connectWithIdentity, + onChange: (value) => connectWithIdentity.val = value, + layout: 'inline', + }), + + () => { + const _connectWithIdentity = connectWithIdentity.val; + if (_connectWithIdentity) { + return div( + {class: 'flex-row p-4 fx-justify-center text-secondary'}, + 'Microsoft Entra ID credentials configured on host machine will be used', + ); + } + + return div( + {class: 'flex-column fx-gap-1'}, + Input({ + name: 'db_user', + label: 'Username', + value: connectionUsername, + onChange: (value, state) => { + connectionUsername.val = value; + validityPerField['db_user'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectWithIdentity.val), + maxLength(50), + ], + }), + Input({ + name: 'password', + label: 'Password', + value: connectionPassword, + type: 'password', + passwordSuggestions: false, + placeholder: (originalConnection?.connection_id && originalConnection?.project_pw_encrypted) ? secretsPlaceholder : '', + onChange: (value, state) => { + connectionPassword.val = value; + validityPerField['password'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + }), + ) + }, + ), + ); +}; + +const SynapseMSSQLForm = RedshiftForm; + +const MSSQLForm = RedshiftForm; + +/** + * @param {VanState} connection + * @param {Flavor} flavor + * @param {boolean} maskPassword + * @param {(params: Partial, isValid: boolean) => void} onChange + * @param {Connection?} originalConnection + * @param {VanState} dynamicConnectionUrl + * @returns {HTMLElement} + */ +const DatabricksForm = ( + connection, + flavor, + onChange, + originalConnection, + dynamicConnectionUrl, +) => { + const isValid = van.state(true); + const connectByUrl = van.state(connection.rawVal?.connect_by_url ?? false); + const connectionHost = van.state(connection.rawVal?.project_host ?? ''); + const connectionPort = van.state(connection.rawVal?.project_port || defaultPorts[flavor.flavor]); + const connectionHttpPath = van.state(connection.rawVal?.http_path ?? ''); + const connectionDatabase = van.state(connection.rawVal?.project_db ?? ''); + const connectionUsername = van.state(connection.rawVal?.project_user ?? ''); + const connectionPassword = van.state(connection.rawVal?.project_pw_encrypted ?? ''); + const connectionUrl = van.state(connection.rawVal?.url ?? ''); + + const validityPerField = {}; + + van.derive(() => { + onChange({ + project_host: connectionHost.val, + project_port: connectionPort.val, + project_db: connectionDatabase.val, + project_user: connectionUsername.val, + project_pw_encrypted: connectionPassword.val, + http_path: connectionHttpPath.val, + connect_by_url: connectByUrl.val, + url: connectByUrl.val ? connectionUrl.val : connectionUrl.rawVal, + connect_by_key: false, + }, isValid.val); + }); + + van.derive(() => { + const newUrlValue = (dynamicConnectionUrl.val ?? '').replace(extractPrefix(dynamicConnectionUrl.rawVal), ''); + if (!connectByUrl.rawVal) { + connectionUrl.val = newUrlValue; + } + }); + + return div( + {class: 'flex-column fx-gap-3 fx-flex'}, + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Server', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + + RadioGroup({ + label: 'Connect by', + options: [ + { + label: 'Host', + value: false, + }, + { + label: 'URL', + value: true, + }, + ], + value: connectByUrl, + onChange: (value) => connectByUrl.val = value, + layout: 'inline', + }), + div( + { class: 'flex-row fx-gap-3 fx-flex' }, + Input({ + name: 'db_host', + label: 'Host', + value: connectionHost, + class: 'fx-flex', + disabled: connectByUrl, + onChange: (value, state) => { + connectionHost.val = value; + validityPerField['db_host'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectByUrl.val), + maxLength(250), + ], + }), + Input({ + name: 'db_port', + label: 'Port', + value: connectionPort, + type: 'number', + disabled: connectByUrl, + onChange: (value, state) => { + connectionPort.val = value; + validityPerField['db_port'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectByUrl.val), + minLength(3), + maxLength(5), + ], + }) + ), + Input({ + label: 'HTTP Path', + value: connectionHttpPath, + class: 'fx-flex', + name: 'http_path', + disabled: connectByUrl, + onChange: (value, state) => { + connectionHttpPath.val = value; + validityPerField['http_path'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectByUrl.val), + maxLength(50), + ], + }), + Input({ + name: 'db_name', + label: 'Database', + value: connectionDatabase, + disabled: connectByUrl, + onChange: (value, state) => { + connectionDatabase.val = value; + validityPerField['db_name'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectByUrl.val), + maxLength(100), + ], + }), + () => div( + { class: 'flex-row fx-gap-3 fx-align-stretch', style: 'position: relative;' }, + Input({ + label: 'URL', + value: connectionUrl, + class: 'fx-flex', + name: 'url_suffix', + prefix: span({ style: 'white-space: nowrap; color: var(--disabled-text-color)' }, extractPrefix(dynamicConnectionUrl.val)), + disabled: !connectByUrl.val, + onChange: (value, state) => { + connectionUrl.val = value; + validityPerField['url_suffix'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => connectByUrl.val), + ], + }), + ), + ), + + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Authentication', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + + Input({ + name: 'db_user', + label: 'Username', + value: connectionUsername, + onChange: (value, state) => { + connectionUsername.val = value; + validityPerField['db_user'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + required, + maxLength(50), + ], + }), + Input({ + name: 'password', + label: 'Password', + value: connectionPassword, + type: 'password', + passwordSuggestions: false, + placeholder: (originalConnection?.connection_id && originalConnection?.project_pw_encrypted) ? secretsPlaceholder : '', + onChange: (value, state) => { + connectionPassword.val = value; + validityPerField['password'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + }), + ), + ); +}; + +/** + * @param {VanState} connection + * @param {Flavor} flavor + * @param {boolean} maskPassword + * @param {(params: Partial, fileValue: FileValue, isValid: boolean) => void} onChange + * @param {Connection?} originalConnection + * @param {string?} cachedFile + * @param {VanState} dynamicConnectionUrl + * @returns {HTMLElement} + */ +const SnowflakeForm = ( + connection, + flavor, + onChange, + originalConnection, + cachedFile, + dynamicConnectionUrl, +) => { + const isValid = van.state(false); + const clearPrivateKeyPhrase = van.state(connection.rawVal?.private_key_passphrase === clearSentinel); + const connectByUrl = van.state(connection.rawVal.connect_by_url ?? false); + const connectByKey = van.state(connection.rawVal?.connect_by_key ?? false); + const connectionHost = van.state(connection.rawVal.project_host ?? ''); + const connectionPort = van.state(connection.rawVal.project_port || defaultPorts[flavor.flavor]); + const connectionDatabase = van.state(connection.rawVal.project_db ?? ''); + const connectionWarehouse = van.state(connection.rawVal.warehouse ?? ''); + const connectionUsername = van.state(connection.rawVal.project_user ?? ''); + const connectionPassword = van.state(connection.rawVal?.project_pw_encrypted ?? ''); + const connectionPrivateKey = van.state(connection.rawVal?.private_key ?? ''); + const connectionPrivateKeyPassphrase = van.state( + clearPrivateKeyPhrase.rawVal + ? '' + : (connection.rawVal?.private_key_passphrase ?? '') + ); + const connectionUrl = van.state(connection.rawVal?.url ?? ''); + + const validityPerField = {}; + + const privateKeyFileRaw = van.state(cachedFile); + + van.derive(() => { + onChange({ + project_host: connectionHost.val, + project_port: connectionPort.val, + project_db: connectionDatabase.val, + project_user: connectionUsername.val, + project_pw_encrypted: connectionPassword.val, + connect_by_url: connectByUrl.val, + url: connectByUrl.val ? connectionUrl.val : connectionUrl.rawVal, + connect_by_key: connectByKey.val, + private_key: connectionPrivateKey.val, + private_key_passphrase: clearPrivateKeyPhrase.val ? clearSentinel : connectionPrivateKeyPassphrase.val, + warehouse: connectionWarehouse.val, + }, privateKeyFileRaw.val, isValid.val); + }); + + van.derive(() => { + const newUrlValue = (dynamicConnectionUrl.val ?? '').replace(extractPrefix(dynamicConnectionUrl.rawVal), ''); + if (!connectByUrl.rawVal) { + connectionUrl.val = newUrlValue; + } + }); + + return div( + {class: 'flex-column fx-gap-3 fx-flex'}, + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Server', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + + RadioGroup({ + label: 'Connect by', + options: [ + { + label: 'Host', + value: false, + }, + { + label: 'URL', + value: true, + }, + ], + value: connectByUrl, + onChange: (value) => connectByUrl.val = value, + layout: 'inline', + }), + div( + { class: 'flex-row fx-gap-3 fx-flex' }, + Input({ + name: 'db_host', + label: 'Host', + value: connectionHost, + class: 'fx-flex', + disabled: connectByUrl, + onChange: (value, state) => { + connectionHost.val = value; + validityPerField['db_host'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectByUrl.val), + maxLength(250), + ], + }), + Input({ + name: 'db_port', + label: 'Port', + value: connectionPort, + type: 'number', + disabled: connectByUrl, + onChange: (value, state) => { + connectionPort.val = value; + validityPerField['db_port'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectByUrl.val), + minLength(3), + maxLength(5), + ], + }) + ), + Input({ + name: 'db_name', + label: 'Database', + value: connectionDatabase, + disabled: connectByUrl, + onChange: (value, state) => { + connectionDatabase.val = value; + validityPerField['db_name'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !connectByUrl.val), + maxLength(100), + ], + }), + Input({ + name: 'warehouse', + label: 'Warehouse', + value: connectionWarehouse, + disabled: connectByUrl, + onChange: (value, state) => { + connectionWarehouse.val = value; + validityPerField['warehouse'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + maxLength(100), + ], + }), + () => div( + { class: 'flex-row fx-gap-3 fx-align-stretch', style: 'position: relative;' }, + Input({ + label: 'URL', + value: connectionUrl, + class: 'fx-flex', + name: 'url_suffix', + prefix: span({ style: 'white-space: nowrap; color: var(--disabled-text-color)' }, extractPrefix(dynamicConnectionUrl.val)), + disabled: !connectByUrl.val, + onChange: (value, state) => { + connectionUrl.val = value; + validityPerField['url_suffix'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => connectByUrl.val), + ], + }), + ), + ), + + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Authentication', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + + RadioGroup({ + label: 'Connection Strategy', + options: [ + {label: 'Connect By Password', value: false}, + {label: 'Connect By Key-Pair', value: true}, + ], + value: connectByKey, + onChange: (value) => connectByKey.val = value, + layout: 'inline', + }), + + Input({ + name: 'db_user', + label: 'Username', + value: connectionUsername, + onChange: (value, state) => { + connectionUsername.val = value; + validityPerField['db_user'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + required, + maxLength(50), + ], + }), + () => { + if (connectByKey.val) { + const hasPrivateKeyPhrase = originalConnection?.private_key_passphrase || connectionPrivateKeyPassphrase.val; + + return div( + { class: 'flex-column fx-gap-3' }, + div( + { class: 'key-pair-passphrase-field'}, + Input({ + name: 'private_key_passphrase', + label: 'Private Key Passphrase', + value: connectionPrivateKeyPassphrase, + type: 'password', + passwordSuggestions: false, + help: 'Passphrase used when creating the private key. Leave empty if the private key is not encrypted.', + placeholder: () => (originalConnection?.connection_id && originalConnection?.private_key_passphrase && !clearPrivateKeyPhrase.val) ? secretsPlaceholder : '', + onChange: (value, state) => { + if (value) { + clearPrivateKeyPhrase.val = false; + } + connectionPrivateKeyPassphrase.val = value; + validityPerField['private_key_passphrase'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + clearable: hasPrivateKeyPhrase, + clearableCondition: 'always', + onClear: () => { + clearPrivateKeyPhrase.val = true; + connectionPrivateKeyPassphrase.val = ''; + }, + }), + ), + FileInput({ + name: 'private_key', + label: 'Upload private key (rsa_key.p8)', + placeholder: (originalConnection?.connection_id && originalConnection?.private_key) + ? 'Drop file here or browse files to replace existing key' + : undefined, + value: privateKeyFileRaw, + onChange: (value, state) => { + let isFieldValid = state.valid; + + privateKeyFileRaw.val = value; + try { + if (value?.content) { + connectionPrivateKey.val = value.content.split(',')?.[1] ?? ''; + } + } catch (err) { + console.error(err); + isFieldValid = false; + } + + validityPerField['private_key'] = isFieldValid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !originalConnection?.connection_id || !originalConnection?.private_key), + sizeLimit(200 * 1024 * 1024), + ], + }), + ); + } + + return Input({ + name: 'password', + label: 'Password', + value: connectionPassword, + type: 'password', + passwordSuggestions: false, + placeholder: (originalConnection?.connection_id && originalConnection?.project_pw_encrypted) ? secretsPlaceholder : '', + onChange: (value, state) => { + connectionPassword.val = value; + validityPerField['password'] = state.valid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + }); + }, + ), + ); +}; + +/** + * @param {VanState} connection + * @param {Flavor} flavor + * @param {(params: Partial, fileValue: FileValue, isValid: boolean) => void} onChange + * @param {Connection?} originalConnection + * @param {string?} originalConnection + * @param {FileValue?} cachedFile + * @returns {HTMLElement} + */ +const BigqueryForm = ( + connection, + flavor, + onChange, + originalConnection, + cachedFile, +) => { + const isValid = van.state(false); + const serviceAccountKey = van.state(connection.rawVal.service_account_key ?? null); + const projectId = van.state(""); + const serviceAccountKeyFileRaw = van.state(cachedFile); + + const validityPerField = {}; + + van.derive(() => { + projectId.val = serviceAccountKey.val?.project_id ?? ''; + isValid.val = !!projectId.val; + }); + + van.derive(() => { + onChange({ service_account_key: serviceAccountKey.val, project_db: projectId.val }, serviceAccountKeyFileRaw.val, isValid.val); + }); + + return div( + {class: 'flex-column fx-gap-3 fx-flex'}, + div( + { class: 'flex-column border border-radius-1 p-3 mt-1 fx-gap-1', style: 'position: relative;' }, + Caption({content: 'Service Account Key', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + + () => { + return div( + { class: 'flex-column fx-gap-3' }, + FileInput({ + name: 'service_account_key', + label: 'Upload service account key (.json)', + placeholder: (originalConnection?.connection_id && originalConnection?.service_account_key) + ? 'Drop file here or browse files to replace existing key' + : undefined, + value: serviceAccountKeyFileRaw, + onChange: (value, state) => { + let isFieldValid = state.valid; + try { + if (value?.content) { + serviceAccountKey.val = JSON.parse(atob(value.content.split(',')?.[1] ?? '')); + } + } catch (err) { + console.error(err); + isFieldValid = false; + } + serviceAccountKeyFileRaw.val = value; + validityPerField['service_account_key'] = isFieldValid; + isValid.val = Object.values(validityPerField).every(v => v); + }, + validators: [ + requiredIf(() => !originalConnection?.connection_id || !originalConnection?.service_account_key), + sizeLimit(20 * 1024), + ], + }), + ); + }, + + div( + { class: 'text-caption text-right' }, + () => `Project ID: ${projectId.val}`, + ), + ), + ); +}; + +function extractPrefix(url) { + if (!url) { + return ''; + } + + if (url.includes('@')) { + const parts = url.split('@'); + if (!parts[0]) { + return ''; + } + return `${parts[0]}@`; + } + + return url.slice(0, url.indexOf('://') + 3); +} + +function shouldRefreshUrl(previous, current) { + if (current.connect_by_url) { + return false; + } + + const fields = ['sql_flavor', 'project_host', 'project_port', 'project_db', 'project_user', 'connect_by_key', 'http_path', 'warehouse', 'connect_with_identity']; + return fields.some((fieldName) => previous[fieldName] !== current[fieldName]); +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.key-pair-passphrase-field { + position: relative; +} + +.key-pair-passphrase-field > i { + position: absolute; + top: 26px; + right: 8px; +} + +`); + +export { ConnectionForm }; diff --git a/testgen/ui/static/js/components/crontab_input.js b/testgen/ui/static/js/components/crontab_input.js new file mode 100644 index 00000000..5f0fc190 --- /dev/null +++ b/testgen/ui/static/js/components/crontab_input.js @@ -0,0 +1,629 @@ +/** + * @import { CronSample } from '../types.js'; + * + * @typedef EditOptions + * @type {object} + * @property {CronSample?} sample + * @property {(expr: string) => void} onChange + * @property {(() => void)?} onClose + * + * @typedef InitialValue + * @type {object} + * @property {string} timezone + * @property {string} expression + * + * @typedef Options + * @type {object} + * @property {(string|null)} id + * @property {(string|null)} name + * @property {string?} testId + * @property {string?} class + * @property {CronSample?} sample + * @property {InitialValue?} value + * @property {('x_hours'|'x_days'|'certain_days'|'custom'))[]?} modes + * @property {boolean?} hideExpression + * @property {((expr: string) => void)?} onChange + */ +import { getRandomId, getValue, loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; +import { Portal } from './portal.js'; +import { Button } from './button.js'; +import { Input } from './input.js'; +import { required } from '../form_validators.js'; +import { Select } from './select.js'; +import { Checkbox } from './checkbox.js'; +import { Link } from './link.js'; + +const { div, span } = van.tags; + +const CrontabInput = (/** @type Options */ props) => { + loadStylesheet('crontab-input', stylesheet); + + const domId = van.derive(() => props.id?.val ?? `tg-crontab-wrapper-${getRandomId()}`); + const opened = van.state(false); + const expression = van.state(props.value?.rawVal?.expression ?? props.value?.expression ?? ''); + const readableSchedule = van.state(null); + const timezone = van.derive(() => getValue(props.value)?.timezone); + const disabled = van.derive(() => !timezone.val); + const placeholder = van.derive(() => !timezone.val ? 'Select a timezone first' : 'Click to select schedule'); + + const onEditorChange = (cronExpr) => { + expression.val = cronExpr; + const onChange = props.onChange?.val ?? props.onChange; + if (onChange && cronExpr) { + onChange(cronExpr); + } + }; + + van.derive(() => { + const sample = getValue(props.sample) ?? {}; + if (!sample.error && sample.readable_expr) { + readableSchedule.val = `${sample.readable_expr} (${timezone.val})`; + } + }); + + return div( + { + id: domId, + class: () => `tg-crontab-input ${getValue(props.class) ?? ''}`, + style: 'position: relative', + 'data-testid': getValue(props.testId) ?? null, + }, + div( + {onclick: () => { + if (!disabled.val) { + opened.val = true; + } + }}, + Input({ + name: props.name ?? getRandomId(), + label: 'Schedule', + icon: 'calendar_clock', + readonly: true, + disabled: disabled, + placeholder: placeholder, + value: readableSchedule, + }), + ), + Portal( + {target: domId.val, targetRelative: true, align: 'right', style: 'width: 500px;', opened}, + () => CrontabEditorPortal( + { + onChange: onEditorChange, + onClose: () => opened.val = false, + sample: props.sample, + modes: props.modes, + hideExpression: props.hideExpression, + }, + expression, + ), + ), + ); +}; + +/** + * @param {EditOptions} options + * @param {import('../van.min.js').VanState} expr + * @returns {HTMLElement} + */ +const CrontabEditorPortal = ({sample, ...options}, expr) => { + const mode = van.state(expr.rawVal ? determineMode(expr.rawVal) : 'x_hours'); + + const xHoursState = { + hours: van.state(1), + minute: van.state(0), + startHour: van.state(0), + }; + const xDaysState = { + days: van.state(1), + hour: van.state(1), + minute: van.state(0), + startDay: van.state(1), + }; + const certainDaysState = { + sunday: van.state(false), + monday: van.state(false), + tuesday: van.state(false), + wednesday: van.state(false), + thursday: van.state(false), + friday: van.state(false), + saturday: van.state(false), + hour: van.state(1), + minute: van.state(0), + }; + + // Populate initial state based on the initial mode and expression + populateInitialModeState(expr.rawVal, mode.rawVal, xHoursState, xDaysState, certainDaysState); + + van.derive(() => { + if (mode.val === 'x_hours') { + const hours = xHoursState.hours.val; + const minute = xHoursState.minute.val; + const startHour = xHoursState.startHour.val; + let hourField; + if (!hours || hours <= 1) { + hourField = '*'; + } else if (startHour > 0) { + hourField = generateSteppedValues(startHour, hours, 23); + } else { + hourField = '*/' + hours; + } + options.onChange(`${minute ?? 0} ${hourField} * * *`); + } else if (mode.val === 'x_days') { + const days = xDaysState.days.val; + const hour = xDaysState.hour.val; + const minute = xDaysState.minute.val; + const startDay = xDaysState.startDay.val; + let dayField; + if (!days || days <= 1) { + dayField = '*'; + } else if (startDay > 1) { + dayField = generateSteppedValues(startDay, days, 31); + } else { + dayField = '*/' + days; + } + options.onChange(`${minute ?? 0} ${hour ?? 0} ${dayField} * *`); + } else if (mode.val === 'certain_days') { + const days = []; + const dayMap = [ + { key: 'sunday', val: certainDaysState.sunday.val, label: 'SUN' }, + { key: 'monday', val: certainDaysState.monday.val, label: 'MON' }, + { key: 'tuesday', val: certainDaysState.tuesday.val, label: 'TUE' }, + { key: 'wednesday', val: certainDaysState.wednesday.val, label: 'WED' }, + { key: 'thursday', val: certainDaysState.thursday.val, label: 'THU' }, + { key: 'friday', val: certainDaysState.friday.val, label: 'FRI' }, + { key: 'saturday', val: certainDaysState.saturday.val, label: 'SAT' }, + ]; + // Collect selected days + dayMap.forEach(d => { if (d.val) days.push(d.label); }); + // If days are consecutive, use range notation + let dayField = '*'; + if (days.length > 0) { + // Find ranges + const indices = days.map(d => dayMap.findIndex(dm => dm.label === d)).sort((a,b) => a-b); + let ranges = [], rangeStart = null, prev = null; + indices.forEach((idx, i) => { + if (rangeStart === null) rangeStart = idx; + if (prev !== null && idx !== prev + 1) { + ranges.push([rangeStart, prev]); + rangeStart = idx; + } + prev = idx; + if (i === indices.length - 1) ranges.push([rangeStart, idx]); + }); + // Convert ranges to crontab format + dayField = ranges.map(([start, end]) => { + if (start === end) return dayMap[start].label; + return `${dayMap[start].label}-${dayMap[end].label}`; + }).join(','); + } + const hour = certainDaysState.hour.val; + const minute = certainDaysState.minute.val; + options.onChange(`${minute ?? 0} ${hour ?? 0} * * ${dayField}`); + } + }); + + return div( + { class: 'tg-crontab-editor flex-column border-radius-1 mt-1' }, + div( + { class: 'tg-crontab-editor-content flex-row' }, + div( + { class: 'tg-crontab-editor-left flex-column' }, + !options.modes || options.modes.includes('x_hours') ? span( + { + class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'x_hours' ? 'selected' : ''}`, + onclick: () => mode.val = 'x_hours', + }, + 'Every x hours', + ) : null, + !options.modes || options.modes.includes('x_days') ? span( + { + class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'x_days' ? 'selected' : ''}`, + onclick: () => mode.val = 'x_days', + }, + 'Every x days', + ) : null, + !options.modes || options.modes.includes('certain_days') ? span( + { + class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'certain_days' ? 'selected' : ''}`, + onclick: () => mode.val = 'certain_days', + }, + 'On certain days', + ) : null, + !options.modes || options.modes.includes('custom') ? span( + { + class: () => `tg-crontab-editor-mode p-4 ${mode.val === 'custom' ? 'selected' : ''}`, + onclick: () => mode.val = 'custom', + }, + 'Custom', + ) : null, + ), + div( + { class: 'tg-crontab-editor-right flex-column p-4 fx-flex' }, + div( + { class: () => `${mode.val === 'x_hours' ? '' : 'hidden'}`}, + div( + {class: 'flex-row fx-gap-2 mb-2'}, + span({}, 'Every'), + () => Select({ + label: "", + options: Array.from({length: 24}, (_, i) => i + 1).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xHoursState.hours, + onChange: (value) => { + xHoursState.hours.val = value; + if (value <= 1) xHoursState.startHour.val = 0; + }, + }), + span({}, 'hours'), + ), + div( + {class: () => `flex-row fx-gap-2 ${xHoursState.hours.val > 1 ? 'mb-2' : ''}`}, + span({}, 'on'), + span({}, 'minute'), + () => Select({ + label: "", + options: Array.from({length: 60}, (_, i) => i).map(i => ({label: i.toString().padStart(2, '0'), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xHoursState.minute, + onChange: (value) => xHoursState.minute.val = value, + }), + ), + div( + {class: () => `flex-row fx-gap-2 ${xHoursState.hours.val > 1 ? '' : 'hidden'}`}, + span({}, 'starting at hour'), + () => Select({ + label: "", + options: Array.from({length: 24}, (_, i) => i).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xHoursState.startHour, + onChange: (value) => xHoursState.startHour.val = value, + }), + ), + ), + div( + { class: () => `${mode.val === 'x_days' ? '' : 'hidden'}`}, + div( + {class: 'flex-row fx-gap-2 mb-2'}, + span({}, 'Every'), + () => Select({ + label: "", + options: Array.from({length: 31}, (_, i) => i + 1).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xDaysState.days, + onChange: (value) => { + xDaysState.days.val = value; + if (value <= 1) xDaysState.startDay.val = 1; + }, + }), + span({}, 'days'), + ), + div( + {class: () => `flex-row fx-gap-2 ${xDaysState.days.val > 1 ? 'mb-2' : ''}`}, + span({}, 'at'), + () => Select({ + label: "", + options: Array.from({length: 24}, (_, i) => i).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xDaysState.hour, + onChange: (value) => xDaysState.hour.val = value, + }), + () => Select({ + label: "", + options: Array.from({length: 60}, (_, i) => i).map(i => ({label: i.toString().padStart(2, '0'), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xDaysState.minute, + onChange: (value) => xDaysState.minute.val = value, + }), + ), + div( + {class: () => `flex-row fx-gap-2 ${xDaysState.days.val > 1 ? '' : 'hidden'}`}, + span({}, 'starting on day'), + () => Select({ + label: "", + options: Array.from({length: 31}, (_, i) => i + 1).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal', + value: xDaysState.startDay, + onChange: (value) => xDaysState.startDay.val = value, + }), + ), + ), + div( + { class: () => `${mode.val === 'certain_days' ? '' : 'hidden'}`}, + div( + {class: 'flex-row fx-gap-2 mb-2'}, + Checkbox({ + label: 'Monday', + checked: certainDaysState.monday, + onChange: (v) => certainDaysState.monday.val = v, + }), + Checkbox({ + label: 'Tuesday', + checked: certainDaysState.tuesday, + onChange: (v) => certainDaysState.tuesday.val = v, + }), + Checkbox({ + label: 'Wednesday', + checked: certainDaysState.wednesday, + onChange: (v) => certainDaysState.wednesday.val = v, + }), + ), + div( + {class: 'flex-row fx-gap-2 mb-2'}, + + Checkbox({ + label: 'Thursday', + checked: certainDaysState.thursday, + onChange: (v) => certainDaysState.thursday.val = v, + }), + Checkbox({ + label: 'Friday', + checked: certainDaysState.friday, + onChange: (v) => certainDaysState.friday.val = v, + }), + Checkbox({ + label: 'Saturday', + checked: certainDaysState.saturday, + onChange: (v) => certainDaysState.saturday.val = v, + }), + Checkbox({ + label: 'Sunday', + checked: certainDaysState.sunday, + onChange: (v) => certainDaysState.sunday.val = v, + }), + ), + div( + {class: 'flex-row fx-gap-2'}, + span({}, 'at'), + () => Select({ + label: "", + options: Array.from({length: 24}, (_, i) => i).map(i => ({label: i.toString(), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal shorter', + value: certainDaysState.hour, + onChange: (value) => certainDaysState.hour.val = value, + }), + () => Select({ + label: "", + options: Array.from({length: 60}, (_, i) => i).map(i => ({label: i.toString().padStart(2, '0'), value: i})), + triggerStyle: 'inline', + portalClass: 'tg-crontab--select-portal shorter', + value: certainDaysState.minute, + onChange: (value) => certainDaysState.minute.val = value, + }), + ), + ), + div( + { class: () => `${mode.val === 'custom' ? '' : 'hidden'}`}, + () => Input({ + name: 'cron_expr', + label: 'Cron Expression', + value: expr, + validators: [ + required, + ((sampleState) => { + return () => { + const sample = getValue(sampleState) ?? {}; + return sample.error || null; + }; + })(sample), + ], + onChange: (value, state) => mode.val === 'custom' && options.onChange(value), + }), + ), + span({class: 'fx-flex'}, ''), + div( + {class: 'flex-column fx-gap-1 mt-3 text-secondary'}, + () => span( + { class: mode.val === 'custom' || getValue(options.hideExpression) ? 'hidden': '' }, + `Cron Expression: ${expr.val ?? ''}`, + ), + () => div( + { class: 'flex-column' }, + span('Next Runs:'), + (getValue(sample) ?? {})?.samples?.map(item => span({ class: 'text-caption' }, item)), + ), + () => div( + {class: `flex-row fx-gap-1 text-caption ${mode.val === 'custom' ? '': 'hidden'}`}, + span({}, 'Learn more about'), + Link({ + open_new: true, + label: 'cron expressions', + href: 'https://crontab.guru/', + right_icon: 'open_in_new', + right_icon_size: 13, + }), + ), + ), + ), + ), + div( + { class: 'flex-row fx-justify-space-between p-3' }, + span({class: 'fx-flex'}, ''), + div( + { class: 'flex-row fx-gap-2' }, + Button({ + type: 'stroked', + color: 'primary', + label: 'Close', + style: 'width: auto;', + onclick: options?.onClose, + }), + ), + ), + ); +}; + +function generateSteppedValues(start, step, max) { + const values = []; + for (let i = start; i <= max; i += step) { + values.push(i); + } + return values.join(','); +} + +function parseSteppedList(field) { + const values = field.split(',').map(Number); + if (values.length < 2 || values.some(isNaN)) return null; + const step = values[1] - values[0]; + if (step <= 0) return null; + for (let i = 2; i < values.length; i++) { + if (values[i] - values[i - 1] !== step) return null; + } + return { start: values[0], step }; +} + +/** + * Populates the state variables for the initial mode based on the cron expression + * @param {string} expr + * @param {string} mode + * @param {object} xHoursState + * @param {object} xDaysState + * @param {object} certainDaysState + */ +function populateInitialModeState(expr, mode, xHoursState, xDaysState, certainDaysState) { + const parts = (expr || '').trim().split(/\s+/); + if (mode === 'x_hours' && parts.length === 5) { + xHoursState.minute.val = Number(parts[0]) || 0; + if (parts[1].startsWith('*/')) { + xHoursState.hours.val = Number(parts[1].slice(2)) || 1; + xHoursState.startHour.val = 0; + } else if (parts[1].includes(',')) { + const parsed = parseSteppedList(parts[1]); + if (parsed) { + xHoursState.hours.val = parsed.step; + xHoursState.startHour.val = parsed.start; + } + } else { + xHoursState.hours.val = 1; + xHoursState.startHour.val = 0; + } + } else if (mode === 'x_days' && parts.length === 5) { + xDaysState.minute.val = Number(parts[0]) || 0; + xDaysState.hour.val = Number(parts[1]) || 0; + if (parts[2].startsWith('*/')) { + xDaysState.days.val = Number(parts[2].slice(2)) || 1; + xDaysState.startDay.val = 1; + } else if (parts[2].includes(',')) { + const parsed = parseSteppedList(parts[2]); + if (parsed) { + xDaysState.days.val = parsed.step; + xDaysState.startDay.val = parsed.start; + } + } else { + xDaysState.days.val = 1; + xDaysState.startDay.val = 1; + } + } else if (mode === 'certain_days' && parts.length === 5) { + // e.g. "M H * * DAY[,DAY...]" + certainDaysState.minute.val = Number(parts[0]) || 0; + certainDaysState.hour.val = Number(parts[1]) || 0; + const days = parts[4].split(','); + const dayKeys = ['sunday','monday','tuesday','wednesday','thursday','friday','saturday']; + const dayLabels = ['SUN','MON','TUE','WED','THU','FRI','SAT']; + dayKeys.forEach((key, idx) => { + certainDaysState[key].val = days.some(d => { + if (d.includes('-')) { + // Range, e.g. MON-WED + const [start, end] = d.split('-'); + const startIdx = dayLabels.indexOf(start); + const endIdx = dayLabels.indexOf(end); + return idx >= startIdx && idx <= endIdx; + } + return d === dayLabels[idx]; + }); + }); + } +} + +/** + * @param {string} expression + * @returns {'x_hours'|'x_days'|'certain_days'|'custom'} + */ +function determineMode(expression) { + // Normalize whitespace + const expr = (expression || '').trim().replace(/\s+/g, ' '); + // x_hours: "M */H * * *" or "M * * * *" or "M H1,H2,... * * *" + if (/^\d{1,2} \*\/\d+ \* \* \*$/.test(expr) || /^\d{1,2} \* \* \* \*$/.test(expr)) { + return 'x_hours'; + } + if (/^\d{1,2} \d+(,\d+)+ \* \* \*$/.test(expr)) { + const hourField = expr.split(' ')[1]; + if (parseSteppedList(hourField)) return 'x_hours'; + } + // x_days: "M H */D * *" or "M H * * *" or "M H D1,D2,... * *" + if (/^\d{1,2} \d{1,2} \*\/\d+ \* \*$/.test(expr) || /^\d{1,2} \d{1,2} \* \* \*$/.test(expr)) { + return 'x_days'; + } + if (/^\d{1,2} \d{1,2} \d+(,\d+)+ \* \*$/.test(expr)) { + const dayField = expr.split(' ')[2]; + if (parseSteppedList(dayField)) return 'x_days'; + } + // certain_days: "M H * * DAY[,DAY...]" (DAY = SUN,MON,...) + if (/^\d{1,2} \d{1,2} \* \* ((SUN|MON|TUE|WED|THU|FRI|SAT)(-(SUN|MON|TUE|WED|THU|FRI|SAT))?(,)?)+$/.test(expr)) { + return 'certain_days'; + } + return 'custom'; +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-crontab-input { + position: relative; +} + +.tg-crontab-display { + border-bottom: 1px dashed var(--border-color); +} + +.tg-crontab-editor { + border-radius: 8px; + background: var(--portal-background); + box-shadow: var(--portal-box-shadow); + overflow: auto; +} + +.tg-crontab-editor-content { + align-items: stretch; + border-bottom: 1px solid var(--border-color); +} + +.tg-crontab-editor-left { + border-right: 1px solid var(--border-color); +} + +.tg-crontab-editor-right { + place-self: stretch; +} + +.tg-crontab-editor-mode { + cursor: pointer; +} + +.tg-crontab-editor-mode.selected, +.tg-crontab-editor-mode:hover { + background: var(--select-hover-background); +} + +.tg-crontab--select-portal { + max-height: 150px; + -ms-overflow-style: none; /* Internet Explorer 10+ */ + scrollbar-width: none; /* Firefox, Safari 18.2+, Chromium 121+ */ +} +.tg-crontab--select-portal::-webkit-scrollbar { + display: none; /* Older Safari and Chromium */ +} + +.tg-crontab--select-portal.shorter { + max-height: 120px; +} +`); + +export { CrontabInput, parseSteppedList }; diff --git a/testgen/ui/static/js/components/dot.js b/testgen/ui/static/js/components/dot.js new file mode 100644 index 00000000..d79b20fa --- /dev/null +++ b/testgen/ui/static/js/components/dot.js @@ -0,0 +1,15 @@ +import van from '../van.min.js'; + +const { span } = van.tags; + + +const dot = (props, color, size) => span({ + ...props, + style: `${props.style ?? ''} ${sizeRules(size ?? 10)} border-radius: 50%; background: ${color ?? 'black'};`, +}); + +function sizeRules(size) { + return `width: ${size}px; min-width: ${size}px; max-width: ${size}px; height: ${size}px; min-height: ${size}px; max-height: ${size}px;` +} + +export { dot }; \ No newline at end of file diff --git a/testgen/ui/static/js/components/dual_pane.js b/testgen/ui/static/js/components/dual_pane.js new file mode 100644 index 00000000..65d89266 --- /dev/null +++ b/testgen/ui/static/js/components/dual_pane.js @@ -0,0 +1,80 @@ +/** + * @typedef Options + * @property {('left'|'right')} resizablePanel + * @property {string} resizablePanelDomId + * @property {number} minSize + * @property {number} maxSize + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; + +const { div, span } = van.tags; +const EMPTY_IMAGE = new Image(1, 1); +EMPTY_IMAGE.src = 'data:image/gif;base64,R0lGODlhAQABAIAAAP///wAAACH5BAEAAAAALAAAAAABAAEAAAICRAEAOw=='; + +/** + * + * @param {Options} options + * @param {HTMLElement?} left + * @param {HTMLElement?} right + * @returns + */ +const DualPane = function (options, left, right) { + loadStylesheet('dualPanel', stylesheet); + + const dragState = van.state(null); + const dragConstraints = { min: options.minSize, max: options.maxSize }; + const dragResize = (/** @type Event */ event) => { + // https://stackoverflow.com/questions/36308460/why-is-clientx-reset-to-0-on-last-drag-event-and-how-to-solve-it + if (event.screenX && dragState.val) { + const dragWidth = dragState.val.startWidth + (event.screenX - dragState.val.startX) * (options.resizablePanel === 'right' ? -1 : 1); + const constrainedWidth = Math.min(dragConstraints.max, Math.max(dragWidth, dragConstraints.min)); + + const element = document.getElementById(options.resizablePanelDomId); + if (element) { + element.style.minWidth = `${constrainedWidth}px`; + } + } + }; + + return div( + { ...options, class: () => `tg-dualpane flex-row fx-align-flex-start ${getValue(options.class) ?? ''}` }, + left, + div( + { + class: 'tg-dualpane-divider', + draggable: true, + ondragstart: (event) => { + event.dataTransfer.effectAllowed = 'move'; + event.dataTransfer.setDragImage(EMPTY_IMAGE, 0, 0); + + const element = document.getElementById(options.resizablePanelDomId); + dragState.val = { startX: event.screenX, startWidth: element.offsetWidth }; + }, + ondragend: (event) => { + dragResize(event); + dragState.val = null; + }, + ondrag: (event) => dragState.rawVal ? dragResize(event) : null, + }, + '', + ), + right, + ); +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + .tg-dualpane { + // height: auto; + } + + .tg-dualpane-divider { + min-height: 100px; + place-self: stretch; + cursor: col-resize; + min-width: 16px; + } +`); + +export { DualPane }; diff --git a/testgen/ui/static/js/components/editable_card.js b/testgen/ui/static/js/components/editable_card.js new file mode 100644 index 00000000..4dc8e54e --- /dev/null +++ b/testgen/ui/static/js/components/editable_card.js @@ -0,0 +1,64 @@ +/** + * @typedef Properties + * @type {object} + * @property {string} title + * @property {object} content + * @property {object} editingContent + * @property {function} onSave + * @property {function?} onCancel + * @property {function?} hasChanges + */ +import { getValue } from '../utils.js'; +import van from '../van.min.js'; +import { Card } from './card.js'; +import { Button } from './button.js'; + +const { div } = van.tags; + +const EditableCard = (/** @type Properties */ props) => { + const editing = van.state(false); + const onCancel = van.derive(() => { + const cancelFunction = props.onCancel?.val ?? props.onCancel; + return () => { + editing.val = false; + cancelFunction?.(); + } + }); + const saveDisabled = van.derive(() => { + const hasChanges = props.hasChanges?.val ?? props.hasChanges; + return !hasChanges?.(); + }); + + return Card({ + title: props.title, + content: [ + () => editing.val ? getValue(props.editingContent) : getValue(props.content), + () => editing.val ? div( + { class: 'flex-row fx-justify-content-flex-end fx-gap-3 mt-4' }, + Button({ + type: 'stroked', + label: 'Cancel', + width: 'auto', + onclick: onCancel, + }), + Button({ + type: 'stroked', + color: 'primary', + label: 'Save', + width: 'auto', + disabled: saveDisabled, + onclick: props.onSave, + }), + ) : '', + ], + actionContent: () => !editing.val ? Button({ + type: 'stroked', + label: 'Edit', + icon: 'edit', + width: 'auto', + onclick: () => editing.val = true, + }) : '', + }); +}; + +export { EditableCard }; diff --git a/testgen/ui/static/js/components/empty_state.js b/testgen/ui/static/js/components/empty_state.js new file mode 100644 index 00000000..86628c88 --- /dev/null +++ b/testgen/ui/static/js/components/empty_state.js @@ -0,0 +1,120 @@ +/** +* @typedef Message +* @type {object} +* @property {string} line1 +* @property {string} line2 +* +* @typedef Link +* @type {object} +* @property {string} href +* @property {string} label +* +* @typedef Properties +* @type {object} +* @property {string} icon +* @property {string} label +* @property {Message} message +* @property {Link?} link +* @property {any?} button +* @property {string?} class +*/ +import van from '../van.min.js'; +import { Card } from '../components/card.js'; +import { getValue, loadStylesheet } from '../utils.js'; +import { Link } from './link.js'; + +const { i, span, strong } = van.tags; + +const EMPTY_STATE_MESSAGE = { + connection: { + line1: 'Begin by connecting your database.', + line2: 'TestGen delivers data quality through data profiling, hygiene review, test generation, and test execution.', + }, + tableGroup: { + line1: 'Profile your tables to detect hygiene issues', + line2: 'Create table groups for your connected databases to run data profiling and hygiene review.', + }, + profiling: { + line1: 'Profile your tables to detect hygiene issues', + line2: 'Run data profiling on your table groups to understand data types, column contents, and data patterns.', + }, + testSuite: { + line1: 'Run data validation tests', + line2: 'Automatically generate tests from data profiling results or write custom tests for your business rules.', + }, + testExecution: { + line1: 'Run data validation tests', + line2: 'Execute tests to assess data quality of your tables.' + }, + score: { + line1: 'Track data quality scores', + line2: 'Create custom scorecards to assess quality of your data assets across different categories.', + }, + explorer: { + line1: 'Track data quality scores', + line2: 'Filter or select columns to assess the quality of your data assets across different categories.', + }, + notifications: { + line1: '', + line2: 'Configure an SMTP email server for TestGen to get alerts on profiling runs, test runs, and quality scorecards.', + }, + monitors: { + line1: 'Monitor your tables', + line2: 'Set up freshness, volume, and schema monitors on your data to detect anomalies.', + }, +}; + +const EmptyState = (/** @type Properties */ props) => { + loadStylesheet('empty-state', stylesheet); + + return Card({ + class: `tg-empty-state flex-column fx-align-flex-center ${getValue(props.class ?? '')}`, + content: [ + span({ class: 'tg-empty-state--title mb-5' }, props.label), + i({class: 'material-symbols-rounded mb-5'}, props.icon), + strong({ class: 'mb-2' }, props.message.line1), + span({ class: 'mb-5' }, props.message.line2), + ( + getValue(props.button) ?? + ( + getValue(props.link) + ? Link({ + class: 'tg-empty-state--link', + right_icon: 'chevron_right', + ...(getValue(props.link)), + }) + : '' + ) + ), + ], + }); +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-empty-state { + margin-top: 80px; + border: 1px solid var(--border-color); + padding: 112px 24px !important; +} + +.tg-empty-state--title { + font-size: 24px; + color: var(--secondary-text-color); +} + +.tg-empty-state > i { + font-size: 60px; + color: var(--disabled-text-color); +} + +.tg-empty-state > .tg-empty-state--link { + margin: auto; + border-radius: 4px; + border: var(--button-stroked-border); + padding: 8px 8px 8px 16px; + color: var(--primary-color); +} +`); + +export { EMPTY_STATE_MESSAGE, EmptyState }; diff --git a/testgen/ui/static/js/components/expander_toggle.js b/testgen/ui/static/js/components/expander_toggle.js new file mode 100644 index 00000000..72aab775 --- /dev/null +++ b/testgen/ui/static/js/components/expander_toggle.js @@ -0,0 +1,61 @@ +/** + * @typedef Properties + * @type {object} + * @property {boolean} default + * @property {string?} expandLabel + * @property {string?} collapseLabel + * @property {string?} style + * @property {Function?} onExpand + * @property {Function?} onCollapse + */ +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { getValue, loadStylesheet } from '../utils.js'; + +const { div, span, i } = van.tags; + +const ExpanderToggle = (/** @type Properties */ props) => { + loadStylesheet('expanderToggle', stylesheet); + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(24); + } + + const expandedState = van.state(!!getValue(props.default)); + const expandLabel = getValue(props.expandLabel) || 'Expand'; + const collapseLabel = getValue(props.collapseLabel) || 'Collapse'; + + return div( + { + class: 'expander-toggle', + style: () => getValue(props.style) ?? '', + onclick: () => { + expandedState.val = !expandedState.val; + const handler = (expandedState.val ? props.onExpand : props.onCollapse) ?? Streamlit.sendData; + handler(expandedState.val); + } + }, + span( + { class: 'expander-toggle--label' }, + () => expandedState.val ? collapseLabel : expandLabel, + ), + i( + { class: 'material-symbols-rounded' }, + () => expandedState.val ? 'keyboard_arrow_up' : 'keyboard_arrow_down', + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.expander-toggle { + display: flex; + flex-flow: row nowrap; + justify-content: flex-end; + align-items: center; + cursor: pointer; + color: #1976d2; +} +`); + +export { ExpanderToggle }; diff --git a/testgen/ui/static/js/components/expansion_panel.js b/testgen/ui/static/js/components/expansion_panel.js new file mode 100644 index 00000000..40f38bf5 --- /dev/null +++ b/testgen/ui/static/js/components/expansion_panel.js @@ -0,0 +1,67 @@ +/** + * @typedef Options + * @type {object} + * @property {string} title + * @property {string?} testId + * @property {bool} expanded + */ + +import van from '../van.min.js'; +import { loadStylesheet } from '../utils.js'; +import { Icon } from './icon.js'; + +const { div, span } = van.tags; + +/** + * + * @param {Options} options + * @param {...HTMLElement} children + */ +const ExpansionPanel = (options, ...children) => { + loadStylesheet('expansion-panel', stylesheet); + + const expanded = van.state(options.expanded ?? false); + const icon = van.derive(() => expanded.val ? 'keyboard_arrow_up' : 'keyboard_arrow_down'); + const expansionClass = van.derive(() => expanded.val ? '' : 'collapsed'); + + return div( + { class: () => `tg-expansion-panel ${expansionClass.val}`, 'data-testid': options.testId ?? '' }, + div( + { + class: 'tg-expansion-panel--title flex-row fx-justify-space-between clickable', + 'data-testid': 'expansion-panel-trigger', + onclick: () => expanded.val = !expanded.val, + }, + span({}, options.title), + Icon({}, icon), + ), + div( + { class: 'tg-expansion-panel--content mt-4' }, + ...children, + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-expansion-panel { + border: 1px solid var(--border-color); + border-radius: 8px; + padding: 12px; +} + +.tg-expansion-panel--title:hover { + color: var(--primary-color); +} + +.tg-expansion-panel--title:hover i.tg-icon { + color: var(--primary-color) !important; +} + +.tg-expansion-panel.collapsed > .tg-expansion-panel--content { + height: 0; + display: none; +} +`); + +export { ExpansionPanel }; diff --git a/testgen/ui/static/js/components/explorer_column_selector.js b/testgen/ui/static/js/components/explorer_column_selector.js new file mode 100644 index 00000000..1d86c542 --- /dev/null +++ b/testgen/ui/static/js/components/explorer_column_selector.js @@ -0,0 +1,283 @@ +/** + * @typedef FilterValue + * @type {object} + * @property {string} field + * @property {string} value + * @property {Array?} others + * + * @typedef Selection + * @type {Array} + * + * @typedef Column + * @type {object} + * @property {string} name + * @property {string} table + * @property {string} table_group + * @property {boolean?} selected + * + * @typedef Properties + * @type {object} + * @property {Array} columns + */ +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { emitEvent, getValue, isEqual, loadStylesheet, slugify } from '../utils.js'; +import { Tree } from './tree.js'; +import { Icon } from './icon.js'; +import { Button } from './button.js'; + +const { div, i, span } = van.tags; +const tableGroupFieldName = 'table_groups_name'; +const tableFieldName = 'table_name'; +const columnFieldName = 'column_name'; + +const TRANSLATIONS = { + table_groups_name: 'Table Group', + table_name: 'Table', + column_name: 'Column', +}; + +const ColumnSelector = (/** @type Properties */ props) => { + loadStylesheet('column-selector', stylesheet); + + window.testgen.isPage = true; + Streamlit.setFrameHeight(400); + + const initialSelection = van.state([]); + const selection = van.state([]); + const valueById = van.state({}); + const treeNodes = van.state([]); + const changed = van.derive(() => { + const current = selection.val; + const initial = initialSelection.val; + return !isEqual(current, initial); + }); + + van.derive(() => { + const initialization = initlialize(getValue(props.columns) ?? []); + + valueById.val = initialization.valueById; + treeNodes.val = initialization.treeNodes; + selection.val = initialization.selection; + initialSelection.val = initialization.selection; + }); + + return div( + {class: 'flex-column fx-gap-2 column-selector-wrapper'}, + div( + {class: 'flex-row column-selector'}, + Tree({ + id: 'column-selector-tree', + classes: 'column-selector--tree', + multiSelect: true, + onMultiSelect: (selected) => { + if (!selected) { + selection.val = []; + return; + } + + selection.val = getSelectionFromTreeNodes(selected, getValue(valueById)); + }, + nodes: treeNodes, + }), + span({class: 'column-selector--divider'}), + () => { + const selection_ = getValue(selection); + return div( + {class: 'flex-row fx-flex-wrap fx-align-flex-start fx-flex-align-content fx-gap-2 column-selector--selected'}, + selection_.map((item) => ColumnFilter(item)), + ); + }, + ), + div( + {class: 'flex-row fx-justify-content-flex-end'}, + Button({ + type: 'stroked', + color: 'primary', + label: 'Apply', + width: 'auto', + disabled: van.derive(() => !changed.val), + onclick: () => emitEvent('ColumnFiltersUpdated', {payload: selection.val}), + }), + ) + ); +}; + +function initlialize(/** @type Array */ columns) { + const valueById = {}; + const treeNodesMapping = {}; + + for (const columnObject of columns) { + const tableGroup = slugify(columnObject.table_group); + const table = slugify(columnObject.table); + const column = slugify(columnObject.name); + + const tableGroupId = `${tableGroupFieldName}:${tableGroup}` + const tableId = `${tableFieldName}:${tableGroup}:${table}` + const columnId = `${columnFieldName}:${tableGroup}:${table}:${column}` + + valueById[tableGroupId] = columnObject.table_group; + valueById[tableId] = columnObject.table; + valueById[columnId] = columnObject.name; + + treeNodesMapping[tableGroupId] = treeNodesMapping[tableGroupId] ?? { + id: tableGroupId, + label: columnObject.table_group, + icon: 'dataset', + selected: false, + children: {}, + }; + treeNodesMapping[tableGroupId].children[tableId] = treeNodesMapping[tableGroupId].children[tableId] ?? { + id: tableId, + label: columnObject.table, + icon: 'table', + selected: false, + children: {}, + }; + treeNodesMapping[tableGroupId].children[tableId].children[columnId] = { + id: columnId, + label: columnObject.name, + icon: 'abc', + selected: columnObject.selected ?? false, + }; + } + + const treeNodes = Object.values(treeNodesMapping); + for (const tableGroup of treeNodes) { + tableGroup.children = Object.values(tableGroup.children); + for (const table of tableGroup.children) { + table.children = Object.values(table.children); + table.selected = table.children.every(child => child.selected); + } + tableGroup.selected = tableGroup.children.every(child => child.selected); + } + + return { treeNodes, valueById, selection: getSelectionFromTreeNodes(treeNodes, valueById) }; +} + +function getSelectionFromTreeNodes(treeNodes, valueById) { + if (!treeNodes || treeNodes.length === 0) { + return []; + } + + const selection = []; + const isFromUserAction = treeNodes[0].all !== undefined; + const propertyToCheck = isFromUserAction ? 'all' : 'selected'; + for (const tableGroup of treeNodes) { + if (tableGroup[propertyToCheck]) { + selection.push({field: tableGroupFieldName, value: valueById[tableGroup.id]}); + continue; + } + + for (const table of tableGroup.children) { + if (table[propertyToCheck]) { + selection.push({ + field: tableFieldName, + value: valueById[table.id], + others: [ + {field: tableGroupFieldName, value: valueById[tableGroup.id]}, + ], + }); + continue; + } + + for (const column of table.children) { + if (isFromUserAction || column.selected) { + selection.push({ + field: columnFieldName, + value: valueById[column.id], + others: [ + {field: tableFieldName, value: valueById[table.id]}, + {field: tableGroupFieldName, value: valueById[tableGroup.id]}, + ], + }); + } + } + } + } + + return selection; +} + +const ColumnFilter = ( + /** @type FilterValue */ filter, +) => { + const expanded = van.state(false); + const expandIcon = van.derive(() => expanded.val ? 'keyboard_arrow_up' : 'keyboard_arrow_down'); + + return div( + { + class: 'flex-row column-selector--filter', + 'data-testid': 'column-selector-filter', + style: 'background: var(--form-field-color); border-radius: 8px; padding: 8px 12px;', + }, + div( + {class: 'flex-column'}, + div( + { class: 'flex-row', 'data-testid': 'column-selector-filter' }, + span({ class: 'text-secondary mr-1', 'data-testid': 'column-selector-filter-label' }, `${TRANSLATIONS[filter.field] ?? filter.field} =`), + span({'data-testid': 'column-selector-filter-value'}, filter.value), + ), + () => { + const expanded_ = getValue(expanded); + if (!expanded_) { + return ''; + } + + return div( + {class: 'flex-column', 'data-testid': 'column-selector-filter-others'}, + filter.others.map((item) => ColumnFilterLine(item.field, item.value)), + ); + }, + ), + filter.others?.length > 0 + ? Icon( + { + size: 16, + classes: 'clickable text-secondary ml-1', + 'data-testid': 'column-selector-filter-expand', + onclick: () => expanded.val = !expanded.val, + }, + expandIcon, + ) + : '', + ); +}; + +const ColumnFilterLine = (/** @type string */ field, /** @type string */ value) => { + return div( + { class: 'flex-row', 'data-testid': 'column-selector-filter' }, + span({ class: 'text-secondary mr-1', 'data-testid': 'column-selector-filter-label' }, `${TRANSLATIONS[field] ?? field} =`), + span({'data-testid': 'column-selector-filter-value'}, value), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.column-selector-wrapper { + height: 100%; + overflow-y: hidden; +} + +.column-selector { + height: calc(100% - 48px); + align-items: stretch; +} + +.column-selector--tree { + flex: 1; +} + +.column-selector--divider { + width: 1px; + background-color: var(--grey); + margin: 0 10px; +} + +.column-selector--selected { + flex: 2; + overflow-y: auto; +} +`); + +export { ColumnSelector, ColumnFilter }; diff --git a/testgen/ui/static/js/components/file_input.js b/testgen/ui/static/js/components/file_input.js new file mode 100644 index 00000000..5b49f503 --- /dev/null +++ b/testgen/ui/static/js/components/file_input.js @@ -0,0 +1,241 @@ +/** + * @import {InputState} from './input.js'; + * @import {Validator} from '../form_validators.js'; + * + * @typedef FileValue + * @type {object} + * @property {string} name + * @property {string} content + * @property {number} size + * + * @typedef Options + * @type {object} + * @property {string} label + * @property {string?} placeholder + * @property {string} name + * @property {string} value + * @property {string?} class + * @property {Array?} validators + * @property {function(FileValue?, InputState)?} onChange + * + */ +import van from '../van.min.js'; +import { checkIsRequired, getRandomId, getValue, loadStylesheet } from "../utils.js"; +import { Icon } from './icon.js'; +import { Button } from './button.js'; +import { humanReadableSize } from '../display_utils.js'; + +const { div, input, label, span } = van.tags; + +/** + * File uploader component that emits change events with a base64 + * encoding of the uploaded file. + * + * @param {Options} options + * @returns {HTMLElement} + */ +const FileInput = (options) => { + loadStylesheet('file-uploader', stylesheet); + + const value = van.state(getValue(options.value)); + const inputId = `file-uploader-${getRandomId()}`; + const fileOver = van.state(false); + const cssClass = van.derive(() => `tg-file-uploader flex-column fx-gap-2 ${getValue(options.class) ?? ''}`) + const showLoading = van.state(false); + const loadingIndicatorProgress = van.state(0); + const loadingIndicatorStyle = van.derive(() => `width: ${loadingIndicatorProgress.val}%;`); + const errors = van.derive(() => { + const validators = getValue(options.validators) ?? []; + return validators.map(v => v(value.val)).filter(error => error); + }); + const isRequired = van.state(false); + + van.derive(() => { + isRequired.val = checkIsRequired(getValue(options.validators) ?? []); + }); + + let sizeLimit = undefined; + let sizeLimitValidator = (getValue(options.validators) ?? []).filter(v => v.args?.name === 'sizeLimit')[0]; + if (sizeLimitValidator) { + sizeLimit = sizeLimitValidator.args.limit; + } + + let hasBeenChecked = false; + van.derive(() => { + if (options.onChange && (!hasBeenChecked || value.val !== value.oldVal || errors.val.length !== errors.oldVal.length)) { + options.onChange(value.val, { errors: errors.val, valid: errors.val.length <= 0 }); + } + hasBeenChecked = true; + }); + + const browseFile = () => { + document.getElementById(inputId).click(); + }; + + const loadFile = (event) => { + const selectedFile = event.target.files[0]; + if (!selectedFile) { + value.val = null; + showLoading.val = false; + loadingIndicatorProgress.val = 0; + return; + } + + const fileReader = new FileReader(); + fileReader.addEventListener('loadstart', (event) => { + loadingIndicatorProgress.val = 0; + showLoading.val = event.lengthComputable; + }); + fileReader.addEventListener('progress', (event) => { + if (showLoading.val) { + loadingIndicatorProgress.val = event.loaded / event.total; + } + }); + fileReader.addEventListener('loadend', (event) => { + loadingIndicatorProgress.val = 100; + value.val = { + name: selectedFile.name, + content: fileReader.result, + size: event.loaded, + }; + }); + + fileReader.readAsDataURL(selectedFile); + }; + + const unloadFile = (event) => { + event.stopPropagation(); + value.val = null; + showLoading.val = false; + loadingIndicatorProgress.val = 0; + }; + + return div( + { class: cssClass }, + label( + { class: 'tg-file-uploader--label text-caption flex-row fx-gap-1' }, + options.label, + () => isRequired.val + ? span({ class: 'text-error' }, '*') + : '', + ), + div( + { class: () => `tg-file-uploader--dropzone flex-column clickable ${fileOver.val ? 'on-dragover' : ''}` }, + div( + { + onclick: browseFile, + ondragenter: (event) => { + event.preventDefault(); + fileOver.val = true; + }, + ondragleave: (event) => { + if (!event.currentTarget.contains(event.relatedTarget)) { + fileOver.val = false; + } + }, + ondragover: (event) => event.preventDefault(), + ondrop: (/** @type {DragEvent} */event) => { + event.preventDefault(); + fileOver.val = false; + + let files = [...(event.dataTransfer.items ?? [])].filter((item) => item.kind === 'file').map((item) => item.getAsFile()); + if (!event.dataTransfer.items) { + files = [...(event.dataTransfer.files ?? [])]; + } + + loadFile({ target: { files }}); + }, + }, + input({ + id: inputId, + type: 'file', + name: options.name, + tabindex: '-1', + onchange: loadFile, + }), + () => value.val + ? FileSummary(value.val, unloadFile) + : FileSelectionDropZone(options.placeholder ?? 'Drop file here or browse files', sizeLimit) + ), + () => showLoading.val + ? div({ class: 'tg-file-uploader--loading', style: loadingIndicatorStyle }, '') + : '', + ), + ); +}; + +/** + * + * @param {string} placeholder + * @param {number} sizeLimit + * @returns + */ +const FileSelectionDropZone = (placeholder, sizeLimit) => { + return div( + { class: 'flex-row fx-gap-4' }, + Icon({size: 48}, 'cloud_upload'), + div( + { class: 'flex-column fx-gap-1' }, + span({}, placeholder), + span({ class: 'text-secondary text-caption' }, `Limit ${humanReadableSize(sizeLimit)} per file`), + ), + ); +}; + +const FileSummary = (value, onFileUnload) => { + const fileName = getValue(value).name; + const fileSize = humanReadableSize(getValue(value).size); + + return div( + { class: 'flex-row fx-gap-4' }, + Icon({size: 48}, 'draft'), + div( + { class: 'flex-column fx-gap-1' }, + span({}, fileName), + span({ class: 'text-secondary text-caption' }, `Size: ${fileSize}`), + ), + span({ style: 'margin: 0px auto;'}), + Button({ + type: 'icon', + color: 'basic', + icon: 'close', + onclick: onFileUnload, + }), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-file-uploader { +} + +.tg-file-uploader--dropzone { + border-radius: 8px; + background: var(--form-field-color); + padding: 16px; + position: relative; + border: 1px transparent dashed; +} + +.tg-file-uploader--dropzone.on-dragover { + border-color: var(--primary-color); +} + +.tg-file-uploader--dropzone input[type="file"] { + display: none; +} + +.tg-file-uploader--loading { + height: 3px; + background: var(--primary-color); + position: absolute; + width: 0%; + left: 0; + bottom: 0; + border-bottom-left-radius: 8px; + border-bottom-right-radius: 8px; + transition: 200ms width ease-in; +} +`); + +export { FileInput }; diff --git a/testgen/ui/static/js/components/frequency_bars.js b/testgen/ui/static/js/components/frequency_bars.js new file mode 100644 index 00000000..d26073ce --- /dev/null +++ b/testgen/ui/static/js/components/frequency_bars.js @@ -0,0 +1,121 @@ +/** + * @typedef FrequencyItem + * @type {object} + * @property {string} value + * @property {number} count + * + * @typedef Properties + * @type {object} + * @property {FrequencyItem[]} items + * @property {number} total + * @property {number} nullCount + * @property {string} title + * @property {string?} color + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; +import { colorMap, formatNumber } from '../display_utils.js'; + +const { div, span } = van.tags; +const defaultColor = 'teal'; +const otherColor = colorMap['emptyTeal']; +const nullColor = colorMap['emptyLight']; + +const FrequencyBars = (/** @type Properties */ props) => { + loadStylesheet('frequencyBars', stylesheet); + + const total = van.derive(() => getValue(props.total)); + const nullCount = van.derive(() => getValue(props.nullCount)); + const color = van.derive(() => { + const colorValue = getValue(props.color) || defaultColor; + return colorMap[colorValue] || colorValue; + }); + const width = van.derive(() => { + const maxCount = getValue(props.items).reduce((max, { count }) => Math.max(max, count), 0); + return String(maxCount).length * 7; + }); + + return () => div( + div( + { class: 'mb-2 text-secondary' }, + props.title, + ), + getValue(props.items).map(({ value, count }) => { + return div( + { class: 'flex-row fx-gap-2' }, + div( + { class: 'tg-frequency-bars' }, + span({ + class: 'tg-frequency-bars--fill', + style: `width: 100%; background-color: ${nullColor};`, + }), + span({ + class: 'tg-frequency-bars--fill', + style: () => `width: ${(total.val - nullCount.val) * 100 / total.val}%; + ${(total.val - nullCount.val) ? 'min-width: 1px;' : ''} + background-color: ${otherColor};`, + }), + span({ + class: 'tg-frequency-bars--fill', + style: () => `width: ${count * 100 / total.val}%; + ${count ? 'min-width: 1px;' : ''} + background-color: ${color.val};`, + }), + ), + div( + { + class: 'text-caption tg-frequency-bars--count', + style: () => `width: ${width.val}px;`, + }, + formatNumber(count), + ), + div(value), + ); + }), + div( + { class: 'tg-frequency-bars--legend flex-row fx-flex-wrap text-caption mt-1' }, + span({ class: 'dot', style: `color: ${color.val};` }), + 'Value', + span({ class: 'dot', style: `color: ${otherColor};` }), + 'Other', + span({ class: 'dot', style: `color: ${nullColor};` }), + 'Null', + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-frequency-bars { + width: 150px; + height: 15px; + flex-shrink: 0; + position: relative; +} + +.tg-frequency-bars--fill { + position: absolute; + border-radius: 4px; + height: 100%; +} + +.tg-frequency-bars--count { + flex-shrink: 0; + text-align: right; +} + +.tg-frequency-bars--legend { + font-style: italic; +} + +.tg-frequency-bars--legend span { + margin-right: 2px; + font-size: 4px; +} + +.tg-frequency-bars--legend span:not(:first-child) { + margin-left: 8px; +} +`); + +export { FrequencyBars }; diff --git a/testgen/ui/static/js/components/freshness_chart.js b/testgen/ui/static/js/components/freshness_chart.js new file mode 100644 index 00000000..919136fa --- /dev/null +++ b/testgen/ui/static/js/components/freshness_chart.js @@ -0,0 +1,211 @@ +/** + * @import {ChartViewBox, Point} from './chart_canvas.js'; + * + * @typedef Options + * @type {object} + * @property {number} width + * @property {number} height + * @property {number} lineWidth + * @property {number} lineHeight + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {ChartViewBox?} viewBox + * @property {Function?} showTooltip + * @property {Function?} hideTooltip + * @property {{startX: number?, endX: number, startTime: number?, endTime: number}?} predictedWindow + * + * @typedef FreshnessEvent + * @type {object} + * @property {Point} point + * @property {number} time + * @property {boolean} changed + * @property {string} status + * @property {string} message + * @property {boolean} isTraining + * @property {boolean} isPending + */ +import van from '../van.min.js'; +import { colorMap, formatTimestamp } from '../display_utils.js'; +import { getValue } from '../utils.js'; + +const { div, span } = van.tags; +const { circle, g, line, rect, svg } = van.tags("http://www.w3.org/2000/svg"); +const colorByStatus = { + Passed: colorMap.limeGreen, + Failed: colorMap.red, + Warning: colorMap.orange, + Log: colorMap.blueLight, +}; + +/** + * @param {Options} options + * @param {Array} events + */ +const FreshnessChart = (options, ...events) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox?.width; + height.val = viewBox?.height; + minX.val = viewBox?.minX; + minY.val = viewBox?.minY; + }); + + const freshnessEvents = events.map(event => { + if (event.isPending) { + return null; + } + + const point = event.point; + const minY = point.y - (_options.lineHeight / 2) + 2; + const maxY = point.y + (_options.lineHeight / 2) - 2; + const lineProps = { x1: point.x, y1: minY, x2: point.x, y2: maxY }; + const eventColor = getFreshnessEventColor(event); + const markerProps = _options.showTooltip ? { + onmouseenter: () => _options.showTooltip?.(FreshnessChartTooltip(event), point), + onmouseleave: () => _options.hideTooltip?.(), + } : {}; + + return g( + {...markerProps}, + event.changed + ? line({ + ...lineProps, + style: `stroke: ${eventColor}; stroke-width: ${event.isTraining ? '1' : _options.lineWidth};`, + }) + : null, + !['Passed', 'Log'].includes(event.status) + ? rect({ + width: _options.markerSize, + height: _options.markerSize, + x: lineProps.x1 - (_options.markerSize / 2), + y: maxY - (_options.markerSize / 2), + fill: eventColor, + style: `transform-box: fill-box; transform-origin: center;`, + transform: 'rotate(45)', + }) + : circle({ + cx: lineProps.x1, + cy: maxY, + r: 2, + fill: event.isTraining ? 'var(--dk-dialog-background)' : eventColor, + style: `stroke: ${eventColor}; stroke-width: 1;`, + }), + // Larger hit area for tooltip + rect({ + width: _options.markerSize, + height: _options.lineHeight, + x: lineProps.x1 - (_options.markerSize / 2), + y: 0, + fill: 'transparent', + style: `transform-box: fill-box; transform-origin: center;`, + }) + ); + }); + + const extraAttributes = {}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + return svg( + { + width: '100%', + height: '100%', + ...extraAttributes, + }, + ...freshnessEvents, + FreshnessPredictedWindow(_options), + ); +}; + +const /** @type Options */ defaultOptions = { + width: 600, + height: 200, + lineWidth: 2, + lineHeight: 5, + markerSize: 8, + nestedPosition: {x: 0, y: 0}, +}; + +/** + * @param {FreshnessEvent} event + * @returns + */ +const getFreshnessEventColor = (event) => { + if (!event.changed && (event.status === 'Passed' || event.isTraining)) { + return colorMap.emptyDark; + } + return colorByStatus[event.status]; +} + +/** + * @param {FreshnessEvent} event + * @returns {HTMLDivElement} + */ +const FreshnessChartTooltip = (event) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, formatTimestamp(event.time, false)), + span( + {class: 'text-left text-small'}, + `${event.changed ? 'Table updated' : 'No update'}${event.message ? ' - ' + event.message : ''}`, + ), + ); +}; + +/** + * @param {Options} options + * @returns {SVGGElement|null} + */ +const FreshnessPredictedWindow = (options) => { + const window = getValue(options.predictedWindow); + if (!window) return null; + + const barHeight = getValue(options.height); + const startX = window.startX ?? window.endX; + const windowWidth = window.endX - startX; + if (windowWidth <= 0) return null; + + const markerProps = options.showTooltip ? { + onmouseenter: () => options.showTooltip?.(FreshnessWindowTooltip(window), {x: startX + windowWidth / 2, y: barHeight / 2}), + onmouseleave: () => options.hideTooltip?.(), + } : {}; + + return g( + {...markerProps}, + rect({ + width: windowWidth, + height: barHeight, + x: startX, + y: 0, + fill: colorMap.emptyDark, + opacity: 0.15, + rx: 2, + }), + ); +}; + +const FreshnessWindowTooltip = (window) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, 'Next update expected'), + window.startTime + ? span({class: 'text-left text-small'}, `${formatTimestamp(window.startTime, false)} - ${formatTimestamp(window.endTime, false)}`) + : span({class: 'text-left text-small'}, `By ${formatTimestamp(window.endTime, false)}`), + ); +}; + +export { FreshnessChart }; diff --git a/testgen/ui/static/js/components/help_menu.js b/testgen/ui/static/js/components/help_menu.js new file mode 100644 index 00000000..3ea341db --- /dev/null +++ b/testgen/ui/static/js/components/help_menu.js @@ -0,0 +1,161 @@ +/** + * @typedef Version + * @type {object} + * @property {string} edition + * @property {string} current + * @property {string} latest + * + * @typedef Permissions + * @type {object} + * @property {boolean} can_edit + * + * @typedef Properties + * @type {object} + * @property {string} page_help + * @property {string} support_email + * @property {Version} version + * @property {Permissions} permissions +*/ +import van from '../van.min.js'; +import { emitEvent, getRandomId, getValue, loadStylesheet, resizeFrameHeightOnDOMChange, resizeFrameHeightToElement } from '../utils.js'; +import { Streamlit } from '../streamlit.js'; +import { Icon } from './icon.js'; + +const { a, div, span } = van.tags; + +const baseHelpUrl = 'https://docs.datakitchen.io/articles/dataops-testgen-help/'; +const releaseNotesTopic = 'testgen-release-notes'; +const upgradeTopic = 'upgrade-testgen'; + +const slackUrl = 'https://data-observability-slack.datakitchen.io/join'; +const trainingUrl = 'https://info.datakitchen.io/data-quality-training-and-certifications'; + +const HelpMenu = (/** @type Properties */ props) => { + loadStylesheet('help-menu', stylesheet); + Streamlit.setFrameHeight(1); + window.testgen.isPage = true; + + const domId = `help-menu-${getRandomId()}`; + const version = getValue(props.version) ?? {}; + + resizeFrameHeightToElement(domId); + resizeFrameHeightOnDOMChange(domId); + + return div( + { id: domId }, + div( + { class: 'flex-column pt-3' }, + getValue(props.help_topic) + ? HelpLink(`${baseHelpUrl}${getValue(props.help_topic)}`, 'Help for this Page', 'description') + : null, + HelpLink(baseHelpUrl, 'TestGen Help', 'help'), + HelpLink(trainingUrl, 'Training Portal', 'school'), + getValue(props.permissions)?.can_edit + ? div( + { class: 'help-item', onclick: () => emitEvent('AppLogsClicked') }, + Icon({ classes: 'help-item-icon' }, 'browse_activity'), + 'Application Logs', + ) + : null, + span({ class: 'help-divider' }), + HelpLink(slackUrl, 'Slack Community', 'group'), + getValue(props.support_email) + ? HelpLink( + `mailto:${getValue(props.support_email)} + ?subject=${version.edition}: Contact Support + &body=%0D%0D%0DVersion: ${version.edition} ${version.current}`, + 'Contact Support', + 'email', + ) + : null, + span({ class: 'help-divider' }), + version.current || version.latest + ? div( + { class: 'help-version' }, + version.current + ? HelpLink(`${baseHelpUrl}${releaseNotesTopic}`, `${version.edition} ${version.current}`, null, null) + : null, + version.latest !== version.current + ? HelpLink( + `${baseHelpUrl}${upgradeTopic}`, + `New version available! ${version.latest}`, + null, + 'latest', + ) + : null, + ) + : null, + ), + ); +} + +const HelpLink = ( + /** @type string */ url, + /** @type string */ label, + /** @type string? */ icon, + /** @type string */ classes = 'help-item', +) => { + return a( + { + class: classes, + href: url, + target: '_blank', + onclick: () => emitEvent('ExternalLinkClicked'), + }, + icon ? Icon({ classes: 'help-item-icon' }, icon) : null, + label, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.help-item { + padding: 12px 24px; + color: var(--primary-text-color); + text-decoration: none; + display: flex; + align-items: center; + gap: 8px; + cursor: pointer; + transition: 0.3s; +} + +.help-item:hover { + background-color: var(--select-hover-background); + color: var(--primary-color); +} + +.help-item-icon { + color: var(--primary-text-color); + transition: 0.3s; +} + +.help-item:hover .help-item-icon { + color: var(--primary-color); +} + +.help-divider { + height: 1px; + background-color: var(--border-color); + margin: 0 16px; +} + +.help-version { + padding: 16px 16px 8px; + display: flex; + flex-direction: column; + align-items: flex-end; + gap: 8px; +} + +.help-version > a { + color: var(--secondary-text-color); + text-decoration: none; +} + +.help-version > a.latest { + color: var(--red); +} +`); + +export { HelpMenu }; diff --git a/testgen/ui/static/js/components/icon.js b/testgen/ui/static/js/components/icon.js new file mode 100644 index 00000000..6f76331b --- /dev/null +++ b/testgen/ui/static/js/components/icon.js @@ -0,0 +1,46 @@ +/** + * @typedef Properties + * @type {object} + * @property {string?} classes + * @property {number?} size + * @property {boolean?} filled + */ +import { getValue, isDataURL, loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; + +const { i, img } = van.tags; +const DEFAULT_SIZE = 20; + +const Icon = (/** @type Properties */ props, /** @type string */ icon) => { + loadStylesheet('icon', stylesheet); + + if (isDataURL(getValue(icon))) { + return img( + { + width: () => getValue(props.size) || DEFAULT_SIZE, + height: () => getValue(props.size) || DEFAULT_SIZE, src: icon, + class: () => `tg-icon tg-icon-image ${getValue(props.classes) ?? ''}`, + src: icon, + } + ); + } + + return i( + { + class: () => `material-symbols-rounded tg-icon text-secondary ${getValue(props.filled) ? 'material-symbols-filled' : ''} ${getValue(props.classes) ?? ''}`, + style: () => `font-size: ${getValue(props.size) || DEFAULT_SIZE}px;`, + ...props, + }, + icon, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-icon { + position: relative; + cursor: default; +} +`); + +export { Icon }; diff --git a/testgen/ui/static/js/components/input.js b/testgen/ui/static/js/components/input.js new file mode 100644 index 00000000..130aba5c --- /dev/null +++ b/testgen/ui/static/js/components/input.js @@ -0,0 +1,333 @@ +/** + * @import { Properties as TooltipProperties } from './tooltip.js'; + * @import { Validator } from '../form_validators.js'; + * + * @typedef InputState + * @type {object} + * @property {boolean} valid + * @property {string[]} errors + * + * @typedef Properties + * @type {object} + * @property {string?} id + * @property {string?} name + * @property {string?} label + * @property {string?} help + * @property {TooltipProperties['position']} helpPlacement + * @property {(string | number)?} value + * @property {string?} placeholder + * @property {string[]?} autocompleteOptions + * @property {string?} icon + * @property {boolean?} clearable + * @property {('value' | 'always')?} clearableCondition + * @property {boolean?} passwordSuggestions + * @property {function(string, InputState)?} onChange + * @property {boolean?} disabled + * @property {boolean?} readonly + * @property {function(string, InputState)?} onClear + * @property {number?} width + * @property {number?} height + * @property {string?} style + * @property {string?} type + * @property {string?} class + * @property {string?} testId + * @property {any?} prefix + * @property {number} step + * @property {Array?} validators + */ +import van from '../van.min.js'; +import { debounce, getValue, loadStylesheet, getRandomId, checkIsRequired } from '../utils.js'; +import { Icon } from './icon.js'; +import { withTooltip } from './tooltip.js'; +import { Portal } from './portal.js'; +import { caseInsensitiveIncludes } from '../display_utils.js'; + +const { div, input, label, i, small, span } = van.tags; +const defaultHeight = 38; +const iconSize = 22; +const addonIconSize = 20; +const passwordFieldTypeSwitch = { + password: 'text', + text: 'password', +}; + +const Input = (/** @type Properties */ props) => { + loadStylesheet('input', stylesheet); + + const domId = van.derive(() => getValue(props.id) ?? getRandomId()); + const value = van.derive(() => getValue(props.value) ?? ''); + const errors = van.derive(() => { + const validators = getValue(props.validators) ?? []; + return validators.map(v => v(value.val)).filter(error => error); + }); + const firstError = van.derive(() => { + return errors.val[0] ?? ''; + }); + const originalInputType = van.derive(() => getValue(props.type) ?? 'text'); + const inputType = van.state(originalInputType.rawVal); + + const isRequired = van.state(false); + const isDirty = van.state(false); + const onChange = props.onChange?.val ?? props.onChange; + if (onChange) { + onChange(value.val, { errors: errors.val, valid: errors.val.length <= 0 }); + } + van.derive(() => { + const onChange = props.onChange?.val ?? props.onChange; + if (onChange && (value.val !== value.oldVal || errors.val.length !== errors.oldVal.length)) { + onChange(value.val, { errors: errors.val, valid: errors.val.length <= 0 }); + } + }); + + van.derive(() => { + isRequired.val = checkIsRequired(getValue(props.validators) ?? []); + }); + + const onClear = props.onClear?.val ?? props.onClear ?? (() => value.val = ''); + + const autocompleteOpened = van.state(false); + const autocompleteOptions = van.derive(() => { + const filtered = getValue(props.autocompleteOptions)?.filter(option => caseInsensitiveIncludes(option, value.val)); + if (!filtered?.length) { + autocompleteOpened.val = false; + } + return filtered; + }); + const onAutocomplete = (/** @type string */ option) => { + autocompleteOpened.val = false; + value.val = option; + }; + + return label( + { + id: domId, + class: () => `flex-column fx-gap-1 tg-input--label ${getValue(props.class) ?? ''}`, + style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}; ${getValue(props.style)}`, + 'data-testid': props.testId ?? props.name ?? '', + }, + div( + { class: 'flex-row fx-gap-1 text-caption' }, + props.label, + () => isRequired.val + ? span({ class: 'text-error' }, '*') + : '', + () => getValue(props.help) + ? withTooltip( + Icon({ size: 16, classes: 'text-disabled' }, 'help'), + { text: props.help, position: getValue(props.helpPlacement) ?? 'top', width: 200 } + ) + : null, + ), + div( + { + class: () => { + const sufixIconCount = Number(value.val && originalInputType.val === 'password') + Number(value.val && getValue(props.clearable)); + return `flex-row tg-input--field ${getValue(props.disabled) ? 'tg-input--disabled' : ''} sufix-padding-${sufixIconCount}`; + }, + style: () => `height: ${getValue(props.height) || defaultHeight}px;`, + }, + props.prefix + ? div( + { class: 'tg-input--field-prefix' }, + props.prefix, + ) + : undefined, + input({ + value, + name: props.name ?? '', + type: inputType, + disabled: props.disabled, + ...(inputType.val === 'number' ? {step: getValue(props.step)} : {}), + ...(props.readonly ? {readonly: true} : {}), + ...(props.passwordSuggestions ?? true ? {} : {autocomplete: 'off', 'data-op-ignore': true}), + placeholder: () => getValue(props.placeholder) ?? '', + oninput: debounce((/** @type Event */ event) => { + isDirty.val = true; + value.val = event.target.value; + }, 300), + onclick: van.derive(() => autocompleteOptions.val?.length + ? () => autocompleteOpened.val = true + : null + ), + }), + () => getValue(props.icon) ? i( + { + class: 'material-symbols-rounded tg-input--icon text-secondary', + style: `top: ${((getValue(props.height) || defaultHeight) - iconSize) / 2}px`, + }, + props.icon, + ) : '', + () => { + const clearableCondition = getValue(props.clearableCondition) ?? 'value'; + const showClearable = getValue(props.clearable) && ( + clearableCondition === 'always' + || (clearableCondition === 'value' && value.val) + ); + + return div( + { class: 'flex-row tg-input--icon-actions' }, + originalInputType.val === 'password' && value.val + ? i( + { + class: 'material-symbols-rounded tg-input--visibility clickable text-secondary', + style: `top: ${((getValue(props.height) || defaultHeight) - addonIconSize) / 2}px`, + onclick: () => inputType.val = passwordFieldTypeSwitch[inputType.val], + }, + inputType.val === 'password' ? 'visibility' : 'visibility_off', + ) + : '', + showClearable + ? i( + { + class: () => `material-symbols-rounded tg-input--clear text-secondary clickable`, + style: `top: ${((getValue(props.height) || defaultHeight) - addonIconSize) / 2}px`, + onclick: onClear, + }, + 'clear', + ) + : '', + ); + }, + ), + () => + isDirty.val && firstError.val + ? small({ class: 'tg-input--error' }, firstError) + : '', + Portal( + { target: domId.val, targetRelative: true, opened: autocompleteOpened }, + () => div( + { class: 'tg-input--options-wrapper' }, + autocompleteOptions.val?.map(option => + div( + { + class: 'tg-input--option', + onclick: (/** @type Event */ event) => { + // https://stackoverflow.com/questions/61273446/stop-click-event-propagation-on-a-label + event.preventDefault(); + event.stopPropagation(); + onAutocomplete(option); + }, + }, + option, + ) + ), + ), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-input--field { + position: relative; +} + +.tg-input--icon { + position: absolute; + left: 4px; + font-size: ${iconSize}px; +} + +.tg-input--field:has(.tg-input--icon) { + padding-left: 28px; +} + +.tg-input--icon-actions { + position: absolute; + right: 8px; +} + +.tg-input--clear, +.tg-input--visibility { + font-size: ${addonIconSize}px; +} + +.tg-input--field.sufix-padding-1 { + padding-right: ${addonIconSize + 8}px; +} + +.tg-input--field.sufix-padding-2 { + padding-right: ${addonIconSize * 2 + 8 * 2}px;; +} + +.tg-input--field { + box-sizing: border-box; + width: 100%; + border-radius: 8px; + border: 1px solid transparent; + transition: border-color 0.3s; + background-color: var(--form-field-color); + color: var(--primary-text-color); + font-size: 14px; +} +.tg-input--field > .tg-input--field-prefix { + padding-left: 8px; +} +.tg-input--field > input { + width: 100%; + height: 100%; + box-sizing: border-box; + font-size: 14px; + background-color: var(--form-field-color); + color: var(--primary-text-color); + border: unset; + padding: 4px 8px; + border-radius: 8px; + outline: none; +} + +.tg-input--field > input::placeholder { + font-style: italic; + color: var(--disabled-text-color); +} + +.tg-input--field:has(input:focus), +.tg-input--field:has(input:focus-visible) { + border-color: var(--primary-color); +} + +.tg-input--options-wrapper { + border-radius: 8px; + background: var(--portal-background); + box-shadow: var(--portal-box-shadow); + min-height: 40px; + max-height: 400px; + overflow: auto; + z-index: 99; +} + +.tg-input--options-wrapper > .tg-input--option:first-child { + border-top-left-radius: 8px; + border-top-right-radius: 8px; +} + +.tg-input--options-wrapper > .tg-input--option:last-child { + border-bottom-left-radius: 8px; + border-bottom-right-radius: 8px; +} + +.tg-input--option { + display: flex; + align-items: center; + height: 32px; + padding: 0px 8px; + cursor: pointer; + font-size: 14px; + color: var(--primary-text-color); +} +.tg-input--option:hover { + background: var(--select-hover-background); +} + +.tg-input--disabled > input { + cursor: not-allowed; + color: var(--disabled-text-color); +} + +.tg-input--label > .tg-input--error { + height: 12px; + color: var(--error-color); +} +`); + +export { Input }; diff --git a/testgen/ui/static/js/components/line_chart.js b/testgen/ui/static/js/components/line_chart.js new file mode 100644 index 00000000..fd16bd06 --- /dev/null +++ b/testgen/ui/static/js/components/line_chart.js @@ -0,0 +1,317 @@ +/** + * @import { Point } from './spark_line.js'; + * + * @typedef TrendChartOptions + * @type {object} + * @property {number?} width + * @property {number?} height + * @property {Ticks?} ticks + * @property {number?} xMinSpanBetweenTicks + * @property {number?} yMinSpanBetweenTicks + * @property {number?} padding + * @property {number?} xAxisLeftPadding + * @property {number?} xAxisRightPadding + * @property {number?} yAxisTopPadding + * @property {number?} yAxisBottomPadding + * @property {string?} axisColor + * @property {number?} axisWidth + * @property {number?} tooltipOffsetX + * @property {number?} tooltipOffsetY + * @property {TrendChartFormatters?} formatters + * @property {TrendChartValueGetters?} getters + * @property {Function?} lineDiscriminator + * @property {Function?} lineColor + * @property {Function?} onShowPointTooltip + * @property {Function?} onRefreshClicked + * + * @typedef Ticks + * @type {object} + * @property {Array} x + * @property {Array} y + * + * @typedef TrendChartValueGetters + * @type {object} + * @property {(item: any) => number} x + * @property {(item: any) => number} y + * + * @typedef TrendChartFormatters + * @type {object} + * @property {(tick: number) => string} x + * @property {(tick: number) => string} y + * + * @typedef TrendLegendOptions + * @type {object} + * @property {Point} origin + * @property {Point} end + * @property {string?} refreshTooltip + * @property {() => void} onRefreshClicked + * @property {(lineId: string) => void} onLineClicked + * @property {(lineId: string) => void} onLineMouseEnter + * @property {(lineId: string) => void} onLineMouseLeave + */ +import van from '../van.min.js'; +import { getValue } from '../utils.js'; +import { colorMap } from '../display_utils.js'; +import { Tooltip } from './tooltip.js'; +import { SparkLine } from './spark_line.js'; +import { Button } from './button.js'; +import { scale } from '../axis_utils.js'; + +const { div, i, span } = van.tags(); +const { circle, foreignObject, g, line, polyline, svg, text } = van.tags("http://www.w3.org/2000/svg"); + +/** + * Draws 2D coordinate system and sparklines inside. + * + * @param {TrendChartOptions} options + * @param {Array | Array} values + */ +const LineChart = ( + options, + ...values +) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + const variables = { + 'axis-color': _options.axisColor, + 'axis-width': _options.axisWidth, + 'line-width': _options.lineWidth, + }; + const style = Object.entries(variables).map(([key, value]) => `--${key}: ${value}`).join(';'); + const origin = {x: _options.padding, y: _options.padding}; + const end = {x: _options.width - _options.padding, y: _options.height - _options.padding}; + const xAxis = {x1: origin.x, y1: end.y, x2: end.x, y2: end.y}; + const yAxis = {x1: end.x, y1: origin.y, x2: end.x, y2: end.y}; + + let /** @type {Array} */ xValues = _options.ticks?.x; + let /** @type {Array} */ yValues = _options.ticks?.y; + + if (!xValues) { + xValues = Array.from(values.reduce((set, v) => set.add(_options.getters.x(v)), new Set())) + .sort((a, b) => a - b); + } + + if (!yValues) { + yValues = Array.from(values.reduce((set, v) => set.add(_options.getters.y(v)), new Set())) + .sort((a, b) => a - b); + } + + const xTicks = xValues.filter((value, idx, ticks) => { + return idx === 0 || ((value - ticks[idx - 1]) >= _options.xMinSpanBetweenTicks); + }).map((value) => ({ value, label: _options.formatters.x(value) })); + const yTicks = yValues.filter((value, idx, ticks) => { + return idx === 0 || ((value - ticks[idx - 1]) >= _options.yMinSpanBetweenTicks); + }).map((value) => ({ value, label: _options.formatters.y(value) })); + + const asSVGX = (/** @type {number} */ value) => { + return scale(value, { + old: {min: Math.min(...xValues), max: Math.max(...xValues)}, + new: {min: origin.x + _options.xAxisLeftPadding, max: end.x - _options.xAxisRightPadding}, + }, origin.x + _options.xAxisLeftPadding); + }; + const asSVGY = (/** @type {number} */ value) => { + return _options.height - scale(value, { + old: {min: Math.min(...yValues), max: Math.max(...yValues)}, + new: {min: origin.y + _options.yAxisBottomPadding, max: end.y - _options.yAxisTopPadding}, + }, end.y - _options.yAxisTopPadding); + }; + + const lines = values + .map(v => ({...v, x: asSVGX(_options.getters.x(v)), y: asSVGY(_options.getters.y(v))})) + .reduce((lines, value) => { + const lineId = _options.lineDiscriminator(value); + if (!Object.keys(lines).includes(String(lineId))) { + lines[lineId] = []; + } + lines[lineId].push(value); + return lines; + }, {}); + const linesStates = Object.keys(lines).reduce((result, lineId) => ({ + ...result, + [lineId]: { + dimmed: van.state(false), + hidden: van.state(false), + }, + }), {}); + const linesOpacity = Object.entries(linesStates).reduce((result, [lineId, {dimmed, hidden}]) => ({ + ...result, + [lineId]: van.derive(() => (getValue(dimmed) || getValue(hidden)) ? 0.2 : 1.0), + }), {}); + + function dimAllExcept(lineId) { + if (linesStates[lineId].hidden.val) { + return; + } + + Object.values(linesStates).forEach(states => states.dimmed.val = true); + linesStates[lineId].dimmed.val = false; + } + + function resetDimmedLines() { + Object.values(linesStates).forEach(states => states.dimmed.val = false); + } + + function toggleLineVisibility(lineId) { + linesStates[lineId].hidden.val = !linesStates[lineId].hidden.val; + } + + const tooltipText = van.state(''); + const showTooltip = van.state(false); + const tooltipExtraStyle = van.state(''); + const tooltip = Tooltip({ + text: tooltipText, + show: showTooltip, + position: '--', + style: tooltipExtraStyle, + }); + + return svg( + { + width: '100%', + height: '100%', + viewBox: `0 0 ${_options.width} ${_options.height}`, + style: `${style}; overflow: visible;`, + }, + + Legend( + { + origin, + end, + refreshTooltip: 'Recalculate Trend', + onLineMouseEnter: dimAllExcept, + onLineMouseLeave: resetDimmedLines, + onLineClicked: toggleLineVisibility, + onRefreshClicked: _options.onRefreshClicked, + }, + Object.entries(lines).map(([lineId, _], idx) => ({ id: lineId, color: _options.lineColor(lineId, idx), opacity: linesOpacity[lineId] })), + ), + + line({...xAxis, style: 'stroke: var(--axis-color); stroke-width: var(--axis-width)'}), + xTicks.map(({ value }) => circle({ cx: asSVGX(value), cy: end.y, r: 2, 'pointer-events': 'none', fill: 'var(--axis-color)' })), + xTicks.map(({ value, label }) => { + const dx = Math.max(5, label.length * 5.5 / 2); + return text({x: asSVGX(value), y: end.y, dx: -dx, dy: 20, style: 'stroke: var(--axis-color); stroke-width: .1; fill: var(--axis-color);' }, label); + }), + + line({...yAxis, style: 'stroke: var(--axis-color); stroke-width: var(--axis-width)'}), + yTicks.map(({ value, label }) => text({ + x: end.x, + y: asSVGY(value), + dx: 5, + dy: 5, + style: 'stroke: var(--axis-color); stroke-width: .1; fill: var(--axis-color);' }, + label, + )), + + Object.entries(lines).map(([lineId, line], idx) => + SparkLine( + { + color: _options.lineColor(lineId, idx), + stroke: _options.lineWidth, + opacity: linesOpacity[lineId], + hidden: linesStates[lineId].hidden, + interactive: _options.onShowPointTooltip != undefined, + onPointMouseEnter: (point, line) => { + tooltipText.val = _options.onShowPointTooltip?.(point, line); + tooltipExtraStyle.val = `transform: translate(${point.x + _options.tooltipOffsetX}px, ${point.y + _options.tooltipOffsetY}px);`; + showTooltip.val = true; + }, + onPointMouseLeave: () => { + tooltipText.val = ''; + tooltipExtraStyle.val = ''; + showTooltip.val = false; + }, + testId: lineId, + }, + line, + ) + ), + + _options.onShowPointTooltip + ? foreignObject({fill: 'none', width: '100%', height: '100%', 'pointer-events': 'none', style: 'overflow: visible;'}, tooltip) + : '', + ); +}; + +/** + * Renders a representation of each line displayed in the chart and allows reacting to events on each. + * + * @param {TrendLegendOptions} options + * @param {Array<{lineId: string, color: string, opacity: number}>} lines + */ +const Legend = (options, lines) => { + const title = 'Score Trend'; + const lineLength = 15; + const lineHeight = 4; + + return foreignObject( + { + x: 0, + y: 0, + width: '100%', + height: '40', + overflow: 'visible', + }, + div( + {class: 'flex-row pt-2 pl-6 pr-6'}, + span({class: 'mr-1 text-secondary', style: 'font-size: 16px; font-weight: 500;'}, title), + options?.onRefreshClicked ? + Button({ + type: 'icon', + icon: 'refresh', + style: 'width: 32px; height: 32px;', + tooltip: options?.refreshTooltip || null, + onclick: options?.onRefreshClicked, + 'data-testid': 'refresh-history', + }) + : null, + div( + {class: 'flex-row ml-7', style: 'margin-right: auto;'}, + ...lines.map((line) => + div( + { + class: 'flex-row clickable mr-3', + style: () => `opacity: ${getValue(line.opacity)}`, + onclick: () => options?.onLineClicked(line.id), + onmouseenter: () => options?.onLineMouseEnter(line.id), + onmouseleave: () => options?.onLineMouseLeave(line.id), + }, + i({style: `width: ${lineLength}px; height: ${lineHeight}px; background: ${line.color}; display: block; margin-right: 2px; border-radius: 10px;`}), + span({class: 'text-caption'}, line.id), + ) + ), + ), + ) + ); +}; + +const defaultOptions = { + width: 600, + height: 200, + padding: 32, + xMinSpanBetweenTicks: 10, + yMinSpanBetweenTicks: 10, + xAxisLeftPadding: 16, + xAxisRightPadding: 16, + yAxisTopPadding: 16, + yAxisBottomPadding: 16, + axisColor: colorMap.grey, + axisWidth: 2, + lineWidth: 3, + tooltipOffsetX: 10, + tooltipOffsetY: 10, + formatters: { + x: String, + y: String, + }, + getters: { + x: (/** @type {Point} */ item) => item.x, + y: (/** @type {Point} */ item) => item.y, + }, + lineDiscriminator: (/** @type {Point} */ item) => '0', + lineColor: (lineId, idx) => ['blue', 'green', 'yellow', 'brown'][idx] ?? 'grey', +}; + +export { LineChart }; diff --git a/testgen/ui/static/js/components/link.js b/testgen/ui/static/js/components/link.js new file mode 100644 index 00000000..f92f3fb2 --- /dev/null +++ b/testgen/ui/static/js/components/link.js @@ -0,0 +1,136 @@ +/** + * @typedef Properties + * @type {object} + * @property {string} href + * @property {object} params + * @property {string} label + * @property {boolean} open_new + * @property {boolean} underline + * @property {string?} left_icon + * @property {number?} left_icon_size + * @property {string?} right_icon + * @property {number?} right_icon_size + * @property {number?} height + * @property {number?} width + * @property {string?} style + * @property {string?} class + * @property {string?} tooltip + * @property {string?} tooltipPosition + * @property {boolean?} disabled + * @property {((event: any) => void)?} onClick + */ +import { emitEvent, enforceElementWidth, getValue, loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; + +const { a, div, i, span } = van.tags; + +const Link = (/** @type Properties */ props) => { + loadStylesheet('link', stylesheet); + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(getValue(props.height) || 24); + const width = getValue(props.width); + if (width) { + enforceElementWidth(window.frameElement, width); + } + if (props.tooltip) { + window.frameElement.parentElement.setAttribute('data-tooltip', props.tooltip.val); + window.frameElement.parentElement.setAttribute('data-tooltip-position', props.tooltipPosition.val); + } + } + + const href = getValue(props.href); + const params = getValue(props.params) ?? {}; + const open_new = !!getValue(props.open_new); + const onClick = getValue(props.onClick); + const showTooltip = van.state(false); + const isExternal = /http(s)?:\/\//.test(href); + + return a( + { + class: `tg-link + ${getValue(props.underline) ? 'tg-link--underline' : ''} + ${getValue(props.disabled) ? 'disabled' : ''} + ${getValue(props.class) ?? ''}`, + style: props.style, + href: isExternal ? href : `/${href}${getQueryFromParams(params)}`, + target: open_new ? '_blank' : '', + onclick: open_new ? null : (onClick ?? ((event) => { + event.preventDefault(); + event.stopPropagation(); + emitEvent('LinkClicked', { href, params }); + })), + onmouseenter: props.tooltip ? (() => showTooltip.val = true) : undefined, + onmouseleave: props.tooltip ? (() => showTooltip.val = false) : undefined, + }, + () => getValue(props.tooltip) ? Tooltip({ + text: props.tooltip, + show: showTooltip, + position: props.tooltipPosition, + }) : '', + div( + {class: 'tg-link--wrapper'}, + props.left_icon ? LinkIcon(props.left_icon, props.left_icon_size, 'left') : undefined, + span({class: 'tg-link--text'}, props.label), + props.right_icon ? LinkIcon(props.right_icon, props.right_icon_size, 'right') : undefined, + ), + ); +}; + +const LinkIcon = ( + /** @type string */icon, + /** @type number */size, + /** @type string */position, +) => { + return i( + {class: `material-symbols-rounded tg-link--icon tg-link--icon-${position}`, style: `font-size: ${getValue(size) || 20}px;`}, + icon, + ); +}; + +function getQueryFromParams(/** @type object */ params) { + const query = Object.entries(params).reduce((query, [ key, value ]) => { + if (key && value) { + return `${query}${query ? '&' : ''}${key}=${value}`; + } + return query; + }, ''); + return query ? `?${query}` : ''; +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + .tg-link { + width: fit-content; + display: flex; + flex-direction: column; + text-decoration: unset !important; + color: var(--link-color); + cursor: pointer; + } + + .tg-link.disabled { + pointer-events: none; + cursor: not-allowed; + } + + .tg-link .tg-link--wrapper { + display: flex; + align-items: center; + } + + .tg-link.tg-link--underline::after { + content: ""; + height: 0; + width: 0; + border-top: 1px solid #1976d2; /* pseudo elements do not inherit variables */ + transition: width 50ms linear; + } + + .tg-link.tg-link--underline:hover::after { + width: 100%; + } +`); + +export { Link }; diff --git a/testgen/ui/static/js/components/monitor_anomalies_summary.js b/testgen/ui/static/js/components/monitor_anomalies_summary.js new file mode 100644 index 00000000..5b53a219 --- /dev/null +++ b/testgen/ui/static/js/components/monitor_anomalies_summary.js @@ -0,0 +1,126 @@ +/** + * @typedef MonitorSummary + * @type {object} + * @property {number} freshness_anomalies + * @property {number} volume_anomalies + * @property {number} schema_anomalies + * @property {number} metric_anomalies + * @property {boolean?} freshness_has_errors + * @property {boolean?} volume_has_errors + * @property {boolean?} schema_has_errors + * @property {boolean?} metric_has_errors + * @property {boolean?} freshness_is_training + * @property {boolean?} volume_is_training + * @property {boolean?} metric_is_training + * @property {boolean?} freshness_is_pending + * @property {boolean?} volume_is_pending + * @property {boolean?} schema_is_pending + * @property {boolean?} metric_is_pending + * @property {number} lookback + * @property {number} lookback_start + * @property {number} lookback_end + * @property {string?} project_code + * @property {string?} table_group_id + * + * @typedef SummaryOptions + * @type {object} + * @property {function(string)?} onTagClick + * @property {object?} activeTypes + */ +import { emitEvent, getValue, loadStylesheet } from '../utils.js'; +import { formatDuration, humanReadableDuration } from '../display_utils.js'; +import { withTooltip } from './tooltip.js'; +import van from '../van.min.js'; + +const { a, div, i, span } = van.tags; + +/** + * @param {MonitorSummary} summary + * @param {string?} label + * @param {SummaryOptions?} options + */ +const AnomaliesSummary = (summary, label = 'Anomalies', options = {}) => { + loadStylesheet('anomalies-summary', summaryStylesheet); + + if (!summary.lookback) { + return span({class: 'text-secondary mt-3 mb-2'}, 'No monitor runs yet'); + } + + const SummaryTag = (typeKey, tagLabel, value, hasErrors, isTraining, isPending) => { + const isClickable = !!options.onTagClick; + const isActive = van.derive(() => (getValue(options.activeTypes) ?? []).includes(typeKey)); + + return div( + { + class: () => `flex-row fx-gap-1 p-1 border-radius-1 summary-tag ${isClickable ? 'clickable' : ''} ${isActive.val ? 'active' : ''}`, + onclick: isClickable ? (event) => { + event.stopPropagation(); + options.onTagClick(typeKey); + } : undefined, + }, + div( + {class: `flex-row fx-justify-center anomaly-tag ${value > 0 ? 'has-anomalies' : hasErrors ? 'has-errors' : isTraining ? 'is-training' : isPending ? 'is-pending' : ''}`}, + value > 0 + ? value + : hasErrors + ? withTooltip( + i({class: 'material-symbols-rounded'}, 'warning'), + {text: 'Execution error', position: 'top-right'}, + ) + : isTraining + ? withTooltip( + i({class: 'material-symbols-rounded'}, 'more_horiz'), + {text: 'Training model', position: 'top-right'}, + ) + : isPending + ? withTooltip( + span({class: 'pl-2 pr-2', style: 'position: relative;'}, '-'), + {text: 'No results yet or not configured'}, + ) + : i({class: 'material-symbols-rounded'}, 'check'), + ), + span({}, tagLabel), + ); + }; + + const numRuns = summary.lookback === 1 ? 'run' : `${summary.lookback} runs`; + const duration = humanReadableDuration(formatDuration(summary.lookback_start, new Date()), true) + const labelElement = span({class: 'text-small text-secondary'}, `${label} in last ${numRuns} (${duration})`); + + const contentElement = div( + {class: 'flex-row fx-gap-5'}, + SummaryTag('freshness', 'Freshness', summary.freshness_anomalies, summary.freshness_has_errors, summary.freshness_is_training, summary.freshness_is_pending), + SummaryTag('volume', 'Volume', summary.volume_anomalies, summary.volume_has_errors, summary.volume_is_training, summary.volume_is_pending), + SummaryTag('schema', 'Schema', summary.schema_anomalies, summary.schema_has_errors, false, summary.schema_is_pending), + SummaryTag('metrics', 'Metrics', summary.metric_anomalies, summary.metric_has_errors, summary.metric_is_training, summary.metric_is_pending), + ); + + if (summary.project_code && summary.table_group_id) { + return a( + { + class: `flex-column fx-gap-2 clickable`, + style: 'text-decoration: none; color: unset;', + href: summary.table_group_id ? `/monitors?project_code=${summary.project_code}&table_group_id=${summary.table_group_id}`: null, + onclick: summary.table_group_id ? (event) => { + event.preventDefault(); + event.stopPropagation(); + emitEvent('LinkClicked', { href: 'monitors', params: {project_code: summary.project_code, table_group_id: summary.table_group_id} }); + }: null, + }, + labelElement, + contentElement, + ); + } + + return div({class: 'flex-column fx-gap-2'}, labelElement, contentElement); +}; + +const summaryStylesheet = new CSSStyleSheet(); +summaryStylesheet.replace(` +.summary-tag.clickable:hover, +.summary-tag.active { + background: var(--select-hover-background); +} +`); + +export { AnomaliesSummary }; diff --git a/testgen/ui/static/js/components/monitor_settings_form.js b/testgen/ui/static/js/components/monitor_settings_form.js new file mode 100644 index 00000000..edd88d7a --- /dev/null +++ b/testgen/ui/static/js/components/monitor_settings_form.js @@ -0,0 +1,405 @@ +/** + * @import { CronSample } from '../types.js'; + * + * @typedef Schedule + * @type {object} + * @property {string?} cron_tz + * @property {string} cron_expr + * @property {boolean} active + * + * @typedef MonitorSuite + * @type {object} + * @property {string?} id + * @property {string?} table_groups_id + * @property {string?} test_suite + * @property {number?} monitor_lookback + * @property {boolean?} monitor_regenerate_freshness + * @property {('low'|'medium'|'high')?} predict_sensitivity + * @property {number?} predict_min_lookback + * @property {boolean?} predict_exclude_weekends + * @property {string?} predict_holiday_codes + * + * @typedef FormState + * @type {object} + * @property {boolean} dirty + * @property {boolean} valid + * + * @typedef Properties + * @type {object} + * @property {Schedule} schedule + * @property {MonitorSuite} monitorSuite + * @property {CronSample?} cronSample + * @property {boolean?} hideActiveCheckbox + * @property {(sch: Schedule, ts: MonitorSuite, state: FormState) => void} onChange + */ +import van from '../van.min.js'; +import { getValue, isEqual, loadStylesheet, emitEvent } from '../utils.js'; +import { Input } from './input.js'; +import { RadioGroup } from './radio_group.js'; +import { Caption } from './caption.js'; +import { Select } from './select.js'; +import { Checkbox } from './checkbox.js'; +import { CrontabInput, parseSteppedList } from './crontab_input.js'; +import { Icon } from './icon.js'; +import { Link } from './link.js'; +import { withTooltip } from './tooltip.js'; +import { numberBetween, required } from '../form_validators.js'; +import { timezones, holidayCodes } from '../values.js'; +import { formatDurationSeconds, humanReadableDuration } from '../display_utils.js'; + +const { div, span } = van.tags; + +const monitorLookbackConfig = { + default: 14, + min: 1, + max: 200, +}; +const predictLookbackConfig = { + default: 30, + min: 20, + max: 1000, +} + +/** + * + * @param {Properties} props + * @returns + */ +const MonitorSettingsForm = (props) => { + loadStylesheet('monitor-settings-form', stylesheet); + + const schedule = getValue(props.schedule) ?? {}; + const cronTimezone = van.state(schedule.cron_tz ?? Intl.DateTimeFormat().resolvedOptions().timeZone); + const cronExpression = van.state(schedule.cron_expr ?? '0 */12 * * *'); + const scheduleActive = van.state(schedule.active ?? true); + + const monitorSuite = getValue(props.monitorSuite) ?? {}; + const monitorLookback = van.state(monitorSuite.monitor_lookback ?? monitorLookbackConfig.default); + const monitorRegenerateFreshness = van.state(monitorSuite.monitor_regenerate_freshness ?? true); + const predictSensitivity = van.state(monitorSuite.predict_sensitivity ?? 'medium'); + const predictMinLookback = van.state(monitorSuite.predict_min_lookback ?? predictLookbackConfig.default); + const predictExcludeWeekends = van.state(monitorSuite.predict_exclude_weekends ?? false); + const predictHolidayCodes = van.state(monitorSuite.predict_holiday_codes); + + const updatedSchedule = van.derive(() => { + return { + cron_tz: cronTimezone.val, + cron_expr: cronExpression.val, + active: scheduleActive.val, + }; + }); + const updatedTestSuite = van.derive(() => { + return { + id: monitorSuite.id, + table_groups_id: monitorSuite.table_groups_id, + test_suite: monitorSuite.test_suite, + monitor_lookback: monitorLookback.val, + monitor_regenerate_freshness: monitorRegenerateFreshness.val, + predict_sensitivity: predictSensitivity.val, + predict_min_lookback: predictMinLookback.val, + predict_exclude_weekends: predictExcludeWeekends.val, + predict_holiday_codes: predictHolidayCodes.val, + }; + }); + + const dirty = van.derive(() => !isEqual(updatedSchedule.val, schedule) || !isEqual(updatedTestSuite.val, monitorSuite)); + const validityPerField = van.state({}); + + van.derive(() => { + const fieldsValidity = validityPerField.val; + const isValid = Object.keys(fieldsValidity).length > 0 && + Object.values(fieldsValidity).every(v => v); + props.onChange?.(updatedSchedule.val, updatedTestSuite.val, { dirty: dirty.val, valid: isValid }); + }); + + const setFieldValidity = (field, validity) => { + validityPerField.val = {...validityPerField.rawVal, [field]: validity}; + } + + return div( + { class: 'flex-column fx-gap-4' }, + MainForm( + { setValidity: setFieldValidity }, + monitorLookback, + monitorRegenerateFreshness, + cronExpression, + ), + ScheduleForm( + { + hideActiveCheckbox: getValue(props.hideActiveCheckbox), + originalActive: schedule.active ?? true, + cronSample: props.cronSample, + setValidity: setFieldValidity, + }, + cronTimezone, + cronExpression, + scheduleActive, + ), + PredictionForm( + { setValidity: setFieldValidity }, + predictSensitivity, + predictMinLookback, + predictExcludeWeekends, + predictHolidayCodes, + ), + ); +}; + +const MainForm = ( + options, + monitorLookback, + monitorRegenerateFreshness, + cronExpression, +) => { + return div( + { class: 'flex-column fx-gap-4' }, + div( + { class: 'flex-row fx-align-flex-start fx-gap-3 fx-flex-wrap monitor-settings-row' }, + Input({ + name: 'monitor_lookback', + label: 'Lookback Runs', + value: monitorLookback, + help: 'Number of monitor runs to summarize on dashboard views', + helpPlacement: 'bottom-right', + type: 'number', + step: 1, + onChange: (value, state) => { + monitorLookback.val = value; + options.setValidity?.('monitor_lookback', state.valid); + }, + validators: [ + numberBetween(monitorLookbackConfig.min, monitorLookbackConfig.max, 1), + ], + }), + () => { + const cronDuration = determineDuration(cronExpression.val); + if (!cronDuration || !monitorLookback.val) { + return span({}); + } + + const lookbackDuration = monitorLookback.val * cronDuration; + return div( + { class: 'flex-column' }, + div( + { class: 'flex-row fx-gap-1 text-caption mt-1 mb-3' }, + span('Lookback Window (calculated)'), + withTooltip( + Icon({ size: 16, classes: 'text-disabled' }, 'help'), + { text: 'Time window to summarize on dashboard views. Calculated based on Lookback Runs and Schedule.', width: 200 }, + ) + ), + span(humanReadableDuration(formatDurationSeconds(lookbackDuration))), + ); + } + ), + div( + { class: 'flex-row fx-align-flex-start fx-gap-3 fx-flex-wrap mb-2 monitor-settings-row' }, + Checkbox({ + name: 'monitor_regenerate_freshness', + label: 'Reconfigure Freshness monitors after profiling', + help: 'When enabled, Freshness monitors will be automatically reconfigured with new fingerprints after each profiling run', + width: 350, + checked: monitorRegenerateFreshness, + onChange: (value) => monitorRegenerateFreshness.val = value, + }), + ), + ); +}; + +const ScheduleForm = ( + options, + cronTimezone, + cronExpression, + scheduleActive, +) => { + const cronEditorValue = van.derive(() => { + if (cronExpression.val && cronTimezone.val) { + emitEvent('GetCronSample', {payload: {cron_expr: cronExpression.val, tz: cronTimezone.val}}); + } + return { + timezone: cronTimezone.val, + expression: cronExpression.val, + }; + }); + + return div( + { class: 'flex-column fx-gap-3 border border-radius-1 p-3', style: 'position: relative;' }, + Caption({content: 'Monitor Schedule', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + div( + { class: 'flex-row fx-gap-3 fx-flex-wrap fx-align-flex-start monitor-settings-row' }, + () => Select({ + label: 'Timezone', + options: timezones.map(tz_ => ({label: tz_, value: tz_})), + value: cronTimezone, + allowNull: false, + filterable: true, + onChange: (value) => cronTimezone.val = value, + portalClass: 'short-select-portal', + }), + CrontabInput({ + name: 'monitor_settings_schedule', + sample: options.cronSample, + value: cronEditorValue, + modes: ['x_hours', 'x_days'], + hideExpression: true, + onChange: (value) => cronExpression.val = value, + }), + ), + !options.hideActiveCheckbox + ? div( + { class: 'flex-row fx-gap-6 fx-flex-wrap' }, + Checkbox({ + name: 'schedule_active', + label: 'Activate schedule', + checked: scheduleActive, + onChange: (value) => scheduleActive.val = value, + }), + () => !scheduleActive.val + ? div( + { class: 'flex-row fx-gap-1' }, + Icon({ style: 'font-size: 16px; color: var(--purple);' }, 'info'), + span( + { class: 'text-caption', style: 'color: var(--purple);' }, + options.originalActive ? 'Monitor schedule will be paused.' : 'Monitor schedule is paused.', + ), + ) + : '', + ) + : null, + ); +}; + +const PredictionForm = ( + options, + predictSensitivity, + predictMinLookback, + predictExcludeWeekends, + predictHolidayCodes, +) => { + const excludeHolidays = van.state(!!predictHolidayCodes.val); + return div( + { class: 'flex-column fx-gap-4 border border-radius-1 p-3', style: 'position: relative;' }, + Caption({content: 'Prediction Model', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + div( + { class: 'flex-row fx-gap-3 fx-flex-wrap monitor-settings-row' }, + RadioGroup({ + name: 'predict_sensitivity', + label: 'Sensitivity', + options: [ + { label: 'Low', value: 'low', help: 'Fewer alerts. Volume/Metric: 3 standard deviations. Freshness: wider interval tolerance.' }, + { label: 'Medium', value: 'medium', help: 'Balanced. Volume/Metric: 2.5 standard deviations. Freshness: moderate interval tolerance.' }, + { label: 'High', value: 'high', help: 'More alerts. Volume/Metric: 2 standard deviations. Freshness: tighter interval tolerance.' }, + ], + value: predictSensitivity, + onChange: (value) => predictSensitivity.val = value, + }), + Input({ + name: 'predict_min_lookback', + type: 'number', + label: 'Minimum Training Lookback', + value: predictMinLookback, + help: 'Minimum number of monitor runs to use for training models', + type: 'number', + step: 1, + onChange: (value, state) => { + predictMinLookback.val = value; + options.setValidity?.('predict_min_lookback', state.valid); + }, + validators: [ + numberBetween(predictLookbackConfig.min, predictLookbackConfig.max, 1), + ], + }), + ), + Checkbox({ + name: 'predict_exclude_weekends', + label: 'Exclude weekends from training', + width: 250, + checked: predictExcludeWeekends, + onChange: (value) => predictExcludeWeekends.val = value, + }), + Checkbox({ + name: 'predict_exclude_holidays', + label: 'Exclude holidays from training', + width: 250, + checked: excludeHolidays, + onChange: (value) => excludeHolidays.val = value, + }), + () => excludeHolidays.val + ? div( + { style: 'width: 250px; margin: -8px 0 0 25px; position: relative;' }, + Input({ + name: 'predict_holiday_codes', + label: 'Holiday Codes', + value: predictHolidayCodes, + help: 'Comma-separated list of country or financial market codes', + autocompleteOptions: holidayCodes, + onChange: (value, state) => { + predictHolidayCodes.val = value; + options.setValidity?.('predict_holiday_codes', state.valid); + }, + validators: [ + required, + ], + }), + div( + { class: 'flex-row fx-gap-1 mt-1 text-caption' }, + span({}, 'See supported'), + Link({ + open_new: true, + label: 'codes', + href: 'https://holidays.readthedocs.io/en/latest/#available-countries', + right_icon: 'open_in_new', + right_icon_size: 13, + }), + ), + ) + : '', + ); +}; + +/** + * @param {string} expression + * @returns {number} + */ +function determineDuration(expression) { + // Normalize whitespace + const expr = (expression || '').trim().replace(/\s+/g, ' '); + // "M * * * *" + if (/^\d{1,2} \* \* \* \*$/.test(expr)) { + return 60 * 60; // 1 hour + } + // "M */H * * *" + let match = expr.match(/^\d{1,2} \*\/(\d+) \* \* \*$/); + if (match) { + return Number(match[1]) * 60 * 60; // H hours + } + // "M H1,H2,... * * *" (stepped hours with starting offset) + if (/^\d{1,2} \d+(,\d+)+ \* \* \*$/.test(expr)) { + const parsed = parseSteppedList(expr.split(' ')[1]); + if (parsed) return parsed.step * 60 * 60; + } + // "M H * * *" + if (/^\d{1,2} \d{1,2} \* \* \*$/.test(expr)) { + return 24 * 60 * 60; // 1 day + } + // "M H */D * *" + match = expr.match(/^\d{1,2} \d{1,2} \*\/(\d+) \* \*$/); + if (match) { + return Number(match[1]) * 24 * 60 * 60; // D days + } + // "M H D1,D2,... * *" (stepped days with starting offset) + if (/^\d{1,2} \d{1,2} \d+(,\d+)+ \* \*$/.test(expr)) { + const parsed = parseSteppedList(expr.split(' ')[2]); + if (parsed) return parsed.step * 24 * 60 * 60; + } + return null; +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.monitor-settings-row > * { + flex: 250px; +} +`); + +export { MonitorSettingsForm }; diff --git a/testgen/ui/static/js/components/monitoring_sparkline.js b/testgen/ui/static/js/components/monitoring_sparkline.js new file mode 100644 index 00000000..716fb047 --- /dev/null +++ b/testgen/ui/static/js/components/monitoring_sparkline.js @@ -0,0 +1,276 @@ +/** + * @import {ChartViewBox, Point} from './chart_canvas.js'; + * + * @typedef Options + * @type {object} + * @property {ChartViewBox} viewBox + * @property {string} lineColor + * @property {number} lineWidth + * @property {string} markerColor + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {number[]?} yAxisTicks + * @property {Object?} attributes + * @property {PredictionPoint[]?} prediction + * @property {('predict'|'static')?} predictionMethod + * + * @typedef MonitoringPoint + * @type {Object} + * @property {number} x + * @property {number} y + * @property {string?} label + * @property {boolean?} isAnomaly + * @property {boolean?} isTraining + * @property {boolean?} isPending + * @property {number?} lowerTolerance + * @property {number?} upperTolerance + * + * @typedef PredictionPoint + * @type {Object} + * @property {number} x + * @property {number?} y + * @property {number} upper + * @property {number} lower + */ +import van from '../van.min.js'; +import { colorMap, formatNumber, formatTimestamp } from '../display_utils.js'; +import { getValue } from '../utils.js'; + +const { div, span } = van.tags(); +const { circle, g, path, polyline, rect, svg } = van.tags("http://www.w3.org/2000/svg"); + +/** + * + * @param {Options} options + * @param {MonitoringPoint[]} points + */ +const MonitoringSparklineChart = (options, ...points) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + const linePoints = van.state(points.filter(e => !e.isPending)); + const isStaticPrediction = _options.predictionMethod === 'static'; + const predictionPoints = van.derive(() => { + const _linePoints = linePoints.val; + const _predictionPoints = _options.prediction ?? []; + if (_linePoints.length > 0 && _predictionPoints.length > 0) { + const lastPoint = _linePoints[_linePoints.length - 1]; + if (isStaticPrediction) { + _predictionPoints.unshift({ + x: lastPoint.x, + y: lastPoint.y, + upper: lastPoint.upperTolerance ?? lastPoint.y, + lower: lastPoint.lowerTolerance ?? lastPoint.y, + }); + } else { + _predictionPoints.unshift({ + x: lastPoint.x, + y: lastPoint.y, + upper: lastPoint.upperTolerance ?? lastPoint.y, + lower: lastPoint.lowerTolerance ?? lastPoint.y, + }); + } + } + return _predictionPoints; + }); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox?.width; + height.val = viewBox?.height; + minX.val = viewBox?.minX; + minY.val = viewBox?.minY; + }); + + const extraAttributes = {...(_options.attributes ?? {})}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + return svg( + { + width: '100%', + height: '100%', + ...extraAttributes, + }, + () => { + const validPoints = linePoints.val.filter(p => + Number.isFinite(p.x) && Number.isFinite(p.y) + ); + if (validPoints.length < 2) return ''; + return polyline({ + points: validPoints.map(point => `${point.x} ${point.y}`).join(', '), + style: `stroke: ${getValue(_options.lineColor)}; stroke-width: ${getValue(_options.lineWidth)};`, + fill: 'none', + }); + }, + () => { + const tolerancePoints = linePoints.val.filter(p => + Number.isFinite(p.lowerTolerance) || Number.isFinite(p.upperTolerance) + ); + if (tolerancePoints.length < 2) return ''; + + return path({ + d: generateTolerancePath(tolerancePoints, _options.height, getValue(_options.lineWidth)), + fill: colorMap.blue, + 'fill-opacity': 0.1, + stroke: 'none', + }); + }, + () => { + const validPoints = predictionPoints.rawVal.filter(p => + Number.isFinite(p.x) && (Number.isFinite(p.upper) || Number.isFinite(p.lower)) + ); + if (validPoints.length < 2) return ''; + return path({ + d: generateShadowPath(validPoints, _options.height), + fill: colorMap.emptyDark, + opacity: 0.25, + stroke: 'none', + }); + }, + () => { + if (isStaticPrediction) return ''; + const validPoints = predictionPoints.rawVal.filter(p => + Number.isFinite(p.x) && Number.isFinite(p.y) + ); + if (validPoints.length < 2) return ''; + return polyline({ + points: validPoints.map(point => `${point.x} ${point.y}`).join(', '), + style: `stroke: ${getValue(colorMap.grey)}; stroke-width: ${getValue(_options.lineWidth)};`, + fill: 'none', + }); + }, + ); +}; + +function generateTolerancePath(points, chartHeight, minHeight = 0) { + const getBounds = (p) => { + let upper = Number.isFinite(p.upperTolerance) ? p.upperTolerance : 0; + let lower = Number.isFinite(p.lowerTolerance) ? p.lowerTolerance : chartHeight; + const height = lower - upper; + if (minHeight > 0 && height < minHeight) { + const midpoint = (upper + lower) / 2; + const halfMin = minHeight / 2; + upper = midpoint - halfMin; + lower = midpoint + halfMin; + } + return { upper, lower }; + }; + + const bounds = points.map(getBounds); + + let pathString = `M ${points[0].x} ${bounds[0].upper}`; + for (let i = 1; i < points.length; i++) { + pathString += ` L ${points[i].x} ${bounds[i].upper}`; + } + for (let i = points.length - 1; i >= 0; i--) { + pathString += ` L ${points[i].x} ${bounds[i].lower}`; + } + pathString += ' Z'; + return pathString; +} + +function generateShadowPath(data, chartHeight) { + const getUpper = (p) => Number.isFinite(p.upper) ? p.upper : 0; + const getLower = (p) => Number.isFinite(p.lower) ? p.lower : chartHeight; + + let pathString = `M ${data[0].x} ${getUpper(data[0])}`; + for (let i = 1; i < data.length; i++) { + pathString += ` L ${data[i].x} ${getUpper(data[i])}`; + } + for (let i = data.length - 1; i >= 0; i--) { + pathString += ` L ${data[i].x} ${getLower(data[i])}`; + } + pathString += ' Z'; + return pathString; +} + +/** + * + * @param {*} options + * @param {MonitoringPoint[]} points + * @returns + */ +const MonitoringSparklineMarkers = (options, points) => { + return g( + {transform: options.transform ?? undefined}, + ...points.map((point) => { + if (point.isPending || !Number.isFinite(point.x) || !Number.isFinite(point.y)) { + return null; + } + + const size = options.anomalySize || defaultAnomalyMarkerSize; + return g( + { + onmouseenter: () => options.showTooltip?.(MonitoringSparklineChartTooltip(point), point), + onmouseleave: () => options.hideTooltip?.(), + }, + circle({ + cx: point.x, + cy: point.y, + r: size, + fill: 'transparent', + }), + point.isAnomaly + ? rect({ + width: size, + height: size, + x: point.x - (size / 2), + y: point.y - (size / 2), + fill: options.anomalyColor || defaultAnomalyMarkerColor, + style: `transform-box: fill-box; transform-origin: center;`, + transform: 'rotate(45)', + + }) + : circle({ + cx: point.x, + cy: point.y, + r: options.size || defaultMarkerSize, + fill: point.isTraining ? 'var(--dk-dialog-background)' : (options.color || defaultMarkerColor), + style: `stroke: ${options.color || defaultMarkerColor}; stroke-width: 1;`, + }), + ); + }), + ); +}; + +/** + * * @param {MonitoringPoint} point + * @returns {HTMLDivElement} + */ +const MonitoringSparklineChartTooltip = (point) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, formatTimestamp(point.originalX)), + span({class: 'text-left text-small'}, `${point.label || 'Value'}: ${formatNumber(point.originalY)}`), + point.lowerTolerance != undefined + ? span({class: 'text-left text-small'}, `Lower bound: ${formatNumber(point.originalLowerTolerance)}`) + : '', + point.upperTolerance != undefined + ? span({class: 'text-left text-small'}, `Upper bound: ${formatNumber(point.originalUpperTolerance)}`) + : '', + ); +}; + +const /** @type Options */ defaultOptions = { + lineColor: colorMap.blueLight, + lineWidth: 3, + yAxisTicks: undefined, + attributes: {}, +}; +const defaultMarkerSize = 3; +const defaultMarkerColor = colorMap.blueLight; +const defaultAnomalyMarkerSize = 8; +const defaultAnomalyMarkerColor = colorMap.red; + +export { MonitoringSparklineChart, MonitoringSparklineMarkers }; diff --git a/testgen/ui/static/js/components/paginator.js b/testgen/ui/static/js/components/paginator.js new file mode 100644 index 00000000..7799e7f2 --- /dev/null +++ b/testgen/ui/static/js/components/paginator.js @@ -0,0 +1,110 @@ +/** + * @typedef Properties + * @type {object} + * @property {number} count + * @property {number} pageSize + * @property {number?} pageIndex + * @property {function(number)?} onChange + */ + +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { emitEvent, getValue, loadStylesheet } from '../utils.js'; + +const { div, span, i, button } = van.tags; + +const Paginator = (/** @type Properties */ props) => { + loadStylesheet('paginator', stylesheet); + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(32); + } + + const { count, pageSize } = props; + const pageIndexState = van.derive(() => getValue(props.pageIndex) ?? 0); + + van.derive(() => { + const onChange = props.onChange?.val ?? props.onChange ?? changePage; + onChange(pageIndexState.val); + }); + + return div( + { class: 'tg-paginator' }, + span( + { class: 'tg-paginator--label' }, + () => { + const pageIndex = pageIndexState.val; + const countValue = getValue(count); + const pageSizeValue = getValue(pageSize); + return `${pageSizeValue * pageIndex + 1} - ${Math.min(countValue, pageSizeValue * (pageIndex + 1))} of ${countValue}`; + }, + ), + button( + { + class: 'tg-paginator--button', + onclick: () => pageIndexState.val = 0, + disabled: () => pageIndexState.val === 0, + }, + i({class: 'material-symbols-rounded'}, 'first_page') + ), + button( + { + class: 'tg-paginator--button', + onclick: () => pageIndexState.val--, + disabled: () => pageIndexState.val === 0, + }, + i({class: 'material-symbols-rounded'}, 'chevron_left') + ), + button( + { + class: 'tg-paginator--button', + onclick: () => pageIndexState.val++, + disabled: () => pageIndexState.val === Math.ceil(getValue(count) / getValue(pageSize)) - 1, + }, + i({class: 'material-symbols-rounded'}, 'chevron_right') + ), + button( + { + class: 'tg-paginator--button', + onclick: () => pageIndexState.val = Math.ceil(getValue(count) / getValue(pageSize)) - 1, + disabled: () => pageIndexState.val === Math.ceil(getValue(count) / getValue(pageSize)) - 1, + }, + i({class: 'material-symbols-rounded'}, 'last_page') + ), + ); +}; + +function changePage(/** @type number */ page_index) { + emitEvent('PageChanged', { page_index }) +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-paginator { + display: flex; + flex-direction: row; + align-items: center; + justify-content: flex-end; +} + +.tg-paginator--label { + margin-right: 20px; + color: var(--secondary-text-color); +} + +.tg-paginator--button { + background-color: transparent; + border: none; + height: 32px; + padding: 4px; + color: var(--secondary-text-color); + cursor: pointer; +} + +.tg-paginator--button[disabled] { + color: var(--disabled-text-color); + cursor: not-allowed; +} +`); + +export { Paginator }; diff --git a/testgen/ui/static/js/components/percent_bar.js b/testgen/ui/static/js/components/percent_bar.js new file mode 100644 index 00000000..a0260344 --- /dev/null +++ b/testgen/ui/static/js/components/percent_bar.js @@ -0,0 +1,79 @@ +/** + * @typedef Properties + * @type {object} + * @property {string} label + * @property {number} value + * @property {number} total + * @property {string?} color + * @property {number?} height + * @property {number?} width + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; +import { colorMap, formatNumber } from '../display_utils.js'; + +const { div, span } = van.tags; +const defaultHeight = 10; +const defaultColor = 'purpleLight'; + +const PercentBar = (/** @type Properties */ props) => { + loadStylesheet('percentBar', stylesheet); + const value = van.derive(() => getValue(props.value)); + const total = van.derive(() => getValue(props.total)); + + return div( + { style: () => `max-width: ${props.width ? getValue(props.width) + 'px' : '100%'};` }, + div( + { class: () => `tg-percent-bar--label ${value.val ? '' : 'text-secondary'}` }, + () => `${getValue(props.label)}: ${formatNumber(value.val)}`, + ), + div( + { + class: 'tg-percent-bar', + style: () => `height: ${getValue(props.height) || defaultHeight}px;`, + }, + span({ + class: 'tg-percent-bar--fill', + style: () => { + const color = getValue(props.color) || defaultColor; + return `width: ${value.val * 100 / total.val}%; + ${value.val ? 'min-width: 1px;' : ''} + background-color: ${colorMap[color] || color};` + }, + }), + span({ + class: 'tg-percent-bar--empty', + style: () => `width: ${(total.val - value.val) * 100 / total.val}%; + ${(total.val - value.val) ? 'min-width: 1px;' : ''};`, + }), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-percent-bar--label { + margin-bottom: 4px; +} + +.tg-percent-bar { + height: 100%; + display: flex; + flex-flow: row nowrap; + align-items: flex-start; + justify-content: flex-start; + border-radius: 4px; + overflow: hidden; +} + +.tg-percent-bar--fill { + height: 100%; +} + +.tg-percent-bar--empty { + height: 100%; + background-color: ${colorMap['empty']} +} +`); + +export { PercentBar }; diff --git a/testgen/ui/static/js/components/portal.js b/testgen/ui/static/js/components/portal.js new file mode 100644 index 00000000..12fa2e70 --- /dev/null +++ b/testgen/ui/static/js/components/portal.js @@ -0,0 +1,66 @@ +/** + * Container for any floating elements anchored to another element. + * + * NOTE: Ensure options is an object and turn individual properties into van.state + * if dynamic updates are needed. + * + * @typedef Options + * @type {object} + * @property {string} target + * @property {boolean?} targetRelative + * @property {boolean} opened + * @property {'left' | 'right'} align + * @property {('top' | 'bottom')?} position + * @property {(string|undefined)} style + * @property {(string|undefined)} class + */ +import van from '../van.min.js'; +import { getValue } from '../utils.js'; + +const { div } = van.tags; + +const Portal = (/** @type Options */ options, ...args) => { + const { target, targetRelative, align = 'left', position = 'bottom' } = getValue(options); + const id = `${target}-portal`; + + window.testgen.portals[id] = { domId: id, targetId: target, opened: options.opened }; + + return () => { + if (!getValue(options.opened)) { + return ''; + } + + const anchor = document.getElementById(target); + return div( + { + id, + class: getValue(options.class) ?? '', + style: `position: absolute; + z-index: 99; + ${position === 'bottom' ? calculateBottomPosition(anchor, align, targetRelative) : calculateTopPosition(anchor, align, targetRelative)} + ${getValue(options.style)}`, + }, + ...args, + ); + }; +}; + +function calculateTopPosition(anchor, align, targetRelative) { + const anchorRect = anchor.getBoundingClientRect(); + const bottom = (targetRelative ? anchorRect.height : anchorRect.top); + const left = targetRelative ? 0 : anchorRect.left; + const right = targetRelative ? 0 : (window.innerWidth - anchorRect.right); + + return `min-width: ${anchorRect.width}px; bottom: ${bottom}px; ${align === 'left' ? `left: ${left}px;` : `right: ${right}px;`}`; +} + +function calculateBottomPosition(anchor, align, targetRelative) { + const anchorRect = anchor.getBoundingClientRect(); + const top = (targetRelative ? 0 : anchorRect.top) + anchorRect.height; + const left = targetRelative ? 0 : anchorRect.left; + const right = targetRelative ? 0 : (window.innerWidth - anchorRect.right); + + return `min-width: ${anchorRect.width}px; top: ${top}px; ${align === 'left' ? `left: ${left}px;` : `right: ${right}px;`}`; +} + +export { Portal }; diff --git a/testgen/ui/static/js/components/radio_group.js b/testgen/ui/static/js/components/radio_group.js new file mode 100644 index 00000000..4f8b0008 --- /dev/null +++ b/testgen/ui/static/js/components/radio_group.js @@ -0,0 +1,158 @@ +/** +* @typedef Option + * @type {object} + * @property {string} label + * @property {string} help + * @property {string | number | boolean | null} value + * + * @typedef Properties + * @type {object} + * @property {string} label + * @property {Option[]} options + * @property {string | number | boolean | null} value + * @property {function(string | number | boolean | null)?} onChange + * @property {number?} width + * @property {('default' | 'inline' | 'vertical')?} layout + */ +import van from '../van.min.js'; +import { getRandomId, getValue, loadStylesheet } from '../utils.js'; +import { withTooltip } from './tooltip.js'; +import { Icon } from './icon.js'; + +const { div, input, label, span } = van.tags; + +const RadioGroup = (/** @type Properties */ props) => { + loadStylesheet('radioGroup', stylesheet); + + const groupName = getRandomId(); + const layout = getValue(props.layout) ?? 'default'; + + return div( + { class: () => `tg-radio-group--wrapper ${layout}`, style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}` }, + div( + { class: 'text-caption tg-radio-group--label' }, + props.label, + ), + () => div( + { class: 'tg-radio-group' }, + getValue(props.options).map(option => label( + { class: `flex-row fx-gap-2 clickable ${layout === 'vertical' ? 'fx-align-flex-start' : ''}` }, + input({ + type: 'radio', + name: groupName, + value: option.value, + checked: () => option.value === getValue(props.value), + onchange: van.derive(() => { + const onChange = props.onChange?.val ?? props.onChange; + return onChange ? () => onChange(option.value) : null; + }), + class: 'tg-radio-group--input', + }), + layout === 'vertical' + ? div( + { class: 'flex-column fx-gap-1' }, + option.label, + span( + { class: 'text-caption tg-radio-group--help' }, + option.help, + ), + ) + : option.label, + layout !== 'vertical' && option.help + ? withTooltip( + Icon({ size: 16, classes: 'text-disabled' }, 'help'), + { text: option.help, position: 'top', width: 200 } + ) + : null, + )), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-radio-group--wrapper.inline { + display: flex; + flex-direction: row; + align-items: center; + gap: 8px; +} + +.tg-radio-group--wrapper.default .tg-radio-group--label, +.tg-radio-group--wrapper.vertical .tg-radio-group--label { + margin-bottom: 4px; +} + +.tg-radio-group--wrapper.vertical .tg-radio-group--label { + margin-bottom: 12px; +} + +.tg-radio-group--wrapper.default .tg-radio-group, +.tg-radio-group--wrapper.inline .tg-radio-group { + display: flex; + flex-direction: row; + align-items: center; + gap: 16px; + height: 32px; +} + +.tg-radio-group--wrapper.vertical .tg-radio-group { + display: flex; + flex-direction: column; + gap: 12px; +} + +.tg-radio-group--input { + flex: 0 0 auto; + appearance: none; + box-sizing: border-box; + margin: 0; + width: 18px; + height: 18px; + border: 1px solid var(--secondary-text-color); + border-radius: 9px; + position: relative; + transition-property: border-color, background-color; + transition-duration: 0.3s; +} + +.tg-radio-group--input:focus, +.tg-radio-group--input:focus-visible { + outline: none; +} + +.tg-radio-group--input:focus-visible::before { + content: ''; + box-sizing: border-box; + position: absolute; + top: -4px; + left: -4px; + width: 24px; + height: 24px; + border: 3px solid var(--border-color); + border-radius: 12px; +} + +.tg-radio-group--input:checked { + border-color: var(--primary-color); +} + +.tg-radio-group--input:checked::after { + content: ''; + box-sizing: border-box; + position: absolute; + top: 3px; + left: 3px; + width: 10px; + height: 10px; + background-color: var(--primary-color); + border-radius: 5px; +} + +.tg-radio-group--help { + white-space: pre-wrap; + line-height: 16px; +} +`); + +export { RadioGroup }; diff --git a/testgen/ui/static/js/components/schema_changes_chart.js b/testgen/ui/static/js/components/schema_changes_chart.js new file mode 100644 index 00000000..0116587d --- /dev/null +++ b/testgen/ui/static/js/components/schema_changes_chart.js @@ -0,0 +1,163 @@ +/** + * @import {ChartViewBox, Point} from './chart_canvas.js'; + * * @typedef Options + * @type {object} + * @property {number} lineWidth + * @property {string} lineColor + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {ChartViewBox?} viewBox + * @property {Function?} showTooltip + * @property {Function?} hideTooltip + * @property {((e: SchemaEvent) => void)} onClick + * * @typedef SchemaEvent + * @type {object} + * @property {Point} point + * @property {string | number} time + * @property {number} additions + * @property {number} deletions + * @property {number} modifications + * @property {string | number} window_start + */ +import van from '../van.min.js'; +import { colorMap, formatNumber, formatTimestamp } from '../display_utils.js'; +import { scale } from '../axis_utils.js'; +import { getValue } from '../utils.js'; + +const { div, span } = van.tags(); +const { circle, g, rect, svg } = van.tags("http://www.w3.org/2000/svg"); + +/** + * * @param {Options} options + * @param {Array} events + */ +const SchemaChangesChart = (options, ...events) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox?.width; + height.val = viewBox?.height; + minX.val = viewBox?.minX; + minY.val = viewBox?.minY; + }); + + const currentViewBox = getValue(_options.viewBox); + const chartHeight = currentViewBox?.height ?? getValue(_options.height) ?? 100; + + const maxValue = Math.ceil(Math.max(...events.map(e => Math.max(e.additions, e.deletions, e.modifications))) / 10) * 10 || 10; + + const schemaEvents = events.map(e => { + const xPosition = e.point.x; + const markerProps = {}; + const parts = []; + + if (_options.showTooltip) { + markerProps.onmouseenter = () => _options.showTooltip?.(SchemaChangesChartTooltip(e), e.point); + markerProps.onmouseleave = () => _options.hideTooltip?.(); + } + + const totalChanges = e.additions + e.deletions + e.modifications; + + if (totalChanges <= 0) { + parts.push(circle({ + cx: xPosition, + cy: chartHeight - (_options.markerSize * 2), + r: _options.markerSize, + fill: colorMap.emptyDark, + })); + } else { + const barWidth = _options.lineWidth; + const gap = 1; + const groupWidth = (barWidth * 3) + (gap * 2); + const startX = xPosition - (groupWidth / 2); + + const drawBar = (val, index, color) => { + const barHeight = scale(val, {old: {min: 0, max: maxValue}, new: {min: 0, max: chartHeight}}); + const yPos = chartHeight - barHeight; + + return rect({ + x: startX + (index * (barWidth + gap)), + y: yPos, + width: barWidth, + height: Math.max(barHeight, 0), + fill: color, + 'shape-rendering': 'crispEdges' + }); + }; + + parts.push(drawBar(e.additions, 0, e.additions ? colorMap.blue : 'transparent')); + parts.push(drawBar(e.deletions, 1, e.deletions ? colorMap.orange : 'transparent')); + parts.push(drawBar(e.modifications, 2, e.modifications ? colorMap.purple : 'transparent')); + + if (_options.onClick && totalChanges > 0) { + const barGroupWidth = (_options.lineWidth * 3) + 4; + const clickableWidth = Math.max(barGroupWidth + 4, 14); + parts.push( + rect({ + width: clickableWidth, + height: chartHeight, + x: xPosition - (clickableWidth / 2), + y: 0, + fill: 'transparent', + style: `transform-box: fill-box; transform-origin: center; cursor: pointer;`, + onclick: () => _options.onClick?.(e), + }) + ); + } + } + + return g( + {...markerProps}, + ...parts, + ); + }); + + const extraAttributes = {}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + return svg( + { + width: '100%', + height: '100%', + ...extraAttributes, + }, + ...schemaEvents, + ); +}; + +const defaultOptions = { + lineWidth: 4, + lineColor: colorMap.red, + markerSize: 2, + nestedPosition: {x: 0, y: 0}, +}; + +/** + * * @param {SchemaEvent} event + * @returns {HTMLDivElement} + */ +const SchemaChangesChartTooltip = (event) => { + return div( + {class: 'flex-column'}, + span({class: 'text-left mb-1'}, formatTimestamp(event.time, false)), + span({class: 'text-left text-small'}, `Additions: ${formatNumber(event.additions)}`), + span({class: 'text-left text-small'}, `Deletions: ${formatNumber(event.deletions)}`), + span({class: 'text-left text-small'}, `Modifications: ${formatNumber(event.modifications)}`), + ); +}; + +export { SchemaChangesChart }; \ No newline at end of file diff --git a/testgen/ui/static/js/components/schema_changes_list.js b/testgen/ui/static/js/components/schema_changes_list.js new file mode 100644 index 00000000..80277e33 --- /dev/null +++ b/testgen/ui/static/js/components/schema_changes_list.js @@ -0,0 +1,125 @@ +/** + * @typedef DataStructureLog + * @type {object} + * @property {('A'|'D'|'M')} change + * @property {string} old_data_type + * @property {string} new_data_type + * @property {string} column_name + * + * @typedef Properties + * @type {object} + * @property {number} window_start + * @property {number} window_end + * @property {(DataStructureLog[])?} data_structure_logs + */ +import van from '../van.min.js'; +import { Streamlit } from '../streamlit.js'; +import { Icon } from '../components/icon.js'; +import { formatTimestamp } from '../display_utils.js'; +import { getValue, loadStylesheet, resizeFrameHeightOnDOMChange, resizeFrameHeightToElement } from '../utils.js'; + +const { div, span } = van.tags; + +/** + * @param {Properties} props + */ +const SchemaChangesList = (props) => { + loadStylesheet('schema-changes-list', stylesheet); + const domId = 'schema-changes-list'; + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(1); + resizeFrameHeightToElement(domId); + resizeFrameHeightOnDOMChange(domId); + } + + const dataStructureLogs = getValue(props.data_structure_logs) ?? []; + const windowStart = getValue(props.window_start); + const windowEnd = getValue(props.window_end); + + return div( + { id: domId, class: 'flex-column fx-gap-1 fx-flex schema-changes-list' }, + span({ style: 'font-size: 16px; font-weight: 500;' }, 'Schema Changes'), + span( + { class: 'mb-3 text-caption', style: 'min-width: 200px;' }, + `${formatTimestamp(windowStart)} ~ ${formatTimestamp(windowEnd)}`, + ), + ...dataStructureLogs.map(log => StructureLogEntry(log)), + ); +}; + +const StructureLogEntry = (/** @type {DataStructureLog} */ log) => { + if (log.change === 'A') { + return div( + { class: 'flex-row fx-gap-1 fx-align-flex-start' }, + Icon( + {style: `font-size: 20px; color: var(--primary-text-color)`, filled: !log.column_name}, + log.column_name ? 'add' : 'add_box', + ), + div( + { class: 'schema-changes-item flex-column' }, + span({ class: 'truncate-text' }, log.column_name ?? 'Table added'), + span(log.new_data_type), + ), + ); + } else if (log.change === 'D') { + return div( + { class: 'flex-row fx-gap-1' }, + Icon( + {style: `font-size: 20px; color: var(--primary-text-color)`, filled: !log.column_name}, + log.column_name ? 'remove' : 'indeterminate_check_box', + ), + div( + { class: 'schema-changes-item flex-column' }, + span({ class: 'truncate-text' }, log.column_name ?? 'Table dropped'), + ), + ); + } else if (log.change === 'M') { + return div( + { class: 'flex-row fx-gap-1 fx-align-flex-start' }, + Icon({style: `font-size: 18px; color: var(--primary-text-color)`}, 'change_history'), + div( + { class: 'schema-changes-item flex-column' }, + span({ class: 'truncate-text' }, log.column_name), + + div( + { class: 'flex-row fx-gap-1' }, + span({ class: 'truncate-text' }, log.old_data_type), + Icon({ size: 10 }, 'arrow_right_alt'), + span({ class: 'truncate-text' }, log.new_data_type), + ), + ), + ); + } + + return null; +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + .schema-changes-list { + overflow-y: auto; + } + + .schema-changes-item { + color: var(--secondary-text-color); + white-space: nowrap; + text-overflow: ellipsis; + overflow: hidden; + } + + .schema-changes-item span { + font-family: 'Courier New', Courier, monospace; + + white-space: nowrap; + text-overflow: ellipsis; + overflow: hidden; + } + + .schema-changes-item > span:first-child { + font-family: 'Roboto', 'Helvetica Neue', sans-serif; + color: var(--primary-text-color); + } +`); + +export { SchemaChangesList }; diff --git a/testgen/ui/static/js/components/score_breakdown.js b/testgen/ui/static/js/components/score_breakdown.js new file mode 100644 index 00000000..acd2ffe1 --- /dev/null +++ b/testgen/ui/static/js/components/score_breakdown.js @@ -0,0 +1,232 @@ +import van from '../van.min.js'; +import { dot } from '../components/dot.js'; +import { Caption } from '../components/caption.js'; +import { Select } from '../components/select.js'; +import { emitEvent, getValue, loadStylesheet } from '../utils.js'; +import { caseInsensitiveSort } from '../display_utils.js'; +import { getScoreColor } from '../score_utils.js'; + +const { div, i, span } = van.tags; + +const ScoreBreakdown = (score, breakdown, category, scoreType, onViewDetails) => { + loadStylesheet('score-breakdown', stylesheet); + + return div( + { class: 'table', 'data-testid': 'score-breakdown' }, + div( + { class: 'flex-row fx-justify-space-between fx-align-flex-start text-caption' }, + div( + { class: 'breakdown-controls table-header flex-row fx-align-flex-center fx-gap-2' }, + span('Score grouped by'), + () => { + const selectedCategory = getValue(category); + return Select({ + label: '', + value: selectedCategory, + options: Object.entries(CATEGORIES) + .sort((A, B) => caseInsensitiveSort(A[1], B[1])) + .map(([value, label]) => ({ value, label })), + height: 32, + onChange: (value) => emitEvent('CategoryChanged', { payload: value }), + testId: 'groupby-selector', + }); + }, + span('for'), + () => { + const scoreValue = getValue(score); + const selectedScoreType = getValue(scoreType); + const scoreTypeOptions = ['score', 'cde_score'].filter((s) => scoreValue[s]) + if (!scoreTypeOptions.length) { + scoreTypeOptions.push('score'); + } + return Select({ + label: '', + value: selectedScoreType, + options: scoreTypeOptions.map((s) => ({ label: SCORE_TYPE_LABEL[s], value: s })), + height: 32, + onChange: (value) => emitEvent('ScoreTypeChanged', { payload: value }), + testId: 'score-type-selector', + }); + }, + ), + () => ['table_name', 'column_name'].includes(getValue(category)) ? span('* Top 100 values by impact') : '', + ), + () => div( + { class: 'table-header breakdown-columns flex-row' }, + getValue(breakdown)?.columns?.map(column => span({ + style: `flex: ${BREAKDOWN_COLUMNS_SIZES[column] ?? COLUMN_DEFAULT_SIZE};` }, + getReadableColumn(column, getValue(scoreType)), + )), + ), + () => { + const scoreValue = getValue(score); + const categoryValue = getValue(category); + const scoreTypeValue = getValue(scoreType); + const breakdownValue = getValue(breakdown); + const columns = breakdownValue?.columns; + return div( + breakdownValue?.items?.map((row) => div( + { class: 'table-row flex-row', 'data-testid': 'score-breakdown-row' }, + columns.map((columnName) => TableCell(row, columnName, scoreValue, categoryValue, scoreTypeValue, onViewDetails)), + )), + ); + }, + ); +}; + +/** + * Translate the column names for the table. + * + * @param {Array} columns + * @param {('table_name' | 'column_name' | 'semantic_data_type' | 'dq_dimension')} category + * @param {('score' | 'cde_score')} scoreType + * @returns {} + */ +function getReadableColumn(column, scoreType) { + if (column === 'impact') { + return `Impact on ${SCORE_TYPE_LABEL[scoreType]}`; + } + const label = BREAKDOWN_COLUMN_LABEL[column]; + if (['table_name', 'column_name'].includes(column)) { + return `${label} *`; + } + return label; +} + +/** + * + * @param {object} row + * @param {string} column + * @returns {} + */ +const TableCell = (row, column, score=undefined, category=undefined, scoreType=undefined, onViewDetails=undefined) => { + const componentByColumn = { + column_name: BreakdownColumnCell, + impact: ImpactCell, + score: ScoreCell, + issue_ct: IssueCountCell, + }; + + if (componentByColumn[column]) { + return componentByColumn[column](row[column], row, score, category, scoreType, onViewDetails); + } + + const size = BREAKDOWN_COLUMNS_SIZES[column] ?? COLUMN_DEFAULT_SIZE; + return div( + { style: `flex: ${size}; max-width: ${size}; word-wrap: break-word;`, 'data-testid': 'score-breakdown-cell' }, + span(row[column] ?? '-'), + ); +}; + +const BreakdownColumnCell = (value, row) => { + const size = COLUMN_DEFAULT_SIZE; + return div( + { class: 'flex-column', style: `flex: ${size}; max-width: ${size}; word-wrap: break-word;`, 'data-testid': 'score-breakdown-cell' }, + Caption({ content: row.table_name, style: 'font-size: 12px;' }), + span(value), + ); +}; + +const ImpactCell = (value) => { + return div( + { class: 'flex-row', style: `flex: ${BREAKDOWN_COLUMNS_SIZES.impact}`, 'data-testid': 'score-breakdown-cell' }, + value && !String(value).startsWith('-') + ? i( + {class: 'material-symbols-rounded', style: 'font-size: 20px; color: #E57373;'}, + 'arrow_downward_alt', + ) + : '', + span(value ?? '-'), + ); +}; + +const ScoreCell = (value) => { + return div( + { class: 'flex-row', style: `flex: ${BREAKDOWN_COLUMNS_SIZES.score}`, 'data-testid': 'score-breakdown-cell' }, + dot({ class: 'mr-2' }, getScoreColor(value)), + span(value ?? '--'), + ); +}; + +const IssueCountCell = (value, row, score, category, scoreType, onViewDetails) => { + let drilldown = row[category]; + if (category === 'table_name') { + drilldown = `${row.table_groups_id}.${row.table_name}`; + } else if (category === 'column_name') { + drilldown = `${row.table_groups_id}.${row.table_name}.${row.column_name}`; + } + + return div( + { class: 'flex-row', style: `flex: ${BREAKDOWN_COLUMNS_SIZES.issue_ct}`, 'data-testid': 'score-breakdown-cell' }, + span({ class: 'mr-2', style: 'min-width: 40px;' }, value || '-'), + (value && onViewDetails) + ? div( + { + class: 'flex-row clickable', + style: 'color: var(--link-color);', + 'data-testid': 'view-issues', + onclick: () => onViewDetails(score.project_code, score.name, scoreType, category, drilldown), + }, + span('View'), + i({class: 'material-symbols-rounded', style: 'font-size: 20px;'}, 'chevron_right'), + ) + : '', + ); +}; + +const CATEGORIES = { + table_name: 'Tables', + column_name: 'Columns', + semantic_data_type: 'Semantic Data Types', + dq_dimension: 'Quality Dimensions', + table_groups_name: 'Table Group', + data_location: 'Data Location', + data_source: 'Data Source', + source_system: 'Source System', + source_process: 'Source Process', + business_domain: 'Business Domain', + stakeholder_group: 'Stakeholder Group', + transform_level: 'Transform Level', + data_product: 'Data Product', +}; + +const BREAKDOWN_COLUMN_LABEL = { + ...CATEGORIES, + table_name: 'Table', + column_name: 'Table | Column', + semantic_data_type: 'Semantic Data Type', + dq_dimension: 'Quality Dimension', + impact: '', + score: 'Individual Score', + issue_ct: 'Issue Count', +}; + +const SCORE_TYPE_LABEL = { + score: 'Total Score', + cde_score: 'CDE Score', +}; + +const COLUMN_DEFAULT_SIZE = '40%'; +const BREAKDOWN_COLUMNS_SIZES = { + impact: '20%', + score: '20%', + issue_ct: '20%', +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.breakdown-controls { + border-bottom: unset; + text-transform: unset; + font-size: 16px; + font-weight: 500; + line-height: 25px; + margin-bottom: 8px; +} + +.breakdown-columns { + text-transform: capitalize; +} +`); + +export { ScoreBreakdown }; diff --git a/testgen/ui/static/js/components/score_card.js b/testgen/ui/static/js/components/score_card.js new file mode 100644 index 00000000..130bc470 --- /dev/null +++ b/testgen/ui/static/js/components/score_card.js @@ -0,0 +1,218 @@ +/** + * @typedef Score + * @type {object} + * @property {string} project_code + * @property {string} name + * @property {number} score + * @property {number} profiling_score + * @property {number} testing_score + * @property {number} cde_score + * @property {Array} categories + * @property {Array} history + * + * @typedef HistoryEntry + * @type {object} + * @property {number} score + * @property {string} category + * @property {string} time + * + * @typedef ScoreCardOptions + * @type {object} + * @property {boolean} showHistory + */ +import van from '../van.min.js'; +import { Card } from './card.js'; +import { dot } from './dot.js'; +import { Attribute } from './attribute.js'; +import { getScoreColor } from '../score_utils.js'; +import { getValue, loadStylesheet } from '../utils.js'; +import { scale } from '../axis_utils.js'; +import { SparkLine } from './spark_line.js'; +import { colorMap } from '../display_utils.js'; + +const { div, i, span } = van.tags; +const { circle, g, rect, svg, text } = van.tags("http://www.w3.org/2000/svg"); + +/** + * Render a scorecard's charts for total and CDE scores and the individual + * categories score. + * + * All three "sections" are optional and can be missing. + * + * @param {Score} score + * @param {(Function|Array|any|undefined)} actions + * @param {ScoreCardOptions?} options + * @returns {HTMLElement} + */ +const ScoreCard = (score, actions, options) => { + loadStylesheet('score-card', stylesheet); + + const title = van.derive(() => getValue(score)?.name ?? ''); + + return Card({ + title: title, + actionContent: actions, + class: 'tg-score-card', + testId: 'scorecard', + content: () => { + const score_ = getValue(score); + const categories = score_.dimensions ?? score_.categories ?? []; + const categoriesLabel = score_.categories_label ?? 'Quality Dimension'; + + const overallScoreHistory = score.history?.filter(e => e.category === 'score') ?? []; + const cdeScoreHistory = score.history?.filter(e => e.category === 'cde_score') ?? []; + + return div( + { class: 'flex-row fx-justify-center fx-align-flex-start' }, + score_.score ? div( + { class: 'mr-4' }, + ScoreChart( + "Total Score", + score_.score, + score.history?.filter(e => e.category === 'score') ?? [], + (options?.showHistory ?? false) && overallScoreHistory.length > 1, + colorMap.teal, + ), + div( + { class: 'flex-row fx-justify-center fx-gap-2 mt-1' }, + Attribute({ label: 'Profiling', value: score_.profiling_score }), + Attribute({ label: 'Testing', value: score_.testing_score }), + ), + ) : '', + score_.cde_score + ? ScoreChart( + "CDE Score", + score_.cde_score, + score.history?.filter(e => e.category === 'cde_score') ?? [], + (options?.showHistory ?? false) && cdeScoreHistory.length > 1, + colorMap.purpleLight, + ) + : '', + (score_.cde_score && categories.length > 0) ? i({ class: 'mr-4 ml-4' }) : '', + categories.length > 0 ? div( + { class: 'flex-column' }, + span({ class: 'mb-2 text-caption' }, categoriesLabel), + div( + { class: 'tg-score-card--categories' }, + categories.map(category => div( + { class: 'flex-row fx-align-flex-center fx-gap-2', 'data-testid': 'scorecard-category' }, + dot({}, getScoreColor(category.score)), + span({ class: 'tg-score-card--category-score', 'data-testid': 'scorecard-category-score' }, category.score ?? '--'), + span( + { class: 'tg-score-card--category-label', title: category.label, 'data-testid': 'scorecard-category-label', style: 'position: relative;' }, + category.label, + ), + )), + ), + ) : '', + ); + }, + }); +}; + +/** + * Circle chart for displaying score. + * + * @param {string} label + * @param {number} score + * @param {Array} history + * @param {boolean} showHistory + * @param {string?} trendColor + * @returns {SVGElement} + */ +const ScoreChart = (label, score, history, showHistory, trendColor) => { + const variables = { + size: '100px', + 'stroke-width': '4px', + color: getScoreColor(score), + 'half-size': 'calc(var(--size) / 2)', + radius: 'calc((var(--size) - var(--stroke-width)) / 2)', + circumference: 'calc(var(--radius) * pi * 2)', + dash: `calc((${score ?? 100} * var(--circumference)) / 100)`, + }; + const style = Object.entries(variables).map(([key, value]) => `--${key}: ${value}`).join(';'); + const historyLine = history.map(e => ({ x: Date.parse(e.time), y: e.score })); + const yLength = 30; + const xValues = historyLine.map(line => line.x); + const yValues = historyLine.map(line => line.y); + const xRanges = {old: {min: Math.min(...xValues), max: Math.max(...xValues)}, new: {min: 0, max: 80}}; + const yRanges = {old: {min: Math.min(...yValues), max: Math.max(...yValues)}, new: {min: 0, max: yLength}}; + + return svg( + { class: 'tg-score-chart', width: 100, height: 100, viewBox: "0 0 100 100", overflow: 'visible', 'data-testid': 'score-chart', style }, + circle({ class: 'tg-score-chart--bg' }), + circle({ class: 'tg-score-chart--fg' }), + text({ x: '50%', y: '40%', 'dominant-baseline': 'middle', 'text-anchor': 'middle', fill: 'var(--primary-text-color)', 'font-size': '18px', 'font-weight': 500, 'data-testid': 'score-chart-value' }, score ?? '-'), + text({ x: '50%', y: '40%', 'dominant-baseline': 'middle', 'text-anchor': 'middle', fill: 'var(--secondary-text-color)', 'font-size': '14px', class: 'tg-score-chart--label', 'data-testid': 'score-chart-text' }, label), + + showHistory ? g( + {fill: 'none', style: 'transform: translate(10px, 70px);'}, + rect({ width: 80, height: 30, x: 0, y: 0, rx: 2, ry: 2, fill: 'var(--dk-card-background)', stroke: 'var(--empty)' }), + SparkLine({color: trendColor}, historyLine.map(line => ({ x: scale(line.x, xRanges), y: yLength - scale(line.y, yRanges, yLength)}))), + ) : null, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-score-card { + height: 216px; + width: fit-content; + box-sizing: border-box; + border: 1px solid var(--border-color); + border-radius: 8px; + margin-bottom: unset !important; +} + +.tg-score-card--categories { + display: flex; + flex-direction: column; + flex-wrap: wrap; + row-gap: 8px; + column-gap: 16px; + max-height: 100px; + overflow-y: auto; +} +.tg-score-card--categories > div { + min-width: 160px; +} + +.tg-score-card--category-score { + min-width: 30px; + font-weight: 500; +} + +.tg-score-card--category-label { + display: block; + overflow-x: hidden; + text-wrap: nowrap; + text-overflow: ellipsis; +} + +svg.tg-score-chart circle { + cx: var(--half-size); + cy: var(--half-size); + r: var(--radius); + stroke-width: var(--stroke-width); + fill: none; + stroke-linecap: round; +} + +svg.tg-score-chart circle.tg-score-chart--bg { + stroke: var(--empty); +} + +svg.tg-score-chart circle.tg-score-chart--fg { + transform: rotate(-90deg); + transform-origin: var(--half-size) var(--half-size); + stroke-dasharray: var(--dash) calc(var(--circumference) - var(--dash)); + transition: stroke-dasharray 0.3s linear 0s; + stroke: var(--color); +} + +svg.tg-score-chart text.tg-score-chart--label { + transform: translateY(20px); +} +`); + +export { ScoreCard }; diff --git a/testgen/ui/static/js/components/score_history.js b/testgen/ui/static/js/components/score_history.js new file mode 100644 index 00000000..93b7b115 --- /dev/null +++ b/testgen/ui/static/js/components/score_history.js @@ -0,0 +1,83 @@ +/** + * @typedef ScoreHistoryEntry + * @type {object} + * @property {number} score + * @property {('score'|'cde_score')} category + * @property {string} time + */ +import van from '../van.min.js'; +import { emitEvent, getValue, loadStylesheet } from '../utils.js'; +import { colorMap } from '../display_utils.js'; +import { LineChart } from './line_chart.js'; + +const { div, span, strong } = van.tags; + +const TRANSLATIONS = { + score: 'Total Score', + cde_score: 'CDE Score', +}; + +/** + * Render the scorecard history as line charts for the enabled scores. + * + * @param {Object} props + * @param {...ScoreHistoryEntry} entries + * @returns {HTMLElment} + */ +const ScoreHistory = (props, ...entries) => { + loadStylesheet('score-trend', stylesheet); + + const lineColors = { + [TRANSLATIONS.score]: colorMap.teal, + [TRANSLATIONS.cde_score]: colorMap.purpleLight, + default: colorMap.grey, + }; + + return div( + { ...props, class: `tg-score-trend flex-row ${props?.class ?? ''}`, 'data-testid': 'score-trend' }, + LineChart( + { + width: 600, + height: 200, + tooltipOffsetX: -100, + tooltipOffsetY: 10, + xMinSpanBetweenTicks: 3 * 24 * 60 * 60 * 1000, + yMinSpanBetweenTicks: 5, + getters: { + x: (/** @type {ScoreHistoryEntry} */ entry) => Date.parse(entry.time), + y: (/** @type {ScoreHistoryEntry} */ entry) => Number(entry.score), + }, + formatters: { + x: (value) => new Intl.DateTimeFormat("en-US", {month: 'short', day: 'numeric'}).format(value), + y: (value) => String(Math.trunc(value)), + }, + lineDiscriminator: (/** @type {ScoreHistoryEntry} */ entry) => TRANSLATIONS[entry.category], + lineColor: (lineId) => lineColors[lineId] ?? lineColors.default, + onShowPointTooltip: (point, _) => { + return div( + { class: 'flex-column fx-align-flex-start fx-justify-flex-start'}, + strong(TRANSLATIONS[point.category]), + span(point.score), + span(Intl.DateTimeFormat("en-US", {dateStyle: 'long', timeStyle: 'long'}).format(Date.parse(point.time))), + ); + }, + onRefreshClicked: getValue(props.showRefresh) ? () => emitEvent('RecalculateHistory', { payload: getValue(props.score).id }) : undefined, + }, + ...entries, + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-score-trend { + width: fit-content; + box-sizing: border-box; + border: 1px solid var(--border-color); + border-radius: 8px; + margin-bottom: unset !important; + background-color: var(--dk-card-background); +} +`); + +export { ScoreHistory }; diff --git a/testgen/ui/static/js/components/score_issues.js b/testgen/ui/static/js/components/score_issues.js new file mode 100644 index 00000000..659f8020 --- /dev/null +++ b/testgen/ui/static/js/components/score_issues.js @@ -0,0 +1,380 @@ +/** + * @typedef Issue + * @type {object} + * @property {string} id + * @property {('hygiene' | 'test')} issue_type + * @property {string} table_group_id + * @property {string} table + * @property {string} column + * @property {string} type + * @property {string} status + * @property {string} detail + * @property {number} time + * @property {string} name + * @property {string} run_id + * + * @typedef Score + * @type {object} + * @property {string} project_code + * @property {string} name + */ +import van from '../van.min.js'; +import { Link } from '../components/link.js'; +import { Caption } from '../components/caption.js'; +import { dot } from '../components/dot.js'; +import { Button } from '../components/button.js'; +import { Checkbox } from '../components/checkbox.js'; +import { Select } from './select.js'; +import { Paginator } from '../components/paginator.js'; +import { emitEvent, loadStylesheet } from '../utils.js'; +import { colorMap, formatTimestamp, caseInsensitiveSort } from '../display_utils.js'; + +const { div, i, span } = van.tags; +const PAGE_SIZE = 100; +const SCROLL_CONTAINER = window.top.document.querySelector('.stMain'); +const statusColors = { + 'Potential PII': colorMap.grey, + Likely: colorMap.orange, + Possible: colorMap.yellow, + Definite: colorMap.red, + Warning: colorMap.yellow, + Failed: colorMap.red, + Passed: colorMap.green, +}; + +const IssuesTable = ( + /** @type Issue[] */ issues, + /** @type string[] */ columns, + /** @type Score */ score, + /** @type ('score' | 'cde_score') */ scoreType, + /** @type ('table_name' | 'column_name' | 'semantic_data_type' | 'dq_dimension') */ category, + /** @type string */ drilldown, + /** @type function */ onBack, +) => { + loadStylesheet('score-issues-table', stylesheet); + + const drilldownParts = drilldown.split('.'); + const pageIndex = van.state(0); + const filters = { + table: van.state(['table_name', 'column_name'].includes(category) ? drilldownParts[1] : null), + column: van.state(category === 'column_name' ? drilldownParts[2] : null), + type: van.state(null), + status: van.state(null), + } + + const filteredIssues = van.derive(() => { + pageIndex.val = 0; + return issues + .filter(({ table, column, type, status }) => ( + [ table, null ].includes(filters.table.val) + && [ column, null ].includes(filters.column.val) + && [ type, null ].includes(filters.type.val) + && [ status, null ].includes(filters.status.val) + )); + }); + const displayedIssues = van.derive(() => filteredIssues.val.slice(PAGE_SIZE * pageIndex.val, PAGE_SIZE * (pageIndex.val + 1))); + const selectedIssues = van.state([]); + + return div( + { class: 'table pb-0', 'data-testid': 'score-issues' }, + div( + { class: 'flex-row fx-justify-space-between fx-align-flex-start'}, + div( + div( + { + class: 'issues-nav flex-row clickable', + style: 'color: var(--link-color);', + onclick: () => onBack(score.project_code, score.name, scoreType, category), + }, + i({class: 'material-symbols-rounded', style: 'font-size: 20px;'}, 'chevron_left'), + span('Back'), + ), + div( + { class: 'issues-header table-header flex-row fx-align-flex-center fx-gap-1' }, + span(`Hygiene / Test Issues (${issues.length ?? 0}) for`), + span( + { class: 'text-primary' }, + `${COLUMN_LABEL[category] ?? '-'}: ${['table_name', 'column_name'].includes(category) ? drilldownParts.slice(1).join(' > ') : drilldown}`, + ), + category === 'column_name' + ? ColumnProfilingButton(drilldownParts[2], drilldownParts[1], drilldownParts[0]) + : null, + ), + ), + div( + { class: 'flex-row' }, + () => { + const count = selectedIssues.val.length; + return count + ? span( + { class: 'text-secondary mr-4' }, + span({ style: 'font-weight: 500' }, count), + ` issue${count > 1 ? 's' : ''} selected` + ) + : ''; + }, + Button({ + icon: 'download', + type: 'stroked', + label: 'Issue Reports', + width: 'fit-content', + style: 'margin-left: auto; background-color: var(--dk-card-background)', + onclick: () => emitEvent('IssueReportsExported', { payload: selectedIssues.val }), + disabled: () => !selectedIssues.val.length, + tooltip: () => selectedIssues.val.length ? '' : 'No issues selected', + }), + ), + ), + () => Toolbar(filters, issues, category), + () => displayedIssues.val.length + ? div( + div( + { class: 'table-header issues-columns flex-row' }, + Checkbox({ + checked: () => selectedIssues.val.length === PAGE_SIZE, + indeterminate: () => !!selectedIssues.val.length, + onChange: (checked) => { + if (checked) { + selectedIssues.val = displayedIssues.val.map(({ id, issue_type }) => ({ id, issue_type })); + } else { + selectedIssues.val = []; + } + }, + }), + span({ class: category === 'column_name' ? null : 'ml-6' }), + columns.map(c => span({ style: `flex: ${c === 'detail' ? '1 1' : '0 0'} ${ISSUES_COLUMNS_SIZES[c]};` }, ISSUES_COLUMN_LABEL[c])) + ), + displayedIssues.val.map((row) => div( + { class: 'table-row flex-row issues-row' }, + Checkbox({ + checked: () => selectedIssues.val.map(({ id }) => id).includes(row.id), + onChange: (checked) => { + if (checked) { + selectedIssues.val = [ ...selectedIssues.val, { id: row.id, issue_type: row.issue_type } ]; + } else { + selectedIssues.val = selectedIssues.val.filter(({ id }) => id !== row.id); + } + }, + }), + category === 'column_name' + ? span({ class: 'ml-2' }) + : ColumnProfilingButton(row.column, row.table, row.table_group_id), + columns.map((columnName) => TableCell(row, columnName)), + )), + () => Paginator({ + pageIndex, + count: filteredIssues.val.length, + pageSize: PAGE_SIZE, + onChange: (newIndex) => { + if (newIndex !== pageIndex.val) { + pageIndex.val = newIndex; + SCROLL_CONTAINER.scrollTop = 0; + } + }, + }), + ) + : div( + { class: 'mt-7 mb-6 text-secondary', style: 'text-align: center;' }, + 'No issues found matching filters', + ), + ); +}; + +const ColumnProfilingButton = ( + /** @type {string} */ column_name, + /** @type {string} */ table_name, + /** @type {string} */ table_group_id, +) => { + return Button({ + type: 'icon', + icon: 'insert_chart', + iconSize: 22, + style: 'color: var(--secondary-text-color);', + tooltip: 'View profiling for column', + tooltipPosition: 'top-right', + onclick: () => emitEvent('ColumnProflingClicked', { payload: { column_name, table_name, table_group_id } }), + }); +}; + +const Toolbar = ( + /** @type {object} */ filters, + /** @type Issue[] */ issues, + /** @type ('table_name' | 'column_name' | 'semantic_data_type' | 'dq_dimension') */ category, +) => { + const filterOptions = { + table: [ ...new Set(issues.map(({ table }) => table)) ] + .sort(caseInsensitiveSort) + .map(value => ({ label: value, value })), + column: van.derive(() => ( + [ ...new Set(issues + .filter(({ table }) => table === filters.table.val) + .map(({ column }) => column) + )] + .sort(caseInsensitiveSort) + .map(value => ({ label: value, value })) + )), + type: [ ...new Set(issues.map(({ type }) => type)) ] + .sort(caseInsensitiveSort) + .map(value => ({ label: value, value })), + status: [ 'Definite', 'Failed', 'Likely', 'Possible', 'Warning', 'Potential PII' ] + .map(value => ({ + label: div({ class: 'flex-row fx-gap-2' }, dot({}, statusColors[value]), span(value)), + value, + })), + }; + + const displayedFilters = [ 'type', 'status' ]; + if (category !== 'column_name') { + displayedFilters.unshift('column'); + } + if (!['table_name', 'column_name'].includes(category)) { + displayedFilters.unshift('table'); + } + + return div( + { class: 'flex-row fx-flex-wrap fx-gap-3 fx-align-flex-end mb-4' }, + displayedFilters.map(key => Select({ + id: `score-issues-${key}`, + label: SCORE_LABEL[key], + height: 32, + style: 'font-size: 14px;', + value: filters[key], + options: filterOptions[key], + allowNull: true, + disabled: () => key === 'column' ? !filters.table.val : false, + onChange: v => filters[key].val = v, + })), + ); +}; + +/** + * + * @param {object} row + * @param {string} column + * @returns {} + */ +const TableCell = (row, column) => { + const componentByColumn = { + column: IssueColumnCell, + type: IssueCell, + status: StatusCell, + detail: DetailCell, + time: TimeCell, + }; + + if (componentByColumn[column]) { + return componentByColumn[column](row[column], row); + } + + const size = { ...BREAKDOWN_COLUMNS_SIZES, ...ISSUES_COLUMNS_SIZES}[column]; + return div( + { style: `flex: 0 0 ${size}; max-width: ${size}; word-wrap: break-word;` }, + span(row[column]), + ); +}; + +const IssueColumnCell = (value, row) => { + const size = ISSUES_COLUMNS_SIZES.column; + return div( + { class: 'flex-column', style: `flex: 0 0 ${size}; max-width: ${size}; word-wrap: break-word;` }, + Caption({ content: row.table, style: 'font-size: 12px;' }), + span(value), + ); +}; + + +const IssueCell = (value, row) => { + return div( + { class: 'flex-column', style: `flex: 0 0 ${ISSUES_COLUMNS_SIZES.type}` }, + Caption({ content: `${row.issue_type} issue`, style: 'font-size: 12px; text-transform: capitalize;' }), + span(value), + ); +}; + +const StatusCell = (value, row) => { + return div( + { class: 'flex-row fx-align-flex-center', style: `flex: 0 0 ${ISSUES_COLUMNS_SIZES.status}` }, + dot({ class: 'mr-2' }, statusColors[value]), + span({}, value), + ); +}; + +const DetailCell = (value, row) => { + return div( + { style: `flex: 1 1 ${ISSUES_COLUMNS_SIZES.detail}` }, + span(value), + ); +}; + +const TimeCell = (value, row) => { + return div( + { class: 'flex-column', style: `flex: 0 0 ${ISSUES_COLUMNS_SIZES.time}` }, + row.issue_type === 'test' + ? Caption({ content: row.name, style: 'font-size: 12px;' }) + : '', + Link({ + label: formatTimestamp(value), + open_new: true, + href: row.issue_type === 'test' ? 'test-runs:results' : 'profiling-runs:hygiene', + params: { + run_id: row.run_id, + table_name: row.table, + column_name: row.column, + selected: row.id, + }, + }), + ); +}; + +const SCORE_LABEL = { + table: 'Table', + column: 'Column', + type: 'Issue Type', + status: 'Likelihood / Status', +}; + +const COLUMN_LABEL = { + table_name: 'Table', + column_name: 'Table > Column', + semantic_data_type: 'Semantic Data Type', + dq_dimension: 'Quality Dimension', +}; + +const ISSUES_COLUMN_LABEL = { + column: 'Table | Column', + type: 'Issue Type', + status: 'Likelihood / Status', + detail: 'Detail', + time: 'Test Suite | Start Time', +}; + +const ISSUES_COLUMNS_SIZES = { + column: '30%', + type: '20%', + status: '10%', + detail: '30%', + time: '10%', +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + +.issues-nav { + margin-left: -4px; + margin-bottom: 8px; +} + +.issues-header { + border-bottom: unset; + text-transform: unset; + font-size: 16px; + font-weight: 500; + line-height: 25px; +} + +.issues-columns { + text-transform: capitalize; +} +`); + +export { IssuesTable }; diff --git a/testgen/ui/static/js/components/score_legend.js b/testgen/ui/static/js/components/score_legend.js new file mode 100644 index 00000000..e5b53281 --- /dev/null +++ b/testgen/ui/static/js/components/score_legend.js @@ -0,0 +1,27 @@ +import van from '../van.min.js'; +import { getScoreColor } from '../score_utils.js'; +import { dot } from './dot.js'; + +const { div, span } = van.tags; + +const ScoreLegend = (/** @type string */ style) => { + return div( + { class: 'flex-row fx-gap-3 text-secondary', style }, + span({ class: 'fx-flex' }), + LegendItem('N/A', NaN), + LegendItem('0-85', 0), + LegendItem('86-90', 86), + LegendItem('91-95', 91), + LegendItem('96-100', 96), + ); +} + +const LegendItem = (label, value) => { + return div( + { class: 'flex-row fx-align-flex-center' }, + dot({ class: 'mr-2' }, getScoreColor(value)), + span({}, label), + ); +}; + +export { ScoreLegend }; diff --git a/testgen/ui/static/js/components/score_metric.js b/testgen/ui/static/js/components/score_metric.js new file mode 100644 index 00000000..321caed0 --- /dev/null +++ b/testgen/ui/static/js/components/score_metric.js @@ -0,0 +1,37 @@ +import van from '../van.min.js'; +import { Attribute } from './attribute.js'; +import { Caption } from './caption.js'; +import { loadStylesheet } from '../utils.js'; + +const { div, span } = van.tags; + +const ScoreMetric = function( + /** @type number */ score, + /** @type number? */ profilingScore, + /** @type number? */ testingScore, +) { + loadStylesheet('scoreMetric', stylesheet); + + return div( + { class: 'flex-column fx-align-flex-center score-metric' }, + Caption({ content: 'Score' }), + span( + { style: 'font-size: 28px;' }, + score ?? '--', + ), + (profilingScore || testingScore) ? div( + { class: 'flex-row fx-gap-2 mt-1' }, + Attribute({ label: 'Profiling', value: profilingScore }), + Attribute({ label: 'Testing', value: testingScore }), + ) : '', + ); +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.score-metric { + min-width: 120px; +} +`); + +export { ScoreMetric }; diff --git a/testgen/ui/static/js/components/select.js b/testgen/ui/static/js/components/select.js new file mode 100644 index 00000000..3e3e658c --- /dev/null +++ b/testgen/ui/static/js/components/select.js @@ -0,0 +1,467 @@ +/** + * @typedef SelectOption + * @type {object} + * @property {string} label + * @property {string} value + * @property {string?} icon + * + * @typedef Properties + * @type {object} + * @property {string?} id + * @property {string} label + * @property {string?|Array.?} value + * @property {Array.} options + * @property {boolean} allowNull + * @property {Function|null} onChange + * @property {boolean?} disabled + * @property {boolean?} required + * @property {boolean?} multiSelect + * @property {number?} width + * @property {number?} height + * @property {string?} style + * @property {string?} testId + * @property {number?} portalClass + * @property {('top' | 'bottom')?} portalPosition + * @property {boolean?} filterable + * @property {('normal' | 'inline')?} triggerStyle + */ +import van from '../van.min.js'; +import { getRandomId, getValue, loadStylesheet, isState, isEqual } from '../utils.js'; +import { Portal } from './portal.js'; +import { Icon } from './icon.js'; + +const { div, i, input, label, span } = van.tags; + +const Select = (/** @type {Properties} */ props) => { + loadStylesheet('select', stylesheet); + + if (getValue(props.multiSelect)) { + return MultiSelect(props); + } + + const domId = van.derive(() => props.id?.val ?? getRandomId()); + const opened = van.state(false); + const optionsFilter = van.state(''); + const options = van.derive(() => { + const options = getValue(props.options) ?? []; + const allowNull = getValue(props.allowNull); + + if (allowNull) { + return [ + {label: "---", value: null}, + ...options, + ]; + } + + return options; + }); + const filteredOptions = van.derive(() => { + const allOptions = getValue(options); + const isFilterable = getValue(props.filterable); + const filterTerm = getValue(optionsFilter); + if (isFilterable && filterTerm.length) { + const filteredOptions_ = []; + for (let i = 0; i < allOptions.length; i++) { + const option = allOptions[i]; + if (option.label === filterTerm) { + return allOptions; + } + + if (option.label.toLowerCase().includes(filterTerm.toLowerCase())) { + filteredOptions_.push(option); + } + } + return filteredOptions_; + } + return allOptions; + }); + + const value = isState(props.value) ? props.value : van.state(props.value ?? null); + const initialSelection = options.val?.find((op) => op.value === value.val); + const valueLabel = van.state(initialSelection?.label ?? ''); + const valueIcon = van.state(initialSelection?.icon ?? undefined); + + const changeSelection = (/** @type SelectOption */ option) => { + opened.val = false; + value.val = option.value; + }; + + const filterOptions = (/** @type InputEvent */ event) => { + optionsFilter.val = event.target.value; + }; + + // Reset filtering when closed + van.derive(() => { + if (!opened.val) { + optionsFilter.val = ''; + } + }); + + van.derive(() => { + const currentOptions = getValue(options); + const previousValue = value.oldVal; + let currentValue = getValue(value); + const selectedOption = currentOptions.find((op) => op.value === currentValue); + + if (selectedOption === undefined) { + currentValue = null; + setTimeout(() => value.val = null, 0.1); + } + + if (!isEqual(currentValue, previousValue)) { + valueLabel.val = selectedOption?.label ?? ''; + valueIcon.val = selectedOption?.icon ?? undefined; + + props.onChange?.(currentValue, { valid: !!currentValue || !getValue(props.required) }); + } + }); + + return label( + { + id: domId, + class: () => `flex-column fx-gap-1 text-caption tg-select--label ${getValue(props.disabled) ? 'disabled' : ''}`, + style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}; ${getValue(props.style)}`, + 'data-testid': getValue(props.testId) ?? '', + onclick: (/** @type Event */ event) => { + event.stopPropagation(); + event.stopImmediatePropagation(); + // Should toggle open/close unless disabled + opened.val = getValue(props.disabled) ? false : !opened.val; + }, + }, + span( + { class: 'flex-row fx-gap-1', 'data-testid': 'select-label' }, + props.label, + () => getValue(props.required) + ? span({ class: 'text-error' }, '*') + : '', + ), + + () => getValue(props.triggerStyle) === 'inline' + ? div( + {class: 'tg-select--inline-trigger flex-row'}, + span({}, valueLabel.val ?? '---'), + div( + { class: 'tg-select--field--icon ', 'data-testid': 'select-input-trigger' }, + i( + { class: 'material-symbols-rounded' }, + 'expand_more', + ), + ), + ) + : div( + { + class: () => `flex-row tg-select--field ${opened.val ? 'opened' : ''}`, + style: () => getValue(props.height) ? `height: ${getValue(props.height)}px;` : '', + 'data-testid': 'select-input', + }, + () => { + // Hack to display value again when closed + // For some reason, it goes away when opened + opened.val; + return div( + { class: 'tg-select--field--content', 'data-testid': 'select-input-display' }, + valueIcon.val + ? Icon({ classes: 'mr-2' }, valueIcon.val) + : undefined, + getValue(props.filterable) + ? input({ + id: `tg-select--field--${getRandomId()}`, + value: valueLabel.val, + onkeyup: filterOptions, + }) + : valueLabel.val, + ); + }, + div( + { class: 'tg-select--field--icon', 'data-testid': 'select-input-trigger' }, + i( + { + class: 'material-symbols-rounded', + }, + 'expand_more', + ), + ), + ), + + Portal( + {target: domId.val, targetRelative: true, position: props.portalPosition?.val ?? props?.portalPosition, opened}, + () => div( + { + class: () => `tg-select--options-wrapper mt-1 ${getValue(props.portalClass) ?? ''}`, + 'data-testid': 'select-options', + }, + getValue(filteredOptions).map(option => + div( + { + class: () => `tg-select--option ${getValue(value) === option.value ? 'selected' : ''}`, + onclick: (/** @type Event */ event) => { + changeSelection(option); + event.stopPropagation(); + }, + 'data-testid': 'select-options-item', + }, + option.icon + ? Icon({ classes: 'mr-2' }, option.icon) + : undefined, + span(option.label), + ) + ), + ), + ), + ); +}; + +/** + * @param {Properties} props + */ +const MultiSelect = (props) => { + const domId = van.derive(() => props.id?.val ?? getRandomId()); + const opened = van.state(false); + const options = van.derive(() => getValue(props.options) ?? []); + + const selectedValues = isState(props.value) ? props.value : van.state(props.value ?? []); + + const displayLabel = van.derive(() => { + const selected = getValue(selectedValues) ?? []; + if (!selected.length) { + return '---'; + }; + const allOptions = getValue(options); + return selected + .map(value => allOptions.find(opt => opt.value === value)?.label ?? value) + .join(', '); + }); + + const toggleOption = (optionValue) => { + const current = [...(getValue(selectedValues) ?? [])]; + const index = current.indexOf(optionValue); + if (index >= 0) { + current.splice(index, 1); + } else { + current.push(optionValue); + } + selectedValues.val = current; + props.onChange?.(current, { valid: current.length > 0 || !getValue(props.required) }); + }; + + return div( + { + id: domId, + class: () => `flex-column fx-gap-1 text-caption tg-select--label ${getValue(props.disabled) ? 'disabled' : ''}`, + style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}; ${getValue(props.style)}`, + 'data-testid': getValue(props.testId) ?? '', + onclick: (/** @type Event */ event) => { + event.stopPropagation(); + event.stopImmediatePropagation(); + // Should toggle open/close unless disabled + opened.val = getValue(props.disabled) ? false : !opened.val; + }, + }, + span( + { class: 'flex-row fx-gap-1', 'data-testid': 'select-label' }, + props.label, + () => getValue(props.required) + ? span({ class: 'text-error' }, '*') + : '', + ), + + div( + { + class: () => `flex-row tg-select--field ${opened.val ? 'opened' : ''}`, + style: () => getValue(props.height) ? `height: ${getValue(props.height)}px;` : '', + 'data-testid': 'select-input', + }, + () => { + // Hack to display value again when closed + // For some reason, it goes away when opened + opened.val; + return div( + { class: 'tg-select--field--content tg-select--multi-display', 'data-testid': 'select-input-display' }, + displayLabel.val || '', + ); + }, + div( + { class: 'tg-select--field--icon', 'data-testid': 'select-input-trigger' }, + i({ class: 'material-symbols-rounded' }, 'expand_more'), + ), + ), + + Portal( + {target: domId.val, targetRelative: true, position: props.portalPosition?.val ?? props?.portalPosition, opened}, + () => div( + { + class: () => `tg-select--options-wrapper mt-1 ${getValue(props.portalClass) ?? ''}`, + 'data-testid': 'select-options', + }, + getValue(options).map(option => { + const isSelected = van.derive(() => (getValue(selectedValues) ?? []).includes(option.value)); + return div( + { + class: () => `tg-select--option fx-gap-2 ${isSelected.val ? 'selected' : ''}`, + onclick: (/** @type Event */ event) => { + event.stopPropagation(); + toggleOption(option.value); + }, + 'data-testid': 'select-options-item', + }, + input({ + type: 'checkbox', + class: 'tg-select--checkbox', + checked: isSelected, + }), + span(option.label), + ); + }), + ), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-select--label { + position: relative; +} +.tg-select--label.disabled { + cursor: not-allowed; + color: var(--disabled-text-color); +} + +.tg-select--label.disabled .tg-select--field { + color: var(--disabled-text-color); +} + +.tg-select--field { + box-sizing: border-box; + width: 100%; + height: 38px; + min-width: 200px; + border: 1px solid transparent; + transition: border-color 0.3s; + background-color: var(--form-field-color); + padding: 4px 8px; + color: var(--primary-text-color); + border-radius: 8px; +} + +.tg-select--field.opened { + border-color: var(--primary-color); +} + +.tg-select--field--content { + font-size: 14px; + display: flex; + align-items: center; + justify-content: flex-start; + height: 100%; + flex: 1; + font-weight: 500; +} + +.tg-select--multi-display { + overflow: hidden; + text-overflow: ellipsis; + white-space: nowrap; +} + +.tg-select--field--content > input { + border: unset !important; + background: transparent !important; + outline: none !important; + width: 100%; + font-weight: 500; + font-family: 'Roboto', 'Helvetica Neue', sans-serif; + color: var(--primary-text-color); +} + +.tg-select--field--icon { + display: flex; + align-items: center; + justify-content: center; + width: 20px; + height: 100%; +} + +.tg-select--field--icon i { + font-size: 20px; +} + +.tg-select--options-wrapper { + border-radius: 8px; + background: var(--portal-background); + box-shadow: var(--portal-box-shadow); + min-height: 40px; + max-height: 400px; + overflow: auto; + z-index: 99; +} + +.tg-select--options-wrapper > .tg-select--option:first-child { + border-top-left-radius: 8px; + border-top-right-radius: 8px; +} + +.tg-select--options-wrapper > .tg-select--option:last-child { + border-bottom-left-radius: 8px; + border-bottom-right-radius: 8px; +} + +.tg-select--option { + display: flex; + align-items: center; + height: 40px; + padding: 0px 16px; + cursor: pointer; + font-size: 14px; + color: var(--primary-text-color); +} +.tg-select--option:hover { + background: var(--select-hover-background); +} + +.tg-select--option.selected { + background: var(--select-hover-background); + color: var(--primary-color); +} + +.tg-select--checkbox { + appearance: none; + box-sizing: border-box; + margin: 0; + width: 18px; + height: 18px; + flex-shrink: 0; + border: 1px solid var(--secondary-text-color); + border-radius: 4px; + position: relative; + pointer-events: none; + transition-property: border-color, background-color; + transition-duration: 0.3s; +} + +.tg-select--checkbox:checked { + border-color: transparent; + background-color: var(--primary-color); +} + +.tg-select--checkbox:checked::after { + content: 'check'; + position: absolute; + top: -4px; + left: -3px; + font-family: 'Material Symbols Rounded'; + font-size: 22px; + color: white; +} + +.tg-select--inline-trigger { + border-bottom: 1px solid var(--border-color); +} + +.tg-select--inline-trigger > span { + min-width: 24px; +} +`); + +export { Select }; diff --git a/testgen/ui/static/js/components/slider.js b/testgen/ui/static/js/components/slider.js new file mode 100644 index 00000000..2582fc8b --- /dev/null +++ b/testgen/ui/static/js/components/slider.js @@ -0,0 +1,164 @@ +/** + * @typedef Properties + * @type {object} + * @property {string} label + * @property {number} value + * @property {number} min + * @property {number} max + * @property {number} step + * @property {function(number)?} onChange + * @property {string?} hint + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; + +const { input, label, span } = van.tags; + +const Slider = (/** @type Properties */ props) => { + loadStylesheet('slider', stylesheet); + + const value = van.state(getValue(props.value) ?? getValue(props.min) ?? 0); + + const handleInput = e => { + value.val = Number(e.target.value); + props.onChange?.(value.val); + }; + + return label( + { class: 'flex-col fx-gap-1 clickable tg-slider--label text-caption' }, + props.label, + input({ + type: "range", + min: props.min ?? 0, + max: props.max ?? 100, + step: props.step ?? 1, + value: value, + oninput: handleInput, + class: 'tg-slider--input', + }), + span({ class: "tg-slider--value" }, () => value.val), + props.hint && span({ class: "tg-slider--hint" }, props.hint) + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-slider--label { + display: flex; + flex-direction: column; + gap: 0.5em; + font-family: inherit; +} + +.tg-slider--value { + font-size: 0.9em; + color: var(--primary-text-color); +} + +.tg-slider--hint { + font-size: 0.8em; + color: var(--disabled-text-color); +} + +/* Basic reset and common styles for the range input */ +input[type=range].tg-slider--input { + -webkit-appearance: none; /* Override default WebKit styles */ + appearance: none; /* Override default pseudo-element styles */ + width: 100%; /* Full width */ + height: 20px; /* Set height to accommodate thumb; track will be smaller */ + cursor: pointer; + outline: none; + background: transparent; /* Make default track invisible, we'll style it manually */ + accent-color: var(--primary-color); /* Sets thumb and selected track color for modern browsers (Chrome, Edge, Firefox) */ +} + +/* --- Thumb Styling (#06a04a) --- */ +/* WebKit (Chrome, Safari, Opera, Edge Chromium) */ +input[type=range].tg-slider--input::-webkit-slider-thumb { + -webkit-appearance: none; /* Required to style */ + appearance: none; + height: 20px; /* Thumb height */ + width: 20px; /* Thumb width */ + background-color: var(--primary-color); /* Thumb color */ + border-radius: 50%; /* Make it circular */ + border: none; /* No border */ + margin-top: -7px; /* Vertically center thumb on track. (Thumb height - Track height) / 2 = (20px - 6px) / 2 = 7px */ + /* This assumes track height is 6px (defined below) */ +} + +/* Firefox */ +input[type=range].tg-slider--input::-moz-range-thumb { + height: 20px; /* Thumb height */ + width: 20px; /* Thumb width */ + background-color: var(--primary-color); /* Thumb color */ + border-radius: 50%; /* Make it circular */ + border: none; /* No border */ +} + +/* IE / Edge Legacy (EdgeHTML) */ +input[type=range].tg-slider--input::-ms-thumb { + height: 20px; /* Thumb height */ + width: 20px; /* Thumb width */ + background-color: var(--primary-color); /* Thumb color */ + border-radius: 50%; /* Make it circular */ + border: 0; /* No border */ + /* margin-top: 1px; /* IE may need slight adjustment if track style requires it */ +} + +/* --- Track Styling --- */ +/* Track "unselected" section: #EEEEEE */ +/* Track "selected" section: #06a04a */ + +/* WebKit browsers */ +input[type=range].tg-slider--input::-webkit-slider-runnable-track { + width: 100%; + height: 6px; /* Track height */ + background: var(--grey); /* Color of the "unselected" part of the track */ + /* accent-color (set on the input) will color the "selected" part */ +// background: transparent !important; + border-radius: 3px; /* Rounded track edges */ +} + +/* Firefox */ +input[type=range].tg-slider--input::-moz-range-track { + width: 100%; + height: 6px; /* Track height */ +// background: var(--grey); /* Color of the "unselected" part of the track */ + background: transparent !important; + border-radius: 3px; /* Rounded track edges */ +} + +/* For Firefox, the "selected" part of the track is ::-moz-range-progress */ +/* This is often handled by accent-color, but explicitly styling it provides a fallback. */ +input[type=range].tg-slider--input::-moz-range-progress { + height: 6px; /* Must match track height */ + background-color: var(--primary-color); /* Color of the "selected" part */ + border-radius: 3px; /* Rounded track edges */ +} + +/* IE / Edge Legacy (EdgeHTML) */ +input[type=range].tg-slider--input::-ms-track { + width: 100%; + height: 6px; /* Track height */ + cursor: pointer; + + /* Needs to be transparent for ms-fill-lower and ms-fill-upper to show through */ + background: transparent; + border-color: transparent; + color: transparent; + border-width: 7px 0; /* Adjust vertical positioning; (thumb height - track height) / 2 */ +} + +input[type=range].tg-slider--input::-ms-fill-lower { + background: var(--primary-color); /* Color of the "selected" part */ + border-radius: 3px; /* Rounded track edges */ +} + +input[type=range].tg-slider--input::-ms-fill-upper { + background: var(--grey); /* Color of the "unselected" part */ + border-radius: 3px; /* Rounded track edges */ +} + +`); + +export { Slider }; \ No newline at end of file diff --git a/testgen/ui/static/js/components/sorting_selector.js b/testgen/ui/static/js/components/sorting_selector.js new file mode 100644 index 00000000..847850e5 --- /dev/null +++ b/testgen/ui/static/js/components/sorting_selector.js @@ -0,0 +1,260 @@ +import {Streamlit} from "../streamlit.js"; +import van from '../van.min.js'; +import { loadStylesheet } from '../utils.js'; + +/** + * @typedef ColDef + * @type {Array.} + * + * @typedef StateItem + * @type {Array.} + * + * @typedef Properties + * @type {object} + * @property {Array.} columns + * @property {Array.} state + */ +const { button, div, i, span } = van.tags; + +const SortingSelector = (/** @type {Properties} */ props) => { + loadStylesheet('sortingSelector', stylesheet); + + let defaultDirection = "ASC"; + + const columns = props.columns.val; + const prevComponentState = props.state.val || []; + + const columnLabel = columns.reduce((acc, [colLabel, colId]) => ({ ...acc, [colId]: colLabel}), {}); + + if (!window.testgen.isPage) { + Streamlit.setFrameHeight(100 + 30 * columns.length); + } + + const componentState = columns.reduce( + (state, [colLabel, colId]) => ( + { ...state, [colId]: van.state(prevComponentState[colId] || { direction: "ASC", order: null })} + ), + {} + ); + + const directionIcons = { + ASC: `arrow_upward`, + DESC: `arrow_downward`, + } + + const activeColumnItem = (colId) => { + const state = componentState[colId]; + const directionIcon = van.derive(() => directionIcons[state.val.direction]); + return button( + { + class: 'flex-row', + onclick: () => { + state.val = { ...state.val, direction: state.val.direction === "DESC" ? "ASC" : "DESC" }; + }, + }, + i( + { class: `material-symbols-rounded` }, + directionIcon, + ), + span(columnLabel[colId]), + i( + { + class: `material-symbols-rounded clickable dismiss-button`, + style: `margin-left: auto;`, + onclick: (event) => { + event?.preventDefault(); + event?.stopPropagation(); + + componentState[colId].val = { direction: defaultDirection, order: null }; + }, + }, + 'close', + ), + ) + } + + const selectColumn = (colId, direction) => { + const activeColumnsCount = Object.values(componentState).filter((columnState) => columnState.val.order != null).length; + componentState[colId].val = { direction: direction, order: activeColumnsCount }; + } + + prevComponentState.forEach(([colId, direction]) => selectColumn(colId, direction)); + + const reset = () => { + columns.map( + ([colLabel, colId]) => ( + componentState[colId].val = { direction: defaultDirection, order: null } + ) + ); + } + + const externalComponentState = () => Object.entries(componentState).filter( + ([colId, colState]) => colState.val.order !== null + ).sort( + ([colIdA, colStateA], [colIdB, colStateB]) => colStateA.val.order - colStateB.val.order + ).map( + ([colId, colState]) => [colId, colState.val.direction] + ) + + const apply = () => { + Streamlit.sendData(externalComponentState()); + } + + const columnItem = (colId) => { + const state = componentState[colId]; + return button( + { + onclick: () => selectColumn(colId, defaultDirection), + hidden: state.val.order !== null, + }, + i( + { + class: `material-symbols-rounded`, + style: `color: var(--disabled-text-color);`, + }, + `expand_all` + ), + span(columnLabel[colId]), + ) + } + + const resetDisabled = () => Object.entries(componentState).filter( + ([colId, colState]) => colState.val.order != null + ).length === 0; + + const applyDisabled = () => externalComponentState().toString() === (props.state.val || []).toString(); + + return div( + { class: 'tg-sort-selector' }, + div( + { + class: `tg-sort-selector--header`, + }, + span("Selected columns") + ), + () => div( + { + class: 'tg-sort-selector--column-list', + style: `flex-grow: 1`, + }, + Object.entries(componentState) + .filter(([, colState]) => colState.val.order != null) + .sort(([, colStateA], [, colStateB]) => colStateA.val.order - colStateB.val.order) + .map(([colId,]) => activeColumnItem(colId)) + ), + div( + { class: `tg-sort-selector--header` }, + span("Available columns") + ), + div( + { + class: 'tg-sort-selector--column-list', + }, + columns.map(([colLabel, colId]) => van.derive(() => columnItem(colId))), + ), + div( + { class: `tg-sort-selector--footer` }, + button( + { + onclick: reset, + style: `color: var(--button-text-color);`, + disabled: van.derive(resetDisabled), + }, + span(`Reset`), + ), + button( + { onclick: apply, disabled: van.derive(applyDisabled) }, + span(`Apply`), + ) + ) + ); +}; + + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` + +.tg-sort-selector { + height: 100vh; + display: flex; + flex-direction: column; + align-content: flex-end; + justify-content: space-between; +} + +.tg-sort-selector--column-list { + display: flex; + flex-direction: column; +} + +.tg-sort-selector--column-list button { + margin: 0; + border: 0; + padding: 5px 0; + text-align: left; + background: transparent; + color: var(--button-text-color); +} + +.tg-sort-selector--column-list button:hover { + background: #00000010; +} + +.tg-sort-selector--column-list button * { + vertical-align: middle; +} + +.tg-sort-selector--column-list button i { + font-size: 20px; +} + + +.tg-sort-selector--column-list { + border-bottom: 3px dotted var(--disabled-text-color); + padding-bottom: 8px; + margin-bottom: 8px; +} + +.tg-sort-selector--header { + text-align: right; + text-transform: uppercase; + font-size: 70%; + color: var(--secondary-text-color); +} + +.tg-sort-selector--footer { + display: flex; + flex-direction: row; + justify-content: space-between; + margin-top: 8px; +} + +.tg-sort-selector--footer button { + background-color: var(--button-stroked-background); + color: var(--button-stroked-text-color); + border: var(--button-stroked-border); + padding: 5px 20px; + border-radius: 5px; +} + +.tg-sort-selector--footer button[disabled] { + color: var(--disabled-text-color) !important; +} + +.dismiss-button { + margin-left: auto; + color: var(--disabled-text-color); +} +.dismiss-button:hover { + color: var(--button-text-color); +} + +@media (prefers-color-scheme: dark) { + .tg-sort-selector--column-list button:hover { + background: #FFFFFF20; + } +} + +`); + +export { SortingSelector }; diff --git a/testgen/ui/static/js/components/spark_line.js b/testgen/ui/static/js/components/spark_line.js new file mode 100644 index 00000000..89985808 --- /dev/null +++ b/testgen/ui/static/js/components/spark_line.js @@ -0,0 +1,67 @@ +/** + * @typedef SparklineOptions + * @type {object} + * @property {string} color + * @property {number} stroke + * @property {number?} opacity + * @property {bool?} hidden + * @property {boolean?} interactive + * @property {Function?} onPointMouseEnter + * @property {Function?} onPointMouseLeave + * @property {string?} testId + * + * @typedef Point + * @type {object} + * @property {number} x + * @property {number} y +*/ +import { getValue } from '../utils.js'; +import van from '../van.min.js'; + +const { circle, g, polyline } = van.tags("http://www.w3.org/2000/svg"); +const defaultCircleRadius = 3; +const onHoverCircleRadius = 5; + +/** + * Creates a line to be redenred inside an SVG. + * + * @param {SparklineOptions} options + * @param {Array} line + * @returns + */ +const SparkLine = ( + /** @type {SparklineOptions} */ options, + /** @type {Array} */ line, +) => { + const display = van.derive(() => getValue(options.hidden) === true ? 'none' : ''); + return g( + { fill: 'none', opacity: options.opacity ?? 1, style: 'overflow: visible;', 'data-testid': options.testId, display }, + polyline({ + points: line.map(point => `${point.x} ${point.y}`).join(', '), + style: `stroke: ${options.color}; stroke-width: ${options.stroke ?? 1};`, + }), + options?.interactive + ? line.map(point => { + const circleRadius = van.state(defaultCircleRadius); + + return circle({ + cx: point.x, + cy: point.y, + r: circleRadius, + 'pointer-events': 'all', + fill: options.color, + onmouseenter: () => { + circleRadius.val = onHoverCircleRadius; + options?.onPointMouseEnter?.(point, line); + }, + onmouseleave: () => { + circleRadius.val = defaultCircleRadius; + options?.onPointMouseLeave?.(point, line); + }, + }); + }) + : '', + ); +}; + +export { SparkLine }; diff --git a/testgen/ui/static/js/components/summary_bar.js b/testgen/ui/static/js/components/summary_bar.js new file mode 100644 index 00000000..c16dcc61 --- /dev/null +++ b/testgen/ui/static/js/components/summary_bar.js @@ -0,0 +1,99 @@ +/** + * @typedef SummaryItem + * @type {object} + * @property {string} value + * @property {string} color + * @property {string} label + * @property {boolean?} showPercent + * + * @typedef Properties + * @type {object} + * @property {Array.} items + * @property {string?} label + * @property {number?} height + * @property {number?} width + */ +import van from '../van.min.js'; +import { friendlyPercent, getValue, loadStylesheet } from '../utils.js'; +import { colorMap, formatNumber } from '../display_utils.js'; + +const { div, span } = van.tags; +const defaultHeight = 24; + +const SummaryBar = (/** @type Properties */ props) => { + loadStylesheet('summaryBar', stylesheet); + const total = van.derive(() => getValue(props.items).reduce((sum, item) => sum + item.value, 0)); + + return div( + () => props.label ? div( + { class: 'tg-summary-bar--label' }, + props.label, + ) : '', + () => div( + { + class: 'tg-summary-bar', + style: () => `height: ${getValue(props.height) || defaultHeight}px; max-width: ${props.width ? getValue(props.width) + 'px' : '100%'};` + }, + getValue(props.items).map(item => span({ + class: 'tg-summary-bar--item', + style: () => `width: ${item.value * 100 / total.val}%; + ${item.value ? 'min-width: 1px;' : ''} + background-color: ${colorMap[item.color] || item.color};`, + })), + ), + () => total.val ? div( + { class: 'tg-summary-bar--caption flex-row fx-flex-wrap text-caption mt-1' }, + getValue(props.items).map(item => item.label + ? div( + { class: 'tg-summary-bar--legend flex-row' }, + span({ + class: 'dot', + style: `color: ${colorMap[item.color] || item.color};`, + }), + `${item.label}: ${formatNumber(item.value || 0)}` + (item.showPercent ? ` (${friendlyPercent(item.value * 100 / total.val)}%)` : '') + ) + : null, + ), + ) : '', + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-summary-bar--label { + margin-bottom: 4px; +} + +.tg-summary-bar { + height: 100%; + display: flex; + flex-flow: row nowrap; + align-items: flex-start; + justify-content: flex-start; + border-radius: 4px; + overflow: hidden; +} + +.tg-summary-bar--item { + height: 100%; +} + +.tg-summary-bar--caption { + font-style: italic; +} + +.tg-summary-bar--legend { + width: auto; +} + +.tg-summary-bar--legend:not(:last-child) { + margin-right: 8px; +} + +.tg-summary-bar--legend span { + margin-right: 2px; + font-size: 4px; +} +`); + +export { SummaryBar }; diff --git a/testgen/ui/static/js/components/summary_counts.js b/testgen/ui/static/js/components/summary_counts.js new file mode 100644 index 00000000..c2ea688d --- /dev/null +++ b/testgen/ui/static/js/components/summary_counts.js @@ -0,0 +1,45 @@ +/** + * @typedef SummaryItem + * @type {object} + * @property {string} value + * @property {string} color + * @property {string} label + * + * @typedef Properties + * @type {object} + * @property {Array.} items + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; +import { colorMap, formatNumber } from '../display_utils.js'; + +const { div } = van.tags; + +const SummaryCounts = (/** @type Properties */ props) => { + loadStylesheet('summaryCounts', stylesheet); + + return div( + { class: 'flex-row fx-gap-5' }, + getValue(props.items).map(item => div( + { class: 'flex-row fx-align-stretch fx-gap-2' }, + div({ class: 'tg-summary-counts--bar', style: `background-color: ${colorMap[item.color] || item.color};` }), + div( + div({ class: 'text-caption' }, item.label), + div({ class: 'tg-summary-counts--count' }, formatNumber(item.value)), + ) + )), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-summary-counts--bar { + width: 4px; +} + +.tg-summary-counts--count { + font-size: 16px; +} +`); + +export { SummaryCounts }; diff --git a/testgen/ui/static/js/components/table.js b/testgen/ui/static/js/components/table.js new file mode 100644 index 00000000..c3ae90c1 --- /dev/null +++ b/testgen/ui/static/js/components/table.js @@ -0,0 +1,540 @@ +/** + * @import {VanState} from '../van.min.js'; + * + * @typedef Column + * @type {object} + * @property {string} name + * @property {string} label + * @property {number?} colspan + * @property {number?} width + * @property {boolean?} sortable + * @property {('left' | 'center' | 'right')?} align + * @property {('hidden' | 'visible')?} overflow + * + * @typedef Sort + * @type {object} + * @property {string?} field + * @property {('asc'|'desc')?} order + * + * @typedef SelectonOptions + * @type {object} + * @property {boolean?} multi + * @property {((rowIndexes: number[]) => void)?} onRowsSelected + * + * @typedef SortOptions + * @type {object} + * @property {string?} field + * @property {('asc'|'desc')?} order + * @property {((a: Sort) => void)} onSortChange + * + * @typedef PaginatorOptions + * @type {object} + * @property {number?} itemsPerPage + * @property {number?} totalItems + * @property {number?} currentPageIdx + * @property {((a: number, b: number) => void)?} onPageChange + * @property {HTMLElement?} leftContent + * + * @typedef Options + * @type {object} + * @property {(Column[] | Column[][])} columns + * @property {any?} header + * @property {any?} emptyState + * @property {string?} class + * @property {((row: any, index: number) => string)?} rowClass + * @property {string?} height + * @property {string?} width + * @property {boolean?} highDensity + * @property {boolean?} dynamicWidth + * @property {SortOptions?} sort + * @property {PaginatorOptions?} paginator + * @property {SelectonOptions?} selection + */ +import { getValue, loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; +import { Button } from './button.js'; +import { Icon } from './icon.js'; +import { Select } from './select.js'; + +const { colgroup, col, div, span, table, thead, th, tbody, tr, td } = van.tags; +const defaultItemsPerPage = 20; +const defaultHeight = 'calc(100% - 76.5px)'; +const defaultWidth = '100%'; + +/** + * @param {Options?} options + * @param {...Row} rows + * @returns {HTMLElement} + */ +const Table = (options, rows) => { + loadStylesheet('table', stylesheet); + + const headerLines = van.derive(() => { + const columns = getValue(options.columns); + if (Array.isArray(columns[0])) { + return columns; + } + return [columns]; + }); + const dataColumns = van.derive(() => getValue(headerLines)?.slice(-1)?.[0] ?? []); + const widthSum = van.state(0); + const columnWidths = []; + + van.derive(() => { + for (let i = 0; i < dataColumns.val.length; i++) { + const column = dataColumns.val[i]; + columnWidths[i] = columnWidths[i] ?? van.state(0); + columnWidths[i].val = column.width; + widthSum.val += column.width; + } + widthSum.val = widthSum.val || undefined; + }); + + const selectedRows = []; + van.derive(() => { + const rows_ = getValue(rows); + rows_.forEach((_, idx) => { + selectedRows[idx] = selectedRows[idx] ?? van.state(false) + selectedRows[idx].val = false; + }); + }); + van.derive(() => { + const selectedRows_ = []; + for (let i = 0; i < selectedRows.length; i++) { + if (selectedRows[i].val) { + selectedRows_.push(i); + } + } + + options.selection?.onRowsSelected?.(selectedRows_); + }); + const onRowSelected = (idx) => { + if (!options.selection?.multi) { + for (const state of selectedRows) { + state.val = false; + } + } + + if (options.selection?.onRowsSelected) { + selectedRows[idx].val = !selectedRows[idx].val; + } + }; + + + const renderPaginator = van.derive(() => getValue(options.paginator) != undefined); + const paginatorOptions = van.derive(() => { + const p = getValue(options.paginator); + return { + itemsPerPage: p?.itemsPerPage ?? defaultItemsPerPage, + totalItems: p?.totalItems ?? undefined, + currentPageIdx: p?.currentPageIdx ?? 0, + onPageChange: p?.onPageChange, + leftContent: p?.leftContent, + }; + }); + + const sortOptions = van.derive(() => { + const s = getValue(options.sort); + + return { + field: s?.field, + order: s?.order, + onSortChange: (columnName) => { + let newSortOrder = 'desc'; + let columnNameOrClear = columnName; + if (s?.field === columnName && s?.order === 'desc') { + newSortOrder = 'asc'; + } else if (s?.field === columnName && s?.order === 'asc') { + newSortOrder = null; + columnNameOrClear = null; + } + + s?.onSortChange?.({field: columnNameOrClear, order: newSortOrder}); + }, + }; + }); + + return div( + { + class: () => `tg-table flex-column border border-radius-1 ${getValue(options.highDensity) ? 'tg-table-high-density' : ''} ${getValue(options.dynamicWidth) ? 'tg-table-dynamic-width' : ''} ${options.onRowsSelected ? 'tg-table-hoverable' : ''}`, + style: () => `height: ${getValue(options.height) ? getValue(options.height) + 'px' : defaultHeight};`, + }, + options.header, + div( + {class: 'tg-table-scrollable flex-column fx-flex'}, + table( + { + class: () => getValue(options.class) ?? '', + style: () => { + const dynamicWidth = getValue(options.dynamicWidth) ?? false; + let widthNumber = getValue(options.width) ?? widthSum.val; + if (widthNumber < window.innerWidth) { + widthNumber = window.innerWidth; + } + return `width: ${(widthNumber && dynamicWidth) ? widthNumber + 'px' : defaultWidth}; ${dynamicWidth ? 'table-layout: fixed;' : ''}`; + }, + }, + () => colgroup( + ...dataColumns.val.map((_, idx) => col({style: `width: ${columnWidths[idx].val}px;`})), + ), + () => thead( + getValue(headerLines).map((headerLine, idx, allHeaderLines) => { + const dynamicWidth = getValue(options.dynamicWidth) ?? false; + return tr( + ...getValue(headerLine).map((column, colIdx) => + TableHeaderColumn( + column, + idx === allHeaderLines.length - 1, + columnWidths, + colIdx, + dynamicWidth, + sortOptions, + ) + ), + ); + }) + ), + () => { + const rows_ = getValue(rows); + if (rows_.length <= 0 && options.emptyState) { + return tbody( + {class: 'tg-table-empty-state-body'}, + tr( + td( + {colspan: dataColumns.length}, + options.emptyState, + ), + ), + ); + } + + return tbody( + rows_.map((row, idx) => + tr( + { + class: () => `${selectedRows[idx].val ? 'selected' : ''} ${options.rowClass?.(row, idx) ?? ''}`, + onclick: () => onRowSelected(idx), + }, + ...getValue(dataColumns).map(column => TableCell(column, row, idx)), + ) + ), + ) + }, + ), + ), + () => renderPaginator.val + ? Paginatior( + getValue(paginatorOptions).itemsPerPage, + getValue(paginatorOptions).totalItems, + getValue(paginatorOptions).currentPageIdx, + getValue(options.highDensity), + getValue(paginatorOptions).onPageChange, + getValue(paginatorOptions).leftContent, + ) + : undefined, + ); +}; + +/** + * @typedef SortOptionsB + * @type {object} + * @property {string?} field + * @property {('asc'|'desc')?} order + * @property {((field: string) => void)} onSortChange + * + * @param {Column} column + * @param {boolean} isDataColumn + * @param {VanState[]} columnWidths + * @param {number} columnIndex + * @param {boolean} dynamicWidth + * @param {VanState} sortOptions + */ +const TableHeaderColumn = ( + column, + isDataColumn, + columnWidths, + columnIndex, + dynamicWidth, + sortOptions, +) => { + let startX, startWidth; + + const doDrag = (e) => { + const newWidth = startWidth + (e.clientX - startX); + if (newWidth > 50) { + columnWidths[columnIndex].val = newWidth; + } + }; + + const stopDrag = () => { + document.removeEventListener('mousemove', doDrag); + document.removeEventListener('mouseup', stopDrag); + document.body.style.cursor = ''; + document.documentElement.style.userSelect = ''; + document.documentElement.style.pointerEvents = ''; + }; + + const initDrag = (e) => { + startX = e.clientX; + startWidth = columnWidths[columnIndex].val; + document.addEventListener('mousemove', doDrag); + document.addEventListener('mouseup', stopDrag); + document.body.style.cursor = 'col-resize'; + document.documentElement.style.userSelect = 'none'; + document.documentElement.style.pointerEvents = 'none'; + }; + + const sortIcon = van.derive(() => { + if (!isDataColumn || !column.sortable) { + return null; + } + + const isSorted = sortOptions.val.field === column.name; + return ( + Icon( + {style: `font-size: 13px; cursor: pointer; color: var(${isSorted ? '--primary-text-color' : '--disabled-text-color'})`}, + isSorted ? (sortOptions.val.order === 'desc' ? 'south' : 'north') : 'expand_all', + ) + ); + }); + + return th( + { + class: `${isDataColumn ? 'tg-table-column' : 'tg-table-helper-column'} text-small text-secondary ${column.name} ${column.sortable ? 'clickable' : ''}`, + align: column.align, + width: column.width, + colspan: column.colspan ?? 1, + 'data-testid': column.name, + style: `overflow-x: ${column.overflow ?? 'hidden'}`, + onclick: () => { + if (isDataColumn && column.sortable) { + sortOptions.val.onSortChange(column.name); + } + }, + }, + () => div( + {class: 'flex-row fx-gap-2', style: 'display: inline-flex'}, + span(column.label), + sortIcon.val, + ), + ( + isDataColumn && dynamicWidth + ? div( + {class: 'tg-column-resizer', onmousedown: initDrag}, + div() + ) + : null + ), + ); +}; + +/** + * + * @param {Column} column + * @param {Row} row + * @param {number} index + */ +const TableCell = (column, row, index) => { + return td( + { + class: `tg-table-cell ${column.name}`, + align: column.align, + width: column.width, + colspan: column.colspan ?? 1, + 'data-testid': `table-cell:${index},${column.name}`, + style: `overflow-x: ${column.overflow ?? 'hidden'}`, + }, + getValue(row[column.name]), + ); +}; + +/** + * + * @param {number} itemsPerPage + * @param {number?} totalItems + * @param {number} currentPageIdx + * @param {boolean?} highDensity + * @param {((number, number) => void)?} onPageChange + * @param {HTMLElement?} leftContent + * @returns {HTMLElement} + */ +const Paginatior = ( + itemsPerPage, + totalItems, + currentPageIdx, + highDensity, + onPageChange, + leftContent = undefined, +) => { + const pageStart = itemsPerPage * currentPageIdx + 1; + const pageEnd = Math.min(pageStart + itemsPerPage - 1, totalItems); + const lastPage = (Math.floor(totalItems / itemsPerPage) + (totalItems % itemsPerPage > 0) - 1); + + return div( + {class: `tg-table-paginator flex-row fx-justify-content-flex-end ${highDensity ? '' : 'p-1'} text-secondary`}, + + leftContent, + leftContent != undefined ? span({class: 'fx-flex'}) : '', + + span({class: 'mr-2'}, 'Rows per page:'), + Select({ + triggerStyle: 'inline', + testId: 'items-per-page', + value: itemsPerPage, + options: [ + {label: '20', value: 20}, + {label: '50', value: 50}, + {label: '100', value: 100}, + ], + portalPosition: 'top', + onChange: (value) => onPageChange(currentPageIdx, parseInt(value)), + }), + span({class: 'mr-6'}, ''), + span({class: 'mr-6'}, `${pageStart}-${pageEnd} of ${totalItems ?? '∞'}`), + Button({ + type: 'icon', + icon: 'first_page', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: currentPageIdx === 0, + onclick: () => onPageChange(0, itemsPerPage), + }), + Button({ + type: 'icon', + icon: 'chevron_left', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: currentPageIdx === 0, + onclick: () => onPageChange(currentPageIdx - 1, itemsPerPage), + }), + Button({ + type: 'icon', + icon: 'chevron_right', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: pageEnd >= totalItems, + onclick: () => onPageChange(currentPageIdx + 1, itemsPerPage), + }), + Button({ + type: 'icon', + icon: 'last_page', + iconSize: 24, + style: 'color: var(--secondary-text-color)', + disabled: pageEnd >= totalItems, + onclick: () => onPageChange(lastPage, itemsPerPage), + }), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-table { + background: var(--dk-card-background); +} + +.tg-table > .tg-table-scrollable { + overflow: auto; + border-radius: 4px; +} + +.tg-table > .tg-table-scrollable > table { + border-collapse: collapse; + border-color: var(--border-color); +} + +.tg-table > .tg-table-scrollable > table:has(.tg-table-empty-state-body) { + height: 100%; +} + +.tg-table > .tg-table-scrollable > table > thead { + border-bottom: var(--button-stroked-border); + position: sticky; + top: 0; + background: var(--dk-card-background); /* Ensure header background is solid when sticky */ + z-index: 1; /* Ensure header is above scrolling content */ +} + +.tg-table > .tg-table-scrollable > table > thead th { + font-weight: normal; +} + +.tg-table > .tg-table-scrollable > table > thead th > div { + text-overflow: ellipsis; + white-space: nowrap; + overflow-x: hidden; +} + +.tg-table > .tg-table-scrollable > table > thead th.tg-table-helper-column { + padding: 0px; +} + +.tg-table > .tg-table-scrollable > table > thead th.tg-table-column { + padding: 4px 8px; + height: 32px; + text-transform: uppercase; + position: relative; /* Needed for absolute positioning of resizer */ +} + +.tg-table > .tg-table-scrollable > table > thead th .tg-column-resizer { + position: absolute; + right: 0; + top: 0; + width: 5px; + height: 90%; + background: transparent; + cursor: col-resize; + z-index: 2; /* Ensure resizer is above other content */ +} + +.tg-table > .tg-table-scrollable > table > thead th .tg-column-resizer > div { + height: 100%; + width: 1px; + background: var(--border-color); +} + +.tg-table > .tg-table-scrollable > table > tbody > tr { + height: 40px; +} + +.tg-table > .tg-table-scrollable > table > tbody > tr:not(:last-of-type) { + border-bottom: var(--button-stroked-border); +} + +.tg-table > .tg-table-scrollable > table > tbody > tr.selected { + background-color: var(--table-selection-color); +} + +.tg-table > .tg-table-scrollable > table .tg-table-cell { + padding: 4px 8px; + height: 40px; +} + +.tg-table > .tg-table-paginator { + border-top: var(--button-stroked-border); +} + +.tg-table.tg-table-high-density > .tg-table-scrollable > table > thead th.tg-table-column { + padding: 0px 8px; + height: 27px; +} + +.tg-table.tg-table-high-density > .tg-table-scrollable > table .tg-table-cell { + padding: 0px 8px; + height: 27px; +} + +.tg-table.tg-table-dynamic-width > .tg-table-scrollable > table { + table-layout: fixed; +} + +.tg-table.tg-table-dynamic-width > .tg-table-scrollable > table > tbody td { + text-overflow: ellipsis; + white-space: nowrap; +} + +.tg-table.tg-table-hoverable > .tg-table-scrollable > table > tbody tr:hover { + background-color: var(--table-hover-color); +} +`); + +export { Table, TableHeaderColumn }; diff --git a/testgen/ui/static/js/components/table_group_form.js b/testgen/ui/static/js/components/table_group_form.js new file mode 100644 index 00000000..6b072255 --- /dev/null +++ b/testgen/ui/static/js/components/table_group_form.js @@ -0,0 +1,541 @@ +/** + * @import { Connection } from './connection_form.js'; + * + * @typedef TableGroup + * @type {object} + * @property {string?} id + * @property {string?} connection_id + * @property {string?} table_groups_name + * @property {string?} profiling_include_mask + * @property {string?} profiling_exclude_mask + * @property {string?} profiling_table_set + * @property {string?} table_group_schema + * @property {string?} profile_id_column_mask + * @property {string?} profile_sk_column_mask + * @property {number?} profiling_delay_days + * @property {boolean?} profile_flag_cdes + * @property {boolean?} include_in_dashboard + * @property {boolean?} add_scorecard_definition + * @property {boolean?} profile_use_sampling + * @property {number?} profile_sample_percent + * @property {number?} profile_sample_min_count + * @property {string?} description + * @property {string?} data_source + * @property {string?} source_system + * @property {string?} source_process + * @property {string?} data_location + * @property {string?} business_domain + * @property {string?} stakeholder_group + * @property {string?} transform_level + * @property {string?} data_product + * + * @typedef FormState + * @type {object} + * @property {boolean} dirty + * @property {boolean} valid + * + * @typedef Properties + * @type {object} + * @property {TableGroup} tableGroup + * @property {Connection[]} connections + * @property {boolean?} showConnectionSelector + * @property {boolean?} disableConnectionSelector + * @property {boolean?} disableSchemaField + * @property {(tg: TableGroup, state: FormState) => void} onChange + */ +import van from '../van.min.js'; +import { getValue, isEqual, loadStylesheet } from '../utils.js'; +import { Input } from './input.js'; +import { Checkbox } from './checkbox.js'; +import { ExpansionPanel } from './expansion_panel.js'; +import { required } from '../form_validators.js'; +import { Select } from './select.js'; +import { Caption } from './caption.js'; +import { Textarea } from './textarea.js'; + +const { div } = van.tags; + +const normalizeTableSet = (value) => { + return value?.split(/[,\n]/) + .map(part => part.trim()) + .filter(part => part) + .join(', '); +} + +/** + * + * @param {Properties} props + * @returns + */ +const TableGroupForm = (props) => { + loadStylesheet('table-group-form', stylesheet); + + const tableGroup = getValue(props.tableGroup); + const tableGroupConnectionId = van.state(tableGroup.connection_id); + const tableGroupsName = van.state(tableGroup.table_groups_name); + const profilingIncludeMask = van.state(tableGroup.profiling_include_mask ?? '%'); + const profilingExcludeMask = van.state(tableGroup.profiling_exclude_mask ?? 'tmp%'); + const profilingTableSet = van.state(normalizeTableSet(tableGroup.profiling_table_set)); + const tableGroupSchema = van.state(tableGroup.table_group_schema); + const profileIdColumnMask = van.state(tableGroup.profile_id_column_mask ?? '%_id'); + const profileSkColumnMask = van.state(tableGroup.profile_sk_column_mask ?? '%_sk'); + const profilingDelayDays = van.state(tableGroup.profiling_delay_days ?? 0); + const profileFlagCdes = van.state(tableGroup.profile_flag_cdes ?? true); + const includeInDashboard = van.state(tableGroup.include_in_dashboard ?? true); + const addScorecardDefinition = van.state(tableGroup.add_scorecard_definition ?? true); + const profileUseSampling = van.state(tableGroup.profile_use_sampling ?? false); + const profileSamplePercent = van.state(tableGroup.profile_sample_percent ?? 30); + const profileSampleMinCount = van.state(tableGroup.profile_sample_min_count ?? 15000); + const description = van.state(tableGroup.description); + const dataSource = van.state(tableGroup.data_source); + const sourceSystem = van.state(tableGroup.source_system); + const sourceProcess = van.state(tableGroup.source_process); + const dataLocation = van.state(tableGroup.data_location); + const businessDomain = van.state(tableGroup.business_domain); + const stakeholderGroup = van.state(tableGroup.stakeholder_group); + const transformLevel = van.state(tableGroup.transform_level); + const dataProduct = van.state(tableGroup.data_product); + + const connectionOptions = van.derive(() => { + const connections = getValue(props.connections) ?? []; + return connections.map(c => ({ + label: c.connection_name, + value: c.connection_id, + icon: c.flavor.icon, + })); + }); + const showConnectionSelector = getValue(props.showConnectionSelector) ?? false; + const disableSchemaField = van.derive(() => getValue(props.disableSchemaField) ?? false) + + const updatedTableGroup = van.derive(() => { + return { + id: tableGroup.id, + connection_id: tableGroupConnectionId.val, + table_groups_name: tableGroupsName.val, + profiling_include_mask: profilingIncludeMask.val, + profiling_exclude_mask: profilingExcludeMask.val, + profiling_table_set: normalizeTableSet(profilingTableSet.val), + table_group_schema: tableGroupSchema.val, + profile_id_column_mask: profileIdColumnMask.val, + profile_sk_column_mask: profileSkColumnMask.val, + profiling_delay_days: profilingDelayDays.val, + profile_flag_cdes: profileFlagCdes.val, + include_in_dashboard: includeInDashboard.val, + add_scorecard_definition: addScorecardDefinition.val, + profile_use_sampling: profileUseSampling.val, + profile_sample_percent: profileSamplePercent.val, + profile_sample_min_count: profileSampleMinCount.val, + description: description.val, + data_source: dataSource.val, + source_system: sourceSystem.val, + source_process: sourceProcess.val, + data_location: dataLocation.val, + business_domain: businessDomain.val, + stakeholder_group: stakeholderGroup.val, + transform_level: transformLevel.val, + data_product: dataProduct.val, + }; + }); + const dirty = van.derive(() => !isEqual(updatedTableGroup.val, tableGroup)); + const validityPerField = van.state({}); + if (showConnectionSelector) { + validityPerField.val.connection_id = !!tableGroupConnectionId.val; + } + + van.derive(() => { + const fieldsValidity = validityPerField.val; + const isValid = Object.keys(fieldsValidity).length > 0 && + Object.values(fieldsValidity).every(v => v); + props.onChange?.(updatedTableGroup.val, { dirty: dirty.val, valid: isValid }); + }); + + const setFieldValidity = (field, validity) => { + validityPerField.val = {...validityPerField.rawVal, [field]: validity}; + } + + return div( + { class: 'flex-column fx-gap-3' }, + showConnectionSelector + ? Select({ + name: 'connection_id', + label: 'Connection', + value: tableGroupConnectionId.rawVal, + options: connectionOptions, + required: true, + disabled: props.disableConnectionSelector, + onChange: (value, state) => { + tableGroupConnectionId.val = value; + setFieldValidity('connection_id', state.valid); + }, + }) + : undefined, + MainForm( + { disableSchemaField, setValidity: setFieldValidity }, + tableGroupsName, + tableGroupSchema, + ), + CriteriaForm( + { setValidity: setFieldValidity }, + profilingIncludeMask, + profilingExcludeMask, + profilingTableSet, + profileIdColumnMask, + profileSkColumnMask, + ), + SettingsForm( + { editMode: !!tableGroup.id, setValidity: setFieldValidity }, + profilingDelayDays, + profileFlagCdes, + includeInDashboard, + addScorecardDefinition, + ), + SamplingForm( + { setValidity: setFieldValidity }, + profileUseSampling, + profileSamplePercent, + profileSampleMinCount, + ), + TaggingForm( + { setValidity: setFieldValidity }, + description, + dataSource, + sourceSystem, + sourceProcess, + dataLocation, + businessDomain, + stakeholderGroup, + transformLevel, + dataProduct, + ), + ); +}; + +const MainForm = ( + options, + tableGroupsName, + tableGroupSchema, +) => { + return div( + { class: 'flex-row fx-align-flex-start fx-gap-3 fx-flex-wrap' }, + Input({ + name: 'table_groups_name', + label: 'Name', + value: tableGroupsName, + class: 'tg-column-flex', + help: 'Unique name to describe the table group', + helpPlacement: 'bottom-right', + onChange: (value, state) => { + tableGroupsName.val = value; + options.setValidity?.('table_groups_name', state.valid); + }, + validators: [ required ], + }), + Input({ + name: 'table_group_schema', + label: 'Schema', + value: tableGroupSchema, + class: 'tg-column-flex', + help: 'Database schema containing the tables for the Table Group', + helpPlacement: 'bottom-left', + disabled: options.disableSchemaField, + onChange: (value, state) => { + tableGroupSchema.val = value; + options.setValidity?.('table_group_schema', state.valid); + }, + validators: [ required ], + }), + ); +}; + +const CriteriaForm = ( + options, + profilingIncludeMask, + profilingExcludeMask, + profilingTableSet, + profileIdColumnMask, + profileSkColumnMask, +) => { + return div( + { class: 'flex-column fx-gap-3 border border-radius-1 p-3 mt-1', style: 'position: relative;' }, + Caption({content: 'Criteria', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + div( + { class: 'flex-row fx-gap-3 fx-flex-wrap fx-align-flex-start' }, + div( + { class: 'tg-column-flex flex-column fx-gap-3', }, + Input({ + name: 'profiling_include_mask', + label: 'Tables to Include Mask', + value: profilingIncludeMask, + help: 'SQL filter supported by your database\'s LIKE operator for table names to include', + onChange: (value, state) => { + profilingIncludeMask.val = value; + options.setValidity?.('profiling_include_mask', state.valid); + }, + }), + Input({ + name: 'profiling_exclude_mask', + label: 'Tables to Exclude Mask', + value: profilingExcludeMask, + help: 'SQL filter supported by your database\'s LIKE operator for table names to exclude', + onChange: (value, state) => { + profilingExcludeMask.val = value; + options.setValidity?.('profiling_exclude_mask', state.valid); + }, + }), + ), + Textarea({ + name: 'profiling_table_set', + label: 'Explicit Table List', + value: profilingTableSet, + height: 108, + class: 'tg-column-flex', + help: 'List of specific table names to include, separated by commas or newlines', + onChange: (value) => profilingTableSet.val = value, + }), + ), + div( + { class: 'flex-row fx-gap-3 fx-flex-wrap' }, + Input({ + name: 'profile_id_column_mask', + label: 'Profiling ID Column Mask', + value: profileIdColumnMask, + class: 'tg-column-flex', + help: 'SQL filter supported by your database\'s LIKE operator representing ID columns', + onChange: (value, state) => { + profileIdColumnMask.val = value; + options.setValidity?.('profile_id_column_mask', state.valid); + }, + }), + Input({ + name: 'profile_sk_column_mask', + label: 'Profiling Surrogate Key Column Mask', + value: profileSkColumnMask, + class: 'tg-column-flex', + help: 'SQL filter supported by your database\'s LIKE operator representing surrogate key columns', + onChange: (value, state) => { + profileSkColumnMask.val = value + options.setValidity?.('profile_sk_column_mask', state.valid); + }, + }), + ), + ); +}; + +const SettingsForm = ( + options, + profilingDelayDays, + profileFlagCdes, + includeInDashboard, + addScorecardDefinition, +) => { + return div( + { class: 'flex-row fx-gap-3 fx-flex-wrap fx-align-flex-start border border-radius-1 p-3 mt-1', style: 'position: relative;' }, + Caption({content: 'Settings', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + div( + { class: 'tg-column-flex flex-column fx-gap-3' }, + Checkbox({ + name: 'profile_flag_cdes', + label: 'Detect critical data elements (CDE) during profiling', + checked: profileFlagCdes, + onChange: (value) => profileFlagCdes.val = value, + }), + Checkbox({ + name: 'include_in_dashboard', + label: 'Include table group in Project Dashboard', + checked: includeInDashboard, + onChange: (value) => includeInDashboard.val = value, + }), + () => !options.editMode + ? Checkbox({ + name: 'add_scorecard_definition', + label: 'Add scorecard for table group', + help: 'Add a new scorecard to the Quality Dashboard upon creation of this table group', + checked: addScorecardDefinition, + onChange: (value) => addScorecardDefinition.val = value, + }) + : null, + ), + Input({ + name: 'profiling_delay_days', + type: 'number', + label: 'Min Profiling Age (in days)', + value: profilingDelayDays, + class: 'tg-column-flex', + help: 'Number of days to wait before new profiling will be available to generate tests', + onChange: (value, state) => { + profilingDelayDays.val = value; + options.setValidity?.('profiling_delay_days', state.valid); + }, + }), + ); +}; + +const SamplingForm = ( + options, + profileUseSampling, + profileSamplePercent, + profileSampleMinCount, +) => { + return ExpansionPanel( + { title: 'Sampling Parameters', testId: 'sampling-panel' }, + div( + { class: 'flex-column fx-gap-3' }, + Checkbox({ + name: 'profile_use_sampling', + label: 'Use profile sampling', + help: 'When checked, profiling will be based on a sample of records instead of the full table', + checked: profileUseSampling, + onChange: (value) => profileUseSampling.val = value, + }), + div( + { class: 'flex-row fx-gap-3' }, + Input({ + name: 'profile_sample_percent', + class: 'fx-flex', + type: 'number', + label: 'Sample percent', + value: profileSamplePercent, + help: 'Percent of records to include in the sample, unless the calculated count falls below the specified minimum', + onChange: (value, state) => { + profileSamplePercent.val = value; + options.setValidity?.('profile_sample_percent', state.valid); + }, + }), + Input({ + name: 'profile_sample_min_count', + class: 'fx-flex', + type: 'number', + label: 'Min Sample Record Count', + value: profileSampleMinCount, + help: 'Minimum number of records to be included in any sample (if available)', + onChange: (value, state) => { + profileSampleMinCount.val = value; + options.setValidity?.('profile_sample_min_count', state.valid); + }, + }), + ), + ), + ); +}; + +const TaggingForm = ( + options, + description, + dataSource, + sourceSystem, + sourceProcess, + dataLocation, + businessDomain, + stakeholderGroup, + transformLevel, + dataProduct, +) => { + return ExpansionPanel( + { title: 'Table Group Tags', testId: 'tags-panel' }, + Input({ + name: 'description', + class: 'fx-flex mb-3', + label: 'Description', + value: description, + onChange: (value, state) => { + description.val = value; + options.setValidity?.('description', state.valid); + }, + }), + div( + { class: 'tg-tagging-form-fields flex-column fx-gap-3 fx-flex-wrap' }, + Input({ + name: 'data_source', + label: 'Data Source', + value: dataSource, + help: 'Original source of the dataset', + onChange: (value, state) => { + dataSource.val = value; + options.setValidity?.('data_source', state.valid); + }, + }), + Input({ + name: 'source_process', + label: 'Source Process', + value: sourceProcess, + help: 'Process, program, or data flow that produced the dataset', + onChange: (value, state) => { + sourceProcess.val = value; + options.setValidity?.('source_process', state.valid); + }, + }), + Input({ + name: 'business_domain', + label: 'Business Domain', + value: businessDomain, + help: 'Business division responsible for the dataset, e.g., Finance, Sales, Manufacturing', + onChange: (value, state) => { + businessDomain.val = value; + options.setValidity?.('business_domain', state.valid); + }, + }), + Input({ + name: 'transform_level', + label: 'Transform Level', + value: transformLevel, + help: 'Data warehouse processing stage, e.g., Raw, Conformed, Processed, Reporting, or Medallion level (bronze, silver, gold)', + onChange: (value, state) => { + transformLevel.val = value; + options.setValidity?.('transform_level', state.valid); + }, + }), + Input({ + name: 'source_system', + label: 'Source System', + value: sourceSystem, + help: 'Enterprise system source for the dataset', + onChange: (value, state) => { + sourceSystem.val = value; + options.setValidity?.('source_system', state.valid); + }, + }), + Input({ + name: 'data_location', + label: 'Data Location', + value: dataLocation, + help: 'Physical or virtual location of the dataset, e.g., Headquarters, Cloud', + onChange: (value, state) => { + dataLocation.val = value; + options.setValidity?.('data_location', state.valid); + }, + }), + Input({ + name: 'stakeholder_group', + label: 'Stakeholder Group', + value: stakeholderGroup, + help: 'Data owners or stakeholders responsible for the dataset', + onChange: (value, state) => { + stakeholderGroup.val = value; + options.setValidity?.('stakeholder_group', state.valid); + }, + }), + Input({ + name: 'data_product', + label: 'Data Product', + value: dataProduct, + help: 'Data domain that comprises the dataset', + onChange: (value, state) => { + dataProduct.val = value; + options.setValidity?.('data_product', state.valid); + }, + }), + ), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-column-flex { + flex: 250px; +} +.tg-tagging-form-fields { + height: 332px; +} +`); + +export { TableGroupForm }; diff --git a/testgen/ui/static/js/components/table_group_stats.js b/testgen/ui/static/js/components/table_group_stats.js new file mode 100644 index 00000000..361118cd --- /dev/null +++ b/testgen/ui/static/js/components/table_group_stats.js @@ -0,0 +1,130 @@ +/** + * @typedef TableGroupStats + * @type {object} + * @property {string} id + * @property {string} table_groups_name + * @property {string} table_group_schema + * @property {number} table_ct + * @property {number} column_ct + * @property {number} approx_record_ct + * @property {number?} record_ct + * @property {number} approx_data_point_ct + * @property {number?} data_point_ct + * + * @typedef Properties + * @type {object} + * @property {boolean?} hideApproxCaption + * @property {boolean?} hideWarning + * @property {string?} class + */ +import van from '../van.min.js'; +import { formatNumber } from '../display_utils.js'; +import { Alert } from '../components/alert.js'; + +const { div, span, strong } = van.tags; +const profilingWarningText = 'Profiling on large datasets could be time-consuming or resource-intensive, depending on your database configuration.'; + +/** + * @param {Properties} props + * @param {TableGroupStats} stats + * @returns {HTMLElement} + */ +const TableGroupStats = (props, stats) => { + const useApprox = stats.record_ct === null || stats.record_ct === undefined; + const rowCount = useApprox ? stats.approx_record_ct : stats.record_ct; + const dataPointCount = useApprox ? stats.approx_data_point_ct : stats.data_point_ct; + const warning = !props.hideWarning ? WarningText(rowCount, dataPointCount) : null; + + return div( + { class: `flex-column fx-gap-1 p-3 border border-radius-2 ${props.class ?? ''}` }, + span( + span({ class: 'text-secondary' }, 'Schema: '), + stats.table_group_schema, + ), + div( + { class: 'flex-row' }, + div( + { class: 'flex-column fx-gap-1', style: 'flex: 1 1 50%;' }, + span( + span({ class: 'text-secondary' }, 'Tables: '), + formatNumber(stats.table_ct), + ), + span( + span({ class: 'text-secondary' }, 'Columns: '), + formatNumber(stats.column_ct), + ), + ), + div( + { class: 'flex-column fx-gap-1', style: 'flex: 1 1 50%;' }, + span( + span({ class: 'text-secondary' }, 'Rows: '), + formatNumber(rowCount), + useApprox ? ' *' : '', + ), + span( + span({ class: 'text-secondary' }, 'Data points: '), + formatNumber(dataPointCount), + useApprox ? ' *' : '', + ), + ), + ), + useApprox && !props.hideApproxCaption + ? span( + { class: 'text-caption text-right mt-1' }, + '* Approximate counts based on server statistics', + ) + : null, + warning + ? Alert({ type: 'warn', icon: 'warning', class: 'mt-2' }, warning) + : null, + ); +}; + +/** + * @param {number | null} rowCount + * @param {number | null} dataPointCount + * @returns {HTMLElement | null} + */ +const WarningText = (rowCount, dataPointCount) => { + if (rowCount === null) { // Unknown counts + return div(`WARNING: ${profilingWarningText}`); + } + + const rowTier = getStatTier(rowCount); + const dataPointTier = getStatTier(dataPointCount); + + if (rowTier || dataPointTier) { + let category; + if (rowTier && dataPointTier) { + category = rowTier === dataPointTier + ? [ strong(rowTier), ' of rows and data points' ] + : [ strong(rowTier), ' of rows and ', strong(dataPointTier), ' of data points' ]; + } else { + category = rowTier + ? [ strong(rowTier), ' of rows' ] + : [ strong(dataPointTier), ' of data points' ]; + } + return div( + div('WARNING: The table group has ', ...category, '.'), + div({ class: 'mt-2' }, profilingWarningText), + ); + } + return null; +} + +/** + * @param {number | null} count + * @returns {string | null} + */ +function getStatTier(/** @type number */ count) { + if (count > 1000000000) { + return 'billions'; + } else if (count > 1000000) { + return 'millions'; + } else if (count > 100000) { + return 'hundreds of thousands'; + } + return null; +}; + +export { TableGroupStats }; diff --git a/testgen/ui/static/js/components/table_group_test.js b/testgen/ui/static/js/components/table_group_test.js new file mode 100644 index 00000000..ff987f06 --- /dev/null +++ b/testgen/ui/static/js/components/table_group_test.js @@ -0,0 +1,127 @@ +/** + * @import { TableGroupStats } from './table_group_stats.js' + * + * @typedef TablePreview + * @type {object} + * @property {number} column_ct + * @property {number} approx_record_ct + * @property {number} approx_data_point_ct + * @property {boolean} can_access + * + * @typedef TableGroupPreview + * @type {object} + * @property {TableGroupStats} stats + * @property {Record?} tables + * @property {boolean?} success + * @property {string?} message + * + * @typedef ComponentOptions + * @type {object} + * @property {(() => void)?} onVerifyAcess + */ +import van from '../van.min.js'; +import { getValue } from '../utils.js'; +import { formatNumber } from '../display_utils.js'; +import { Alert } from '../components/alert.js'; +import { Icon } from '../components/icon.js'; +import { Button } from '../components/button.js'; +import { TableGroupStats } from './table_group_stats.js'; + +const { div, span } = van.tags; + +/** + * @param {TableGroupPreview?} preview + * @param {ComponentOptions} options + * @returns {HTMLElement} + */ +const TableGroupTest = (preview, options) => { + return div( + { class: 'flex-column fx-gap-2' }, + div( + { class: 'flex-row fx-justify-space-between fx-align-flex-end' }, + span({ class: 'text-caption text-right' }, '* Approximate row counts based on server statistics'), + options.onVerifyAcess + ? div( + { class: 'flex-row' }, + span({ class: 'fx-flex' }), + Button({ + label: 'Verify Access', + width: 'fit-content', + type: 'stroked', + onclick: options.onVerifyAcess, + }), + ) + : '', + ), + () => getValue(preview) + ? TableGroupStats({ hideWarning: true, hideApproxCaption: true }, getValue(preview).stats) + : '', + () => { + const tableGroupPreview = getValue(preview); + const wasPreviewExecuted = tableGroupPreview && typeof tableGroupPreview.success === 'boolean'; + + if (!wasPreviewExecuted) { + return ''; + } + + const tables = tableGroupPreview?.tables ?? {}; + const hasTables = Object.keys(tables).length > 0; + const verifiedAccess = Object.values(tables).some(({ can_access }) => can_access != null); + const tableAccessWarning = Object.values(tables).some(({ can_access }) => can_access != null && can_access === false) + ? tableGroupPreview.message + : ''; + + const columns = ['50%', '14%', '14%', '14%', '8%']; + + return div( + {class: 'flex-column fx-gap-2'}, + div( + { class: 'table hoverable p-3 pb-0' }, + div( + { class: 'table-header flex-row' }, + span({ style: `flex: 1 1 ${columns[0]}; max-width: ${columns[0]};` }, 'Tables'), + span({ style: `flex: 1 1 ${columns[1]};` }, 'Columns'), + span({ style: `flex: 1 1 ${columns[2]};` }, 'Rows *'), + span({ style: `flex: 1 1 ${columns[3]};` }, 'Data Points *'), + verifiedAccess + ? span({class: 'flex-row fx-justify-center', style: `flex: 1 1 ${columns[4]};`}, 'Can access?') + : '', + ), + div( + { class: 'flex-column', style: 'max-height: 400px; overflow-y: auto;' }, + hasTables + ? Object.entries(tables).map(([ tableName, table ]) => + div( + { class: 'table-row flex-row fx-justify-space-between' }, + span( + { style: `flex: 1 1 ${columns[0]}; max-width: ${columns[0]}; word-wrap: break-word;` }, + tableName, + ), + span({ style: `flex: 1 1 ${columns[1]};` }, formatNumber(table.column_ct)), + span({ style: `flex: 1 1 ${columns[2]};` }, formatNumber(table.approx_record_ct)), + span({ style: `flex: 1 1 ${columns[3]};` }, formatNumber(table.approx_data_point_ct)), + table.can_access != null + ? span( + {class: 'flex-row fx-justify-center', style: `flex: 1 1 ${columns[4]};`}, + table.can_access + ? Icon({classes: 'text-green', size: 20}, 'check_circle') + : Icon({classes: 'text-error', size: 20}, 'dangerous'), + ) + : '', + ), + ) + : div( + { class: 'flex-row fx-justify-center', style: 'height: 50px; font-size: 16px;'}, + tableGroupPreview.message ?? 'No tables found.' + ), + ), + ), + tableAccessWarning ? + Alert({type: 'warn', closeable: true, icon: 'warning'}, span(tableAccessWarning)) + : '', + ); + }, + ); +}; + +export { TableGroupTest }; diff --git a/testgen/ui/static/js/components/tabs.js b/testgen/ui/static/js/components/tabs.js new file mode 100644 index 00000000..b23b9ca5 --- /dev/null +++ b/testgen/ui/static/js/components/tabs.js @@ -0,0 +1,128 @@ +/** + * @typedef {Object} TabProps + * @property {string} label + */ +import { getValue, loadStylesheet } from '../utils.js'; +import van from '../van.min.js'; + +const { div, button, span } = van.tags; + +/** + * @param {TabProps} props + * @param {...any} children + * @returns {{label: string, children: van.ChildDom[]}} + */ +const Tab = ({ label }, ...children) => ({ + label, + children, +}); + +/** + * @param {object} props + * @param {...Tab} tabs + */ +const Tabs = (props, ...tabs) => { + loadStylesheet('tabs', stylesheet); + + const activeTab = van.state(0); + + let labelsContainerEl; + const highlightEl = span({ class: "tg-tabs--highlight" }); + + const updateHighlight = () => { + if (!labelsContainerEl?.isConnected || !labelsContainerEl.children.length) return; + + const activeLabel = labelsContainerEl.children[activeTab.val]; + if (!activeLabel) return; + + highlightEl.style.width = `${activeLabel.offsetWidth}px`; + highlightEl.style.left = `${activeLabel.offsetLeft}px`; + highlightEl.style.opacity = '1'; + }; + + labelsContainerEl = div( + { class: "tg-tabs--labels" }, + ...tabs.map((tab, i) => + button({ + class: () => `tg-tabs--tab--label ${i === activeTab.val ? 'active' : ''}`, + onclick: () => (activeTab.val = i), + }, + tab.label + )), + highlightEl, + ); + + const tabsContainerEl = div({ ...props, class: () => `${getValue(props.class) ?? ''} tg-tabs--container` }, + labelsContainerEl, + div({ class: "tg-tabs--content" }, () => div({class: "tg-tabs--content-inner"}, tabs[activeTab.val].children)), + ); + + van.derive(() => { + activeTab.val; + requestAnimationFrame(updateHighlight); + }); + + const resizeObserver = new ResizeObserver(() => { + requestAnimationFrame(updateHighlight); + }); + + tabsContainerEl.onadd = () => { + resizeObserver.observe(labelsContainerEl); + updateHighlight(); + }; + + tabsContainerEl.onremove = () => { + resizeObserver.disconnect(); + }; + + return tabsContainerEl; +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-tabs--container { + width: 100%; +} + +.tg-tabs--labels { + position: relative; + display: flex; + border-bottom: 1px solid #dddfe2; +} + +.tg-tabs--tab--label { + padding: 12px 20px; + cursor: pointer; + background-color: transparent; + border: none; + font-size: 0.875rem; + color: var(--secondary-text-color); + font-weight: 500; + transition: color 0.2s ease-in-out; + white-space: nowrap; +} + +.tg-tabs--tab--label:hover { + color: var(--primary-color); + border-radius: 6px 6px 0 0; +} + +.tg-tabs--tab--label.active { + color: var(--primary-color); +} + +.tg-tabs--highlight { + position: absolute; + bottom: -1px; + height: 2px; + background-color: var(--primary-color); + transition: left 0.3s cubic-bezier(0.25, 0.8, 0.25, 1), width 0.3s cubic-bezier(0.25, 0.8, 0.25, 1); + opacity: 0; +} + +.tg-tabs--content { + padding-top: 20px; +} +`); + +export { Tabs, Tab }; \ No newline at end of file diff --git a/testgen/ui/static/js/components/test_definition_form.js b/testgen/ui/static/js/components/test_definition_form.js new file mode 100644 index 00000000..31812f87 --- /dev/null +++ b/testgen/ui/static/js/components/test_definition_form.js @@ -0,0 +1,451 @@ +/** + * @typedef TestDefinition + * @type {object} + * @property {string} id + * @property {string} table_groups_id + * @property {string?} profile_run_id + * @property {string} test_type + * @property {string} test_suite_id + * @property {string?} test_description + * @property {string} schema_name + * @property {string?} table_name + * @property {string?} column_name + * @property {number?} skip_errors + * @property {string?} baseline_ct + * @property {string?} baseline_unique_ct + * @property {string?} baseline_value + * @property {string?} baseline_value_ct + * @property {string?} threshold_value + * @property {string?} baseline_sum + * @property {string?} baseline_avg + * @property {string?} baseline_sd + * @property {string?} lower_tolerance + * @property {string?} upper_tolerance + * @property {string?} subset_condition + * @property {string?} groupby_names + * @property {string?} having_condition + * @property {string?} window_date_column + * @property {number?} window_days + * @property {string?} match_schema_name + * @property {string?} match_table_name + * @property {string?} match_column_names + * @property {string?} match_subset_condition + * @property {string?} match_groupby_names + * @property {string?} match_having_condition + * @property {string?} custom_query + * @property {string?} history_calculation + * @property {string?} history_calculation_upper + * @property {number?} history_lookback + * @property {boolean} test_active + * @property {string?} test_definition_status + * @property {string?} severity + * @property {boolean} lock_refresh + * @property {number?} last_auto_gen_date + * @property {number?} profiling_as_of_date + * @property {number?} last_manual_update + * @property {boolean} export_to_observability + * @property {string} test_name_short + * @property {string} default_test_description + * @property {string} measure_uom + * @property {string} measure_uom_description + * @property {string} default_parm_columns + * @property {string} default_parm_prompts + * @property {string} default_parm_help + * @property {string} default_severity + * @property {'column'|'referential'|'table'|'tablegroup'|'custom'} test_scope + * @property {string?} prediction + * + * @typedef Properties + * @type {object} + * @property {TestDefinition} definition + * @property {string?} class + * @property {(changes: object, valid: boolean) => void} onChange + */ + +import van from '../van.min.js'; +import { getValue, isEqual, loadStylesheet } from '../utils.js'; +import { Input } from './input.js'; +import { Select } from './select.js'; +import { Textarea } from './textarea.js'; +import { RadioGroup } from './radio_group.js'; +import { Caption } from './caption.js'; +import { numberBetween } from '../form_validators.js'; + +const { div, span } = van.tags; + +const thresholdColumns = [ + 'history_calculation', + 'history_calculation_upper', + 'history_lookback', + 'lower_tolerance', + 'upper_tolerance', +]; + +// Columns using the default { type: 'text' } do not need to be specified here +const PARAMETER_CONFIG = { + custom_query: { type: 'textarea' }, + lower_tolerance: { type: 'number' }, + upper_tolerance: { type: 'number' }, +}; + + +const TestDefinitionForm = (/** @type Properties */ props) => { + loadStylesheet('test-definition-form', stylesheet); + + const definition = getValue(props.definition); + + const paramColumns = (definition.default_parm_columns || '').split(',').map(v => v.trim()); + const paramLabels = (definition.default_parm_prompts || '').split(',').map(v => v.trim()); + const paramHelp = (definition.default_parm_help || '').split('|').map(v => v.trim()); + + const hasThresholds = paramColumns.includes('history_calculation'); + const dynamicParamColumns = paramColumns + .map((column, index) => ({ + ...(PARAMETER_CONFIG[column] || { type: 'text' }), + column, + label: paramLabels[index] || column.replaceAll('_', ' '), + help: paramHelp[index] || null, + })) + .filter(config => !hasThresholds || !thresholdColumns.includes(config.column)) + + const updatedDefinition = van.state({ ...definition }); + const validityPerField = van.state({}); + + van.derive(() => { + const newDefinition = updatedDefinition.val + const fieldsValidity = validityPerField.val; + const isValid = Object.keys(fieldsValidity).length > 0 && + Object.values(fieldsValidity).every(v => v); + + const changes = {}; + for (const key in newDefinition) { + if (!isEqual(newDefinition[key], definition[key])) { + changes[key] = newDefinition[key]; + } + } + props.onChange?.(changes, { dirty: !!Object.keys(changes).length, valid: isValid }); + }); + + const setFieldValues = (updatedValues) => { + updatedDefinition.val = { ...updatedDefinition.rawVal, ...updatedValues }; + }; + + const setFieldValidity = (field, validity) => { + validityPerField.val = { ...validityPerField.rawVal, [field]: validity }; + }; + + return div( + { class: props.class }, + div( + { class: 'mb-2' }, + div({ class: 'text-large' }, definition.test_name_short), + definition.test_description || definition.default_test_description + ? span({ class: 'text-caption mt-2' }, definition.test_description ?? definition.default_test_description) + : null, + ), + () => div( + { class: 'flex-row fx-flex-wrap fx-gap-3' }, + dynamicParamColumns.map(config => { + const column = config.column; + const currentValue = () => updatedDefinition.val[column] ?? config.default; + + if (config.type === 'select') { + return div( + { class: 'td-form--field' }, + () => Select({ + label: config.label, + options: config.options, + value: currentValue(), + onChange: (value) => setFieldValues({ [column]: value }), + }), + ); + } + + if (config.type === 'number') { + return div( + { class: 'td-form--field' }, + () => Input({ + name: column, + label: config.label, + help: config.help, + type: 'number', + value: currentValue(), + step: config.step, + onChange: (value, state) => { + setFieldValues({ [column]: value || null }) + setFieldValidity(column, state.valid); + }, + }), + ); + } + + if (config.type === 'textarea') { + return div( + { class: 'td-form--field-wide' }, + () => Textarea({ + name: column, + label: config.label, + help: config.help, + value: currentValue(), + height: 100, + onChange: (value) => { + setFieldValues({ [column]: value || null }) + }, + }), + ); + } + + return div( + { class: 'td-form--field' }, + () => Input({ + name: column, + label: config.label, + help: config.help, + value: currentValue(), + onChange: (value, state) => { + setFieldValues({ [column]: value || null }) + setFieldValidity(column, state.valid); + }, + }), + ); + }), + ), + hasThresholds + ? ThresholdForm( + { setFieldValues, setFieldValidity }, + definition, + ) + : null, + ); +}; + +const thresholdModeOptions = [ + { + label: 'Prediction Model', + value: 'prediction', + help: 'Use time series prediction to automatically determine expected bounds', + }, + { + label: 'Historical Calculation', + value: 'historical', + help: 'Calculate bounds based on historical results', + }, + { + label: 'Static Thresholds', + value: 'static', + help: 'Manually specify fixed upper and lower bounds', + }, +]; + +const historyCalcOptions = [ + { label: 'Value', value: 'Value' }, + { label: 'Minimum', value: 'Minimum' }, + { label: 'Maximum', value: 'Maximum' }, + { label: 'Sum', value: 'Sum' }, + { label: 'Average', value: 'Average' }, + { label: 'Expression', value: 'Expression' }, +]; + +/** + * @typedef ThresholdFormOptions + * @type {object} + * @property {(updatedValues: object) => void} setFieldValues + * @property {(field: string, valid: boolean) => void} setFieldValidity + * + * @param {ThresholdFormOptions} options + * @param {TestDefinition} definition + */ +const ThresholdForm = (options, definition) => { + const { setFieldValues, setFieldValidity } = options; + const isFreshnessTrend = definition.test_type === 'Freshness_Trend'; + const initialHistoryCalc = definition.history_calculation; + + const initialMode = initialHistoryCalc === 'PREDICT' ? 'prediction' : initialHistoryCalc ? 'historical' : 'static'; + const mode = van.state(initialMode); + + const historyCalc = van.state(initialHistoryCalc === 'PREDICT' || !initialHistoryCalc ? 'Minimum' : initialHistoryCalc); + const historyCalcUpper = van.state(definition.history_calculation_upper ?? 'Maximum'); + const historyLookback = van.state(definition.history_lookback || 10); + const lowerTolerance = van.state(definition.lower_tolerance); + const upperTolerance = van.state(definition.upper_tolerance); + + const lowerParsed = van.derive(() => parseExpressionValue(historyCalc.val)); + const upperParsed = van.derive(() => parseExpressionValue(historyCalcUpper.val)); + + return div( + { class: 'flex-column fx-gap-4 border border-radius-1 p-3 mt-5', style: 'position: relative;' }, + Caption({ content: 'Thresholds', style: 'position: absolute; top: -10px; background: var(--app-background-color); padding: 0px 8px;' }), + RadioGroup({ + name: 'threshold_mode', + options: isFreshnessTrend + ? thresholdModeOptions.filter(option => option.value !== 'historical') + : thresholdModeOptions, + value: mode, + layout: 'vertical', + onChange: (newMode) => { + mode.val = newMode; + options.setFieldValues({ + 'history_calculation': newMode === 'prediction' ? 'PREDICT' : newMode === 'historical' ? historyCalc.val : null, + 'history_calculation_upper': newMode === 'historical' ? historyCalcUpper.val : null, + 'history_lookback': newMode === 'historical' ? historyLookback.val : null, + 'lower_tolerance': newMode === 'static' ? lowerTolerance.val : newMode === 'prediction' ? definition.lower_tolerance : null, + 'upper_tolerance': newMode === 'static' ? upperTolerance.val : newMode === 'prediction' ? definition.upper_tolerance : null, + }); + }, + }), + () => { + if (mode.val === 'historical') { + return div( + { class: 'flex-column fx-gap-3 mt-2' }, + div( + { class: 'flex-row fx-align-flex-start fx-gap-3 fx-flex-wrap' }, + div( + { class: 'td-form--field flex-column fx-gap-3' }, + () => Select({ + label: 'Lower Bound Calculation', + options: historyCalcOptions, + value: lowerParsed.val.selectValue, + onChange: (value) => { + const fieldValue = value === 'Expression' ? formatExpressionValue('') : value; + historyCalc.val = fieldValue; + setFieldValues({ history_calculation: fieldValue }); + }, + }), + () => lowerParsed.val.isExpression + ? Input({ + name: 'history_calculation_expression', + label: 'Lower Bound Expression', + value: lowerParsed.val.expression, + help: 'Use {VALUE}, {MINIMUM}, {MAXIMUM}, {SUM}, {AVERAGE}, {STANDARD_DEVIATION} to reference historical aggregates. Example: 0.5 * {AVERAGE}', + onChange: (value) => { + const fieldValue = formatExpressionValue(value); + setFieldValues({ history_calculation: fieldValue }); + }, + }) + : '', + ), + div( + { class: 'td-form--field flex-column fx-gap-3' }, + () => Select({ + label: 'Upper Bound Calculation', + options: historyCalcOptions, + value: upperParsed.val.selectValue, + onChange: (value) => { + const fieldValue = value === 'Expression' ? formatExpressionValue('') : value; + historyCalcUpper.val = fieldValue; + setFieldValues({ history_calculation_upper: fieldValue }); + }, + }), + () => upperParsed.val.isExpression + ? Input({ + name: 'history_calculation_upper_expression', + label: 'Upper Bound Expression', + value: upperParsed.val.expression, + help: 'Use {VALUE}, {MINIMUM}, {MAXIMUM}, {SUM}, {AVERAGE}, {STANDARD_DEVIATION} to reference historical aggregates. Example: 1.5 * {AVERAGE}', + onChange: (value) => { + const fieldValue = formatExpressionValue(value); + setFieldValues({ history_calculation_upper: fieldValue }); + }, + }) + : '', + ), + ), + div( + { class: 'flex-row fx-gap-3' }, + div( + { class: 'td-form--field' }, + Input({ + name: 'history_lookback', + label: 'History Lookback', + type: 'number', + value: historyLookback, + help: 'Number of historical runs to use for calculation', + step: 1, + disabled: () => lowerParsed.val.selectValue === 'Value' && upperParsed.val.selectValue === 'Value', + onChange: (value, state) => { + historyLookback.val = value; + setFieldValues({ history_lookback: value }); + setFieldValidity('history_lookback', state.valid); + }, + validators: [numberBetween(1, 1000, 1)], + }), + ), + ) + ); + } + + if (mode.val === 'static') { + return div( + { class: 'flex-row fx-gap-3 fx-flex-wrap mt-2' }, + !isFreshnessTrend + ? div( + { class: 'td-form--field' }, + Input({ + name: 'lower_tolerance', + label: 'Lower Bound', + type: 'number', + value: lowerTolerance, + onChange: (value, state) => { + lowerTolerance.val = value; + setFieldValues({ lower_tolerance: value }); + setFieldValidity('lower_tolerance', state.valid); + }, + }), + ) + : null, + div( + { class: 'td-form--field' }, + Input({ + name: 'upper_tolerance', + label: isFreshnessTrend ? 'Maximum interval since last update (minutes)' : 'Upper Bound', + type: 'number', + value: upperTolerance, + onChange: (value, state) => { + upperTolerance.val = value; + setFieldValues({ upper_tolerance: value }); + setFieldValidity('upper_tolerance', state.valid); + }, + }), + ), + ); + } + + return span({ class: 'text-caption mt-2' }, 'The prediction model will automatically determine expected bounds based on historical patterns.'); + }, + ); +}; + +/** + * @param {string?} value + * @returns {{ isExpression: boolean, selectValue: string?, expression: string? }} + */ +const parseExpressionValue = (value) => { + if (!value) { + return { isExpression: false, selectValue: value, expression: null }; + } + // Format: EXPR:[...] + const match = value.match(/^EXPR:\[(.*)\]$/); + if (match) { + return { isExpression: true, selectValue: 'Expression', expression: match[1] }; + } + return { isExpression: false, selectValue: value, expression: null }; +}; + +/** + * @param {string?} expression + * @returns {string} + */ +const formatExpressionValue = (expression) => `EXPR:[${expression || ''}]`; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.td-form--field { + flex: calc(50% - 8px) 0 0; +} + +.td-form--field-wide { + flex: 100% 1 1; +} +`); + +export { TestDefinitionForm }; diff --git a/testgen/ui/static/js/components/textarea.js b/testgen/ui/static/js/components/textarea.js new file mode 100644 index 00000000..828d8c86 --- /dev/null +++ b/testgen/ui/static/js/components/textarea.js @@ -0,0 +1,101 @@ +/** + * @typedef Properties + * @type {object} + * @property {string?} id + * @property {string?} name + * @property {string?} label + * @property {string?} help + * @property {TooltipProperties['position']} helpPlacement + * @property {(string | number)?} value + * @property {string?} placeholder + * @property {string?} icon + * @property {boolean?} disabled + * @property {function(string, InputState)?} onChange + * @property {string?} style + * @property {string?} class + * @property {number?} width + * @property {number?} height + * @property {string?} testId + */ +import van from '../van.min.js'; +import { debounce, getValue, loadStylesheet, getRandomId } from '../utils.js'; +import { Icon } from './icon.js'; +import { withTooltip } from './tooltip.js'; + +const { div, label, textarea } = van.tags; +const defaultHeight = 64; + +const Textarea = (/** @type Properties */ props) => { + loadStylesheet('textarea', stylesheet); + + const domId = van.derive(() => getValue(props.id) ?? getRandomId()); + const value = van.derive(() => getValue(props.value) ?? ''); + + const onChange = props.onChange?.val ?? props.onChange; + if (onChange) { + onChange(value.val); + } + van.derive(() => { + const onChange = props.onChange?.val ?? props.onChange; + if (onChange && value.val !== value.oldVal) { + onChange(value.val); + } + }); + + return label( + { + id: domId, + class: () => `flex-column fx-gap-1 ${getValue(props.class) ?? ''}`, + style: () => `width: ${props.width ? getValue(props.width) + 'px' : 'auto'}; ${getValue(props.style)}`, + 'data-testid': props.testId ?? props.name ?? '', + }, + div( + { class: 'flex-row fx-gap-1 text-caption' }, + props.label, + () => getValue(props.help) + ? withTooltip( + Icon({ size: 16, classes: 'text-disabled' }, 'help'), + { text: props.help, position: getValue(props.helpPlacement) ?? 'top', width: 200 } + ) + : null, + ), + textarea({ + class: () => `tg-textarea--field ${getValue(props.disabled) ? 'tg-textarea--disabled' : ''}`, + style: () => `min-height: ${getValue(props.height) || defaultHeight}px;`, + value, + name: props.name ?? '', + disabled: props.disabled, + placeholder: () => getValue(props.placeholder) ?? '', + oninput: debounce((/** @type Event */ event) => value.val = event.target.value, 300), + }), + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-textarea--field { + box-sizing: border-box; + width: 100%; + border-radius: 8px; + border: 1px solid transparent; + transition: border-color 0.3s; + background-color: var(--form-field-color); + padding: 4px 8px; + color: var(--primary-text-color); + font-size: 14px; + resize: vertical; +} + +.tg-textarea--field::placeholder { + font-style: italic; + color: var(--disabled-text-color); +} + +.tg-textarea--field:focus, +.tg-textarea--field:focus-visible { + outline: none; + border-color: var(--primary-color); +} +`); + +export { Textarea }; diff --git a/testgen/ui/static/js/components/threshold_chart.js b/testgen/ui/static/js/components/threshold_chart.js new file mode 100644 index 00000000..ea92d8ad --- /dev/null +++ b/testgen/ui/static/js/components/threshold_chart.js @@ -0,0 +1,106 @@ +/** + * @import {ChartViewBox, DrawingArea} from './chart_canvas.js'; + * + * @typedef Point + * @type {object} + * @property {number} x + * @property {number} y + * + * @typedef Options + * @type {object} + * @property {number} width + * @property {number} height + * @property {DrawingArea} area + * @property {ChartViewBox} viewBox + * @property {number} paddingLeft + * @property {number} paddingRight + * @property {string} color + * @property {number} lineWidth + * @property {string} markerColor + * @property {number} markerSize + * @property {Point?} nestedPosition + * @property {number[]?} yAxisTicks + * + * @typedef MonitoringEvent + * @type {object} + * @property {number} value + * @property {string} time + */ +import van from '../van.min.js'; +import { colorMap } from '../display_utils.js'; +import { getValue } from '../utils.js'; + +const { polygon, polyline, svg } = van.tags("http://www.w3.org/2000/svg"); + +/** + * + * @param {Options} options + * @param {Array} line1 + * @param {Array?} line2 + */ +const ThresholdChart = (options, line1, line2) => { + const _options = { + ...defaultOptions, + ...(options ?? {}), + }; + + const minX = van.state(0); + const minY = van.state(0); + const width = van.state(0); + const height = van.state(0); + const widthFactor = van.state(1.0); + + van.derive(() => { + const viewBox = getValue(_options.viewBox); + width.val = viewBox.width; + height.val = viewBox.height; + minX.val = viewBox.minX; + minY.val = viewBox.minY; + widthFactor.val = viewBox.widthFactor; + }); + + const extraAttributes = {}; + if (_options.nestedPosition) { + extraAttributes.x = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).x; + extraAttributes.y = () => (_options.nestedPosition?.rawVal || _options.nestedPosition).y; + } else { + extraAttributes.viewBox = () => `${minX.val} ${minY.val} ${width.val} ${height.val}`; + } + + let content = () => polyline({ + points: line1.map(point => `${point.x} ${point.y}`).join(', '), + style: `stroke: ${getValue(_options.color)}; stroke-width: ${getValue(_options.lineWidth)};`, + fill: 'none', + }); + if (line2) { + content = () => polygon({ + points: `${line1.map(point => `${point.x} ${point.y}`).join(', ')} ${line2.map(point => `${point.x} ${point.y}`).join(', ')}`, + fill: getValue(_options.color), + stroke: 'none', + }); + } + + return svg( + { + width: '100%', + height: '100%', + style: `overflow: visible;`, + ...extraAttributes, + }, + content, + ); +}; + +const /** @type Options */ defaultOptions = { + width: 600, + height: 200, + paddingLeft: 16, + paddingRight: 16, + color: colorMap.redLight, + lineWidth: 3, + markerColor: colorMap.red, + markerSize: 8, + yAxisTicks: undefined, +}; + +export { ThresholdChart }; diff --git a/testgen/ui/static/js/components/toggle.js b/testgen/ui/static/js/components/toggle.js new file mode 100644 index 00000000..0a635c7c --- /dev/null +++ b/testgen/ui/static/js/components/toggle.js @@ -0,0 +1,89 @@ +/** + * @typedef Properties + * @type {object} + * @property {string} label + * @property {string?} name + * @property {boolean?} checked + * @property {string?} style + * @property {function(boolean)?} onChange + */ +import van from '../van.min.js'; +import { loadStylesheet } from '../utils.js'; + +const { input, label } = van.tags; + +const Toggle = (/** @type Properties */ props) => { + loadStylesheet('toggle', stylesheet); + + return label( + { class: 'flex-row fx-gap-2 clickable', style: props.style ?? '', 'data-testid': props.name ?? '' }, + input({ + type: 'checkbox', + role: 'switch', + class: 'tg-toggle--input clickable', + name: props.name ?? '', + checked: props.checked, + onchange: van.derive(() => { + const onChange = props.onChange?.val ?? props.onChange; + return onChange ? (/** @type Event */ event) => onChange(event.target.checked) : null; + }), + }), + props.label, + ); +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-toggle--input { + appearance: none; + margin: 0; + width: 28px; + height: 16px; + flex-shrink: 0; + border-radius: 8px; + background-color: var(--disabled-text-color); + position: relative; + transition-property: background-color; + transition-duration: 0.3s; +} + +.tg-toggle--input::after { + content: ''; + position: absolute; + top: 2px; + left: 2px; + width: 12px; + height: 12px; + border-radius: 6px; + background-color: #fff; + transition-property: left; + transition-duration: 0.3s; +} + +.tg-toggle--input:focus, +.tg-toggle--input:focus-visible { + outline: none; +} + +.tg-toggle--input:focus-visible::before { + content: ''; + box-sizing: border-box; + position: absolute; + top: -3px; + left: -3px; + width: 34px; + height: 22px; + border: 3px solid var(--border-color); + border-radius: 11px; +} + +.tg-toggle--input:checked { + background-color: var(--primary-color); +} + +.tg-toggle--input:checked::after { + left: 14px; +} +`); + +export { Toggle }; diff --git a/testgen/ui/static/js/components/tooltip.js b/testgen/ui/static/js/components/tooltip.js new file mode 100644 index 00000000..e3b23a39 --- /dev/null +++ b/testgen/ui/static/js/components/tooltip.js @@ -0,0 +1,171 @@ +// Code modified from vanjs-ui +// https://www.npmjs.com/package/vanjs-ui +// https://cdn.jsdelivr.net/npm/vanjs-ui@0.10.0/dist/van-ui.nomodule.js + +/** + * @typedef {'top-left' | 'top' | 'top-right' | 'right' | 'bottom-right' | 'bottom' | 'bottom-left' | 'left'} TooltipPosition + * + * @typedef Properties + * @type {object} + * @property {string} text + * @property {boolean} show + * @property {TooltipPosition?} position + * @property {number} width + * @property {string?} style + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet } from '../utils.js'; + +const { div, span } = van.tags; +const defaultPosition = 'top'; + +const Tooltip = (/** @type Properties */ props) => { + loadStylesheet('tooltip', stylesheet); + + return span( + { + class: () => `tg-tooltip ${getValue(props.position) || defaultPosition} ${getValue(props.show) ? '' : 'hidden'}`, + style: () => `opacity: ${getValue(props.show) ? 1 : 0}; max-width: ${getValue(props.width) || '400'}px; ${getValue(props.style) ?? ''}`, + }, + props.text, + div({ class: 'tg-tooltip--triangle' }), + ); +}; + +const withTooltip = (/** @type HTMLElement */ component, /** @type Properties */ tooltipProps) => { + const showTooltip = van.state(false); + const tooltip = Tooltip({ ...tooltipProps, show: showTooltip }); + + component.onmouseenter = () => showTooltip.val = true; + component.onmouseleave = () => showTooltip.val = false; + component.appendChild(tooltip); + + return component; +}; + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-tooltip { + width: max-content; + position: absolute; + z-index: 1; + border-radius: 4px; + background-color: var(--tooltip-color); + padding: 4px 8px; + color: var(--tooltip-text-color); + font-size: 13px; + font-family: 'Roboto', 'Helvetica Neue', sans-serif; + text-align: center; + text-wrap: wrap; + transition: opacity 0.3s; +} + +.tg-tooltip--triangle { + width: 0; + height: 0; + position: absolute; + border: solid transparent; +} + +.tg-tooltip.top-left { + right: 50%; + bottom: 125%; + transform: translateX(20px); +} +.top-left .tg-tooltip--triangle { + bottom: -5px; + right: 20px; + margin-right: -5px; + border-width: 5px 5px 0; + border-top-color: var(--tooltip-color); +} + +.tg-tooltip.top { + left: 50%; + bottom: 125%; + transform: translateX(-50%); +} +.top .tg-tooltip--triangle { + bottom: -5px; + left: 50%; + margin-left: -5px; + border-width: 5px 5px 0; + border-top-color: var(--tooltip-color); +} + +.tg-tooltip.top-right { + left: 50%; + bottom: 125%; + transform: translateX(-20px); +} +.top-right .tg-tooltip--triangle { + bottom: -5px; + left: 20px; + margin-left: -5px; + border-width: 5px 5px 0; + border-top-color: var(--tooltip-color); +} + +.tg-tooltip.right { + left: 125%; +} +.right .tg-tooltip--triangle { + top: 50%; + left: -5px; + margin-top: -5px; + border-width: 5px 5px 5px 0; + border-right-color: var(--tooltip-color); +} + +.tg-tooltip.bottom-right { + left: 50%; + top: 125%; + transform: translateX(-20px); +} +.bottom-right .tg-tooltip--triangle { + top: -5px; + left: 20px; + margin-left: -5px; + border-width: 0 5px 5px; + border-bottom-color: var(--tooltip-color); +} + +.tg-tooltip.bottom { + top: 125%; + left: 50%; + transform: translateX(-50%); +} +.bottom .tg-tooltip--triangle { + top: -5px; + left: 50%; + margin-left: -5px; + border-width: 0 5px 5px; + border-bottom-color: var(--tooltip-color); +} + +.tg-tooltip.bottom-left { + right: 50%; + top: 125%; + transform: translateX(20px); +} +.bottom-left .tg-tooltip--triangle { + top: -5px; + right: 20px; + margin-right: -5px; + border-width: 0 5px 5px; + border-bottom-color: var(--tooltip-color); +} + +.tg-tooltip.left { + right: 125%; +} +.left .tg-tooltip--triangle { + top: 50%; + right: -5px; + margin-top: -5px; + border-width: 5px 0 5px 5px; + border-left-color: var(--tooltip-color); +} +`); + +export { Tooltip, withTooltip }; diff --git a/testgen/ui/static/js/components/tree.js b/testgen/ui/static/js/components/tree.js new file mode 100644 index 00000000..82acc371 --- /dev/null +++ b/testgen/ui/static/js/components/tree.js @@ -0,0 +1,528 @@ +/** + * @typedef TreeNode + * @type {object} + * @property {string} id + * @property {string} label + * @property {string?} classes + * @property {string?} icon + * @property {number?} iconSize + * @property {'red'?} iconColor + * @property {string?} iconTooltip + * @property {TreeNode[]?} children + * @property {number?} level + * @property {boolean?} expanded + * @property {boolean?} hidden + * @property {boolean?} selected + * + * @typedef SelectedNode + * @type {object} + * @property {string} id + * @property {boolean} all + * @property {SelectedNode[]?} children + * + * @typedef Properties + * @type {object} + * @property {string} id + * @property {string} classes + * @property {TreeNode[]} nodes + * @property {(string|string[])?} selected + * @property {function(string)?} onSelect + * @property {boolean?} multiSelect + * @property {boolean?} multiSelectToggle + * @property {string?} multiSelectToggleLabel + * @property {function(SelectedNode[] | null)?} onMultiSelect + * @property {(function(TreeNode, string): boolean) | null} isNodeHidden + * @property {function()?} onApplySearchOptions + * @property {(function(): boolean) | null} hasActiveFilters + * @property {function()?} onApplyFilters + * @property {function()?} onResetFilters + */ +import van from '../van.min.js'; +import { getValue, loadStylesheet, getRandomId, isState } from '../utils.js'; +import { Input } from './input.js'; +import { Button } from './button.js'; +import { Portal } from './portal.js'; +import { Icon } from './icon.js'; +import { Checkbox } from './checkbox.js'; +import { Toggle } from './toggle.js'; +import { withTooltip } from './tooltip.js'; +import { caseInsensitiveIncludes } from '../display_utils.js'; + +const { div, h3, span } = van.tags; +const levelOffset = 14; + +const Tree = (/** @type Properties */ props, /** @type any? */ searchOptionsContent, /** @type any? */ filtersContent) => { + loadStylesheet('tree', stylesheet); + + // Use only initial prop value as default and maintain internal state + const initialSelection = props.selected?.rawVal || props.selected || null; + const selected = van.state(initialSelection); + + const treeNodes = van.derive(() => { + const nodes = getValue(props.nodes) || []; + const treeSelected = initTreeState(nodes, selected.rawVal); + if (!treeSelected) { + selected.val = null; + } + return nodes; + }); + + const multiSelect = isState(props.multiSelect) ? props.multiSelect : van.state(!!props.multiSelect); + const noMatches = van.derive(() => treeNodes.val.every(node => node.hidden.val)); + + van.derive(() => { + const onSelect = props.onSelect?.val ?? props.onSelect; + if (!multiSelect.val && onSelect) { + onSelect(selected.val); + } + }); + + van.derive(() => { + if (!multiSelect.val) { + selectTree(treeNodes.val, false); + } + props.onMultiSelect?.(multiSelect.val ? getMultiSelection(treeNodes.val) : null); + }); + + return div( + { + id: props.id, + class: () => `flex-column ${getValue(props.classes)}`, + }, + Toolbar(treeNodes, multiSelect, props, searchOptionsContent, filtersContent), + div( + { class: 'tg-tree' }, + () => div( + { + class: 'tg-tree--nodes', + onclick: van.derive(() => multiSelect.val ? () => props.onMultiSelect?.(getMultiSelection(treeNodes.val)) : null), + }, + treeNodes.val.map(node => TreeNode(node, selected, multiSelect.val)), + ), + ), + () => noMatches.val + ? span({ class: 'tg-tree--empty mt-7 mb-7 text-secondary' }, 'No matching items found') + : '', + ); +}; + +const Toolbar = ( + /** @type { val: TreeNode[] } */ nodes, + /** @type object */ multiSelect, + /** @type Properties */ props, + /** @type any? */ searchOptionsContent, + /** @type any? */ filtersContent, +) => { + const search = van.state(''); + const searchOptionsDomId = `tree-search-options-${getRandomId()}`; + const searchOptionsOpened = van.state(false); + + const filterDomId = `tree-filters-${getRandomId()}`; + const filtersOpened = van.state(false); + const filtersActive = van.state(false); + const isNodeHidden = (/** @type TreeNode */ node) => props.isNodeHidden + ? props.isNodeHidden?.(node, search.val) + : !caseInsensitiveIncludes(node.label, search.val); + + return div( + { class: 'tg-tree--actions' }, + div( + { class: 'flex-row fx-gap-1 mb-1' }, + Input({ + icon: 'search', + clearable: true, + height: 32, + onChange: (/** @type string */ value) => { + search.val = value; + filterTree(nodes.val, isNodeHidden); + if (value) { + expandOrCollapseTree(nodes.val, true); + } + }, + }), + searchOptionsContent ? [ + div( + { class: 'tg-tree--search-options' }, + Button({ + id: searchOptionsDomId, + type: 'icon', + icon: 'settings', + style: 'width: 24px; height: 24px; padding: 4px;', + tooltip: 'Search options', + tooltipPosition: 'bottom', + onclick: () => searchOptionsOpened.val = !searchOptionsOpened.val, + }), + ), + Portal( + { target: searchOptionsDomId, opened: searchOptionsOpened }, + () => div( + { class: 'tg-tree--portal' }, + searchOptionsContent, + Button({ + type: 'stroked', + color: 'primary', + label: 'Apply', + style: 'width: 80px; margin-top: 12px; margin-left: auto;', + onclick: () => { + props.onApplySearchOptions?.(); + filterTree(nodes.val, isNodeHidden); + searchOptionsOpened.val = false; + }, + }), + ), + ) + ] : null, + Button({ + type: 'icon', + icon: 'expand_all', + style: 'width: 24px; height: 24px; padding: 4px;', + tooltip: 'Expand All', + tooltipPosition: 'bottom', + onclick: () => expandOrCollapseTree(nodes.val, true), + }), + Button({ + type: 'icon', + icon: 'collapse_all', + style: 'width: 24px; height: 24px; padding: 4px;', + tooltip: 'Collapse All', + tooltipPosition: 'bottom', + onclick: () => expandOrCollapseTree(nodes.val, false), + }), + ), + div( + { class: 'flex-row fx-justify-space-between mb-1' }, + div( + { class: 'text-secondary' }, + props.multiSelectToggle + ? Toggle({ + label: props.multiSelectToggleLabel ?? 'Select multiple', + checked: multiSelect, + onChange: (/** @type boolean */ checked) => multiSelect.val = checked, + }) + : null, + ), + filtersContent ? [ + div( + { class: () => `tg-tree--filter-button ${filtersActive.val ? 'active' : ''}` }, + Button({ + id: filterDomId, + type: 'basic', + label: 'Filters', + icon: 'filter_list', + style: 'height: 24px; padding: 4px;', + tooltip: () => filtersActive.val ? 'Filters active' : null, + tooltipPosition: 'bottom', + onclick: () => filtersOpened.val = !filtersOpened.val, + }), + ), + Portal( + { target: filterDomId, opened: filtersOpened }, + () => div( + { class: 'tg-tree--portal' }, + h3( + { class: 'flex-row fx-justify-space-between'}, + 'Filters', + Button({ + type: 'icon', + icon: 'close', + iconSize: 22, + onclick: () => filtersOpened.val = false, + }), + ), + filtersContent, + div( + { class: 'flex-row fx-justify-space-between mt-4' }, + Button({ + label: 'Reset filters', + width: '110px', + disabled: () => !props.hasActiveFilters(), + onclick: props.onResetFilters, + }), + Button({ + type: 'stroked', + color: 'primary', + label: 'Apply', + width: '80px', + onclick: () => { + props.onApplyFilters?.(); + filterTree(nodes.val, isNodeHidden); + filtersActive.val = props.hasActiveFilters(); + filtersOpened.val = false; + }, + }), + ), + ), + ) + ] : null, + ) + ); +}; + +const TreeNode = ( + /** @type TreeNode */ node, + /** @type string */ selected, + /** @type boolean */ multiSelect, +) => { + const hasChildren = !!node.children?.length; + return div( + { + onclick: multiSelect + ? (/** @type Event */ event) => { + if (hasChildren) { + if (!event.fromChild) { + // Prevent the default behavior of toggling the "checked" property - we want to control it + event.preventDefault(); + selectTree( + node.children, + node.selected.val ? false : node.children.some(child => !child.hidden.val && !child.selected.val), + ); + } + node.selected.val = node.children.every(child => child.selected.val); + } else { + node.selected.val = !node.selected.val; + } + event.fromChild = true; + } + : null, + }, + div( + { + class: () => `tg-tree--row flex-row clickable ${node.classes || ''} + ${selected.val === node.id ? 'selected' : ''} + ${node.hidden.val ? 'hidden' : ''}`, + style: `padding-left: ${levelOffset * node.level}px;`, + onclick: () => selected.val = node.id, + }, + Icon( + { + classes: hasChildren ? '' : 'invisible', + onclick: (/** @type Event */ event) => { + event.stopPropagation(); + node.expanded.val = hasChildren ? !node.expanded.val : false; + }, + }, + () => node.expanded.val ? 'arrow_drop_down' : 'arrow_right', + ), + multiSelect + ? [ + Checkbox({ + checked: () => node.selected.val, + indeterminate: hasChildren ? () => isIndeterminate(node) : false, + }), + span({ class: 'mr-1' }), + ] + : null, + () => { + if (node.icon) { + const icon = Icon({ size: node.iconSize, classes: `tg-tree--row-icon ${node.iconColor}` }, node.icon); + return node.iconTooltip ? withTooltip(icon, { text: node.iconTooltip, position: 'right' }) : icon; + } + return null; + }, + node.label, + ), + hasChildren ? div( + { class: () => node.expanded.val ? '' : 'hidden' }, + node.children.map(node => TreeNode(node, selected, multiSelect)), + ) : null, + ); +}; + +const initTreeState = ( + /** @type TreeNode[] */ nodes, + /** @type string */ selected, + /** @type number */ level = 0, +) => { + let treeExpanded = false; + nodes.forEach(node => { + node.level = level; + // Expand node if it is initial selection + let expanded = node.id === selected; + if (node.children) { + // Expand node if initial selection is a descendent + expanded = initTreeState(node.children, selected, level + 1) || expanded; + } + node.expanded = van.state(expanded); + node.hidden = van.state(false); + node.selected = van.state(node.selected ?? false); + treeExpanded = treeExpanded || expanded; + }); + return treeExpanded; +}; + +const filterTree = ( + /** @type TreeNode[] */ nodes, + /** @type function(TreeNode): boolean */ isNodeHidden, +) => { + nodes.forEach(node => { + let hidden = isNodeHidden(node); + if (node.children) { + filterTree(node.children, isNodeHidden); + hidden = hidden && node.children.every(child => child.hidden.rawVal); + } + node.hidden.val = hidden; + }); +}; + +const expandOrCollapseTree = ( + /** @type TreeNode[] */ nodes, + /** @type boolean */ expanded, +) => { + nodes.forEach(node => { + if (node.children) { + expandOrCollapseTree(node.children, expanded); + node.expanded.val = expanded; + } + }); +}; + +const selectTree = ( + /** @type TreeNode[] */ nodes, + /** @type boolean */ selected, +) => { + nodes.forEach(node => { + if (!selected || !node.hidden.val) { + node.selected.val = selected; + if (node.children) { + selectTree(node.children, selected); + } + } + }); +}; + +/** + * @param {TreeNode[]} nodes + * @returns {SelectedNode[]} + */ +const getMultiSelection = (nodes) => { + const selected = []; + nodes.forEach(node => { + if (node.children) { + const selectedChildren = getMultiSelection(node.children); + if (selectedChildren.length) { + selected.push({ + id: node.id, + all: selectedChildren.length === node.children.length + && (selectedChildren[0]?.children === undefined || selectedChildren.every(child => child.all)), + children: selectedChildren, + }); + } + } else if (node.selected.val) { + selected.push({ id: node.id }); + } + }); + return selected; +}; + +/** + * + * @param {TreeNode} node + * @returns {boolean} + */ +const isIndeterminate = (node) => { + return !node.selected.val && isAnyDescendantSelected(node); +}; + + +/** + * + * @param {TreeNode} node + * @returns {boolean} + */ +const isAnyDescendantSelected = (node) => { + if ((node.children ?? []).length <= 0) { + return false; + } + + for (const child of node.children) { + if (getValue(child.selected) || isAnyDescendantSelected(child)) { + return true; + } + } + + return false; +} + +const stylesheet = new CSSStyleSheet(); +stylesheet.replace(` +.tg-tree { + overflow: auto; +} + +.tg-tree--empty { + text-align: center; +} + +.tg-tree--actions { + margin: 4px; + border-bottom: 1px solid var(--border-color); +} + +.tg-tree--actions > div > label { + flex: auto; +} + +.tg-tree--filter-button { + position: relative; + border-radius: 4px; + border: 1px solid transparent; + transition: 0.3s; +} + +.tg-tree--filter-button.active { + border-color: var(--primary-color); +} + +.tg-tree--portal { + border-radius: 8px; + background: var(--dk-card-background); + box-shadow: var(--portal-box-shadow); + padding: 16px; + overflow: visible; + z-index: 99; +} + +.tg-tree--portal > h3 { + margin: 0 0 12px; + font-size: 18px; + font-weight: 500; +} + +.tg-tree--nodes { + width: fit-content; + min-width: 100%; +} + +.tg-tree--row { + box-sizing: border-box; + width: auto; + min-width: fit-content; + border: solid transparent; + border-width: 1px 0; + padding-right: 8px; + transition: background-color 0.3s; +} + +.tg-tree--row:hover { + background-color: var(--sidebar-item-hover-color); +} + +.tg-tree--row.selected { + background-color: var(--sidebar-item-hover-color); + color: var(--primary-color); + font-weight: 500; +} + +.tg-tree--row-icon { + margin-right: 4px; + width: 24px; + color: #B0BEC5; + text-align: center; +} + +.tg-tree--row-icon.red { + color: var(--red); +} +`); + +export { Tree }; diff --git a/testgen/ui/static/js/components/truncated_text.js b/testgen/ui/static/js/components/truncated_text.js new file mode 100644 index 00000000..c5d50241 --- /dev/null +++ b/testgen/ui/static/js/components/truncated_text.js @@ -0,0 +1,39 @@ +/** + * @import { TooltipPosition } from './tooltip.js'; + * + * @typedef TruncatedTextOptions + * @type {object} + * @property {number} max + * @property {string?} class + * @property {TooltipPosition?} tooltipPosition + */ +import van from '../van.min.js'; +import { withTooltip } from './tooltip.js'; +import { caseInsensitiveSort } from '../display_utils.js'; + +const { div, span, i } = van.tags; + +/** + * @param {TruncatedTextOptions} options + * @param {string[]} children + */ +const TruncatedText = ({ max, ...options }, ...children) => { + const sortedChildren = [...children.sort((a, b) => a.length - b.length)]; + const tooltipText = children.sort(caseInsensitiveSort).join(', '); + + return div( + { class: () => `${options.class ?? ''}`, style: 'position: relative;' }, + span(sortedChildren.slice(0, max).join(', ')), + sortedChildren.length > max + ? withTooltip( + i({class: 'text-caption'}, ` + ${sortedChildren.length - max} more`), + { + text: tooltipText, + position: options.tooltipPosition, + } + ) + : '', + ); +}; + +export { TruncatedText }; diff --git a/testgen/ui/static/js/components/wizard_progress_indicator.js b/testgen/ui/static/js/components/wizard_progress_indicator.js new file mode 100644 index 00000000..88bbb789 --- /dev/null +++ b/testgen/ui/static/js/components/wizard_progress_indicator.js @@ -0,0 +1,147 @@ + +/** + * @typedef WizardStepMeta + * @type {object} + * @property {int} index + * @property {string} title + * @property {boolean} skipped + * @property {string[]} includedSteps + * + * @typedef CurrentStep + * @type {object} + * @property {int} index + * @property {string} name + * + * @param {WizardStepMeta[]} steps + * @param {CurrentStep} currentStep + * @returns + */ +import van from '../van.min.js'; +import { colorMap } from '../display_utils.js'; + +const { div, i, span } = van.tags; + +const WizardProgressIndicator = (steps, currentStep) => { + const currentPhysicalIndex = steps.findIndex(s => s.includedSteps.includes(currentStep.name)); + const progressWidth = van.state('0px'); + + const updateProgress = () => { + const container = document.getElementById('wizard-progress-container'); + const activeIcon = document.querySelector('.step-icon-current'); + + if (container && activeIcon) { + const containerRect = container.getBoundingClientRect(); + const iconRect = activeIcon.getBoundingClientRect(); + const centerOffset = (iconRect.left - containerRect.left) + (iconRect.width / 2); + progressWidth.val = `${centerOffset}px`; + } + }; + + setTimeout(updateProgress, 10); + + const progressLineStyle = () => ` + position: absolute; + top: 10px; + left: 0; + height: 4px; + width: ${progressWidth.val}; + background: ${colorMap.green}; + transition: width 0.3s ease-out; + z-index: -4; + `; + + const currentStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 step-icon-current`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row fx-justify-center', style: `border: 2px solid var(--secondary-text-color); background: var(--dk-dialog-background); border-radius: 50%; height: 24px; width: 24px;` }, + div({ style: 'width: 14px; height: 14px; border-radius: 50%; background: var(--secondary-text-color);' }, ''), + ), + span({}, title), + ); + + const pendingStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 ${currentPhysicalIndex === stepIndex ? 'step-icon-current' : 'text-secondary'}`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row', style: `color: var(--empty-light); border: 2px solid var(--disabled-text-color); background: var(--dk-dialog-background); border-radius: 50%;` }, + i({style: 'width: 20px; height: 20px;'}, ''), + ), + span({}, title), + ); + + const completedStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 ${currentPhysicalIndex === stepIndex ? 'step-icon-current' : 'text-secondary'}`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row', style: `color: var(--empty-light); border: 2px solid ${colorMap.green}; background: ${colorMap.green}; border-radius: 50%;` }, + i( + { + class: 'material-symbols-rounded', + style: `font-size: 20px; color: var(--empty-light);`, + }, + 'check', + ), + ), + span({}, title), + ); + + const skippedStepIndicator = (title, stepIndex) => div( + { class: `flex-column fx-align-flex-center fx-gap-1 ${currentPhysicalIndex === stepIndex ? 'step-icon-current' : 'text-secondary'}`, style: 'position: relative;' }, + stepIndex === 0 + ? div({ style: 'position: absolute; width: 50%; height: 50%; left: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + stepIndex === steps.length - 1 + ? div({ style: 'position: absolute; width: 50%; height: 50%; right: 0px; background: var(--dk-dialog-background); z-index: -1;' }, '') + : '', + div( + { class: 'flex-row', style: `color: var(--empty-light); border: 2px solid var(--grey); background: var(--grey); border-radius: 50%;` }, + i( + { + class: 'material-symbols-rounded', + style: `font-size: 20px; color: var(--empty-light);`, + }, + 'remove', + ), + ), + span({}, title), + ); + + return div( + { + id: 'wizard-progress-container', + class: 'flex-row fx-justify-space-between mb-5', + style: 'position: relative; margin-top: -20px;' + }, + div({ style: `position: absolute; top: 10px; left: 0; width: 100%; height: 4px; background: var(--disabled-text-color); z-index: -5;` }), + div({ style: progressLineStyle }), + + ...steps.map((step, physicalIdx) => { + if (step.index < currentStep.index) { + if (step.skipped) return skippedStepIndicator(step.title, physicalIdx); + return completedStepIndicator(step.title, physicalIdx); + } else if (step.includedSteps.includes(currentStep.name)) { + return currentStepIndicator(step.title, physicalIdx); + } else { + return pendingStepIndicator(step.title, physicalIdx); + } + }), + ); +}; + +export { WizardProgressIndicator }; diff --git a/testgen/ui/static/js/display_utils.js b/testgen/ui/static/js/display_utils.js new file mode 100644 index 00000000..c590c9a0 --- /dev/null +++ b/testgen/ui/static/js/display_utils.js @@ -0,0 +1,190 @@ +function formatTimestamp( + /** @type number | string */ timestamp, + /** @type boolean */ showYear, +) { + if (timestamp) { + let date = timestamp; + if (typeof timestamp === 'number') { + date = new Date(timestamp.toString().length === 10 ? timestamp * 1000 : timestamp); + } + if (!isNaN(date)) { + const months = [ 'Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec' ]; + const hours = date.getHours(); + const minutes = date.getMinutes(); + return `${months[date.getMonth()]} ${date.getDate()}, ${showYear ? date.getFullYear() + ' at ': ''}${(hours % 12) || 12}:${String(minutes).padStart(2, '0')} ${hours / 12 >= 1 ? 'PM' : 'AM'}`; + } + } + return '--'; +} + +function formatDuration( + /** @type Date | number | string */ startTime, + /** @type Date | number | string */ endTime, +) { + if (!startTime || !endTime) { + return '--'; + } + + const startDate = new Date(typeof startTime === 'number' ? startTime * 1000 : startTime); + const endDate = new Date(typeof endTime === 'number' ? endTime * 1000 : endTime); + + const totalSeconds = Math.floor((endDate.getTime() - startDate.getTime()) / 1000); + return formatDurationSeconds(totalSeconds); +} + +function formatDurationSeconds( + /** @type number */ totalSeconds, +) { + if (!totalSeconds) { + return '--'; + } + + let formatted = [ + { value: Math.floor(totalSeconds / (3600 * 24)), unit: 'd' }, + { value: Math.floor((totalSeconds % (3600 * 24)) / 3600), unit: 'h' }, + { value: Math.floor((totalSeconds % 3600) / 60), unit: 'm' }, + { value: totalSeconds % 60, unit: 's' }, + ].map(({ value, unit }) => value ? `${value}${unit}` : '') + .join(' '); + + return formatted.trim() || '< 1s'; +} + +function humanReadableDuration(/** @type string */ duration, /** @type boolean */ round = false) { + if (duration === '< 1s') { + return 'Less than 1 second'; + } + + + const unitTemplates = { + d: (/** @type number */ value) => `${value} day${value === 1 ? '' : 's'}`, + h: (/** @type number */ value) => `${value} hour${value === 1 ? '' : 's'}`, + m: (/** @type number */ value) => `${value} minute${value === 1 ? '' : 's'}`, + s: (/** @type number */ value) => `${value} second${value === 1 ? '' : 's'}`, + }; + + if (round) { + const biggestPart = duration.split(' ')[0]; + const durationUnit = biggestPart.slice(-1)[0]; + const durationValue = Number(biggestPart.replace(durationUnit, '')); + return unitTemplates[durationUnit](durationValue); + } + + return duration + .split(' ') + .map(part => { + const unit = part.slice(-1)[0]; + const value = Number(part.replace(unit, '')); + return unitTemplates[unit](value); + }) + .join(' '); +} + +function formatNumber(/** @type number | string */ number, /** @type number */ decimals = 3) { + if (!['number', 'string'].includes(typeof number) || isNaN(number)) { + return '--'; + } + // toFixed - rounds to specified number of decimal places + // toLocaleString - adds commas as necessary + return parseFloat(Number(number).toFixed(decimals)).toLocaleString(); +} + +function capitalize(/** @type string */ text) { + return text.toLowerCase() + .split(' ') + .map((s) => s.charAt(0).toUpperCase() + s.substring(1)) + .join(' '); +} + +/** + * Display bytes in the closest unit with an integer part. + * + * @param {number} bytes + * @returns {string} + */ +function humanReadableSize(bytes) { + const thresholds = { + MB: 1024 * 1024, + KB: 1024, + }; + + for (const [unit, startsAt] of Object.entries(thresholds)) { + if (bytes > startsAt) { + return `${(bytes / startsAt).toFixed()}${unit}`; + } + } + + return `${bytes}B`; +} + +const caseInsensitiveSort = new Intl.Collator('en').compare; +const caseInsensitiveIncludes = (/** @type string */ value, /** @type string */ search) => { + if (value && search) { + return value.toLowerCase().includes(search.toLowerCase()); + } + return !search; +} + +/** + * Convert viewport units to pixels using the current + * window's `innerHeight` and defaulting to the top window's + * `innerHeight` when needed. + * + * @param {number} value + * @param {('height'|'width')} dim + * @returns {number} + */ +function viewPortUnitsToPixels(value, dim) { + if (typeof value !== 'number') { + return 0; + } + + const viewPortSize = window[`inner${capitalize(dim)}`] || window.top[`inner${capitalize(dim)}`]; + return (value / 100) * viewPortSize; +} + +// https://m2.material.io/design/color/the-color-system.html#tools-for-picking-colors +const colorMap = { + red: '#EF5350', // Red 400 + redLight: '#FFB6C180', // Clear red + redDark: '#D32F2F', // Red 700 + orange: '#FF9800', // Orange 500 + yellow: '#FDD835', // Yellow 600 + green: '#9CCC65', // Light Green 400 + greenLight: '#90EE90FF', // Clear green + limeGreen: '#C0CA33', // Lime Green 600 + purple: '#AB47BC', // Purple 400 + purpleLight: '#CE93D8', // Purple 200 + deepPurple: '#9575CD', // Deep Purple 300 + blue: '#2196F3', // Blue 500 + blueLight: '#90CAF9', // Blue 200 + indigo: '#5C6BC0', // Indigo 400 + teal: '#26A69A', // Teal 400 + tealDark: '#009688', // Teal 500 + brown: '#8D6E63', // Brown 400 + brownLight: '#D7CCC8', // Brown 100 + brownDark: '#4E342E', // Brown 800 + grey: '#BDBDBD', // Gray 400 + lightGrey: '#E0E0E0', // Gray 300 + empty: 'var(--empty)', // Light: Gray 200, Dark: Gray 800 + emptyLight: 'var(--empty-light)', // Light: Gray 50, Dark: Gray 900 + emptyDark: 'var(--empty-dark)', // Light: Gray 400, Dark: Gray 600 + emptyTeal: 'var(--empty-teal)', +} + +const DISABLED_ACTION_TEXT = 'You do not have permissions to perform this action. Contact your administrator.'; + +export { + formatTimestamp, + formatDuration, + formatDurationSeconds, + formatNumber, + capitalize, + humanReadableSize, + caseInsensitiveSort, + caseInsensitiveIncludes, + humanReadableDuration, + viewPortUnitsToPixels, + colorMap, + DISABLED_ACTION_TEXT, +}; diff --git a/testgen/ui/static/js/form_validators.js b/testgen/ui/static/js/form_validators.js new file mode 100644 index 00000000..635b8b6a --- /dev/null +++ b/testgen/ui/static/js/form_validators.js @@ -0,0 +1,131 @@ +/** + * @typedef Validator + * @type {Function} + * @param {any} value + * @param {object} form + * @returns {string} + */ + +function required(value) { + if (!value) { + return 'This field is required' + } + return null; +} + +/** + * @param {(v: any) => bool} condition + * @returns {Validator} + */ +function requiredIf(condition) { + const validator = (value) => { + if (condition(value)) { + return required(value); + } + return null; + } + validator['args'] = { name: 'requiredIf', condition }; + + return validator; +} + +function noSpaces(value) { + if (value?.includes(' ')) { + return `Value cannot contain spaces.`; + } + return null; +} + +/** + * + * @param {number} min + * @returns {Validator} + */ +function minLength(min) { + return (value) => { + if (value && value.length < min) { + return `Value must be at least ${min} characters long.`; + } + return null; + }; +} + +/** + * + * @param {number} max + * @returns {Validator} + */ +function maxLength(max) { + return (value) => { + if (typeof value !== 'string' || value.length > max) { + return `Value must be ${max} characters long or shorter.`; + } + return null; + }; +} + +/** + * @param {number} min + * @param {number} max + * @param {number} [precision] + * @returns {Validator} + */ +function numberBetween(min, max, precision = null) { + return (value) => { + const valueNumber = parseFloat(value); + if (isNaN(valueNumber)) { + return 'Value must be a numeric type.'; + } + + if (valueNumber < min || valueNumber > max) { + return `Value must be between ${min} and ${max}.`; + } + + if (precision !== null) { + const strValue = value.toString(); + const decimalPart = strValue.includes('.') ? strValue.split('.')[1] : ''; + + if (decimalPart.length > precision) { + if (precision === 0) { + return 'Value must be an integer.'; + } else { + return `Value must have at most ${precision} digits after the decimal point.`; + } + } + } + }; +} + + +/** + * To use with FileInput, enforce a cap on file size + * allowed to upload. + * + * @param {number} limit + * @returns {Validator} + */ +function sizeLimit(limit) { + /** + * @import {FileValue} from './components/file_input.js'; + * @param {FileValue} value + */ + const validator = (value) => { + if (value != null && value.size > limit) { + return `Uploaded file must be smaller than ${limit}.`; + } + return null; + }; + validator['args'] = { name: 'sizeLimit', limit }; + + return validator; +} + +export { + maxLength, + minLength, + numberBetween, + noSpaces, + required, + requiredIf, + sizeLimit, +}; diff --git a/testgen/ui/static/js/score_utils.js b/testgen/ui/static/js/score_utils.js new file mode 100644 index 00000000..3ed4f079 --- /dev/null +++ b/testgen/ui/static/js/score_utils.js @@ -0,0 +1,31 @@ +import { colorMap } from './display_utils.js'; + +/** + * Get a color based on a numeric score. + * + * @param {number} score + * @returns {string} + */ +function getScoreColor(score) { + if (Number.isNaN(parseFloat(score))) { + const stringScore = String(score); + if (stringScore.startsWith('>')) { + return colorMap.green; + } else if (stringScore.startsWith('<')) { + return colorMap.red; + } + return colorMap.grey; + } + + if (score >= 96) { + return colorMap.green; + } else if (score >= 91) { + return colorMap.yellow; + } else if (score >= 86) { + return colorMap.orange; + } else { + return colorMap.red; + } +} + +export { getScoreColor }; diff --git a/testgen/ui/assets/scripts.js b/testgen/ui/static/js/scripts.js similarity index 52% rename from testgen/ui/assets/scripts.js rename to testgen/ui/static/js/scripts.js index 52b4e520..88dc7c4c 100644 --- a/testgen/ui/assets/scripts.js +++ b/testgen/ui/static/js/scripts.js @@ -1,4 +1,4 @@ -import van from './static/js/van.min.js'; +import van from './van.min.js'; window.van = van; @@ -13,6 +13,38 @@ window.addEventListener('message', async function(event) { } }); +document.addEventListener('click', (event) => { + const openedPortals = (Object.values(window.testgen.portals) ?? []).filter(portal => portal.opened.val); + if (Object.keys(openedPortals).length <= 0) { + return; + } + + const targetParents = getParents(event.target); + for (const portal of openedPortals) { + const targetEl = document.getElementById(portal.targetId); + const portalEl = document.getElementById(portal.domId); + + if (event?.target?.id !== portal.targetId && event?.target?.id !== portal.domId && !targetParents.includes(targetEl) && !targetParents.includes(portalEl)) { + portal.opened.val = false; + } + } +}); + +function getParents(/** @type HTMLElement*/ element) { + const parents = []; + + let currentParent = element.parentElement; + do { + if (currentParent !== null) { + parents.push(currentParent); + currentParent = currentParent.parentElement; + } + } + while (currentParent !== null && currentParent.tagName !== 'iframe'); + + return parents; +} + async function copyToClipboard(text) { if (navigator.clipboard && window.isSecureContext) { await navigator.clipboard.writeText(text || ''); @@ -46,5 +78,6 @@ window.testgen = { states: {}, components: {}, loadedStylesheets: {}, + portals: {}, changeLocation: url => window.location.href = url, }; diff --git a/testgen/ui/components/frontend/js/components/sidebar.js b/testgen/ui/static/js/sidebar.js similarity index 98% rename from testgen/ui/components/frontend/js/components/sidebar.js rename to testgen/ui/static/js/sidebar.js index 9c6e9329..9382b40f 100644 --- a/testgen/ui/components/frontend/js/components/sidebar.js +++ b/testgen/ui/static/js/sidebar.js @@ -54,6 +54,7 @@ const Sidebar = (/** @type {Properties} */ props) => { return div( {class: 'menu'}, div( + {class: 'fx-flex', style: 'overflow-y: auto;'}, div( { class: 'menu--project' }, div({ class: 'caption' }, 'Project'), @@ -224,15 +225,12 @@ stylesheet.replace(` flex-direction: column; justify-content: space-between; height: calc(100% - 68px); + font-size: 15px; } .menu .menu--project { padding: 0 20px; - margin-bottom: 16px; -} - -.project-select { - position: relative; + margin-bottom: 12px; } .project-select--label { @@ -295,7 +293,7 @@ stylesheet.replace(` } .menu .menu--item { - height: 40px; + height: 36px; display: flex; align-items: center; padding: 0 16px; @@ -364,8 +362,7 @@ button.tg-button:hover { background: rgba(0, 0, 0, 0.04); } -button.tg-button > i { - font-size: 18px; +button.tg-button > i:has(+ span:not(.tg-tooltip)) { margin-right: 8px; } /* ... */ diff --git a/testgen/ui/static/js/streamlit.js b/testgen/ui/static/js/streamlit.js new file mode 100644 index 00000000..a30ace8c --- /dev/null +++ b/testgen/ui/static/js/streamlit.js @@ -0,0 +1,33 @@ +const Streamlit = { + _v2: false, + _customSendDataHandler: undefined, + init() { + sendMessageToStreamlit('streamlit:componentReady', { apiVersion: 1 }); + }, + enableV2(handler) { + this._v2 = true; + this._customSendDataHandler = handler; + }, + setFrameHeight(height) { + if (!this._v2) { + sendMessageToStreamlit('streamlit:setFrameHeight', { height: height }); + } + }, + sendData(data) { + if (this._v2) { + const event = data.event; + const triggerData = Object.fromEntries(Object.entries(data).filter(([k, v]) => k !== 'event')); + this._customSendDataHandler(event, triggerData); + } else { + sendMessageToStreamlit('streamlit:setComponentValue', { value: data, dataType: 'json' }); + } + }, +}; + +function sendMessageToStreamlit(type, data) { + if (window.top) { + window.top.postMessage(Object.assign({ type: type, isStreamlitMessage: true }, data), '*'); + } +} + +export { Streamlit }; diff --git a/testgen/ui/static/js/utils.js b/testgen/ui/static/js/utils.js new file mode 100644 index 00000000..5dc5560f --- /dev/null +++ b/testgen/ui/static/js/utils.js @@ -0,0 +1,242 @@ +import van from './van.min.js'; +import { Streamlit } from './streamlit.js'; + +function enforceElementWidth( + /** @type Element */element, + /** @type number */width, +) { + const observer = new ResizeObserver(() => { + element.width = width; + }); + + observer.observe(element); +} + +function resizeFrameHeightToElement(/** @type string */elementId) { + const observer = new ResizeObserver(() => { + const element = document.getElementById(elementId); + if (element) { + const height = element.offsetHeight; + if (height) { + Streamlit.setFrameHeight(height); + } + } + }); + observer.observe(window.frameElement); +} + +function resizeFrameHeightOnDOMChange(/** @type string */elementId) { + const observer = new MutationObserver(() => { + const element = document.getElementById(elementId); + if (element) { + const height = element.offsetHeight; + if (height) { + Streamlit.setFrameHeight(height); + } + } + }); + observer.observe(window.frameElement.contentDocument.body, {subtree: true, childList: true}); +} + +/** + * @param {string} elementId + * @param {((rect: DOMRect, element: HTMLElement) => void)} callback + * @returns {ResizeObserver} + */ +function onFrameResized(elementId, callback) { + const observer = new ResizeObserver(() => { + const element = document.getElementById(elementId); + if (element) { + callback(element.getBoundingClientRect(), element); + } + }); + observer.observe(window.frameElement); + + return observer; +} + +function loadStylesheet( + /** @type string */key, + /** @type CSSStyleSheet */stylesheet, +) { + if (!window.testgen.loadedStylesheets[key]) { + document.adoptedStyleSheets.push(stylesheet); + window.testgen.loadedStylesheets[key] = true; + } +} + +function emitEvent( + /** @type string */event, + /** @type object */data = {}, +) { + Streamlit.sendData({ event, ...data, _id: Math.random() }) // Identify the event so its handler is called once +} + +// Replacement for van.val() +// https://github.com/vanjs-org/van/discussions/280 +const stateProto = Object.getPrototypeOf(van.state()); +/** + * Get value from van.state + * @template T + * @param {T} prop + * @returns {T} + */ +function getValue(prop) { // van state or static value + const proto = Object.getPrototypeOf(prop ?? 0); + if (proto === stateProto) { + return prop.val; + } + if (proto === Function.prototype) { + return prop(); + } + return prop; +} + +function isState(/** @type object */ value) { + return Object.getPrototypeOf(value ?? 0) == stateProto; +} + +function getRandomId() { + return Math.random().toString(36).substring(2); +} + +// https://stackoverflow.com/a/75988895 +function debounce( + /** @type function */ callback, + /** @type number */ wait, +) { + let timeoutId = null; + return (...args) => { + window.clearTimeout(timeoutId); + timeoutId = window.setTimeout(() => callback(...args), wait); + }; +} + +function getParents(/** @type HTMLElement*/ element) { + const parents = []; + + let currentParent = element.parentElement; + do { + if (currentParent !== null) { + parents.push(currentParent); + currentParent = currentParent.parentElement; + } + } + while (currentParent !== null && currentParent.tagName !== 'iframe'); + + return parents; +} + +function friendlyPercent(/** @type number */ value) { + if (Number.isNaN(value)) { + return 0; + } + const rounded = Math.round(value); + if (rounded === 0 && value > 0) { + return '< 1'; + } + if (rounded === 100 && value < 100) { + return '> 99'; + } + return rounded; +} + +function isEqual(value, other) { + if (typeof value !== 'object' && typeof other !== 'object') { + return Object.is(value, other); + } + + if (value === null && other === null) { + return true; + } + + if ((value === null || other === null) && (value !== null || other !== null)) { + return false; + } + + if (typeof value !== typeof other) { + return false; + } + + if (value === other) { + return true; + } + + if (Array.isArray(value) && Array.isArray(other)) { + if (value.length !== other.length) { + return false; + } + + for (let i = 0; i < value.length; i++) { + if (!isEqual(value[i], other[i])) { + return false; + } + } + + return true; + } + + if (Array.isArray(value) || Array.isArray(other)) { + return false; + } + + if (Object.keys(value).length !== Object.keys(other).length) { + return false; + } + + for (const [k, v] of Object.entries(value)) { + if (!(k in other)) { + return false; + } + + if (!isEqual(v, other[k])) { + return false; + } + } + + return true; +} + +function afterMount(/** @ype Function */ callback) { + const trigger = van.state(false); + van.derive(() => trigger.val && callback()); + trigger.val = true; +} + +function slugify(/** @type string */ str) { + return str + .toLowerCase() + .replace(/[^a-z0-9]+/g, '-') + .replace(/^-|-$/g, ''); +} + +function isDataURL(/** @type string */ url) { + return url.startsWith('data:'); +} + +function checkIsRequired(validators) { + let isRequired = validators.some(v => v.name === 'required'); + if (!isRequired) { + isRequired = validators + .filter((v) => v.args?.name === 'requiredIf') + .some((v) => v.args?.condition?.()) + } + return isRequired; +} + +/** + * + * @param {(string|number)} value + * @returns {number} + */ +function parseDate(value) { + if (typeof value === 'string') { + return Date.parse(value); + } else if (typeof value === 'number') { + return value * 1000; + } + + return value; +} + +export { afterMount, debounce, emitEvent, enforceElementWidth, getRandomId, getValue, getParents, isEqual, isState, loadStylesheet, resizeFrameHeightToElement, resizeFrameHeightOnDOMChange, friendlyPercent, slugify, isDataURL, checkIsRequired, onFrameResized, parseDate }; diff --git a/testgen/ui/static/js/values.js b/testgen/ui/static/js/values.js new file mode 100644 index 00000000..725ba2ff --- /dev/null +++ b/testgen/ui/static/js/values.js @@ -0,0 +1,266 @@ +// Chrome does not include UTC: https://github.com/mdn/browser-compat-data/issues/25828 +const timezones = [ 'UTC', ...Intl.supportedValuesOf('timeZone').filter(tz => tz !== 'UTC') ]; + +const holidayCodes = [ + 'USA', + 'NYSE', + 'ECB', + 'BombayStockExchange', + 'EuropeanCentralBank', + 'IceFuturesEurope', + 'NationalStockExchangeOfIndia', + 'NewYorkStockExchange', + 'BrasilBolsaBalcao', + 'Afghanistan', + 'AlandIslands', + 'Albania', + 'Algeria', + 'AmericanSamoa', + 'Andorra', + 'Angola', + 'Anguilla', + 'Antarctica', + 'AntiguaAndBarbuda', + 'Argentina', + 'Armenia', + 'Aruba', + 'Australia', + 'Austria', + 'Azerbaijan', + 'Bahamas', + 'Bahrain', + 'Bangladesh', + 'Barbados', + 'Belarus', + 'Belgium', + 'Belgium', + 'Belize', + 'Benin', + 'Bermuda', + 'Bhutan', + 'Bolivia', + 'BonaireSintEustatiusAndSaba', + 'BosniaAndHerzegovina', + 'Botswana', + 'BouvetIsland', + 'Brazil', + 'BritishIndianOceanTerritory', + 'BritishVirginIslands', + 'Brunei', + 'Bulgaria', + 'BurkinaFaso', + 'Burundi', + 'CaboVerde', + 'Cambodia', + 'Cameroon', + 'Canada', + 'CaymanIslands', + 'CentralAfricanRepublic', + 'Chad', + 'Chile', + 'China', + 'ChristmasIsland', + 'CocosIslands', + 'Colombia', + 'Comoros', + 'Congo', + 'CookIslands', + 'CostaRica', + 'Croatia', + 'Cuba', + 'Curacao', + 'Cyprus', + 'Czechia', + 'Denmark', + 'Djibouti', + 'Dominica', + 'DominicanRepublic', + 'DRCongo', + 'Ecuador', + 'Egypt', + 'ElSalvador', + 'EquatorialGuinea', + 'Eritrea', + 'Estonia', + 'Eswatini', + 'Ethiopia', + 'FalklandIslands', + 'FaroeIslands', + 'Fiji', + 'Finland', + 'France', + 'FrenchGuiana', + 'FrenchPolynesia', + 'FrenchSouthernTerritories', + 'Gabon', + 'Gambia', + 'Georgia', + 'Germany', + 'Ghana', + 'Gibraltar', + 'Greece', + 'Greenland', + 'Grenada', + 'Guadeloupe', + 'Guam', + 'Guatemala', + 'Guernsey', + 'Guinea', + 'GuineaBissau', + 'Guyana', + 'Haiti', + 'HeardIslandAndMcDonaldIslands', + 'Honduras', + 'HongKong', + 'Hungary', + 'Iceland', + 'India', + 'Indonesia', + 'Iran', + 'Iraq', + 'Ireland', + 'IsleOfMan', + 'Israel', + 'Italy', + 'IvoryCoast', + 'Jamaica', + 'Japan', + 'Jersey', + 'Jordan', + 'Kazakhstan', + 'Kenya', + 'Kiribati', + 'Kuwait', + 'Kyrgyzstan', + 'Laos', + 'Latvia', + 'Lebanon', + 'Lesotho', + 'Liberia', + 'Libya', + 'Liechtenstein', + 'Lithuania', + 'Luxembourg', + 'Macau', + 'Madagascar', + 'Malawi', + 'Malaysia', + 'Maldives', + 'Mali', + 'Malta', + 'MarshallIslands', + 'Martinique', + 'Mauritania', + 'Mauritius', + 'Mayotte', + 'Mexico', + 'Micronesia', + 'Moldova', + 'Monaco', + 'Mongolia', + 'Montenegro', + 'Montserrat', + 'Morocco', + 'Mozambique', + 'Myanmar', + 'Namibia', + 'Nauru', + 'Nepal', + 'Netherlands', + 'NewCaledonia', + 'NewZealand', + 'Nicaragua', + 'Niger', + 'Nigeria', + 'Niue', + 'NorfolkIsland', + 'NorthKorea', + 'NorthMacedonia', + 'NorthernMarianaIslands', + 'Norway', + 'Oman', + 'Pakistan', + 'Palau', + 'Palestine', + 'Panama', + 'PapuaNewGuinea', + 'Paraguay', + 'Peru', + 'Philippines', + 'PitcairnIslands', + 'Poland', + 'Portugal', + 'PuertoRico', + 'Qatar', + 'Reunion', + 'Romania', + 'Russia', + 'Rwanda', + 'SaintBarthelemy', + 'SaintHelenaAscensionAndTristanDaCunha', + 'SaintKittsAndNevis', + 'SaintLucia', + 'SaintMartin', + 'SaintPierreAndMiquelon', + 'SaintVincentAndTheGrenadines', + 'Samoa', + 'SanMarino', + 'SaoTomeAndPrincipe', + 'SaudiArabia', + 'Senegal', + 'Serbia', + 'Seychelles', + 'SierraLeone', + 'Singapore', + 'SintMaarten', + 'Slovakia', + 'Slovenia', + 'SolomonIslands', + 'Somalia', + 'SouthAfrica', + 'SouthGeorgiaAndTheSouthSandwichIslands', + 'SouthKorea', + 'SouthSudan', + 'Spain', + 'SriLanka', + 'Sudan', + 'Suriname', + 'SvalbardAndJanMayen', + 'Sweden', + 'Switzerland', + 'SyrianArabRepublic', + 'Taiwan', + 'Tajikistan', + 'Tanzania', + 'Thailand', + 'TimorLeste', + 'Togo', + 'Tokelau', + 'Tonga', + 'TrinidadAndTobago', + 'Tunisia', + 'Turkey', + 'Turkmenistan', + 'TurksAndCaicosIslands', + 'Tuvalu', + 'Uganda', + 'Ukraine', + 'UnitedArabEmirates', + 'UnitedKingdom', + 'UnitedStates', + 'UnitedStatesMinorOutlyingIslands', + 'UnitedStatesVirginIslands', + 'Uruguay', + 'Uzbekistan', + 'Vanuatu', + 'VaticanCity', + 'Venezuela', + 'Vietnam', + 'WallisAndFutuna', + 'WesternSahara', + 'Yemen', + 'Zambia', + 'Zimbabwe', +]; + +export { timezones, holidayCodes }; diff --git a/testgen/ui/static/js/van.min.js b/testgen/ui/static/js/van.min.js new file mode 100644 index 00000000..57c6b792 --- /dev/null +++ b/testgen/ui/static/js/van.min.js @@ -0,0 +1,10 @@ +/** + * @template T + * @typedef VanState + * @type {object} + * @property {T?} rawVal + * @property {T?} oldVal + * @property {T?} val + */ +// https://vanjs.org/code/van-1.5.2.min.js +let e,t,r,o,l,n,s=Object.getPrototypeOf,f={isConnected:1},i={},h=s(f),a=s(s),d=(e,t,r,o)=>(e??(setTimeout(r,o),new Set)).add(t),u=(e,t,o)=>{let l=r;r=t;try{return e(o)}catch(e){return console.error(e),o}finally{r=l}},w=e=>e.filter(e=>e.t?.isConnected),_=e=>l=d(l,e,()=>{for(let e of l)e.o=w(e.o),e.l=w(e.l);l=n},1e3),c={get val(){return r?.i?.add(this),this.rawVal},get oldVal(){return r?.i?.add(this),this.h},set val(o){r?.u?.add(this),o!==this.rawVal&&(this.rawVal=o,this.o.length+this.l.length?(t?.add(this),e=d(e,this,v)):this.h=o)}},S=e=>({__proto__:c,rawVal:e,h:e,o:[],l:[]}),g=(e,t)=>{let r={i:new Set,u:new Set},l={f:e},n=o;o=[];let s=u(e,r,t);s=(s??document).nodeType?s:new Text(s);for(let e of r.i)r.u.has(e)||(_(e),e.o.push(l));for(let e of o)e.t=s;return o=n,l.t=s},y=(e,t=S(),r)=>{let l={i:new Set,u:new Set},n={f:e,s:t};n.t=r??o?.push(n)??f,t.val=u(e,l,t.rawVal);for(let e of l.i)l.u.has(e)||(_(e),e.l.push(n));return t},b=(e,...t)=>{for(let r of t.flat(1/0)){let t=s(r??0),o=t===c?g(()=>r.val):t===a?g(r):r;o!=n&&e.append(o)}return e},m=(e,t,...r)=>{let[o,...l]=s(r[0]??0)===h?r:[{},...r],f=e?document.createElementNS(e,t):document.createElement(t);for(let[e,r]of Object.entries(o)){let o=t=>t?Object.getOwnPropertyDescriptor(t,e)??o(s(t)):n,l=t+","+e,h=i[l]??=o(s(f))?.set??0,d=e.startsWith("on")?(t,r)=>{let o=e.slice(2);f.removeEventListener(o,r),f.addEventListener(o,t)}:h?h.bind(f):f.setAttribute.bind(f,e),u=s(r??0);e.startsWith("on")||u===a&&(r=y(r),u=c),u===c?g(()=>(d(r.val,r.h),f)):d(r)}return b(f,l)},x=e=>({get:(t,r)=>m.bind(n,e,r)}),j=(e,t)=>t?t!==e&&e.replaceWith(t):e.remove(),v=()=>{let r=0,o=[...e].filter(e=>e.rawVal!==e.h);do{t=new Set;for(let e of new Set(o.flatMap(e=>e.l=w(e.l))))y(e.f,e.s,e.t),e.t=n}while(++r<100&&(o=[...t]).length);let l=[...e].filter(e=>e.rawVal!==e.h);e=n;for(let e of new Set(l.flatMap(e=>e.o=w(e.o))))j(e.t,g(e.f,e.t)),e.t=n;for(let e of l)e.h=e.rawVal};export default{tags:new Proxy(e=>new Proxy(m,x(e)),x()),hydrate:(e,t)=>j(e,g(t,e)),add:b,state:S,derive:y}; \ No newline at end of file diff --git a/testgen/ui/utils.py b/testgen/ui/utils.py new file mode 100644 index 00000000..13032513 --- /dev/null +++ b/testgen/ui/utils.py @@ -0,0 +1,84 @@ +import zoneinfo +from collections.abc import Callable +from datetime import datetime +from typing import TypedDict + +import cron_converter +import cron_descriptor + +from testgen.ui.session import temp_value + + +class CronSample(TypedDict): + id: str | None + error: str | None + samples: list[str] | list[int] | None + readable_expr: str | None + +class CronSampleHandlerPayload(TypedDict): + tz: str + cron_expr: str + + +CronSampleCallback = Callable[[CronSampleHandlerPayload], None] + + +def get_cron_sample( + cron_expr: str, + cron_tz: str, + sample_count: int, + *, + reference_time: datetime | None = None, + formatted: bool = False, +) -> CronSample: + try: + cron_obj = cron_converter.Cron(cron_expr) + cron_schedule = cron_obj.schedule(reference_time or datetime.now(zoneinfo.ZoneInfo(cron_tz))) + readble_cron_schedule = cron_descriptor.get_description(cron_expr) + if formatted: + samples = [cron_schedule.next().strftime("%a %b %-d, %-I:%M %p") for _ in range(sample_count)] + else: + samples = [int(cron_schedule.next().timestamp()) for _ in range(sample_count)] + except ValueError as e: + return {"error": str(e)} + except Exception as e: + return {"error": "Error validating the Cron expression"} + else: + return { + "samples": samples, + "readable_expr": readble_cron_schedule, + } + + +def get_cron_sample_handler(key: str, *, sample_count: int = 3) -> tuple[dict | None, CronSampleCallback]: + cron_sample_result, set_cron_sample = temp_value(key, default={}) + + def on_cron_sample(payload: CronSampleHandlerPayload): + cron_expr = payload["cron_expr"] + cron_tz = payload.get("tz", "America/New_York") + cron_sample = get_cron_sample(cron_expr, cron_tz, sample_count, formatted=True) + set_cron_sample(cron_sample) + + return cron_sample_result, on_cron_sample + + +def parse_fuzzy_date(value: str | int) -> datetime | None: + if type(value) == str: + return datetime.strptime(value, "%Y-%m-%d %H:%M:%S") + elif type(value) == int or type(value) == float: + ts = int(value) + if ts >= 1e11: + ts /= 1000 + return datetime.fromtimestamp(ts) + return value + + +def dict_from_kv(value: str | None, pairs_seprator: str = ";", kv_separator: str = "=") -> dict: + if not value: + return {} + pairs = [ pair.split(kv_separator) for raw_pair in value.split(pairs_seprator) if (pair := raw_pair.strip()) ] + return { + pair_key: pair_value + for pair in pairs + if (pair_key := pair[0].strip()) and (pair_value := pair[1].strip()) + } diff --git a/testgen/ui/views/connections.py b/testgen/ui/views/connections.py index fc7938f2..0c69e992 100644 --- a/testgen/ui/views/connections.py +++ b/testgen/ui/views/connections.py @@ -6,6 +6,7 @@ import streamlit as st +from testgen.commands.test_generation import run_monitor_generation from testgen.ui.queries import table_group_queries try: @@ -19,12 +20,15 @@ from testgen.common.database.database_service import empty_cache, get_flavor_service from testgen.common.models import with_database_session from testgen.common.models.connection import Connection, ConnectionMinimal +from testgen.common.models.scheduler import RUN_MONITORS_JOB_KEY, RUN_TESTS_JOB_KEY, JobSchedule from testgen.common.models.table_group import TableGroup +from testgen.common.models.test_suite import TestSuite from testgen.ui.assets import get_asset_data_url from testgen.ui.components import widgets as testgen from testgen.ui.navigation.menu import MenuItem from testgen.ui.navigation.page import Page from testgen.ui.session import session, temp_value +from testgen.ui.utils import get_cron_sample_handler LOG = logging.getLogger("testgen") PAGE_TITLE = "Connection" @@ -61,7 +65,7 @@ class ConnectionsPage(Page): def render(self, project_code: str, **_kwargs) -> None: testgen.page_header( PAGE_TITLE, - "connect-your-database", + "manage-connections", ) connections = Connection.select_where(Connection.project_code == project_code) @@ -276,16 +280,17 @@ def on_save_table_group_clicked(payload: dict) -> None: table_group: dict = payload["table_group"] table_group_verified: bool = payload.get("table_group_verified", False) run_profiling: bool = payload.get("run_profiling", False) + standard_test_suite: dict | None = payload.get("standard_test_suite", None) + monitor_test_suite: dict | None = payload.get("monitor_test_suite", None) set_new_table_group(table_group) mark_for_preview(True) set_table_group_verified(table_group_verified) set_run_profiling(run_profiling) + set_standard_test_suite_data(standard_test_suite) + set_monitor_test_suite_data(monitor_test_suite) mark_for_save(True) - def on_go_to_profiling_runs(params: dict) -> None: - set_navigation_params({ **params, "project_code": project_code }) - def on_preview_table_group(payload: dict) -> None: table_group = payload["table_group"] verify_table_access = payload.get("verify_access") or False @@ -294,12 +299,12 @@ def on_preview_table_group(payload: dict) -> None: mark_for_preview(True) mark_for_access_preview(verify_table_access) - get_navigation_params, set_navigation_params = temp_value( - "connections:new_table_group:go_to_profiling_run", - default=None, - ) - if (params := get_navigation_params()): - self.router.navigate(to="profiling-runs", with_args=params) + def on_close_clicked(_params: dict) -> None: + set_close_dialog(True) + + get_close_dialog, set_close_dialog = temp_value(f"connections:{connection_id}:close", default=False) + if (get_close_dialog()): + st.rerun() get_new_table_group, set_new_table_group = temp_value( f"connections:{connection_id}:table_group", @@ -329,8 +334,31 @@ def on_preview_table_group(payload: dict) -> None: f"connections:{connection_id}:tg_save", default=False, ) + standard_cron_sample_result, on_get_standard_cron_sample = get_cron_sample_handler(f"connections:{connection_id}:standard_cron_expr_validation") + monitor_cron_sample_result, on_get_monitor_cron_sample = get_cron_sample_handler(f"connections:{connection_id}:monitor_cron_expr_validation") + get_standard_test_suite_data, set_standard_test_suite_data = temp_value( + f"connections:{connection_id}:test_suite_data", + default={ + "generate": False, + "name": "", + "schedule": "", + "timezone": "", + }, + ) + get_monitor_test_suite_data, set_monitor_test_suite_data = temp_value( + f"connections:{connection_id}:monitor_suite_data", + default={ + "generate": False, + "monitor_lookback": 0, + "schedule": "", + "timezone": "", + "predict_sensitivity": 0, + "predict_min_lookback": 0, + "predict_exclude_weekends": False, + "predict_holiday_codes": None, + }, + ) - add_monitor_test_suite = table_group_data.pop("add_monitor_test_suite", False) add_scorecard_definition = table_group_data.pop("add_scorecard_definition", False) table_group = TableGroup( project_code=project_code, @@ -348,17 +376,18 @@ def on_preview_table_group(payload: dict) -> None: verify_table_access=should_verify_access(), ) + run_profiling = False + generate_test_suite = False + generate_monitor_suite = False + standard_test_suite = None + monitor_test_suite = None if should_save(): success = True message = None if is_table_group_verified(): try: - table_group.save( - add_scorecard_definition=add_scorecard_definition, - add_monitor_test_suite=add_monitor_test_suite, - monitor_schedule_timezone=st.session_state["browser_timezone"] or "UTC", - ) + table_group.save(add_scorecard_definition) if save_data_chars: try: @@ -366,8 +395,69 @@ def on_preview_table_group(payload: dict) -> None: except Exception: LOG.exception("Data characteristics refresh encountered errors") + standard_test_suite_data = get_standard_test_suite_data() or {} + if standard_test_suite_data.get("generate"): + generate_test_suite = True + standard_test_suite = TestSuite( + project_code=project_code, + test_suite=standard_test_suite_data["name"], + connection_id=table_group.connection_id, + table_groups_id=table_group.id, + export_to_observability=False, + dq_score_exclude=False, + is_monitor=False, + monitor_lookback=0, + predict_min_lookback=0, + ) + standard_test_suite.save() + + JobSchedule( + project_code=project_code, + key=RUN_TESTS_JOB_KEY, + cron_expr=standard_test_suite_data["schedule"], + cron_tz=standard_test_suite_data["timezone"], + args=[], + kwargs={"test_suite_id": str(standard_test_suite.id)}, + ).save() + + monitor_test_suite_data = get_monitor_test_suite_data() or {} + if monitor_test_suite_data.get("generate"): + generate_monitor_suite = True + monitor_test_suite = TestSuite( + project_code=project_code, + test_suite=f"{table_group.table_groups_name} Monitors", + connection_id=table_group.connection_id, + table_groups_id=table_group.id, + export_to_observability=False, + dq_score_exclude=True, + is_monitor=True, + monitor_lookback=monitor_test_suite_data.get("monitor_lookback") or 14, + monitor_regenerate_freshness=monitor_test_suite_data.get("monitor_regenerate_freshness") or True, + predict_min_lookback=monitor_test_suite_data.get("predict_min_lookback") or 30, + predict_sensitivity=monitor_test_suite_data.get("predict_sensitivity") or "medium", + predict_exclude_weekends=monitor_test_suite_data.get("predict_exclude_weekends") or False, + predict_holiday_codes=monitor_test_suite_data.get("predict_holiday_codes") or None, + ) + monitor_test_suite.save() + run_monitor_generation(monitor_test_suite.id, ["Volume_Trend", "Schema_Drift"]) + + JobSchedule( + project_code=project_code, + key=RUN_MONITORS_JOB_KEY, + cron_expr=monitor_test_suite_data.get("schedule"), + cron_tz=monitor_test_suite_data.get("timezone"), + args=[], + kwargs={"test_suite_id": str(monitor_test_suite.id)}, + ).save() + + if standard_test_suite or monitor_test_suite: + table_group.default_test_suite_id = standard_test_suite.id if standard_test_suite else None + table_group.monitor_test_suite_id = monitor_test_suite.id if monitor_test_suite else None + table_group.save() + if should_run_profiling: try: + run_profiling = True run_profiling_in_background(table_group.id) message = f"Profiling run started for table group {table_group.table_groups_name}." except Exception as error: @@ -385,35 +475,43 @@ def on_preview_table_group(payload: dict) -> None: results = { "success": success, "message": message, - "table_group_id": str(table_group.id), + "test_suite_name": standard_test_suite.test_suite if standard_test_suite else None, + "run_profiling": run_profiling, + "generate_test_suite": generate_test_suite, + "generate_monitor_suite": generate_monitor_suite, } else: results = { "success": False, "message": "Verify the table group before saving", - "connection_id": None, - "table_group_id": None, + "run_profiling": False, + "generate_test_suite": False, + "generate_monitor_suite": False, + "test_suite_name": None, } - testgen.testgen_component( - "table_group_wizard", - props={ + return testgen.table_group_wizard( + key="setup_data_configuration", + data={ "project_code": project_code, - "connection_id": connection_id, "table_group": table_group.to_dict(json_safe=True), "table_group_preview": table_group_preview, "steps": [ "tableGroup", "testTableGroup", "runProfiling", + "testSuite", + "monitorSuite", ], "results": results, + "standard_cron_sample": standard_cron_sample_result(), + "monitor_cron_sample": monitor_cron_sample_result(), }, - on_change_handlers={ - "SaveTableGroupClicked": on_save_table_group_clicked, - "GoToProfilingRunsClicked": on_go_to_profiling_runs, - "PreviewTableGroupClicked": on_preview_table_group, - }, + on_SaveTableGroupClicked_change=on_save_table_group_clicked, + on_PreviewTableGroupClicked_change=on_preview_table_group, + on_CloseClicked_change=on_close_clicked, + on_GetCronSample_change=on_get_monitor_cron_sample, + on_GetCronSampleAux_change=on_get_standard_cron_sample, ) diff --git a/testgen/ui/views/data_catalog.py b/testgen/ui/views/data_catalog.py index 3c8d2fea..c06c8b96 100644 --- a/testgen/ui/views/data_catalog.py +++ b/testgen/ui/views/data_catalog.py @@ -57,6 +57,7 @@ class DataCatalogPage(Page): def render(self, project_code: str, table_group_id: str | None = None, selected: str | None = None, **_kwargs) -> None: testgen.page_header( PAGE_TITLE, + "data-catalog", ) _, loading_column = st.columns([.4, .6]) @@ -110,11 +111,10 @@ def render(self, project_code: str, table_group_id: str | None = None, selected: }, }, on_change_handlers={ - "RunProfilingClicked": partial( - run_profiling_dialog, - project_code, - selected_table_group.id, - ), + "RunProfilingClicked": lambda _: run_profiling_dialog( + project_code=project_code, + table_group_id=selected_table_group.id, + ) if selected_table_group else None, "TableGroupSelected": on_table_group_selected, "ItemSelected": on_item_selected, "ExportClicked": lambda items: download_dialog( @@ -401,7 +401,9 @@ def get_table_group_columns(table_group_id: str) -> list[dict]: profile_results.datatype_suggestion, table_chars.record_ct, profile_results.value_ct, + column_chars.add_date, column_chars.drop_date, + table_chars.add_date AS table_add_date, table_chars.drop_date AS table_drop_date, column_chars.critical_data_element, table_chars.critical_data_element AS table_critical_data_element, diff --git a/testgen/ui/views/dialogs/data_preview_dialog.py b/testgen/ui/views/dialogs/data_preview_dialog.py index d2837e3e..8a65b006 100644 --- a/testgen/ui/views/dialogs/data_preview_dialog.py +++ b/testgen/ui/views/dialogs/data_preview_dialog.py @@ -31,7 +31,7 @@ def data_preview_dialog( else: st.dataframe( data, - width=520 if column_name else None, + width=520 if column_name else "content", height=700, ) diff --git a/testgen/ui/views/dialogs/generate_tests_dialog.py b/testgen/ui/views/dialogs/generate_tests_dialog.py index 7dc25500..0da5e623 100644 --- a/testgen/ui/views/dialogs/generate_tests_dialog.py +++ b/testgen/ui/views/dialogs/generate_tests_dialog.py @@ -2,32 +2,29 @@ import streamlit as st -from testgen.commands.run_generate_tests import run_test_gen_queries +from testgen.commands.test_generation import run_test_generation from testgen.common.models import with_database_session from testgen.common.models.test_suite import TestSuiteMinimal from testgen.ui.components import widgets as testgen from testgen.ui.services.database_service import execute_db_query, fetch_all_from_db, fetch_one_from_db -ALL_TYPES_LABEL = "All Test Types" - @st.dialog(title="Generate Tests") @with_database_session def generate_tests_dialog(test_suite: TestSuiteMinimal) -> None: test_suite_id = test_suite.id test_suite_name = test_suite.test_suite - table_group_id = test_suite.table_groups_id selected_set = "" generation_sets = get_generation_set_choices() if generation_sets: - generation_sets.insert(0, ALL_TYPES_LABEL) - + try: + default_generation_set = generation_sets.index("Standard") + except ValueError: + default_generation_set = 0 with st.container(): - selected_set = st.selectbox("Generation Set", generation_sets) - if selected_set == ALL_TYPES_LABEL: - selected_set = "" + selected_set = st.selectbox("Generation Set", generation_sets, index=default_generation_set) test_ct, unlocked_test_ct, unlocked_edits_ct = get_test_suite_refresh_warning(test_suite_id) if test_ct: @@ -55,7 +52,7 @@ def generate_tests_dialog(test_suite: TestSuiteMinimal) -> None: if testgen.expander_toggle(expand_label="Show CLI command", key="test_suite:keys:generate-tests-show-cli"): st.code( - f"testgen run-test-generation --table-group-id {table_group_id} --test-suite-key '{test_suite_name}'", + f"testgen run-test-generation --test-suite-id {test_suite_id} --generation-set '{selected_set}'", language="shellSession", ) @@ -73,7 +70,7 @@ def generate_tests_dialog(test_suite: TestSuiteMinimal) -> None: status_container.info("Generating tests ...") try: - run_test_gen_queries(table_group_id, test_suite_name, selected_set) + run_test_generation(test_suite_id, selected_set) except Exception as e: status_container.error(f"Test generation encountered errors: {e!s}.") @@ -86,7 +83,7 @@ def generate_tests_dialog(test_suite: TestSuiteMinimal) -> None: def get_test_suite_refresh_warning(test_suite_id: str) -> tuple[int, int, int]: result = fetch_one_from_db( """ - SELECT + SELECT COUNT(*) AS test_ct, SUM(CASE WHEN COALESCE(td.lock_refresh, 'N') = 'N' THEN 1 ELSE 0 END) AS unlocked_test_ct, SUM(CASE WHEN COALESCE(td.lock_refresh, 'N') = 'N' AND td.last_manual_update IS NOT NULL THEN 1 ELSE 0 END) AS unlocked_edits_ct diff --git a/testgen/ui/views/dialogs/manage_notifications.py b/testgen/ui/views/dialogs/manage_notifications.py index d830a7c3..c1037d4a 100644 --- a/testgen/ui/views/dialogs/manage_notifications.py +++ b/testgen/ui/views/dialogs/manage_notifications.py @@ -98,6 +98,22 @@ def on_resume_item(self, item): def _get_component_props(self) -> dict[str, Any]: raise NotImplementedError + def _mark_duplicates(self, ns_json_list: list[dict[str, Any]]) -> list[dict[str, Any]]: + """Return a list of recipients that have duplicate rule (recipient + trigger + scope) combinations.""" + rule_counts = {} + rule_items: dict[tuple, list[dict]] = {} + for item in ns_json_list: + for recipient in item["recipients"]: + rule = (recipient, item.get("trigger"), item.get("scope")) + rule_counts[rule] = rule_counts.get(rule, 0) + 1 + rule_items.setdefault(rule, []).append(item) + for rule, ocurrence_count in rule_counts.items(): + if ocurrence_count > 1: + items = rule_items[rule] + for item in items: + item.setdefault("duplicates", []).append(rule[0]) + return ns_json_list + @with_database_session def render(self) -> None: user_can_edit = session.auth.user_has_permission("edit") @@ -118,6 +134,15 @@ def render(self) -> None: } ns_json_list.append(ns_json) + component_props = { + **self.component_props, + **(self._get_component_props()), + } + scope_options_labels = dict(component_props.get("scope_options", [])) + ns_json_list = sorted( + self._mark_duplicates(ns_json_list), + key=lambda item: "0" if not item["scope"] else scope_options_labels.get(item["scope"], "ZZZ"), + ) widgets.css_class("m-dialog") widgets.testgen_component( "notification_settings", @@ -129,8 +154,7 @@ def render(self) -> None: "result": result, "scope_options": [], "scope_label": None, - **self.component_props, - **self._get_component_props(), + **component_props, }, event_handlers={ "AddNotification": self.on_add_item, diff --git a/testgen/ui/views/dialogs/manage_schedules.py b/testgen/ui/views/dialogs/manage_schedules.py index 85292565..82ff0551 100644 --- a/testgen/ui/views/dialogs/manage_schedules.py +++ b/testgen/ui/views/dialogs/manage_schedules.py @@ -1,6 +1,4 @@ import json -import zoneinfo -from datetime import datetime from typing import Any import cron_converter @@ -12,6 +10,7 @@ from testgen.common.models.scheduler import JobSchedule from testgen.ui.components import widgets as testgen from testgen.ui.session import session, temp_value +from testgen.ui.utils import get_cron_sample_handler CRON_SAMPLE_COUNT = 3 class ScheduleDialog: @@ -57,26 +56,6 @@ def on_resume_sched(item): JobSchedule.update_active(item["id"], True) st.rerun(scope="fragment") - def on_cron_sample(payload: dict[str, str]): - try: - cron_expr = payload["cron_expr"] - cron_tz = payload.get("tz", "America/New_York") - - cron_obj = cron_converter.Cron(cron_expr) - cron_schedule = cron_obj.schedule(datetime.now(zoneinfo.ZoneInfo(cron_tz))) - readble_cron_schedule = cron_descriptor.get_description( - cron_expr, - ) - - set_cron_sample({ - "samples": [cron_schedule.next().strftime("%a %b %-d, %-I:%M %p") for _ in range(CRON_SAMPLE_COUNT)], - "readable_expr": readble_cron_schedule, - }) - except ValueError as e: - set_cron_sample({"error": str(e)}) - except Exception as e: - set_cron_sample({"error": "Error validating the Cron expression"}) - def on_add_schedule(payload: dict[str, str]): set_arg_value(payload["arg_value"]) set_timezone(payload["cron_tz"]) @@ -85,7 +64,7 @@ def on_add_schedule(payload: dict[str, str]): set_should_save(True) user_can_edit = session.auth.user_has_permission("edit") - cron_sample_result, set_cron_sample = temp_value("schedule_dialog:cron_expr_validation", default={}) + cron_sample_result, on_cron_sample = get_cron_sample_handler("schedule_dialog:cron_expr_validation", sample_count=CRON_SAMPLE_COUNT) get_arg_value, set_arg_value = temp_value("schedule_dialog:new:arg_value", default=None) get_timezone, set_timezone = temp_value("schedule_dialog:new:timezone", default=None) get_cron_expr, set_cron_expr = temp_value("schedule_dialog:new:cron_expr", default=None) @@ -110,18 +89,16 @@ def on_add_schedule(payload: dict[str, str]): if is_form_valid: cron_obj = cron_converter.Cron(cron_expr) args, kwargs = self.get_job_arguments(arg_value) - with Session() as db_session: - sched_model = JobSchedule( - project_code=self.project_code, - key=self.job_key, - cron_expr=cron_obj.to_string(), - cron_tz=cron_tz, - active=True, - args=args, - kwargs=kwargs, - ) - db_session.add(sched_model) - db_session.commit() + sched_model = JobSchedule( + project_code=self.project_code, + key=self.job_key, + cron_expr=cron_obj.to_string(), + cron_tz=cron_tz, + active=True, + args=args, + kwargs=kwargs, + ) + with_database_session(sched_model.save)() else: success = False message = "Complete all the fields before adding the schedule" diff --git a/testgen/ui/views/dialogs/run_tests_dialog.py b/testgen/ui/views/dialogs/run_tests_dialog.py index c01bd049..1350a230 100644 --- a/testgen/ui/views/dialogs/run_tests_dialog.py +++ b/testgen/ui/views/dialogs/run_tests_dialog.py @@ -20,7 +20,10 @@ def run_tests_dialog(project_code: str, test_suite: TestSuiteMinimal | None = No test_suite_id: str = str(test_suite.id) test_suite_name: str = test_suite.test_suite else: - test_suites = TestSuite.select_minimal_where(TestSuite.project_code == project_code) + test_suites = TestSuite.select_minimal_where( + TestSuite.project_code == project_code, + TestSuite.is_monitor.isnot(True), + ) test_suites_df = to_dataframe(test_suites, TestSuiteMinimal.columns()) test_suite_id: str = testgen.select( label="Test Suite", diff --git a/testgen/ui/views/hygiene_issues.py b/testgen/ui/views/hygiene_issues.py index a452be7a..15e04614 100644 --- a/testgen/ui/views/hygiene_issues.py +++ b/testgen/ui/views/hygiene_issues.py @@ -23,6 +23,7 @@ from testgen.ui.components.widgets.page import css_class, flex_row_end from testgen.ui.navigation.page import Page from testgen.ui.pdf.hygiene_issue_report import create_report +from testgen.ui.queries.profiling_queries import get_profiling_anomalies from testgen.ui.queries.source_data_queries import get_hygiene_issue_source_data, get_hygiene_issue_source_query from testgen.ui.services.database_service import ( execute_db_query, @@ -63,7 +64,7 @@ def render( testgen.page_header( "Hygiene Issues", - "view-hygiene-issues", + "data-hygiene-issues", breadcrumbs=[ { "label": "Profiling Runs", "path": "profiling-runs", "params": { "project_code": run.project_code } }, { "label": f"{run.table_groups_name} | {run_date}" }, @@ -335,10 +336,6 @@ def open_download_dialog(data: pd.DataFrame | None = None) -> None: with score_column: render_score(run.project_code, run_id) - # Help Links - st.markdown( - "[Help on Hygiene Issues](https://docs.datakitchen.io/article/dataops-testgen-help/data-hygiene-issues)" - ) @st.fragment @with_database_session @@ -386,94 +383,6 @@ def get_profiling_run_columns(profiling_run_id: str) -> pd.DataFrame: return fetch_df_from_db(query, {"profiling_run_id": profiling_run_id}) -@st.cache_data(show_spinner=False) -def get_profiling_anomalies( - profile_run_id: str, - likelihood: str | None = None, - issue_type_id: str | None = None, - table_name: str | None = None, - column_name: str | None = None, - action: typing.Literal["Confirmed", "Dismissed", "Muted", "No Action"] | None = None, - sorting_columns: list[str] | None = None, -) -> pd.DataFrame: - query = f""" - SELECT - r.table_name, - r.column_name, - r.schema_name, - r.db_data_type, - t.anomaly_name, - t.issue_likelihood, - r.disposition, - null as action, - CASE - WHEN t.issue_likelihood = 'Possible' THEN 'Possible: speculative test that often identifies problems' - WHEN t.issue_likelihood = 'Likely' THEN 'Likely: typically indicates a data problem' - WHEN t.issue_likelihood = 'Definite' THEN 'Definite: indicates a highly-likely data problem' - WHEN t.issue_likelihood = 'Potential PII' - THEN 'Potential PII: may require privacy policies, standards and procedures for access, storage and transmission.' - END AS likelihood_explanation, - CASE - WHEN t.issue_likelihood = 'Potential PII' THEN 4 - WHEN t.issue_likelihood = 'Possible' THEN 3 - WHEN t.issue_likelihood = 'Likely' THEN 2 - WHEN t.issue_likelihood = 'Definite' THEN 1 - END AS likelihood_order, - t.anomaly_description, r.detail, t.suggested_action, - r.anomaly_id, r.table_groups_id::VARCHAR, r.id::VARCHAR, p.profiling_starttime, r.profile_run_id::VARCHAR, - tg.table_groups_name, - - -- These are used in the PDF report - dcc.functional_data_type, - dcc.description as column_description, - COALESCE(dcc.critical_data_element, dtc.critical_data_element) as critical_data_element, - COALESCE(dcc.data_source, dtc.data_source, tg.data_source) as data_source, - COALESCE(dcc.source_system, dtc.source_system, tg.source_system) as source_system, - COALESCE(dcc.source_process, dtc.source_process, tg.source_process) as source_process, - COALESCE(dcc.business_domain, dtc.business_domain, tg.business_domain) as business_domain, - COALESCE(dcc.stakeholder_group, dtc.stakeholder_group, tg.stakeholder_group) as stakeholder_group, - COALESCE(dcc.transform_level, dtc.transform_level, tg.transform_level) as transform_level, - COALESCE(dcc.aggregation_level, dtc.aggregation_level) as aggregation_level, - COALESCE(dcc.data_product, dtc.data_product, tg.data_product) as data_product - FROM profile_anomaly_results r - INNER JOIN profile_anomaly_types t - ON r.anomaly_id = t.id - INNER JOIN profiling_runs p - ON r.profile_run_id = p.id - INNER JOIN table_groups tg - ON r.table_groups_id = tg.id - LEFT JOIN data_column_chars dcc - ON (tg.id = dcc.table_groups_id - AND r.schema_name = dcc.schema_name - AND r.table_name = dcc.table_name - AND r.column_name = dcc.column_name) - LEFT JOIN data_table_chars dtc - ON dcc.table_id = dtc.table_id - WHERE r.profile_run_id = :profile_run_id - {"AND t.issue_likelihood = :likelihood" if likelihood else ""} - {"AND t.id = :issue_type_id" if issue_type_id else ""} - {"AND r.table_name = :table_name" if table_name else ""} - {"AND r.column_name ILIKE :column_name" if column_name else ""} - {"AND r.disposition IS NULL" if action == "No Action" else "AND r.disposition = :disposition" if action else ""} - {f"ORDER BY {', '.join(' '.join(col) for col in sorting_columns)}" if sorting_columns else ""} - """ - params = { - "profile_run_id": profile_run_id, - "likelihood": likelihood, - "issue_type_id": issue_type_id, - "table_name": table_name, - "column_name": column_name, - "disposition": { - "Muted": "Inactive", - }.get(action, action), - } - df = fetch_df_from_db(query, params) - dct_replace = {"Confirmed": "βœ“", "Dismissed": "✘", "Inactive": "πŸ”‡"} - df["action"] = df["disposition"].replace(dct_replace) - - return df - - @st.cache_data(show_spinner=False) def get_anomaly_disposition(profile_run_id: str) -> pd.DataFrame: query = """ @@ -555,11 +464,6 @@ def source_data_dialog(selected_row): st.markdown("#### Hygiene Issue Detail") st.caption(selected_row["detail"]) - st.markdown("#### SQL Query") - query = get_hygiene_issue_source_query(selected_row) - if query: - st.code(query, language="sql", wrap_lines=True, height=100) - with st.spinner("Retrieving source data..."): bad_data_status, bad_data_msg, _, df_bad = get_hygiene_issue_source_data(selected_row, limit=500) if bad_data_status in {"ND", "NA"}: @@ -579,6 +483,11 @@ def source_data_dialog(selected_row): # Display the dataframe st.dataframe(df_bad, width=1050, hide_index=True) + st.markdown("#### SQL Query") + query = get_hygiene_issue_source_query(selected_row) + if query: + st.code(query, language="sql", wrap_lines=True, height=100) + def do_disposition_update(selected, str_new_status): str_result = None diff --git a/testgen/ui/views/monitors_dashboard.py b/testgen/ui/views/monitors_dashboard.py new file mode 100644 index 00000000..491789be --- /dev/null +++ b/testgen/ui/views/monitors_dashboard.py @@ -0,0 +1,1052 @@ +import logging +from datetime import UTC, date, datetime +from math import ceil +from typing import Any, ClassVar, Literal + +import pandas as pd +import streamlit as st + +from testgen.commands.test_generation import run_monitor_generation +from testgen.common.freshness_service import add_business_minutes, get_schedule_params, resolve_holiday_dates +from testgen.common.models import with_database_session +from testgen.common.models.notification_settings import ( + MonitorNotificationSettings, + MonitorNotificationTrigger, + NotificationEvent, +) +from testgen.common.models.project import Project +from testgen.common.models.scheduler import RUN_MONITORS_JOB_KEY, JobSchedule +from testgen.common.models.table_group import TableGroup, TableGroupMinimal +from testgen.common.models.test_definition import TestDefinition, TestDefinitionSummary, TestType +from testgen.common.models.test_suite import PredictSensitivity, TestSuite +from testgen.ui.components import widgets as testgen +from testgen.ui.navigation.menu import MenuItem +from testgen.ui.navigation.page import Page +from testgen.ui.navigation.router import Router +from testgen.ui.queries.profiling_queries import get_tables_by_table_group +from testgen.ui.services.database_service import execute_db_query, fetch_all_from_db, fetch_one_from_db +from testgen.ui.session import session, temp_value +from testgen.ui.utils import dict_from_kv, get_cron_sample, get_cron_sample_handler +from testgen.ui.views.dialogs.manage_notifications import NotificationSettingsDialogBase +from testgen.utils import make_json_safe + +PAGE_ICON = "apps_outage" +PAGE_TITLE = "Monitors" +LOG = logging.getLogger("testgen") + +ALLOWED_SORT_FIELDS = { + "table_name", "freshness_anomalies", "volume_anomalies", "schema_anomalies", + "metric_anomalies", "latest_update", "row_count", +} +ANOMALY_TYPE_FILTERS = { + "freshness": "freshness_anomalies", + "volume": "volume_anomalies", + "schema": "schema_anomalies", + "metrics": "metric_anomalies", +} +DIALOG_AUTO_OPENED_KEY = "monitors:dialog_auto_opened" + + +class MonitorsDashboardPage(Page): + path = "monitors" + can_activate: ClassVar = [ + lambda: session.auth.is_logged_in, + lambda: "project_code" in st.query_params, + ] + menu_item = MenuItem( + icon=PAGE_ICON, + label=PAGE_TITLE, + section="Data Quality Testing", + order=0, + ) + + def render( + self, + project_code: str, + table_group_id: str | None = None, + table_name_filter: str | None = None, + anomaly_type_filter: str | None = None, + sort_field: str | None = None, + sort_order: str | None = None, + items_per_page: str = "20", + current_page: str = "0", + table_name: str | None = None, + **_kwargs, + ) -> None: + testgen.page_header( + PAGE_TITLE, + "monitor-tables", + ) + + project_summary = Project.get_summary(project_code) + table_groups = TableGroup.select_minimal_where(TableGroup.project_code == project_code) + + if not table_group_id or table_group_id not in [ str(item.id) for item in table_groups ]: + table_group_id = str(table_groups[0].id) if table_groups else None + + selected_table_group = None + monitor_schedule = None + monitored_tables_page = [] + all_monitored_tables_count = 0 + monitor_changes_summary = None + auto_open_table = None + + current_page = int(current_page) + items_per_page = int(items_per_page) + page_start = current_page * items_per_page + + if table_group_id: + selected_table_group = next(item for item in table_groups if str(item.id) == table_group_id) + monitor_suite_id = selected_table_group.monitor_test_suite_id + + if monitor_suite_id: + with st.spinner(text="Loading data ..."): + monitor_schedule = JobSchedule.get( + JobSchedule.key == RUN_MONITORS_JOB_KEY, + JobSchedule.kwargs["test_suite_id"].astext == str(monitor_suite_id), + ) + + anomaly_type_filter = [t for t in anomaly_type_filter.split(",") if t in ANOMALY_TYPE_FILTERS] if anomaly_type_filter else None + if sort_field and sort_field not in ALLOWED_SORT_FIELDS: + sort_field = None + + monitored_tables_page = get_monitor_changes_by_tables( + table_group_id, + table_name_filter=table_name_filter, + anomaly_type_filter=anomaly_type_filter, + sort_field=sort_field, + sort_order=sort_order, + limit=int(items_per_page), + offset=page_start, + ) + all_monitored_tables_count = count_monitor_changes_by_tables( + table_group_id, + table_name_filter=table_name_filter, + anomaly_type_filter=anomaly_type_filter, + ) + monitor_changes_summary = summarize_monitor_changes(table_group_id) + + monitored_table_names = {table["table_name"] for table in monitored_tables_page} + if table_name: + if st.session_state.get(DIALOG_AUTO_OPENED_KEY) != table_name: + if table_name in monitored_table_names: + auto_open_table = table_name + else: + Router().set_query_params({"table_name": None}) + else: + st.session_state.pop(DIALOG_AUTO_OPENED_KEY, None) + + return testgen.testgen_component( + "monitors_dashboard", + props={ + "project_summary": project_summary.to_dict(json_safe=True), + "summary": make_json_safe(monitor_changes_summary), + "schedule": { + "active": monitor_schedule.active, + "cron_tz": monitor_schedule.cron_tz, + "cron_sample": get_cron_sample(monitor_schedule.cron_expr, monitor_schedule.cron_tz, 1) + } if monitor_schedule else None, + "table_group_filter_options": [ + { + "value": str(table_group.id), + "label": table_group.table_groups_name, + "selected": str(table_group_id) == str(table_group.id), + "has_monitors": bool(table_group.monitor_test_suite_id), + } for table_group in table_groups + ], + "monitors": { + "items": make_json_safe(monitored_tables_page), + "current_page": current_page, + "items_per_page": items_per_page, + "total_count": all_monitored_tables_count, + }, + "filters": { + "table_group_id": table_group_id, + "table_name_filter": table_name_filter, + "anomaly_type_filter": list(anomaly_type_filter) if anomaly_type_filter else None, + }, + "sort": { + "sort_field": sort_field, + "sort_order": sort_order, + } if sort_field and sort_order else None, + "has_monitor_test_suite": bool(selected_table_group and monitor_suite_id), + "auto_open_table": auto_open_table, + "permissions": { + "can_edit": session.auth.user_has_permission("edit"), + }, + }, + on_change_handlers={ + "OpenSchemaChanges": lambda payload: open_schema_changes(selected_table_group, payload), + "OpenMonitoringTrends": lambda payload: open_table_trends(selected_table_group, payload), + "SetParamValues": lambda payload: set_param_values(payload), + "EditNotifications": manage_notifications(project_code, selected_table_group), + "EditMonitorSettings": lambda *_: edit_monitor_settings(selected_table_group, monitor_schedule), + "DeleteMonitorSuite": lambda *_: delete_monitor_suite(selected_table_group), + "EditTableMonitors": lambda payload: edit_table_monitors(selected_table_group, payload), + }, + ) + + +def manage_notifications(project_code: str, selected_table_group: TableGroupMinimal): + def open_dialog(*_): + MonitorNotificationSettingsDialog( + MonitorNotificationSettings, + ns_attrs={ + "project_code": project_code, + "table_group_id": str(selected_table_group.id), + "test_suite_id": str(selected_table_group.monitor_test_suite_id), + }, + component_props={ + "subtitle": { + "label": "Table Group", + "value": selected_table_group.table_groups_name, + }, + }, + ).open(), + return open_dialog + + +class MonitorNotificationSettingsDialog(NotificationSettingsDialogBase): + title = "Monitor Notifications" + + def _item_to_model_attrs(self, item: dict[str, Any]) -> dict[str, Any]: + return { + "trigger": MonitorNotificationTrigger.on_anomalies, + "table_name": item["scope"], + } + + def _model_to_item_attrs(self, model: MonitorNotificationSettings) -> dict[str, Any]: + return { + "trigger": model.trigger.value if model.trigger else None, + "scope": table_name + if model.settings and (table_name := model.settings.get("table_name")) else None, + } + + def _get_component_props(self) -> dict[str, Any]: + tables = get_tables_by_table_group(self.ns_attrs["table_group_id"]) + table_options = [ + (table["table_name"], table["table_name"]) for table in tables + ] + table_options.insert(0, (None, "All Tables")) + trigger_labels = { + MonitorNotificationTrigger.on_anomalies.value: "On Anomalies", + } + trigger_options = [(t.value, trigger_labels[t.value]) for t in MonitorNotificationTrigger] + return { + "event": NotificationEvent.monitor_run.value, + "scope_label": "Table", + "scope_options": table_options, + "trigger_options": trigger_options, + } + + +@st.cache_data(show_spinner=False) +def get_monitor_changes_by_tables( + table_group_id: str, + table_name_filter: str | None = None, + anomaly_type_filter: list[str] | None = None, + sort_field: str | None = None, + sort_order: Literal["asc"] | Literal["desc"] | None = None, + limit: int | None = None, + offset: int | None = None, +) -> list[dict]: + query, params = _monitor_changes_by_tables_query( + table_group_id, + table_name_filter=table_name_filter, + anomaly_type_filter=anomaly_type_filter, + sort_field=sort_field, + sort_order=sort_order, + limit=limit, + offset=offset, + ) + + results = fetch_all_from_db(query, params) + return [ dict(row) for row in results ] + + +@st.cache_data(show_spinner=False) +def count_monitor_changes_by_tables( + table_group_id: str, + table_name_filter: str | None = None, + anomaly_type_filter: list[str] | None = None, +) -> int: + query, params = _monitor_changes_by_tables_query( + table_group_id, + table_name_filter=table_name_filter, + anomaly_type_filter=anomaly_type_filter, + ) + count_query = f"SELECT COUNT(*) AS count FROM ({query}) AS subquery" + result = execute_db_query(count_query, params) + return result or 0 + + +@st.cache_data(show_spinner=False) +def summarize_monitor_changes(table_group_id: str) -> dict: + query, params = _monitor_changes_by_tables_query(table_group_id) + count_query = f""" + SELECT + lookback, + MIN(lookback_start) AS lookback_start, + MAX(lookback_end) AS lookback_end, + SUM(freshness_anomalies)::INTEGER AS freshness_anomalies, + SUM(volume_anomalies)::INTEGER AS volume_anomalies, + SUM(schema_anomalies)::INTEGER AS schema_anomalies, + SUM(metric_anomalies)::INTEGER AS metric_anomalies, + BOOL_OR(freshness_error_message IS NOT NULL) AS freshness_has_errors, + BOOL_OR(volume_error_message IS NOT NULL) AS volume_has_errors, + BOOL_OR(schema_error_message IS NOT NULL) AS schema_has_errors, + BOOL_OR(metric_error_message IS NOT NULL) AS metric_has_errors, + BOOL_OR(freshness_is_training) AND BOOL_AND(freshness_is_training OR freshness_is_pending) AS freshness_is_training, + BOOL_OR(volume_is_training) AND BOOL_AND(volume_is_training OR volume_is_pending) AS volume_is_training, + BOOL_OR(metric_is_training) AND BOOL_AND(metric_is_training OR metric_is_pending) AS metric_is_training, + BOOL_AND(freshness_is_pending) AS freshness_is_pending, + BOOL_AND(volume_is_pending) AS volume_is_pending, + BOOL_AND(schema_is_pending) AS schema_is_pending, + BOOL_AND(metric_is_pending) AS metric_is_pending + FROM ({query}) AS subquery + GROUP BY lookback + """ + + result = fetch_one_from_db(count_query, params) + return {**result} if result else { + "lookback": 0, + "freshness_anomalies": 0, + "volume_anomalies": 0, + "schema_anomalies": 0, + "metric_anomalies": 0, + "freshness_is_training": False, + "volume_is_training": False, + "metric_is_training": False, + "freshness_is_pending": False, + "volume_is_pending": False, + "schema_is_pending": False, + "metric_is_pending": False, + "freshness_has_errors": False, + "volume_has_errors": False, + "schema_has_errors": False, + "metric_has_errors": False, + } + + +def _monitor_changes_by_tables_query( + table_group_id: str, + table_name_filter: str | None = None, + anomaly_type_filter: list[str] | None = None, + sort_field: str | None = None, + sort_order: Literal["asc"] | Literal["desc"] | None = None, + limit: int | None = None, + offset: int | None = None, +) -> tuple[str, dict]: + query = f""" + WITH ranked_test_runs AS ( + SELECT + test_runs.id, + test_runs.test_starttime, + COALESCE(test_suites.monitor_lookback, 1) AS lookback, + ROW_NUMBER() OVER (PARTITION BY test_runs.test_suite_id ORDER BY test_runs.test_starttime DESC) AS position + FROM table_groups + INNER JOIN test_runs + ON (test_runs.test_suite_id = table_groups.monitor_test_suite_id) + INNER JOIN test_suites + ON (table_groups.monitor_test_suite_id = test_suites.id) + WHERE table_groups.id = :table_group_id + ), + lookback_window AS ( + SELECT MIN(test_starttime) AS lookback_start + FROM ranked_test_runs + WHERE position <= lookback + ), + latest_tables AS ( + SELECT DISTINCT + table_chars.schema_name, + table_chars.table_name + FROM data_table_chars table_chars + CROSS JOIN lookback_window + WHERE table_chars.table_groups_id = :table_group_id + -- Include current tables and tables dropped within lookback window + AND (table_chars.drop_date IS NULL OR table_chars.drop_date >= lookback_window.lookback_start) + {"AND table_chars.table_name ILIKE :table_name_filter" if table_name_filter else ''} + ), + monitor_results AS ( + SELECT + latest_tables.table_name, + results.test_time, + results.test_type, + results.result_code, + ranked_test_runs.lookback, + ranked_test_runs.position, + ranked_test_runs.test_starttime, + -- result_code = -1 indicates training mode + CASE WHEN results.result_code = -1 THEN 1 ELSE 0 END AS is_training, + CASE WHEN results.test_type = 'Freshness_Trend' AND results.result_code = 0 THEN 1 ELSE 0 END AS freshness_anomaly, + CASE WHEN results.test_type = 'Volume_Trend' AND results.result_code = 0 THEN 1 ELSE 0 END AS volume_anomaly, + CASE WHEN results.test_type = 'Schema_Drift' AND results.result_code = 0 THEN 1 ELSE 0 END AS schema_anomaly, + CASE WHEN results.test_type = 'Metric_Trend' AND results.result_code = 0 THEN 1 ELSE 0 END AS metric_anomaly, + CASE WHEN results.test_type = 'Freshness_Trend' THEN results.result_signal ELSE NULL END AS freshness_interval, + CASE WHEN results.test_type = 'Volume_Trend' THEN results.result_signal::BIGINT ELSE NULL END AS row_count, + CASE WHEN results.test_type = 'Schema_Drift' THEN SPLIT_PART(results.result_signal, '|', 1) ELSE NULL END AS table_change, + CASE WHEN results.test_type = 'Schema_Drift' THEN NULLIF(SPLIT_PART(results.result_signal, '|', 2), '')::INT ELSE 0 END AS col_adds, + CASE WHEN results.test_type = 'Schema_Drift' THEN NULLIF(SPLIT_PART(results.result_signal, '|', 3), '')::INT ELSE 0 END AS col_drops, + CASE WHEN results.test_type = 'Schema_Drift' THEN NULLIF(SPLIT_PART(results.result_signal, '|', 4), '')::INT ELSE 0 END AS col_mods, + CASE WHEN results.result_status = 'Error' THEN results.result_message ELSE NULL END AS error_message + FROM latest_tables + LEFT JOIN ranked_test_runs ON TRUE + LEFT JOIN test_results AS results + ON results.test_run_id = ranked_test_runs.id + AND results.table_name = latest_tables.table_name + WHERE ranked_test_runs.position IS NULL + -- Also capture 1 run before the lookback to get baseline results + OR ranked_test_runs.position <= ranked_test_runs.lookback + 1 + ), + monitor_tables AS ( + SELECT + :table_group_id AS table_group_id, + table_name, + MAX(lookback) AS lookback, + SUM(freshness_anomaly) AS freshness_anomalies, + SUM(volume_anomaly) AS volume_anomalies, + SUM(schema_anomaly) AS schema_anomalies, + SUM(metric_anomaly) AS metric_anomalies, + MAX(test_time - (COALESCE(NULLIF(freshness_interval, 'Unknown')::INTEGER, 0) * INTERVAL '1 minute')) + FILTER (WHERE test_type = 'Freshness_Trend' AND position = 1) AS latest_update, + MAX(row_count) FILTER (WHERE position = 1) AS row_count, + SUM(col_adds) AS column_adds, + SUM(col_drops) AS column_drops, + SUM(col_mods) AS column_mods, + MAX(error_message) FILTER (WHERE test_type = 'Freshness_Trend' AND position = 1) AS freshness_error_message, + MAX(error_message) FILTER (WHERE test_type = 'Volume_Trend' AND position = 1) AS volume_error_message, + MAX(error_message) FILTER (WHERE test_type = 'Schema_Drift' AND position = 1) AS schema_error_message, + MAX(error_message) FILTER (WHERE test_type = 'Metric_Trend' AND position = 1) AS metric_error_message, + BOOL_OR(is_training = 1) FILTER (WHERE test_type = 'Freshness_Trend' AND position = 1) AS freshness_is_training, + BOOL_OR(is_training = 1) FILTER (WHERE test_type = 'Volume_Trend' AND position = 1) AS volume_is_training, + BOOL_OR(is_training = 1) FILTER (WHERE test_type = 'Metric_Trend' AND position = 1) AS metric_is_training, + BOOL_OR(test_type = 'Freshness_Trend') IS NOT TRUE AS freshness_is_pending, + BOOL_OR(test_type = 'Volume_Trend') IS NOT TRUE AS volume_is_pending, + -- Schema monitor only creates results on schema changes (Failed) + -- Mark it as pending only if there are no results of any test type + BOOL_OR(test_time IS NOT NULL) IS NOT TRUE AS schema_is_pending, + BOOL_OR(test_type = 'Metric_Trend') IS NOT TRUE AS metric_is_pending, + CASE + -- Mark as Dropped if latest Schema Drift result for the table indicates it was dropped + WHEN (ARRAY_AGG(table_change ORDER BY test_time DESC) FILTER (WHERE table_change IS NOT NULL))[1] = 'D' + THEN 'dropped' + -- Only mark as Added if latest change does not indicate a drop + WHEN MAX(CASE WHEN table_change = 'A' THEN 1 ELSE 0 END) = 1 + THEN 'added' + WHEN SUM(schema_anomaly) > 0 + THEN 'modified' + ELSE NULL + END AS table_state + FROM monitor_results + -- Only aggregate within lookback runs + WHERE position IS NULL OR position <= COALESCE(lookback, 1) + GROUP BY table_name + ), + table_bounds AS ( + SELECT + table_name, + MIN(position) AS min_position, + MAX(position) AS max_position + FROM monitor_results + WHERE position IS NOT NULL + GROUP BY table_name + ), + baseline_tables AS ( + SELECT + monitor_results.table_name, + MIN(monitor_results.test_starttime) FILTER ( + WHERE monitor_results.position = LEAST(monitor_results.lookback + 1, table_bounds.max_position) + ) AS lookback_start, + MAX(monitor_results.test_starttime) FILTER ( + WHERE monitor_results.position = GREATEST(1, table_bounds.min_position) + ) AS lookback_end, + MAX(monitor_results.row_count) FILTER ( + WHERE monitor_results.test_type = 'Volume_Trend' + AND monitor_results.position = LEAST(monitor_results.lookback + 1, table_bounds.max_position) + ) AS previous_row_count + FROM monitor_results + JOIN table_bounds ON monitor_results.table_name = table_bounds.table_name + GROUP BY monitor_results.table_name + ) + SELECT + monitor_tables.*, + baseline_tables.lookback_start, + baseline_tables.lookback_end, + baseline_tables.previous_row_count + FROM monitor_tables + LEFT JOIN baseline_tables ON monitor_tables.table_name = baseline_tables.table_name + {f"WHERE ({' OR '.join(f'{ANOMALY_TYPE_FILTERS[t]} > 0' for t in anomaly_type_filter)})" if anomaly_type_filter else ""} + ORDER BY {"LOWER(monitor_tables.table_name)" if not sort_field or sort_field == "table_name" else f"monitor_tables.{sort_field}"} + {"DESC" if sort_order == "desc" else "ASC"} NULLS LAST + {"LIMIT :limit" if limit else ""} + {"OFFSET :offset" if offset else ""} + """ + + params = { + "table_group_id": table_group_id, + "table_name_filter": f"%{table_name_filter.replace('_', '\\_')}%" if table_name_filter else None, + "sort_field": sort_field, + "limit": limit, + "offset": offset, + } + + return query, params + + +def set_param_values(payload: dict) -> None: + Router().set_query_params(payload) + + +def edit_monitor_settings(table_group: TableGroupMinimal, schedule: JobSchedule | None): + monitor_suite_id = table_group.monitor_test_suite_id + + @with_database_session + def show_dialog(): + if monitor_suite_id: + monitor_suite = TestSuite.get(monitor_suite_id) + else: + monitor_suite = TestSuite( + project_code=table_group.project_code, + test_suite=f"{table_group.table_groups_name} Monitors", + connection_id=table_group.connection_id, + table_groups_id=table_group.id, + export_to_observability=False, + dq_score_exclude=True, + is_monitor=True, + ) + + def on_save_settings_clicked(payload: dict) -> None: + set_save(True) + set_schedule(payload["schedule"]) + set_monitor_suite(payload["monitor_suite"]) + + cron_sample_result, on_cron_sample = get_cron_sample_handler("monitors:cron_expr_validation", sample_count=2) + should_save, set_save = temp_value(f"monitors:save:{monitor_suite_id}", default=False) + get_schedule, set_schedule = temp_value(f"monitors:updated_schedule:{monitor_suite_id}", default={}) + get_monitor_suite, set_monitor_suite = temp_value(f"monitors:updated_suite:{monitor_suite_id}", default={}) + + if should_save(): + for key, value in get_monitor_suite().items(): + setattr(monitor_suite, key, value) + + is_new = not monitor_suite.id + monitor_suite.save() + + new_schedule_config = get_schedule() + if ( # Check if schedule has to be created/recreated + not schedule + or schedule.cron_tz != new_schedule_config["cron_tz"] + or schedule.cron_expr != new_schedule_config["cron_expr"] + ): + if schedule: + JobSchedule.delete(schedule.id) + + new_schedule = JobSchedule( + project_code=table_group.project_code, + key=RUN_MONITORS_JOB_KEY, + args=[], + kwargs={"test_suite_id": str(monitor_suite.id)}, + **new_schedule_config, + ) + new_schedule.save() + + elif schedule.active != new_schedule_config["active"]: # Only active status changed + JobSchedule.update_active(schedule.id, new_schedule_config["active"]) + + if is_new: + updated_table_group = TableGroup.get(table_group.id) + updated_table_group.monitor_test_suite_id = monitor_suite.id + updated_table_group.save() + run_monitor_generation(monitor_suite.id, ["Volume_Trend", "Schema_Drift"]) + + st.rerun() + + testgen.edit_monitor_settings( + key="edit_monitor_settings", + data={ + "table_group": table_group.to_dict(json_safe=True), + "monitor_suite": monitor_suite.to_dict(json_safe=True), + "schedule": { + "cron_tz": schedule.cron_tz, + "cron_expr": schedule.cron_expr, + "active": schedule.active, + } if schedule else None, + "cron_sample": cron_sample_result(), + }, + on_SaveSettingsClicked_change=on_save_settings_clicked, + on_GetCronSample_change=on_cron_sample, + ) + + return st.dialog(title="Edit Monitor Settings" if monitor_suite_id else "Configure Monitors")(show_dialog)() + + +@st.dialog(title="Delete Monitors") +@with_database_session +def delete_monitor_suite(table_group: TableGroupMinimal) -> None: + def on_delete_confirmed(*_args) -> None: + set_delete_confirmed(True) + + message = f"Are you sure you want to delete all monitors for the table group '{table_group.table_groups_name}'?" + constraint = { + "warning": "All monitor configuration and historical results will be deleted.", + "confirmation": "Yes, delete all monitors and historical results.", + } + + result, set_result = temp_value(f"monitors:result-value:{table_group.id}", default=None) + delete_confirmed, set_delete_confirmed = temp_value(f"monitors:confirm-delete:{table_group.id}", default=False) + + testgen.testgen_component( + "confirm_dialog", + props={ + "message": message, + "constraint": constraint, + "button_label": "Delete", + "button_color": "warn", + "result": result(), + }, + on_change_handlers={ + "ActionConfirmed": on_delete_confirmed, + }, + ) + + if delete_confirmed(): + try: + with st.spinner("Deleting monitors ..."): + monitor_suite = TestSuite.get(table_group.monitor_test_suite_id) + TestSuite.cascade_delete([monitor_suite.id]) + st.cache_data.clear() + st.rerun() + except Exception: + LOG.exception("Failed to delete monitor suite") + set_result({ + "success": False, + "message": "Unable to delete monitors for the table group, try again.", + }) + st.rerun(scope="fragment") + + +def open_schema_changes(table_group: TableGroupMinimal, payload: dict): + table_name = payload.get("table_name") + start_time = payload.get("start_time") + end_time = payload.get("end_time") + + @with_database_session + def show_dialog(): + testgen.css_class("s-dialog") + + data_structure_logs = get_data_structure_logs( + table_group.id, table_name, start_time, end_time, + ) + + testgen.testgen_component( + "schema_changes_list", + props={ + "window_start": start_time, + "window_end": end_time, + "data_structure_logs": make_json_safe(data_structure_logs), + }, + ) + + return st.dialog(title=f"Table: {table_name}")(show_dialog)() + + +def _resolve_holiday_dates(test_suite: TestSuite) -> set[date] | None: + if not test_suite.holiday_codes_list: + return None + now = pd.Timestamp.now("UTC") + idx = pd.DatetimeIndex([now - pd.Timedelta(days=7), now + pd.Timedelta(days=30)]) + return resolve_holiday_dates(test_suite.holiday_codes_list, idx) + + +def open_table_trends(table_group: TableGroupMinimal, payload: dict): + table_name = payload.get("table_name") + st.session_state[DIALOG_AUTO_OPENED_KEY] = table_name + Router().set_query_params({"table_name": table_name}) + + get_selected_data_point, set_selected_data_point = temp_value("table_monitoring_trends:dsl_time", default=None) + extended_history_key = f"table_monitoring_trends:extended:{table_group.monitor_test_suite_id}:{table_name}" + + @with_database_session + def show_dialog(): + testgen.css_class("l-dialog") + + extended_history = st.session_state.get(extended_history_key, False) + + selected_data_point = get_selected_data_point() + data_structure_logs = None + if selected_data_point: + data_structure_logs = get_data_structure_logs( + table_group.id, table_name, *selected_data_point, + ) + + lookback_multiplier = 3 if extended_history else 1 + events = get_monitor_events_for_table(table_group.monitor_test_suite_id, table_name, lookback_multiplier) + definitions = TestDefinition.select_where( + TestDefinition.test_suite_id == table_group.monitor_test_suite_id, + TestDefinition.table_name == table_name, + TestDefinition.test_type.in_(["Freshness_Trend", "Volume_Trend", "Metric_Trend"]), + ) + + predictions = {} + if len(definitions) > 0: + test_suite = TestSuite.get(table_group.monitor_test_suite_id) + monitor_schedule = JobSchedule.get( + JobSchedule.key == RUN_MONITORS_JOB_KEY, + JobSchedule.kwargs["test_suite_id"].astext == str(table_group.monitor_test_suite_id), + ) + monitor_lookback = test_suite.monitor_lookback + predict_sensitivity = test_suite.predict_sensitivity or PredictSensitivity.medium + + last_run_time_per_test_key: dict[str, datetime] = { + "volume_trend": max(e["time"] for e in events["volume_events"]), + } + for metric_group in events["metric_events"]: + metric_definition_id = metric_group["test_definition_id"] + last_run_time_per_test_key[f"metric:{metric_definition_id}"] = max(e["time"] for e in metric_group["events"]) + + for definition in definitions: + test_key = f"metric:{definition.id}" if definition.test_type == "Metric_Trend" else definition.test_type.lower() + if definition.history_calculation == "PREDICT" and definition.prediction and (base_mean_predictions := definition.prediction.get("mean")): + predicted_times = sorted([datetime.fromtimestamp(int(timestamp) / 1000.0, UTC) for timestamp in base_mean_predictions.keys()]) + # Limit predictions to 1/3 of the lookback, with minimum 3 points + predicted_times = [str(int(t.timestamp() * 1000)) for idx, t in enumerate(predicted_times) if idx < 3 or idx < monitor_lookback / 3] + + mean_predictions: dict = {} + lower_tolerance_predictions: dict = {} + upper_tolerance_predictions: dict = {} + for timestamp in predicted_times: + mean_predictions[timestamp] = base_mean_predictions[timestamp] + lower_tolerance_predictions[timestamp] = definition.prediction[f"lower_tolerance|{predict_sensitivity.value}"][timestamp] + upper_tolerance_predictions[timestamp] = definition.prediction[f"upper_tolerance|{predict_sensitivity.value}"][timestamp] + + predictions[test_key] = { + "method": "predict", + "mean": mean_predictions, + "lower_tolerance": lower_tolerance_predictions, + "upper_tolerance": upper_tolerance_predictions, + } + elif definition.history_calculation is None and (definition.lower_tolerance is not None or definition.upper_tolerance is not None): + cron_sample = get_cron_sample( + monitor_schedule.cron_expr, + monitor_schedule.cron_tz, + sample_count=ceil(min(max(3, monitor_lookback / 3), 10)), + reference_time=last_run_time_per_test_key.get(test_key), + ) + mean_predictions: dict = {} + lower_tolerance_predictions: dict = {} + upper_tolerance_predictions: dict = {} + sample_next_runs = [timestamp * 1000 for timestamp in (cron_sample.get("samples") or [])] + for timestamp in sample_next_runs: + mean_predictions[timestamp] = None + lower_tolerance_predictions[timestamp] = definition.lower_tolerance + upper_tolerance_predictions[timestamp] = definition.upper_tolerance + + predictions[test_key] = { + "method": "static", + "mean": mean_predictions, + "lower_tolerance": lower_tolerance_predictions, + "upper_tolerance": upper_tolerance_predictions, + } + elif ( + definition.test_type == "Freshness_Trend" + and definition.history_calculation == "PREDICT" + and (not definition.prediction or definition.prediction.get("schedule_stage")) + and definition.upper_tolerance is not None + ): + last_update_events = [ + e for e in events["freshness_events"] + if e["changed"] and not e["is_training"] and not e["is_pending"] + ] + if last_update_events: + last_detection_time = max(e["time"] for e in last_update_events) + holiday_dates = _resolve_holiday_dates(test_suite) + tz = monitor_schedule.cron_tz or "UTC" if monitor_schedule else None + sched = get_schedule_params(definition.prediction) + + window_end = add_business_minutes( + pd.Timestamp(last_detection_time), + float(definition.upper_tolerance), + test_suite.predict_exclude_weekends, + holiday_dates, tz, + excluded_days=sched.excluded_days, + ) + window_start = None + if lower_minutes := float(definition.lower_tolerance) if definition.lower_tolerance else None: + window_start = add_business_minutes( + pd.Timestamp(last_detection_time), + lower_minutes, + test_suite.predict_exclude_weekends, + holiday_dates, tz, + excluded_days=sched.excluded_days, + ) + + predictions["freshness_trend"] = { + "method": "freshness_window", + "window": { + "start": int(window_start.timestamp() * 1000) if window_start else None, + "end": int(window_end.timestamp() * 1000), + }, + } + + testgen.table_monitoring_trends( + "table_monitoring_trends", + data={ + **make_json_safe(events), + "data_structure_logs": make_json_safe(data_structure_logs), + "predictions": predictions, + "extended_history": extended_history, + }, + on_ShowDataStructureLogs_change=on_show_data_structure_logs, + on_ToggleExtendedHistory_change=on_toggle_extended_history, + ) + + def on_show_data_structure_logs(payload): + try: + set_selected_data_point( + (float(payload.get("start_time")) / 1000, float(payload.get("end_time")) / 1000) + ) + except: pass # noqa: S110 + + def on_toggle_extended_history(_payload): + st.session_state[extended_history_key] = not st.session_state.get(extended_history_key, False) + + def on_dismiss(): + st.session_state.pop(DIALOG_AUTO_OPENED_KEY, None) + Router().set_query_params({"table_name": None}) + + return st.dialog(title=f"Table: {table_name}", on_dismiss=on_dismiss)(show_dialog)() + + +@st.cache_data(show_spinner=False) +def get_monitor_events_for_table(test_suite_id: str, table_name: str, lookback_multiplier: int = 1) -> dict: + query = """ + WITH ranked_test_runs AS ( + SELECT + test_runs.id, + test_runs.test_starttime, + COALESCE(test_suites.monitor_lookback, 1) * :lookback_multiplier AS lookback, + ROW_NUMBER() OVER (PARTITION BY test_runs.test_suite_id ORDER BY test_runs.test_starttime DESC) AS position + FROM test_suites + INNER JOIN test_runs + ON (test_suites.id = test_runs.test_suite_id) + WHERE test_suites.id = :test_suite_id + ), + active_runs AS ( + SELECT id, test_starttime FROM ranked_test_runs + WHERE position <= lookback + ), + target_tests AS ( + SELECT 'Freshness_Trend' AS test_type + UNION ALL SELECT 'Volume_Trend' + UNION ALL SELECT 'Schema_Drift' + UNION ALL SELECT 'Metric_Trend' + ) + SELECT + COALESCE(results.test_time, active_runs.test_starttime) AS test_time, + tt.test_type, + results.id AS result_id, + results.result_code, + COALESCE(results.result_status, 'Log') AS result_status, + results.result_signal, + results.result_message, + results.test_definition_id::TEXT, + COALESCE(results.input_parameters, '') AS input_parameters, + results.column_names + FROM active_runs + CROSS JOIN target_tests tt + LEFT JOIN test_results AS results + ON ( + results.test_run_id = active_runs.id + AND results.test_type = tt.test_type + AND results.table_name = :table_name + ) + LEFT JOIN test_definitions AS definition + ON (definition.id = results.test_definition_id) + ORDER BY active_runs.id, tt.test_type; + """ + + params = { + "table_name": table_name, + "test_suite_id": test_suite_id, + "lookback_multiplier": lookback_multiplier, + } + + results = fetch_all_from_db(query, params) + results = [ dict(row) for row in results ] + + metric_events: dict[str, dict] = {} + for event in results: + if event["test_type"] == "Metric_Trend" and event["result_status"] != "Error" and (definition_id := event["test_definition_id"]): + if definition_id not in metric_events: + metric_events[definition_id] = { + "test_definition_id": definition_id, + "column_name": event["column_names"], + "events": [], + } + params = dict_from_kv(event.get("input_parameters") or "") + metric_events[definition_id]["events"].append({ + "value": float(event["result_signal"]) if event["result_signal"] else None, + "time": event["test_time"], + "is_anomaly": int(event["result_code"]) == 0 if event["result_code"] is not None else None, + "is_training": int(event["result_code"]) == -1 if event["result_code"] is not None else None, + "is_pending": not bool(event["result_id"]), + "lower_tolerance": params.get("lower_tolerance") if params.get("lower_tolerance") else None, + "upper_tolerance": params.get("upper_tolerance") if params.get("upper_tolerance") else None, + }) + + return { + "freshness_events": [ + { + "changed": "detected: Yes" in (result_message := event["result_message"] or ""), + "message": parts[1].rstrip(".") if len(parts := result_message.split(". ", 1)) > 1 else None, + "status": event["result_status"], + "is_training": event["result_code"] == -1, + "is_pending": not bool(event["result_id"]), + "time": event["test_time"], + } + for event in results if event["test_type"] == "Freshness_Trend" and event["result_status"] != "Error" + ], + "volume_events": [ + { + "record_count": int(event["result_signal"] or 0), + "time": event["test_time"], + "is_anomaly": int(event["result_code"]) == 0 if event["result_code"] is not None else None, + "is_training": int(event["result_code"]) == -1 if event["result_code"] is not None else None, + "is_pending": not bool(event["result_id"]), + **params, + } + for event in results if event["test_type"] == "Volume_Trend" and event["result_status"] != "Error" and ( + params := dict_from_kv(event.get("input_parameters")) + or {"lower_tolerance": None, "upper_tolerance": None} + ) + ], + "schema_events": [ + { + "table_change": signals[0] or None, + "additions": signals[1], + "deletions": signals[2], + "modifications": signals[3], + "time": event["test_time"], + "window_start": datetime.fromisoformat(signals[4]) if signals[4] else None, + } + for event in results if event["test_type"] == "Schema_Drift" and event["result_status"] != "Error" + and (signals := (event["result_signal"] or "|0|0|0|").split("|") or True) + ], + "metric_events": list(metric_events.values()), + } + + +@st.cache_data(show_spinner=False) +def get_data_structure_logs(table_group_id: str, table_name: str, start_time: str, end_time: str): + query = """ + SELECT + change_date, + change, + old_data_type, + new_data_type, + column_name + FROM data_structure_log + WHERE table_groups_id = :table_group_id + AND table_name = :table_name + AND change_date > :start_time ::TIMESTAMP + AND change_date <= :end_time ::TIMESTAMP; + """ + params = { + "table_group_id": str(table_group_id), + "table_name": table_name, + "start_time": datetime.fromtimestamp(start_time, UTC), + "end_time": datetime.fromtimestamp(end_time, UTC), + } + + results = fetch_all_from_db(query, params) + return [ dict(row) for row in results ] + + +def edit_table_monitors(table_group: TableGroupMinimal, payload: dict): + table_name = payload.get("table_name") + + @with_database_session + def show_dialog(): + definitions = TestDefinition.select_where( + TestDefinition.test_suite_id == table_group.monitor_test_suite_id, + TestDefinition.table_name == table_name, + TestDefinition.test_type.in_(["Freshness_Trend", "Volume_Trend", "Metric_Trend"]), + ) + + def on_save_test_definition(payload: dict) -> None: + set_save(True) + set_close(payload.get("close", False)) + set_updated_definitions(payload.get("updated_definitions", [])) + set_new_metrics(payload.get("new_metrics", [])) + set_deleted_metric_ids(payload.get("deleted_metric_ids", [])) + + should_save, set_save = temp_value(f"edit_table_monitors:save:{table_name}", default=False) + should_close, set_close = temp_value(f"edit_table_monitors:close:{table_name}", default=False) + get_updated_definitions, set_updated_definitions = temp_value(f"edit_table_monitors:updated_definitions:{table_name}", default=[]) + get_new_metrics, set_new_metrics = temp_value(f"edit_table_monitors:new_metrics:{table_name}", default=[]) + get_deleted_metric_ids, set_deleted_metric_ids = temp_value(f"edit_table_monitors:deleted_metric_ids:{table_name}", default=[]) + get_result, set_result = temp_value(f"edit_table_monitors:result:{table_name}", default=None) + + if should_save(): + valid_columns = {col.name for col in TestDefinition.__table__.columns} + + for updated_def in get_updated_definitions(): + current_def: TestDefinitionSummary = TestDefinition.get(updated_def.get("id")) + if current_def: + merged = {key: getattr(current_def, key, None) for key in valid_columns} + merged.update({key: value for key, value in updated_def.items() if key in valid_columns}) + merged["lock_refresh"] = True + + # For Freshness static mode: set threshold_value and lower_tolerance + # so the SQL template's staleness and BETWEEN checks work correctly. + # Also clear prediction JSON to avoid stale schedule-based exclusions. + if merged.get("test_type") == "Freshness_Trend" and merged.get("history_calculation") != "PREDICT": + merged["threshold_value"] = merged.get("upper_tolerance") + merged["lower_tolerance"] = 0 + merged["prediction"] = None + + TestDefinition(**merged).save() + + for new_metric in get_new_metrics(): + new_def = TestDefinition( + table_groups_id=table_group.id, + test_type="Metric_Trend", + test_suite_id=table_group.monitor_test_suite_id, + schema_name=table_group.table_group_schema, + table_name=table_name, + test_active=True, + lock_refresh=True, + ) + for key, value in new_metric.items(): + if key in valid_columns: + setattr(new_def, key, value) + new_def.save() + + deleted_ids = get_deleted_metric_ids() + if deleted_ids: + TestDefinition.delete_where( + TestDefinition.id.in_(deleted_ids), + TestDefinition.test_type == "Metric_Trend", + ) + + if should_close(): + st.rerun() + + set_result({"success": True, "timestamp": datetime.now(UTC).isoformat()}) + st.rerun(scope="fragment") + + metric_test_types = TestType.select_summary_where(TestType.test_type == "Metric_Trend") + metric_test_type = metric_test_types[0] if metric_test_types else None + + testgen.edit_table_monitors( + key="edit_table_monitors", + data={ + "table_name": table_name, + "definitions": [td.to_dict(json_safe=True) for td in definitions], + "metric_test_type": metric_test_type.to_dict(json_safe=True) if metric_test_type else {}, + "result": get_result(), + }, + on_SaveTestDefinition_change=on_save_test_definition, + ) + + return st.dialog(title=f"Table Monitors: {table_name}")(show_dialog)() diff --git a/testgen/ui/views/profiling_results.py b/testgen/ui/views/profiling_results.py index 0c5deff8..5a31fa3f 100644 --- a/testgen/ui/views/profiling_results.py +++ b/testgen/ui/views/profiling_results.py @@ -1,6 +1,5 @@ import json import typing -from datetime import datetime from functools import partial import pandas as pd @@ -23,6 +22,7 @@ from testgen.ui.navigation.page import Page from testgen.ui.services.database_service import fetch_df_from_db from testgen.ui.session import session +from testgen.ui.utils import parse_fuzzy_date from testgen.ui.views.dialogs.data_preview_dialog import data_preview_dialog FORM_DATA_WIDTH = 400 @@ -49,7 +49,7 @@ def render(self, run_id: str, table_name: str | None = None, column_name: str | testgen.page_header( "Data Profiling Results", - "view-data-profiling-results", + "investigate-profiling-results", breadcrumbs=[ { "label": "Profiling Runs", "path": "profiling-runs", "params": { "project_code": run.project_code } }, { "label": f"{run.table_groups_name} | {run_date}" }, @@ -198,7 +198,7 @@ def get_excel_report_data( for key in ["min_date", "max_date"]: data[key] = data[key].apply( - lambda val: datetime.strptime(val, "%Y-%m-%d %H:%M:%S").strftime("%b %-d %Y, %-I:%M %p") if not pd.isna(val) and val != "NaT" else None + lambda val: parse_fuzzy_date(val) if not pd.isna(val) and val != "NaT" else None ) data["hygiene_issues"] = data["hygiene_issues"].apply(lambda val: "Yes" if val else None) diff --git a/testgen/ui/views/profiling_runs.py b/testgen/ui/views/profiling_runs.py index 583ddb4c..40ee4487 100644 --- a/testgen/ui/views/profiling_runs.py +++ b/testgen/ui/views/profiling_runs.py @@ -49,7 +49,7 @@ class DataProfilingPage(Page): def render(self, project_code: str, table_group_id: str | None = None, **_kwargs) -> None: testgen.page_header( PAGE_TITLE, - "investigate-profiling", + "data-profiling", ) with st.spinner("Loading data ..."): @@ -201,7 +201,6 @@ def on_delete_confirmed(*_args) -> None: testgen.testgen_component( "confirm_dialog", props={ - "project_code": project_code, "message": message, "constraint": constraint, "button_label": "Delete", diff --git a/testgen/ui/views/project_dashboard.py b/testgen/ui/views/project_dashboard.py index 6f7fe37b..0fef708e 100644 --- a/testgen/ui/views/project_dashboard.py +++ b/testgen/ui/views/project_dashboard.py @@ -30,6 +30,7 @@ class ProjectDashboardPage(Page): def render(self, project_code: str, **_kwargs): testgen.page_header( PAGE_TITLE, + "project-dashboard", ) with st.spinner("Loading data ..."): @@ -65,6 +66,28 @@ def render(self, project_code: str, **_kwargs): "dq_score": friendly_score(score(table_group.dq_score_profiling, table_group.dq_score_testing)), "dq_score_profiling": friendly_score(table_group.dq_score_profiling), "dq_score_testing": friendly_score(table_group.dq_score_testing), + "monitoring_summary": { + "project_code": project_code, + "table_group_id": str(table_group.id), + "lookback": table_group.monitor_lookback, + "lookback_start": make_json_safe(table_group.monitor_lookback_start), + "lookback_end": make_json_safe(table_group.monitor_lookback_end), + "freshness_anomalies": table_group.monitor_freshness_anomalies or 0, + "schema_anomalies": table_group.monitor_schema_anomalies or 0, + "volume_anomalies": table_group.monitor_volume_anomalies or 0, + "metric_anomalies": table_group.monitor_metric_anomalies or 0, + "freshness_has_errors": table_group.monitor_freshness_has_errors or False, + "volume_has_errors": table_group.monitor_volume_has_errors or False, + "schema_has_errors": table_group.monitor_schema_has_errors or False, + "metric_has_errors": table_group.monitor_metric_has_errors or False, + "freshness_is_training": table_group.monitor_freshness_is_training or False, + "volume_is_training": table_group.monitor_volume_is_training or False, + "metric_is_training": table_group.monitor_metric_is_training or False, + "freshness_is_pending": table_group.monitor_freshness_is_pending or False, + "volume_is_pending": table_group.monitor_volume_is_pending or False, + "schema_is_pending": table_group.monitor_schema_is_pending or False, + "metric_is_pending": table_group.monitor_metric_is_pending or False, + } if table_group.monitor_test_suite_id else None, } for table_group in table_groups ], diff --git a/testgen/ui/views/project_settings.py b/testgen/ui/views/project_settings.py index 5a05b3f8..08f1af13 100644 --- a/testgen/ui/views/project_settings.py +++ b/testgen/ui/views/project_settings.py @@ -38,7 +38,7 @@ def render(self, project_code: str | None = None, **_kwargs) -> None: testgen.page_header( PAGE_TITLE, - "tg-project-settings", + "manage-projects", ) testgen.whitespace(1) diff --git a/testgen/ui/views/quality_dashboard.py b/testgen/ui/views/quality_dashboard.py index e2b7e0b9..4391b6d7 100644 --- a/testgen/ui/views/quality_dashboard.py +++ b/testgen/ui/views/quality_dashboard.py @@ -28,7 +28,7 @@ class QualityDashboardPage(Page): def render(self, *, project_code: str, **_kwargs) -> None: project_summary = Project.get_summary(project_code) - testgen.page_header(PAGE_TITLE) + testgen.page_header(PAGE_TITLE, "quality-scores") testgen.testgen_component( "quality_dashboard", props={ diff --git a/testgen/ui/views/score_details.py b/testgen/ui/views/score_details.py index 6362082d..fad8403f 100644 --- a/testgen/ui/views/score_details.py +++ b/testgen/ui/views/score_details.py @@ -67,6 +67,7 @@ def render( testgen.page_header( "Score Details", + "view-score-details", breadcrumbs=[ {"path": "quality-dashboard", "label": "Quality Dashboard", "params": {"project_code": score_definition.project_code}}, {"label": score_definition.name}, diff --git a/testgen/ui/views/score_explorer.py b/testgen/ui/views/score_explorer.py index 8846629e..48c3385a 100644 --- a/testgen/ui/views/score_explorer.py +++ b/testgen/ui/views/score_explorer.py @@ -77,7 +77,7 @@ def render( page_title = "Edit Scorecard" last_breadcrumb = original_score_definition.name - testgen.page_header(page_title, breadcrumbs=[ + testgen.page_header(page_title, "explore-and-create-scorecards", breadcrumbs=[ {"path": "quality-dashboard", "label": "Quality Dashboard", "params": {"project_code": project_code}}, {"label": last_breadcrumb}, ]) diff --git a/testgen/ui/views/table_groups.py b/testgen/ui/views/table_groups.py index 409a65a4..9f81a2f5 100644 --- a/testgen/ui/views/table_groups.py +++ b/testgen/ui/views/table_groups.py @@ -8,15 +8,19 @@ from sqlalchemy.exc import IntegrityError from testgen.commands.run_profiling import run_profiling_in_background +from testgen.commands.test_generation import run_monitor_generation from testgen.common.models import with_database_session from testgen.common.models.connection import Connection from testgen.common.models.project import Project +from testgen.common.models.scheduler import RUN_MONITORS_JOB_KEY, RUN_TESTS_JOB_KEY, JobSchedule from testgen.common.models.table_group import TableGroup, TableGroupMinimal +from testgen.common.models.test_suite import TestSuite from testgen.ui.components import widgets as testgen from testgen.ui.navigation.menu import MenuItem from testgen.ui.navigation.page import Page from testgen.ui.queries import table_group_queries from testgen.ui.session import session, temp_value +from testgen.ui.utils import get_cron_sample_handler from testgen.ui.views.connections import FLAVOR_OPTIONS, format_connection from testgen.ui.views.dialogs.run_profiling_dialog import run_profiling_dialog from testgen.ui.views.profiling_runs import ProfilingScheduleDialog, manage_notifications @@ -45,7 +49,7 @@ def render( table_group_name: str | None = None, **_kwargs, ) -> None: - testgen.page_header(PAGE_TITLE, "create-a-table-group") + testgen.page_header(PAGE_TITLE, "manage-table-groups") user_can_edit = session.auth.user_has_permission("edit") project_summary = Project.get_summary(project_code) @@ -108,6 +112,8 @@ def add_table_group_dialog(self, project_code: str, connection_id: str | None): "tableGroup", "testTableGroup", "runProfiling", + "testSuite", + "monitorSuite", ], ) @@ -143,27 +149,50 @@ def on_save_table_group_clicked(payload: dict): table_group: dict = payload["table_group"] table_group_verified: bool = payload.get("table_group_verified", False) run_profiling: bool = payload.get("run_profiling", False) + standard_test_suite: dict | None = payload.get("standard_test_suite", None) + monitor_test_suite: dict | None = payload.get("monitor_test_suite", None) mark_for_preview(True) set_save(True) set_table_group(table_group) + set_standard_test_suite_data(standard_test_suite) + set_monitor_test_suite_data(monitor_test_suite) set_table_group_verified(table_group_verified) set_run_profiling(run_profiling) - def on_go_to_profiling_runs(params: dict) -> None: - set_navigation_params({ **params, "project_code": project_code }) + def on_close_clicked(_params: dict) -> None: + set_close_dialog(True) - get_navigation_params, set_navigation_params = temp_value( - "connections:new_table_group:go_to_profiling_run", - default=None, - ) - if (params := get_navigation_params()): - self.router.navigate(to="profiling-runs", with_args=params) + get_close_dialog, set_close_dialog = temp_value("table_groups:close:new", default=False) + if (get_close_dialog()): + st.rerun() should_preview, mark_for_preview = temp_value("table_groups:preview:new", default=False) should_verify_access, mark_for_access_preview = temp_value("table_groups:preview_access:new", default=False) should_save, set_save = temp_value("table_groups:save:new", default=False) get_table_group, set_table_group = temp_value("table_groups:updated:new", default={}) + get_standard_test_suite_data, set_standard_test_suite_data = temp_value( + "table_groups:test_suite_data:new", + default={ + "generate": False, + "name": "", + "schedule": "", + "timezone": "", + }, + ) + get_monitor_test_suite_data, set_monitor_test_suite_data = temp_value( + "table_groups:monitor_suite_data:new", + default={ + "generate": False, + "monitor_lookback": 0, + "schedule": "", + "timezone": "", + "predict_sensitivity": 0, + "predict_min_lookback": 0, + "predict_exclude_weekends": False, + "predict_holiday_codes": None, + }, + ) is_table_group_verified, set_table_group_verified = temp_value( "table_groups:new:verified", default=False, @@ -172,6 +201,8 @@ def on_go_to_profiling_runs(params: dict) -> None: "table_groups:new:run_profiling", default=False, ) + standard_cron_sample_result, on_get_standard_cron_sample = get_cron_sample_handler("table_groups:new:standard_cron_expr_validation") + monitor_cron_sample_result, on_get_monitor_cron_sample = get_cron_sample_handler("table_groups:new:monitor_cron_expr_validation") is_table_group_used = False connections = self._get_connections(project_code) @@ -182,13 +213,10 @@ def on_go_to_profiling_runs(params: dict) -> None: original_table_group_schema = table_group.table_group_schema is_table_group_used = TableGroup.is_in_use([table_group_id]) - add_monitor_test_suite = False add_scorecard_definition = False for key, value in get_table_group().items(): if key == "add_scorecard_definition": add_scorecard_definition = value - elif key == "add_monitor_test_suite": - add_monitor_test_suite = value else: setattr(table_group, key, value) @@ -220,23 +248,84 @@ def on_go_to_profiling_runs(params: dict) -> None: success = None message = "" + run_profiling = False + generate_test_suite = False + generate_monitor_suite = False + standard_test_suite = None + monitor_test_suite = None if should_save(): success = True if is_table_group_verified(): try: - table_group.save( - add_scorecard_definition, - add_monitor_test_suite=add_monitor_test_suite, - monitor_schedule_timezone=st.session_state["browser_timezone"] or "UTC", - ) - + table_group.save(add_scorecard_definition) if save_data_chars: try: save_data_chars(table_group.id) except Exception: LOG.exception("Data characteristics refresh encountered errors") + standard_test_suite_data = get_standard_test_suite_data() or {} + if standard_test_suite_data.get("generate"): + generate_test_suite = True + standard_test_suite = TestSuite( + project_code=project_code, + test_suite=standard_test_suite_data["name"], + connection_id=table_group.connection_id, + table_groups_id=table_group.id, + export_to_observability=False, + dq_score_exclude=False, + is_monitor=False, + monitor_lookback=0, + predict_min_lookback=0, + ) + standard_test_suite.save() + + JobSchedule( + project_code=project_code, + key=RUN_TESTS_JOB_KEY, + cron_expr=standard_test_suite_data["schedule"], + cron_tz=standard_test_suite_data["timezone"], + args=[], + kwargs={"test_suite_id": str(standard_test_suite.id)}, + ).save() + + monitor_test_suite_data = get_monitor_test_suite_data() or {} + if monitor_test_suite_data.get("generate"): + generate_monitor_suite = True + monitor_test_suite = TestSuite( + project_code=project_code, + test_suite=f"{table_group.table_groups_name} Monitors", + connection_id=table_group.connection_id, + table_groups_id=table_group.id, + export_to_observability=False, + dq_score_exclude=True, + is_monitor=True, + monitor_lookback=monitor_test_suite_data.get("monitor_lookback") or 14, + monitor_regenerate_freshness=monitor_test_suite_data.get("monitor_regenerate_freshness") or True, + predict_min_lookback=monitor_test_suite_data.get("predict_min_lookback") or 30, + predict_sensitivity=monitor_test_suite_data.get("predict_sensitivity") or "medium", + predict_exclude_weekends=monitor_test_suite_data.get("predict_exclude_weekends") or False, + predict_holiday_codes=monitor_test_suite_data.get("predict_holiday_codes") or None, + ) + monitor_test_suite.save() + run_monitor_generation(monitor_test_suite.id, ["Volume_Trend", "Schema_Drift"]) + + JobSchedule( + project_code=project_code, + key=RUN_MONITORS_JOB_KEY, + cron_expr=monitor_test_suite_data.get("schedule"), + cron_tz=monitor_test_suite_data.get("timezone"), + args=[], + kwargs={"test_suite_id": str(monitor_test_suite.id)}, + ).save() + + if standard_test_suite or monitor_test_suite: + table_group.default_test_suite_id = standard_test_suite.id if standard_test_suite else None + table_group.monitor_test_suite_id = monitor_test_suite.id if monitor_test_suite else None + table_group.save() + if should_run_profiling(): + run_profiling = True try: run_profiling_in_background(table_group.id) message = f"Profiling run started for table group {table_group.table_groups_name}." @@ -244,8 +333,7 @@ def on_go_to_profiling_runs(params: dict) -> None: success = False message = "Profiling run encountered errors" LOG.exception(message) - else: - st.rerun() + except IntegrityError: success = False message = "A Table Group with the same name already exists." @@ -253,9 +341,9 @@ def on_go_to_profiling_runs(params: dict) -> None: success = False message = "Verify the table group before saving" - return testgen.testgen_component( - "table_group_wizard", - props={ + return testgen.table_group_wizard( + key="add_tg_wizard", + data={ "project_code": project_code, "connections": connections, "table_group": table_group.to_dict(json_safe=True), @@ -265,14 +353,19 @@ def on_go_to_profiling_runs(params: dict) -> None: "results": { "success": success, "message": message, - "table_group_id": str(table_group.id), + "run_profiling": run_profiling, + "generate_test_suite": generate_test_suite, + "generate_monitor_suite": generate_monitor_suite, + "test_suite_name": standard_test_suite.test_suite if standard_test_suite else None, } if success is not None else None, + "standard_cron_sample": standard_cron_sample_result(), + "monitor_cron_sample": monitor_cron_sample_result(), }, - on_change_handlers={ - "PreviewTableGroupClicked": on_preview_table_group_clicked, - "SaveTableGroupClicked": on_save_table_group_clicked, - "GoToProfilingRunsClicked": on_go_to_profiling_runs, - }, + on_PreviewTableGroupClicked_change=on_preview_table_group_clicked, + on_GetCronSample_change=on_get_monitor_cron_sample, + on_GetCronSampleAux_change=on_get_standard_cron_sample, + on_SaveTableGroupClicked_change=on_save_table_group_clicked, + on_CloseClicked_change=on_close_clicked, ) def _get_connections(self, project_code: str, connection_id: str | None = None) -> list[dict]: diff --git a/testgen/ui/views/test_definitions.py b/testgen/ui/views/test_definitions.py index ce35d955..c494deed 100644 --- a/testgen/ui/views/test_definitions.py +++ b/testgen/ui/views/test_definitions.py @@ -1,4 +1,5 @@ import logging +import re import time import typing from datetime import datetime @@ -67,7 +68,7 @@ def render( testgen.page_header( "Test Definitions", - "testgen-test-types", + "test-definitions", breadcrumbs=[ { "label": "Test Suites", "path": "test-suites", "params": { "project_code": project_code } }, { "label": test_suite.test_suite }, @@ -438,12 +439,12 @@ def show_test_form( baseline_unique_ct = empty_if_null(selected_test_def["baseline_unique_ct"]) if mode == "edit" else "" baseline_value = empty_if_null(selected_test_def["baseline_value"]) if mode == "edit" else "" baseline_value_ct = empty_if_null(selected_test_def["baseline_value_ct"]) if mode == "edit" else "" - threshold_value = selected_test_def["threshold_value"] or 0 if mode == "edit" else 0 + threshold_value = empty_if_null(selected_test_def["threshold_value"]) if mode == "edit" else "" baseline_sum = empty_if_null(selected_test_def["baseline_sum"]) if mode == "edit" else "" baseline_avg = empty_if_null(selected_test_def["baseline_avg"]) if mode == "edit" else "" baseline_sd = empty_if_null(selected_test_def["baseline_sd"]) if mode == "edit" else "" - lower_tolerance = selected_test_def["lower_tolerance"] or 0 if mode == "edit" else 0 - upper_tolerance = selected_test_def["upper_tolerance"] or 0 if mode == "edit" else 0 + lower_tolerance = empty_if_null(selected_test_def["lower_tolerance"]) if mode == "edit" else "" + upper_tolerance = empty_if_null(selected_test_def["upper_tolerance"]) if mode == "edit" else "" subset_condition = empty_if_null(selected_test_def["subset_condition"]) if mode == "edit" else "" groupby_names = empty_if_null(selected_test_def["groupby_names"]) if mode == "edit" else "" having_condition = empty_if_null(selected_test_def["having_condition"]) if mode == "edit" else "" @@ -454,8 +455,9 @@ def show_test_form( match_subset_condition = empty_if_null(selected_test_def["match_subset_condition"]) if mode == "edit" else "" match_groupby_names = empty_if_null(selected_test_def["match_groupby_names"]) if mode == "edit" else "" match_having_condition = empty_if_null(selected_test_def["match_having_condition"]) if mode == "edit" else "" - window_days = selected_test_def["window_days"] or 0 if mode == "edit" else 0 + window_days = empty_if_null(selected_test_def["window_days"]) if mode == "edit" else "" history_calculation = empty_if_null(selected_test_def["history_calculation"]) if mode == "edit" else "" + history_calculation_upper = empty_if_null(selected_test_def["history_calculation_upper"]) if mode == "edit" else "" history_lookback = empty_if_null(selected_test_def["history_lookback"]) if mode == "edit" else "" # export_to_observability @@ -553,6 +555,7 @@ def show_test_form( "match_having_condition": match_having_condition, "window_days": window_days, "history_calculation": history_calculation, + "history_calculation_upper": history_calculation_upper, "history_lookback": history_lookback, } @@ -675,15 +678,14 @@ def render_dynamic_attribute(attribute: str, container: DeltaGenerator): if not attribute in dynamic_attributes or not attribute: return - choice_fields = { - "history_calculation": ["Value", "Minimum", "Maximum", "Sum", "Average"], - } float_numeric_attributes = ["lower_tolerance", "upper_tolerance"] if test_type != "LOV_All": float_numeric_attributes.append("threshold_value") int_numeric_attributes = ["history_lookback"] default_value = 0 if attribute in [*float_numeric_attributes, *int_numeric_attributes] else "" + if attribute == "history_lookback": + default_value = 10 value = ( selected_test_def[attribute] if mode == "edit" and selected_test_def[attribute] is not None @@ -705,19 +707,22 @@ def render_dynamic_attribute(attribute: str, container: DeltaGenerator): ) if attribute == "custom_query": - custom_query_placeholder = None - if test_type == "Condition_Flag": - custom_query_placeholder = "EXAMPLE: status = 'SHIPPED' and qty_shipped = 0" - elif test_type == "CUSTOM": - custom_query_placeholder = "EXAMPLE: SELECT product, SUM(qty_sold) as sum_sold, SUM(qty_shipped) as qty_shipped \n FROM {DATA_SCHEMA}.sales_history \n GROUP BY product \n HAVING SUM(qty_shipped) > SUM(qty_sold)" - - test_definition[attribute] = container.text_area( - label=label_text, - value=custom_query, - placeholder=custom_query_placeholder, - height=150 if test_type == "CUSTOM" else 75, - help=help_text, - ) + if test_type == "Volume_Trend": + test_definition[attribute] = "COUNT(CASE WHEN {SUBSET_CONDITION} THEN 1 END)" + else: + custom_query_placeholder = None + if test_type == "Condition_Flag": + custom_query_placeholder = "EXAMPLE: status = 'SHIPPED' and qty_shipped = 0" + elif test_type == "CUSTOM": + custom_query_placeholder = "EXAMPLE: SELECT product, SUM(qty_sold) as sum_sold, SUM(qty_shipped) as qty_shipped \n FROM {DATA_SCHEMA}.sales_history \n GROUP BY product \n HAVING SUM(qty_shipped) > SUM(qty_sold)" + + test_definition[attribute] = container.text_area( + label=label_text, + value=custom_query, + placeholder=custom_query_placeholder, + height=150 if test_type == "CUSTOM" else 75, + help=help_text, + ) elif attribute in float_numeric_attributes: test_definition[attribute] = container.number_input( label=label_text, @@ -726,32 +731,73 @@ def render_dynamic_attribute(attribute: str, container: DeltaGenerator): help=help_text, ) elif attribute in int_numeric_attributes: - max_value = None - if ( - attribute == "history_lookback" - and int(value) <= 1 - and ( - not test_definition.get("history_calculation") - or test_definition.get("history_calculation") == "Value" - ) - ): - max_value = 1 + min_value = 0 + placeholder = None + disabled = False + if attribute == "history_lookback": + min_value = 1 + if test_definition.get("history_calculation") == "PREDICT": + value = None + placeholder = "Max" + disabled = True + + if test_definition.get("history_calculation") == "Value" and ( + "history_calculation_upper" not in dynamic_attributes + or test_definition.get("history_calculation_upper") == "Value" + ): + value = 1 + disabled = True + test_definition[attribute] = container.number_input( label=label_text, step=1, - value=int(value), - max_value=max_value, - min_value=0, + value=int(value) if value is not None else None, + min_value=min_value, + placeholder=placeholder, help=help_text, + disabled=disabled, ) - elif attribute in choice_fields: + elif attribute in ["history_calculation", "history_calculation_upper"]: + predict_label = "Use Prediction Model" + options = ["Value", "Minimum", "Maximum", "Sum", "Average", "Expression"] + if attribute == "history_calculation": + options.append(predict_label) + + default = value + disabled = False + match = re.search(r"^EXPR:\[(.+)\]$", value) + expression = None + if value and match: + default = "Expression" + expression = match.group(1) + elif value == "PREDICT": + default = predict_label + + if attribute == "history_calculation_upper" and test_definition["history_calculation"] == "PREDICT": + default = None + disabled = True + with container: - test_definition[attribute] = testgen.select( + selection = testgen.select( label_text, - choice_fields[attribute], + options=options, required=True, - default_value=value, + default_value=default, + disabled=disabled, + ) + + if selection == "Expression": + expression = st.text_input( + label=f"{label_text} Expression", + max_chars=900, + value=expression, + # help="", // TODO ) + test_definition[attribute] = f"EXPR:[{expression}]" + elif selection == predict_label: + test_definition[attribute] = "PREDICT" + else: + test_definition[attribute] = selection else: test_definition[attribute] = container.text_input( label=label_text, @@ -867,7 +913,10 @@ def copy_move_test_dialog( ) with suite_filter_column: - test_suites = TestSuite.select_minimal_where(TestSuite.table_groups_id == target_table_group_id) + test_suites = TestSuite.select_minimal_where( + TestSuite.table_groups_id == target_table_group_id, + TestSuite.is_monitor.isnot(True), + ) test_suites_df = to_dataframe(test_suites, TestSuiteMinimal.columns()) target_test_suite_id = testgen.select( options=test_suites_df, diff --git a/testgen/ui/views/test_results.py b/testgen/ui/views/test_results.py index b181ac62..f6065121 100644 --- a/testgen/ui/views/test_results.py +++ b/testgen/ui/views/test_results.py @@ -43,7 +43,7 @@ from testgen.ui.session import session from testgen.ui.views.dialogs.profiling_results_dialog import view_profiling_button from testgen.ui.views.test_definitions import show_test_form_by_id -from testgen.utils import friendly_score +from testgen.utils import friendly_score, str_to_timestamp PAGE_PATH = "test-runs:results" @@ -78,7 +78,7 @@ def render( testgen.page_header( "Test Results", - "view-testgen-test-results", + "investigate-test-results", breadcrumbs=[ { "label": "Test Runs", "path": "test-runs", "params": { "project_code": run.project_code } }, { "label": f"{run.test_suite} | {run_date}" }, @@ -312,9 +312,6 @@ def open_download_dialog(data: pd.DataFrame | None = None) -> None: multi_select, ) - # Help Links - st.markdown("[Help on Test Types](https://docs.datakitchen.io/article/dataops-testgen-help/testgen-test-types)") - @st.fragment @with_database_session @@ -531,7 +528,7 @@ def render_selected_details( date_service.accommodate_dataframe_to_timezone(dfh, st.session_state, time_columns) if user_can_edit: - view_edit_test(v_col1, selected_item["test_definition_id_current"]) + view_edit_test(v_col1, selected_item["test_definition_id"]) if selected_item["test_scope"] == "column": with v_col2: @@ -608,9 +605,10 @@ def render_selected_details( if dfh.empty: st.write("Test history not available.") else: - write_history_graph(dfh) + # write_history_graph(dfh) + write_history_chart_v2(dfh) with ut_tab2: - show_test_def_detail(selected_item["test_definition_id_current"], test_suite) + show_test_def_detail(selected_item["test_definition_id"], test_suite) @with_database_session @@ -659,6 +657,17 @@ def write_history_graph(data: pd.DataFrame): case _: render_line_chart(data, **chart_params) +def write_history_chart_v2(data: pd.DataFrame): + data["test_date"] = data["test_date"].apply(str_to_timestamp) + return testgen.testgen_component( + "test_results_chart", + props={ + # Fix NaN values + "data": json.loads(data.to_json(orient="records")), + }, + ) + + def render_line_chart(dfh: pd.DataFrame, **_params: dict) -> None: str_uom = dfh.at[0, "measure_uom"] @@ -813,14 +822,6 @@ def source_data_dialog(selected_row): st.markdown("#### Result Detail") st.caption(selected_row["result_message"].replace("*", "\\*")) - st.markdown("#### SQL Query") - if selected_row["test_type"] == "CUSTOM": - query = get_test_issue_source_query_custom(selected_row) - else: - query = get_test_issue_source_query(selected_row) - if query: - st.code(query, language="sql", wrap_lines=True, height=100) - with st.spinner("Retrieving source data..."): if selected_row["test_type"] == "CUSTOM": bad_data_status, bad_data_msg, _, df_bad = get_test_issue_source_data_custom(selected_row, limit=500) @@ -843,6 +844,14 @@ def source_data_dialog(selected_row): # Display the dataframe st.dataframe(df_bad, width=1050, hide_index=True) + st.markdown("#### SQL Query") + if selected_row["test_type"] == "CUSTOM": + query = get_test_issue_source_query_custom(selected_row) + else: + query = get_test_issue_source_query(selected_row) + if query: + st.code(query, language="sql", wrap_lines=True, height=100) + def view_edit_test(button_container, test_definition_id): if test_definition_id: diff --git a/testgen/ui/views/test_runs.py b/testgen/ui/views/test_runs.py index 950384a4..ea30c3fd 100644 --- a/testgen/ui/views/test_runs.py +++ b/testgen/ui/views/test_runs.py @@ -45,20 +45,20 @@ class TestRunsPage(Page): icon=PAGE_ICON, label=PAGE_TITLE, section="Data Quality Testing", - order=0, + order=1, ) def render(self, project_code: str, table_group_id: str | None = None, test_suite_id: str | None = None, **_kwargs) -> None: testgen.page_header( PAGE_TITLE, - "test-results", + "data-quality-testing", ) with st.spinner("Loading data ..."): project_summary = Project.get_summary(project_code) test_runs = TestRun.select_summary(project_code, table_group_id, test_suite_id) table_groups = TableGroup.select_minimal_where(TableGroup.project_code == project_code) - test_suites = TestSuite.select_minimal_where(TestSuite.project_code == project_code) + test_suites = TestSuite.select_minimal_where(TestSuite.project_code == project_code, TestSuite.is_monitor.isnot(True)) testgen_component( "test_runs", @@ -142,7 +142,10 @@ def _model_to_item_attrs(self, model: TestRunNotificationSettings) -> dict[str, def _get_component_props(self) -> dict[str, Any]: test_suite_options = [ (str(ts.id), ts.test_suite) - for ts in TestSuite.select_minimal_where(TestSuite.project_code == self.ns_attrs["project_code"]) + for ts in TestSuite.select_minimal_where( + TestSuite.project_code == self.ns_attrs["project_code"], + TestSuite.is_monitor.isnot(True), + ) ] test_suite_options.insert(0, (None, "All Test Suites")) trigger_labels = { @@ -167,7 +170,10 @@ class TestRunScheduleDialog(ScheduleDialog): test_suites: Iterable[TestSuiteMinimal] | None = None def init(self) -> None: - self.test_suites = TestSuite.select_minimal_where(TestSuite.project_code == self.project_code) + self.test_suites = TestSuite.select_minimal_where( + TestSuite.project_code == self.project_code, + TestSuite.is_monitor.isnot(True), + ) def get_arg_value(self, job): return next(item.test_suite for item in self.test_suites if str(item.id) == job.kwargs["test_suite_id"]) @@ -214,7 +220,6 @@ def on_delete_confirmed(*_args) -> None: testgen.testgen_component( "confirm_dialog", props={ - "project_code": project_code, "message": message, "constraint": constraint, "button_label": "Delete", diff --git a/testgen/ui/views/test_suites.py b/testgen/ui/views/test_suites.py index 9a8109ea..033972b5 100644 --- a/testgen/ui/views/test_suites.py +++ b/testgen/ui/views/test_suites.py @@ -34,13 +34,13 @@ class TestSuitesPage(Page): icon=PAGE_ICON, label=PAGE_TITLE, section="Data Quality Testing", - order=1, + order=2, ) def render(self, project_code: str, table_group_id: str | None = None, **_kwargs) -> None: testgen.page_header( PAGE_TITLE, - "create-a-test-suite", + "manage-test-suites", ) table_groups = TableGroup.select_minimal_where(TableGroup.project_code == project_code) diff --git a/testgen/utils/__init__.py b/testgen/utils/__init__.py index 4dfc9d39..fe40803d 100644 --- a/testgen/utils/__init__.py +++ b/testgen/utils/__init__.py @@ -4,6 +4,7 @@ from collections.abc import Iterable from datetime import UTC, datetime from decimal import Decimal +from enum import Enum from functools import wraps from typing import TYPE_CHECKING @@ -27,9 +28,25 @@ def to_int(value: float | int) -> int: return 0 +def to_sql_timestamp(value: datetime): + return value.strftime("%Y-%m-%d %H:%M:%S") + + +def str_to_timestamp(value: str) -> int: + try: + return int(datetime.strptime(value, "%Y-%m-%d %H:%M:%S").replace(tzinfo=UTC).timestamp()) + except: + ... + try: + return int(datetime.strptime(value, "%Y-%m-%dT%H:%M:%SZ").replace(tzinfo=UTC).timestamp()) + except: + ... + + def to_dataframe( data: Iterable[Any], columns: list[str] | None = None, + coerce_float: bool = False, ) -> pd.DataFrame: records = [] for item in data: @@ -40,7 +57,7 @@ def to_dataframe( else: row = dict(item) records.append(row) - return pd.DataFrame.from_records(records, columns=columns) + return pd.DataFrame.from_records(records, columns=columns, coerce_float=coerce_float) def is_uuid4(value: str) -> bool: @@ -73,6 +90,8 @@ def make_json_safe(value: Any) -> str | bool | int | float | None: return int(value.replace(tzinfo=UTC).timestamp()) elif isinstance(value, Decimal): return float(value) + elif isinstance(value, Enum): + return value.value elif isinstance(value, list): return [ make_json_safe(item) for item in value ] elif isinstance(value, dict): diff --git a/testgen/utils/plugins.py b/testgen/utils/plugins.py index 6d5596a0..15bb024d 100644 --- a/testgen/utils/plugins.py +++ b/testgen/utils/plugins.py @@ -3,6 +3,7 @@ import inspect import json import os +import shutil from collections.abc import Generator from pathlib import Path from typing import ClassVar @@ -66,7 +67,19 @@ def provide(self) -> None: target = ui_plugins_components_directory / self.name try: - os.symlink(self.root, target) + if target.exists(): + if target.is_symlink(): + target.unlink() + else: + shutil.rmtree(target) + + try: + if self.root.is_dir(): + shutil.copytree(self.root, target) + else: + shutil.copy2(self.root, target) + except Exception: + os.symlink(self.root, target) except FileExistsError: ... except OSError as e: diff --git a/tests/unit/test_refresh_data_chars_query.py b/tests/unit/commands/queries/test_refresh_data_chars_query.py similarity index 60% rename from tests/unit/test_refresh_data_chars_query.py rename to tests/unit/commands/queries/test_refresh_data_chars_query.py index a84bc139..9118d586 100644 --- a/tests/unit/test_refresh_data_chars_query.py +++ b/tests/unit/commands/queries/test_refresh_data_chars_query.py @@ -4,8 +4,9 @@ from testgen.common.models.connection import Connection from testgen.common.models.table_group import TableGroup +pytestmark = pytest.mark.unit + -@pytest.mark.unit def test_include_exclude_mask_basic(): connection = Connection(sql_flavor="postgresql") table_group = TableGroup( @@ -26,7 +27,6 @@ def test_include_exclude_mask_basic(): )""" in query -@pytest.mark.unit @pytest.mark.parametrize("mask", ("", None)) def test_include_empty_exclude_mask(mask): connection = Connection(sql_flavor="snowflake") @@ -44,7 +44,6 @@ def test_include_empty_exclude_mask(mask): )""" in query -@pytest.mark.unit @pytest.mark.parametrize("mask", ("", None)) def test_include_empty_include_mask(mask): connection = Connection(sql_flavor="mssql") @@ -60,3 +59,51 @@ def test_include_empty_include_mask(mask): assert r"""AND ( (c.table_name LIKE 'important%' ) OR (c.table_name LIKE '%useful[_]%' ) )""" in query + + +def test_table_set_only(): + connection = Connection(sql_flavor="postgresql") + table_group = TableGroup( + table_group_schema="test_schema", + profiling_table_set="users, orders, products", + profiling_include_mask="", + profiling_exclude_mask="", + ) + sql_generator = RefreshDataCharsSQL(connection, table_group) + criteria = sql_generator._get_table_criteria() + + assert "IN ('users','orders','products')" in criteria + assert "LIKE" not in criteria + + +@pytest.mark.parametrize("include", ("", None)) +@pytest.mark.parametrize("exclude", ("", None)) +def test_no_filters(include, exclude): + connection = Connection(sql_flavor="postgresql") + table_group = TableGroup( + table_group_schema="test_schema", + profiling_table_set="", + profiling_include_mask=include, + profiling_exclude_mask=exclude, + ) + sql_generator = RefreshDataCharsSQL(connection, table_group) + criteria = sql_generator._get_table_criteria() + + assert criteria == "" + + +def test_table_set_with_include_exclude(): + connection = Connection(sql_flavor="postgresql") + table_group = TableGroup( + table_group_schema="test_schema", + profiling_table_set="users, orders", + profiling_include_mask="important%", + profiling_exclude_mask="temp%", + ) + sql_generator = RefreshDataCharsSQL(connection, table_group) + criteria = sql_generator._get_table_criteria() + + assert "IN ('users','orders')" in criteria + assert "LIKE 'important%'" in criteria + assert "AND NOT" in criteria + assert "LIKE 'temp%'" in criteria diff --git a/tests/unit/test_run_observability_exporter.py b/tests/unit/commands/test_run_observability_exporter.py similarity index 98% rename from tests/unit/test_run_observability_exporter.py rename to tests/unit/commands/test_run_observability_exporter.py index cbcdc483..fa713b63 100644 --- a/tests/unit/test_run_observability_exporter.py +++ b/tests/unit/commands/test_run_observability_exporter.py @@ -6,6 +6,8 @@ calculate_chunk_size, ) +pytestmark = pytest.mark.unit + @pytest.fixture() def test_outcome(): @@ -22,7 +24,6 @@ def test_outcome(): } -@pytest.mark.unit @pytest.mark.parametrize( "test_outcomes_length", [1, 100, 10000], @@ -40,7 +41,6 @@ def test_calculate_chunk_size(test_outcome, test_outcomes_length): assert 100 < chunk_size < 500 -@pytest.mark.unit @pytest.mark.parametrize( "profiling_table_set, expected_outcome", ( @@ -54,7 +54,6 @@ def test_get_processed_profiling_table_set(profiling_table_set, expected_outcome assert expected_outcome == actual_outcome -@pytest.mark.unit @pytest.mark.parametrize( "input_parameters, expected_outcome", ( diff --git a/tests/unit/common/__init__.py b/tests/unit/common/__init__.py new file mode 100644 index 00000000..e69de29b diff --git a/tests/unit/common/conftest.py b/tests/unit/common/conftest.py new file mode 100644 index 00000000..8646de5c --- /dev/null +++ b/tests/unit/common/conftest.py @@ -0,0 +1,343 @@ +from datetime import datetime, timedelta +from typing import NamedTuple + +import pandas as pd + +from testgen.commands.test_thresholds_prediction import compute_freshness_threshold +from testgen.common.freshness_service import count_excluded_minutes, get_schedule_params, is_excluded_day +from testgen.common.models.test_suite import PredictSensitivity + + +def _make_freshness_history( + update_timestamps: list[str], + check_interval_minutes: int = 120, +) -> pd.DataFrame: + """Build a sawtooth freshness history from a list of update timestamps. + + Between updates, the signal grows by check_interval_minutes each step. + At each update, the signal resets to 0. + """ + updates = sorted(pd.Timestamp(ts) for ts in update_timestamps) + rows: list[tuple[pd.Timestamp, float]] = [] + for i in range(len(updates) - 1): + start = updates[i] + end = updates[i + 1] + # First segment starts at the exact update time with signal=0 (the update event). + # Later segments start one check_interval after the update, with signal equal to + # that interval β€” simulating the first monitoring check after the update landed. + t = start if i == 0 else start + pd.Timedelta(minutes=check_interval_minutes) + signal = 0.0 if i == 0 else float(check_interval_minutes) + while t < end: + rows.append((t, signal)) + t += pd.Timedelta(minutes=check_interval_minutes) + signal += check_interval_minutes + rows.append((end, 0.0)) + + df = pd.DataFrame(rows, columns=["timestamp", "result_signal"]) + df = df.set_index("timestamp") + return df + + +# ─── Scenario test infrastructure ──────────────────────────────────── + + +class ScenarioPoint(NamedTuple): + timestamp: pd.Timestamp + value: float + lower: float | None + upper: float | None + staleness: float | None + prediction_json: str | None + result_code: int # -1 = training, 1 = passed, 0 = failed + result_status: str # "Log", "Passed", "Failed" + + +def _to_csv_rows(raw: list[tuple[str, str]]) -> list[tuple[pd.Timestamp, float]]: + """Convert (str, str) tuples from generate_test_data to (Timestamp, float).""" + return [(pd.Timestamp(ts), float(val)) for ts, val in raw] + + +def _to_history_df(rows: list[tuple[pd.Timestamp, float]]) -> pd.DataFrame: + """Convert a list of (timestamp, value) tuples to a DataFrame with DatetimeIndex.""" + df = pd.DataFrame(rows, columns=["timestamp", "value"]) + df["timestamp"] = pd.to_datetime(df["timestamp"]) + return df.set_index("timestamp") + + +def _evaluate_freshness_point( + timestamp: pd.Timestamp, + value: float, + lower: float | None, + upper: float | None, + staleness: float | None, + prediction_json: str | None, + freshness_last_update: pd.Timestamp | None, + exclude_weekends: bool, + tz: str | None, +) -> tuple[int, str]: + """Evaluate a single freshness observation against thresholds. + + Mirrors the 3-branch decision in simulate_monitor.py (lines 421-476) + and the SQL template logic. Returns (result_code, result_status). + """ + effective_staleness = staleness if staleness is not None else upper + sched = get_schedule_params(prediction_json) if prediction_json else None + inferred_excluded = sched.excluded_days if sched else None + win_s = sched.window_start if sched else None + win_e = sched.window_end if sched else None + + # Training: thresholds not yet available + if upper is None: + return -1, "Log" + + # Update point: check completed gap against [lower, upper] + if value == 0 and freshness_last_update is not None: + completed_gap = (timestamp - freshness_last_update).total_seconds() / 60 + has_exclusions = exclude_weekends or inferred_excluded or win_s is not None + if has_exclusions: + excluded = count_excluded_minutes( + freshness_last_update, timestamp, exclude_weekends, holiday_dates=None, + tz=tz, excluded_days=inferred_excluded, + window_start=win_s, window_end=win_e, + ) + completed_gap = max(completed_gap - excluded, 0) + if (lower is not None and completed_gap < lower) or completed_gap > upper: + return 0, "Failed" + return 1, "Passed" + + # Between updates: check growing interval against staleness + if value > 0: + has_exclusions = exclude_weekends or inferred_excluded or win_s is not None + is_excl = has_exclusions and is_excluded_day( + timestamp, exclude_weekends, holiday_dates=None, tz=tz, + excluded_days=inferred_excluded, window_start=win_s, window_end=win_e, + ) + if is_excl: + return 1, "Passed" + + excluded = count_excluded_minutes( + freshness_last_update, timestamp, exclude_weekends, holiday_dates=None, + tz=tz, excluded_days=inferred_excluded, + window_start=win_s, window_end=win_e, + ) if has_exclusions and freshness_last_update else 0 + business_interval = value - excluded + if business_interval > effective_staleness: + return 0, "Failed" + return 1, "Passed" + + # First update point (value == 0, no prior update) + return 1, "Passed" + + +def _run_scenario( + csv_rows: list[tuple[pd.Timestamp, float]], + sensitivity: PredictSensitivity, + exclude_weekends: bool = False, + tz: str | None = None, +) -> list[ScenarioPoint]: + """Iterate through csv_rows calling compute_freshness_threshold at each step.""" + results: list[ScenarioPoint] = [] + freshness_last_update: pd.Timestamp | None = None + + for i, (timestamp, value) in enumerate(csv_rows): + history_df = _to_history_df(csv_rows[:i]) + + lower, upper, staleness, prediction_json = compute_freshness_threshold( + history_df, sensitivity, min_lookback=30, + exclude_weekends=exclude_weekends, schedule_tz=tz, + ) + + result_code, result_status = _evaluate_freshness_point( + timestamp, value, lower, upper, staleness, prediction_json, + freshness_last_update, exclude_weekends, tz, + ) + + results.append(ScenarioPoint( + timestamp=timestamp, + value=value, + lower=lower, + upper=upper, + staleness=staleness, + prediction_json=prediction_json, + result_code=result_code, + result_status=result_status, + )) + + if value == 0: + freshness_last_update = timestamp + + return results + + +# ─── Scenario data generators (from generate_test_data.py) ─────────── + + +def _ts(dt: datetime) -> str: + return dt.strftime("%Y-%m-%d %H:%M:%S") + + +def _make_observations( + start: datetime, + end: datetime, + interval_hours: int | float, + update_times: set[datetime], +) -> list[tuple[str, str]]: + rows: list[tuple[str, str]] = [] + last_update: datetime | None = None + current = start + while current <= end: + if current in update_times: + rows.append((_ts(current), "0")) + last_update = current + elif last_update is not None: + minutes = int((current - last_update).total_seconds() / 60) + rows.append((_ts(current), str(minutes))) + current += timedelta(hours=interval_hours) + return rows + + +def _weekday_updates( + hour: int, + start: datetime, + end: datetime, + skip_dates: set | None = None, +) -> set[datetime]: + updates: set[datetime] = set() + d = start.replace(hour=0, minute=0, second=0) + while d <= end: + if d.weekday() < 5 and (skip_dates is None or d.date() not in skip_dates): + updates.add(d.replace(hour=hour, minute=0, second=0)) + d += timedelta(days=1) + return updates + + +def _gen_daily_regular() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 7, 0) + end = datetime(2025, 11, 9, 19, 0) + updates = _weekday_updates(7, start, end) + return _to_csv_rows(_make_observations(start, end, 12, updates)) + + +def _gen_daily_late_gap_phase() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 7, 0) + end = datetime(2025, 11, 16, 19, 0) + skip = { + datetime(2025, 10, 29).date(), + datetime(2025, 10, 30).date(), + datetime(2025, 10, 31).date(), + } + updates = _weekday_updates(7, start, end, skip_dates=skip) + return _to_csv_rows(_make_observations(start, end, 12, updates)) + + +def _gen_daily_late_schedule_phase() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 7, 0) + end = datetime(2025, 11, 30, 19, 0) + skip = { + datetime(2025, 11, 12).date(), + datetime(2025, 11, 13).date(), + datetime(2025, 11, 14).date(), + } + updates = _weekday_updates(7, start, end, skip_dates=skip) + return _to_csv_rows(_make_observations(start, end, 12, updates)) + + +def _gen_subdaily_regular() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 0, 0) + end = datetime(2025, 11, 2, 23, 0) + updates: set[datetime] = set() + d = start.replace(hour=0) + while d <= end: + if d.weekday() < 5: + for h in range(8, 19, 2): + updates.add(d.replace(hour=h)) + d += timedelta(days=1) + return _to_csv_rows(_make_observations(start, end, 2, updates)) + + +def _gen_subdaily_gap_phase() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 0, 0) + end = datetime(2025, 11, 2, 23, 0) + gap_date = datetime(2025, 10, 22).date() + updates: set[datetime] = set() + d = start.replace(hour=0) + while d <= end: + if d.weekday() < 5: + for h in range(8, 19, 2): + dt = d.replace(hour=h) + if dt.date() == gap_date and h >= 12: + continue + updates.add(dt) + d += timedelta(days=1) + return _to_csv_rows(_make_observations(start, end, 2, updates)) + + +def _gen_subdaily_gap_schedule_phase() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 0, 0) + end = datetime(2025, 11, 9, 23, 0) + gap_date = datetime(2025, 10, 29).date() + updates: set[datetime] = set() + d = start.replace(hour=0) + while d <= end: + if d.weekday() < 5: + for h in range(8, 19, 2): + dt = d.replace(hour=h) + if dt.date() == gap_date and h >= 12: + continue + updates.add(dt) + d += timedelta(days=1) + return _to_csv_rows(_make_observations(start, end, 2, updates)) + + +def _gen_weekly_early() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 8, 7, 10, 0) + end = datetime(2025, 11, 6, 22, 0) + updates: set[datetime] = set() + d = start.replace(hour=0) + while d <= end: + if d.weekday() == 3: + updates.add(d.replace(hour=10, minute=0)) + d += timedelta(days=1) + updates.add(datetime(2025, 10, 21, 10, 0)) + updates.discard(datetime(2025, 10, 23, 10, 0)) + return _to_csv_rows(_make_observations(start, end, 12, updates)) + + +def _gen_training_only() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 7, 0) + end = datetime(2025, 11, 2, 19, 0) + updates = { + datetime(2025, 10, 6, 7, 0), + datetime(2025, 10, 13, 7, 0), + datetime(2025, 10, 20, 7, 0), + datetime(2025, 10, 27, 7, 0), + } + return _to_csv_rows(_make_observations(start, end, 12, updates)) + + +def _gen_mwf_regular() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 7, 0) + end = datetime(2025, 12, 1, 19, 0) + updates: set[datetime] = set() + d = start.replace(hour=0) + while d <= end: + if d.weekday() in {0, 2, 4}: + updates.add(d.replace(hour=7, minute=0, second=0)) + d += timedelta(days=1) + return _to_csv_rows(_make_observations(start, end, 12, updates)) + + +def _gen_mwf_late() -> list[tuple[pd.Timestamp, float]]: + start = datetime(2025, 10, 6, 7, 0) + end = datetime(2025, 12, 15, 19, 0) + skip = { + datetime(2025, 11, 26).date(), + datetime(2025, 11, 28).date(), + } + updates: set[datetime] = set() + d = start.replace(hour=0) + while d <= end: + if d.weekday() in {0, 2, 4} and d.date() not in skip: + updates.add(d.replace(hour=7, minute=0, second=0)) + d += timedelta(days=1) + return _to_csv_rows(_make_observations(start, end, 12, updates)) diff --git a/tests/unit/common/models/test_custom_types.py b/tests/unit/common/models/test_custom_types.py new file mode 100644 index 00000000..2bb56ce6 --- /dev/null +++ b/tests/unit/common/models/test_custom_types.py @@ -0,0 +1,132 @@ +from datetime import UTC, datetime +from unittest.mock import patch + +import pytest + +from testgen.common.models.custom_types import ( + EncryptedBytea, + EncryptedJson, + NullIfEmptyString, + UpdateTimestamp, + YNString, + ZeroIfEmptyInteger, +) + +pytestmark = pytest.mark.unit + + +# --- NullIfEmptyString --- + +@pytest.mark.parametrize( + "value, expected", + [ + ("", None), + ("hello", "hello"), + (None, None), + ], +) +def test_null_if_empty_string(value, expected): + t = NullIfEmptyString() + assert t.process_bind_param(value, None) == expected + + +# --- YNString --- + +@pytest.mark.parametrize( + "value, expected", + [ + (True, "Y"), + (False, "N"), + ("Y", "Y"), + ("N", "N"), + (None, None), + ], +) +def test_yn_string_bind(value, expected): + t = YNString() + assert t.process_bind_param(value, None) == expected + + +@pytest.mark.parametrize( + "value, expected", + [ + ("Y", True), + ("N", False), + (None, None), + ], +) +def test_yn_string_result(value, expected): + t = YNString() + assert t.process_result_value(value, None) == expected + + +# --- ZeroIfEmptyInteger --- + +@pytest.mark.parametrize( + "value, expected", + [ + (5, 5), + (0, 0), + ("", 0), + (None, 0), + ], +) +def test_zero_if_empty_integer(value, expected): + t = ZeroIfEmptyInteger() + assert t.process_bind_param(value, None) == expected + + +# --- UpdateTimestamp --- + +def test_update_timestamp(): + t = UpdateTimestamp() + before = datetime.now(UTC) + result = t.process_bind_param("ignored", None) + after = datetime.now(UTC) + assert before <= result <= after + + +# --- EncryptedBytea roundtrip --- + +@patch("testgen.common.encrypt.settings") +def test_encrypted_bytea_roundtrip(mock_settings): + mock_settings.APP_ENCRYPTION_SALT = "testsalt12345678" + mock_settings.APP_ENCRYPTION_SECRET = "testsecret123456" # noqa: S105 + + t = EncryptedBytea() + original = "sensitive data" + + encrypted = t.process_bind_param(original, None) + assert encrypted != original.encode() + + decrypted = t.process_result_value(encrypted, None) + assert decrypted == original + + +@patch("testgen.common.encrypt.settings") +def test_encrypted_bytea_none(mock_settings): + t = EncryptedBytea() + assert t.process_bind_param(None, None) is None + assert t.process_result_value(None, None) is None + + +# --- EncryptedJson roundtrip --- + +@patch("testgen.common.encrypt.settings") +def test_encrypted_json_roundtrip(mock_settings): + mock_settings.APP_ENCRYPTION_SALT = "testsalt12345678" + mock_settings.APP_ENCRYPTION_SECRET = "testsecret123456" # noqa: S105 + + t = EncryptedJson() + original = {"key": "value", "num": 42, "list": [1, 2, 3]} + + encrypted = t.process_bind_param(original, None) + decrypted = t.process_result_value(encrypted, None) + assert decrypted == original + + +@patch("testgen.common.encrypt.settings") +def test_encrypted_json_none(mock_settings): + t = EncryptedJson() + assert t.process_bind_param(None, None) is None + assert t.process_result_value(None, None) is None diff --git a/tests/unit/test_common_email.py b/tests/unit/common/notifications/test_common_email.py similarity index 98% rename from tests/unit/test_common_email.py rename to tests/unit/common/notifications/test_common_email.py index 907a4ba2..c109f328 100644 --- a/tests/unit/test_common_email.py +++ b/tests/unit/common/notifications/test_common_email.py @@ -4,6 +4,8 @@ from testgen.common.notifications.base import BaseEmailTemplate, EmailTemplateException +pytestmark = pytest.mark.unit + class TestEmailTemplate(BaseEmailTemplate): diff --git a/tests/unit/common/notifications/test_monitor_run_notifications.py b/tests/unit/common/notifications/test_monitor_run_notifications.py new file mode 100644 index 00000000..cd48b6db --- /dev/null +++ b/tests/unit/common/notifications/test_monitor_run_notifications.py @@ -0,0 +1,322 @@ +from unittest.mock import Mock, patch + +import pytest + +from testgen.common.models.notification_settings import ( + MonitorNotificationSettings, + MonitorNotificationTrigger, +) +from testgen.common.models.project import Project +from testgen.common.models.table_group import TableGroup +from testgen.common.models.test_result import TestResult +from testgen.common.models.test_run import TestRun +from testgen.common.notifications.monitor_run import send_monitor_notifications + +pytestmark = pytest.mark.unit + + +def create_monitor_ns(**kwargs): + with patch("testgen.common.notifications.monitor_run.MonitorNotificationSettings.save"): + return MonitorNotificationSettings.create("proj", "tg-id", "ts-id", **kwargs) + + +def create_test_result(table_name, test_type, message, result_code=0): + mock = Mock(spec=TestResult) + mock.table_name = table_name + mock.test_type = test_type + mock.message = message + mock.result_code = result_code + return mock + + +@pytest.fixture +def ns_select_result(): + return [ + create_monitor_ns( + recipients=["always@example.com"], + trigger=MonitorNotificationTrigger.on_anomalies, + ), + create_monitor_ns( + recipients=["filtered@example.com"], + trigger=MonitorNotificationTrigger.on_anomalies, + table_name="users", + ), + ] + + +@pytest.fixture +def ns_select_patched(ns_select_result): + with patch("testgen.common.notifications.monitor_run.MonitorNotificationSettings.select") as mock: + mock.return_value = ns_select_result + yield mock + + +@pytest.fixture +def send_mock(): + with patch("testgen.common.notifications.monitor_run.MonitorEmailTemplate.send") as mock: + yield mock + + +@pytest.fixture +def select_where_mock(): + with patch("testgen.common.notifications.monitor_run.TableGroup.select_where") as mock: + yield mock + + +@pytest.fixture +def project_get_mock(): + with patch("testgen.common.notifications.monitor_run.Project.get") as mock: + yield mock + + +@pytest.fixture +def test_result_select_where_mock(): + with patch("testgen.common.notifications.monitor_run.TestResult.select_where") as mock: + yield mock + + +@pytest.fixture +def persisted_setting_mock(): + with patch("testgen.common.notifications.monitor_run.PersistedSetting.get") as mock: + mock.return_value = "http://tg-base-url" + yield mock + + +@pytest.mark.parametrize( + ( + "freshness_count", "schema_count", "volume_count", "table_name_filter", + "expected_send_calls", "expected_anomalies_count" + ), + [ + (0, 0, 0, None, 0, 0), + (5, 0, 0, None, 1, 5), + (0, 3, 0, None, 1, 3), + (0, 0, 2, None, 1, 2), + (5, 3, 2, None, 1, 10), + (5, 3, 2, "users", 1, 10), + (10, 5, 3, None, 1, 18), + ] +) +def test_send_monitor_notifications( + freshness_count, + schema_count, + volume_count, + table_name_filter, + expected_send_calls, + expected_anomalies_count, + ns_select_patched, + select_where_mock, + project_get_mock, + test_result_select_where_mock, + send_mock, + persisted_setting_mock, +): + test_run = TestRun( + id="monitor-run-id", + test_suite_id="monitor-suite-id", + test_endtime="2024-01-15T10:30:00Z", + ) + + table_group = Mock(spec=TableGroup) + table_group.id = "tg-id" + table_group.project_code = "proj-code" + table_group.table_groups_name = "production_tables" + select_where_mock.return_value = [table_group] + + project = Mock(spec=Project) + project.project_name = "Data Platform" + project_get_mock.return_value = project + + test_results = [] + for _ in range(freshness_count): + test_results.append(create_test_result("orders", "Freshness_Trend", "Data is 2 hours old")) + for _ in range(schema_count): + test_results.append(create_test_result("customers", "Schema_Drift", "Column 'status' was removed")) + for _ in range(volume_count): + test_results.append(create_test_result("products", "Volume_Trend", "Volume decreased by 25%")) + + test_result_select_where_mock.return_value = test_results + + if table_name_filter: + ns_select_patched.return_value = [ + create_monitor_ns( + recipients=["filtered@example.com"], + trigger=MonitorNotificationTrigger.on_anomalies, + table_name=table_name_filter, + ), + ] + else: + ns_select_patched.return_value = [ + create_monitor_ns( + recipients=["always@example.com"], + trigger=MonitorNotificationTrigger.on_anomalies, + ), + ] + + send_monitor_notifications(test_run) + + ns_select_patched.assert_called_once_with( + enabled=True, + test_suite_id="monitor-suite-id", + ) + + if expected_send_calls > 0: + assert send_mock.call_count == expected_send_calls + + for call_args in send_mock.call_args_list: + context = call_args[0][1] + assert context["summary"]["test_endtime"] == "2024-01-15T10:30:00Z" + assert context["summary"]["table_groups_name"] == "production_tables" + assert context["summary"]["project_name"] == "Data Platform" + assert context["total_anomalies"] == expected_anomalies_count + assert "summary_tags" in context + assert "anomalies" in context + assert "view_in_testgen_url" in context + assert len(context["anomalies"]) == expected_anomalies_count + else: + send_mock.assert_not_called() + + +@pytest.mark.parametrize( + ("has_notifications", "has_table_group", "has_results"), + [ + (False, True, True), + (True, False, True), + (True, True, False), + ] +) +def test_send_monitor_notifications_early_exit( + has_notifications, + has_table_group, + has_results, + ns_select_patched, + select_where_mock, + test_result_select_where_mock, + send_mock, +): + test_run = TestRun( + id="monitor-run-id", + test_suite_id="monitor-suite-id", + test_endtime="2024-01-15T10:30:00Z", + ) + + if not has_notifications: + ns_select_patched.return_value = [] + if not has_table_group: + select_where_mock.return_value = [] + if not has_results: + test_result_select_where_mock.return_value = [] + + send_monitor_notifications(test_run) + + send_mock.assert_not_called() + + +def test_send_monitor_notifications_anomaly_counts( + ns_select_patched, + select_where_mock, + project_get_mock, + test_result_select_where_mock, + send_mock, + persisted_setting_mock, +): + test_run = TestRun( + id="monitor-run-id", + test_suite_id="monitor-suite-id", + test_endtime="2024-01-15T10:30:00Z", + ) + + table_group = Mock(spec=TableGroup) + table_group.id = "tg-id" + table_group.project_code = "proj-code" + table_group.table_groups_name = "prod" + select_where_mock.return_value = [table_group] + + project = Mock(spec=Project) + project.project_name = "Analytics" + project_get_mock.return_value = project + + test_results = [ + create_test_result("t1", "Freshness_Trend", "msg1"), + create_test_result("t2", "Freshness_Trend", "msg2"), + create_test_result("t3", "Schema_Drift", "msg3"), + create_test_result("t4", "Volume_Trend", "msg4"), + create_test_result("t5", "Volume_Trend", "msg5"), + ] + test_result_select_where_mock.return_value = test_results + + ns_select_patched.return_value = [ + create_monitor_ns( + recipients=["always@example.com"], + trigger=MonitorNotificationTrigger.on_anomalies, + ), + ] + + send_monitor_notifications(test_run) + + assert send_mock.call_count == 1 + context = send_mock.call_args[0][1] + + summary_tags = {item["type"]: item["badge_content"] for item in context["summary_tags"]} + assert summary_tags["Freshness"] == "2" + assert summary_tags["Schema"] == "1" + assert summary_tags["Volume"] == "2" + + +def test_send_monitor_notifications_url_construction( + ns_select_patched, + select_where_mock, + project_get_mock, + test_result_select_where_mock, + send_mock, + persisted_setting_mock, +): + test_run = TestRun( + id="monitor-run-id", + test_suite_id="monitor-suite-id", + test_endtime="2024-01-15T10:30:00Z", + ) + + table_group = Mock(spec=TableGroup) + table_group.id = "tg-123" + table_group.project_code = "proj-abc" + table_group.table_groups_name = "prod" + select_where_mock.return_value = [table_group] + + project = Mock(spec=Project) + project.project_name = "Analytics" + project_get_mock.return_value = project + + test_results = [create_test_result("orders", "Freshness_Trend", "stale")] + test_result_select_where_mock.return_value = test_results + + # Test without table_name filter + ns_select_patched.return_value = [ + create_monitor_ns( + recipients=["always@example.com"], + trigger=MonitorNotificationTrigger.on_anomalies, + ), + ] + send_monitor_notifications(test_run) + + context = send_mock.call_args[0][1] + assert context["view_in_testgen_url"] == ( + "http://tg-base-url/monitors?project_code=proj-abc&table_group_id=tg-123&source=email" + ) + + send_mock.reset_mock() + ns_select_patched.return_value = [ + create_monitor_ns( + recipients=["filtered@example.com"], + trigger=MonitorNotificationTrigger.on_anomalies, + table_name="users", + ), + ] + + send_monitor_notifications(test_run) + + context = send_mock.call_args[0][1] + assert context["view_in_testgen_url"] == ( + "http://tg-base-url/monitors?project_code=proj-abc&table_group_id=tg-123&table_name_filter=users&source=email" + ) + assert context["summary"]["table_name"] == "users" diff --git a/tests/unit/test_profiling_run_notifications.py b/tests/unit/common/notifications/test_profiling_run_notifications.py similarity index 86% rename from tests/unit/test_profiling_run_notifications.py rename to tests/unit/common/notifications/test_profiling_run_notifications.py index b1fc911d..c9e7ca38 100644 --- a/tests/unit/test_profiling_run_notifications.py +++ b/tests/unit/common/notifications/test_profiling_run_notifications.py @@ -1,5 +1,6 @@ from itertools import count from unittest.mock import ANY, Mock, call, patch +from urllib.parse import quote import pytest @@ -11,6 +12,8 @@ from testgen.common.models.profiling_run import ProfilingRun from testgen.common.notifications.profiling_run import send_profiling_run_notifications +pytestmark = pytest.mark.unit + def create_ns(**kwargs): with patch("testgen.common.notifications.profiling_run.ProfilingRunNotificationSettings.save"): @@ -130,8 +133,8 @@ def test_send_profiling_run_notification( { "profiling_run": { "id": "pr-id", - "issues_url": "http://tg-base-url/profiling-runs:hygiene?run_id=pr-id", - "results_url": "http://tg-base-url/profiling-runs:results?run_id=pr-id", + "issues_url": "http://tg-base-url/profiling-runs:hygiene?run_id=pr-id&source=email", + "results_url": "http://tg-base-url/profiling-runs:results?run_id=pr-id&source=email", "start_time": None, "end_time": None, "status": profiling_run_status, @@ -139,7 +142,7 @@ def test_send_profiling_run_notification( "table_ct": None, "column_ct": None, }, - "new_issue_count": new_issue_count, + "issue_count": issue_count, "hygiene_issues_summary": ANY, "notification_trigger": trigger, "project_name": "proj-name", @@ -162,3 +165,22 @@ def test_send_profiling_run_notification( assert all(s.get("label") is not None for s in summary) assert all(s.get("priority") in priorities for s in summary) assert all(s.get("url") is not None for s in summary) + + # Verify priority-to-likelihood URL mapping and URL encoding + expected_likelihoods = { + "Definite": "Definite", + "Likely": "Likely", + "Possible": "Possible", + "High": "Potential PII", + "Moderate": "Potential PII", + } + for s in summary: + expected_likelihood = expected_likelihoods[s["priority"]] + assert f"likelihood={quote(expected_likelihood)}" in s["url"] + + # Verify is_new flags are passed through + all_issues = [issue for s in summary for issue in s["issues"]] + if not has_prev_run: + assert all(issue["is_new"] is True for issue in all_issues) + elif new_issue_count == 0: + assert all(issue["is_new"] is False for issue in all_issues) diff --git a/tests/unit/test_score_drop_notifications.py b/tests/unit/common/notifications/test_score_drop_notifications.py similarity index 97% rename from tests/unit/test_score_drop_notifications.py rename to tests/unit/common/notifications/test_score_drop_notifications.py index 796ecee6..26267578 100644 --- a/tests/unit/test_score_drop_notifications.py +++ b/tests/unit/common/notifications/test_score_drop_notifications.py @@ -7,6 +7,8 @@ from testgen.common.models.scores import ScoreDefinition, ScoreDefinitionResult from testgen.common.notifications.score_drop import collect_score_notification_data, send_score_drop_notifications +pytestmark = pytest.mark.unit + def create_ns(**kwargs): with patch("testgen.common.notifications.score_drop.ScoreDropNotificationSettings.save"): @@ -144,8 +146,8 @@ def test_send_score_drop_notifications_no_match( ) ) def test_send_score_drop_notifications( - total_prev, total_fresh, cde_prev, cde_fresh, triggers, score_definition, db_session_mock, ns_select_result, - send_mock, + total_prev, total_fresh, cde_prev, cde_fresh, triggers, score_definition, db_session_mock, + ns_select_result, send_mock, ): data = [ (score_definition, "score", total_prev, total_fresh), @@ -169,7 +171,7 @@ def test_send_score_drop_notifications( { "project_name": "Test Project", "definition": score_definition, - "scorecard_url": "http://tg-base-url/quality-dashboard:score-details?definition_id=sd-1", + "scorecard_url": "http://tg-base-url/quality-dashboard:score-details?definition_id=sd-1&source=email", "diff": [ {**expected_total_diff, "notify": total_triggers}, {**expected_cde_diff, "notify": cde_triggers}, diff --git a/tests/unit/test_test_run_notifications.py b/tests/unit/common/notifications/test_test_run_notifications.py similarity index 98% rename from tests/unit/test_test_run_notifications.py rename to tests/unit/common/notifications/test_test_run_notifications.py index e1817eb4..bde2d6ff 100644 --- a/tests/unit/test_test_run_notifications.py +++ b/tests/unit/common/notifications/test_test_run_notifications.py @@ -8,6 +8,8 @@ from testgen.common.models.test_run import TestRun from testgen.common.notifications.test_run import send_test_run_notifications +pytestmark = pytest.mark.unit + def create_ns(**kwargs): with patch("testgen.common.notifications.test_run.TestRunNotificationSettings.save"): @@ -97,6 +99,7 @@ def select_summary_mock(): ), [ ("Complete", 0, 0, 0, {}, 0, 0, 0, ["always"]), + ("Complete", 0, 5, 0, {}, 0, 5, 0, ["always", "on_warnings"]), ("Complete", 1, 1, 1, {}, 1, 1, 1, ["always", "on_failures", "on_warnings"]), ("Complete", 50, 50, 50, {"failed": 2, "warning": 3}, 10, 5, 5, [ "always", "on_failures", "on_warnings", "on_changes", @@ -171,7 +174,7 @@ def test_send_test_run_notification( expected_context = { "test_run": summary, - "test_run_url": "http://tg-base-url/test-runs:results?run_id=tr-id", + "test_run_url": "http://tg-base-url/test-runs:results?run_id=tr-id&source=email", "test_run_id": "tr-id", "test_result_summary": ANY, } diff --git a/tests/unit/common/test_clean_sql.py b/tests/unit/common/test_clean_sql.py new file mode 100644 index 00000000..89b4c456 --- /dev/null +++ b/tests/unit/common/test_clean_sql.py @@ -0,0 +1,74 @@ +import pytest + +from testgen.common.clean_sql import CleanSQL, concat_columns + +pytestmark = pytest.mark.unit + + +# --- CleanSQL --- + +def test_clean_sql_block_comments(): + assert CleanSQL("SELECT /* comment */ 1") == "SELECT 1" + + +def test_clean_sql_multiline_block_comments(): + sql = """SELECT /* + multi-line + comment + */ 1""" + assert CleanSQL(sql) == "SELECT 1" + + +def test_clean_sql_line_comments(): + sql = "SELECT 1 -- this is a comment\nFROM t" + assert CleanSQL(sql) == "SELECT 1 FROM t" + + +def test_clean_sql_tabs_and_extra_spaces(): + sql = "SELECT\t 1\t\tFROM t" + assert CleanSQL(sql) == "SELECT 1 FROM t" + + +def test_clean_sql_preserves_quoted_strings(): + sql = "SELECT ' spaces ' FROM t" + result = CleanSQL(sql) + assert "' spaces '" in result + + +def test_clean_sql_preserves_double_quoted_strings(): + sql = 'SELECT " col name " FROM t' + result = CleanSQL(sql) + assert '" col name "' in result + + +def test_clean_sql_combined(): + sql = """ + SELECT /* get all */ + col1, col2 + FROM table1 -- main table + WHERE col1 = 'hello world' + """ + result = CleanSQL(sql) + assert "/* get all */" not in result + assert "-- main table" not in result + assert "'hello world'" in result + assert "col1, col2" in result + + +# --- concat_columns --- + +def test_concat_columns_multiple(): + result = concat_columns("col1, col2, col3", "NULL") + assert result == "CONCAT(COALESCE(col1, 'NULL'), COALESCE(col2, 'NULL'), COALESCE(col3, 'NULL'))" + + +def test_concat_columns_single(): + assert concat_columns("col1", "NULL") == "col1" + + +def test_concat_columns_empty(): + assert concat_columns("", "NULL") == "" + + +def test_concat_columns_none(): + assert concat_columns(None, "NULL") == "" diff --git a/tests/unit/common/test_date_service.py b/tests/unit/common/test_date_service.py new file mode 100644 index 00000000..3e12e37b --- /dev/null +++ b/tests/unit/common/test_date_service.py @@ -0,0 +1,29 @@ +from datetime import UTC, datetime +from unittest.mock import patch + +import pytest + +from testgen.common.date_service import as_iso_timestamp, get_now_as_iso_timestamp + +pytestmark = pytest.mark.unit + + +@pytest.mark.parametrize( + "value, expected", + [ + (datetime(2024, 3, 15, 10, 30, 45), "2024-03-15T10:30:45Z"), + (datetime(2024, 1, 1, 0, 0, 0), "2024-01-01T00:00:00Z"), + (None, None), + ], +) +def test_as_iso_timestamp(value, expected): + assert as_iso_timestamp(value) == expected + + +def test_get_now_as_iso_timestamp(): + with patch("testgen.common.date_service.datetime") as mock_dt: + mock_dt.now.return_value = datetime(2024, 6, 15, 12, 0, 0, tzinfo=UTC) + mock_dt.strftime = datetime.strftime + result = get_now_as_iso_timestamp() + + assert result == "2024-06-15T12:00:00Z" diff --git a/tests/unit/common/test_freshness_scenarios.py b/tests/unit/common/test_freshness_scenarios.py new file mode 100644 index 00000000..86111b1a --- /dev/null +++ b/tests/unit/common/test_freshness_scenarios.py @@ -0,0 +1,474 @@ +"""Freshness monitor scenario tests. + +Pure Python tests that iterate through time series data, calling +compute_freshness_threshold() at each step with growing history, +and asserting expected outcomes at key checkpoints. + +See scripts/test_data/SCENARIOS.md for scenario descriptions. +""" + +import json + +import pandas as pd +import pytest + +from testgen.common.models.test_suite import PredictSensitivity + +from .conftest import ( + ScenarioPoint, + _gen_daily_late_gap_phase, + _gen_daily_late_schedule_phase, + _gen_daily_regular, + _gen_mwf_late, + _gen_mwf_regular, + _gen_subdaily_gap_phase, + _gen_subdaily_gap_schedule_phase, + _gen_subdaily_regular, + _gen_training_only, + _gen_weekly_early, + _run_scenario, +) + + +def _updates(results: list[ScenarioPoint]) -> list[ScenarioPoint]: + """Filter to update points only (value == 0).""" + return [p for p in results if p.value == 0] + + +def _anomalies(results: list[ScenarioPoint]) -> list[ScenarioPoint]: + """Filter to anomaly points only (result_code == 0).""" + return [p for p in results if p.result_code == 0] + + + +def _schedule(point: ScenarioPoint) -> dict | None: + if not point.prediction_json: + return None + data = json.loads(point.prediction_json) + return data if data else None + + +# ─── Scenario 1: Daily Regular ────────────────────────────────────── + + +class Test_DailyRegular: + """Happy path: daily weekday updates at 07:00 UTC, 5 weeks.""" + + @pytest.fixture(scope="class") + def results_excl(self) -> list[ScenarioPoint]: + rows = _gen_daily_regular() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York") + + @pytest.fixture(scope="class") + def results_no_excl(self) -> list[ScenarioPoint]: + rows = _gen_daily_regular() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=False, tz=None) + + def test_training_exits(self, results_excl: list[ScenarioPoint]) -> None: + """Training should end. First non-training update needs 5 gaps + min_lookback=30 rows.""" + updates = _updates(results_excl) + first_non_training = next((i for i, p in enumerate(updates) if p.upper is not None), None) + assert first_non_training is not None + # 5 weekday updates = 5 gaps, but min_lookback=30 means ~30 rows needed first + # With 12h obs interval and daily updates, training exits around update 10-14 + assert 6 <= first_non_training <= 16 + + def test_zero_anomalies_excl(self, results_excl: list[ScenarioPoint]) -> None: + assert len(_anomalies(results_excl)) == 0 + + def test_zero_anomalies_no_excl(self, results_no_excl: list[ScenarioPoint]) -> None: + assert len(_anomalies(results_no_excl)) == 0 + + def test_thresholds_present_after_training(self, results_excl: list[ScenarioPoint]) -> None: + post_training = [p for p in results_excl if p.upper is not None] + assert len(post_training) > 0 + for p in post_training: + assert p.upper > 0 + + +# ─── Scenario 2a: Daily Late (Gap Phase) ──────────────────────────── + + +class Test_DailyLateGapPhase: + """3-day outage during gap-duration phase (~16 completed gaps).""" + + @pytest.fixture(scope="class") + def results_excl(self) -> list[ScenarioPoint]: + rows = _gen_daily_late_gap_phase() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York") + + @pytest.fixture(scope="class") + def results_no_excl(self) -> list[ScenarioPoint]: + rows = _gen_daily_late_gap_phase() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=False, tz=None) + + def test_schedule_tentative_excl(self, results_excl: list[ScenarioPoint]) -> None: + """At ~16 gaps, schedule should be tentative (not active).""" + outage_start = pd.Timestamp("2025-10-29") + pre_outage = [p for p in results_excl if p.timestamp < outage_start and p.prediction_json] + last_sched = _schedule(pre_outage[-1]) if pre_outage else None + if last_sched and last_sched.get("schedule_stage"): + assert last_sched["schedule_stage"] in ("tentative", "training") + + def test_anomaly_detected_during_outage_excl(self, results_excl: list[ScenarioPoint]) -> None: + """Anomaly should be detected during the Wed-Fri outage.""" + outage_start = pd.Timestamp("2025-10-29") + recovery = pd.Timestamp("2025-11-03 07:00") # Mon + outage_anomalies = [p for p in _anomalies(results_excl) if outage_start <= p.timestamp < recovery] + assert len(outage_anomalies) > 0 + + def test_anomaly_detected_during_outage_no_excl(self, results_no_excl: list[ScenarioPoint]) -> None: + """Anomaly should be detected during outage (possibly delayed).""" + outage_start = pd.Timestamp("2025-10-29") + recovery = pd.Timestamp("2025-11-03 19:00") + outage_anomalies = [p for p in _anomalies(results_no_excl) if outage_start <= p.timestamp <= recovery] + assert len(outage_anomalies) > 0 + + def test_recovery_passes_excl(self, results_excl: list[ScenarioPoint]) -> None: + """After recovery on Monday, subsequent updates should pass. + + The first recovery update (Mon 07:00) marks the completion of the + anomalous outage gap, so it legitimately fails. The SECOND update + after recovery should pass. + """ + recovery = pd.Timestamp("2025-11-03 07:00") + post_recovery_updates = [p for p in _updates(results_excl) if p.timestamp >= recovery] + assert len(post_recovery_updates) >= 2 + # First recovery update completes the outage gap β€” expected to fail + assert post_recovery_updates[0].result_code == 0 + # Second and subsequent updates should pass + for p in post_recovery_updates[1:3]: + assert p.result_code == 1, f"Expected pass at {p.timestamp}, got code={p.result_code}" + + +# ─── Scenario 2b: Daily Late (Schedule Phase) ─────────────────────── + + +class Test_DailyLateSchedulePhase: + """3-day outage during schedule inference phase (~26 completed gaps).""" + + @pytest.fixture(scope="class") + def results_excl(self) -> list[ScenarioPoint]: + rows = _gen_daily_late_schedule_phase() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York") + + @pytest.fixture(scope="class") + def results_no_excl(self) -> list[ScenarioPoint]: + rows = _gen_daily_late_schedule_phase() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=False, tz="America/New_York") + + def test_schedule_active_before_outage(self, results_excl: list[ScenarioPoint]) -> None: + """By ~26 gaps, schedule should reach 'active' stage.""" + outage_start = pd.Timestamp("2025-11-12") + pre_outage = [p for p in results_excl if p.timestamp < outage_start and p.prediction_json] + last_sched = _schedule(pre_outage[-1]) if pre_outage else None + assert last_sched is not None + assert last_sched.get("schedule_stage") == "active" + + def test_anomaly_detected_during_outage_excl(self, results_excl: list[ScenarioPoint]) -> None: + outage_start = pd.Timestamp("2025-11-12") + recovery = pd.Timestamp("2025-11-17 07:00") + outage_anomalies = [p for p in _anomalies(results_excl) if outage_start <= p.timestamp < recovery] + assert len(outage_anomalies) > 0 + + def test_anomaly_detected_during_outage_no_excl(self, results_no_excl: list[ScenarioPoint]) -> None: + outage_start = pd.Timestamp("2025-11-12") + recovery = pd.Timestamp("2025-11-17 19:00") + outage_anomalies = [p for p in _anomalies(results_no_excl) if outage_start <= p.timestamp <= recovery] + assert len(outage_anomalies) > 0 + + def test_detection_no_later_than_gap_phase(self, results_excl: list[ScenarioPoint]) -> None: + """Schedule-phase detection should be no later than gap-phase.""" + gap_rows = _gen_daily_late_gap_phase() + gap_results = _run_scenario(gap_rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York") + + # Find first anomaly relative to outage start in each scenario + gap_outage_start = pd.Timestamp("2025-10-29") + sched_outage_start = pd.Timestamp("2025-11-12") + + gap_first = next( + ((p.timestamp - gap_outage_start).total_seconds() for p in _anomalies(gap_results) + if p.timestamp >= gap_outage_start), + None, + ) + sched_first = next( + ((p.timestamp - sched_outage_start).total_seconds() for p in _anomalies(results_excl) + if p.timestamp >= sched_outage_start), + None, + ) + + assert gap_first is not None and sched_first is not None + assert sched_first <= gap_first, ( + f"Schedule-phase detected at +{sched_first/3600:.1f}h but gap-phase at +{gap_first/3600:.1f}h" + ) + + +# ─── Scenario 3: Sub-daily Regular ────────────────────────────────── + + +class Test_SubdailyRegular: + """Sub-daily happy path: updates every 2h from 08:00-18:00 on weekdays.""" + + @pytest.fixture(scope="class") + def results(self) -> list[ScenarioPoint]: + rows = _gen_subdaily_regular() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York") + + def test_zero_anomalies(self, results: list[ScenarioPoint]) -> None: + assert len(_anomalies(results)) == 0 + + def test_schedule_active_with_subdaily(self, results: list[ScenarioPoint]) -> None: + """Schedule should reach active with sub_daily frequency.""" + last_with_sched = None + for p in reversed(results): + sched = _schedule(p) + if sched and sched.get("schedule_stage"): + last_with_sched = sched + break + assert last_with_sched is not None + assert last_with_sched["schedule_stage"] == "active" + assert last_with_sched["frequency"] == "sub_daily" + + def test_window_set(self, results: list[ScenarioPoint]) -> None: + """Active sub-daily schedule should have a time window.""" + last_with_sched = None + for p in reversed(results): + sched = _schedule(p) + if sched and sched.get("schedule_stage") == "active": + last_with_sched = sched + break + assert last_with_sched is not None + assert last_with_sched.get("window_start") is not None + assert last_with_sched.get("window_end") is not None + + +# ─── Scenario 4a: Sub-daily Gap (Gap Phase) ───────────────────────── + + +class Test_SubdailyGapPhase: + """Within-window gap during gap-duration phase (schedule NOT active).""" + + @pytest.fixture(scope="class") + def results(self) -> list[ScenarioPoint]: + rows = _gen_subdaily_gap_phase() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York") + + def test_schedule_not_active(self, results: list[ScenarioPoint]) -> None: + """Schedule should NOT be active at ~16 days of history.""" + gap_date = pd.Timestamp("2025-10-22") + pre_gap = [p for p in results if p.timestamp < gap_date and p.prediction_json] + if pre_gap: + sched = _schedule(pre_gap[-1]) + if sched and sched.get("schedule_stage"): + assert sched["schedule_stage"] != "active" + + def test_anomaly_detected_late(self, results: list[ScenarioPoint]) -> None: + """Without schedule, anomaly triggers late (fallback to upper).""" + gap_start = pd.Timestamp("2025-10-22 10:00") + gap_end = pd.Timestamp("2025-10-23 08:00") + gap_anomalies = [p for p in _anomalies(results) if gap_start <= p.timestamp <= gap_end] + assert len(gap_anomalies) > 0 + + def test_recovery_passes(self, results: list[ScenarioPoint]) -> None: + """Recovery at Thu 10:00 should pass.""" + recovery = pd.Timestamp("2025-10-23 10:00") + post = [p for p in _updates(results) if p.timestamp >= recovery] + assert len(post) > 0 + assert post[0].result_code == 1 + + +# ─── Scenario 4b: Sub-daily Gap (Schedule Phase) ──────────────────── + + +class Test_SubdailyGapSchedulePhase: + """Within-window gap during schedule inference phase (schedule active).""" + + @pytest.fixture(scope="class") + def results(self) -> list[ScenarioPoint]: + rows = _gen_subdaily_gap_schedule_phase() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York") + + def test_schedule_active(self, results: list[ScenarioPoint]) -> None: + """Schedule should be active by the gap date.""" + gap_date = pd.Timestamp("2025-10-29") + pre_gap = [p for p in results if p.timestamp < gap_date and p.prediction_json] + sched = _schedule(pre_gap[-1]) if pre_gap else None + assert sched is not None + assert sched.get("schedule_stage") == "active" + + def test_anomaly_detected_earlier_than_gap_phase(self, results: list[ScenarioPoint]) -> None: + """Schedule-aware detection should catch the gap earlier than 4a.""" + gap_phase_rows = _gen_subdaily_gap_phase() + gap_phase_results = _run_scenario( + gap_phase_rows, PredictSensitivity.medium, exclude_weekends=True, tz="America/New_York", + ) + + # Time from gap start to first anomaly + gap_4a_start = pd.Timestamp("2025-10-22 10:00") + gap_4b_start = pd.Timestamp("2025-10-29 10:00") + + first_4a = next( + ((p.timestamp - gap_4a_start).total_seconds() for p in _anomalies(gap_phase_results) + if p.timestamp >= gap_4a_start), + None, + ) + first_4b = next( + ((p.timestamp - gap_4b_start).total_seconds() for p in _anomalies(results) + if p.timestamp >= gap_4b_start), + None, + ) + + assert first_4a is not None and first_4b is not None + assert first_4b < first_4a, ( + f"4b detected at +{first_4b/3600:.1f}h but 4a at +{first_4a/3600:.1f}h" + ) + + def test_off_window_suppressed(self, results: list[ScenarioPoint]) -> None: + """Overnight/off-window observations should be suppressed (passed) when schedule is active.""" + gap_date = pd.Timestamp("2025-10-29") + # After the gap, overnight obs between 0:00-6:00 should not be anomalies + overnight_after_gap = [ + p for p in results + if p.timestamp.date() == gap_date.date() + and p.timestamp.hour < 6 + and p.value > 0 + and p.upper is not None # post-training + ] + for p in overnight_after_gap: + assert p.result_code != 0, f"Off-window anomaly at {p.timestamp}" + + +# ─── Scenario 5: Weekly Early ─────────────────────────────────────── + + +class Test_WeeklyEarly: + """Weekly Thursday updates, early Tuesday update in week 11.""" + + @pytest.fixture(scope="class") + def results(self) -> list[ScenarioPoint]: + rows = _gen_weekly_early() + return _run_scenario(rows, PredictSensitivity.low, exclude_weekends=False, tz=None) + + def test_early_update_detected(self, results: list[ScenarioPoint]) -> None: + """Lower bound should trigger on the early Tuesday update.""" + early_ts = pd.Timestamp("2025-10-21 10:00") + early_point = next((p for p in results if p.timestamp == early_ts), None) + assert early_point is not None + assert early_point.result_code == 0, f"Expected anomaly at early update, got code={early_point.result_code}" + + def test_lower_bound_present(self, results: list[ScenarioPoint]) -> None: + """Lower bound should be non-None by the time of the early update.""" + early_ts = pd.Timestamp("2025-10-21 10:00") + early_point = next((p for p in results if p.timestamp == early_ts), None) + assert early_point is not None + assert early_point.lower is not None + assert early_point.lower > 0 + + +# ─── Scenario 6: Training Only ────────────────────────────────────── + + +class Test_TrainingOnly: + """Insufficient data: only 4 updates (3 completed gaps).""" + + @pytest.fixture(scope="class") + def results(self) -> list[ScenarioPoint]: + rows = _gen_training_only() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=False, tz=None) + + def test_all_training(self, results: list[ScenarioPoint]) -> None: + """ALL observations should be training (result_code == -1).""" + for p in results: + assert p.result_code == -1, f"Expected training at {p.timestamp}, got code={p.result_code}" + + def test_upper_never_set(self, results: list[ScenarioPoint]) -> None: + """Upper threshold should never be non-None.""" + for p in results: + assert p.upper is None, f"Expected upper=None at {p.timestamp}, got {p.upper}" + + +# ─── Scenario 7: MWF Regular ──────────────────────────────────────── + + +class Test_MWFRegular: + """Mon/Wed/Fri updates at 07:00 UTC, 8 weeks. No anomalies.""" + + @pytest.fixture(scope="class") + def results(self) -> list[ScenarioPoint]: + rows = _gen_mwf_regular() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=False, tz="America/New_York") + + def test_zero_anomalies(self, results: list[ScenarioPoint]) -> None: + assert len(_anomalies(results)) == 0 + + def test_schedule_active(self, results: list[ScenarioPoint]) -> None: + last_sched = None + for p in reversed(results): + sched = _schedule(p) + if sched and sched.get("schedule_stage"): + last_sched = sched + break + assert last_sched is not None + assert last_sched["schedule_stage"] == "active" + + def test_active_days_mwf(self, results: list[ScenarioPoint]) -> None: + """Active days should be Mon(0), Wed(2), Fri(4).""" + last_sched = None + for p in reversed(results): + sched = _schedule(p) + if sched and sched.get("schedule_stage") == "active": + last_sched = sched + break + assert last_sched is not None + assert set(last_sched["active_days"]) == {0, 2, 4} + + def test_frequency_irregular(self, results: list[ScenarioPoint]) -> None: + """MWF cadence (median ~48h gap) should classify as 'irregular'.""" + last_sched = None + for p in reversed(results): + sched = _schedule(p) + if sched and sched.get("schedule_stage") == "active": + last_sched = sched + break + assert last_sched is not None + assert last_sched["frequency"] == "irregular" + + +# ─── Scenario 8: MWF Late ─────────────────────────────────────────── + + +class Test_MWFLate: + """Mon/Wed/Fri updates, skip Wed+Fri of week 8 (outage).""" + + @pytest.fixture(scope="class") + def results(self) -> list[ScenarioPoint]: + rows = _gen_mwf_late() + return _run_scenario(rows, PredictSensitivity.medium, exclude_weekends=False, tz="America/New_York") + + def test_schedule_active_before_outage(self, results: list[ScenarioPoint]) -> None: + outage_start = pd.Timestamp("2025-11-26") + pre_outage = [p for p in results if p.timestamp < outage_start and p.prediction_json] + sched = _schedule(pre_outage[-1]) if pre_outage else None + assert sched is not None + assert sched.get("schedule_stage") == "active" + + def test_anomaly_on_missed_wed(self, results: list[ScenarioPoint]) -> None: + """Anomaly should be detected around the missed Wed update.""" + missed_wed = pd.Timestamp("2025-11-26") + next_update = pd.Timestamp("2025-12-01 07:00") # Mon recovery + outage_anomalies = [p for p in _anomalies(results) if missed_wed <= p.timestamp < next_update] + assert len(outage_anomalies) > 0 + + def test_recovery_passes(self, results: list[ScenarioPoint]) -> None: + """After recovery on Mon week 9, subsequent updates should pass. + + The first recovery update completes the outage gap and fails. + The second update after recovery should pass. + """ + recovery = pd.Timestamp("2025-12-01 07:00") + post_recovery = [p for p in _updates(results) if p.timestamp >= recovery] + assert len(post_recovery) >= 2 + # First recovery update completes the outage gap β€” expected to fail + assert post_recovery[0].result_code == 0 + # Second update after recovery should pass + assert post_recovery[1].result_code == 1 diff --git a/tests/unit/common/test_freshness_service.py b/tests/unit/common/test_freshness_service.py new file mode 100644 index 00000000..6021e8e2 --- /dev/null +++ b/tests/unit/common/test_freshness_service.py @@ -0,0 +1,1141 @@ +import json +import zoneinfo + +import numpy as np +import pandas as pd + +from testgen.commands.test_thresholds_prediction import compute_freshness_threshold +from testgen.common.freshness_service import ( + MAX_FRESHNESS_GAPS, + InferredSchedule, + add_business_minutes, + classify_frequency, + compute_schedule_confidence, + count_excluded_minutes, + detect_active_days, + detect_update_window, + get_freshness_gap_threshold, + get_schedule_params, + infer_schedule, + is_excluded_day, + minutes_to_next_deadline, +) +from testgen.common.models.test_suite import PredictSensitivity + +from .conftest import _make_freshness_history + +TZ = "America/New_York" + + +def _make_schedule(**kwargs) -> InferredSchedule: + """Build an InferredSchedule with sensible defaults, overridable via kwargs.""" + defaults = { + "frequency": "daily", + "active_days": frozenset(range(5)), + "window_start": 9.0, + "window_end": 13.0, + "confidence": 0.0, + "num_events": 20, + "stage": "active", + } + defaults.update(kwargs) + return InferredSchedule(**defaults) + + +def _utc_timestamps(local_strings: list[str], tz: str = TZ) -> list[pd.Timestamp]: + """Convert local time strings to naive UTC timestamps (as stored in DB).""" + zi = zoneinfo.ZoneInfo(tz) + result = [] + for s in local_strings: + local_ts = pd.Timestamp(s, tz=zi) + utc_ts = local_ts.tz_convert("UTC").tz_localize(None) + result.append(utc_ts) + return result + + +# --------------------------------------------------------------------------- +# Sliding Window Tests +# --------------------------------------------------------------------------- + +class Test_SlidingWindow: + def test_outlier_ages_out(self): + # Build history: 1 big outlier gap followed by many normal gaps + updates = ["2026-01-01T00:00"] + # Outlier gap: 72h + updates.append("2026-01-04T00:00") + # Then 50 normal gaps of ~10h each (well beyond MAX_FRESHNESS_GAPS) + for i in range(50): + base = pd.Timestamp("2026-01-04T00:00") + pd.Timedelta(hours=10 * (i + 1)) + updates.append(str(base)) + + history = _make_freshness_history(updates) + + result = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10, + ) + + # Sliding window drops the 72h outlier β€” threshold should be near 10h (600 min) + assert result.upper < 1000 + + def test_window_size_respected(self): + # Create exactly MAX_FRESHNESS_GAPS + 5 gaps, first 5 are big outliers + updates = ["2026-01-01T00:00"] + for idx in range(5): + # 5 outlier gaps of 100h + updates.append(str(pd.Timestamp("2026-01-01T00:00") + pd.Timedelta(hours=100 * (idx + 1)))) + base = pd.Timestamp(updates[-1]) + for _ in range(MAX_FRESHNESS_GAPS): + base += pd.Timedelta(hours=10) + updates.append(str(base)) + + history = _make_freshness_history(updates) + + result = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.0, lower_percentile=10, + ) + # With window, P95 should be close to 600 min (10h gaps) + # Without window, P95 would be inflated by 100h gaps + assert result.upper < 1000 # Well below the 6000-min outlier gaps + + +# --------------------------------------------------------------------------- +# classify_frequency Tests +# --------------------------------------------------------------------------- + +class Test_ClassifyFrequency: + def test_sub_daily(self): + gaps = np.array([1.0, 2.0, 1.5, 2.5, 1.0]) + assert classify_frequency(gaps) == "sub_daily" + + def test_daily(self): + gaps = np.array([24.0, 23.0, 25.0, 24.0, 22.0]) + assert classify_frequency(gaps) == "daily" + + def test_weekly(self): + gaps = np.array([168.0, 167.0, 169.0, 168.0]) + assert classify_frequency(gaps) == "weekly" + + def test_irregular_empty(self): + assert classify_frequency(np.array([])) == "irregular" + + def test_irregular_mixed(self): + # Median around 50h β€” doesn't fit daily or weekly + gaps = np.array([40.0, 50.0, 60.0, 45.0, 55.0]) + assert classify_frequency(gaps) == "irregular" + + def test_boundary_36h_daily_to_irregular(self): + # Median exactly at 36h β€” boundary of daily band (< 36 β†’ daily) + gaps = np.array([35.0, 36.0, 37.0, 35.5, 36.5]) + assert classify_frequency(gaps) == "irregular" + + def test_boundary_just_under_36h(self): + # Median just under 36h β€” still daily + gaps = np.array([34.0, 35.0, 35.5, 34.5, 35.0]) + assert classify_frequency(gaps) == "daily" + + def test_every_other_day_48h(self): + # Median ~48h (every other day, e.g. MWF cadence) β†’ irregular + gaps = np.array([48.0, 47.0, 49.0, 48.0, 47.5]) + assert classify_frequency(gaps) == "irregular" + + def test_boundary_120h(self): + # Median at 120h β€” still in the irregular band (not weekly) + gaps = np.array([118.0, 120.0, 122.0, 119.0, 121.0]) + assert classify_frequency(gaps) == "irregular" + + def test_boundary_240h_and_above(self): + # Median at 240h β€” boundary of weekly band (weekly < 240) + gaps = np.array([238.0, 240.0, 242.0, 239.0, 241.0]) + assert classify_frequency(gaps) == "irregular" + + +# --------------------------------------------------------------------------- +# detect_active_days Tests +# --------------------------------------------------------------------------- + +class Test_DetectActiveDays: + def test_weekday_only(self): + # 4 weeks of weekday-only updates (Mon-Fri) + timestamps = [] + for week in range(4): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) # Monday + for day in range(5): # Mon-Fri + ts = base + pd.Timedelta(days=day, hours=10) + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + result = detect_active_days(utc_times, TZ) + + assert result is not None + assert result == frozenset({0, 1, 2, 3, 4}) + + def test_mon_wed_fri(self): + # 4 weeks of Mon/Wed/Fri updates + timestamps = [] + for week in range(4): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day_offset in [0, 2, 4]: # Mon, Wed, Fri + ts = base + pd.Timedelta(days=day_offset, hours=10) + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + result = detect_active_days(utc_times, TZ) + + assert result is not None + assert 0 in result # Monday + assert 2 in result # Wednesday + assert 4 in result # Friday + + def test_all_days(self): + # 4 weeks of daily updates (7 days/week) + timestamps = [] + for day in range(28): + ts = pd.Timestamp("2026-01-05") + pd.Timedelta(days=day, hours=10) + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + result = detect_active_days(utc_times, TZ) + + assert result is not None + assert len(result) == 7 + + def test_insufficient_data(self): + # Only 2 weeks of data (below min_weeks=3) + timestamps = [] + for day in range(14): + ts = pd.Timestamp("2026-01-05") + pd.Timedelta(days=day, hours=10) + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + result = detect_active_days(utc_times, TZ) + + assert result is None + + +# --------------------------------------------------------------------------- +# detect_update_window Tests +# --------------------------------------------------------------------------- + +class Test_DetectUpdateWindow: + def test_morning_cluster(self): + # 15 updates around 10-12 AM on weekdays + timestamps = [] + for day in range(15): + hour = 10 + (day % 3) # 10, 11, 12 cycling + ts = pd.Timestamp("2026-01-05") + pd.Timedelta(days=day, hours=hour) + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + active_days = frozenset(range(7)) + result = detect_update_window(utc_times, active_days, TZ) + + assert result is not None + window_start, window_end = result + assert 9.0 <= window_start <= 11.0 + assert 11.0 <= window_end <= 13.0 + + def test_insufficient_data(self): + # Only 5 updates β€” below threshold of 10 + timestamps = [ + pd.Timestamp("2026-01-05T10:00"), + pd.Timestamp("2026-01-06T10:00"), + pd.Timestamp("2026-01-07T10:00"), + pd.Timestamp("2026-01-08T10:00"), + pd.Timestamp("2026-01-09T10:00"), + ] + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + result = detect_update_window(utc_times, frozenset(range(7)), TZ) + assert result is None + + def test_midnight_wrap(self): + # Updates around midnight (23:00-01:00) + timestamps = [] + for day in range(15): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(days=day) + if day % 3 == 0: + ts = base + pd.Timedelta(hours=23) + elif day % 3 == 1: + ts = base + pd.Timedelta(hours=23, minutes=30) + else: + ts = base + pd.Timedelta(hours=0, minutes=30) + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + result = detect_update_window(utc_times, frozenset(range(7)), TZ) + + assert result is not None + # Window should wrap around midnight + + + +# --------------------------------------------------------------------------- +# compute_schedule_confidence Tests +# --------------------------------------------------------------------------- + +class Test_ComputeScheduleConfidence: + def test_high_confidence(self): + # All updates on weekdays 10-12 AM + schedule = _make_schedule() + timestamps = [] + for day in range(20): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(days=day) + if base.weekday() < 5: + ts = base + pd.Timedelta(hours=10 + (day % 3)) + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + confidence = compute_schedule_confidence(utc_times, schedule, TZ) + assert confidence >= 0.7 + + def test_low_confidence(self): + # Updates scattered across all hours and days + schedule = _make_schedule() + timestamps = [] + for i in range(20): + ts = pd.Timestamp("2026-01-05") + pd.Timedelta(hours=i * 17) # irregular spacing + timestamps.append(ts) + + utc_times = [ts.tz_localize(TZ).tz_convert("UTC").tz_localize(None) for ts in timestamps] + confidence = compute_schedule_confidence(utc_times, schedule, TZ) + assert confidence < 0.7 + + +# --------------------------------------------------------------------------- +# infer_schedule Tests +# --------------------------------------------------------------------------- + +class Test_InferSchedule: + def test_daily_weekday_pattern(self): + # 4 weeks of weekday updates at ~10 AM ET + updates = [] + for week in range(4): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10, minutes=(day * 10) % 60) + updates.append(str(ts)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + # Convert timestamps to UTC (history uses naive timestamps treated as UTC) + zi = zoneinfo.ZoneInfo(TZ) + utc_updates = [] + for s in updates: + local_ts = pd.Timestamp(s, tz=zi) + utc_ts = local_ts.tz_convert("UTC").tz_localize(None) + utc_updates.append(str(utc_ts)) + history = _make_freshness_history(utc_updates, check_interval_minutes=120) + + schedule = infer_schedule(history, TZ) + + assert schedule is not None + assert schedule.frequency == "daily" + assert schedule.num_events == 20 + + def test_insufficient_data_returns_none(self): + # Only 5 updates + updates = [f"2026-02-{d:02d}T10:00" for d in range(1, 6)] + history = _make_freshness_history(updates) + result = infer_schedule(history, TZ) + assert result is None + + def test_mon_wed_fri_pattern(self): + """MWF updates at ~10 AM ET for 7 weeks (21 events) β†’ stage should be 'active', not forced to 'irregular'.""" + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(7): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) # Monday + for day_offset in [0, 2, 4]: # Mon, Wed, Fri + ts = base + pd.Timedelta(days=day_offset, hours=10, minutes=(day_offset * 7) % 30) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + schedule = infer_schedule(history, TZ) + + assert schedule is not None + assert schedule.frequency == "irregular" # median gap ~48h falls in irregular band + assert schedule.stage == "active" # confidence-based, NOT forced to "irregular" + assert schedule.num_events == 21 + assert 0 in schedule.active_days # Monday + assert 2 in schedule.active_days # Wednesday + assert 4 in schedule.active_days # Friday + + def test_tue_thu_pattern(self): + """Tue/Thu updates at ~10 AM ET for 10 weeks (20 events) β†’ stage should be 'active'.""" + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(10): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) # Monday + for day_offset in [1, 3]: # Tue, Thu + ts = base + pd.Timedelta(days=day_offset, hours=10, minutes=(day_offset * 5) % 20) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + schedule = infer_schedule(history, TZ) + + assert schedule is not None + assert schedule.frequency == "irregular" # median gap ~72-84h + assert schedule.stage == "active" # high confidence from consistent day+time pattern + assert 1 in schedule.active_days # Tuesday + assert 3 in schedule.active_days # Thursday + + def test_sub_daily_pattern(self): + # 4 weeks of hourly updates during business hours + updates = [] + for week in range(4): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + for hour in range(9, 17): + ts = base + pd.Timedelta(days=day, hours=hour) + updates.append(str(ts)) + + zi = zoneinfo.ZoneInfo(TZ) + utc_updates = [] + for s in updates: + local_ts = pd.Timestamp(s, tz=zi) + utc_ts = local_ts.tz_convert("UTC").tz_localize(None) + utc_updates.append(str(utc_ts)) + history = _make_freshness_history(utc_updates, check_interval_minutes=30) + + schedule = infer_schedule(history, TZ) + + assert schedule is not None + assert schedule.frequency == "sub_daily" + assert schedule.num_events > 50 + + +# --------------------------------------------------------------------------- +# compute_freshness_threshold with schedule inference Tests +# --------------------------------------------------------------------------- + +class Test_ComputeFreshnessThresholdWithSchedule: + def test_returns_prediction_json_with_tz(self): + """When tz is provided and enough data exists, prediction JSON should contain schedule info.""" + # 4 weeks of daily weekday updates + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(4): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + lower, upper, staleness, prediction_json = compute_freshness_threshold( + history, PredictSensitivity.medium, schedule_tz=TZ, + ) + + assert upper is not None + assert prediction_json is not None + data = json.loads(prediction_json) + if "schedule_stage" in data: + assert data["schedule_stage"] in {"training", "tentative", "active", "irregular"} + assert "frequency" in data + assert "confidence" in data + # staleness is non-None only when schedule is active + if data.get("schedule_stage") == "active": + assert staleness is not None + + def test_schedule_overrides_threshold_when_active(self): + """When schedule inference reaches 'active' stage, staleness and upper should be set.""" + zi = zoneinfo.ZoneInfo(TZ) + # 5 weeks of daily weekday updates at 10 AM ET β€” 25 events, highly regular + updates = [] + for week in range(5): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + + + + lower, upper, staleness, prediction_json = compute_freshness_threshold( + history, PredictSensitivity.medium, schedule_tz=TZ, + ) + + assert upper is not None + assert upper > 0 + assert prediction_json is not None + assert staleness is not None # Active schedule β†’ staleness returned + + data = json.loads(prediction_json) + assert data["schedule_stage"] == "active" + assert "active_days" in data + + def test_prediction_json_includes_sensitivity_metadata(self): + """Prediction JSON should include sensitivity-related fields.""" + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(5): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + + + _, _, _, prediction_json = compute_freshness_threshold( + history, PredictSensitivity.high, schedule_tz=TZ, + ) + + assert prediction_json is not None + data = json.loads(prediction_json) + assert data["sensitivity"] == "high" + assert data["deadline_buffer_hours"] == 1.5 + + def test_high_sensitivity_tighter_than_low_end_to_end(self): + """Via compute_freshness_threshold: high sensitivity yields a tighter upper than low.""" + zi = zoneinfo.ZoneInfo(TZ) + # 5 weeks of daily weekday updates at 10 AM ET β€” reaches active schedule + updates = [] + for week in range(5): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + + + _, upper_high, _, json_high = compute_freshness_threshold( + history, PredictSensitivity.high, schedule_tz=TZ, + ) + _, upper_low, _, json_low = compute_freshness_threshold( + history, PredictSensitivity.low, schedule_tz=TZ, + ) + + assert upper_high is not None and upper_low is not None + assert json_high is not None and json_low is not None + + data_high = json.loads(json_high) + data_low = json.loads(json_low) + assert data_high["schedule_stage"] == "active" + assert data_low["schedule_stage"] == "active" + assert upper_high < upper_low + + def test_no_schedule_without_tz(self): + """Without tz, schedule inference is skipped and staleness_upper is absent.""" + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + _, _, staleness, prediction = compute_freshness_threshold(history, PredictSensitivity.medium) + assert staleness is None # No tz β†’ no active schedule β†’ staleness is None + assert prediction is not None + data = json.loads(prediction) + assert "schedule_stage" not in data # No schedule inference without tz + + def test_staleness_returned_with_active_schedule(self): + """When schedule inference reaches active stage, staleness is returned as 4th element.""" + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(5): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + + + _, _, staleness, prediction_json = compute_freshness_threshold( + history, PredictSensitivity.medium, schedule_tz=TZ, + ) + + assert staleness is not None + assert staleness > 0 + assert prediction_json is not None + data = json.loads(prediction_json) + assert data["schedule_stage"] == "active" + + def test_excluded_days_in_prediction_json_when_active(self): + """When schedule reaches active with weekday-only pattern, excluded_days=[5,6] in prediction JSON.""" + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(5): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + + + _, _, _, prediction_json = compute_freshness_threshold( + history, PredictSensitivity.medium, schedule_tz=TZ, + ) + + assert prediction_json is not None + data = json.loads(prediction_json) + assert data["schedule_stage"] == "active" + assert "active_days" in data + assert sorted(data["active_days"]) == [0, 1, 2, 3, 4] + + def test_staleness_recomputed_with_excluded_days(self): + """Active schedule with weekday-only pattern: staleness is returned with tz, None without.""" + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(5): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) + for day in range(5): + ts = base + pd.Timedelta(days=day, hours=10) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + + history = _make_freshness_history(updates, check_interval_minutes=120) + + + # With tz (triggers schedule inference + excluded_days recomputation) + _, _, staleness_with_tz, pred_json_with_tz = compute_freshness_threshold( + history, PredictSensitivity.medium, schedule_tz=TZ, + ) + # Without tz (no schedule inference, no staleness) + _, _, staleness_no_tz, pred_json_no_tz = compute_freshness_threshold( + history, PredictSensitivity.medium, + ) + + assert pred_json_with_tz is not None and pred_json_no_tz is not None + data_with = json.loads(pred_json_with_tz) + + # With active schedule, staleness is returned + assert data_with["schedule_stage"] == "active" + assert staleness_with_tz is not None + assert staleness_with_tz > 0 + + # Without tz, staleness is None (no active schedule) + assert staleness_no_tz is None + + def test_staleness_catches_daily_miss_that_upper_misses(self): + """Staleness threshold detects a missed daily update at gap=1440 min where upper doesn't.""" + # Daily weekday pattern: all gaps ~1440 min (24h) + updates = [f"2026-02-{d:02d}T08:00" for d in range(2, 9) if pd.Timestamp(f"2026-02-{d:02d}").weekday() < 5] + # Ensure we have enough gaps + while len(updates) < 8: + updates.append(f"2026-02-{9 + len(updates) - 7:02d}T08:00") + history = _make_freshness_history(updates) + + from testgen.commands.test_thresholds_prediction import FRESHNESS_THRESHOLD_MAP, STALENESS_FACTOR_MAP + upper_pct, floor_mult, lower_pct = FRESHNESS_THRESHOLD_MAP[PredictSensitivity.medium] + staleness_factor = STALENESS_FACTOR_MAP[PredictSensitivity.medium] + + result = get_freshness_gap_threshold( + history, upper_percentile=upper_pct, floor_multiplier=floor_mult, lower_percentile=lower_pct, + staleness_factor=staleness_factor, + ) + + # The typical gap is ~1440 min. After a missed update, the next check shows gap=1440. + # Upper (P95 with floor) should be >= 1440 β€” so upper alone wouldn't catch it + assert result.upper >= 1440 + # Staleness (median x 0.85) should be < 1440 β€” catches the miss + assert result.staleness < 1440 + + +# --------------------------------------------------------------------------- +# is_excluded_day with excluded_days Tests +# --------------------------------------------------------------------------- + +class Test_IsExcludedDayWithExcludedDays: + def test_monday_excluded(self): + """excluded_days={0} should exclude Monday.""" + monday = pd.Timestamp("2026-02-09") # Monday + assert is_excluded_day(monday, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({0})) + + def test_weekend_via_excluded_days(self): + """excluded_days={5,6} without exclude_weekends=True still excludes weekends.""" + saturday = pd.Timestamp("2026-02-07") # Saturday + sunday = pd.Timestamp("2026-02-08") # Sunday + assert is_excluded_day(saturday, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({5, 6})) + assert is_excluded_day(sunday, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({5, 6})) + + def test_both_exclude_weekends_and_excluded_days(self): + """Both flags combined: Mon+Sat+Sun excluded.""" + monday = pd.Timestamp("2026-02-09") + saturday = pd.Timestamp("2026-02-07") + tuesday = pd.Timestamp("2026-02-10") + assert is_excluded_day(monday, exclude_weekends=True, holiday_dates=None, excluded_days=frozenset({0})) + assert is_excluded_day(saturday, exclude_weekends=True, holiday_dates=None, excluded_days=frozenset({0})) + assert not is_excluded_day(tuesday, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({0})) + + def test_weekday_not_excluded(self): + """Wednesday not in excluded_days={5,6} should not be excluded.""" + wednesday = pd.Timestamp("2026-02-11") # Wednesday + assert not is_excluded_day(wednesday, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({5, 6})) + + def test_with_timezone(self): + """excluded_days with timezone conversion.""" + # 2026-02-07 Saturday 23:00 ET = 2026-02-08 Sunday 04:00 UTC + # In ET this is Saturday (weekday=5), so excluded_days={5} should match + saturday_utc = pd.Timestamp("2026-02-08T04:00") + assert is_excluded_day(saturday_utc, exclude_weekends=False, holiday_dates=None, tz=TZ, excluded_days=frozenset({5})) + + +# --------------------------------------------------------------------------- +# count_excluded_minutes with excluded_days Tests +# --------------------------------------------------------------------------- + +class Test_CountExcludedMinutesWithExcludedDays: + def test_wednesday_excluded(self): + """excluded_days={2} should subtract Wednesday minutes only.""" + # Tue 2026-02-10 00:00 β†’ Thu 2026-02-12 00:00 (includes a full Wednesday) + start = pd.Timestamp("2026-02-10T00:00") + end = pd.Timestamp("2026-02-12T00:00") + result = count_excluded_minutes(start, end, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({2})) + # Full Wednesday = 24h = 1440 min + assert result == 1440.0 + + def test_no_excluded_days_returns_zero(self): + """excluded_days=None should return 0.""" + start = pd.Timestamp("2026-02-10T00:00") + end = pd.Timestamp("2026-02-12T00:00") + result = count_excluded_minutes(start, end, exclude_weekends=False, holiday_dates=None, excluded_days=None) + assert result == 0.0 + + +# --------------------------------------------------------------------------- +# get_schedule_params Tests +# --------------------------------------------------------------------------- + +class Test_GetScheduleParams: + def test_returns_empty_for_none(self): + result = get_schedule_params(None) + assert result.excluded_days is None + assert result.window_start is None + assert result.window_end is None + + def test_returns_empty_for_empty_string(self): + result = get_schedule_params("") + assert result.excluded_days is None + + def test_returns_none_excluded_days_when_no_active_days(self): + result = get_schedule_params({"schedule_stage": "active"}) + assert result.excluded_days is None + + def test_inverts_active_days_to_excluded_days(self): + pred = {"active_days": [0, 1, 2, 3, 4], "schedule_stage": "active"} + result = get_schedule_params(pred) + assert result.excluded_days == frozenset({5, 6}) + + def test_inverts_active_days_from_json_string(self): + pred = json.dumps({"active_days": [0, 1, 2, 3, 4], "schedule_stage": "active"}) + result = get_schedule_params(pred) + assert result.excluded_days == frozenset({5, 6}) + + def test_all_days_active_returns_empty_excluded(self): + pred = {"active_days": [0, 1, 2, 3, 4, 5, 6], "schedule_stage": "active"} + result = get_schedule_params(pred) + assert not result.excluded_days + + def test_returns_window_for_sub_daily_active(self): + pred = {"frequency": "sub_daily", "schedule_stage": "active", "window_start": 9.0, "window_end": 17.0} + result = get_schedule_params(pred) + assert result.window_start == 9.0 + assert result.window_end == 17.0 + + def test_no_window_for_daily(self): + pred = {"frequency": "daily", "schedule_stage": "active", "window_start": 9.0, "window_end": 17.0} + result = get_schedule_params(pred) + assert result.window_start is None + assert result.window_end is None + + def test_no_exclusions_for_tentative(self): + pred = { + "active_days": [0, 2, 4], + "frequency": "sub_daily", + "schedule_stage": "tentative", + "window_start": 9.0, + "window_end": 17.0, + } + result = get_schedule_params(pred) + assert result.excluded_days is None + assert result.window_start is None + assert result.window_end is None + + def test_no_window_when_missing(self): + pred = {"frequency": "sub_daily", "schedule_stage": "active"} + result = get_schedule_params(pred) + assert result.window_start is None + + def test_combined_days_and_window(self): + pred = { + "active_days": [0, 1, 2, 3, 4], + "frequency": "sub_daily", + "schedule_stage": "active", + "window_start": 8.0, + "window_end": 18.0, + } + result = get_schedule_params(pred) + assert result.excluded_days == frozenset({5, 6}) + assert result.window_start == 8.0 + assert result.window_end == 18.0 + + +# --------------------------------------------------------------------------- +# add_business_minutes with excluded_days Tests +# --------------------------------------------------------------------------- + +class Test_AddBusinessMinutesWithExcludedDays: + def test_skips_excluded_day(self): + """excluded_days={2} (Wednesday) should skip Wednesday entirely.""" + # Tuesday 22:00, add 180 min β†’ skip Wed, land on Thursday 01:00 + start = pd.Timestamp("2026-02-10T22:00") # Tuesday + result = add_business_minutes(start, 180, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({2})) + assert result == pd.Timestamp("2026-02-12T01:00") # Thursday + + def test_starts_on_excluded_day(self): + """Starting on an excluded day should fast-forward past it.""" + start = pd.Timestamp("2026-02-11T10:00") # Wednesday + result = add_business_minutes(start, 60, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({2})) + assert result == pd.Timestamp("2026-02-12T01:00") # Thursday 01:00 + + def test_consecutive_excluded_days(self): + """Multiple consecutive excluded days are all skipped.""" + # Exclude Wed+Thu+Fri (2,3,4), start Tuesday 23:00, add 120 min + start = pd.Timestamp("2026-02-10T23:00") # Tuesday + result = add_business_minutes(start, 120, exclude_weekends=False, holiday_dates=None, excluded_days=frozenset({2, 3, 4})) + # 1h left on Tuesday (23:00β†’midnight), skip Wed/Thu/Fri, 1h into Saturday + assert result == pd.Timestamp("2026-02-14T01:00") # Saturday + + def test_excluded_days_combined_with_weekends(self): + """excluded_days + exclude_weekends together: skip excluded + Sat+Sun.""" + # Exclude Friday (4), start Thursday 23:00, add 120 min + start = pd.Timestamp("2026-02-05T23:00") # Thursday + result = add_business_minutes(start, 120, exclude_weekends=True, holiday_dates=None, excluded_days=frozenset({4})) + # 1h left on Thursday, skip Fri+Sat+Sun, 1h into Monday + assert result == pd.Timestamp("2026-02-09T01:00") # Monday + + def test_inverse_property_with_excluded_days(self): + """add_business_minutes and count_excluded_minutes should be inverses.""" + start = pd.Timestamp("2026-02-06T14:00") # Friday + excluded_days = frozenset({5, 6}) # weekend via excluded_days + business_minutes = 600.0 + end = add_business_minutes(start, business_minutes, exclude_weekends=False, holiday_dates=None, excluded_days=excluded_days) + wall_minutes = (end - start).total_seconds() / 60 + excluded = count_excluded_minutes(start, end, exclude_weekends=False, holiday_dates=None, excluded_days=excluded_days) + assert wall_minutes - excluded == business_minutes + + def test_with_timezone(self): + """excluded_days with timezone: day boundaries use local time.""" + # UTC Friday 23:00 = ET Friday 6PM β†’ Friday is not excluded + # excluded_days={5} = Saturday only + start = pd.Timestamp("2026-02-06T23:00") # UTC Friday 11PM = ET Friday 6PM + result = add_business_minutes(start, 120, exclude_weekends=False, holiday_dates=None, tz=TZ, excluded_days=frozenset({5})) + # ET: Fri 6PM + 2h = Fri 8PM ET. Skip Saturday. So 2h consumed on Friday β†’ Fri 8PM ET = Sat 01:00 UTC + assert result == pd.Timestamp("2026-02-07T01:00") + + +# --------------------------------------------------------------------------- +# minutes_to_next_deadline Tests +# --------------------------------------------------------------------------- + +class Test_MinutesToNextDeadline: + def test_basic_daily(self): + """Mon 11AM β†’ Tue deadline (window_end 13 + buffer 3 = 16h) = 29h = 1740 min.""" + schedule = _make_schedule() + zi = zoneinfo.ZoneInfo(TZ) + last_update = pd.Timestamp("2026-02-09T11:00", tz=zi).tz_convert("UTC").tz_localize(None) + result = minutes_to_next_deadline(last_update, schedule, exclude_weekends=False, holiday_dates=None, tz=TZ, buffer_hours=3.0) + assert result is not None + assert 1700 <= result <= 1800 + + def test_friday_crosses_weekend(self): + """Friday last update β†’ next active day is Monday, weekend excluded.""" + schedule = _make_schedule() # active_days = Mon-Fri + zi = zoneinfo.ZoneInfo(TZ) + last_update = pd.Timestamp("2026-02-06T11:00", tz=zi).tz_convert("UTC").tz_localize(None) + result = minutes_to_next_deadline( + last_update, schedule, + exclude_weekends=True, holiday_dates=None, tz=TZ, buffer_hours=3.0, + ) + assert result is not None + # Fri 11AM β†’ Mon 4PM = 77h wall = 4620 min, minus 2880 weekend = 1740 business min + assert 1700 <= result <= 1800 + + def test_friday_crosses_weekend_plus_holiday(self): + """Friday β†’ Monday is holiday β†’ deadline still targets Monday (next active day), + but most of Monday's minutes are subtracted as excluded holiday time.""" + from datetime import date + schedule = _make_schedule() + zi = zoneinfo.ZoneInfo(TZ) + last_update = pd.Timestamp("2026-02-06T11:00", tz=zi).tz_convert("UTC").tz_localize(None) + holiday_dates = {date(2026, 2, 9)} # Monday + result = minutes_to_next_deadline( + last_update, schedule, + exclude_weekends=True, holiday_dates=holiday_dates, tz=TZ, buffer_hours=3.0, + ) + assert result is not None + # Fri 11AM β†’ Mon 4PM (deadline) = 77h wall = 4620 min + # Minus Sat+Sun (2880) + Mon midnight-to-4PM holiday (960) = 780 business min + assert 750 <= result <= 810 + + def test_no_window_end_returns_none(self): + schedule = _make_schedule(window_end=None) + last_update = pd.Timestamp("2026-02-09T11:00") + result = minutes_to_next_deadline(last_update, schedule, exclude_weekends=False, holiday_dates=None, tz=TZ, buffer_hours=3.0) + assert result is None + + def test_buffer_hours_affects_result(self): + """Larger buffer β†’ later deadline β†’ more minutes.""" + schedule = _make_schedule() + zi = zoneinfo.ZoneInfo(TZ) + last_update = pd.Timestamp("2026-02-09T11:00", tz=zi).tz_convert("UTC").tz_localize(None) + small = minutes_to_next_deadline(last_update, schedule, exclude_weekends=False, holiday_dates=None, tz=TZ, buffer_hours=1.0) + large = minutes_to_next_deadline(last_update, schedule, exclude_weekends=False, holiday_dates=None, tz=TZ, buffer_hours=5.0) + assert small is not None and large is not None + assert small < large + + def test_with_excluded_days(self): + """excluded_days={5,6} should subtract weekend minutes like exclude_weekends.""" + schedule = _make_schedule() + zi = zoneinfo.ZoneInfo(TZ) + last_update = pd.Timestamp("2026-02-06T11:00", tz=zi).tz_convert("UTC").tz_localize(None) + result = minutes_to_next_deadline( + last_update, schedule, + exclude_weekends=False, holiday_dates=None, tz=TZ, buffer_hours=3.0, + excluded_days=frozenset({5, 6}), + ) + assert result is not None + assert 1700 <= result <= 1800 + + + +# --------------------------------------------------------------------------- +# is_excluded_day with window_start/window_end Tests +# --------------------------------------------------------------------------- + +class Test_IsExcludedDayWithWindow: + """Test active-hours exclusion for sub-daily schedules.""" + + def test_inside_window_not_excluded(self): + """10:00 ET inside [4, 14] window β†’ not excluded.""" + # 10:00 ET = 15:00 UTC (EST = UTC-5) + ts = pd.Timestamp("2026-02-09T15:00") # Monday + assert not is_excluded_day( + ts, exclude_weekends=False, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + + def test_outside_window_excluded(self): + """20:00 ET outside [4, 14] window β†’ excluded.""" + # 20:00 ET = 01:00+1 UTC + ts = pd.Timestamp("2026-02-10T01:00") # Tuesday 01:00 UTC = Monday 20:00 ET + assert is_excluded_day( + ts, exclude_weekends=False, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + + def test_at_window_start_not_excluded(self): + """Exactly at window_start β†’ inside window β†’ not excluded.""" + # 4:00 ET = 9:00 UTC + ts = pd.Timestamp("2026-02-09T09:00") # Monday + assert not is_excluded_day( + ts, exclude_weekends=False, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + + def test_at_window_end_not_excluded(self): + """Exactly at window_end β†’ inside window β†’ not excluded.""" + # 14:00 ET = 19:00 UTC + ts = pd.Timestamp("2026-02-09T19:00") # Monday + assert not is_excluded_day( + ts, exclude_weekends=False, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + + def test_weekend_still_excluded_with_window(self): + """Weekend exclusion takes priority over window check.""" + # Saturday 10:00 ET (inside window) but weekend + ts = pd.Timestamp("2026-02-07T15:00") # Saturday 10:00 ET + assert is_excluded_day( + ts, exclude_weekends=True, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + + def test_excluded_day_takes_priority(self): + """excluded_days exclusion takes priority over window check.""" + # Monday 10:00 ET (inside window) but Monday excluded + ts = pd.Timestamp("2026-02-09T15:00") # Monday 10:00 ET + assert is_excluded_day( + ts, exclude_weekends=False, holiday_dates=None, + tz=TZ, excluded_days=frozenset({0}), window_start=4.0, window_end=14.0, + ) + + def test_no_window_no_effect(self): + """Without window params, daytime on active day β†’ not excluded.""" + ts = pd.Timestamp("2026-02-10T01:00") # Monday 20:00 ET + assert not is_excluded_day( + ts, exclude_weekends=False, holiday_dates=None, tz=TZ, + ) + + def test_window_without_tz_uses_raw_hour(self): + """Window params without tz still check the raw timestamp's hour.""" + ts = pd.Timestamp("2026-02-09T20:00") + # 20:00 is outside [4, 14] β†’ excluded + assert is_excluded_day( + ts, exclude_weekends=False, holiday_dates=None, + window_start=4.0, window_end=14.0, + ) + # 10:00 is inside [4, 14] β†’ not excluded + ts2 = pd.Timestamp("2026-02-09T10:00") + assert not is_excluded_day( + ts2, exclude_weekends=False, holiday_dates=None, + window_start=4.0, window_end=14.0, + ) + + +# --------------------------------------------------------------------------- +# count_excluded_minutes with window_start/window_end Tests +# --------------------------------------------------------------------------- + +class Test_CountExcludedMinutesWithWindow: + """Test active-hours exclusion in gap duration computation.""" + + def test_overnight_gap_excludes_outside_window(self): + """Gap from 14:00 ET to 04:00 ET next day: all 14h overnight are outside [4, 14].""" + # 14:00 ET = 19:00 UTC, 04:00 ET next day = 09:00 UTC next day + start = pd.Timestamp("2026-02-09T19:00") # Monday 14:00 ET + end = pd.Timestamp("2026-02-10T09:00") # Tuesday 04:00 ET + result = count_excluded_minutes( + start, end, exclude_weekends=False, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + # Total gap = 14h = 840 min + # Monday 14:00-midnight = 10h, outside window = 10h (14:00 is window_end, so 14:00-midnight) + # Tuesday midnight-04:00 = 4h, outside window = 4h (before window_start) + # Total excluded = 14h = 840 min (the entire overnight gap is outside the window) + assert result == 840.0 + + def test_within_window_gap_excludes_nothing(self): + """Gap entirely within [4, 14] window β†’ 0 excluded minutes.""" + # 06:00 ET = 11:00 UTC, 08:00 ET = 13:00 UTC + start = pd.Timestamp("2026-02-09T11:00") # Monday 06:00 ET + end = pd.Timestamp("2026-02-09T13:00") # Monday 08:00 ET + result = count_excluded_minutes( + start, end, exclude_weekends=False, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + assert result == 0.0 + + def test_gap_spanning_weekend_and_outside_window(self): + """Gap from Friday 14:00 ET to Monday 04:00 ET: weekend + outside-window hours.""" + # Fri 14:00 ET = 19:00 UTC, Mon 04:00 ET = 09:00 UTC + start = pd.Timestamp("2026-02-06T19:00") # Friday 14:00 ET + end = pd.Timestamp("2026-02-09T09:00") # Monday 04:00 ET + result = count_excluded_minutes( + start, end, exclude_weekends=True, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + # Total wall = 62h = 3720 min + # Friday 14:00 ET β†’ midnight = 10h outside window + # Saturday (full day excluded as weekend) = 1440 min + # Sunday (full day excluded as weekend) = 1440 min + # Monday midnight β†’ 04:00 = 4h outside window (before window_start) + # Total = 600 + 1440 + 1440 + 240 = 3720 min (everything excluded) + assert result == 3720.0 + + def test_partial_day_with_window(self): + """Gap starting inside window, ending outside β†’ only outside portion excluded.""" + # 12:00 ET = 17:00 UTC, 16:00 ET = 21:00 UTC (same day) + start = pd.Timestamp("2026-02-09T17:00") # Monday 12:00 ET + end = pd.Timestamp("2026-02-09T21:00") # Monday 16:00 ET + result = count_excluded_minutes( + start, end, exclude_weekends=False, holiday_dates=None, + tz=TZ, window_start=4.0, window_end=14.0, + ) + # Total gap = 4h = 240 min + # 12:00-14:00 ET = inside window = 0 excluded + # 14:00-16:00 ET = outside window = 120 min excluded + assert result == 120.0 + + def test_no_window_backward_compat(self): + """Without window params, behaves exactly as before.""" + start = pd.Timestamp("2026-02-09T19:00") # Monday + end = pd.Timestamp("2026-02-10T09:00") # Tuesday + result = count_excluded_minutes( + start, end, exclude_weekends=False, holiday_dates=None, tz=TZ, + ) + assert result == 0.0 + + +# --------------------------------------------------------------------------- +# get_freshness_gap_threshold with window_start/window_end Tests +# --------------------------------------------------------------------------- + +class Test_GetFreshnessGapThresholdWithWindow: + """Test that window exclusion in gap computation normalizes overnight gaps to 0.""" + + def _make_sub_daily_history(self): + """Build 4 weeks of 2-hourly updates, 08:00-18:00 UTC weekdays. + + This matches the subdaily_regular scenario: updates every 2h during + business hours, with overnight and weekend gaps. + """ + zi = zoneinfo.ZoneInfo(TZ) + updates = [] + for week in range(4): + base = pd.Timestamp("2026-01-05") + pd.Timedelta(weeks=week) # Monday + for day in range(5): # Mon-Fri + for hour in range(4, 15, 2): # 04:00-14:00 ET = 09:00-19:00 UTC + ts = base + pd.Timedelta(days=day, hours=hour) + utc = pd.Timestamp(ts, tz=zi).tz_convert("UTC").tz_localize(None) + updates.append(str(utc)) + return _make_freshness_history(updates, check_interval_minutes=30) + + def test_overnight_gaps_become_zero(self): + """With window exclusion, overnight gaps are normalized to 0 business minutes.""" + history = self._make_sub_daily_history() + excluded_days = frozenset({5, 6}) + + # Without window exclusion: overnight gaps (~840 min) inflate the distribution + result_no_window = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10, + exclude_weekends=True, tz=TZ, excluded_days=excluded_days, + ) + + # With window exclusion: overnight gaps become 0 + result_with_window = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10, + exclude_weekends=True, tz=TZ, excluded_days=excluded_days, + window_start=4.0, window_end=14.0, + ) + + # The upper should be much tighter with window exclusion + assert result_with_window.upper < result_no_window.upper + # Upper should be around 150 (1.25 * 120) not 1050 (1.25 * 840) + assert result_with_window.upper <= 200 + + def test_within_window_gaps_unchanged(self): + """Gaps entirely within the activity window are not affected by window exclusion.""" + history = self._make_sub_daily_history() + excluded_days = frozenset({5, 6}) + + result = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10, + exclude_weekends=True, tz=TZ, excluded_days=excluded_days, + window_start=4.0, window_end=14.0, + ) + + # Within-window gaps are 120 min (2h), so median should be 120 + # (overnight 0s pull median down, but most gaps are 120) + assert 100 <= result.staleness / 0.85 <= 130 # median ~120 + + def test_lower_disabled_when_overnight_zeros(self): + """With overnight gaps normalized to 0, P10 is 0 β†’ lower set to None.""" + history = self._make_sub_daily_history() + excluded_days = frozenset({5, 6}) + + result = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.25, + lower_percentile=10, + exclude_weekends=True, tz=TZ, excluded_days=excluded_days, + window_start=4.0, window_end=14.0, + ) + + # P10 should be 0 (or very close) β†’ lower should be None + assert result.lower is None diff --git a/tests/unit/test_read_file.py b/tests/unit/common/test_read_file.py similarity index 83% rename from tests/unit/test_read_file.py rename to tests/unit/common/test_read_file.py index cd7ac008..d6b62ba5 100644 --- a/tests/unit/test_read_file.py +++ b/tests/unit/common/test_read_file.py @@ -4,6 +4,8 @@ from testgen.common.read_file import replace_templated_functions +pytestmark = pytest.mark.unit + @pytest.fixture def query(): @@ -14,7 +16,6 @@ def query(): """) -@pytest.mark.unit def test_replace_templated_functions(query): fn = replace_templated_functions(query, "postgresql") @@ -27,7 +28,6 @@ def test_replace_templated_functions(query): assert fn == expected -@pytest.mark.unit def test_replace_templated_missing_arg(query): query = query.replace(";'1970-01-01'", "") with pytest.raises( @@ -35,3 +35,8 @@ def test_replace_templated_missing_arg(query): match="Templated function call missing required arguments: <%DATEDIFF_YEAR;'{COL_NAME}'::DATE%>", ): replace_templated_functions(query, "postgresql") + + +def test_replace_templated_functions_no_templates(): + plain_query = "SELECT col1, col2 FROM my_table WHERE id = 1" + assert replace_templated_functions(plain_query, "postgresql") == plain_query diff --git a/tests/unit/common/test_time_series_service.py b/tests/unit/common/test_time_series_service.py new file mode 100644 index 00000000..86e2e8b3 --- /dev/null +++ b/tests/unit/common/test_time_series_service.py @@ -0,0 +1,636 @@ +from datetime import date + +import numpy as np +import pandas as pd +import pytest + +from testgen.commands.test_thresholds_prediction import compute_freshness_threshold +from testgen.common.freshness_service import ( + MIN_FRESHNESS_GAPS, + FreshnessThreshold, + add_business_minutes, + count_excluded_minutes, + get_freshness_gap_threshold, + is_excluded_day, + next_business_day_start, +) +from testgen.common.models.test_suite import PredictSensitivity +from testgen.common.time_series_service import NotEnoughData, get_sarimax_forecast + +from .conftest import _make_freshness_history + + +class Test_GetFreshnessGapThreshold: + def test_basic_threshold(self): + # 6 updates spaced 10h apart = 5 gaps of 600 minutes each + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + + result = get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10) + # All gaps are 600 min, so P95 = 600, floor = 600 * 1.25 = 750 + assert isinstance(result, FreshnessThreshold) + assert result.upper == pytest.approx(750.0) + # staleness = median(600) * 0.85 = 510 + assert result.staleness == pytest.approx(600.0 * 0.85) + + def test_not_enough_data_few_gaps(self): + # 4 updates = 3 gaps, below MIN_FRESHNESS_GAPS + updates = ["2026-02-01T00:00", "2026-02-01T10:00", "2026-02-01T20:00", "2026-02-02T06:00"] + history = _make_freshness_history(updates) + + with pytest.raises(NotEnoughData, match=f"{MIN_FRESHNESS_GAPS}"): + get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10) + + def test_not_enough_data_no_updates(self): + # History with no zero values = no detected updates + timestamps = pd.date_range("2026-02-01", periods=30, freq="2h") + signal = np.arange(1, 31, dtype=float) * 120 # never hits 0 + history = pd.DataFrame({"result_signal": signal}, index=timestamps) + + with pytest.raises(NotEnoughData): + get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10) + + def test_floor_multiplier_dominates(self): + # 6 identical gaps β€” percentile β‰ˆ max, so floor_multiplier > 1 dominates + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + + result_low = get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.0, lower_percentile=10) + result_high = get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.5, lower_percentile=10) + + assert result_high.upper > result_low.upper + + def test_sensitivity_ordering(self): + # Varied gaps so percentiles differentiate + updates = [ + "2026-02-01T00:00", + "2026-02-01T04:00", # 4h + "2026-02-02T14:00", # 34h + "2026-02-03T14:00", # 24h + "2026-02-04T06:00", # 16h + "2026-02-04T08:00", # 2h + "2026-02-04T16:00", # 8h + ] + history = _make_freshness_history(updates) + + high = get_freshness_gap_threshold(history, upper_percentile=80, floor_multiplier=1.0, lower_percentile=10) + medium = get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10) + low = get_freshness_gap_threshold(history, upper_percentile=99, floor_multiplier=1.5, lower_percentile=10) + + assert high.upper <= medium.upper <= low.upper + + def test_single_update_raises(self): + # Only one zero = zero gaps + timestamps = pd.date_range("2026-02-01", periods=10, freq="2h") + signal = [0.0] + [120.0 * i for i in range(1, 10)] + history = pd.DataFrame({"result_signal": signal}, index=timestamps) + + with pytest.raises(NotEnoughData): + get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10) + + def test_returns_last_update_timestamp(self): + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + + result = get_freshness_gap_threshold(history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10) + assert result.last_update == pd.Timestamp("2026-02-03T02:00") + + def test_lower_threshold(self): + # Varied gaps: 4h, 34h, 24h, 16h, 2h, 8h + updates = [ + "2026-02-01T00:00", + "2026-02-01T04:00", # 4h = 240 min + "2026-02-02T14:00", # 34h = 2040 min + "2026-02-03T14:00", # 24h = 1440 min + "2026-02-04T06:00", # 16h = 960 min + "2026-02-04T08:00", # 2h = 120 min + "2026-02-04T16:00", # 8h = 480 min + ] + history = _make_freshness_history(updates) + + result = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.25, lower_percentile=10, + ) + assert result.lower is not None + assert result.lower > 0 + assert result.lower < result.upper + + def test_lower_threshold_none_when_zero(self): + # All identical gaps β†’ P10 = same value, but if very small or zero, returns None + # Create gaps where the minimum is 0 after percentile + updates = [ + "2026-02-01T00:00:00", + "2026-02-01T00:01:00", # 1 min gap + "2026-02-01T00:02:00", # 1 min gap + "2026-02-01T00:03:00", # 1 min gap + "2026-02-01T00:04:00", # 1 min gap + "2026-02-01T00:05:00", # 1 min gap + "2026-02-01T00:06:00", # 1 min gap + ] + history = _make_freshness_history(updates, check_interval_minutes=1) + + result = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.0, lower_percentile=5, + ) + # All gaps are 1 min, P5 = 1.0 which is > 0, so lower should be set + assert result.lower == pytest.approx(1.0) + + +class Test_GetFreshnessGapThreshold_WeekendExclusion: + def test_weekend_gaps_normalized(self): + # Table updates daily on weekdays, 72h gap over weekend + # Mon Feb 2 through Mon Feb 9 (2026-02-02 is a Monday) + updates = [ + "2026-02-02T08:00", # Mon + "2026-02-03T08:00", # Tue (24h gap) + "2026-02-04T08:00", # Wed (24h gap) + "2026-02-05T08:00", # Thu (24h gap) + "2026-02-06T08:00", # Fri (24h gap) + "2026-02-09T08:00", # Mon (72h raw, but 24h after subtracting Sat+Sun) + "2026-02-10T08:00", # Tue (24h gap) + ] + history = _make_freshness_history(updates) + + # Without exclusion: the 72h gap inflates the threshold + result_raw = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.0, lower_percentile=10, + ) + + # With exclusion: all gaps normalize to ~24h + result_normalized = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.0, lower_percentile=10, exclude_weekends=True, + ) + + # Normalized threshold should be lower (all gaps β‰ˆ 24h vs max raw = 72h) + assert result_normalized.upper < result_raw.upper + + def test_partial_weekend_day_subtracted(self): + # All gaps are 4h except the last one which crosses into Saturday (14h raw). + # Partial-day exclusion subtracts the 10h Saturday portion, bringing the + # max normalized gap below the raw max. + updates = [ + "2026-02-06T04:00", # Fri + "2026-02-06T08:00", # Fri (4h gap) + "2026-02-06T12:00", # Fri (4h gap) + "2026-02-06T16:00", # Fri (4h gap) + "2026-02-06T20:00", # Fri (4h gap) + "2026-02-07T00:00", # Sat midnight (4h gap) + "2026-02-07T10:00", # Sat 10AM (10h raw gap, 0h business β€” entirely on Saturday) + ] + history = _make_freshness_history(updates) + + result_raw = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.0, lower_percentile=10, + ) + result_normalized = get_freshness_gap_threshold( + history, upper_percentile=95, floor_multiplier=1.0, lower_percentile=10, exclude_weekends=True, + ) + + # Raw: max gap = Sat midnight β†’ Sat 10AM = 10h = 600 min + # Normalized: 600 - 10h Saturday excluded = 0 min + # So normalized max = 4h (the weekday gaps), while raw max = 10h + assert result_normalized.upper < result_raw.upper + + +class Test_CountExcludedMinutes: + def test_no_exclusions(self): + start = pd.Timestamp("2026-02-06T17:00") # Friday + end = pd.Timestamp("2026-02-09T08:00") # Monday + result = count_excluded_minutes(start, end, exclude_weekends=False, holiday_dates=None) + assert result == 0.0 + + def test_full_weekend(self): + # Friday 5PM to Monday 8AM β€” Saturday and Sunday are full days in between + start = pd.Timestamp("2026-02-06T17:00") # Friday + end = pd.Timestamp("2026-02-09T08:00") # Monday + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == 2 * 24 * 60 # 2 full weekend days + + def test_partial_weekend_day(self): + # Saturday 1AM to Saturday 11PM β€” 22 hours of excluded Saturday + start = pd.Timestamp("2026-02-07T01:00") # Saturday + end = pd.Timestamp("2026-02-07T23:00") # Saturday + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == 22 * 60 + + def test_weekday_only(self): + # Monday to Wednesday β€” no weekends + start = pd.Timestamp("2026-02-02T08:00") # Monday + end = pd.Timestamp("2026-02-04T08:00") # Wednesday + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == 0.0 + + def test_holiday(self): + start = pd.Timestamp("2026-02-02T08:00") # Monday + end = pd.Timestamp("2026-02-05T08:00") # Thursday + # Wednesday is a holiday + holiday_dates = {date(2026, 2, 4)} + result = count_excluded_minutes(start, end, exclude_weekends=False, holiday_dates=holiday_dates) + assert result == 1 * 24 * 60 # 1 holiday + + def test_weekend_and_holiday(self): + # Friday to Tuesday, with Monday as holiday + start = pd.Timestamp("2026-02-06T08:00") # Friday + end = pd.Timestamp("2026-02-10T08:00") # Tuesday + holiday_dates = {date(2026, 2, 9)} # Monday + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=holiday_dates) + # Saturday + Sunday + Monday(holiday) = 3 days + assert result == 3 * 24 * 60 + + def test_holiday_on_weekend_not_double_counted(self): + # Holiday falls on Saturday β€” should only count once + start = pd.Timestamp("2026-02-06T08:00") # Friday + end = pd.Timestamp("2026-02-09T08:00") # Monday + holiday_dates = {date(2026, 2, 7)} # Saturday + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=holiday_dates) + # Saturday (weekend) + Sunday (weekend) = 2 days, not 3 + assert result == 2 * 24 * 60 + + def test_same_excluded_day(self): + # Saturday 8AM to 8PM β€” 12 hours of excluded time + start = pd.Timestamp("2026-02-07T08:00") + end = pd.Timestamp("2026-02-07T20:00") + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == 12 * 60 + + def test_same_weekday(self): + # Monday 8AM to 8PM β€” no excluded time + start = pd.Timestamp("2026-02-09T08:00") + end = pd.Timestamp("2026-02-09T20:00") + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == 0.0 + + def test_accepts_datetime(self): + from datetime import datetime + start = datetime(2026, 2, 6, 17, 0) # Friday + end = datetime(2026, 2, 9, 8, 0) # Monday + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == 2 * 24 * 60 + + + def test_partial_start_on_excluded_day(self): + # Last update Saturday 1AM, end Monday midnight + # Saturday has 23h excluded (1AM to midnight), Sunday has 24h + start = pd.Timestamp("2026-02-07T01:00") # Saturday 1AM + end = pd.Timestamp("2026-02-09T00:00") # Monday midnight + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == (23 + 24) * 60 # 23h Saturday + 24h Sunday + + def test_start_equals_end(self): + ts = pd.Timestamp("2026-02-07T08:00") + result = count_excluded_minutes(ts, ts, exclude_weekends=True, holiday_dates=None) + assert result == 0.0 + + def test_start_after_end(self): + start = pd.Timestamp("2026-02-08T08:00") + end = pd.Timestamp("2026-02-07T08:00") + result = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert result == 0.0 + + def test_timezone_shifts_weekend_boundaries(self): + # Without timezone: UTC Fri 23:00 to UTC Mon 01:00 + # UTC Saturday and Sunday are full weekend days β†’ 2 * 24h = 2880 min + start = pd.Timestamp("2026-02-06T23:00") # UTC Friday 11PM + end = pd.Timestamp("2026-02-09T01:00") # UTC Monday 1AM + result_utc = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + + # With ET timezone (UTC-5): start = Fri 6PM ET, end = Sun 8PM ET + # ET Saturday = UTC Sat 05:00 to UTC Sun 05:00 + # ET Sunday = UTC Sun 05:00 to UTC Mon 05:00 + # The interval Fri 6PM ET β†’ Sun 8PM ET contains: + # Full ET Saturday (24h) + partial ET Sunday (midnight to 8PM = 20h) = 44h = 2640 min + result_et = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None, tz="America/New_York") + + assert result_et != result_utc + assert result_et == pytest.approx(44 * 60) + + +class Test_IsExcludedDay: + def test_weekend_saturday(self): + assert is_excluded_day(pd.Timestamp("2026-02-07"), exclude_weekends=True, holiday_dates=None) is True + + def test_weekend_sunday(self): + assert is_excluded_day(pd.Timestamp("2026-02-08"), exclude_weekends=True, holiday_dates=None) is True + + def test_weekday(self): + assert is_excluded_day(pd.Timestamp("2026-02-09"), exclude_weekends=True, holiday_dates=None) is False + + def test_holiday(self): + holidays = {date(2026, 2, 9)} # Monday + assert is_excluded_day(pd.Timestamp("2026-02-09"), exclude_weekends=False, holiday_dates=holidays) is True + + def test_timestamp(self): + assert is_excluded_day(pd.Timestamp("2026-02-07T14:00"), exclude_weekends=True, holiday_dates=None) is True + + def test_no_exclusions(self): + assert is_excluded_day(pd.Timestamp("2026-02-07"), exclude_weekends=False, holiday_dates=None) is False + + def test_timezone_converts_utc_to_local(self): + # UTC Saturday 03:00 = Friday 10PM in New York β†’ NOT a weekend day in ET + assert is_excluded_day( + pd.Timestamp("2026-02-07T03:00"), exclude_weekends=True, holiday_dates=None, tz="America/New_York", + ) is False + + def test_timezone_saturday_in_local(self): + # UTC Saturday 15:00 = Saturday 10AM in New York β†’ IS a weekend day in ET + assert is_excluded_day( + pd.Timestamp("2026-02-07T15:00"), exclude_weekends=True, holiday_dates=None, tz="America/New_York", + ) is True + + def test_timezone_sunday_to_monday_boundary(self): + # UTC Monday 03:00 = Sunday 10PM in New York β†’ IS a weekend day in ET + assert is_excluded_day( + pd.Timestamp("2026-02-09T03:00"), exclude_weekends=True, holiday_dates=None, tz="America/New_York", + ) is True + + +class Test_NextBusinessDayStart: + def test_friday_to_monday(self): + result = next_business_day_start(pd.Timestamp("2026-02-06T17:00"), exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09") # Monday midnight + + def test_saturday_to_monday(self): + result = next_business_day_start(pd.Timestamp("2026-02-07T10:00"), exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09") # Monday midnight + + def test_sunday_to_monday(self): + result = next_business_day_start(pd.Timestamp("2026-02-08T10:00"), exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09") # Monday midnight + + def test_weekday_to_next_day(self): + result = next_business_day_start(pd.Timestamp("2026-02-09T17:00"), exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-10") # Tuesday midnight + + def test_weekend_plus_holiday(self): + # Friday β†’ Saturday (weekend) β†’ Sunday (weekend) β†’ Monday (holiday) β†’ Tuesday + holidays = {date(2026, 2, 9)} # Monday + result = next_business_day_start(pd.Timestamp("2026-02-06T17:00"), exclude_weekends=True, holiday_dates=holidays) + assert result == pd.Timestamp("2026-02-10") # Tuesday midnight + + +class Test_ComputeFreshnessThreshold: + def test_returns_business_minute_thresholds(self): + # 6 updates spaced 10h apart = 5 gaps of 600 minutes each + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + + lower, upper, staleness, prediction = compute_freshness_threshold(history, PredictSensitivity.medium) + assert upper is not None + assert upper > 0 + # Without exclusions, thresholds are raw business minutes from gap analysis + assert upper == pytest.approx(750.0) # P95 of uniform 600-min gaps = 600, floor 1.25x = 750 + # No tz β†’ no schedule β†’ staleness is None + assert staleness is None + # prediction JSON is returned (staleness only when schedule is active) + assert prediction is not None + + def test_not_enough_data_returns_none(self): + # 4 updates = 3 gaps, below MIN_FRESHNESS_GAPS + updates = ["2026-02-01T00:00", "2026-02-01T10:00", "2026-02-01T20:00", "2026-02-02T06:00"] + history = _make_freshness_history(updates) + + lower, upper, staleness, prediction = compute_freshness_threshold(history, PredictSensitivity.medium) + assert lower is None + assert upper is None + assert staleness is None + assert prediction is None + + def test_returns_four_tuple(self): + """Verify compute_freshness_threshold returns a 4-tuple (lower, upper, prediction, staleness).""" + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + result = compute_freshness_threshold(history, PredictSensitivity.medium) + assert len(result) == 4 + + def test_prediction_json_without_tz_has_no_staleness(self): + """Without tz (no active schedule), staleness_upper is absent from prediction JSON.""" + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + _, upper, staleness, prediction = compute_freshness_threshold(history, PredictSensitivity.medium) + # No tz β†’ staleness is None + assert staleness is None + assert prediction is not None + + def test_with_weekend_exclusion_returns_business_thresholds(self): + # Table updates daily on weekdays, 72h gap over weekend + updates = [ + "2026-02-02T08:00", # Mon + "2026-02-03T08:00", # Tue + "2026-02-04T08:00", # Wed + "2026-02-05T08:00", # Thu + "2026-02-06T08:00", # Fri + "2026-02-09T08:00", # Mon (72h raw, 24h business) + "2026-02-10T08:00", # Tue + ] + history = _make_freshness_history(updates) + + _, upper_raw, _, _ = compute_freshness_threshold(history, PredictSensitivity.medium) + _, upper_biz, _, _ = compute_freshness_threshold( + history, PredictSensitivity.medium, exclude_weekends=True, + ) + + # With exclusion, the 72h weekend gap normalizes to ~24h, so threshold is lower + assert upper_biz < upper_raw + + def test_sensitivity_ordering(self): + updates = [ + "2026-02-01T00:00", + "2026-02-01T04:00", + "2026-02-02T14:00", + "2026-02-03T14:00", + "2026-02-04T06:00", + "2026-02-04T08:00", + "2026-02-04T16:00", + ] + history = _make_freshness_history(updates) + + _, upper_high, _, _ = compute_freshness_threshold(history, PredictSensitivity.high) + _, upper_med, _, _ = compute_freshness_threshold(history, PredictSensitivity.medium) + _, upper_low, _, _ = compute_freshness_threshold(history, PredictSensitivity.low) + + assert upper_high <= upper_med <= upper_low + + def test_min_lookback_respected(self): + # 6 updates with sawtooth rows in between β€” the helper generates many rows + updates = [f"2026-02-{d:02d}T{h:02d}:00" for d, h in [(1, 0), (1, 10), (1, 20), (2, 6), (2, 16), (3, 2)]] + history = _make_freshness_history(updates) + row_count = len(history) + + # With min_lookback at exactly the row count β†’ should produce thresholds + _, upper, _, _ = compute_freshness_threshold(history, PredictSensitivity.medium, min_lookback=row_count) + assert upper is not None + + # With min_lookback above the row count β†’ training mode + lower, upper, staleness, prediction = compute_freshness_threshold(history, PredictSensitivity.medium, min_lookback=row_count + 1) + assert lower is None + assert upper is None + assert staleness is None + assert prediction is None + +class Test_AddBusinessMinutes: + def test_no_exclusions(self): + start = pd.Timestamp("2026-02-09T08:00") # Monday + result = add_business_minutes(start, 120, exclude_weekends=False, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09T10:00") + + def test_zero_minutes(self): + start = pd.Timestamp("2026-02-09T08:00") + result = add_business_minutes(start, 0, exclude_weekends=True, holiday_dates=None) + assert result == start + + def test_negative_minutes(self): + start = pd.Timestamp("2026-02-09T08:00") + result = add_business_minutes(start, -10, exclude_weekends=True, holiday_dates=None) + assert result == start + + def test_within_same_business_day(self): + start = pd.Timestamp("2026-02-09T08:00") # Monday + result = add_business_minutes(start, 60, exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09T09:00") + + def test_crosses_to_next_weekday(self): + # Monday 23:00, add 120 min β†’ Tuesday 01:00 + start = pd.Timestamp("2026-02-09T23:00") # Monday + result = add_business_minutes(start, 120, exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-10T01:00") # Tuesday + + def test_crosses_weekend(self): + # Friday 22:00, add 180 min (3h) β†’ should skip Sat+Sun, land on Monday 01:00 + start = pd.Timestamp("2026-02-06T22:00") # Friday + result = add_business_minutes(start, 180, exclude_weekends=True, holiday_dates=None) + # 2h left on Friday (22:00β†’midnight), then skip Sat+Sun, 1h into Monday + assert result == pd.Timestamp("2026-02-09T01:00") # Monday + + def test_starts_on_excluded_day(self): + # Starting on Saturday β€” should fast-forward to Monday midnight before consuming + start = pd.Timestamp("2026-02-07T10:00") # Saturday + result = add_business_minutes(start, 60, exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09T01:00") # Monday 01:00 + + def test_starts_on_sunday(self): + start = pd.Timestamp("2026-02-08T14:00") # Sunday + result = add_business_minutes(start, 120, exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09T02:00") # Monday 02:00 + + def test_holiday_skipped(self): + # Wednesday is a holiday + start = pd.Timestamp("2026-02-03T22:00") # Tuesday + holiday_dates = {date(2026, 2, 4)} + result = add_business_minutes(start, 180, exclude_weekends=False, holiday_dates=holiday_dates) + # 2h left on Tuesday (22:00β†’midnight), skip Wed holiday, 1h into Thursday + assert result == pd.Timestamp("2026-02-05T01:00") + + def test_weekend_plus_adjacent_holiday(self): + # Friday 23:00, Monday is holiday β†’ skip Sat, Sun, Mon + start = pd.Timestamp("2026-02-06T23:00") # Friday + holiday_dates = {date(2026, 2, 9)} # Monday + result = add_business_minutes(start, 120, exclude_weekends=True, holiday_dates=holiday_dates) + # 1h left on Friday (23:00β†’midnight), skip Sat+Sun+Mon, 1h into Tuesday + assert result == pd.Timestamp("2026-02-10T01:00") # Tuesday + + def test_multi_day_span(self): + # Monday 08:00, add 3 business days (4320 min) with weekends excluded + start = pd.Timestamp("2026-02-09T08:00") # Monday + result = add_business_minutes(start, 3 * 24 * 60, exclude_weekends=True, holiday_dates=None) + # Monβ†’Tueβ†’Wedβ†’Thu 08:00 (no weekends in the way) + assert result == pd.Timestamp("2026-02-12T08:00") + + def test_multi_day_span_crossing_weekend(self): + # Thursday 08:00, add 3 business days β†’ Fri, skip Sat+Sun, Mon 08:00 + start = pd.Timestamp("2026-02-05T08:00") # Thursday + result = add_business_minutes(start, 3 * 24 * 60, exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-10T08:00") # Monday (skipped Sat+Sun) + + def test_inverse_property(self): + # add_business_minutes(start, N) β†’ end, then wall_minutes - excluded β‰ˆ N + start = pd.Timestamp("2026-02-06T14:00") # Friday + business_minutes = 600.0 # 10 hours + end = add_business_minutes(start, business_minutes, exclude_weekends=True, holiday_dates=None) + + wall_minutes = (end - start).total_seconds() / 60 + excluded = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=None) + assert wall_minutes - excluded == pytest.approx(business_minutes) + + def test_inverse_property_with_holidays(self): + start = pd.Timestamp("2026-02-06T14:00") # Friday + holiday_dates = {date(2026, 2, 9)} # Monday + business_minutes = 600.0 + end = add_business_minutes(start, business_minutes, exclude_weekends=True, holiday_dates=holiday_dates) + + wall_minutes = (end - start).total_seconds() / 60 + excluded = count_excluded_minutes(start, end, exclude_weekends=True, holiday_dates=holiday_dates) + assert wall_minutes - excluded == pytest.approx(business_minutes) + + def test_timezone_friday_night_utc_vs_et(self): + # UTC Friday 23:00 = ET Friday 6PM β†’ still a business day in ET + # Without tz: Sat in UTC, would skip weekend immediately + # With ET tz: still Friday, consumes some time before weekend + start = pd.Timestamp("2026-02-06T23:00") # UTC Friday 11PM + + result_no_tz = add_business_minutes(start, 120, exclude_weekends=True, holiday_dates=None) + result_et = add_business_minutes(start, 120, exclude_weekends=True, holiday_dates=None, tz="America/New_York") + + # Without tz: naive Saturday β†’ skip to Monday, 2h into Monday + assert result_no_tz == pd.Timestamp("2026-02-09T01:00") + # With ET: Friday 6PM ET, 2h β†’ Friday 8PM ET = Sat 01:00 UTC + assert result_et == pd.Timestamp("2026-02-07T01:00") + + def test_timezone_result_is_naive_when_input_is_naive(self): + start = pd.Timestamp("2026-02-06T22:00") + result = add_business_minutes(start, 60, exclude_weekends=True, holiday_dates=None, tz="America/New_York") + assert result.tzinfo is None + + def test_no_exclusions_ignores_tz(self): + start = pd.Timestamp("2026-02-07T10:00") # Saturday + result = add_business_minutes(start, 120, exclude_weekends=False, holiday_dates=None, tz="America/New_York") + assert result == pd.Timestamp("2026-02-07T12:00") + + def test_accepts_datetime(self): + from datetime import datetime + start = datetime(2026, 2, 9, 8, 0) # Monday + result = add_business_minutes(start, 60, exclude_weekends=True, holiday_dates=None) + assert result == pd.Timestamp("2026-02-09T09:00") + + +class Test_GetSarimaxForecast_TimezoneExog: + """Verify that get_sarimax_forecast uses the schedule timezone for weekend/holiday exog flags.""" + + @staticmethod + def _make_daily_history(n_days: int = 30, hour_utc: int = 3) -> pd.DataFrame: + """Create a simple daily history at a fixed UTC hour. + + With hour_utc=3, the timestamps are 3 AM UTC = 10 PM ET (previous day). + This means UTC Saturday 3 AM = ET Friday 10 PM β€” a weekday in ET but weekend in UTC. + """ + dates = pd.date_range("2026-01-05", periods=n_days, freq="1D") + pd.Timedelta(hours=hour_utc) + values = np.arange(100, 100 + n_days, dtype=float) + np.random.default_rng(42).normal(0, 5, n_days) + return pd.DataFrame({"value": values}, index=dates) + + def test_timezone_changes_weekend_flags(self): + # History at 3 AM UTC daily β€” in ET that's 10 PM the previous day + history = self._make_daily_history(n_days=40, hour_utc=3) + + # Without timezone: UTC Saturday/Sunday get is_excluded=1 + forecast_utc = get_sarimax_forecast(history, num_forecast=3, exclude_weekends=True) + # With ET timezone: ET Saturday/Sunday get is_excluded=1 (shifted by ~5 hours) + forecast_et = get_sarimax_forecast(history, num_forecast=3, exclude_weekends=True, tz="America/New_York") + + # The forecasts should differ because the exog flags apply to different days + # (UTC Sat 3AM = ET Fri 10PM β†’ not excluded in ET, excluded in UTC) + assert not forecast_utc["mean"].equals(forecast_et["mean"]) + + def test_no_timezone_preserves_original_behavior(self): + history = self._make_daily_history(n_days=40) + + forecast_no_tz = get_sarimax_forecast(history, num_forecast=3, exclude_weekends=True) + forecast_none_tz = get_sarimax_forecast(history, num_forecast=3, exclude_weekends=True, tz=None) + + pd.testing.assert_frame_equal(forecast_no_tz, forecast_none_tz) + + def test_without_exclusions_timezone_has_no_effect(self): + history = self._make_daily_history(n_days=40) + + forecast_no_tz = get_sarimax_forecast(history, num_forecast=3, exclude_weekends=False) + forecast_with_tz = get_sarimax_forecast(history, num_forecast=3, exclude_weekends=False, tz="America/New_York") + + pd.testing.assert_frame_equal(forecast_no_tz, forecast_with_tz) diff --git a/tests/unit/conftest.py b/tests/unit/conftest.py index 4f88336a..48865d49 100644 --- a/tests/unit/conftest.py +++ b/tests/unit/conftest.py @@ -13,7 +13,7 @@ def patched_settings(): yield mock -@pytest.fixture(autouse=True) +@pytest.fixture def db_session_mock(): with patch("testgen.common.models.Session") as factory_mock: yield factory_mock().__enter__() diff --git a/tests/unit/test_scheduler_base.py b/tests/unit/scheduler/test_scheduler_base.py similarity index 70% rename from tests/unit/test_scheduler_base.py rename to tests/unit/scheduler/test_scheduler_base.py index e92037c2..ccac8374 100644 --- a/tests/unit/test_scheduler_base.py +++ b/tests/unit/scheduler/test_scheduler_base.py @@ -8,6 +8,8 @@ from testgen.scheduler.base import DelayedPolicy, Job, Scheduler +pytestmark = pytest.mark.unit + @contextmanager def assert_finishes_within(**kwargs): @@ -22,7 +24,12 @@ class TestScheduler(Scheduler): get_jobs = Mock() start_job = Mock() - yield TestScheduler() + instance = TestScheduler() + yield instance + # Cleanup: ensure scheduler thread is stopped even if test fails + if instance.thread and instance.thread.is_alive(): + instance.shutdown() + instance.wait(timeout=1.0) @pytest.fixture @@ -48,7 +55,34 @@ def now_func(): yield now_func -@pytest.mark.unit +def test_get_triggering_times_every_5_min(): + job = Job(cron_expr="*/5 * * * *", cron_tz="UTC", delayed_policy=DelayedPolicy.ALL) + base = datetime(2025, 4, 15, 9, 0, 0, tzinfo=UTC) + times = list(islice(job.get_triggering_times(base), 5)) + minutes = [t.minute for t in times] + # cron_converter yields starting at base time, then increments + assert minutes == [0, 5, 10, 15, 20] + assert all(t.hour == 9 for t in times) + + +def test_get_triggering_times_hourly(): + job = Job(cron_expr="0 * * * *", cron_tz="UTC", delayed_policy=DelayedPolicy.ALL) + base = datetime(2025, 4, 15, 9, 30, 0, tzinfo=UTC) + times = list(islice(job.get_triggering_times(base), 3)) + hours = [t.hour for t in times] + assert hours == [10, 11, 12] + assert all(t.minute == 0 for t in times) + + +def test_get_triggering_times_timezone(): + job = Job(cron_expr="0 9 * * *", cron_tz="America/New_York", delayed_policy=DelayedPolicy.ALL) + base = datetime(2025, 4, 15, 12, 0, 0, tzinfo=UTC) # 8 AM ET (EDT) + times = list(islice(job.get_triggering_times(base), 2)) + # 9 AM ET = 13:00 UTC (during EDT) + assert times[0].astimezone(UTC).hour == 13 + assert times[1].astimezone(UTC).hour == 13 + + def test_getting_jobs_wont_crash(scheduler_instance, base_time): scheduler_instance.get_jobs.side_effect = Exception scheduler_instance.start(base_time) @@ -61,7 +95,6 @@ def test_getting_jobs_wont_crash(scheduler_instance, base_time): scheduler_instance.wait() -@pytest.mark.unit @pytest.mark.parametrize( ("expr", "dpol", "expected_minutes"), [ @@ -76,7 +109,6 @@ def test_delayed_jobs_policies(expr, dpol, expected_minutes, scheduler_instance, assert triggering_times == expected_triggering_times -@pytest.mark.unit def test_jobs_start_in_order(scheduler_instance, base_time): jobs = { 3: Job(cron_expr="*/3 * * * *", cron_tz="UTC", delayed_policy=DelayedPolicy.ALL), @@ -94,24 +126,31 @@ def test_jobs_start_in_order(scheduler_instance, base_time): assert job in triggred_jobs or triggering_time.minute % divisor != 0 -@pytest.mark.unit +def wait_for_call_count(mock, expected_count, timeout=0.5): + """Wait for a mock's call_count to reach the expected value.""" + start = time.monotonic() + while mock.call_count < expected_count: + if time.monotonic() - start > timeout: + return False + time.sleep(0.01) + return True + + @pytest.mark.parametrize("with_job", (True, False)) def test_reloads_and_shutdowns_immediately(with_job, scheduler_instance, base_time): jobs = [Job(cron_expr="0 0 * * *", cron_tz="UTC", delayed_policy=DelayedPolicy.ALL)] if with_job else [] scheduler_instance.get_jobs.return_value = jobs scheduler_instance.start(base_time) - time.sleep(0.05) - assert scheduler_instance.get_jobs.call_count == 1 - with assert_finishes_within(milliseconds=100): + assert wait_for_call_count(scheduler_instance.get_jobs, 1), "get_jobs should be called once on start" + + with assert_finishes_within(milliseconds=500): scheduler_instance.reload_jobs() - time.sleep(0.05) - assert scheduler_instance.get_jobs.call_count == 2 + assert wait_for_call_count(scheduler_instance.get_jobs, 2), "get_jobs should be called again after reload" scheduler_instance.shutdown() scheduler_instance.wait() -@pytest.mark.unit @pytest.mark.parametrize("start_side_effect", (lambda *_: None, Exception)) def test_job_start_is_called(start_side_effect, scheduler_instance, base_time, no_wait): jobs = [ diff --git a/tests/unit/test_scheduler_cli.py b/tests/unit/scheduler/test_scheduler_cli.py similarity index 98% rename from tests/unit/test_scheduler_cli.py rename to tests/unit/scheduler/test_scheduler_cli.py index 7a7e0854..bf6cc1b2 100644 --- a/tests/unit/test_scheduler_cli.py +++ b/tests/unit/scheduler/test_scheduler_cli.py @@ -10,6 +10,8 @@ from testgen.scheduler.base import DelayedPolicy from testgen.scheduler.cli_scheduler import CliJob, CliScheduler +pytestmark = pytest.mark.unit + @pytest.fixture def scheduler_instance() -> CliScheduler: @@ -77,7 +79,6 @@ def cli_job(job_data): yield CliJob(**job_data, delayed_policy=DelayedPolicy.SKIP) -@pytest.mark.unit def test_get_jobs(scheduler_instance, db_jobs, job_sched): db_jobs.return_value = iter([job_sched]) @@ -89,7 +90,6 @@ def test_get_jobs(scheduler_instance, db_jobs, job_sched): assert getattr(jobs[0], attr) == getattr(job_sched, attr), f"Attribute '{attr}' does not match" -@pytest.mark.unit def test_job_start(scheduler_instance, cli_job, cmd_mock, popen_mock, popen_proc_mock): with patch("testgen.scheduler.cli_scheduler.threading.Thread") as thread_mock: scheduler_instance.start_job(cli_job, datetime.now(UTC)) @@ -102,7 +102,6 @@ def test_job_start(scheduler_instance, cli_job, cmd_mock, popen_mock, popen_proc thread_mock.assert_called_once_with(target=scheduler_instance._proc_wrapper, args=(popen_proc_mock,)) -@pytest.mark.unit @pytest.mark.parametrize("proc_side_effect", (lambda: None, RuntimeError)) def test_proc_wrapper(proc_side_effect, scheduler_instance): with ( @@ -121,7 +120,6 @@ def test_proc_wrapper(proc_side_effect, scheduler_instance): cond_mock.notify.assert_called_once() -@pytest.mark.unit def test_shutdown_no_jobs(scheduler_instance): with ( patch.object(scheduler_instance, "start") as start_mock, @@ -148,7 +146,6 @@ def test_shutdown_no_jobs(scheduler_instance): assert not scheduler_instance._running_jobs -@pytest.mark.unit @pytest.mark.parametrize("sig", [signal.SIGINT, signal.SIGTERM]) def test_shutdown(scheduler_instance, sig): with ( diff --git a/tests/unit/test_utils.py b/tests/unit/test_utils.py new file mode 100644 index 00000000..aea93451 --- /dev/null +++ b/tests/unit/test_utils.py @@ -0,0 +1,288 @@ +import logging +from datetime import UTC, datetime +from decimal import Decimal +from enum import Enum +from uuid import UUID + +import pytest + +from testgen.utils import ( + chunk_queries, + friendly_score, + friendly_score_impact, + get_exception_message, + is_uuid4, + log_and_swallow_exception, + make_json_safe, + score, + str_to_timestamp, + to_dataframe, + to_int, + to_sql_timestamp, + try_json, +) + +pytestmark = pytest.mark.unit + + +# --- to_int --- + +@pytest.mark.parametrize( + "value, expected", + [ + (5, 5), + (3.7, 3), + (0, 0), + (0.0, 0), + (float("nan"), 0), + (None, 0), + ], +) +def test_to_int(value, expected): + assert to_int(value) == expected + + +# --- to_sql_timestamp --- + +def test_to_sql_timestamp(): + dt = datetime(2024, 3, 15, 10, 30, 45) + assert to_sql_timestamp(dt) == "2024-03-15 10:30:45" + + +# --- str_to_timestamp --- + +@pytest.mark.parametrize( + "value, expected", + [ + ("2024-03-15 10:30:45", int(datetime(2024, 3, 15, 10, 30, 45, tzinfo=UTC).timestamp())), + ("2024-03-15T10:30:45Z", int(datetime(2024, 3, 15, 10, 30, 45, tzinfo=UTC).timestamp())), + ("not-a-date", None), + ], +) +def test_str_to_timestamp(value, expected): + assert str_to_timestamp(value) == expected + + +# --- is_uuid4 --- + +@pytest.mark.parametrize( + "value, expected", + [ + ("550e8400-e29b-41d4-a716-446655440000", True), + (UUID("550e8400-e29b-41d4-a716-446655440000"), True), + ("not-a-uuid", False), + ("", False), + ("550e8400-e29b-41d4-a716-44665544000", False), # too short + ], +) +def test_is_uuid4(value, expected): + assert is_uuid4(value) == expected + + +# --- try_json --- + +@pytest.mark.parametrize( + "value, default, expected", + [ + ('{"a": 1}', None, {"a": 1}), + ("[1, 2, 3]", None, [1, 2, 3]), + ("invalid", "fallback", "fallback"), + (None, "default", "default"), + ("null", None, None), + ], +) +def test_try_json(value, default, expected): + assert try_json(value, default) == expected + + +# --- get_exception_message --- + +def test_get_exception_message_string_arg(): + exc = ValueError("something went wrong ") + assert get_exception_message(exc) == "something went wrong" + + +def test_get_exception_message_non_string_arg(): + exc = ValueError(42) + assert get_exception_message(exc) == "42" + + +def test_get_exception_message_no_args(): + exc = ValueError() + assert get_exception_message(exc) == "" + + +# --- make_json_safe --- + +def test_make_json_safe_uuid(): + uid = UUID("550e8400-e29b-41d4-a716-446655440000") + assert make_json_safe(uid) == "550e8400-e29b-41d4-a716-446655440000" + + +def test_make_json_safe_datetime(): + dt = datetime(2024, 1, 1, 0, 0, 0, tzinfo=UTC) + assert make_json_safe(dt) == int(dt.timestamp()) + + +def test_make_json_safe_decimal(): + assert make_json_safe(Decimal("3.14")) == 3.14 + + +def test_make_json_safe_enum(): + class Color(Enum): + RED = "red" + assert make_json_safe(Color.RED) == "red" + + +def test_make_json_safe_list(): + uid = UUID("550e8400-e29b-41d4-a716-446655440000") + result = make_json_safe([uid, 42]) + assert result == ["550e8400-e29b-41d4-a716-446655440000", 42] + + +def test_make_json_safe_dict(): + uid = UUID("550e8400-e29b-41d4-a716-446655440000") + result = make_json_safe({"id": uid, "name": "test"}) + assert result == {"id": "550e8400-e29b-41d4-a716-446655440000", "name": "test"} + + +def test_make_json_safe_passthrough(): + assert make_json_safe("hello") == "hello" + assert make_json_safe(42) == 42 + assert make_json_safe(None) is None + + +# --- chunk_queries --- + +def test_chunk_queries_fits_in_one(): + queries = ["SELECT 1", "SELECT 2"] + result = chunk_queries(queries, " UNION ALL ", 100) + assert result == ["SELECT 1 UNION ALL SELECT 2"] + + +def test_chunk_queries_needs_splitting(): + queries = ["SELECT 1", "SELECT 2", "SELECT 3"] + result = chunk_queries(queries, " UNION ALL ", 30) + assert len(result) > 1 + for chunk in result: + assert len(chunk) <= 30 + + +def test_chunk_queries_single_query(): + result = chunk_queries(["SELECT 1"], ";", 100) + assert result == ["SELECT 1"] + + +def test_chunk_queries_each_at_limit(): + queries = ["A" * 10, "B" * 10, "C" * 10] + result = chunk_queries(queries, ";", 11) + assert result == ["A" * 10, "B" * 10, "C" * 10] + + +# --- score --- + +@pytest.mark.parametrize( + "profiling, tests, expected", + [ + (0.9, 0.8, 0.9 * 0.8), + (0.9, 0.0, 0.9), + (0.0, 0.8, 0.8), + (0.0, 0.0, 0.0), + (float("nan"), 0.8, 0.8), + (0.9, float("nan"), 0.9), + (float("nan"), float("nan"), 0.0), + ], +) +def test_score(profiling, tests, expected): + assert score(profiling, tests) == pytest.approx(expected) + + +# --- friendly_score --- + +@pytest.mark.parametrize( + "value, expected", + [ + (1.0, "100"), + (0.956, "95.6"), + (0.0001, "< 0.1"), + (0.99999, "> 99.9"), + (0.5, "50.0"), + (None, None), + (0, None), + (float("nan"), None), + ], +) +def test_friendly_score(value, expected): + assert friendly_score(value) == expected + + +# --- friendly_score_impact --- + +@pytest.mark.parametrize( + "value, expected", + [ + (100, "100"), + (50.123, "50.12"), + (0.001, "< 0.01"), + (99.999, "> 99.99"), + (None, "-"), + (0, "-"), + (float("nan"), "-"), + ], +) +def test_friendly_score_impact(value, expected): + assert friendly_score_impact(value) == expected + + +# --- to_dataframe --- + +def test_to_dataframe_with_to_dict(): + class Item: + def to_dict(self): + return {"a": 1, "b": 2} + + df = to_dataframe([Item(), Item()]) + assert list(df.columns) == ["a", "b"] + assert len(df) == 2 + + +def test_to_dataframe_with_dict_attr(): + class Item: + def __init__(self): + self.x = 10 + self.y = 20 + + df = to_dataframe([Item()]) + assert df.iloc[0]["x"] == 10 + assert df.iloc[0]["y"] == 20 + + +def test_to_dataframe_with_plain_dict(): + df = to_dataframe([{"k": "v"}]) + assert df.iloc[0]["k"] == "v" + + +def test_to_dataframe_empty(): + df = to_dataframe([]) + assert len(df) == 0 + + +# --- log_and_swallow_exception --- + +def test_log_and_swallow_exception_no_error(): + @log_and_swallow_exception + def good_func(): + return 42 + + good_func() # should not raise + + +def test_log_and_swallow_exception_swallows(caplog): + @log_and_swallow_exception + def bad_func(): + raise RuntimeError("boom") + + with caplog.at_level(logging.ERROR, logger="testgen"): + bad_func() # should not raise + + assert "boom" in caplog.text