Skip to content

feat(telemetry): introduce telemetry system for usage data collection#1109

Draft
davidberenstein1957 wants to merge 1 commit intomasterfrom
feat/telemetry
Draft

feat(telemetry): introduce telemetry system for usage data collection#1109
davidberenstein1957 wants to merge 1 commit intomasterfrom
feat/telemetry

Conversation

@davidberenstein1957
Copy link
Collaborator

  • Added a new telemetry module to collect and export usage data.
  • Implemented three telemetry tiers: Off, Internal, and Public.
  • Integrated OpenTelemetry for data export.
  • Created user prompts for telemetry consent on first run.
  • Updated documentation to explain telemetry features and configuration.

This enhancement aims to improve CodeCarbon by gathering anonymous usage data while ensuring user privacy and consent.

Description

Please explain the changes you made here.

Related Issue

Please link to the issue this PR resolves: [issue #1106]

Motivation and Context

Why is this change required? What problem does it solve?

How Has This Been Tested?

Please describe in detail how you tested your changes.

Screenshots (if appropriate):

Types of changes

What types of changes does your code introduce? Put an x in all the boxes that apply:

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

Go over all the following points, and put an x in all the boxes that apply.

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING.md document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.

- Added a new telemetry module to collect and export usage data.
- Implemented three telemetry tiers: Off, Internal, and Public.
- Integrated OpenTelemetry for data export.
- Created user prompts for telemetry consent on first run.
- Updated documentation to explain telemetry features and configuration.

This enhancement aims to improve CodeCarbon by gathering anonymous usage data while ensuring user privacy and consent.
from codecarbon.external.logger import logger

# Environment variable name for telemetry setting
TELEMETRY_ENV_VAR = "CODECARBON_TELEMETRY"
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be updated

# Environment variable for OTEL endpoint
OTEL_ENDPOINT_ENV_VAR = "CODECARBON_OTEL_ENDPOINT"

# Default OTEL endpoint (can be configured by CodeCarbon team)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

needs to be updated

hardware_tracked: list = field(default_factory=list)
measure_power_interval_secs: float = 15.0

# ML Ecosystem (Tier 1: Internal)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these enough? perhaps we also want to add some traditional ML libraries like sklearn etc?

container_runtime: str = ""
in_container: bool = False

# Emissions Data (Tier 2: Public only)
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cool to see if people are interested in sharing this when prompted. We could use this to share approximation on measured carbon with codecarbon.


return self

def collect_ml_ecosystem(self) -> "TelemetryCollector":
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps this can be done a bit more efficiently.

Comment on lines +14 to +24
try:
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter as OTLPSpanExporterHTTP

OTEL_AVAILABLE = True
except ImportError:
OTEL_AVAILABLE = False
logger.debug("OpenTelemetry not available, telemetry will not be exported")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otel should be installed by default.

self._tracer = None
self._initialized = False

if not OTEL_AVAILABLE:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otel should be enabled by default

if not self._config.is_enabled:
return False

try:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enabled by default

Returns:
TelemetryExporter instance or None if not available
"""
if not OTEL_AVAILABLE:
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

enabled by default

Comment on lines +18 to +31
try:
from rich.console import Console
from rich.prompt import Prompt

RICH_AVAILABLE = True
except ImportError:
RICH_AVAILABLE = False

try:
import questionary

QUESTIONARY_AVAILABLE = True
except ImportError:
QUESTIONARY_AVAILABLE = False
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant