feat(telemetry): introduce telemetry system for usage data collection#1109
feat(telemetry): introduce telemetry system for usage data collection#1109davidberenstein1957 wants to merge 1 commit intomasterfrom
Conversation
- Added a new telemetry module to collect and export usage data. - Implemented three telemetry tiers: Off, Internal, and Public. - Integrated OpenTelemetry for data export. - Created user prompts for telemetry consent on first run. - Updated documentation to explain telemetry features and configuration. This enhancement aims to improve CodeCarbon by gathering anonymous usage data while ensuring user privacy and consent.
| from codecarbon.external.logger import logger | ||
|
|
||
| # Environment variable name for telemetry setting | ||
| TELEMETRY_ENV_VAR = "CODECARBON_TELEMETRY" |
There was a problem hiding this comment.
needs to be updated
| # Environment variable for OTEL endpoint | ||
| OTEL_ENDPOINT_ENV_VAR = "CODECARBON_OTEL_ENDPOINT" | ||
|
|
||
| # Default OTEL endpoint (can be configured by CodeCarbon team) |
There was a problem hiding this comment.
needs to be updated
| hardware_tracked: list = field(default_factory=list) | ||
| measure_power_interval_secs: float = 15.0 | ||
|
|
||
| # ML Ecosystem (Tier 1: Internal) |
There was a problem hiding this comment.
are these enough? perhaps we also want to add some traditional ML libraries like sklearn etc?
| container_runtime: str = "" | ||
| in_container: bool = False | ||
|
|
||
| # Emissions Data (Tier 2: Public only) |
There was a problem hiding this comment.
I think it would be cool to see if people are interested in sharing this when prompted. We could use this to share approximation on measured carbon with codecarbon.
|
|
||
| return self | ||
|
|
||
| def collect_ml_ecosystem(self) -> "TelemetryCollector": |
There was a problem hiding this comment.
perhaps this can be done a bit more efficiently.
| try: | ||
| from opentelemetry import trace | ||
| from opentelemetry.sdk.trace import TracerProvider | ||
| from opentelemetry.sdk.trace.export import BatchSpanProcessor | ||
| from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter | ||
| from opentelemetry.exporter.otlp.proto.http.trace_exporter import OTLPSpanExporter as OTLPSpanExporterHTTP | ||
|
|
||
| OTEL_AVAILABLE = True | ||
| except ImportError: | ||
| OTEL_AVAILABLE = False | ||
| logger.debug("OpenTelemetry not available, telemetry will not be exported") |
There was a problem hiding this comment.
otel should be installed by default.
| self._tracer = None | ||
| self._initialized = False | ||
|
|
||
| if not OTEL_AVAILABLE: |
There was a problem hiding this comment.
otel should be enabled by default
| if not self._config.is_enabled: | ||
| return False | ||
|
|
||
| try: |
There was a problem hiding this comment.
enabled by default
| Returns: | ||
| TelemetryExporter instance or None if not available | ||
| """ | ||
| if not OTEL_AVAILABLE: |
There was a problem hiding this comment.
enabled by default
| try: | ||
| from rich.console import Console | ||
| from rich.prompt import Prompt | ||
|
|
||
| RICH_AVAILABLE = True | ||
| except ImportError: | ||
| RICH_AVAILABLE = False | ||
|
|
||
| try: | ||
| import questionary | ||
|
|
||
| QUESTIONARY_AVAILABLE = True | ||
| except ImportError: | ||
| QUESTIONARY_AVAILABLE = False |
There was a problem hiding this comment.
remove this
This enhancement aims to improve CodeCarbon by gathering anonymous usage data while ensuring user privacy and consent.
Description
Please explain the changes you made here.
Related Issue
Please link to the issue this PR resolves: [issue #1106]
Motivation and Context
Why is this change required? What problem does it solve?
How Has This Been Tested?
Please describe in detail how you tested your changes.
Screenshots (if appropriate):
Types of changes
What types of changes does your code introduce? Put an
xin all the boxes that apply:Checklist:
Go over all the following points, and put an
xin all the boxes that apply.