# Code Coverage Module

Exports test coverage data to ClickHouse, enriched with:

- **Feature category ownership** - group, stage, and section for each covered file
- **Responsibility classification** - whether coverage comes from unit or integration tests
- **Test-to-file mappings** - which source files each test covers

## How It Works

### Feature Category Attribution

Coverage data is enriched with feature category ownership by joining three data sources:

1. **Coverage Report** (LCOV) - which source files have coverage and their percentages
2. **Test Map** - which test files cover each source file
3. **Test Reports** (JSON) - which feature category each test file belongs to

```mermaid
flowchart LR
    subgraph Inputs
        A["<b>Coverage Report</b><br/>user.rb: 85%"]
        B["<b>Test Map</b><br/>user.rb → user_spec.rb"]
        C["<b>Test Reports</b><br/>user_spec.rb → user_profile"]
    end

    subgraph Output
        E["<b>ClickHouse Record</b><br/>file: user.rb<br/>feature_category: user_profile<br/>coverage: 85%"]
    end

    A --> D((Join))
    B --> D
    C --> D
    D --> E
```

This enables **multi-category attribution**: if a source file is covered by tests from
multiple feature categories, it creates a separate record for each category in ClickHouse.

### Why All Three Inputs Are Required

| Input | Provides | Without it |
|-------|----------|------------|
| Coverage Report | Line/branch coverage percentages | No coverage metrics |
| Test Map | Source file → test file relationships | No feature category attribution (all records have `category=NULL`) |
| Test Reports | Test file → feature category metadata | No feature category attribution (all records have `category=NULL`) |

## Responsibility Classification

Tests are classified as either **responsible** or **dependent**:

- **Responsible**: Unit tests that directly test a component in isolation
- **Dependent**: Integration/E2E tests that exercise a component through other layers

This classification is tracked per (source_file, feature_category) combination using two boolean columns:

| is_responsible | is_dependent | Meaning |
|----------------|--------------|---------|
| `true` | `true` | Source file has both unit AND integration test coverage from this feature category |
| `true` | `false` | Source file has only unit test coverage from this feature category |
| `false` | `true` | Source file has only integration test coverage from this feature category |
| `nil` | `nil` | No test mapping exists for this source file |

### Configuration

This gem is designed to be reusable across different projects. Classification patterns
are project-specific and must be provided via a YAML config file, since different
codebases have different test directory structures. The config file defines regex
patterns for matching test file paths:

> **Note:** The table above describes the *semantic meaning* of the flags. The patterns
> you configure determine *which tests* produce those flags for your project.

```yaml
# responsibility_patterns.yml
responsible:
  - "^spec/(models|controllers|services)/"   # Backend unit tests
  - "^spec/frontend/"                        # Frontend unit tests
  - "_test\\.go$"                            # Go unit tests

dependent:
  - "^spec/(requests|features|integration)/" # Backend integration tests
  - "^spec/frontend_integration/"            # Frontend integration tests
  - "^qa/"                                   # E2E tests
  - "_integration_test\\.go$"                # Go integration tests
```

**Pattern matching rules:**
1. Dependent patterns are checked first (higher priority)
2. If no pattern matches, the test defaults to "dependent"
3. Patterns are Ruby regexes (escape special characters like `.` with `\\`)

**Why dependent has priority:** We use a conservative approach. `is_responsible: true`
makes a stronger claim ("this file has unit test coverage") than `is_dependent: true`.
If a test matches both patterns or no patterns, defaulting to "dependent" avoids
incorrectly inflating unit test coverage metrics. It's safer to under-claim than over-claim.

### Example: GitLab Configuration

```yaml
# .gitlab/coverage/responsibility_patterns.yml
responsible:
  # Backend unit test directories
  - "^spec/(models|controllers|services|workers|helpers|mailers|policies|presenters|uploaders|validators|lib|graphql|serializers|components)/"
  - "^ee/spec/(models|controllers|services|workers|helpers|mailers|policies|presenters|uploaders|validators|lib|graphql|serializers|components)/"
  # Frontend unit tests
  - "^spec/frontend/"
  - "^ee/spec/frontend/"
  # Go unit tests
  - "_test\\.go$"

dependent:
  # Backend integration tests
  - "^spec/(requests|features|system|integration)/"
  - "^ee/spec/(requests|features|system|integration)/"
  # Frontend integration tests
  - "^spec/frontend_integration/"
  - "^ee/spec/frontend_integration/"
  # E2E tests
  - "^qa/"
  # Go integration tests
  - "_integration_test\\.go$"
```

### Example: Standard Rails Project

```yaml
# config/responsibility_patterns.yml
responsible:
  - "^test/(models|controllers|services|helpers|mailers)/"
  - "^test/unit/"

dependent:
  - "^test/(integration|system)/"
  - "^spec/features/"
```

## Test-to-File Mappings

When a test map is provided, the module also exports test-to-source-file relationships
to a separate `test_file_mappings` table. This enables:

- **Coverage context for tests** - see which source files a specific test covers
- **Impact analysis** - understand which files would lose coverage if a test is quarantined
- **Flaky test triage** - correlate flaky tests with the source files they cover

## CLI

Example usage:

```bash
test-coverage \
  --test-reports 'rspec/*.json' \
  --coverage-report 'coverage/lcov.info' \
  --test-map 'mapping.json' \
  --responsibility-patterns 'config/responsibility_patterns.yml' \
  --clickhouse-url 'https://clickhouse.example.com' \
  --clickhouse-database 'coverage' \
  --clickhouse-username 'user'
```

See `exe/test-coverage --help` for full usage.
