Methodology
Dataset
This dashboard uses the Kaggle Agency Performance Dataset, a real P&C insurance dataset containing 213,328 rows of annual data from 2005 to 2015. Each row represents one Agency × Product × Product Line × State × Year observation.
Metric Calculations
- Loss Ratio is computed as
SUM(Incurred Losses) / SUM(Earned Premium) aggregated per year. The dataset's raw LOSS_RATIO column is not used directly because it contains row-level sentinel values and outliers that distort aggregated averages.
- Retention Ratio, 3-Year Loss Ratio, and 3-Year Growth Rate are averaged from the dataset's raw ratio columns after excluding sentinel values 99997, 99998, and 99999, which indicate missing or invalid data.
- All other financial metrics (premiums, losses) use SUM aggregation per year. All count metrics (policies, producers) also use SUM.
Record Counts
Matching records: total rows satisfying the active filters. Valid records (per metric): rows usable for that metric's calculation — excludes sentinel values for ratio metrics; requires non-null, non-zero earned premium for Loss Ratio.
CAGR
Compound Annual Growth Rate is shown for non-ratio metrics where both the first-year and last-year values are positive.