Outlier Detection Using IQR Method
This guide explains how to create a Sentinel to detect salary outliers using the Interquartile Range (IQR) method.
How the IQR Method Works
The IQR method identifies outliers as values that fall below Q1 - 1.5 × IQR or above Q3 + 1.5 × IQR, where:
- Q1: First quartile (25th percentile)
- Q3: Third quartile (75th percentile)
- IQR: Interquartile range (Q3 - Q1)
Example Dataset
Download the sample dataset as a CSV file: outlier-dataset.csv
Example Alert Condition
In the Alert Condition field, enter:
Detect salary outliers using the IQR method (1.5 × IQR rule)
This will flag any records where the salary is outside the calculated lower and upper bounds.
Example Alert Response
When an outlier is detected, the alert validation will show a response like this:
Alert Condition: Detect salary outliers using the IQR method (1.5 × IQR rule)
Alert Triggered! Your alert condition would trigger based on the sample data.
Execution Details:
- Alert Condition Evaluation: True (Alert would trigger)
- Code Execution Status: Successful
Matching Records (1 row):
| department | emp_id | joining_date | name | salary |
|---|---|---|---|---|
| Engineering | 21 | 2025-05-22T00:00:00 | Jhonny English | 0 |
This means the sentinel has detected an employee whose salary is an outlier according to the IQR method.
If you include the statistical columns in the alert condition (for confirmation), the validation output will also show the computed values used by the rule. For example, with the alert condition:
Detect salary outliers using the IQR method (1.5 × IQR rule). Include the statistical columns in the output for confirmation.
The validation would include extra columns like this:
| department | emp_id | iqr | joining_date | lower_bound | name | q1 | q3 | salary | upper_bound |
|---|---|---|---|---|---|---|---|---|---|
| Engineering | 21 | 11000 | 2025-05-22T00:00:00 | 42500.0 | Jhonny English | 59000 | 70000 | 0 | 86500.0 |
This shows the statistical values (q1, q3, iqr, lower/upper bounds) used to decide that the record is an outlier.
Why this record is flagged:
- The lower bound is calculated as Q1 - 1.5 × IQR = 59000 - 1.5 × 11000 = 59000 - 16500 = 42500.
- Jhonny English's salary is 0, which is below the lower bound (42500), so it falls outside the acceptable range on the low side.
- Therefore the IQR rule correctly identifies this row as an outlier.
About the multiplier (k × IQR)
The multiplier (commonly 1.5) determines how far from Q1/Q3 a value must be to be considered an outlier. The bounds are:
- Lower bound = Q1 - k × IQR
- Upper bound = Q3 + k × IQR
Using the dataset values shown above (Q1 = 59000, Q3 = 70000, IQR = 11000):
-
With k = 1.5 (standard):
- Lower bound = 59000 - 1.5 × 11000 = 42500
- Upper bound = 70000 + 1.5 × 11000 = 86500
-
With k = 2 (less sensitive, fewer outliers):
- Lower bound = 59000 - 2 × 11000 = 37000
- Upper bound = 70000 + 2 × 11000 = 92000
Increasing k makes the bounds wider and reduces false positives (only extreme values are flagged). Decreasing k makes the bounds narrower and increases sensitivity (more values flagged). Choose k based on how strict you want the sentinel to be.
Tips
- You can adjust the multiplier (e.g., 2 × IQR) in the alert condition for stricter or more lenient detection.
- Use the validation feature to preview which records will be flagged as outliers.