FDE: Fault Detection and Exclusion

This tutorial illustrates a few of the fault detection and exclusion capabilities from algorithms/fde.py.

[1]:

import numpy as np
import gnss_lib_py as glp

# load Android Google Challenge data
glp.make_dir("../data")
!wget https://raw.githubusercontent.com/Stanford-NavLab/gnss_lib_py/main/data/unit_test/google_decimeter_2022/device_gnss.csv --quiet -nc -O "../data/device_gnss.csv"
navdata = glp.AndroidDerived2022("../data/device_gnss.csv")

For this demonstration, we limit ourselves to the first time instance. This better shows when more faults are detected and how changing different hyperparameters changes each method’s behaviour.

[2]:

navdata = navdata.where('gps_millis', navdata['gps_millis', 0], 'eq')

Several pre-built methods exist for performing fault detection and exclusion and can be accessed through the solve_fde() function.

Greedy Euclidean Distance Matrix FDE

The first method is based on “Detection and Exclusion of Multiple Faults using Euclidean Distance Matrices” by Derek Knowles and Grace Gao from the ION GNSS+ 2023 conference.

[3]:

result = glp.solve_fde(navdata, method="edm")

After this method runs, a new row is added called “fault_edm” which has a 0 if no fault is predicted, 1 if a fault is predicted, and 2 for an unknown fault status (usually due to lack of necessary columns or information).

[4]:

result["fault_edm"]

[4]:

array([0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 0,
       0, 0, 0, 2, 0, 0, 0, 2, 0, 0, 2, 0, 0, 2, 2, 2, 0])

You can change the threshold variable to be more or less sensitive based on false alarm or missed detection system requirements. Greedy EDM FDE has a range for the threshold between 0 and 1 since the detection statistic is normalized to 1. Note that if the threshold is set to the lower limit of zero, faults are detected until only four measurements remain at each timestep.

[5]:

result = glp.solve_fde(navdata, method="edm",
                       threshold=0)
result["fault_edm"]

[5]:

array([1, 1, 1, 1, 1, 2, 1, 1, 2, 2, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1,
       1, 1, 1, 2, 1, 1, 1, 2, 0, 1, 2, 0, 0, 2, 2, 2, 0])

The max_faults variable can be changed to apply a maximum number of faults detected at each timestep. In this example, the threshold is still set to zero, but we limit the faults removed with the max_faults variable.

[6]:

result = glp.solve_fde(navdata, method="edm",
                       threshold=0, max_faults=4)
result["fault_edm"]

[6]:

array([0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 0, 0, 0, 1, 2, 2, 2, 2, 2, 1,
       0, 0, 1, 2, 0, 1, 0, 2, 0, 0, 2, 0, 0, 2, 2, 2, 0])

If the remove_outliers variable is set, then the outliers and unknown statuses will automatically be removed from the returned NavData object.

[7]:

result = glp.solve_fde(navdata, method="edm",
                       threshold=0, remove_outliers=True)
result["fault_edm"]

[7]:

array([0, 0, 0, 0])

Greedy Residual FDE

This FDE method is based on “Fast multiple fault exclusion with a large number of measurements.” by Juan Blanch, Todd Walter, and Per Enge from the ION GNSS+ 2015 conference.

[8]:

result = glp.solve_fde(navdata, method="residual")

After this method runs, a new row is added called “fault_residual” which has a 0 if no fault is predicted, 1 if a fault is predicted, and 2 for an unknown fault status (usually due to lack of necessary columns or information).

[9]:

result["fault_residual"]

[9]:

array([0, 0, 0, 0, 0, 2, 0, 0, 2, 2, 0, 0, 0, 0, 0, 0, 2, 2, 2, 2, 2, 0,
       0, 0, 1, 2, 0, 1, 0, 2, 0, 0, 2, 0, 0, 2, 2, 2, 0])

Evaluate FDE

The evaluate_fde() function can be used to create overall metrics on accuracy and timing based on a ground truth fault status row. The below example shows a comparison between the default parameters for greedy EDM and greedy residual FDE.

[10]:

edm_metrics, _ = glp.evaluate_fde(navdata, method="edm",
                                  fault_truth_row="MultipathIndicator",
                                  time_fde=True)
for key, value in edm_metrics.items():
    print(key,":",value)

dataset_timesteps : 1
measurement_counts_min : 39
measurement_counts_mean : 39.0
measurement_counts_median : 39.0
measurement_counts_max : 39
fault_counts_min : 9
fault_counts_mean : 9.0
fault_counts_median : 9.0
fault_counts_max : 9
faults_per_timestemp_min : 0.23076923076923078
faults_per_timestemp_mean : 0.23076923076923078
faults_per_timestemp_median : 0.23076923076923078
faults_per_timestemp_max : 0.23076923076923078
method : edm
total_measurements : 39
true_positives_count : 0
tpr : 0.0
true_negatives_count : 22
tnr : 1.0
missed_detection_count : 3
mdr : 1.0
false_alarm_count : 0
far : 0.0
unknown_count : 14
precision : 0.0
recall : 0.0
accuracy : 0.88
balanced_accuracy : 0.5
timestep_min_ms : 0.8757114410400391
timestep_mean_ms : 0.8757114410400391
timestep_median_ms : 0.8757114410400391
timestep_max_ms : 0.8757114410400391

[11]:

residual_metrics, _ = glp.evaluate_fde(navdata, method="residual",
                                       fault_truth_row="MultipathIndicator",
                                       time_fde=True)
for key, value in residual_metrics.items():
    print(key,":",value)

dataset_timesteps : 1
measurement_counts_min : 39
measurement_counts_mean : 39.0
measurement_counts_median : 39.0
measurement_counts_max : 39
fault_counts_min : 9
fault_counts_mean : 9.0
fault_counts_median : 9.0
fault_counts_max : 9
faults_per_timestemp_min : 0.23076923076923078
faults_per_timestemp_mean : 0.23076923076923078
faults_per_timestemp_median : 0.23076923076923078
faults_per_timestemp_max : 0.23076923076923078
method : residual
total_measurements : 39
true_positives_count : 1
tpr : 0.3333333333333333
true_negatives_count : 21
tnr : 0.9545454545454546
missed_detection_count : 2
mdr : 0.6666666666666666
false_alarm_count : 1
far : 0.045454545454545456
unknown_count : 14
precision : 0.5
recall : 0.3333333333333333
accuracy : 0.88
balanced_accuracy : 0.6439393939393939
timestep_min_ms : 47.2569465637207
timestep_mean_ms : 47.2569465637207
timestep_median_ms : 47.2569465637207
timestep_max_ms : 47.2569465637207

You can also compare the two methods based on the time they take.

[12]:

speed_increase = residual_metrics["timestep_mean_ms"] - edm_metrics["timestep_mean_ms"]
print(f"Greedy EDM is {np.round(speed_increase,1)} milliseconds faster than Greedy Residual on average!")

Greedy EDM is 46.4 milliseconds faster than Greedy Residual on average!