Fairness metrics, like equalized odds and positive predictive parity, are meant to evaluate predictive models on a given dataset for discrimination relative to a particular demographic category. In real-world deployment contexts, it is often the case that datasets are subject to one or more varieties of measurement bias. These include such everyday situations as when a college only has academic performance data for admitted applicants, employee performance data is systematically skewed against female software engineers, or banks manipulate lendee repayment ability in differentially setting interest rates. In practice, fairness metrics turn out to be highly sensitive to the presence of such biases. A relatively small magnitude of bias is capable of rendering a fairness evaluation effectively meaningless. Adapting tools from causal sensitivity analysis, we have created a tool designed to statistically test for such impacts.
The tool below provides a user interface for constructing biases and running sensitivity analyses. It is a wrapper around the codebase here. You can specify a bias by designing a DAG on the canvas and adding constraints to define your sensitivity parameter. The easiest way to start and get a sense for how this works it to look at the example biases. You can upload your dataset in the form of a CSV file containg binary A, Y, and P columns corresponding to the sensitive attribute, observed outcome, and prediction. The readme in the codebase is a good place to further understand all the options available here. The codebase also offers more advanced options. Both the codebase and this tool are operate on JSON files that can be downloaded and uploaded to save and load custom configs between this tool and the codebase.
Note: this tool is in beta testing, it is likely to have bugs and may not work as expected. Please report any issues to the authors, including the config that generated the error. The codebase above is more stable.