1/1
5 files

A Distributed Fair Random Forest

thesis
posted on 02.05.2020, 00:00 by James Fantin
Machine learning algorithms are increasingly responsible for making critical decisions which have broad societal impact. Questions are arising about the fairness of the algorithms which make these decisions. While existing models have been proposed, many require direct access to private data which may be impossible given new privacy regulations. We propose a distributed fair random forest algorithm which does not require direct access to private demographic data. Our approach uses randomly generated decisions trees which are added to our forest if they are fair with a weighted voting mechanism for accuracy. In building on existing literature, we assume a third party holds private demographic data which can communicate with a data center that builds a model without compromising the privacy of individuals demographic data. We compare our algorithm against existing fair random forest and decision tree algorithms and show that our method can outperform existing methods.

History

Advisor

Lan, Chao

Collection

Honors Theses AY 19/20

Usage metrics

Honors Capstones AY 19/20

Licence

Exports