RATE-Analytics: Next Generation Predictive Analytics for Data-Driven Banking and Insurance

Collaris, Dennis; Pechenizkiy, Mykola; van Wijk, Jarke J.

doi:10.4230/OASIcs.Commit2Data.8

Abstract

We conducted the RATE-Analytics project: a unique collaboration between Rabobank, Achmea, Tilburg and Eindhoven University. We aimed to develop foundations and techniques for next generation big data analytics. The main challenge of existing approaches is the lack of reliability and trustworthiness: if experts do not trust a model or its predictions they are much less likely to use and rely on that model. Hence, we focused on solutions to bring the human-in-the-loop, enabling the diagnostics and refinement of models, and support in decision making and justification. This chapter zooms in on the part of the project focused on developing explainable and trustworthy machine learning techniques.

Mihael Ankerst, Martin Ester, and Hans-Peter Kriegel. Towards an effective cooperation of the user and the computer for classification. In Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pages 179-188. ACM, 2000. URL: https://doi.org/10.1145/347090.347124.
Battista Biggio, Igino Corona, Davide Maiorca, Blaine Nelson, Nedim Srndic, Pavel Laskov, Giorgio Giacinto, and Fabio Roli. Evasion attacks against machine learning at test time. In Hendrik Blockeel, Kristian Kersting, Siegfried Nijssen, and Filip Zelezný, editors, Machine Learning and Knowledge Discovery in Databases - European Conference, ECML PKDD 2013, Prague, Czech Republic, September 23-27, 2013, Proceedings, Part III, volume 8190 of Lecture Notes in Computer Science, pages 387-402. Springer, Springer, 2013. URL: https://doi.org/10.1007/978-3-642-40994-3_25.
Steve Blank. The four steps to the epiphany: successful strategies for products that win. John Wiley & Sons, 2020.
Joy Buolamwini and Timnit Gebru. Gender shades: Intersectional accuracy disparities in commercial gender classification. In Conference on Fairness, Accountability and Transparency, pages 77-91. PMLR, 2018. URL: http://proceedings.mlr.press/v81/buolamwini18a.html.
Declan Butler. When Google got flu wrong. Nat., 494(7436):155-156, 2013. URL: https://doi.org/10.1038/494155a.
Dennis Collaris. sklearn-pmml-model: Machine learning portability and interoperability using PMML, 2022. Library available at URL: https://github.com/iamDecode/sklearn-pmml-model.
Dennis Collaris, Pratik Gajane, Joost Jorritsma, Jarke J van Wijk, and Mykola Pechenizkiy. LEMON: Alternative sampling for more faithful explanation through local surrogate models. In Advances in Intelligent Data Analysis XXI: 21st International Symposium on Intelligent Data Analysis (IDA 2023), pages 77-90. Springer, 2023. URL: https://doi.org/10.1007/978-3-031-30047-9_7.
Dennis Collaris and Jarke J van Wijk. ExplainExplore: Visual exploration of machine learning explanations. In Proceedings of the 2020 IEEE Pacific Visualization Symposium (PacificVis), pages 26-35. IEEE, 2020. URL: https://doi.org/10.1109/PacificVis48177.2020.7090.
Dennis Collaris and Jarke J. van Wijk. Machine learning interpretability through contribution-value plots. In Michael Burch, Michel A. Westenberg, Quang Vinh Nguyen, and Ying Zhao, editors, Proceedings of the 13th International Symposium on Visual Information Communication and Interaction, pages 4:1-4:5. ACM, 2020. URL: https://doi.org/10.1145/3430036.3430067.
Dennis Collaris and Jarke J. van Wijk. Comparative evaluation of contribution-value plots for machine learning understanding. Journal of Visualization, 25(1):47-57, 2022. URL: https://doi.org/10.1007/s12650-021-00776-w.
Dennis Collaris and Jarke J. Van Wijk. StrategyAtlas: Strategy analysis for machine learning interpretability. IEEE Transactions on Visualization and Computer Graphics, 29(6):2996-3008, 2023. URL: https://doi.org/10.1109/TVCG.2022.3146806.
Dennis Collaris, Hilde J. P. Weerts, Daphne Miedema, Jarke J. van Wijk, and Mykola Pechenizkiy. Characterizing data scientists' mental models of local feature importance. In NordiCHI '22: Nordic Human-Computer Interaction Conference, Aarhus, Denmark, October 8 - 12, 2022, pages 9:1-9:12. ACM, 2022. URL: https://doi.org/10.1145/3546155.3546670.
Finale Doshi-Velez and Been Kim. Towards a rigorous science of interpretable machine learning. arXiv preprint arXiv:1702.08608, 2017.
Bryce Goodman and Seth Flaxman. European union regulations on algorithmic decision-making and a "right to explanation". AI magazine, 38(3):50-57, 2017. URL: https://doi.org/10.1609/aimag.v38i3.2741.
Philipp Hacker and Jan-Hendrik Passoth. Varieties of AI explanations under the law. from the GDPR to the aia, and beyond. In Andreas Holzinger, Randy Goebel, Ruth Fong, Taesup Moon, Klaus-Robert Müller, and Wojciech Samek, editors, xxAI - Beyond Explainable AI - International Workshop, Held in Conjunction with ICML 2020, July 18, 2020, Vienna, Austria, Revised and Extended Papers, volume 13200 of Lecture Notes in Computer Science, pages 343-373, Cham, 2020. Springer. URL: https://doi.org/10.1007/978-3-031-04083-2_17.
Fred Hohman, Andrew Head, Rich Caruana, Robert DeLine, and Steven Mark Drucker. Gamut: A design probe to understand how data scientists understand machine learning models. In Stephen A. Brewster, Geraldine Fitzpatrick, Anna L. Cox, and Vassilis Kostakos, editors, Proceedings of the 2019 CHI Conference on Human Factors in Computing Systems, CHI 2019, Glasgow, Scotland, UK, May 04-09, 2019, page 579. ACM, 2019. URL: https://doi.org/10.1145/3290605.3300809.
Tianjin Huang, Vlado Menkovski, Yulong Pei, Yuhao Wang, and Mykola Pechenizkiy. Direction-aggregated attack for transferable adversarial examples. ACM J. Emerg. Technol. Comput. Syst., 18(3):60:1-60:22, 2022. URL: https://doi.org/10.1145/3501769.
Tianjin Huang, Yulong Pei, Vlado Menkovski, and Mykola Pechenizkiy. Hop-count based self-supervised anomaly detection on attributed networks. In Massih-Reza Amini, Stéphane Canu, Asja Fischer, Tias Guns, Petra Kralj Novak, and Grigorios Tsoumakas, editors, Proceedings of European Conference on Machine Learning and Knowledge Discovery in Databases - , ECML PKDD 2022, Part I, volume 13713 of Lecture Notes in Computer Science, pages 225-241. Springer, 2022. URL: https://doi.org/10.1007/978-3-031-26387-3_14.
Samaneh Khoshrou and Mykola Pechenizkiy. Adaptive long-term ensemble learning from multiple high-dimensional time-series. In Discovery Science: 22nd International Conference, DS 2019, Split, Croatia, October 28-30, 2019, Proceedings 22, pages 511-521. Springer, 2019. URL: https://doi.org/10.1007/978-3-030-33778-0_38.
Anh Mai Nguyen, Jason Yosinski, and Jeff Clune. Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2015, Boston, MA, USA, June 7-12, 2015, pages 427-436. IEEE Computer Society, 2015. URL: https://doi.org/10.1109/CVPR.2015.7298640.
Yulong Pei, Tianjin Huang, Werner van Ipenburg, and Mykola Pechenizkiy. ResGCN: attention-based deep residual modeling for anomaly detection on attributed networks. Machine Learning, 111(2):519-541, 2022. URL: https://doi.org/10.1007/s10994-021-06044-0.
Yulong Pei, Fang Lyu, Werner van Ipenburg, and Mykola Pechenizkiy. Subgraph anomaly detection in financial transaction networks. In Tucker Balch, editor, ICAIF '20: The First ACM International Conference on AI in Finance, New York, NY, USA, October 15-16, 2020, ICAIF '20, pages 18:1-18:8, New York, NY, USA, 2020. ACM. URL: https://doi.org/10.1145/3383455.3422548.
Marco Túlio Ribeiro, Sameer Singh, and Carlos Guestrin. "Why should I trust you?" explaining the predictions of any classifier. In Balaji Krishnapuram, Mohak Shah, Alexander J. Smola, Charu C. Aggarwal, Dou Shen, and Rajeev Rastogi, editors, Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, August 13-17, 2016, pages 1135-1144. ACM, ACM, 2016. URL: https://doi.org/10.1145/2939672.2939778.
Simone Stumpf, Vidya Rajaram, Lida Li, Weng-Keen Wong, Margaret Burnett, Thomas Dietterich, Erin Sullivan, and Jonathan Herlocker. Interacting meaningfully with machine learning systems: Three experiments. International Journal of Human-Computer Studies, 67(8):639-662, 2009. URL: https://doi.org/10.1016/j.ijhcs.2009.03.004.
Shihao Zheng, Simon P. van der Zon, Mykola Pechenizkiy, Cassio P. de Campos, Werner van Ipenburg, and Hennie de Harder. Labelless concept drift detection and explanation. In Proceedings of NeurIPS 2019 Workshop on Robust AI in Financial Service, 2019.
Indrė Žliobaitė, Mykola Pechenizkiy, and Joao Gama. An overview of concept drift applications. In Big Data Analysis: New Algorithms for a New Society, pages 91-114. Springer, 2016.

RATE-Analytics: Next Generation Predictive Analytics for Data-Driven Banking and Insurance

Authors Dennis Collaris , Mykola Pechenizkiy , Jarke J. van Wijk

File

Document Identifiers

Subject Classification

ACM Subject Classification

Keywords

Metrics

Abstract

Cite As Get BibTex

Author Details

Acknowledgements

References

Thanks for your feedback!

Could not send message