Responsible and trustworthy AI concerns the development and usage of AI that adheres to ethical standards, ensuring reliable, fair, transparent and privacy-friendly technologies. This implies AI systems designed to be secure, with mechanisms to prevent bias and enabling users to understand and control their operations. The aim is to ensure that AI makes a positive contribution to society while minimizing potential risks.
DataLab Group initiatives for safe and transparent AI
Several tools have been implemented, ensuring data and model control at every stage of an artificial intelligence project. As soon as data is received, whether for production or simple experimentation, the DataLab Group ensures data quality (drift, detection and management of outliers, presence of expected elements and their consistency, etc.). These checks are carried out systematically on receipt of the data. Proof of these checks is archived according to a precise documentation format which will be detailed later in this article.
Regarding the modeling part, a customizable end-to-end AutoML solution has been designed not only to facilitate the deployment of ML solutions, but also to control and supervise all modeling stages. In this tool, several features are present to ensure end-to-end mastery of the provided models, such as
- Meta-Learning: methodically compare models to strike a balance between performance and complexity
- Bias: detect the presence of bias in data
- Interpretability: explain the predictions generated by the tool
- Code Carbon: trace CO2 emissions in each stage of the pipeline
Finally, once the AI solutions have been deployed in production, the DataLab Group has created a monitoring platform to follow the models in production, in order to continuously monitor and control their performance.
Our Works on Adversarial Attacks
Our work revolves heavily around documentary data, a focus shared by several other companies. This entails processing documents for various purposes, such as classification and information extraction.
Despite the prevalence of models processing documentary data, we’ve observed a lack of focus on this type of data in the field of trustworthy and responsible AI. There exist numerous attacks targeted at natural images, yet few focus on documents. To address this gap, we conducted research on this specific subject and presented a paper at ICDAR 2023 on attacking documentary models. Our findings revealed that these models are susceptible and easily attacked, but the use of adversarial training proved to be an effective countermeasure.
Our Works on Reconstruction Attacks
Similar to the situation with adversarial attacks, we’ve noticed a scarcity of work regarding reconstruction attacks on documentary data. These attacks aim to recover sensitive training data from a trained model. While there’s a wealth of literature on this topic concerning images or textual generative models, there are fewer papers for other types of data. As a result, we launched attacks on LayoutLM and BROS, which are information extraction models. Our results demonstrated that perfect reconstruction, although difficult, is achievable. These findings will be published in the near future.
Our Works on Explainability
The ability to trace the reason behind a model’s specific decision is invaluable. To this end, the DataLab Group utilizes explainability methods, such as saliency maps on text and images. This aids in understanding why a document may be poorly classified and where the model focuses - the text, layout, logo, etc. This insight enhances trust in our AI models. The use of saliency maps on textual data offers a straightforward and effective way to bolster trust in our AI models.
Certification of the DataLab Group AI systems
Innovative, Industrial and Trusted. To demonstrate that it’s possible to do all these things at once, we set out to have the Crédit Agricole Group DataLab’s AI solutions manufacturing method certified. After a year’s work, our teams were awarded “Artificial Intelligence” certification by the Laboratoire National de Métrologie et d’Essais in February 2023.
Following on from this certification by the LNE of industrial and trusted Artificial Intelligence systems, the DataLab Group has been awarded the “Labelia Advancé” label for responsible AI. It is the 5th structure in France and the 1st in its sector to be awarded this “advanced” level by Labelia Labs. This label reinforces and completes the requirements of LNE certification, particularly in terms of CSR (Corporate Social Responsibility). It reflects the DataLab Group’s determination to align its actions with the commitments of the Crédit Agricole Group’s CSR Projet, both in terms of the choice of projects and in the way we carry them out.