Context
Contexte
In many GWS environments, data quality checks were either externalized or implemented with ad hoc scripts, which made validation harder to standardize and less visible for non-technical teams. This project was built to make quality control a native part of platform workflows.
Dans de nombreux environnements GWS, les controles qualite etaient externalises ou geres par des scripts ad hoc, ce qui rendait la validation difficile a standardiser et peu visible pour les equipes non techniques. Ce projet a ete concu pour rendre le controle qualite natif dans les workflows de la plateforme.
Project objective
Objectif du projet
The goal was to bring Great Expectations into GWS in a production-friendly way: reusable validation tasks, clear deliverables for data/quality teams, and execution stability in the actual runtime used by projects.
L'objectif etait d'integrer Great Expectations dans GWS de maniere exploitable en production : taches de validation reutilisables, livrables lisibles pour les equipes data/qualite, et stabilite d'execution dans le runtime reel des projets.
Main contributions
Contributions principales
- Enabled Great Expectations directly inside GWS pipelines for tabular resources and CSV folders.
- Integrated automatic publication of Data Docs as a native GWS resource (GxDataDocsResource).
- Defined configurable validation behaviors to support both generic and business-specific quality rules.
- Delivered specialized task variants for domain contexts, including CDISC and NGS workflows.
- Secured runtime compatibility by aligning dependencies around great-expectations==0.18.21.
- Rendu Great Expectations directement utilisable dans les pipelines GWS pour les ressources tabulaires et les dossiers CSV.
- Integre la publication automatique des Data Docs comme ressource GWS native (GxDataDocsResource).
- Defini des comportements de validation configurables pour couvrir des regles qualite standards et metier.
- Livre des variantes specialisees de taches pour des contextes domaine, notamment CDISC et NGS.
- Securise la compatibilite runtime via l'alignement des dependances autour de great-expectations==0.18.21.
Project architecture
Architecture du projet
- src/gws_great_expectations/gx_data_docs_demo.py: core implementation and GWS task logic.
- src/gws_great_expectations/__init__.py: public exports for task registration.
- example_input_gx_customer_quality.csv: representative input dataset used for demonstration.
- src/gws_great_expectations/gx_data_docs_demo.py : implementation principale et logique des taches GWS.
- src/gws_great_expectations/__init__.py : exports publics pour l'enregistrement des taches.
- example_input_gx_customer_quality.csv : jeu de donnees representatif utilise pour la demonstration.
Project impact
Impact du projet
Great Expectations becomes a first-class capability in GWS: validation, reporting, and visibility are unified in one platform experience for engineering and quality teams.
Great Expectations devient une capacite native de GWS : validation, reporting et visibilite sont reunis dans une meme experience plateforme pour les equipes engineering et qualite.