Big Query Lead/ Architect
- Utilize GCP services such as
BigQuery, Dataflow, Datastream, and Composer to build scalable and
high-performance data processing solutions.
- Implement data quality checks,
data validation, and monitoring mechanisms to ensure the accuracy and integrity
of the data
- Optimize and fine-tune data pipelines for
performance and cost efficiency, making use of GCP best practices.
- Design and implement robust
Extract, Transform, Load (ETL) or Extract, Load, Transform (ELT) pipelines
to ingest data into BigQuery from various sources.
- Ensure data pipelines are
optimized for performance, scalability, and cost-efficiency.
- Automate repetitive data
ingestion, transformation, and quality checks using tools like Apache
Airflow, Cloud Dataflow, or other orchestration tools.
- Troubleshoot and resolve issues
related to data pipelines and queries.
- Set up monitoring for data
pipelines and BigQuery resources using tools like Stackdriver or custom
dashboards.
- Document processes, data models,
and workflows for maintainability and knowledge sharing.
- Configure Identity and Access
Management (IAM) roles and permissions for data security. Implement data
audit trails to monitor access and changes.
Tools and
Skills Often Used
- Languages: SQL, Python
- GCP Services: BigQuery, Cloud Dataflow, Cloud Composer, Cloud Storage,
Cloud Pub/Sub.
- Visualization: Looker, Tableau, Power BI, or Google Data Studio.
- Version Control: Git or similar.
This role is crucial for ensuring that an
organization’s data infrastructure is scalable, reliable, and optimized for
analytics.