AI Customer Data Platform Specialist
An AI Customer Data Platform Specialist architects, deploys, and optimizes AI-powered customer data ecosystems that unify behavior…
Skill Guide
The engineering discipline of using SQL for set-based data manipulation and Python for procedural logic, orchestration, and integration to design, build, and maintain automated systems that extract, transform, and load data from disparate sources into target systems for analysis.
Scenario
You have two CSVs: `orders.csv` (order_id, customer_id, product_id, amount, order_date) and `customers.csv` (customer_id, name, region). Create a pipeline that loads these, joins them, calculates daily sales by region, and loads the result into a new CSV or SQLite database.
Scenario
Marketing needs a dashboard showing campaign performance. Sources are a Google Analytics 4 API (JSON) and a Google Sheets export. Design an ELT pipeline that extracts daily data, loads raw data into a cloud data warehouse (e.g., BigQuery), and then uses SQL models (dbt) to transform it into a clean `fct_campaign_performance` fact table.
Scenario
The company is migrating to a data lakehouse. You need to ingest high-volume, semi-structured clickstream events from Kafka and legacy transactional data from PostgreSQL, handling late-arriving data, schema changes, and providing exactly-once semantics.
Airflow/Dagster orchestrate complex DAGs of tasks. dbt manages the 'T' in ELT with version-controlled SQL models. Python libraries handle API integrations, complex transformations, and non-SQL data. Cloud warehouses provide scalable compute/storage for the ELT paradigm.
Idempotency ensures pipelines can be safely re-run. Data quality frameworks validate data contracts proactively. Dimensional modeling provides the logical blueprint for structuring transformed data for analytics consumption.
1 career found
Try a different search term.