1. Web/API Services: Clean/Hexagonal Architecture 1 2 3 4 5 6 7 domain (business logics) ↑ interfaces(Interface defination,Dependency Inversion and Inversion of Control) ↑ adapters(Adapters,e.g. DB,Redis) ↑ entrypoints(enterypoints,e.g. FastAPI)
1 2 3 4 5 6 7 project/ │ ├── domain/ # 核心业务逻辑(实体 + 服务) ├── interfaces/ # 抽象接口 ├── adapters/ # 适配器实现(DB、缓存等) ├── entrypoints/ # FastAPI 或 CLI 入口 └── main.py # 启动应用
2. Data Pipline `` Data Source -> Data Collection(Kafka/CDC) -> Bronze: original data -> Silver: Structure, clean data -> Gold: Aggregating and analysing, BI-ready -> BI
data_pipeline_project/ ├── dags/ # Airflow DAG 定义:任务之间的依赖与调度 │ └── user_behavior_dag.py │ ├── pipelines/ # 每一层的处理逻辑(模块化分层) │ ├── bronze/ │ │ └── ingest_from_kafka.py # 从 Kafka 消费数据,写入原始层 │ ├── silver/ │ │ └── clean_user_data.py # 数据清洗、转换、校验 │ └── gold/ │ └── generate_user_metrics.py # 聚合指标生成、维度表构建 │ ├── common/ # 通用组件:连接器、工具函数、schema校验器 │ ├── kafka_consumer.py │ └── delta_writer.py │ ├── config/ # YAML / JSON配置,如Kafka地址、Schema定义 │ └── kafka_topics.yaml │ ├── scripts/ # 单独可运行的脚本,如初始化、历史回灌 │ └── bootstrap_topic.py │ ├── tests/ # 单元测试(Pytest) │ └── test_data_cleaning.py │ ├── requirements.txt └── README.md ```