It is a project that is part of lazy-stock-screener which is a full-stack micro-service prototype product.
The full product is still under constructing and the full product is committed to gitlab. I only share part of the task or dag
in order to demonstrate how I built up the data-pipeline.
- Abstract Factory pattern to create parameter catalog of each stock.
- Build profit/safety report with builder pattern
- Pipeline constructed by command pattern
- Separate panda out of from outputtemplate pattern in order to integrate with different data processing engine, e.g. spark.
- You have freedom to decide how to run on airflow as each
main_as_dag.py
was designed to execute asdag
in airflow or execute main.py directly by cron job.
Inspiration by https://github.com/rjurney/Agile_Data_Code_2
- shared
- calculate_score
- dump_financial_reports
- construct_stock_catalog
- catalog_builder
- collection_factory
- tasks
- state_1_scrape_fs_file_from_s3
- state_2_integration_and_cleaning
- state_3_transformation
- state_4_loading
- Python
- Pandas
- https://github.com/rjurney/Agile_Data_Code_2
- https://www2.slideshare.net/rjurney/predictive-analytics-with-airflow-and-pyspark
- https://blog.usejournal.com/testing-in-airflow-part-1-dag-validation-tests-dag-definition-tests-and-unit-tests-2aa94970570c
- https://github.com/chandulal/airflow-testing/tree/master/src