SPRUCE

Spruce helps estimate the environmental impact of your cloud usage. By leveraging open source models and data, it enriches usage reports generated by cloud providers and allows you to build reports and visualisations. Having the greenops and finops data in the same place makes it easier to expose your costs and impacts side by side.

Spruce uses Apache Spark to read and write the usage reports (typically in Parquet format) in a scalable way and, thanks to its modular approach, splits the enrichment of the data into configurable stages.

A typical sequence of stages would be:

estimation of embedded emissions from resources used
estimation of energy used
application of PUE and other overheads
application of carbon intensity factors

Please note that this is currently a prototype which handles only CUR reports from AWS. Not all AWS services are covered.

One of the benefits of using Apache Spark is that you can use EMR on AWS to enrich the CURs at scale without having to export or expose any of your data.

Prerequisites

You will need to have CUR reports as inputs. Those are generated via DataExports and stored on S3 as Parquet files.

Local install

With Apache Maven, Java and Apache Spark installed locally and added to the $PATH.

mvn clean package
spark-submit --class com.digitalpebble.spruce.SparkJob --driver-memory 8g ./target/spruce-1.0.jar -i ./curs -o ./output

The option -c allows to specify a JSON configuration file to override the default settings.

Docker

Build the Docker image with docker build -t digitalpebble/spruce:1.0 .

The command below processes the data locally by mounting the directories containing the CURs and output as volumes:

docker run -it -v ./curs:/curs -v ./output:/output  digitalpebble/spruce:1.0 \
/opt/spark/bin/spark-submit  \
--class com.digitalpebble.spruce.SparkJob \
--driver-memory 4g \
--master 'local[*]' \
/usr/local/lib/spruce-1.0.jar \
-i /curs -o /output/enriched

Explore the output

Using DuckDB locally or Athena on AWS:

create table enriched_curs as select * from 'output/*/*.parquet';

select line_item_product_code, product_servicecode, 
       round(sum(operational_emissions_co2eq_g),2) as co2_usage_g, 
       round(sum(energy_usage_kwh),2) as energy_usage_kwh 
       from enriched_curs where operational_emissions_co2eq_g > 0.01 
       group by line_item_product_code, product_servicecode order by co2_usage_g desc;

should give an output similar to

line_item_product_code	product_servicecode	co2_usage_g	energy_usage_kwh
AmazonS3	AWSDataTransfer	659.2	3.31
AmazonRDS	AWSDataTransfer	361.59	1.09
AmazonEC2	AWSDataTransfer	162.59	1.43
AmazonECR	AWSDataTransfer	88.75	0.8
AmazonVPC	AWSDataTransfer	40.55	0.38
AWSELB	AWSDataTransfer	6.3	0.06

To measure the proportion of the costs for which emissions where calculated

select
  round(covered * 100 / "total costs", 2) as percentage_costs_covered
from (
  select
    sum(line_item_unblended_cost) as "total costs",
    sum(line_item_unblended_cost) filter (where operational_emissions_co2eq_g is not null) as covered
  from
    enriched_curs
  where
    line_item_line_item_type like '%Usage'
);

License

Licensed under the Apache License, Version 2.0: http://www.apache.org/licenses/LICENSE-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
.github		.github
src		src
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
logo.png		logo.png
pom.xml		pom.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

SPRUCE

Prerequisites

Local install

Docker

Explore the output

License

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Uh oh!

Languages

Uh oh!

License

DigitalPebble/spruce

Folders and files

Latest commit

History

Repository files navigation

SPRUCE

Prerequisites

Local install

Docker

Explore the output

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Uh oh!

Languages

Packages