Skip to content

engdan77/energylens

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

31 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Energy Lens

Purpose, thoughts and lessons learned

I have for quite some time receieved invoices from our energy provider Jรถnkรถping Energi (for electricity and fjรคrrvรคrme) and had the process of transferring amounts and costs into a sheet for easier following up deviations or changes of prices. At the time of writing this there were not official API's nor easy way to export "all" data into any usable form. To spare me from this manual process I instead I developed some code using the Playwright (Python) framework allowing to login using 2FA (bankid) and as the underlying backend mechanics were quite sophisticated (e.g. using different token exchanges etc) this allowed me to efficiently instead used the dynamic frontend to download all invoices as PDF files as my first step.

The second stage would be to automatically parse the tables within those PDFs into a parquet dataframe, for this I primarily gave Docling a spin. It turned out to do a decent work for the most recent files, but for some reason for other ones some important tables/texts were left out. After evaluating a few other poplular packages it turns out those performed even worse. So as a backup solution I went for a more programmatic text-extraction using PyPDF package and straight pattern/regex matching instead.

Now with this I should now have raw data to do my data visualization using e.g. Polars and Altair.

This project may be found valuable for others with this energy provider and/or find other bits of this code useful for other usage.

Usage

Ensure you have UV package manager installed and you can simply run e.g.

$ uv run --with https://github.com/engdan77/energylens.git energylens --help                              

Usage: energylens COMMAND

Application for accessing, parse and convert invoices from Jonkoping Energi

โ•ญโ”€ Commands โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฎ
โ”‚ download-invoices  Downloads invoices from a given source and saves them to a         โ”‚
โ”‚                    specified path using a web scraper.                                โ”‚
โ”‚ parse-invoices     Parses and processes invoices from PDF files into structured data, โ”‚
โ”‚                    and outputs the parsed data to the specified location in the       โ”‚
โ”‚                    desired format.                                                    โ”‚
โ”‚ --help -h          Display this message and exit.                                     โ”‚
โ”‚ --version          Display application version.                                       โ”‚
โ•ฐโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ•ฏ

Do you have notebook using this?

If you click here you'll get to a Notebook including chart and analysis that runs entirely within your browser thanks to WebAssembly, and that could easily be kept updated.

About

๐Ÿ”Œ A project aimed for downloading energy provider invoices and turn into dataframe

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages