Classification methods used on new data sources

Content WIP

NOTE: This repository is still in the works

Classification methods used on new data sources

This repository is a companion to the online handbook developed by the UN Task Team on Scanner data to guide National Statistical Offices (NSOs) in the implementation of new data sources such as transaction data and web scraped data in the production of consumer price statistics -- specifically to classification methods applicable on this data. Thus this repository aims to hosts standardized code, public demonstration datasets (where applicable), and notebooks -- that NSOs can review as a companion to the material written in the Classification section of the online handbook. It also is the storage location of some key artifacts that workstream developing classification used throughout the guidance (such as diagrams).

Folder structure

The repository is structure around the following structure

├── data                    # Placeholder folders for demo data used in the repository
│   ├── raw                 # For raw data
│   └── final               # Cleaned data for use in the example notebooks
├── notebooks               # Exploration notebooks demonstrating the various classification methods
└── docs                    # Documentation supporting this repository
    └── images              # Images and diagrams

Note on Programming Language use

Note - while notebooks as well as code in this repository is written primarily in Python, there is no 'correct' programming language for classification or other work NSOs could do. Classification workstream members chose Python to demonstrate classification methods as Python is a very popular language for Data Science and especially Machine Learning (Method 4), can be written in a quite easy to read syntax, as well as that based on workstream member experience, many NSOs utilize Python for their classification work. NSOs should however choose a programming language as they see fit, such as depending on their preference those their officers and ability to maintain the code base.

For contributors who want to contribute in another language, please feel free to do so! Note that notebooks (such as in R markdown) can also be saved in the /notebooks/ folder.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
docs/images		docs/images
notebooks		notebooks
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Content WIP

Classification methods used on new data sources

Folder structure

Note on Programming Language use

About

Uh oh!

Releases

Packages

Uh oh!

License

UN-Task-Team-for-Scanner-Data/classification_methods

Folders and files

Latest commit

History

Repository files navigation

Content WIP

Classification methods used on new data sources

Folder structure

Note on Programming Language use

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Packages