Distributed-Deep-Learning

This library is the result of Robert K.L. Kennedy's summer 2018 internship.

This library is designed to leverage HPCC Systems and Google TensorFlow to provide HPCC a distributed deep learning runtime. Current implementation is based off of the Parallel Stochastic Gradient Descent research paper by Zinkevich et al., a synchronous data parallel design. It uses HPCC/ECL to provide the data parallelism and TensorFlow for the localized neural network optimization.

I used Keras on top of TensorFlow, for two main reasons. First, Keras reduces the required lines of code. Secondly, and perhaps more importantly, Keras can use different deep learning libraries as a 'backend' (Theano, CNTK, TensorFlow) removing this work's dependency on TensorFlow.

Performance

A sufficiently large data set is required to see a benefit from training a neural network using more than one Thor node. What constitutes "sufficient" is highly dependent on the type data, neural network model size, the underlying system hardware, and the model architecture. However, in general as the dataset sizes increase the increase in Thor nodes will significantly reduce training time.

General trends can be observed below in Figure 1. Note that the X and Y scale is logarithmic and in the dataset used in this test, each row in the dataset is 850 bytes long. The sytem had 1 CPU (1 core) and 3.5 gigs of memory.

System Requirements

The examples in this project require the pyembed plugin and Python to be installed on every machine in the system. Python version 2.7 has been tested, but Python 3.x should work with little to no modifications. In addition, there are several python packages that are required for use, the main libraries (and any of their included dependencies) are:

TensorFlow (only cpu installation has been tested)
Keras
numpy
pandas
h5py

A full list of python packages of the system the examples were tested on can be found in python_modules.txt

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
Examples		Examples
Tech Talk August 2018		Tech Talk August 2018
imgs		imgs
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
python_modules.txt		python_modules.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Distributed-Deep-Learning

Performance

System Requirements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 3

Uh oh!

hpcc-systems/Distributed-Deep-Learning

Folders and files

Latest commit

History

Repository files navigation

Distributed-Deep-Learning

Performance

System Requirements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 3

Uh oh!

Packages