Skip to content

Download and cache plot schema (or always download if no file permissions). #284

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Sep 19, 2015
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions make_instructions.txt
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ Available targets (i.e. $ make target):

readme -> `less` this file

update_default_schema -> download the latest plot schema from plot.ly

setup_subs -> delete old submodule locations, init, update, and sync new

install -> 1. sync submodules 2. install plotly with setup.py
Expand Down
9 changes: 9 additions & 0 deletions makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,15 @@ setup_subs :
git submodule update
make sync_subs

update_default_schema :
@echo "Making sure the default-schema.json file is up to date"
python -c "import json;\
from plotly.graph_reference import GRAPH_REFERENCE;\
f = open('plotly/graph_reference/default-schema.json', 'w');\
json.dump(GRAPH_REFERENCE, f, indent=4, sort_keys=True,\
separators=(',', ': '));\
f.close()"

install : sync_subs
@echo ""
@echo "Installing Python API with make"
Expand Down
39 changes: 39 additions & 0 deletions plotly/files.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import os

# file structure
PLOTLY_DIR = os.path.join(os.path.expanduser("~"), ".plotly")
CREDENTIALS_FILE = os.path.join(PLOTLY_DIR, ".credentials")
CONFIG_FILE = os.path.join(PLOTLY_DIR, ".config")
GRAPH_REFERENCE_FILE = os.path.join(PLOTLY_DIR, ".graph_reference")
TEST_DIR = os.path.join(os.path.expanduser("~"), ".test")
TEST_FILE = os.path.join(PLOTLY_DIR, ".permission_test")

# this sets both the DEFAULTS and the TYPES for these files
FILE_CONTENT = {CREDENTIALS_FILE: {'username': '',
'api_key': '',
'proxy_username': '',
'proxy_password': '',
'stream_ids': []},
CONFIG_FILE: {'plotly_domain': 'https://plot.ly',
'plotly_streaming_domain': 'stream.plot.ly',
'plotly_api_domain': 'https://api.plot.ly',
'plotly_ssl_verification': True,
'plotly_proxy_authorization': False,
'world_readable': True}}

try:
os.mkdir(TEST_DIR)
os.rmdir(TEST_DIR)
if not os.path.exists(PLOTLY_DIR):
os.mkdir(PLOTLY_DIR)
f = open(TEST_FILE, 'w')
f.write('testing\n')
f.close()
os.remove(TEST_FILE)
_file_permissions = True
except:
_file_permissions = False


def check_file_permissions():
return _file_permissions
64 changes: 64 additions & 0 deletions plotly/graph_reference.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
"""
This module handles accessing, storing, and managing the graph reference.

"""
from __future__ import absolute_import

import hashlib
import json
import os
import requests
from pkg_resources import resource_string

import six

from plotly import exceptions, files, utils

GRAPH_REFERENCE_PATH = '/v2/plot-schema'


def get_graph_reference():
"""
Attempts to load local copy of graph reference or makes GET request if DNE.

:return: (dict) The graph reference.
:raises: (PlotlyError) When graph reference DNE and GET request fails.

"""
default_config = files.FILE_CONTENT[files.CONFIG_FILE]
if files.check_file_permissions():
graph_reference = utils.load_json_dict(files.GRAPH_REFERENCE_FILE)
config = utils.load_json_dict(files.CONFIG_FILE)

# TODO: https://github.com/plotly/python-api/issues/293
plotly_api_domain = config.get('plotly_api_domain',
default_config['plotly_api_domain'])
else:
graph_reference = {}
plotly_api_domain = default_config['plotly_api_domain']

sha1 = hashlib.sha1(six.b(str(graph_reference))).hexdigest()

graph_reference_url = '{}{}?sha1={}'.format(plotly_api_domain,
GRAPH_REFERENCE_PATH, sha1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes, I didn't think about this. A few things to consider:

  • Is this going to break for the plotly enterprises out there that don't have v2 endpoints?
  • If we keep seperate versions of plot-schema for whichever plotly.js is being used, we'll need another version for plotly offline and whatever version of plot-schema it uses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It won't break enterprise solutions (more than they're already broken). They have a default schema to fall back on. This also has the deprecated stuff worked in, so it'll be allowed. A funny situation would be if the plotlyjs they are using hasn't deprecated a key and we throw a warning suggesting they use a key that actually doesn't exist yet...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@chriddyp you're gonna need to refresh me on the plotly offline here 🐱. Do we have a plot schema for the plotly-offline that we can get at somewhere?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably, when plotly offline downloads (the latest) plotly.js, it can also download that version's plot-schema. This schema won't need updating, since it is locked in with the local plotly.js.


try:
response = requests.get(graph_reference_url)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we should place a client-side timeout on this. I'm imagining the spotty internet connections (conferences), where the request doesn't fail right away, but might take eg 30 seconds. Restarting the ipython kernal and waiting for a >5 sec import would be a bummer!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related: #297

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bueno, timeout sounds great.

Alternatively, we could run it in the background and raise a runtime warning that the user needs to manually update the graph reference with some command? That'd be the fastest approach and seems like it wouldn't be too much to ask our users? Dunno how I feel about that...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Timeout: 3 seconds? I like the idea of the background process, but I'm afraid of the complexity that it might introduce and I think that it's hard to communicate to the user what's going on and the implications.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely agreed about the complexity-factor. Yeah 3 seconds seems reasonable I guess, I'll try it out a bunch and see how often I hit that threshold.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool :) This should pass the "I think I'll work out on my back patio, even though the internet sucks" test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this study done ~20 seconds ago : P

https://plot.ly/~theengineear/2413/stats-on-wait-times-for-loading-plot-schema-50-times/

I think we should have a timeout around 5s.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!
How did you do this test btw? was this all in the same location or did you move around? (14 sec is crazy!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All from the same location. I used a crappy network I have access to. And yeah, 14 is definitely crazy... which is exactly having some sane timeout is a great call ;)

response.raise_for_status()
except requests.exceptions.RequestException:
if not graph_reference:
path = os.path.join('graph_reference', 'default-schema.json')
s = resource_string('plotly', path).decode('utf-8')
graph_reference = json.loads(s)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I see. Old enterprise will be using the local, shipped version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, exactly. Note that this doesn't mean they'll be in sync though :(

else:
if six.PY3:
content = str(response.content, encoding='utf-8')
else:
content = response.content
data = json.loads(content)
if data['modified']:
graph_reference = data['schema']

return utils.decode_unicode(graph_reference)

GRAPH_REFERENCE = get_graph_reference()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forget how imports work, but if I do something like:

import plotly.plotly as py
from plotly.offline import iplot
from plotly.tools import FF

is that going to call this file 3 times (and perform 3 downloads?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nop, this runs once only.

Loading