Skip to content

Releases: clearml/clearml-agent

PyPI v0.16.3

22 Dec 18:34
Compare
Choose a tag to compare

Features and Bug Fixes

  • Update PyJWT requirement (v2.0.0 breaks interface)
  • Update other requirements constraints
  • Change k8s pod naming scheme in k8s glue to include queue name, conform queue name to k8s standard

PyPI v0.16.2

10 Dec 11:03
Compare
Choose a tag to compare

Features

  • conda
    • Add agent.package_manager.conda_env_as_base_docker allowing "docker_cmd" to contain link to a full pre-packaged conda environment (tar.gz created by conda-pack). Use TRAINS_CONDA_ENV_PACKAGE environment variable to specify conda tar.gz file.
    • Add conda support for read-only pre-built environment (pass conda folder as docker_cmd on Task)
    • Improve trying to find conda executable
  • k8s glue
    • Add support for limited number of services exposing ports
    • Add support for k8s pod custom user properties
    • Allow selecting external trains.conf file for the pod itself
    • Allow providing pod template, extra bash init script, alternate SSH server port, gateway address (k8s ingress/ELB)
  • Allow specifying cudatoolkit version in the "installed packages" section when using conda as package manager clearml/clearml#229
  • Add agent.package_manager.force_repo_requirements_txt. If True, "Installed Packages" on Task are ignored, and only repository requirements.txt is used
  • Pass TRAINS_DOCKER_IMAGE into docker for interactive sessions
  • Add torchcsprng and torchtext to PyTorch resolving

Bug Fixes

  • When logging suppress "\r" when reading a current chunk of a file/stream. Add agent.suppress_carriage_return (default True) to support previous behavior
  • Make sure TRAINS_AGENT_K8S_HOST_MOUNT is used only once per mount
  • Fix k8s glue script to trains-agent default docker script
  • Fix apply git diff from submodule only
  • conda
    • Fix conda pip freeze to be consistent with trains 0.16.3
    • Fix conda environment support for trains 0.16.3 full env. Add agent.package_manager.conda_full_env_update to allow conda to update back the requirements (default False, to preserve previous behavior)
    • Fix running from conda environment - conda.sh not found in first conda PATH match
  • Fix docker mode ubuntu/debian support by making sure not to ask for input (fix tzdata install)
  • Fix repository detection - ignore environment SSH_AUTH_SOCK, only check if git user/pass are configured
  • git diff
    • Fix support for non-ascii diff
    • Fix diff with empty line at the end will cause corrupt diff apply message
    • Allow zero context diffs (useful when blind patching repository)
  • Fix daemon --stop when agent UID cannot be located
  • Fix nvidia docker support on some linux distros (SUSE)
  • Fix nvidia pytorch dockers support
  • Fix torch CUDA 11.1 support
  • Fix requirements dict with null entry in pip should be considered None install from repository's requirements.txt

PyPI v0.16.1

05 Oct 15:47
Compare
Choose a tag to compare

Features

  • Add sdk.metrics.plot_max_num_digits configuration option to reduce plot storage size
  • Add agent.package_manager.post_packages and agent.package_manager.post_optional_packages configuration options to control packages install order (e.g. horovod)
  • Add agent.git_host configuration option for limiting git credential usage for a specific host (overridable using TRAINS_AGENT_GIT_HOST environment variable)
  • Add agent.force_git_ssh_port configuration option to control https to ssh link conversion for non standard ssh ports
  • Add requirements detection features
    • Improve support for detecting new pip version (20+) supporting package @ scheme://link

Bug Fixes

  • Fix pre-installed packages are ignored when installing a git package wheel. Reinstalling a git+http link is enough to make sure all requirements are met/installed clearml/clearml#196
  • Fix incorrect check for spaces in current execution folder
  • Fix requirements detection
    • Update torch version after using downloaded / system pre-installed version
    • Do not install git packages twice when a new pip version is used (pip freeze will detect the correct git link version)

PyPI v0.16.0

11 Aug 14:57
Compare
Choose a tag to compare

Features

  • Add agent.docker_init_bash_script configuration section to allow finer control over docker startup script
  • Changed default docker image from nvidia/cuda to nvidia/cuda:10.1-runtime-ubuntu18.04 to support cudnn frameworks (e.g. TF)
  • Improve support for dockers with preinstalled conda environment
  • Improve trains-agent-docker spinning
  • Add daemon --order-fairness for round-robin queue pulling
  • Add daemon --stop to terminate a running agent (assuming other arguments are the same)
    • If no additional arguments, Agents are terminated in lexicographical order
  • Support cleanup of all log files on termination unless executed with --debug
  • Add error message when Trains API Server is not accessible on startup

Bug Fixes

  • Fix GPU Windows monitoring support clearml/clearml#177
  • Fix .git-credentials and .gitconfig mapping into docker
  • Fix non-root docker image usage
  • Fix docker to use UTF-8 encoding, so prints won't break it
  • Fix --debug to set all loggers to DEBUG
  • Fix task status change to queued should never happen during Task runtime
  • Fix requirement_parser to support package @ git+http lines
  • Fix GIT user/password in requirements and support for -e git+http lines
  • Fix configuration wizard to generate trains.conf matching latest Trains definitions

PyPI v0.15.1

21 Jun 20:43
Compare
Choose a tag to compare

Features

  • Add Trains Agent Daemon and Services docker files

Bug Fixes

  • Fix initialization wizard (allow at most two verification retries, then print error)
  • Add warning on --gpus with no detected CUDA version #24
  • Add agent.force_git_ssh_protocol configuration option to force all git links to ssh:// #16
  • Add git user/pass permission into pip package installation from Git repository #22

PyPI v0.15.0

01 Jun 16:59
Compare
Choose a tag to compare

Features

  • Add daemon Services Mode (daemon --services-mode) where the daemon spins a task in its own docker and verifies start-up and shut-down. This allows multiple tasks to be launched simultaneously on the same machine (currently in CPU mode only), where each task service will register itself as a worker for the lifetime of the task
  • Enhance build --docker mode
    • Add --install-globally option to install required packages in the docker's system python
    • Add --entry-point option to allow automatic task cloning when running the docker
  • Support PyTorch Nightly builds using the agent.torch_nightly configuration flag. If true, the agent looks for a nightly build when a stable torch wheel is not found
  • Add environment variables support for git user/password
    • Using TRAINS_AGENT_GIT_USER/TRAINS_AGENT_GIT_PASS
    • Pass git credentials to dockerized experiment execution
  • Support running code from module (i.e. -m in execution entry point)
  • Add daemon --create-queue to automatically create a queue and use it if queue name doesn't exist in the server
  • Move --gpus and --cpu-only to worker args (used by daemon, execute and build)

Bug Fixes

  • Fix init wizard, correctly display the input servers #19
  • Fix version control links in requirements when using conda
  • Fix build --docker mode standalone docker execution
  • Improve docker host-mount support, use TRAINS_AGENT_DOCKER_HOST_MOUNT environment variable
  • Support pip v20.1 local/http package reference in pip freeze
  • Fix detached mode to correctly use cache folder slots
  • Fix CUDA_VISIBLE_DEVICES should never be set to "all" (Trains Slack channel thread)
  • Do not monitor GPU when running with --cpu-only

PyPI v0.14.1

24 Mar 19:07
Compare
Choose a tag to compare

Features and Bug Fixes

  • Add daemon detached mode (--detached, -d) that runs the agent as daemon in the background and returns immediately
  • Auto mount ~/.git-credentials into docker container (if file exists)
  • Add TRAINS_AGENT_EXTRA_PYTHON_PATH environment variable to allow adding additional python path during experiment execution (helpful when using extra un-tracked modules)
  • Fix "run as user" feature (using TRAINS_AGENT_EXEC_USER environment variable)
  • Fix PyTorch support to ignore minor versions when looking for package to install/download
  • Fix experiment execution output handling

PyPI v0.14.0

12 Mar 17:43
Compare
Choose a tag to compare

Features and Bug Fixes

  • Add support for trains-agent execute --id <experiment-id> --docker that allows executing a specific experiment inside a docker container
  • Add support for trains-agent execute --id <template-experiment-id> --clone that clones the provided experiment and executes the cloned experiment
  • Add support for APIClient.models.delete() to allow programmatically deleting a model clearml/clearml-server#32
  • Add daemon support for passing storage-related OS environment variables to experiments executed inside a docker container (supported by trains>=0.13.3):
    • AWS: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and AWS_DEFAULT_REGION
    • Azure: AZURE_STORAGE_ACCOUNT and AZURE_STORAGE_KEY
    • Google: GOOGLE_APPLICATION_CREDENTIALS
  • Fix git checkout with submodules clearml/clearml#112
  • Prefer docker image from command line over the one specified in experiment

PyPI v0.13.3

09 Mar 14:08
Compare
Choose a tag to compare

Features and Bug Fixes

  • Allow providing queue names instead of queue IDs in daemon mode
  • Docker mode improvements
    • Support running as a specific user inside a docker using the TRAINS_AGENT_EXEC_USER environment flag
    • Pass correct GPU limit when skipping gpus flag
    • Add --force-current-version daemon command-line flag
  • Add K8s/trains glue service example
  • Added K8s support in daemon mode
    • Running inside a K8s pod
    • Mounting dockerized experiment folders to host
    • Allow a specific network for the docker
  • Add default storage environment vars (for AWS, GS and Azure) to generated agent configuration
  • Improve Unicode/UTF stdout handling

PyPI v0.13.2

23 Feb 14:02
Compare
Choose a tag to compare

Features and Bug Fixes

  • Pre-install numpy if it exists in the requirements
  • Add experiment archiving example
  • Add .bashrc reloading before running trains-agent in the AWS dynamic cluster management service
  • Add support for pulling recursive git modules as as well as main project
  • Limit virtualenv version to <20 due to an import issue in v20.0.0
  • Fix pip install/upgrade with limit in conda
  • Fix daemon monitor to not stop experiments if network is down