Anaconda Virtual Environments

Managing Virtual Environments In Anaconda

conda_logo.svg

Introduction

Creating virtual environments give you the ability to have a single or multiple custom environments for your project.

Why create virtual environments?

Assume that you have two separate programs with each using different versions of python and/or other libraries. Given a situation like this, rather than always modifying your program to meet the several version requirements that your project depends on, you can create multiple virtual environments(one for each variant of code/program) to serve this purpose so as to keep things clean and efficient.

Creating a Virtual environment

To get started, first you need to load the Anaconda module which is a tool that aims to simplify package management and deployment of data science and machine learning tools.

  1. Login to Discovery and run

    module load anaconda
  2. Before creating anaconda virtual environments on Discovery, there are a few list of already created virtual environments you can use right out of the box which are tailored to suit specific project needs. Some of these virtual environments are TensorFlow (with GPU support), PyTorch and QIIME. But if you require none of those, you can go ahead and create yours.

    Show the list of all existing environments

    conda env list

    Output

    # conda environments:
    #
    base                  *  /software/anaconda/anaconda3
    alfalfa_gbs              /software/anaconda/anaconda3/envs/alfalfa_gbs
    amptk-1.4.2              /software/anaconda/anaconda3/envs/amptk-1.4.2
    hsc_prediction           /software/anaconda/anaconda3/envs/hsc_prediction
    pytorch                  /software/anaconda/anaconda3/envs/pytorch
    qiime2-2019.10           /software/anaconda/anaconda3/envs/qiime2-2019.10
    redbiom                  /software/anaconda/anaconda3/envs/redbiom
    soilsystems              /software/anaconda/anaconda3/envs/soilsystems
    tensorflow-1.15.0        /software/anaconda/anaconda3/envs/tensorflow-1.15.0
    tensorflow-2.0.0         /software/anaconda/anaconda3/envs/tensorflow-2.0.0
    tensorflow-gpu-1.15.0    /software/anaconda/anaconda3/envs/tensorflow-gpu-1.15.0
    tensorflow-gpu-2.0.0     /software/anaconda/anaconda3/envs/tensorflow-gpu-2.0.0

    The output above contains a list of virtual environments on Discovery and their respective locations. The asterisk * symbol on the base environment specifies that the base is the current active virtual environment. Actually, the base is the default VE.

  3. Create the environment

    syntax: conda create -n <your_environment> --no-channel-priority -c <channel_name> <package_name>

    This command only creates the environment without installing any packages.

    $ conda create -n my_env

    OR

    • R

    • Python

    This command creates the environment and also installs some essential packages for working with R, all on one line.

    $ conda create -n my_env -c r r-essentials

    The my_env is the name of the virtual environment, r is the specified channel to install the r-essentials package from.

    This command creates the environment and also installs some essential packages for working with Python, all on one line. You can also specify the version of python you would like to install.

    $ conda create -n my_env python=3.6

    The my_env is the name of the virtual environment, python is the specified package and version you want to install in this new environment.

    Flags Explained

    Flag

    Description

    -n or --name

    The name of the virtual environment

    -c or --channel

    Anaconda packages are downloaded from remote channels, which are URLs to directories containing the conda packages.

    Now, conda will take a little while to search for the package(s) you specified to download and install it to your home directory /home/yourusername/.conda/envs/my_env. After you get the prompt Proceed ([y]/n)?, please hit the y and then the Enter key afterward to continue with the installation.

    Once this phase is done the end of the output printed on your console should look like the one below.

    ...
    
    Preparing transaction: done
    Verifying transaction: done
    Executing transaction: done
    #
    # To activate this environment, use
    #
    #     $ conda activate my_env
    #
    # To deactivate an active environment, use
    #
    #     $ conda deactivate
  4. Activate your newly created environment To start using the packages installed in your environment, you have to activate the environment you just created using the command below.

    syntax: conda activate <your_environment>

    $conda activate my_env

    After the environment has been activated, you would notice your shell prompt on the console changed from:

    [yourusername@discovery-l1 ~]$

    to

    (my_env) [yourusername@discovery-l1 ~]$

    This shows that you are currently in the my_env workspace. Also when you run the command conda env list, you should see the asterisk * symbol on the my_env line.

  5. Show the list of installed packages

    Use the conda list command to show the list of packages installed.

    (my_env) [yourusername@discovery-l1 ~]$ conda list

    You should get an output like the one below.

    • R

    • Python

    ...
    
    r-xml2                    1.2.0             r36h29659fb_0
    r-xtable                  1.8_4             r36h6115d3f_0
    r-xts                     0.11_2            r36h96ca727_0
    r-yaml                    2.2.0             r36h96ca727_0
    r-zoo                     1.8_6             r36h96ca727_0
    
    ...
    ...
    
    pip                       20.1.1                   py36_1
    python                    3.6.10               h7579374_2
    readline                  8.0                  h7b6447c_0
    setuptools                49.2.0                   py36_0
    sqlite                    3.32.3               h62c20be_0
    tk                        8.6.10               hbc83047_0
    
    ...

    Instead of showing the entire list, you can use the grep command to quickly search and verify if a given package is installed.

    (my_env) [yourusername@discovery-l1 ~]$ conda list | grep -i wheel

    If after going through the installed packages, and you discovered that the package you want isn’t part of the installed essentials. Then, you can search for the package on the conda channels and install afterward.

  6. Installing additional packages to your environment

    For the R programming example, search for and install the package → R devtools. It makes your life as a package developer easier by providing R functions that simplify many common tasks.

    For the Python programming example, search for and install the package → Scipy, a free and open-source Python library used for scientific computing and technical computing

    • Search for the package(s):

      syntax: conda search <package_name(s)>

      • R

      • Python

      (my_env) [yourusername@discovery-l1 ~]$ conda search r-devtools
      (my_env) [yourusername@discovery-l1 ~]$ conda search scipy

      The output of the R example above should print out a list of r-devtools versions along side their channels. The Python example should print a list of scipy versions as well as their channels respectively. Whether you require the latest version of the package or not, choose the version that suits your purpose and also specify it’s channel.

    • Search for the package(s) by channel:

      syntax: conda search -c <channel> <package_name(s)>

      • R

      • Python

      (my_env) [yourusername@discovery-l1 ~]$ conda search -c r -c conda-forge r-devtools
      (my_env) [yourusername@discovery-l1 ~]$ conda search -c conda-forge scipy
    • In the R example, notice that the multiple channels r and conda-forge are specified meaning that it looks for the devtools package on both the channels. This is because either channel could have the most recent version of the package you am looking for.

      The first -c argument is of higher priority than the second, therefore priority decreases from left to right. conda-forge is a community channel made up of thousands of contributors.

      Although specifying the channel is optional, it remains a good practice to specify the --channel or -c flag because it shows you a list of various versions of that package.

    • Install the package(s):

      syntax: conda install <package(s)>

      • R

      • Python

      (my_env) [yourusername@discovery-l1 ~]$ conda install r-devtools
      (my_env) [yourusername@discovery-l1 ~]$ conda install scipy

      Note that when installing a package, anaconda would also install all the dependencies required for that given package.

  7. Run the installed program

    • R

    • Python

    • Next, launch the R CLI by typing the letter R on your console. Notice the uppercase.

      (my_env) [yourusername@discovery-l1 ~]$ R

      Now you should get an output like the one below.

      R version 3.6.1 (2019-07-05) -- "Action of the Toes"
      Copyright (C) 2019 The R Foundation for Statistical Computing
      Platform: x86_64-conda_cos6-linux-gnu (64-bit)
      
      R is free software and comes with ABSOLUTELY NO WARRANTY.
      You are welcome to redistribute it under certain conditions.
      Type 'license()' or 'licence()' for distribution details.
      
      Natural language support but running in an English locale
      
      R is a collaborative project with many contributors.
      Type 'contributors()' for more information and
      'citation()' on how to cite R or R packages in publications.
      
      Type 'demo()' for some demos, 'help()' for on-line help, or
      'help.start()' for an HTML browser interface to help.
      Type 'q()' to quit R.
      
      >|
    • Next, update to the latest version of devtools.

      > devtools::install_github("hadley/devtools")

      If everything works correctly, you should be presented with a list of packages that have more recent versions available for update. Next, select one, or more numbers, for the package you intend to update, or simply select 1 to update all packages.

    • Next, launch the python CLI by typing python on your console.

      (my_env) [yourusername@discovery-l1 ~]$ python

      Now you should get an output like the one below.

      Python 3.6.10 |Anaconda, Inc.| (default, May  8 2020, 02:54:21)
      [GCC 7.3.0] on linux
      Type "help", "copyright", "credits" or "license" for more information.
      >>>|

      Notice the version of python installed is 3.6>

    • Next, import and use the scipy module.

      >>> from scipy.special import cbrt
      >>> cb = cbrt([27, 64])
      >>> print(cb)
      
      [3. 4.]

      The output is [3. 4.].

Exiting a conda environment

$ conda deactivate

Searching for a conda package

Syntax: conda search <package_name>

Example command to search for r-devtools

$ conda search r-devtools

Cloning an existing conda virtual environment

There are occasions where you’d want to install extra packages to an already existing virtual environment. However, you may not have the necessary permissions to do so, your best bet would be to create a new environment.

Rather than recreating the entire virtual environment which includes re-downloading the required dependencies, you can simply clone the existing virtual environment and activate it. Then, you can install your own packages to it so as to save you some time as well as cutting-off the extra work.

Syntax:

conda --name <your_custom_environment_name> --clone

$ conda create --name myproject --clone my_env

The my_env is the name of the already existing environment you intend cloning and myproject is the name of the new environment you are cloning into.

Removing a Package From a conda Environment

Syntax:

conda remove --name <your_custom_environment_name> <package_name>

$ conda remove --name my_env scipy

The my_env is the name of the already existing environment and scipy is the name of the package which will be removed from the environment. This uninstalls the package together with its dependencies.

Delete an environment and everything in it

If you want to destroy a given virtual environment that you created, you can use the command below which removes the environment and all the packages in it.

Syntax: conda env remove --name <your_custom_environment_name>

$ conda env remove --name my_env

Installing local packages in R

Local packages are packages that aren’t part of the packages included in Anaconda’s package ​​​​​​repository but are available elsewhere to be downloaded and installed manually.

In the below example, an R package called GWASploy was downloaded which is used for genome-wide association studies in Autopolyploids (and Diploids).

  1. Login to Discovery.

  2. Load the anaconda module

    module load anaconda
  3. Download the package to your home directory (For Example, /home/your-username)

    $ wget https://potatobreeding.webhosting.cals.wisc.edu/wp-content/uploads/sites/161/2016/08/GWASpoly_download.zip
  4. Unzip the downloaded file and copy the file with .tar.gz extension to your home directory

    Notice the trailing period at the end of line two.

    $  unzip GWASpoly_download.zip
    $  cp GWASpoly_download/GWASpoly_1.3.tar.gz .
  5. Install the package

    R --slave -e "install.packages('/home/yourusername/GWASpoly_1.3.tar.gz', dependencies = TRUE, repos=NULL, method='libcurl')"

    If you get a warning message about the non-existence of a given dependency, please search for and install the given dependency and run step 3 again.

Managing Python Packages with pip

In the below example, the python camelcase package which capitalizes the first letter of each word was downloaded using pip command.

  1. Log in to Discovery.

  2. Load the anaconda module

    module load anaconda
  3. Activate the virtual environment

    conda activate my_env
  4. Use the pip command to install the camelcase package

    pip install camelcase
  5. After the successful installation of the package, launch the python CLI and paste the codes below line after line.

    >>> import camelcase
    >>> c = camelcase.CamelCase()
    >>> txt = "hello world"
    >>> print(c.hump(txt))
    
    Hello World

    Thus, the text hello world is converted to a Camel case format.

Using Anaconda virtual environments in your Slurm script

The example in this section assumes that you have carried out the steps in Installing packages with python pip

After creating your virtual environment, you can use it in your Slurm script because your program depends on the packages contained in the environment. In your Slurm script, there are two lines you’d want to add right after the declaration of Slurm directives, module load anaconda and conda activate my_env. my_env is the name of the virtual environment which was created earlier.

  1. Log in to Discovery.

  2. Create a file called script.sh and then copy and paste the code below and save afterward.

    #!/bin/bash
    
    #SBATCH --job-name=CamelCase   		## Name of the job
    #SBATCH --output=CamelCase.out  	 ## Output file
    #SBATCH --time=10:00           		## Job Duration
    #SBATCH --ntasks=1             		## Number of tasks (analyses) to run
    #SBATCH --cpus-per-task=1      		## The number of threads the code will use
    #SBATCH --mem-per-cpu=100      		## Real memory(MB) per CPU required by the job.
    
    ## Load the python interpreter
    module load anaconda
    conda activate my_env
    
    ## Execute the python script
    srun -n 1 python program.py

    On line 11, the anaconda module has been loaded. On line 12, the custom anaconda environment, my_env has been activated which contains the packages and dependencies, that the project requires to run.

  3. Create a file called program.py and then copy and paste the code below and save afterward.

    import camelcase
    
    c = camelcase.CamelCase()
    
    txt = "hello world"
    
    print(c.hump(txt))
  4. Make the batch script executable

    chmod +x script.sh
  5. Submit the batch script

    sbatch script.sh

    Output If you show the content of the output file CamelCase.out, you should see a result like the one below on your console.

    Hello World

    Thus, the text hello world is converted to a Camel Case format.