Skip to content

Running your workflow in the clouds

Keiichiro Ono edited this page May 20, 2016 · 4 revisions

Running your workflows on servers

For some use cases, your laptop is not powerful enough to analyze large data sets. If you want to use Cytoscape to visualize the results from complex data analysis pipeline, this is the setup you need. You can use your private servers or commercial cloud services to perform such analysis and can use Cytoscape as a frontend for such workflows.

In this document, I will show you how to setup your workspace using Cytoscape and Jupyter Notebook running in clouds.

How it works

The figure above is the overview of the setup. On your laptop, you need to run two client applications:

  1. Cytoscape Desktop (3.3+) - For visualizing network data
  2. Web Browser - For writing your own workflows in Jupyter Notebook

And on the server(s), you need to install the following:

  1. Data analysis software packages, such as numpy/scipy/pandas/igraph/R/Bioconductor/etc.
  2. Jupyter Notebook Server

In this setup, Jupyter Notebook server running on cloud services works as the hub for the entire workflow. You can write your workflows in Jupyter Notebook, and run all CPU/IO intensive tasks on the servers. Results of the analysis can be sent to Cytoscape via cyREST to see the results on your laptop.

How to setup

Essentially, there is no difference between running your workflow on your laptop or remote servers because Jupyter Notebook is a server application. The only difference is the IP address parameter for cyREST.

I assume that you are familiar with your firewall settings. You may need to open the port for Jupyter Notebook and cyREST.

For Python Users

  1. Install Cytoscape on your laptop (client machine)
  2. Check the IP address of your laptop and write it down.
  3. Check your laptop's firewall settings and make sure the port 1234 is accessible from remote machines. Port 1234 is the default port number of cyREST, and it may be different if you have already set different port number.
  4. Setup your server instances.
  5. Install all required software packages for your data analysis to the remote server(s).
  6. Install Jupyter Notebook on the remote server
  7. Check the Jupyter Notebook server's IP address and port. Default port number for Jupyter Notebook is 8888.
  8. Make sure your server opens the port for Jupyter Notebook
  9. Start the Jupyter Notebook server
  10. Access the notebook server from your web browser
  11. Start Cytoscape
  12. Stare a new Jupyter Notebook
  13. Create a new code cell
  14. Try sample code to make sure your notebook can access Cytoscape
from py2cytoscape.data.cyrest_client import CyRestClient
cy = CyRestClient(ip='YOUR_LAPTOP_IP_ADDRESS', port=1234)
empty1 = cy.network.create()

When you run this notebook cell, you will see an empty network in your Cytoscape Desktop.

Even if you use other programming languages like R or Java, you can still use this setup. You need to use REST client library for your languages, but as long as you can access cyREST from your servers, you can use Cytoscape from the servers. You just need to pass your laptop's IP address and port number to the REST client.

Troubleshooting

If you cannot use Cytoscape from your server, it maybe a firewall problem. Make sure:

  • Laptop IP address passed to REST client is correct
  • Port 1234 (or port number you specified in the properties in Cytoscape) is open on your laptop
  • Your server can access remote machines