Inside the Collaboration That Built the Open Source JupyterLab Project

Researchers and programmers know the Jupyter Notebook as a web-based environment where they can do computational research with native support for code, math and data visualization.

Today at SciPy 2016, Bloomberg joined Continuum Analytics and Project Jupyter to reveal the new JupyterLab platform so that early adopters can help test the alpha release. JupyterLab creates a more desktop-like experience on the Web, rivaling expensive software suites that allow programmers to use familiar tricks like keyboard shortcuts, tabs and configurable editor layouts.

By creating a more developer-friendly interface, the platform will enable interactive and exploratory computing in new and exciting forms, going beyond what was previously possible with the popular Jupyter Notebook. Updated tools will allow researchers to more easily share their work, see how code works, make changes to it and publish the code and data they used to reach a scientific conclusion.

“What we’re seeing is that by lowering that barrier and by producing a tool that is very close to the natural workflow, it makes it sort of easy to do the right thing,” says Fernando Perez, a research scientist at UC Berkeley and one of the founders of the Berkeley Institute for Data Science. Perez created the original IPython in 2001 in grad school and stayed on as a lead developer as the project evolved into what is now called Jupyter. In the old way of doing things and sharing computational results, says Perez, “it was difficult, you had a lot of work to do.”

The JupyterLab project is also a powerful demonstration of collaboration between an open source project, expert developers in the project’s community and support and applied expertise from an established corporation. In this case, open source developers partnered with consulting firm Continuum Analytics and Bloomberg to hammer out the latest iteration of the tool. Bloomberg has been working in collaboration with the IPython and Jupyter community since early 2014, and has made it a priority to support open source projects, encouraging their engineers to continue their work in the community and take active roles in projects.

“It’s a tri-party organization: we have the Jupyter developers, and we have Bloomberg, and we have Continuum,” says Sylvain Corlay, a quantitative researcher at Bloomberg and a contributor to Project Jupyter. He says it’s not atypical for Bloomberg developers to spend a lot of time contributing to open source; BQPlot, an open source plotting framework for Jupyter, is another Bloomberg-created project.

Project Jupyter itself isn’t limited to just Python; in fact it makes the building blocks of science reproducible across more than 40 programming languages. The versatility of the project also opens its usability beyond programmers to data journalists.

The “computational narratives” that Project Jupyter can create put source code, mathematical equations, text, graphs and visualizations, and other media into a live research paper that allows authors to show their work. The old way: publish a paper as an inert PDF, perhaps with a pointer to a repository somewhere with bundles of scripts.

As for all that manual work: gone. Today, GitHub automatically renders any Jupyter notebook files in a repository as static web pages, and the technical publisher O’Reilly accepts Jupyter notebook files in their online publishing workflow.

Replicating desktop software in a web browser isn’t easy. For that, Bloomberg invested in an open source library, PhosphorJS, to build a solid foundation for deploying onto the web. Chris Colbert, the PhosphorJS team lead and software architect at Continuum Analytics states, “even outside of the technical computing space, the web application that we built was complex.”

As a major contributor to the JupyterLab interface, Colbert believes the new UI will make Project Jupyter a bellwether for browser-based applications. “The new UI delivers a richer user experience that is more dynamic and easier to use for data scientists creating Jupyter notebooks,” continues Colbert. Other major Continuum Analytics contributors to PhosphorJS include: Steven Silvester, Afshin Darian and Dave Willmer.

The PhosphorJS library that powers the new JupyterLab widgets is an extensible environment for building web applications. It contains a collection of various components, data structures, and algorithms which are useful when building desktop-like applications on the Web. It also includes a widget UI framework which enables rich and responsive layouts that cannot be achieved using CSS alone.

It also enables JupyterLab to expand, allowing users to customize and create widgets. This, Bloomberg’s Jason Grout says, is a defining feature of JupyterLab: “It brings us all back together into an extensible UI that lets you build a framework for other users,” he says. “[JupyterLab] unifies and realizes a large part of the project’s vision that’s been evolving separately in different pieces,” he says. “It builds a platform for people to experiment with taking the UI to the next level.”

While the Jupyter Notebook is extremely popular, the underlying file format and messaging protocols defined by the Jupyter project mean that other notebook interfaces, such as the Beaker Notebook and the Nteract application, can also read documents and share kernels.

The new UI and workflow improvements will further the appeal of Jupyter Notebook to all researchers, but the newest users might come from the humanities. “Originally IPython was used primarily by academic researchers, a lot of people in the hard sciences,” says Brian Granger, a Physics Professor at Cal Poly San Luis Obispo, and a cofounder of the project along with Perez. Both Granger and Perez have separate academic grants supporting their work.

According to Perez, it wasn’t until 2011 that usage of IPython notebooks started to grow beyond the scientific community. Today, it’s spreading quickly. “I know people in the humanities using Jupyter, and people using it in bioinformatics and finance. And areas of commercial data science, [like] journalism. BuzzFeed’s data journalists actually release sets of Jupyter Notebooks with articles that allows people to reproduce their work.”

With the expanding abilities and usage of JupyterLab, the broader community now has more opportunity ahead. “We are motivated to build the most powerful tools possible, not only for our own projects, but for our partners and clients,” says Zach Haehn, Head of Software Engineering San Francisco at Bloomberg. “By enabling our clients to combine Bloomberg data with their own to expand research and analysis, we bring more value to the technology community at large.”

To learn more about JupyterLab, watch the video presentation from SciPy 2016.