# Creating Interactive Dashboards

## [Creating Interactive Dashboards from Jupyter Notebooks](https://pbpython.com/interactive-dashboards.html)

![](https://3650666907-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-M16IoaANYRvc0I-U9uw%2F-MCbdk_x5xCn0wyF5QhM%2F-MCbe2rZ80j-IPfWO5Es%2Fimage.png?alt=media\&token=a147b59a-63fa-4bc1-b6b1-f9fb03506d85)

### Introduction

I am pleased to have another guest post from [Duarte O.Carmo](https://duarteocarmo.com/). He wrote [series](https://pbpython.com/papermil-rclone-report-1.html) of [posts](https://pbpython.com/papermil-rclone-report-2.html) in July on report generation with Papermill that were very well received. In this article, he will explore how to use [Voilà](https://github.com/voila-dashboards/voila) and [Plotly Express](https://plot.ly/python/plotly-express/) to convert a Jupyter notebook into a standalone interactive web site. In addition, this article will show examples of collecting data through an API endpoint, performing sentiment analysis on that data and show multiple approaches to deploying the dashboard.

### About Duarte

Hey everyone! My name is [Duarte O.Carmo](https://duarteocarmo.com/) and I’m a Consultant working at [Jabra](https://jabra.com/) that loves working with python and data. Make sure to visit [my website](https://duarteocarmo.com/) if you want to find more about me ![😀](https://pbpython.com/images/smiley.png)

Since this is a long article, here are the table of contents for easier navigation:

* [The Goal](https://pbpython.com/interactive-dashboards.html#id1)
* [Getting Live Reddit Data](https://pbpython.com/interactive-dashboards.html#id2)
* [Analyzing Reddit Data with Plotly Express](https://pbpython.com/interactive-dashboards.html#analyzing-the-data-with-plotly-express)
  * [In What Subreddits does the work python appear more](https://pbpython.com/interactive-dashboards.html#in-what-subreddits-does-the-word-python-appear-more)
  * [What are the most up voted comments with the word python](https://pbpython.com/interactive-dashboards.html#id3)
  * [What is the sentiment in r/python across time](https://pbpython.com/interactive-dashboards.html#what-is-the-sentiment-in-r-python-across-time-introducing-textblob)
* [Creating a Live Dashboard with Voilà](https://pbpython.com/interactive-dashboards.html#id4)
* [Deploying your notebook to the web](https://pbpython.com/interactive-dashboards.html#id5)
  * [First Option - Use Binder](https://pbpython.com/interactive-dashboards.html#first-option-using-binder)
  * [Second Option - Using an Ubuntu Server in a Hacky Way with Tmux](https://pbpython.com/interactive-dashboards.html#second-option-using-an-ubuntu-server-in-a-hacky-way-with-tmux-see-no-evil)
  * [Using Heroku or Google Cloud Platform](https://pbpython.com/interactive-dashboards.html#id6)
* [Conclusion](https://pbpython.com/interactive-dashboards.html#id7)

### The Goal

Jupyter notebooks are one of my favorite tools to work with data, they are simple to use, fast to set up, and flexible. However, they do have their disadvantages: source control, collaboration, and reproducibility are just some of them. As I illustrated in my prior post, I tend to enjoy [seeing what I can accomplish with them](https://pbpython.com/papermil-rclone-report-1.html).

An increasing need is the sharing of our notebooks. Sure, you can export your notebooks to html, pdf, or even use something like [nbviewer](https://nbviewer.jupyter.org/) to share them. But what if your data changes constantly? What if every time you run your notebook, you expect to see something different? How can you go about sharing something like that?

> But what if your data changes constantly? What if every time you run your notebook, you expect to see something different? How can you go about sharing something like that?

In this article, I’ll show you how to create a Jupyter Notebook that fetches live data, builds an interactive plot and then how to deploy it as a live dashboard. When you want to share the dashboard, all you need to share with someone is a link.

Let’s have some fun with the data first.

### Getting live Reddit data

We will use Reddit as the source of data for our dashboard. Reddit is a tremendous source of information, and there are a million ways to get access to it. One of my favorite ways to access the data is through a small API called pushshift. The [documentation is right here](https://github.com/pushshift/api).

Let’s say you wanted the most recent comments mentioning the word “python”. In python, you could use [requests](https://requests.readthedocs.io/en/master/) to get a json version of the data:

```python
import requests
url = "https://api.pushshift.io/reddit/search/comment/?q=python"
request = requests.get(url)
json_response = request.json()
```

You can add a [multitude of parameters](https://github.com/pushshift/api#search-parameters-for-comments) to this request, such as:

* in a certain subreddit
* after a certain day
* sorted by up votes
* many more

To make my life easier, I built a function that allows me to call this API as a function:

```python
def get_pushshift_data(data_type, **kwargs):
    """
    Gets data from the pushshift api.

    data_type can be 'comment' or 'submission'
    The rest of the args are interpreted as payload.

    Read more: https://github.com/pushshift/api
    """

    base_url = f"https://api.pushshift.io/reddit/search/{data_type}/"
    payload = kwargs
    request = requests.get(base_url, params=payload)
    return request.json()
```

Using the `payload` parameter and `kwargs` I can then add any payload I wish as a function. For example,

```python
get_pushshift_data(data_type="comment",     # give me comments
                   q="python",              # that mention 'python'
                   after="48h",             # in the last 48 hours
                   size=1000,               # maximum 1000 comments
                   sort_type="score",       # sort them by score
                   sort="desc")             # sort descending
```

returns the json response. Pretty sweet right?

### Analyzing the data with Plotly Express

#### In what subreddits does the word ‘python’ appear more?

To answer the above question, we start by getting the data with our function:

```python
data = get_pushshift_data(data_type="comment",
                          q="python",
                          after="48h",
                          size=1000,
                          aggs="subreddit")
```

The `aggs` keyword asks pushshift to return an aggregation into subreddits, which basically means, group the results by subreddit. ([read about it in the documentation](https://github.com/pushshift/api#using-the-subreddit-aggregation))

Since the json response is pretty nested, we’ll need to navigate a bit inside of the dictionary.

```python
data = data.get("aggs").get("subreddit")
```

And we transform the list of dictionaries returned into a pandas DataFrame, and get the top 10.

```python
df = pandas.DataFrame.from_records(data)[0:10]
```

Here’s what our DataFrame looks like:

|   | doc\_count | key              |
| - | ---------- | ---------------- |
| 0 | 352        | learnpython      |
| 1 | 220        | AskReddit        |
| 2 | 177        | Python           |
| 3 | 139        | learnprogramming |

These are the names of the subreddits where the word `python` appears most frequently in their comments ![🎉](https://pbpython.com/images/party_popper.png) ![🎉](https://pbpython.com/images/party_popper.png) !

Let’s plot our results with the [Ploty Express](https://nbviewer.jupyter.org/github/plotly/plotly_express/blob/gh-pages/walkthrough.ipynb) library. Plotly Express is great, in my opinion, if you want to:

* create figures quickly. ![🚅](https://pbpython.com/images/bullet_train.png)
* create figures that are a bit more interactive than [matplotlib](https://matplotlib.org/). ![👋](https://pbpython.com/images/hand_wave.png)
* don’t mind a bit more installation and (imo) a bit less documentation. ![😰](https://pbpython.com/images/anxious-face.png)

Here’s all the code you need:

```python
import plotly.express as px

px.bar(df,              # our dataframe
       x="key",         # x will be the 'key' column of the dataframe
       y="doc_count",   # y will be the 'doc_count' column of the dataframe
       title=f"Subreddits with most activity - comments with 'python' in the last 48h",
       labels={"doc_count": "# comments","key": "Subreddits"}, # the axis names
       color_discrete_sequence=["blueviolet"], # the colors used
       height=500,
       width=800)
```

Yes, perhaps a bit more verbose than matplotlib, but you get an interactive chart!

![First example of dashboard image](https://pbpython.com/images/dashboard_figure_1.png)

All of the details are included in [the notebook](https://github.com/duarteocarmo/interactive-dashboard-post/blob/master/notebooks/Dashboard.ipynb) for this article.

#### What are the most up-voted comments with the word ‘python’?

To answer this question, our function will again come in handy. Let’s aggregate things a bit.

Don’t get scared, this is a one liner that will produce similar results to above:

```python
# get the data we need using the function
data = get_pushshift_data(data_type="comment", q="python", after="7d", size=10, sort_type="score", sort="desc").get("data")

# we only care about certain columns
df = pandas.DataFrame.from_records(data)[["author", "subreddit", "score", "body", "permalink"]]

# we only keep the first X characters of the body of the comment (sometimes they are too big)
df['body'] = df['body'].str[0:400] + "..."

# we append the string to all the permalink entries so that we have a link to the comment
df['permalink'] = "https://reddit.com" + df['permalink'].astype(str)

# style the last column to be clickable and print
df.style.format({'permalink': make_clickable})
```

To make a DataFrame column clickable you can can apply the following function to it:

```python
def make_clickable(val):
    """ Makes a pandas column clickable by wrapping it in some html.
    """
    return '<a href="{}">Link</a>'.format(val,val)
```

The above code will return the top 10 most upvoted comments of the last 7 days:

|   | author               | subreddit              | sc or e | body                                                | permalink |
| - | -------------------- | ---------------------- | ------- | --------------------------------------------------- | --------- |
| 0 | Saiboo               | learnpyth on           | 11 1    | Suppose you create the following python file calle… | Link      |
| 1 | Kompakt              | Programme rHumor       | 92      | Some languages don’t have switch statements…look…   | Link      |
| 2 | clown\_ world\_ 2020 | MrRobot                | 47      | Just goes to show that Esmail isn’t the only brill… | Link      |
| 3 | Leebert ysauce       | AnimalsBe ingBros      | 28      | They wont even be mad when the python decide to ta… | Link      |
| 4 | Kompakt              | Programme rHumor       | 23      | Yep it’s true, and depending on the design of the … | Link      |
| 5 | niceboy 4431         | Cringetop ia           | 23      | I have a theory (someone prove me wrong if you kno… | Link      |
| 6 | kinggur u            | Denmark                | 22      | Brug af Python: **+1** Brug af Python 3: **+2** …   | Link      |
| 7 | MintyAr oma          | totalwar               | 20      | We really need Bretonnian Men-At-Arms shouting Mon… | Link      |
| 8 | aspirin gtobeme      | gifsthatk eepongivi ng | 19      | Amazing. Brought \[this Monty Python clip]\(…       | Link      |
| 9 | Crimson Spooker      | TwoBestFr iendsPlay    | 19      | “Why can’t Three Houses be gritty and “realistic” … | Link      |

In the notebook, you can click the link column to be taken right into the comment. Hooray! ![🎉](https://pbpython.com/images/party_popper.png)

#### What is the sentiment in /r/python across time? Introducing [TextBlob](https://textblob.readthedocs.io/en/dev/)

Alright, the final analysis is a bit more complicated. We want to see the sentiment in the [/r/python](https://reddit.com/r/python) subreddit in some sort of time line.

First, we already now how to retrieve the most up voted comments of the past 2 days:

```python
# get the data with our function
data = get_pushshift_data(data_type="comment",
                          after="2d",
                          size=1000,
                          sort_type="score",
                          sort="desc",
                          subreddit="python").get("data")

# define a list of columns we want to keep
columns_of_interest = ["author", "body", "created_utc", "score", "permalink"]

# transform the response into a dataframe
df = pandas.DataFrame.from_records(data)[columns_of_interest]
```

This gives us a pandas DataFrame with the columns specified in `columns_of_interest` . But how do we get the sentiment of every comment?

Enter [TextBlob](https://textblob.readthedocs.io/en/dev/). A simple library that makes it ridiculously easy to get the sentiment of a sentence. Textblob returns two values, the sentiment **polarity** (-1 is negative; 0 is neutral; and 1 is positive) and the sentiment **subjectivity** (0 is objective and 1 is subjective)

Here’s an example:

```python
import textblob

sentence1 = "Portugal is a horrible country. People drive like crazy animals."
print(textblob.TextBlob(sentence1).sentiment)
# -> Sentiment(polarity=-0.8, subjectivity=0.95)
# negative and subjective

sentence2 = "Portugal is the most beautiful country in the world because beaches face west."
print(textblob.TextBlob(sentence2).sentiment)
# -> Sentiment(polarity=0.675, subjectivity=0.75)
# positive and less subjective
```

Read more about the library [here](https://textblob.readthedocs.io/en/dev/).

Now that we know how to extract sentiment from a piece of text, we can easily create some other columns for our DataFrame of comments:

```python
# create a column with sentiment polarity
df["sentiment_polarity"] = df.apply(lambda row: textblob.TextBlob(row["body"]).sentiment.polarity, axis=1)

# create a column with sentiment subjectivity
df["sentiment_subjectivity"] = df.apply(lambda row: textblob.TextBlob(row["body"]).sentiment.subjectivity, axis=1)

# create a column with 'positive' or 'negative' depending on sentiment_polarity
df["sentiment"] = df.apply(lambda row: "positive" if row["sentiment_polarity"] >= 0 else "negative", axis=1)

# create a column with a text preview that shows the first 50 characters
df["preview"] = df["body"].str[0:50]

# take the created_utc parameter and tranform it into a datetime column
df["date"] = pandas.to_datetime(df['created_utc'],unit='s')
```

Finally, it’s time to plot our figure with the help of Plotly Express:

```python
px.scatter(df, x="date", # date on the x axis
               y="sentiment_polarity", # sentiment on the y axis
               hover_data=["author", "permalink", "preview"], # data to show on hover
               color_discrete_sequence=["lightseagreen", "indianred"], # colors to use
               color="sentiment", # what should the color depend on?
               size="score", # the more votes, the bigger the circle
               size_max=10, # not too big
               labels={"sentiment_polarity": "Comment positivity", "date": "Date comment was posted"}, # axis names
               title=f"Comment sentiment in /r/python for the past 48h", # title of figure
          )
```

And here’s the output!

![Dashboard sentiment analysis](https://pbpython.com/images/dashboard_sentiment.png)

In this view, we can see the comments made in /r/python in the last 48 hours. We can see that most comments are rather on the positive side, but some are also negative. In your own notebook you’ll notice that you can hover over the comments and read the preview to see why they were classified as negative or positive.

The cool thing here is that if you run the same script tomorrow, you’ll get a different output.

So how can we have this in some place that “automatically” is updated whenever we see it?

### Creating a live dashboard with [Voilà](https://github.com/voila-dashboards/voila)

Voilà has a simple premise: “*Voilà turns Jupyter notebooks into standalone web applications.*”

Let’s back up a bit, and get everything you need running in your system. First step is to have a working setup with everything above, for that, follow these [instructions](https://github.com/duarteocarmo/interactive-dashboard-post) .

Once that is done, you should be able to launch the dashboard with:

```python
(env) $ voila notebooks/Dashboard.ipynb
```

Now, you should be able to see a web like application in a new tab in your browser from the notebook we created!

Feel free to modify this notebook according to your interests. You’ll notice that I have created some general variables in the first notebook cell, so you can fire up Jupyter Lab, and modify them and see what comes out!

Here are the general modifiable cells:

```python
COMMENT_COLOR         = "blueviolet"    # color for your comment graph
SUBMISSION_COLOR      = "darkorange"    # color for your submission graph
TEXT_PREVIEW_SIZE     = 240             # how long should the preview be?
TERM_OF_INTEREST      = "python"        # maybe you are interested in some other term?
SUBREDDIT_OF_INTEREST = "python"        # maybe you are interested in some other subreddit?
TIMEFRAME             = "48h"           # you can define another timeline
```

Once you have modified your dashboard, you can launch Voilà again to see the results.

The most important thing about Voilà is that every time it runs, it actually re-runs your whole code, which yes, makes things a bit slow, but also means that the results get updated every time the page is refreshed! :tada:

### Deploying your notebook to the web

#### First option: Using [binder](https://mybinder.org/)

Binder helps you turn a simple GitHub repo into an interactive notebook environment. They do this by using docker images to reproduce your GitHub repo’s setup.

We don’t really care about all that. We just want to publish our Voilà dashboard. To do that, follow these steps:

* Create a public [GitHub repo](https://help.github.com/en/github/getting-started-with-github/create-a-repo)
* Add the notebooks you want to publish as dashboards to it
* Add a `requirements.txt` file just as I have in the [example repo](https://github.com/duarteocarmo/interactive-dashboard-post/blob/master/requirements.txt) with all of your dependencies.
* Go to [mybinder.org](https://mybinder.org/)
* In the `GitHub` field add your repo’s URL.
* In the `GitHub branch, tag, or commit` field, add “master”, otherwise, you probably know what you are doing.
* In the `Path to a notebook field` add `/voila/render/path/to/notebook.ipynb` the `path/to/render` should be the location of your notebook in your repo. In the [example](https://github.com/duarteocarmo/interactive-dashboard-post), this results in `voila/render/notebooks/Dashboard.ipynb`
* In the `Path to a notebook field` toggle `URL` (instead of the default `file` option)
* Hit `launch`
* Your dashboard will automatically launch :open\_mouth: :tada:
* You can share the link with others and they will have access to the dashboard as well.

[Here is the running example of our reddit dashboard.](https://mybinder.org/v2/gh/duarteocarmo/interactive-dashboard-post/master?urlpath=%2Fvoila%2Frender%2Fnotebooks%2FDashboard.ipynb) (Takes a bit to build for the first time..)

#### Second Option: Using an ubuntu server in a hacky way with tmux ![🙈](https://pbpython.com/images/see_no_evil.png)

WARNING: This option is not 100% safe, so make sure to only use it for testing or proof of concepts, particularly if you are dealing with sensitive data!

If you want to have your dashboard running on a typical URL (such as mycooldash.com for example), you probably want to deploy it on a Linux server.

Here are the steps I used to accomplish that:

* Set up your virtual private server - [this](https://www.linode.com/docs/getting-started/) Linode guide is a good start.
* Make sure port 80 (the regular http port) is open

```python
$ sudo iptables -A INPUT -p tcp --dport 80 -j ACCEPT
```

* Once you have your repo in GitHub or somewhere else, clone it to your server.

```python
$ git clone https://github.com/your_username/your_awesome_repo.git
```

* You should already have python 3 installed. Try typing `python3` in your console. If that fails then [these instructions will help you](https://www.digitalocean.com/community/tutorials/how-to-install-python-3-and-set-up-a-local-programming-environment-on-ubuntu-16-04).
* Make sure you can run your dashboard, by [creating a virtual environment and installing the dependencies](https://github.com/duarteocarmo/interactive-dashboard-post#running-the-notebook).
* Now, if you type in your console the Voilà command, and specify the port:

```python
(env) $ voila YourNoteBook.ipynb --port=80
```

You can probably navigate to your server’s IP and see the dashboard. However, as soon as you exit your server, your dashboard will stop working. We are going to use a nifty trick with a tool called [tmux](https://en.wikipedia.org/wiki/Tmux).

Tmux is a “terminal multiplexer” (wow, that’s a big word). It basically allows us to create multiple terminal sessions at the same time, and then (yes you guessed it), keep them running indefinitely. If this sounds confusing, let’s just get to it.

* Install tmux:

```python
$ sudo apt-get install tmux
```

* Once installed we create a new terminal session:

```python
$ tmux new voila
```

* You are now inside a new terminal session. Let’s get Voilà running there.

```python
$ cd my_repo            # navigate to the repo
$ . env/bin/activate    # activate the environment
(env) $ voila MyNotebook.ipynb --port=80 # start the dashboard on port 80
```

* You should see the dashboard in your browser
* And now, for the magic, in your terminal hit `ctrl` + `b` and then `d` on your keyboard. This will “[detach](https://www.linode.com/docs/networking/ssh/persistent-terminal-sessions-with-tmux/#attach-and-detach-from-a-tmux-session)” you from that terminal where Voilà is running.
* You are now back to your original terminal session. Notice that your dashboard is still running. This is because your `voila` terminal session is still running.
* You can see it by listing the terminal sessions with:

```python
$ tmux ls
```

* And then attach to it via:

```
$ tmux attach voila
```

* And you’ll see your Voilà logs outputting.

This is arguably a bit of a hack to have things running, but it works - so no complaints there.

Tmux is an awesome tool, and you should definitely [learn more about it here](https://www.linode.com/docs/networking/ssh/persistent-terminal-sessions-with-tmux/#attach-and-detach-from-a-tmux-session).

#### Using Heroku or Google Cloud Platform

There are a million ways of deploying, and [Voilà also has good documentation on these](https://voila.readthedocs.io/en/stable/deploy.html).

### Conclusion

That was a long post! But we are finally done! Let’s summarize everything we learned:

* We learned how to transform an API endpoint into a function with `*kwargs`
* We learned how to analyze reddit data with python and [Plotly Express](https://plot.ly/python/plotly-express/)
* We learned how to analyze sentiment from sentences with [TextBlob](https://textblob.readthedocs.io/en/dev/)
* We learned how to transform a jupyter notebook into a dashboard using [Voilà](https://github.com/voila-dashboards/voila).
* We learned how to deploy those dashboards with [Binder.org](https://mybinder.readthedocs.io/en/latest/index.html)
* We learned how to use [tmux](https://en.wikipedia.org/wiki/Tmux) to deploy these kinds of tools in a server.

That was a lot of stuff, and probably there are a lot of bugs in my notebook, or explanation so make sure to:

* Visit the [GitHub repo](https://github.com/duarteocarmo/interactive-dashboard-post#running-the-notebook) were both the code and post are stored.
* If there is something wrong in the code please feel free to [submit an issue](https://github.com/duarteocarmo/interactive-dashboard-post/issues) or a [pull request](https://github.com/duarteocarmo/interactive-dashboard-post/pulls).
* [Tweet at me if you have questions](https://twitter.com/duarteocarmo)!
* Visit [my website](https://duarteocarmo.com/) if you want to learn more about my work ![😀](https://pbpython.com/images/smiley.png)

Reference :   <https://pbpython.com/interactive-dashboards.html>
