2 minute read

Overview

By default jekyll uses Markdown and is very easy and learnable even for novices like me. For more data sciene and development, I decided to experiment with Jupyter Notebooks. In this document, we will test the functionality of Jupyter Notebook by trying out various visualization tools available in Python. We will then use a shell script command to convert this .ipynb file into .md format to display it on our static website.

print("Hello World")
Hello world

We can also use libraries to create visualizations. Shown below is a simple representation of a sine graph created using numpy and matplotlib.

import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
%config InlineBackend.figure_format = 'retina'

x = np.linspace(0, 5, 10)
y = np.sin(x)
plt.style.use("seaborn")
plt.plot(x, y, color="skyblue", label="Sine")
plt.legend()
plt.show()

png

Let’s step up the game a bit. Here is a simple experimentation with subplots with the matplotlib package again.

from scipy.stats import beta

fig, ax = plt.subplots(2, 2)  

ax[0][0].plot([1,2,3,4], color="skyblue")
ax[0][1].plot(np.random.randn(5, 10), np.random.randn(5,10), "mo--")
ax[1][0].plot(np.linspace(0, 5), np.cos(2 * np.pi * np.linspace(0, 5)), color="lime")
ax[1][1].plot(np.linspace(0, 1, 100), beta.pdf(np.linspace(0, 1, 100), 2, 5), color="gold")

plt.show()

png

Next, we use pandas to see if basic spreadsheet functionalities can be displayed on Jupyter.

import pandas as pd

df = pd.DataFrame(np.random.randn(5, 5))

print(df.head())
          0         1         2         3         4
0  0.123505  2.768920  0.274593 -1.038880  0.796700
1  0.145169  0.214949 -0.821644 -0.172914  0.020527
2 -0.269373  0.454270  0.231315 -1.258400 -0.430058
3 -1.608318  0.299697 -1.137735 -0.870107 -0.292700
4  0.829528  1.392714 -0.155681 -0.462716 -0.777012

We can visualize this toy data using the pandas.plot function, much like we created visualizations with matplotlib above.

df.plot(kind='barh')
<AxesSubplot:>

png

seaborn is another powerful data visualization framework. It is based off of matplotlib, but seaborn also contains graphing functionalities that its parent does not, such as heatmaps.

import seaborn as sns

flights = sns.load_dataset("flights")
flights = flights.pivot("month", "year", "passengers")
ax = sns.heatmap(flights, annot=False, fmt="d")
plt.xlabel("Month")
plt.ylabel("Year")
plt.show()

png

Pair plots are used to visualize multidimensional data. We can see pair plots in action by using the seaborn.pairplot function. Below are visualizations created using one of seaborn’s default data set.

iris = sns.load_dataset("iris")
sns.pairplot(iris, hue="species", markers=["o", "s", "D"], palette="husl")
plt.show()

png

sns.jointplot(x="sepal_length", y="sepal_width", data=iris, kind="kde", space=0, color="skyblue")
plt.show()

png

x = np.linspace(0, 10, 50)
data = np.sin(x) + np.random.rand(5, 50)
df = pd.DataFrame(data).melt()
sns.lineplot(x="variable", y="value", data=df, color="skyblue")
plt.show()

png

This is a quick check to see if equations can properly be displayed on Jupyter using Mathjax.

\[\begin{equation*} P(E) = {n \choose k} \ p^k (1-p)^{ n-k} \end{equation*}\] \[\begin{equation*} \mathbf{V}_1 \times \mathbf{V}_2 = \begin{vmatrix} \mathbf{i} & \mathbf{j} & \mathbf{k} \\ \frac{\partial X}{\partial u} & \frac{\partial Y}{\partial u} & 0 \\ \frac{\partial X}{\partial v} & \frac{\partial Y}{\partial v} & 0 \end{vmatrix} \end{equation*}\]

Updated:

Comments