Module 5: Design¶
The aims of this lab are:
- Learn about
matplotlib
's colormaps, including the awesomevidiris
. - Learn how to adjust the design element of a basic plot in
matplotlib
. - Understand the differences between bitmap and vector graphics.
- Learn what is SVG and how to create simple shapes in SVG.
First, import numpy
and matplotlib
libraries (don't forget the matplotlib inline
magic command if you are using Jupyter Notebook).
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
Colors¶
We discussed colors for categorical and quantitative data. We can further specify the quantitative cases into sequential and diverging. "Sequential" means that the underlying value has a sequential ordering and the color also just needs to change sequentially and monotonically.
In the "diverging" case, there should be a meaningful anchor point. For instance, the correlation values may be positive or negative. Both large positive correlation and large negative correlation are important and the sign of the correlation has an important meaning. Therefore, we would like to stitch two sequential colormap together, one from zero to +1, the other from zero to -1.
Categorical (qualitative) colormaps¶
To experiment with colormpas, let's create some data first. We will use the numpy
's random
module to create some random data.
numpy
¶
numpy
is one of the most important packages in Python. As the name suggests (num
+ py
), it handles all kinds of numerical manipulations and is the basis of pretty much all scientific packages. Actually, a pandas
"series" is essentially a numpy
array and a dataframe is essentially a bunch of numpy
arrays grouped together. If you use it wisely, it can easily give you 10x, 100x or even 1000x speed-up!
If you use pandas
or other packages, they may do all these numpy optimization under the hood for you. However, it is still good to know some basic numpy
operations. If you want to study numpy
more, check out the official tutorial and "From Python to Numpy" book:
Plotting some trigonometric functions¶
Let's plot a sine and cosine function. By the way, a common trick to plot a function is creating a list of x coordinate values (evenly spaced numbers over an interval) first. numpy
has a function called linspace
for that ("LINear SPACE"). By default, it creates 50 numbers that fill the interval that you pass.
np.linspace(0, 3)
array([0. , 0.06122449, 0.12244898, 0.18367347, 0.24489796, 0.30612245, 0.36734694, 0.42857143, 0.48979592, 0.55102041, 0.6122449 , 0.67346939, 0.73469388, 0.79591837, 0.85714286, 0.91836735, 0.97959184, 1.04081633, 1.10204082, 1.16326531, 1.2244898 , 1.28571429, 1.34693878, 1.40816327, 1.46938776, 1.53061224, 1.59183673, 1.65306122, 1.71428571, 1.7755102 , 1.83673469, 1.89795918, 1.95918367, 2.02040816, 2.08163265, 2.14285714, 2.20408163, 2.26530612, 2.32653061, 2.3877551 , 2.44897959, 2.51020408, 2.57142857, 2.63265306, 2.69387755, 2.75510204, 2.81632653, 2.87755102, 2.93877551, 3. ])
Let's just work with 10 numbers to make it easier to see.
a = np.linspace(0, 3, 10) # 10 numbers instead of 50
a
array([0. , 0.33333333, 0.66666667, 1. , 1.33333333, 1.66666667, 2. , 2.33333333, 2.66666667, 3. ])
A nice thing about numpy
is that you can apply many mathematical operations as if you are dealing with a single number.
# add 1 to each element of the array
a_plus_1 = a + 1
print(a_plus_1)
# multiply each element of the array by 3
a_times_3 = a * 3
print(a_times_3)
# raise each element of the array to the power of 2
a_squared = a ** 2
print(a_squared)
# take the square root of each element of the array
a_sqrt = np.sqrt(a)
print(a_sqrt)
[1. 1.33333333 1.66666667 2. 2.33333333 2.66666667 3. 3.33333333 3.66666667 4. ] [0. 1. 2. 3. 4. 5. 6. 7. 8. 9.] [0. 0.11111111 0.44444444 1. 1.77777778 2.77777778 4. 5.44444444 7.11111111 9. ] [0. 0.57735027 0.81649658 1. 1.15470054 1.29099445 1.41421356 1.52752523 1.63299316 1.73205081]
These are called "vectorized" operations. Whenever you can, you should use vectorized operations instead of looping over the elements because they are way way faster and efficient.
Q: Let's plot some sin
and cos
functions.
use numpy
's sin
and cos
functions with matplotlib
's plot
function to plot.
x = np.linspace(0, 3*np.pi)
# YOUR SOLUTION HERE
[<matplotlib.lines.Line2D at 0x118dffbf0>]
matplotlib
picks a pretty good color pair by default! Orange-blue pair is colorblind-safe and it is like the color pair of every movie.
matplotlib
has many qualitative (categorical) colorschemes. https://matplotlib.org/users/colormaps.html
You can access them through the following ways:
plt.cm.Pastel1
or
pastel1 = plt.get_cmap('Pastel1')
pastel1
You can also see the colors in the colormap in RGB (remember what each number means?).
pastel1.colors
((0.984313725490196, 0.7058823529411765, 0.6823529411764706), (0.7019607843137254, 0.803921568627451, 0.8901960784313725), (0.8, 0.9215686274509803, 0.7725490196078432), (0.8705882352941177, 0.796078431372549, 0.8941176470588236), (0.996078431372549, 0.8509803921568627, 0.6509803921568628), (1.0, 1.0, 0.8), (0.8980392156862745, 0.8470588235294118, 0.7411764705882353), (0.9921568627450981, 0.8549019607843137, 0.9254901960784314), (0.9490196078431372, 0.9490196078431372, 0.9490196078431372))
To get the first and second colors, you can use either ways:
plt.plot(x, np.sin(x), color=plt.cm.Pastel1(0))
plt.plot(x, np.cos(x), color=pastel1(1))
[<matplotlib.lines.Line2D at 0x118ee5d30>]
Q: pick a qualitative colormap and then draw four different curves with four different colors in the colormap.
Note that the colorschemes are not necessarily colorblindness-safe nor lightness-varied! Think about whether the colormap you chose is a good one or not based on the criteria that we discussed.
# TODO: put your code here
# YOUR SOLUTION HERE
Quantitative colormaps¶
Take a look at the tutorial about image processing in matplotlib
: https://matplotlib.org/stable/tutorials/introductory/images.html#sphx-glr-tutorials-introductory-images-py
We can also display an image using quantitative (sequential) colormaps. Download the image of a snake: https://github.com/yy/dviz-course/blob/master/docs/m05-design/sneakySnake.png or use other image of your liking.
Check out imread()
function that returns an numpy.array()
.
import matplotlib.image as mpimg
import PIL
import urllib
img = np.array(PIL.Image.open('sneakySnake.png'))
plt.imshow(img)
<matplotlib.image.AxesImage at 0x118f93fb0>