There are many reasons why a visualization grammar can be useful when creating visualizations. Here are three reasons why you might choose to use a visualization grammar, along with examples that illustrate each reason. For these examples, we use altair and json Pythn libaries, which can be imported as follows:
import altair as alt import json
Altair gives a consistent interface to the Vega-Lite grammar. You can install it with:
pip install altair vega vega_datasets
Here are 3 reasons why to use graphical grammars:
REASON #1: Concise and expressive visualization specification: A visualization grammar provides a more concise and expressive way to specify visualizations. For example, with Vega-Lite, you can create a scatter plot with a single line of code, like this:
alt.Chart(data).mark_point().encode(x='wt', y='mpg')
This simple script creates a scatter plot of weight against miles per gallon with a simple and expressive syntax.
REASON #2: Interactive visualizations: A visualization grammar such as Vega-Lite allows you to create interactive visualizations that can be easily shared and explored by others. For example, with Vega-Lite, you can create an interactive scatter plot with a tooltip that shows the name of the car when hovering over a point, like this:
alt.Chart(data).mark_point().encode(x='wt', y='mpg', tooltip=['name'])
This script creates the same scatter plot as before, but now it has an interactive feature that allows the user to see the name of the car by hovering over the point.
REASON #3: Built-in support for data transformations: A visualization grammar such as Vega-Lite provides built-in support for data transformations, including aggregations, filtering and sorting, which can be useful for data exploration and analysis. For example, with Vega-Lite, you can create a bar chart that shows the average miles per gallon by car manufacturer, with a filter that only shows manufacturers with more than 5 cars in the dataset, like this:
alt.Chart(data).mark_bar().encode( x='manufacturer:N', y='average(mpg):Q', filter='datum.count_manufacturer > 5' )
This script creates a bar chart that shows the average mpg by manufacturer, but only for manufacturers that have more than 5 cars in the dataset, this feature allows the user to filter the data in an easy way, to focus on the relevant information.
REASON #4: Data can be separated from the logic that's used to render that type of data. Furthermore, the rendering rules can be serialized into a specfication that can be defined on the data server and shared with the client for reuse on subsequent datasets of the same type from the data server.
# convert the chart to a JSON object chart_json = json.dumps(chart.to_json(), indent=4)
The chart_json will contiain the data and the spec in the JSON object, and the spec looks like this:
{ "$schema": "https://vega.github.io/schema/vega-lite/v4.json", "data": { ... }, "mark": "point", "encoding": { "x": { "field": "wt", "type": "quantitative" }, "y": { "field": "mpg", "type": "quantitative" } } }
To recapitulate, using a different visualization grammar, you can do the same with the plotnine library (which uses the Grammar of Graphics framework) as follows:
from plotnine import * from plotnine.data import mtcars import json plot = (ggplot(mtcars) # defining what data to use + aes(x='wt', y='mpg') # defining the variables on x and y axis + geom_point() # defining the type of plot to use ) plot_json = json.dumps(plot.to_json())
and the spec would look like:
{ "data": { "values": [ {"wt": 2.62, "mpg": 21.0, "name": "Mazda RX4"}, {"wt": 2.875, "mpg": 21.0, "name": "Mazda RX4 Wag"}, {"wt": 2.32, "mpg": 22.8, "name": "Datsun 710"}, ... ] }, "mark": "point", "encoding": { "x": {"field": "wt", "type": "quantitative"}, "y": {"field": "mpg", "type": "quantitative"} } }
In summary, visualization grammar allows to create plots with a consistent and expressive syntax, allows to create interactive visualizations, allows working with data transformations in a simple and intuitive way, and allows separate serialization of data and rendering logic.