Data visualization is essential to data science and machine learning. Visualization aids in understanding the data and improving model performance, which is why visualization is effective in every step of the analytical process. It can help the scientist understand patterns in the data, like trends and outliers, which can drive data transformation and model hyperparameter tuning. Data visualization can also help communicate the data shortcomings and model interpretation to stakeholders.
Data visualization not necessarily limited to graphical devices such as plots, charts, and animations. For example, visualization of natural language processing tasks is imperative and may be more text-oriented. For example, rendering input text strings alongside their labels by randomly sampling the data can help with understanding sample characteristics, such the frequency of mislabeled text. Rendering the feature and target shapes is also text-based and an important first step in understanding the data. Rendering the class namesand the number of observations per class can help understand data imbalances and class weights. Some of these renderings could be represented in chart form, but not necessarily. The essence of the visualization is to help understand patterns and trends.
In the context of machine learning, data visualization presents the data in a way for the human brain to understand and interpret the source, model, and predicted data.