Visualize

datatoolkit.visualize.dash_line(x: str, y: str, data: pandas.core.frame.DataFrame, figsize: tuple = (1000, 400), title: Optional[str] = None, x_axis_label: Optional[str] = None, y_axis_label: Optional[str] = None)

Create dashboard line plot

Parameters
  • x (str) – Column in x axis

  • y (str) – Column in y axis

  • data (pd.DataFrame) – Data frame contaning the above. It also must contain columns for the max, min and mean of y axis

  • figsize (tuple, optional) – Width and height of figure. Defaults to (1000, 400).

  • title (str, optional) – Figure title. Defaults to None.

  • x_axis_label (str, optional) – X-axis label. Defaults to None.

  • y_axis_label (str, optional) – Y-axis label. Defaults to None.

Returns

Plotted figure

Return type

bokeh object

References

[1] https://www.reddit.com/r/learnpython/comments/8ythxo/how_to_use_checkbox_in_bokeh_with_pandas/

Example: TODO:
>>> show(dash_line())
datatoolkit.visualize.graphplot(G: networkx.classes.digraph.DiGraph, M: numpy.ndarray, min_weight_threshold: float = 0.0, bins: int = 4, graph_layout: str = 'spring_layout', figsize: tuple = (20, 10), cmap=<matplotlib.colors.LinearSegmentedColormap object>, edge_kwargs=None, node_label_kwargs=None, node_kwargs=None)

Plot a graph with weights on edges :param G: Weighted graph :type G: nx.classes.digraph.DiGraph :param M: Weight matrix :type M: np.ndarray :param min_weight_threshold: Minimal weight to be plotted. Defaults to 0.0. :type min_weight_threshold: float, optional :param bins: Number of bins to divide the weights. Defaults to 4. :type bins: int, optional :param graph_layout: Defaults to “spring_layout”. :type graph_layout: str, optional :param figsize: Defaults to (20, 10). :type figsize: tuple, optional :param cmap: Matplotlib colormap. Defaults to plt.cm.coolwarm. :type cmap: [type], optional :param edge_kwargs: Kwargs to edge plot. Defaults to None. :type edge_kwargs: [type], optional

Returns

Plotted graph

Return type

ax

Example

>>> n_nodes = 4
>>> M = np.random.rand(n_nodes, n_nodes)
>>> nodes = range(M.shape[0])
>>> G = make_graph(nodes, M)
>>> graphplot(G, M)

References

[1] https://networkx.org/documentation/stable/auto_examples/drawing/plot_directed.html

datatoolkit.visualize.heatmap_4d(volume: pandas.core.frame.DataFrame, probabilities: pandas.core.frame.DataFrame, xlabel: str = 'xlabel', ylabel: str = 'ylabel', figsize: tuple = (20, 30))

Plots a 4-dimensional heatmap, where the colorbar varies in the interval [0, 1] and the circle sizes are integers

Parameters
  • volume (pd.DataFrame) – Pivoted data frame containing integers values

  • probabilities (pd.DataFrame) – Pivoted data frame containing values in the interval [0, 1]

  • xlabel (str, optional) – Name of x label. Defaults to “xlabel”.

  • ylabel (str, optional) – Name of y label. Defaults to “ylabel”.

  • figsize (tuple, optional) – Figure size. Defaults to (20, 30).

Returns

matplotlib figure objects

Return type

[type]

Example

>>> nrows = 25
>>> ncols = 50
>>> volume = pd.DataFrame(np.random.randint(0, 1000, size=(nrows, ncols)), columns=[f"col_{i}" for i in range(ncols)])
>>> probabilities = pd.DataFrame(np.random.randn(nrows, ncols), columns=[f"col_{i}" for i in range(ncols)])
>>> _, _ = heatmap_4d(volume, probabilities, xlabel="Category_1", ylabel="Category_2")

References

[1] https://blogs.oii.ox.ac.uk/bright/2014/08/12/point-size-legends-in-matplotlib-and-basemap-plots/ [2] https://stackoverflow.com/questions/54545758/create-equal-aspect-square-plot-with-multiple-axes-when-data-limits-are-differ/54555334#54555334

datatoolkit.visualize.hist_box(feature: str, data: pandas.core.frame.DataFrame, figsize: tuple = (20, 10))

Plots histogram and box

Parameters
  • feature (str) – Feature to be plotted

  • data (pd.DataFrame) – Dataframe containing the feature

  • figsize (tuple, optional) – Size of figure. Defaults to (20, 10).

Returns

line and bar plots

Return type

matplotlib axis objects

datatoolkit.visualize.line_bar_plot(x: str, y_line: str, y_bar: str, data: pandas.core.frame.DataFrame, figsize: tuple = (20, 10), proportions: bool = True)

Plot line and bars

Parameters
  • x (str) – The shared x axis

  • y_line (str) – Values to be plotted in line

  • y_bar (str) – Values to be plotted in bars

  • data (pd.DataFrame) – Dataframe containing the above features

  • figsize (tuple, optional) – Size of the figure. Defaults to (20, 10).

  • proportions (bool, optional) – Add proportions each bar to the plot. Defaults to True.

Returns

line and bar plots

Return type

matplotlib axis objects