Range Selector in Python with Matplotlib Widgets

Sometimes, when working with data structures such as Numpy Arrays or Pandas DataFrames, one wants to slice the data in order to perform specific operations in certain ranges of it. Both data structures provide excellent methods and functions to do that, and they only require a few lines of code. However, and specially inside of the Digital Signal Processing (DSP) field, one needs to perform several windowing operations before knowing which selection range is optimal. Running the same lines of code several times and modifying the selected data-points between tries until one finds the optimal selection is an inefficient way of performing this operation. A better way to approach these kind of operations is with a small graphical user interface that allows the user to select the points with the mouse and, at the same time, to observe the results of this selection. Therefore, several tries can be done in a few seconds, and the windowing process is simplified in several aspects.

While working in my Bachelor's thesis I came up with this problem: I needed to select specific data-points of a two-dimensional electronic spectroscopy (2DES) map, in order to perform over specific areas an integrated Fast Fourier transform (FFT). At the beginning it was very time consuming, but after figuring out how to do that graphically, I could adjust the specific ranges by eye and got the best out of the analysis. In this article, I present two different ways to perform the range selection in Numpy Arrays employing Matplotlib and its Widgets, as well as one of the functions that I implemented for the data analysis of my thesis.

First of all, you can have access to my code in my GitHub profile (here), and you should definitively check out the documentation of these functions (here).

Span Selector: for only vertical or only horizontal functions

SpanSelector is a class from Matplotlib Widgets that can perform specific operations over the graphical selection in a plot. In order to do that, you need to be working in an interactive backend. As I usually work with Jupyter Notebooks, I have to call the magic command

%matplotlibt qt

before calling any of the Matplotlib functions. It is really easy to implement: you initialize and plot your data-set and, before calling the plot, you initialize the SpanSelector class in the same axis:

self.fig, (self.ax1, self.ax2) = plt.subplots(2, 1, figsize=(7, 7))
self.ax1.plot(self.X, self.Y)
self.ax2.plot(self.X, self.Y)
# Some customization
self.ax1.set_xlabel("Frequency / Hz")
self.ax1.set_ylabel("Intensity")
self.ax2.set_xlabel("Frequency / Hz")
self.ax2.set_ylabel("Intensity")
# Selector class, recall arguments of direction, span_stays, button and rectprops
self.span = SpanSelector(self.ax1, onselect=onselect_x,
                        direction="horizontal", useblit=True, span_stays=True, button=1,
                        rectprops={"facecolor":"red", "alpha":0.3})
plt.show()

Here, I am using the syntax "self." before any of the variables because I have implemented this inside a class, but that does not matter now. Let's focus on how the function works, and see that you create the graph in a normal way and, before calling "plt.show()", you create an instance of the SpanSelector class. All the arguments of this Class are important, and for understand them you can check out the documentation. Here, I am going to focus in the onselect argument. This argument calls a previously defined function, and pass the arguments "min" and "max" on it. For example, let's say that we have defined the "direction = horizontal". Then, after doing the selection, it will call "onselect(xmin, xmax)", where xmin and xmax are our minimum and maximum value-point of the selection. Because of this, we need to define before what is the function that we want to apply on the two values that we will choose with the mouse. Let's see the function that I have created:
def onselect_x(xmin, xmax):
            # Finding index from xmin xmax values
            indmin, indmax = np.searchsorted(self.X, (xmin, xmax))
            indmax = min(len(self.X) - 1, indmax)
            # Creating selection of the array
            self.newx = self.X[indmin:indmax]
            self.newy = self.Y[indmin:indmax]
            # Plotting the new selection, probably a method of self.line, = self.ax is more effective            
            self.ax2.cla()
            self.ax2.plot(self.newx, self.newy)
            # Some customization
            self.ax2.set_xlabel("Frequency / Hz")
            self.ax2.set_ylabel("Intensity")
            self.ax2.set_xlim(self.newx[0], self.newx[-1])
            self.ax2.set_ylim(self.newy.min()-self.newy.max()*0.05, self.newy.max()+self.newy.max()*0.05)
            self.fig.canvas.draw()
This function takes the input values of xmin and xmax and finds the index of that points in the Numpy array that we were plotting (first two lines). Then, it creates two new attributes: self.newx and self.newy, which are the "graphical" slices of our Numpy array. Finally, as we want to perform some kind of operation with that selection, I decided to plot it in a bottom subfigure. As an example, I decided to apply this over the NMR spectrum of one organic compound. This function might be used in this context to find the number of peaks inside our selection, or to perform non-linear fits of the spectral line shape or our peaks.


 I integrated the previous concepts inside a Class:
class slice_fun():
    def __init__(self, X, Y):
        self.X = X
        self.Y = Y
        def onselect_x(xmin, xmax):
            # Finding index from xmin xmax values
            indmin, indmax = np.searchsorted(self.X, (xmin, xmax))
            indmax = min(len(self.X) - 1, indmax)
            # Creating selection of the array
            self.newx = self.X[indmin:indmax]
            self.newy = self.Y[indmin:indmax]
            # Plotting the new selection, probably a method of self.line, = self.ax is more effective            
            self.ax2.cla()
            self.ax2.plot(self.newx, self.newy)
            # Some customization
            self.ax2.set_xlabel("Frequency / Hz")
            self.ax2.set_ylabel("Intensity")
            self.ax2.set_xlim(self.newx[0], self.newx[-1])
            self.ax2.set_ylim(self.newy.min()-self.newy.max()*0.05, self.newy.max()+self.newy.max()*0.05)
            self.fig.canvas.draw()

        # Plotting the array without selection
        self.fig, (self.ax1, self.ax2) = plt.subplots(2, 1, figsize=(7, 7))
        self.ax1.plot(self.X, self.Y)
        self.ax2.plot(self.X, self.Y)
        # Some customization
        self.ax1.set_xlabel("Frequency / Hz")
        self.ax1.set_ylabel("Intensity")
        self.ax2.set_xlabel("Frequency / Hz")
        self.ax2.set_ylabel("Intensity")
        # Selector class, recall arguments of direction, span_stays, button and rectprops
        self.span = SpanSelector(self.ax1, onselect=onselect_x,
                            direction="horizontal", useblit=True, span_stays=True, button=1,
                            rectprops={"facecolor":"red", "alpha":0.3})
        plt.show()
Afterwards, I can call:
%matplotlib qt
selection = slice_fun(X, y)
And my slicing-GUI would be something like this:

Having access to the attributes "selection.newx" and "selection.newy", which provide me only the range that you can observe in the bottom subfigure. The key of the functionality, as I said, resides in the "onselect" function. There, you can define as many and complex operations as you want, as well as plot the results or print them in the console. Therefore, you can be selecting different data points for as many times as you want and, at the same time, the onselect function will provide you the final results that you were looking for.
RectangleSelector: the perfect selector for two-dimensional data

You already know the basics, and with this selector the functionality is similar. You need an interactive backend, you need to call the RectangleSelector class (check out the documentation) before calling the plot and, as well as you need to have a previously defined function which will be applied over your selection. For simplicity, I integrated this in one class, and you can see the code below:
class slice_2d():
    def __init__(self, X, Y, Z):
        self.X = X
        self.Y = Y
        self.Z = Z
        
        def line_select_callback(eclick, erelease):
            # Finding new index
            x1, y1 = eclick.xdata, eclick.ydata
            x2, y2 = erelease.xdata, erelease.ydata
            yindmin, yindmax = np.searchsorted(self.Y, (y1, y2))
            yindmax = min(len(self.Y) - 1, yindmax)
            xindmin, xindmax = np.searchsorted(self.X, (x1, x2))
            xindmax = min(len(self.X) - 1, xindmax)
            # Creating new arrays
            self.newx = X[xindmin:xindmax]
            self.newy = Y[yindmin:yindmax]
            self.newz = Z[yindmin:yindmax, xindmin:xindmax]
            # Plotting new results
            self.ax2.cla()
            self.ax2.contourf(self.newx, self.newy, self.newz, cmap="gist_heat_r")
            self.ax2.set_xlabel("Excitation Frequency / $cm^{-1}$")
            self.ax2.set_ylabel("Detection Frequency / $cm^{-1}$")
            self.fig.canvas.draw()
        
        # Ploting first arrays
        self.fig, (self.ax1, self.ax2) = plt.subplots(1, 2, figsize=(14, 6))
        self.ax1.contourf(self.X, self.Y, self.Z, cmap="Greys")
        self.ax2.contourf(self.X, self.Y, self.Z, cmap="gist_heat_r")
        # Some customization
        self.ax1.set_xlabel("Excitation Frequency / $cm^{-1}$")
        self.ax1.set_ylabel("Detection Frequency / $cm^{-1}$")
        self.ax2.set_xlabel("Excitation Frequency / $cm^{-1}$")
        self.ax2.set_ylabel("Detection Frequency / $cm^{-1}$")
        self.ax1.set(xlim=(self.X.min(), self.X.max()), ylim=(self.Y.min(), self.Y.max()))
        
        # Selector class, recall arguments drawtype, button, spancoords and interactive
        self.span = RectangleSelector(self.ax1, line_select_callback,
                                       drawtype='box', useblit=True,
                                       button=[1, 3], spancoords="pixels",
                                       minspanx=0.1, minspany=0.1,
                                       interactive=True)
        plt.show()
As you can see, everything is very similar. The main difference is that, this time, you have a bit more information, as your selection is made in two-dimensions, and the way to access the selected points is a bit different (through the "eclick" and "erelease" events). However, the RectangleSelector class is superior with respect the SpanSelector one: it allows to have more control of the properties of the selector and, with the argument "interactive = True", you gain access to tools to modify the size of the rectangle with the mouse, without having to draw it again. Therefore, this class is much better than the previous one, but both work nicely for windowing analysis.
In my example, I decided to draw one of the beating maps of my 2D experiments:

And calling this class gives me access to a small-GUI, as before:
%matplotlib qt
selection_2d = slice_2d(X, Y, z)

In this case, I just decided to plot the selected range in a new subfigure, employing a different colormap for the visualization. Check out how my RectangleSelector now have a few points around and one in the center. With the border points I can modify the size, and with the central one I can move the rectangle to other areas! Again,  possibilities about what to do with your selection are infinite, and all of them can be included inside the "onselect" function that you have to define before.
Real-case application: integrated FFT of a 2DES spectra
Here, I am not going to enter in details of the physics or the mathematical operations that I have applied: I just want to highlight what is the potential of this tool.  In my experiments, I have a spectra of many two-dimensional maps. Every map correspond to one time-frame, and after performing a complex Fourier transform, I got a spectra as a function of the frequency. Then, I am only interested in certain areas of this spectra, over which I have to integrate. In order to explore in a fast way all the areas of interest, and try to tune up as much as possible the selected area, I implemented a small code that performs these operations inside the onselect function. The final result is something like this:


My first subfigure contains an example 2D map. On it, I can explore the different areas, selecting several sections and making my rectangle broader or narrower. My second subfigure has performed the sum over the x-axis, and it spams two blue-lines that are related  to the vertical section selected by my rectangle. The bottom subfigure contains my integrated FFT spectrum. This is the most important one for my experiments, as it needs to have well-defined and resolved peaks. On it, I implemented a few more features: it has annotated the rectangular limits of my selection, and the most important peaks are automatically detected, with their coordinates being stored in the class. These features are beyond the scope of the article, but all of them can implemented inside the "callback" function inside the class.

Finally, I just wanted to show the implementation of this feature inside a Graphical User Interface (GUI) that I have developed to work with the 2D data. The basis are the same, but it has a bit more interaction through buttons and text boxes! However, how this software works will be explained in future articles.


All together, this tool has saved me an important amount of time, providing me a way to perform windowing operations in graphical way! I hope this article was useful for you, as I definitely think that you can find a spot to include these features inside your data analysis or digital signal processing applications!