Data Labeling with Jupyter ipywidgets
This article demonstrates how to use Jupyter to do a quick data labeling for machine learning model training, and of course, showing off how cute my dogs are :)
Labeling might be the most frequently encountered task for ML Engineers and Data Scientists. With many advanced applications now capable of automating this process, I asked myself can I write a simple script to do that, without necessarily paying for the apps? As a DS, the first tool comes to my mind is Jupyter Notebook. Yea, I use it daily, why not build an interactive labeling tool there. Should be easy and quick. Mostly importantly, it’s FREE!!!
To do that, you need to use ipywidgets. The following example demos how to label my dogs with just couple simple functions. I attached my github link at the end.
First of all, these are the libraries you will use:
import ipywidgets as widgets
from IPython.display import display, Image
import os
widgets provides the interactive buttons or checkbox. display mainly works with the Image module to show the photos. os, well… you should know what it is :)
The ipywidgets offers many cool ways to interact with your application, such as the Play widget:
play = widgets.Play(
value=50,
min=0,
max=100,
step=1,
interval=500,
description="Press play",
disabled=False
)
slider = widgets.IntSlider()
widgets.jslink((play, 'value'), (slider, 'value'))
widgets.HBox([play, slider])
You can find more of them here: https://ipywidgets.readthedocs.io/en/latest/examples/Widget%20List.html#Button
In my example, I use radio buttons. It looks like this
Laying out all the functions below:
def radio_button():
# create a radio button
return widgets.RadioButtons(
options=['Luka','Olla'],
description='Who is she ?',
disabled=False)def create_radio_button_list(length):
# create a list of radio buttons to match the sample side
rbs = []
for _ in range(length):
rbs.append(radio_button())
return rbsdef get_radio_button_values(rbs):
# extract the values from the radio button list
return [x.value for x in rbs]
You need to create radio buttons and append them to a list for later looping with the photos. That’s what radio_button and create_radio_button_list do. After you label them, you will use get_radio_button_values to extract the labels. The workflow is like this:
file_list = os.listdir('images/')
rbs = create_radio_button_list(len(file_list))
for i, f in enumerate(file_list):
display(Image(filename='images/{}'.format(f), width=300, height=150))
display(rbs[i])
label_list = get_radio_button_values(rbs)
Here are the results:
This is the link for the notebook. Thanks for reading!
https://github.com/tsjohnnychan/jupyter_labeling/blob/master/Labeling.ipynb