After "Export data", what is next? :)

Hello All,

I’m new in ML domain, Yolov3, OpenCV and Python.

I’m curious about the available methods to train Yolov3 from Hasty JSON data.
Are you aware of a Youtube video or a quick HowToPage that demonstrates how to train Yolov3 using Hasty Json?

Thanks for the attention you’ll grant to my message.
Before writing it, I took time to look for an answer in this forum.

1 Like

Hey thierry,

Since you haven’t used a specific yolo repository, I’ll try to answer the question with regards to a cursory reading from here: Yolo

It appears you will require a text file for images containing the class label and the bounding box coordinates.

You will need to create a parser for hasty json which will produce these text files. You can do so simply in python by:

import json
hasty_json = '/path/to/hasty/json'

with open(hasty_json) as file:
    data = json.load(file)

# This is only pseudo code as I'm only trying to give an idea of how to do it
for image in data['images']:
    # create text file here with image name 'image['image_name']'
    for label in image['labels']:
         # Extract bounding boxes with label['bbox'] and convert from x1, y1, x2, y2 to x, y, w, h
         # write line into text file
   # save text file

You should then be able to run the training mostly like explained from the link above.

Alternatively, you can try finding yolov3 repositories that use the coco format, in which case you can download the coco json from Hasty and just use that directly.

Hope this helps!

Cheers,
Hasnain

2 Likes

Good day Hasnain,

Many thanks for your reply. I really appreciate the details you provide and the 2 routes you suggest. At present I’m in the process to annotate the pictures and once I’m done with this part, I’ll let you know the outcome.

Cheers,
TM

Hi Thierry,

I’m a community manager here at Hasty. Please share the outcome, would be really interesting to see. Also, if you create your annotations with Hasty, make sure to try out AI-Assistants. They can automate up to 70% of the labeling work you. If you have any other questions, please don’t hesitate to reach out.

:v: Tobias

Hi Tobias,

I’m learning a lot with the use of Hasty. If I chose Hasty, it is because of its AI-Assistants. At this moment, “Instance segmentation” is available for me.
My first findings are that using the instance segmentation too early in the annotation work won’t help to improve my dataset in allowing quality object detection.

Now, I use the instance segmentation AI to check my progresses in producing a good enough dataset. It also helps to improve my manner to manually draw the polygons with the focus to facilitate ML training later.

My stats in my current data set are:
Number of images: 232

Number of images marked as ‘Done’: 126 (54%)

Number of images marked as ‘Skipped’: 11 (5%)

Number of label classes: 6

Number of labels created: 2310

Cheers,
TM

Hi Thierry,

it sounds like you’re using Hasty exactly as it should be used! Tudos! Have you seen our manual review feature as well already?

Only thing which might get tricky is that your data-set seems to be a bit small (but might work perfectly for your use-case).

Hello Tobias,

If “manual review” is the document here:
Introduction - Hasty.ai documentation (gitbook.io)

Yes I explored the content of the document. :ok_hand:

Again I’m a newcomer and there is a question I would like to ask. Feel free to tell me if I should open a new topic :
In terms of best practices related to the annotation work, with the focus to optimise the ML training, should I make as many labels as there are variations of one sort of object (i.e. analogical clock, digital clock, Grandfather clock…) or is it fine for the ML training to make just one label to cover all the variations of the object. (i.e. Clock)

Thanks for the attention you’ll grant to my message!

Hi Thierry,

Good question! What we saw often is that this is where attributes are very useful.
You want to be careful that you don’t unfairly penalise the neural network (NN) and also that there is good representation for different events.

For example:
If you make a different label class for every brand of Sedan car then the NN will get penalised when guessing BMW vs Mercedes and waste a lot of energy trying to resolve that.
There it makes more sense to have class as “sedan” and assign an attribute as brand that can focus a different model entirely on trying to make that distinction.

To your example, it might make sense to have class a clock and then attributes as analog, digital etc.

more on that here: https://hasty.gitbook.io/documentation/annotation-environment/label-attributes

1 Like

Many thanks for this information ! I’m eager to try this feature now :slight_smile:

Hello all,

Tobias you asked me to share a feedback after the end of my work. Please find it below .

1st : Kudos to Hasty team for such effective service to help in labelling work. The power of your AI assistants reveals after 250 processed images. They helped to accelerate the labelling work.
I annotated about a total 1070 images and produced about 26500 labels. Doing this without AI assistants would have been a true nightmare :slight_smile: (This is not FAKEFEEDBACK!)

I also thank Hasty community.
Kudos to Hasnain and Treebeard, you provided guidance which helped me to understand aspects of ML training I wasn’t aware about. Attributes are important. They need to be thought at the very beginning of the project, have obvious meanings, and be added to the right classes. Your messages were decisive for my work.

2nd: to perform the ML training, I explored youtube to capture information.
Because I wanted to access the best technology to deliver the best results, I wanted to refactor my work environment from Yolov3 to Yolov5. (Remember I’m a beginner) Because the online tutorials about ML training were very well designed, I used Roboflow + Google Colab to do the work. It did the trick. In few hours, I managed to get an effective object detector based on Ultranatics tech demonstrator. (mAP of 76%). When it was the time to integrate the Yolov5 into my work environment, I was clueless : no tutorial, example, youtube videos processed this part of the SW development. I explored all the websites possible (Github, ultranatics, etc…) for hours. These efforts gave no satisfying outcome. The only given solution that exists needs an online link with some external webservice (that is probably free) Because my work environment is offline, I had to twist ultranatics offline demonstrator and rebuild my work environment around it to get what I wanted. I could have purchased some ultranatics support service hours but I gave up the idea. It was sad that I didn’t catch from the beginning that paying support was strongly recommended to being able to integrate YoloV5 in the manner I needed.

3rd: after the Yolov5 trial, I did the ML training with Roboflow, Google Colab and YoloV4. The ML training environement preparation was less easy than Yolov5’s one, but, in the end I got an mAP of 87% and even 90% on some classes. The cherry on the cake was when I could integrate the result of the ML Training in my work environment built initially around YoloV3 with just 3 changes.

Now I have an effective object detector able to answer my needs. \o/

In regards to the things to improve in Hasty, I propose ONE feature that I do think will help to avoid repetitive moments of pain.:

  • ability to change attributes for a group of labels from the “labels editor (with the picture to be annotated)” and from the “Manual Review tool”.

During my work, I had to add 5 or 6 times new attributes in order to facilitate the IA work. The consequence is that I had to change the attributes of each label from the very beginning of the dataset. When you have 500 images to review 5 or 6 times, this is painful. Such feature to change attributes from the manual review tool or for a group of labels in the “label/picture editor” would be REALLY fantastic.

What will happen next: I plan to make bigger my dataset (potentially 500 additional pictures) in order to improve the ability of the object detector to capture small sized objects in pictures.

3 Likes

Other feature ideas:

  • (like in Roboflow) ability to export the data in Yolo Darknet format or other.
  • (like in Roboflow) ability to upload the dataset in a defined format to a Googlecolab repository, or a google drive or other similar service.
  • (like in Roboflow) propose a collection of Googlecolab notebooks to start a ML training in few clicks.

Basically, avoid the use of Roboflow would make the life easier.

2 Likes

Hi Thierry,

thanks for the detailed and kind feedback! We are very happy that we could help you. I’m excited to hear what you’ll be building in the future.

I shared your feedback with the product team also, and they’ll look at your suggestions.

Reading your feedback, one question remained open, though: In what environment are you working?

Have a great weekend,
Tobias

You’re welcome Tobias.

Windows 10, Miniconda, Python 3.7, OpenCV, Yolov4, Google Colab, Yolo darknet format for the annotations,
The developed software runs on a computer that is offline and it processes several GBs of pictures for my own use.

Have a great weekend too!

1 Like