preloader
image

Animal Recognition

This project is a group project and it has been developed for the exam of Digital Signal & Image processing. The goal of this project is to handle and analyse data made by signals monodimensional (audio) and 2D (images). The topic chosen is Animals and it’s composed by 3 components:

  • Audio Recognition
  • Image Recognition
  • Image Retrieval

Project Details

The project is written in Python. In particula Jupyter Notebook is used as IDE to test and write the code. The datasets used have been downloaded from the Museum fur Naturkunde Berlin through the Gbif website. The three parts has been developed independently:

Audio Recognition

The classes of animal calls have been collected. After some preprocessing the singal has been expanded in 2D realm through mel-spectograms computation. After that we have used a ResNet101 trained on ImageNet as NN to apply transfer learning. The accuracy achieved reaches about 75%.

Image Recognition

For this task 10 classes of animals images have been collected. After some preprocessing again a ResNet101 has been used to the image classification. Transfer learning is applied and an accuracy of 95% has been achived.

Image Retrieval

The objective of this task to output the 10 images that are most similar to the one in input. Again the same animal images are used. To model the process a ResNet101 is used for feature extraction an a KDTree structure has been implemented to make faster results. The output is quite convincing even using cartoon images.