Sara Al-Emadi

Drone Audio Dataset

While I was working on my latest research which is focused on using deep learning algorithms to secure physical infrastructures from drone violations, I have found out that there was a no open-source datasets for drone audio clips mainly due to privacy restrictions. Therefore, one of my main objectives was to prepare an open-source drone audio dataset which is available to aid the research community in related areas.

In my point of view, people in the Machine Learning(ML)/Deep Learning(DL) community are focusing highly on the algorithm aspect of the solution and not much on the datasets. Throughout my research, I have learnt that one of the most important factors that effects greatly the entire ML/DL experiment and the results obtained is, in fact, the dataset being used.

In order to ensure that you have the desired dataset, you must consider the following points when preparing the dataset:

The environment at which the data is acquired.
The presence of background noise. This is valid for both audio and image based problems.
The correct format of the acquired files. Majority of the time reformatting and cleaning of the data would be needed to ensure consistency throughout the dataset.
Never forget to include silence in your dataset if you are trying to solve audio based problem.

The repo https://github.com/saraalemadi/DroneAudioDataset consists of drone audio dataset. The audio clips of drone propellers noise were recorded in an indoor environment and artificially augmented with random noise clips. The drone audio dataset is part of our ‘Audio Based Drone Detection and Identification using Deep Learning‘ conference paper.

The noise clips that are categorised as ‘Unknown’ in both binary and multiclass folders are obtained from the open-source project ESC: Dataset for Environmental Sound Classification by Karol J. Piczak (https://github.com/karoldvl/ESC-50) and the white noise from Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition by Pete Warden (https://arxiv.org/pdf/1804.03209.pdf and https://www.tensorflow.org/tutorials/sequences/audio_recognition). In addition, I have created silence audio clips to balance the dataset.

I hope the tips mentioned above were helpful and the introduction to our dataset was interesting. Please leave your questions or suggestions in the comment section below, I would be happy to help!

saraalemadi

—

January 12, 2019

Dataset, Deep Learning, Drone, Drones

2 responses to “Drone Audio Dataset”

st.

April 14, 2022

hello.
I want to know how to get your data using smartphone.
ex) distance of drone and microphone!
and how to mix your drone sound and random noise.
what kind of random noise and mix in time domain? or frequency domain?
I want to use your datasets! plz let me know this things!
Thanks!

LikeLike

Reply
ammar abdulrasool

October 22, 2022

Thanks for your contribution.
I have couple of questions regards your drone detection based rf signal paper.
I’d like to discuss those questions with you.
Thanks again
Ammar,

LikeLike

Reply

Drone Audio Dataset

Share this:

2 responses to “Drone Audio Dataset”

Leave a comment Cancel reply