Development of Long-Range, Low-Powered and Smart IoT Device for Detecting Illegal Logging in Forests

: Forests promote the conservation of biodiversity and also play a crucial role in safeguarding the environment against erosion, landslides, and climate change. However, illegal logging remains a significant threat worldwide, necessitating the development of automatic logging detection systems in forests. This paper proposes the use of long-range, low-powered, and smart Internet of Things (IoT) nodes to enhance forest monitoring capabilities. The research framework involves developing IoT devices for forest sound classification and transmitting each node's status to a gateway at the forest base station, which further sends the obtained data through cellular connectivity to a cloud server. The key issues addressed in this work include sensor and board selection, Machine Learning (ML) model development for audio classification, TinyML implementation on a microcontroller, choice of communication protocol, gateway selection, and power consumption optimization. Unlike the existing solutions, the developed node prototype uses an array of two microphone sensors for redundancy, and an ensemble network consisting of Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN) models for improved classification accuracy. The model outperforms LSTM and CNN models when used independently and also gave 88% accuracy after quantization. Notably, this solution demonstrates cost efficiency and high potential for scalability.


I. Introduction
Forests are essential for both the environment and human society.They are home to various plants and animals, contributing to biodiversity conservation and the protection of the ecosystem against erosion, landslides, global warming, as well as climate change [1].However, in many parts of the world, illegal logging of trees is done in high volumes each year and this contributes significantly to global CO2 emissions [2], [3].To save our planet, advanced monitoring systems need to be developed and appropriate policies also need to be made to prevent unauthorized logging.
There have been various measures to prevent illegal logging in the past.These measures include the establishment of forest patrols and the deployment of forest guards to monitor and deter unauthorized activities.In many countries, governments have implemented stricter regulations and logging bans, while also increasing enforcement efforts to prosecute offenders [1], [4].These measures have not worked properly due to the lack of human resources [4]; hence, they need to leverage advanced technologies, such as satellite imaging, Geographic Information Systems (GIS), drones, IoT, Wireless Sensor Networks (WSN), and others.Out of the available technologies, sensor nodes with acoustic and a few other sensors tend to be more reliable and used in many works [1] because acoustic sensors for example can function the way the human auditory system and brain work in detecting sound classes.
In this paper, the development of longrange, low-powered, and smart Internet of Things (IoT) nodes is presented from a design perspective.Each node consists of low-cost sensors, a microcontroller, a scaled ML model, and a wireless communication system.There are a few sensors to explore for forest monitoring such as acoustic, vibration, motion sensors, and many more.However, this work is limited to the use of an array of acoustic sensors due to their effectiveness compared to vibration sensors.The next important component of an IoT sensor node is ML model development, which involves training an ML model to predict the condition of the forest (i.e.logging or no logging event).For this study, low-powered microcontrollers are utilized, which typically have memory constraints, including static and flash memory limitations.
These constraints pose challenges to deploying trained ML models.As such, Tiny Machine Learning (TinyML) is a great solution for scaling down the computational size of ML models.Wireless communication protocols such as LoRa and ZigBee are used in many indoor and outdoor applications and they are common in applications where traditional wired networks are impractical or costly, such as environmental monitoring.Both LoRa and ZigBee are energy efficient but LoRa is well-suited for this research project due to its long-distance functionality [5].
The overall framework of this research is illustrated in Fig.

II. Related Work
There are a few related works on the detection of logging activities using audioacoustic signals; however, each of the designs slightly varies from the other in terms of the operational flowchart, hardware components, communication protocol, and inference system used.
A WSN with an architecture consisting of an Arduino Nano board, sound and vibration sensors, Zigbee communication device, and power supply was presented in [6].The authors claimed the analog values from both the vibration and sound sensors were used to detect illegal tree cutting when a set limit is exceeded by each respective sensor.This method of detection is not reliable.Besides, each vibration sensor is placed around the base of a specific tree where they have the slave node, while the sound sensor is used only at the master node.This means the detection of logging events through vibration signals is limited since the slave nodes cannot be installed on all the trees in the forest.In [7], a more practical framework was presented, which also integrates vibration and sound sensors with the Arduino Nano board, and the GSM module was used as a means of providing information directly to the patrol officers in the forest.The drawback of this work is that advanced detection models, such as Artificial Neural Network (ANN), were not used as an inferencing system for the sensors.A different sensor node architecture was presented in [5], with hardware components comprising two boards (Arduino Uno and Raspberry Pi3), a LoRa GPS module, an accelerometer, microphone and vibration sensors, and a LoRa gateway.The downside of this work is that the authors used a predefined threshold as a way of detecting logging events from sensor values.Nevertheless, they provided a good insight into how data can be transferred through the LoRa gateway to the LoRa network server and then to an application server.
In [4], a framework was presented for the automatic detection of logging in forests using audio data.Monitoring stations, each with one or many microphone sensors, were proposed to capture sound events, and the audio data were afterward transmitted wirelessly using either Wi-Fi or ZigBee protocols.However, in scenarios involving dense vegetation, or significant distances between stations, a 3G, 4G, or 5G network was recommended for the audio transmission.At the base station/server, the signal is pre-processed, and the features extracted are 18 in number, comprising Mel Frequency Cepstral Coefficients (MFCCs), the harmonics-to-noise ratio obtained via autocorrelation function, voicing probability, and the dominant frequency.These features are then used to predict the class of the sound.When training the classifier, the Machine Learning (ML) models examined include Support Vector Machine (SVM), Multilayer Perceptron (MLP), decision tree, K-nearest neighbours, and Bayes network.
The training dataset consists of 5 minutes of 16 different chainsaw sounds and other forest sounds from existing data repositories that were not mentioned.Besides, the data was downsampled at 8 kHz and blended randomly with background acoustic noise recordings at different signal-to-noise ratios (SNRs) in order to have a dataset matching the real conditions in forests.The models were all trained, and it was noted from the results that SVM gave the best performance across all assessed SNR levels, ranging from 6 dB to 20 dB values.The framework appears feasible; however, a prototype was not developed in this work.
A different framework with a prototype design was presented in [1], where the classification was done at the node through a

III. Proposed IoT Node Architecture
The IoT node proposed in this paper combines the effective audio classification capabilities of ANN with long-range and low-power data transmission of LoRa technology.The node is composed of acoustic and GPS sensors, a LoRa module, and a power system.Each node monitors the forest environmental sound to differentiate illegal tree-cutting sounds and sounds from other activities such as rain, birds, winds, fire, etc.After the sound has been acquired through the omnidirectional microphone sensors, their MFCCs are computed, and the resulting features are passed through an ensemble network which is a combination of Long Short-Term Memory (LSTM) and 1D CNN.The ensemble network predicts the class of the sound based on the labels considered in the training dataset.
The nodes send the decision and GPS data (longitude, and latitude of the node) to the gateway in the form of LoRa packets.Subsequently, the LoRa gateway forwards the data received from the end nodes through GSM (cellular connectivity) to a cloud server, if logging is detected.Thereafter, a text message is sent from the cloud to the designated forest inspector to act.A flowchart describing the point-topoint communication of the node is illustrated in Fig. 2.
In this section, different aspects of the node are discussed, including the gateway and cloud solution, and the other design factors considered for the component's selection.The schematic of the node, consisting of Raspberry Pi Pico, two microphones, a Lora module, a GPS module, an OLED display, a battery, a charging controller, and a solar panel is shown in Fig. 3.

A. Sensors
Based on the review of different designs in Section 2, the sensors selected for this research project are two acoustic sensors that are used separately at regular intervals in capturing sounds and a GPS module.Two acoustic sensors are used to ensure the node still functions in case one of the sensors stops working or is faulty.The design considerations while choosing an acoustic sensor for this study include high SNR and sensitivity, the ability to capture sound in different directions (omnidirectional), and low power consumption.Some existing lowpowered digital Micro-Electro-Mechanical Systems (MEMS) microphones in the market are the INMP441 microphone module and the MSM261S4030H0 microphone module.These microphones work well but they require I2S (Inter-IC Sound) protocol in transmitting data to integrated circuits (i.e.microcontrollers).Based on our research, there is also an analogue acoustic sensor called electret microphone which can equally be used for this research project.To test the three microphones, logging audio was played from YouTube, and a 5-second part of the audio was recorded with each microphone.The performance of the three microphones sensors in the time domain are plotted in Fig. 4, which shows the electret analogue microphone sensor produced a waveform that closely matches the original signal.It is worth noting that there was a slight mismatch in the played audio timeframe and the sampling frequency of the original audio was 48 kHz while 16 kHz was used for the recording.
In the process of acquiring raw sound data using microphones, it is important to filter the sound signal to improve the signal quality and reduce noise.A simple high-pass filter was used in this work, and it is mathematically expressed as follows: where � � is the current output of the filter at sample n, � � is the current sample and �(� − 1) is the previous sample.
The GPS module selected for this project is the NEO-7M GPS which has an improved update rate and sensitivity in challenging environments like forests compared to NEO-6M GPS which was used in other previous works reported in Section 2.
For this study, an ML model is first developed in Google Colab using an existing ESC dataset called FSC22 dataset [10].The dataset comprises audio recordings of 34 different sound subclasses, ranging from natural sounds such as rain, thunderstorm, and wind to mechanical sounds such as generators, axes and chainsaws [11].Each class contains a variety of recordings captured in different environments and conditions, making it suitable for training and evaluating machine learning models for sound classification tasks in diverse settings.However, additional logging data of different chainsaws can be added from other ESC datasets like SONYC-UST-V2 dataset [12] and ESC-50 dataset [13] to ensure there is no data imbalance.A detailed survey of diverse available public datasets for ESC was examined in [14].Preferably it is necessary to also add data recorded directly from the forest environment where the nodes will be deployed.It is a common practice to perform binary classification where the ML model predicts whether there is logging or not based on the available sound; however, multiclass classification provides greater insights.
After the FSC22 dataset was extracted, it was downsampled to 16 kHz and all the audio data were converted from stereo to mono by computing the mean of the two channels.MFCCs features were then extracted from each 1 second of the mono input data through a process involving FFT calculation, Mel-scale computation, and Discrete Cosine Transform (details can be found in Chapter 5 and 6 of this book [8]).
To ensure an optimal data length is used for the classification, the original size of each audio data (5 seconds) is also used to extract another MFCCs features.The two features are used differently in training an ensemble network consisting of 1D CNN and LSTM networks concatenated to have a more robust network and accuracy.This proposed model is different from the independent investigation CNN and LSTM models in [15] using the UrbanSound8k dataset and different features including MFCC, Mel Spectrogram, chroma Short Term Fourier Transform (STFT), spectral contrast, tonnetz etc.In addition, the size of the proposed model is around a few kilobytes and it is lesser than pretrained CNN based models such AlexNet, ResNet-50, DenseNet-121, Inception-v3, MobileNet-v3-small, and EfficientNet-v2-B0 investigated [16], which gave an average accuracy of 85 % with FSC22 dataset without data augmentation.
The ensemble network proposed in this work is shown in Fig. 5.

D. LoRa Communication and Gateway
LoRa communication technology is preferred in illegal logging detection research due to its low power consumption and long-range transmission compared to alternatives like ZigBee and GSM modules.Illegal logging monitoring systems often operate in remote areas without access to reliable power sources.LoRa's low-power characteristics enable nodes to transmit data over long distances while conserving battery life.For this research project, the SX1276 EU868 MHz LoRa module was selected which can transfer data to a range of up to 5 KM.
At the base station in the forest, a gateway device receives ML model decisions from nodes through LoRa connections and transmits those insights to a cloud server for real-time monitoring and intervention against illegal logging activities.The factors considered in selecting a gateway are the position of the device (indoor or outdoor), number of channels (8, 16, or 32), network server (The Things Network-TTN, ChirpStack), data rate and bandwidth, and power efficiency.The 8-channel RAK7439 LoRaWAN gateway was selected because it supports cellular connectivity with the cloud server as the GSM (2G) network is the most reliable internet network in the forest environment.

E. Power
The power requirements of a Raspberry Pi Pico can vary depending on factors such as the peripherals connected, the code running on it, and whether it's in an active or lowpower state.The operating voltage of the microcontroller is 3.3 V while the operating current without any peripherals attached is 50-100 mA.In our prototype for this research project, a 3.7 V lithium battery with a capacity of 2000 mAh was used which will last for roughly 10 hours since the operating current is not more than 200 mA after adding low-power microphone sensors taking 3-5 mA, GPS taking 50 mA and LoRa modules taking 15 mA.The formula for calculating the battery life is shown in Equation 2. For the field implementation of the sensor node, a 6V solar panel is recommended with a TP4056 charging controller to ensure the node is constantly powered.

F. Other Design Factors
The other design factors that were considered and critical to the successful implementation of this project are shown as follows: 1. Node speed: Currently, the node's data acquisition and processing speed ranges from 40 seconds to 1 minute while inferring the captured sound event from each microphone and waiting to transmit the detected event label, including GPS coordinates, via LoRa.Based on our setup, three LoRa packets are sent at intervals of roughly 1 minute.Regardless of this transmission frequency, the energy consumed is relatively low, as highlighted earlier in part E of section 3. The Fresnel zone is an ellipsoid region within the line of sight between the end device and the gateway.The nodes could be placed at a height around the tree trunk or on posts (although the latter may involve additional costs).Besides, the nodes are exposed to environmental risks such as rain, fire, and theft and this necessitates robust protection measures.The use of a multiclass ML inference system can also enable the early detection and diagnosis of such risks.

Cost of node production:
The approximate cost per node stands at £50, with the potential for reduction to around £30 through economies of scale.Managing production costs is crucial for scalability and widespread deployment of monitoring infrastructure.
The actual prototype of the node with a casing in unassembled and assembled forms is shown in Fig. 6.

IV. Results and Discussion
The ML and node field test results are presented in this section.

A. ML Training and Quantization
The ensemble network model was trained using TensorFlow deep learning framework and the hyperparameters employed are as follows: the number of epochs is 100, the batch size is 50, the learning rate is 0.001, and sparse categorical cross entropy was used for the training loss optimization.The validation accuracies of the two cases of features examined and three distinct models are listed in Tables 2 and 3.For each of the models, it was observed that the accuracy of the first case was about 5 % better than the second case.Also, by comparing the performance of LSTM and CNN with the ensemble model, it can be said that the latter gave about 1% improved accuracy.This improvement appears small, but the ensemble model has other advantages like stability, which is important for models that need to be deployed for real-world applications.In addition, the confusion matrices for the two cases in Fig. 7 and Fig. 8 show that the model is biased towards the no_logging label due to the use of an incomplete dataset.This justifies why there are very high true negatives and fewer true positives.The classification accuracies of the ensemble model after quantizing its parameters to 8-bit to deploy it on the Raspberry Pi Pico microcontroller for the two cases are 88 %. and 84.69 %, respectively.
Moreover, the classification accuracy reported is for binary classification (i.e., logging, and no logging detection).The model can be enhanced by using balanced datasets that include field data and by employing a grid search approach to optimize the parameters for both feature extraction and the model itself.

B. Node Test
The developed node has been tested at several locations around Ngong Hill and the neighbouring forest called Oloolua in Kenya.
For the hill test where we had a line of sight, point-to-point communication of more than 1 km was achieved.The forest environment, on the other hand, presented some challenges we didn't initially anticipate.One primary issue was the density of the forest, even around the edges, which affected the range of our LoRa module, limiting pointto-point communication to about 200 m.Enhancing the LoRa module performance requires employing superior antennas.Additionally, there was not enough sunlight for the solar panel to charge the battery quickly due to obstruction by the trees.To conserve the battery, the node microcontroller can be programmed to operate only during the day and enter sleep mode at night (8-10 hours) when there is usually a low chance of logging.

V. Conclusion
In As part of our future work, we will be investigating the following: 1) A cheap and better LoRa module that supports a 3 or 12 dBi antenna for more reliable and longer-range data transfer by the nodes.
2) An improved design of the casing such that the GPS antenna has a clear view of the sky.
3) Selection of a battery with a lower charging current for compatibility with the minimum supply current of the solar panel and the use of the Maximum Power Point Tracking (MPPT) support power management module.
4) Acquisition of different field sound data and retraining the existing ensemble ML model with balanced datasets (i.e., equal logging and no-logging labelled datasets) from the field and existing ESC datasets.
CNN) model, and the decision was sent through LoRa.During the model training process, a subset of the ESC50 dataset closely related to logging detection is selected and downsampled from 44.1 kHz to 16 kHz, which aims to reduce the size of audio signals needed during online implementation.The various audio classes from the selection were pre-processed into the linear spectrogram, Mel spectrogram, and MFCC and quantized into 32-bit and 8bit formats for efficient classification.A comparative study of classification performance in terms of the inference time, peak Random Access Memory (RAM) utilization, Read-Only Memory (ROM) consumption, and accuracy metrics are evaluated in reference to the employed Nano BLE microcontroller that comes with inbuilt omnidirectional microphone and 64 MHz ARM Cortex M4F processor.The results show the MFCC-8 and MFCC-32 gave the best accuracy of approximately 85 % and the lowest memory and time usage.

2 .
Node installation and protection: It's imperative to position the nodes optimally because of the Fresnel zone.

Fig. 7 .Fig. 8 .
Fig. 7. Ensemble model confusion matrix (Case I) The remainder of the paper is structured as follows: the related works are presented in section 2 while Section 3 covers the proposed IoT node architecture and a detailed discussion of the prototype.Section 4 focuses on the results of the ML model and node tests.
1, where each IoT node acquires sound from the forest, classifies the sound and transmits the detected event and the node GPS location over a wireless network to the gateway.The gateway which is located at the forest base station then sends the sound class and the node's location data to a cloud server from where an alert is sent to the forest inspector if logging is detected.The major issues considered in the IoT node development include but are not limited to the selection

Table 1 .
Key features of T-Beam, Nano 33 BLE, and Raspberry Pi Pico.

Table 2 .
The validation accuracy of different models (Case I-features from 1s audio inputs)

Table 3 .
The validation accuracy of different models (Case II-features from 5s audio inputs) conclusion, this paper has presented the development of IoT sensor nodes coupled with ML to improve forest monitoring capabilities.The proposed research framework entails the development of a LoRa-based IoT device tailored for forest sound classification and real-time data transmission for prompt action by forest officers in case of a logging event.Key design components and considerations are discussed such as node sensors and microcontroller selection, ML model training and testing along with TinyML implementation on an ARM Cortex microcontroller (Raspberry Pi Pico).