# Use Case 1: Heat Exposure! ðŸ¥µ

This notebook will guide you through how to analyze temperature and humidity data from the Octopus.

## Initial Setup

The first thing you need to do, is to check your data from the Octopus deployment that everything looks fine and is ready to be analyzed. This


In [None]:
#Block 1
#this block is going to allow the notebook to connect to your google drive so you can interact with it
from google.colab import drive
drive.mount('/content/drive')

In [None]:
#Block 2
#this block of code is where we get the infrastructure for the notebook set up, by calling libraries
import csv
import numpy as np

#these libraries will help us read in and format the data correctly
import pytz
import time
import pandas as pd
from datetime import datetime
import os

#these libraries will help us with our time series analysis
from matplotlib import pyplot as plt
from matplotlib import ticker as mticker
import matplotlib.dates as mdates
from statsmodels.tsa.seasonal import seasonal_decompose

In [None]:
#Block 3
#adjust the path below to match your google drive setup! please use the format below:
os.chdir('/content/drive/My Drive/path to correct drive/')

In [None]:
# Block 4
# Read the raw data from a CSV file into a DataFrame
# adjust name to correct file
filename = "octopus_data_test.txt"
df = pd.read_csv(filename, names=["timestamp", "temperature", "humidity"])

# Display the DataFrame
print(df)


# Data preprocessing steps

Dependent on you use case and what you want to answer with your data, different pre-processing steps should be done to get the best results possible. In this case, we will conduct Handling Missing Values, Outlier Detection and Removal, Resampling, Normalization, Smoothing and Seasonal Decomposition.

In [None]:
# Block 5
# Handling missing values
df["temperature"].fillna(method="ffill", inplace=True)
df["humidity"].fillna(method="ffill", inplace=True)

In [None]:
# Block 6
# Outlier detection and removal (using Z-score)
z_scores_temp = (df["temperature"] - df["temperature"].mean()) / df["temperature"].std()
z_scores_humidity = (df["humidity"] - df["humidity"].mean()) / df["humidity"].std()
df = df[(z_scores_temp.abs() < 3) & (z_scores_humidity.abs() < 3)] # threshold used is Â±3

In [None]:
# Block 7
# Resampling
# This part will depend on the frequency of your collected data, and what you want to analyse.
# In this example, the data is collected every second, but we want to analyze the mean values every 15 min
# '15T' specifies the new frequency, indicating every 15 minutes. You can change to '1H' if you want every hour.
# .mean() specifies the aggregation method, meaning that within each new 15-minute interval, the mean of the values will be calculated.

df["timestamp"] = pd.to_datetime(df["timestamp"])
df.set_index("timestamp", inplace=True)
df = df.resample('15T').mean()

In [None]:
# Block 8
# Normalization
df_normalized = (df - df.mean()) / df.std()

In [None]:
# Block 9
# Smoothing (using 5-point moving average)
df_smoothed = df.rolling(window=5).mean()

In [None]:
# Block 10
# Seasonal decomposition
decomposition = seasonal_decompose(df["temperature"], period=24)  # Assuming daily seasonality 24h
trend = decomposition.trend
seasonal = decomposition.seasonal
residual = decomposition.resid

Now you can use **df, df_normalized, df_smoothed, trend, seasonal, and residual for analysis!**



# Time Series
Now we will plot the data processed to see how it looks! We will start by plotting the data we have in a graph. One for temperature, and one for humidity

In [None]:
# Block 11
# Plot Temperature

plt.figure(figsize=(12, 6))
plt.plot(df.index, df['temperature'], label='Temperature', color='red', marker='o')
plt.xlabel('Index')
plt.ylabel('Â°C')
plt.title('First 10 Rows of Environmental Data')
plt.legend()
plt.grid(True)
plt.show()

In [None]:
# Block 12
# Plot Humidity

plt.figure(figsize=(12, 6))
plt.plot(df.index, df['humidity'], label='Humidity', color='blue', marker='o')
plt.xlabel('Index')
plt.ylabel('%')
plt.title('First 10 Rows of Environmental Data')
plt.legend()
plt.grid(True)
plt.show()