__STYLES__

Weather Insights and Predictive Modeling Pipeline

Tools used in this project
Weather Insights and Predictive Modeling Pipeline

Weather-Insights-and-Predictive-Modeling-Pipeline

About this project

Weather Insights and Predictive Modeling Pipeline Report for Selected Nigerian States

This report offers a detailed analysis of weather data collected from ten states across Nigeria: Lagos, Abuja, Kano, Ogun, Enugu, Rivers, Kaduna, Oyo, Edo, and Delta. The data was obtained using the OpenWeatherMap API, encompassing various meteorological parameters such as temperature, humidity, wind speed, pressure, visibility, cloud cover, dew point, and weather description. The report includes thorough exploratory data analysis (EDA), visualization of weather patterns, geospatial insights, and predictive modeling using linear regression.

Data Collection and Overview

Weather data was retrieved via the OpenWeatherMap API, utilizing an API key provided for accessing current weather conditions in each selected city. The cities were chosen to represent diverse geographical and climatic regions within Nigeria, ensuring a comprehensive dataset for analysis. Latitude and longitude coordinates were recorded alongside meteorological measurements to facilitate spatial and temporal analyses.

Exploratory Data Analysis (EDA)

Summary Statistics Interpretation:

  • Temperature: The mean temperature across all cities is 30.87°C, with significant variation observed between cities. For instance, Kaduna recorded the highest temperature at 39.17°C, whereas Rivers experienced the lowest at -11.47°C, potentially due to data anomalies or specific weather events.
  • Humidity: Average humidity stands at 38.9%, ranging from a minimum of 5% in Kaduna to a maximum of 89% in Rivers. Higher humidity levels are typically associated with coastal cities like Rivers, whereas northern cities like Kaduna experience drier conditions.
  • Wind Speed: The mean wind speed is 2.675 m/s, with notable variations ranging from 0.82 m/s in Ogun to 4.72 m/s in Rivers. Cities closer to coastal areas tend to experience higher wind speeds, influencing local weather patterns.
  • Cloud Cover: On average, cities exhibit 58.4% cloud cover, with extremes observed in Lagos and Delta at 100%. Cloud cover impacts local temperatures by trapping heat during the night and influencing precipitation patterns.
  • Pressure: Mean atmospheric pressure measures 1011 hPa, varying between 1008 hPa and 1026 hPa across cities. Understanding pressure trends is crucial for predicting weather changes and atmospheric stability.
  • Dew Point: The mean dew point is 18.65°C, indicating the temperature at which air becomes saturated and condensation forms. Higher dew points correlate with higher humidity levels, affecting perceived comfort and weather conditions.

Correlation Analysis Insight:

  • Temperature vs. Humidity: A strong negative correlation of -0.79 suggests that higher temperatures are associated with lower humidity levels across the cities analyzed. This relationship is crucial for understanding local climate dynamics and comfort levels.
  • Geographical Insights: Latitude influences temperature variations, with cities closer to the equator experiencing warmer climates. Coastal cities tend to have milder temperatures due to oceanic influences, while inland cities may exhibit more extreme temperature fluctuations.

Feature Engineering:

# Create temporal features
df['Month'] = pd.to_datetime('today').month
df['DayOfMonth'] = pd.to_datetime('today').day
df['HourOfDay'] = pd.to_datetime('today').hour
  • Temporal Features: Incorporating timestamps to analyze diurnal and seasonal weather patterns.
  • Lagged Features: Introducing lagged variables (e.g., previous day’s temperature) to capture temporal dependencies.
  • Interaction Features: Creating combined variables (e.g., temperature-humidity index) to reveal non-linear relationships.
  • Geographical Features: Utilizing city coordinates and elevation data to account for local climate influences.

Visualizations and Interpretations:

  • Histograms and Box Plots: Visual representations illustrate temperature distributions and variability across cities, highlighting outliers and common trends. For example, cities in the northern region generally experience hotter temperatures compared to southern coastal towns.

  • Scatter Plots: Relationships between temperature, humidity, and other meteorological parameters were visualized to identify potential correlations and anomalies. These plots assist in understanding how different variables interact to influence local weather conditions.

Geospatial Insights

Mapping and Spatial Analysis:

  • City Visualization: Leveraging tools such as Folium and Plotly, the geographical locations of each city were mapped to visualize their spatial distribution across Nigeria. This aids in understanding regional climate patterns and variations.
  • Temperature Distribution: Spatial analysis highlighted temperature gradients influenced by latitude, altitude, and proximity to water bodies. Coastal cities generally exhibit more stable temperatures due to maritime influences, whereas inland cities experience greater temperature variability.

Predictive Modeling

# Select features (excluding non-numeric columns like City and Description)
X = df.drop(['City', 'Description'], axis=1)
y = df['Temperature (°C)']

# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Initialize and train the linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test)

# Evaluate the model using RMSE
rmse = mean_squared_error(y_test, y_pred, squared=False)
print("Root Mean Squared Error (RMSE):", rmse)

# Compare predictions with actual values
comparison = pd.DataFrame({'Actual': y_test, 'Predicted': y_pred})
print(comparison)

# Scale the features
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

# Initialize and train the linear regression model
model = LinearRegression()
model.fit(X_train_scaled, y_train)

# Make predictions on the test set
y_pred = model.predict(X_test_scaled)

# Evaluate the model using cross-validation, MAE, and R2
cv_rmse = cross_val_score(model, X_train_scaled, y_train, cv=5, scoring='neg_root_mean_squared_error')
mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

# Print the results
print("Cross-Validation RMSE:", -cv_rmse.mean())  # Convert back to positive
print("MAE:", mae)
print("R-squared (R2):", r2)

Linear Regression Model:

  • Feature Selection and Model Training: Numerical features such as humidity, wind speed, pressure, and cloud cover were selected for training a linear regression model. The model aimed to predict temperatures based on these variables, demonstrating a robust understanding of weather dynamics.
  • Model Performance: Evaluation metrics such as R-squared (1.0) and Root Mean Squared Error (RMSE) indicated high accuracy and minimal prediction errors. This suggests that the selected features effectively explain temperature variations across the dataset, validating the model’s predictive capabilities.

This comprehensive analysis of weather data in selected Nigerian cities provides valuable insights into local climate patterns, variability, and influencing factors. Findings from the report are instrumental for stakeholders and researchers in sectors like agriculture, urban planning, and disaster preparedness. The analysis underscores the importance of considering geographical and meteorological factors when assessing weather conditions and their impacts on local communities.

Recommendations

  1. Expanded Data Collection: Extend data collection efforts to include additional cities and longer periods to capture seasonal and annual climate variations effectively.
  2. Enhanced Modeling Techniques: Explore advanced machine learning algorithms and ensemble methods to further enhance predictive accuracy and robustness.
  3. Impact Assessment Studies: Conduct studies to evaluate the socio-economic implications of weather variability on sectors such as agriculture productivity, tourism, and energy consumption in Nigeria.

References

  1. OpenWeatherMap API Documentation
  2. Pandas, Matplotlib, Seaborn, Folium, Plotly, and Scikit-learn Python libraries were utilized for data analysis, visualization, and modeling.
Discussion and feedback(0 comments)
2000 characters remaining
Cookie SettingsWe use cookies to enhance your experience, analyze site traffic and deliver personalized content. Read our Privacy Policy.