Uncovering Global Deforestation and Soy Bean Consumption
Author
Affiliation
The Plotting Pandas - Megan, Shakir, Maria, Eshaan, Bharath
School of Information, University of Arizona
Abstract
This project delves into two crucial aspects of global environmental dynamics using the Global Deforestation dataset, a comprehensive resource published by Hannah Ritchie and Max Roser in the Our World in Data journal in 2021. The dataset encompasses a wide array of attributes related to global forest cover, deforestation rates, and associated factors. Two distinct questions guide our exploration.
With a focus on identifying the patterns in forest area conversion, the first question seeks to comprehend how the world’s forest cover has changed over the previous three decades. With the help of a choropleth map, we meticulously prepare, clean, and visualize the data in order to clearly depict net forest conversion across the globe. We use the strengths of ggplot and gganimate to build an interesting, dynamic map that sheds light on the dynamics of forests around the world.
The second question investigates the trajectory of soybean consumption in Brazil and its potential impact on afforestation and deforestation rates. Through data manipulation, we calculate total soybean consumption, revealing the crop’s evolution over time. Employing ggplot, we construct time series plots to showcase soybean consumption trends and assess their correlation of soybean consumption with afforestation and deforestation rates. The period from 1990 to 2013 becomes our focal point.
Introduction
The global environment is undergoing profound changes, driven by factors ranging from climate shifts to land use transformations. Within this complex web of interconnected challenges, the fate of the world’s forests and the dynamics of soybean consumption stand as two pivotal and interrelated aspects of environmental change. These subjects are the focus of our investigation, guided by the rich and detailed “Global Deforestation” dataset, a comprehensive resource provided by Hannah Ritchie and Max Roser in the “Our World in Data” journal in 2021.
The dataset offers an extensive repository of data, comprising variables such as net forest conversion, year, entity (providing country and continent information), and soybean consumption statistics. This dataset proves invaluable for analyzing the complex interplay between land use changes, soybean consumption, and broader conservation efforts. By leveraging advanced data analysis and visualization techniques, this project aims to provide critical insights into the ever-evolving dynamics of global forests and the influence of soybean consumption, contributing to a better understanding of essential environmental conservation and sustainable land management practices.
Question 1: What does the global forest area look like over past decades, highlighting the trends of forest area conversion?
Introduction
In the face of global climate change, It is important to understand the dynamics of forest cover conversion. The first question delves into the net forest conversion data from 1990 to 2015, examining the changes decade-wise for various countries across the world. Utilizing data from tidytuesday which was sourced from Our World in Data, this study explores the dynamics of forest cover alteration over four decades. By visualizing and analyzing the net forest conversion rates from 1990 to 2015, this study aims to shed light on the evolving state of the world’s forests and highlight countries with notable trends in forest expansion and deforestation.
For this visualization we focused on the forest dataset provided by Our World in Data. We utilized the net_forest_converstion variable as it gave us a significant information on how forest conversion has been over the years. In the main plot, our attention was directed towards the specific periods of 1990, 2000 and 2010. However, in the animated compilation, we incorporated visualizations from all decades.
Approach
To address this question, we implemented a choropleth map visualization highlighting the forest conversion for specific decades and spotlighting few countries with extensive forest conversion. We began to scout the data for getting relevant information and noteworthy details. The whole approach can be categorized into Data Preparation and Pre-processing, Visualizing the data and Animating the plots.
Data Preparation and Pre-processing
To generate a map plot we needed geographical information which is not present in the dataset and was achieved by utilizing the maps package in R. From this world data we retrieved all the unique countries for further processing and also filtered out Antarctica from the data.
A custom function, processForest(), is developed to handle the pre-processing of the forest conversion dataset. This function is used filter countries, ensuring all entities present in the map data are included. It also used to categorize countries based on their net forest conversion rates, grouping them into distinct categories. The processed forest conversion data is split into subsets for each decade (1990, 2000, 2010, and 2015). The split() function in R is employed to divide the data based on the year column.
Glimpse of processForest Function
# function to pre process the forest dataset# input : dataset - tibble# unique_countries - tibble# output : filtered_data - tibbleprocessForest <-function (dataset, unique_countries) { filtered_data <- dataset |># filtering only entity, year and net_forest_conversion columnsselect(entity, year, net_forest_conversion) |># getting all the countires which are not present in forest dataset for a specific years# bind_rows() is used combine combine rows of two data framesbind_rows(# anti_join() is used to return only the rows from the first dataset that isn't having matching rows in the second dataset based on specified key columnsanti_join(unique_countries, dataset, by =c("region"="entity")) |># adding year and net_forest_conversion for that specific year as NAmutate(year = dataset[1, "year"], net_forest_conversion =NA) ) |># renaming USA and UK so that both these countries are matching in world dataset and forest datasetmutate(entity =case_when( entity =="United States"~"USA", entity =="United Kingdom"~"UK",TRUE~ entity ) ) |># creating a categorical variable forest_converstion to group countries based on their forest conversionmutate(entity =coalesce(entity, region),forest_converstion =case_when( net_forest_conversion <-400000~"<-400k", net_forest_conversion <-200000~"-400k to -200k", net_forest_conversion <-100000~"-200k to -100k", net_forest_conversion <0~"-100k to 0", net_forest_conversion <100000~"0 to 100k", net_forest_conversion <200000~"100k to 200k", net_forest_conversion <400000~"200k to 400k",is.na(net_forest_conversion) ~NA_character_,TRUE~">400k" ) ) |># ordering forest_converstion column using factors based on the created categoriesmutate(forest_converstion =as_factor(forest_converstion) |>fct_relevel("<-400k","-400k to -200k","-200k to -100k","-100k to 0","0 to 100k","100k to 200k","200k to 400k",">400k" ) )return(filtered_data)}
A custom function, filterCountries(), is created to identify and extract specific countries of interest. These countries are singled out due to their significant forest conversion.
Visualization the data and Animating the plots
The ggplot2 package in R is utilized to create detailed visualizations for each decade. For each decade, a map is generated where countries are color-coded according to their forest conversion categories. The geom_map() function is employed to plot the world map, and additional layers are added to highlight the noteworthy countries, ensuring they stand out in the visual representation.We’ve encapsulated the common plotting logic for all decades within a function called generateForestConversionPlot(). This function streamlines the process of generating plots for each specific decade.
We also developed a function called generatePlotforAnimation(), designed specifically to adjust text sizes in the original plot to enhance clarity during animation and save the resulting plots. The individual plots for each decade are compiled into an animated GIF using the gganimate package. The resulting GIF provides a dynamic overview of the global forest conversion patterns, emphasizing the transformations occurring over the specified decades.
Analysis
We concentrated our visual analysis on the decades of 1990, 2000, 2010. As some countries like USA, Russia and Australia are not having any data to visualize for 2015. It is important to note that the animated plots incorporates data from all decades, providing a comprehensive overview of the entire timeline and the evolving patterns in forest cover across different years.
Function used to generate the plot
# Function for creating the ggplot map plot# Using the filtered_forests$`2000` dataset created earlier as a data source# using entity as map_id for first layer# using forest_convestion as fill aesthetic and word as map for second layer# using highlight_filtered_data$`2000` as another dataset for creating another map layer# using entity as map_id,forest_convestion as fill aesthetic and highlight_world as map for third layer# input : year - integer# output : world_plot - plot objectgenerateForestConversionPlot <-function(year) { world_plot <-ggplot(filtered_forests[[as.character(year)]], aes(map_id = entity)) +geom_map(aes(fill = forest_converstion),map = world,color ="#B2BEB5",linewidth =0.25,linetype ="blank" ) +geom_map(data = highlight_filtered_data[[as.character(year)]],aes(map_id = entity, fill = forest_converstion),map = highlight_world,color ="#71797E",show.legend = F ) +expand_limits(x = world$long, y = world$lat) +scale_fill_manual(values = color_mapping, na.value ="#F2F3F4") +coord_fixed(ratio =1) +labs(title =paste("Net Forest Conversion by Country in", year),subtitle ="Net change in forest area measures forest expansion minus deforestation",caption ="Data source: Our World in Data",fill ="Net Forest Conversion (hectares)" ) +theme_void() +theme(legend.position ="bottom",legend.direction ="horizontal",plot.title =element_text(size =19, face ="bold", hjust =0.5),plot.subtitle =element_text(size =15, color ="azure4", hjust =0.5),plot.caption =element_text(size =12, color ="azure4", hjust =0.95) ) +guides(fill =guide_legend(nrow =1,direction ="horizontal",title.position ="top",title.hjust =0.5,label.position ="bottom",label.hjust =1,label.vjust =1,label.theme =element_text(lineheight =0.25, size =9),keywidth =1,keyheight =0.5 ) )return(world_plot)}
Change in forest cover for the decade 1990
In the 1990 map, the initial phase of global forest conversion is illustrated. Countries are visually distinguished through color codes representing their net forest conversion rates. Particularly, Brazil stands out with significant deforestation during this period, in contrast to countries such as China, India and USA which showcase notable forest expansion.
Change in forest cover for the decade 2000
Moving to the year 2000, the map reveals how forest conversion patterns have evolved. Several countries have altered their trajectory. Countries such as Russia, China and USA has shown increase in their forestation. While Brazil still continues to be in the state of deforestation. Compared to previous decade India’s forest conversion rate reduced even though they are moving towards reforestation.
Change in forest cover for the decade 2010
By 2010, global efforts to combat deforestation are visible. The forest conversion has shifted its tides in some countries like Russia and Australia. But, Brazil still continues to be in the state of deforestation.
Dynamic display of forest conversion across the countries over the past decades
Animation of the plots
# making gif using gganimate packageforest_plots <-list.files(path ="images/q1/", full.names =TRUE)forest_plot_list <-lapply(forest_plots, image_read)# joining all the saved imagesjoined_plots <-image_join(forest_plot_list)# animating the images using image_animate() and restting the resolution# setting fps = 1forest_animation <-image_animate(image_scale(joined_plots, "6000x4000"), fps =1)# saving image to gitimage_write(image = forest_animation, path ="images/world_forestaion.gif")forest_animation
Discussion
The visualizations offer a vivid depiction of the evolving global forest conversion patterns between 1990 and 2015. Through color-coded maps, complex net forest conversion rates are presented in an easily understandable manner, facilitating the immediate recognition of trends. Darker shades representing significant deforestation or expansion immediately draw attention, guiding viewers to focus on countries undergoing substantial environmental changes. The animated visualizations highlight the dynamic nature of global forest conversion over the years across different countries.
The plot reveals significant changes in forest cover over recent decades, with unexpected rises observed in specific countries. Notable positive shifts occurred in the 2000s and 2010s, particularly in countries like the USA, Russia, Australia, China, and India. These improvements might be attributed to global deforestation awareness campaigns, leading to increased forest coverage in these nations. However, a significant portion of South America and Africa continues to bear the brunt of deforestation, with no signs of slowing down, particularly in countries like Brazil and Tanzania. Economically developed continents seem to prioritize reforestation initiatives, contrasting with less economically advantaged regions where such efforts receive comparatively less focus.
In conclusion, this visualization offers a compelling narrative of global environmental change, highlighting both challenges and successes in forest conservation efforts. It serves as a valuable tool for policymakers and environmentalists providing actionable insights into regional and global trends. Such visualizations can aid in targeted policy formulation, focusing resources where they are most needed, and fostering international collaborations to combat deforestation.
Question 2: How has the consumption of Soybean in Brazil changed over time, and how does it impact the afforestation or deforestation rates?
Introduction
The consumption of soybeans, a versatile and globally significant crop, is intricately linked with land use changes, often impacting regions far beyond agricultural fields. In the second question, we focus on the dynamic story of soybean consumption in Brazil. Our central question revolves around the historical evolution of soybean consumption and its potential implications for afforestation and deforestation rates in this vital agricultural region. This analysis aims to unveil the intricate relationship between soybean consumption and environmental changes in Brazil. The findings will contribute to a deeper understanding of how agricultural practices in this key region influence land use, afforestation, and deforestation. This knowledge is invaluable for making informed decisions regarding sustainable land management and conservation practices in this dynamic agricultural landscape.
For this visualization, we used soybean_use data which was sourced from Our World in Data and performed data manipulation to obtain a new column showing the total consumption of soybean. The soyabean_use data, comprises of the columns (variables) human_food ,animal_feed and processed.
Approach
Cleaning and processing of the dataset includes the following steps:
Soybean consumption in Brazil:
Created a new column for calculating the total soybean consumption. Removing totals of countries whose total consumption is 0 using the subset function.
Filtering for Brazil under the entity column, and year between 1990 and 2013.
Forest coverage in Brazil:
Filtering for year between 1990 and 2013 in the forest_brazil dataset.
There is a parameter ‘World’ which shows the overall forest coverage data. Filtering out entity as ‘World’ and year between 1990 and 2013, and grouping by year.
Using left_join we merge the two tables based on the year column.
Since we now have percentage data and total data per year, we can calculate the change in forest coverage for Brazil by doing forest_area.x * forest_area.y.
Pre-processing of soybean and forest data
#Function to pre-process the total_forest, soybean_use and forest_area datasets#Input : total_forest- tibble# soybean_use- tibble# forest_area- tibble#Output: soybean_brazil- tibble# forest_brazil- tibble#Cleaning total_forest tabletotal_forest_cleaned <-clean_names(total_forest)#Making a new column to calculate the total soybean consumptionsoybean <- soybean_use |>mutate(total = human_food + animal_feed + processed)#Some countries do not have consumption, and shows as 0. #Removing the rows if total=0soybean <-subset(soybean, total !=0)# Filter data for Brazilsoybean_brazil <- soybean |>filter(entity =="Brazil", year>=1990&year<=2013)# Filter data for Brazil forest: forest_brazil <- forest_area |>filter(entity =="Brazil",year>=1990&year<=2013)#Finding total forest coverage per yeartotal_forest_world <- total_forest_cleaned |>filter(year >=1990, year <=2013, entity =="World") |>group_by(year)# Left join to add total world forest coverage to the forest_brazil datasetforest_brazil <- forest_brazil |>left_join(total_forest_world, by ="year")#Finding actual total coverage for Brazil (percentage * total)forest_brazil <- forest_brazil|>mutate(forest_area_brazil = forest_area.x * forest_area.y /100)
Analysis
Plotting the soybean data
Plotting of soybean consumption in Brazil
#Code for creating the time series plot#Used the soybean_brazil dataset created earlier as a data source#using year and total as x and y axis for first layer#Plotting points over line to increase visibility as second layer#Manual fill to show trend as positive#Input: year and total- numeric#Output: plot_soybean_brazil- plot object# Create a line plot for Brazil soybean consumptionplot_soybean_brazil <-ggplot(soybean_brazil, aes(x = year, y = total, color ="Brazil")) +geom_line(linewidth =2) +#Plotting line plot of seriesgeom_point(color ="#6E8B3D") +#Plotting points for claritylabs(x ="\nYear", y ="Total (in lb)\n", title ="Soybean consumption in Brazil\n", caption ="Jon Harmon | TidyTuesday") +theme_minimal() +theme(legend.position ="none", plot.title =element_text(size =15)) +scale_y_continuous(labels = scales::label_number(scale =1e-06, suffix ="M")) +#Cleaning long numbersscale_color_manual(values =c("Brazil"="#a6d96a")) +scale_x_continuous(limits =c(1990, 2013), breaks =seq(1990, 2013, by =2)) #Defining year range#Saving plot to location, and defining custom widthggsave(plot_soybean_brazil, filename ="images/q2/plot_soybean_brazil.jpg", height =8, width =15, unit ="in", dpi =120)plot_soybean_brazil
Animating the soybean graph to show the trends:
Animation of soybean usage
#Code to animate the plot using gganimate package# Animate the plotanim_plot_soybean <- plot_soybean_brazil +transition_reveal(year)# Save as an animated GIFanim_save("soybean_brazil_animation.gif", anim_plot_soybean, renderer =gifski_renderer())# Load the animated GIFbrazil_soybean_animation <-image_read("soybean_brazil_animation.gif")# Display the animationbrazil_soybean_animation |>image_animate(fps =25)
Plotting the forest coverage in Brazil
Plotting of forest coverage in Brazil
#Code for creating the time series plot#Used the forest_brazil dataset created earlier as a data source#using year and total as x and y axis for first layer for line plot#Plotting points over line to increase visibility as second layer#Manual fill to show trend as negative#Input: year and forest_area_brazil- numeric#Output: plot_soybean_brazil- plot object# Create a line plot for Brazil with pointsplot_forest_brazil <-ggplot(forest_brazil, aes(x = year, y = forest_area_brazil, color ="Brazil")) +geom_line(linewidth =2) +#Plotting line plot of seriesgeom_point(color="#fdae61") +#Plotting points for claritylabs(x ="\nYear", y ="Forest coverage (in hectares)\n", title ="Forest coverage in Brazil\n", caption="Jon Harmon | TidyTuesday") +theme_minimal() +theme(legend.position ="none", plot.title =element_text(size =15)) +scale_color_manual(values =c("Brazil"="#fee08b"))+scale_x_continuous(limits =c(1990, 2013), breaks =seq(1990, 2013, by =2))+#Defining year rangescale_y_continuous(labels = scales::label_number(scale =1e-6, suffix ="M")) #Cleaning long numbersggsave(plot_forest_brazil, filename ="images/q2/plot_forest_brazil.jpg", height =8, width =15, unit ="in", dpi =120)plot_forest_brazil
Animation of forest coverage
#Animation of plot using gganimate package# Animate the plotanim_plot_forest <- plot_forest_brazil +transition_reveal(year)# Save as an animated GIFanim_save("forest_brazil_animation.gif", anim_plot_forest, renderer =gifski_renderer())# Load the animated GIFbrazil_forest_animation <-image_read("forest_brazil_animation.gif")# Display the animationbrazil_forest_animation |>image_animate(fps =25)
Discussion
The discussion on the relationship between soybean consumption and environmental changes in Brazil is of paramount importance, given the global significance of soybeans as a versatile crop (Pagano & Miransari). The intricate link between land use changes and soybean consumption highlights the need for a comprehensive understanding of the dynamics at play. By examining the historical evolution of soybean consumption in Brazil, this study sheds light on the potential implications for afforestation and deforestation rates in this key agricultural region. It becomes evident that this had a notable impact on land use in Brazil.
The utilization of time series visualization techniques, including geom_line and geom_point, has allowed for a comprehensive overview of the trends over time. These visualizations provide a clear depiction of the steady increase in soybean consumption in Brazil. The data shows a remarkable increase, from approximately 16.4 million pounds of soybeans in 1990 to a staggering 36.87 million pounds in 2013. This indicates a substantial growth over this period (Pagano & Miransari).
Moreover, the decrease in forest coverage during this period is stark. The forest coverage in Brazil dropped from 588 million hectares in 1990 to 507 million hectares in 2013, representing a significant loss of 81 million hectares of forest land during this time. This reduction in forest area is indicative of the environmental impact in Brazil (Song et al.).
The correlation between rising soybean consumption and decreasing forest coverage in Brazil underscores the need for sustainable agricultural practices and conservation efforts. As the global demand for soybeans continues to grow, this study’s findings, supported by the data, serve as a valuable resource for policymakers and stakeholders in the region (Valdez). It underscores the importance of considering the environmental consequences of agricultural expansion and the need for measures to mitigate deforestation. This research contributes to the broader understanding of the complex interplay between agriculture and environmental changes, making it a pivotal step towards more informed and sustainable land management practices in Brazil.
Valdez, C. (2022). Brazil’s Momentum as a Global Agricultural Supplier Faces Headwinds. Published online at ers.usda.gov. Retrieved from: Economic Research Service
Pagano, M. & Miransari, M. (2016). “The importance of soybean production worldwide.” Retrieved from: Science Direct
Song et al., (2021). “Massive soybean expansion in South America since 2000 and implications for conservation.” Retrieved from: National Library of Medicine
Quarto, For documentation and presentation - Quarto
ggplot, For understanding of different plot - ggplot
---title: "Forests in Transition: Visualizing Global Deforestation"subtitle: "INFO 526 - Project 1"author: - name: "The Plotting Pandas - Megan, Shakir, Maria, Eshaan, Bharath" affiliations: - name: "School of Information, University of Arizona"description: "Uncovering Global Deforestation and Soy Bean Consumption"format: html: code-tools: true code-overflow: wrap embed-resources: trueeditor: visualexecute: warning: false echo: false---```{r load_packages, message=FALSE, include=FALSE}# GETTING THE LIBRARIESif (!require(pacman))install.packages(pacman)pacman::p_load(tidyverse, gridExtra, tidytuesdayR, dplyr, janitor, dlookr, # Exploratory data analysis here, # Standardizes paths to data formattable, ggpubr, maps, plotly, gganimate, imager, magick, gifski)``````{r ggplot_setup, message=FALSE, include=FALSE}# setting theme for ggplot2ggplot2::theme_set(ggplot2::theme_minimal(base_size =14, base_family ="sans"))# setting width of code outputoptions(width =65)# setting figure parameters for knitrknitr::opts_chunk$set(fig.width =8, # 8" widthfig.asp =0.618, # the golden ratiofig.retina =1, # dpi multiplier for displaying HTML output on retinafig.align ="center", # center align figuresdpi =140# higher dpi, sharper image)``````{r load_dataset, message=FALSE, include=FALSE}# Getting all the underlying data in the datasetforest <-read.csv(here("data", "forest.csv"))forest_area <-read.csv(here("data", "forest_area.csv"))brazil_loss <-read.csv(here("data", "brazil_loss.csv"))soybean_use <-read.csv(here("data", "soybean_use.csv"))vegetable_oil <-read.csv(here("data", "vegetable_oil.csv"))total_forest <-read.csv(here("data", "forest-area-km.csv"))deforestation_by_source <-read.csv(here("data", "deforestation_by_source.csv"))```## AbstractThis project delves into two crucial aspects of global environmental dynamics using the **Global Deforestation** dataset, a comprehensive resource published by Hannah Ritchie and Max Roser in the **Our World in Data** journal in 2021. The dataset encompasses a wide array of attributes related to global forest cover, deforestation rates, and associated factors. Two distinct questions guide our exploration.With a focus on identifying the patterns in forest area conversion, the first question seeks to comprehend how the world's forest cover has changed over the previous three decades. With the help of a choropleth map, we meticulously prepare, clean, and visualize the data in order to clearly depict net forest conversion across the globe. We use the strengths of `ggplot` and `gganimate` to build an interesting, dynamic map that sheds light on the dynamics of forests around the world.The second question investigates the trajectory of soybean consumption in Brazil and its potential impact on afforestation and deforestation rates. Through data manipulation, we calculate total soybean consumption, revealing the crop's evolution over time. Employing ggplot, we construct time series plots to showcase soybean consumption trends and assess their correlation of soybean consumption with afforestation and deforestation rates. The period from 1990 to 2013 becomes our focal point.## IntroductionThe global environment is undergoing profound changes, driven by factors ranging from climate shifts to land use transformations. Within this complex web of interconnected challenges, the fate of the world's forests and the dynamics of soybean consumption stand as two pivotal and interrelated aspects of environmental change. These subjects are the focus of our investigation, guided by the rich and detailed "Global Deforestation" dataset, a comprehensive resource provided by Hannah Ritchie and Max Roser in the "Our World in Data" journal in 2021.The dataset offers an extensive repository of data, comprising variables such as net forest conversion, year, entity (providing country and continent information), and soybean consumption statistics. This dataset proves invaluable for analyzing the complex interplay between land use changes, soybean consumption, and broader conservation efforts. By leveraging advanced data analysis and visualization techniques, this project aims to provide critical insights into the ever-evolving dynamics of global forests and the influence of soybean consumption, contributing to a better understanding of essential environmental conservation and sustainable land management practices.## Question 1: What does the global forest area look like over past decades, highlighting the trends of forest area conversion?### IntroductionIn the face of global climate change, It is important to understand the dynamics of forest cover conversion. The first question delves into the net forest conversion data from 1990 to 2015, examining the changes decade-wise for various countries across the world. Utilizing data from [tidytuesday](https://github.com/rfordatascience/tidytuesday/tree/master/data/2021/2021-04-06) which was sourced from *Our World in Data*, this study explores the dynamics of forest cover alteration over four decades. By visualizing and analyzing the net forest conversion rates from 1990 to 2015, this study aims to shed light on the evolving state of the world's forests and highlight countries with notable trends in forest expansion and deforestation.For this visualization we focused on the `forest` dataset provided by Our World in Data. We utilized the `net_forest_converstion` variable as it gave us a significant information on how forest conversion has been over the years. In the main plot, our attention was directed towards the specific periods of 1990, 2000 and 2010. However, in the animated compilation, we incorporated visualizations from all decades.### ApproachTo address this question, we implemented a choropleth map visualization highlighting the forest conversion for specific decades and spotlighting few countries with extensive forest conversion. We began to scout the data for getting relevant information and noteworthy details. The whole approach can be categorized into Data Preparation and Pre-processing, Visualizing the data and Animating the plots.**Data Preparation and Pre-processing**To generate a map plot we needed geographical information which is not present in the dataset and was achieved by utilizing the `maps` package in R. From this `world` data we retrieved all the unique countries for further processing and also filtered out Antarctica from the data.A custom function, `processForest()`, is developed to handle the pre-processing of the forest conversion dataset. This function is used filter countries, ensuring all entities present in the map data are included. It also used to categorize countries based on their `net forest conversion` rates, grouping them into distinct categories. The processed forest conversion data is split into subsets for each decade (1990, 2000, 2010, and 2015). The `split()` function in R is employed to divide the data based on the `year` column.```{r world_data, message=FALSE, include=FALSE}# World data from mapsworld <-map_data("world")# Extracting unique countries from the world dataset and storing them in a tibbleunique_countries <- world |>select(region) |># region is the column which is having country names infounique() |># getting unique country namesas_tibble()# Removing Antarctica from the world dataworld <-subset(world, region !="Antarctica")``````{r process_forest_function, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Glimpse of processForest Function"# function to pre process the forest dataset# input : dataset - tibble# unique_countries - tibble# output : filtered_data - tibbleprocessForest <-function (dataset, unique_countries) { filtered_data <- dataset |># filtering only entity, year and net_forest_conversion columnsselect(entity, year, net_forest_conversion) |># getting all the countires which are not present in forest dataset for a specific years# bind_rows() is used combine combine rows of two data framesbind_rows(# anti_join() is used to return only the rows from the first dataset that isn't having matching rows in the second dataset based on specified key columnsanti_join(unique_countries, dataset, by =c("region"="entity")) |># adding year and net_forest_conversion for that specific year as NAmutate(year = dataset[1, "year"], net_forest_conversion =NA) ) |># renaming USA and UK so that both these countries are matching in world dataset and forest datasetmutate(entity =case_when( entity =="United States"~"USA", entity =="United Kingdom"~"UK",TRUE~ entity ) ) |># creating a categorical variable forest_converstion to group countries based on their forest conversionmutate(entity =coalesce(entity, region),forest_converstion =case_when( net_forest_conversion <-400000~"<-400k", net_forest_conversion <-200000~"-400k to -200k", net_forest_conversion <-100000~"-200k to -100k", net_forest_conversion <0~"-100k to 0", net_forest_conversion <100000~"0 to 100k", net_forest_conversion <200000~"100k to 200k", net_forest_conversion <400000~"200k to 400k",is.na(net_forest_conversion) ~NA_character_,TRUE~">400k" ) ) |># ordering forest_converstion column using factors based on the created categoriesmutate(forest_converstion =as_factor(forest_converstion) |>fct_relevel("<-400k","-400k to -200k","-200k to -100k","-100k to 0","0 to 100k","100k to 200k","200k to 400k",">400k" ) )return(filtered_data)}``````{r split_forest_data, message=FALSE, include=FALSE}# Dividing data into tibbles based on year and it creates a list of tibblesforest_decades <-split(forest, f = forest$year)# Implementing the pre processing function on top of the created split tibbles# lapply() is used for applying a function on top of any listfiltered_forests <-lapply(forest_decades, function(forest_ds) {processForest(forest_ds, unique_countries)})```A custom function, `filterCountries()`, is created to identify and extract specific countries of interest. These countries are singled out due to their significant forest conversion.```{r highlightingCountriesWithmajorChange, message=FALSE, include=FALSE}# function to filter countries which are having noteworthy forest conversion# input : dataset - tibble# variable - column character# output : highlight_data - tibblefilterCountries <-function (dataset, variable){ highlight_data <-subset(dataset, variable %in%c("Brazil", "Tanzania", "China", "India", "Russia", "USA", "Australia"))return(highlight_data)}# filtering countries from world dataset using filter_countrieshighlight_world <-filterCountries(world, world$region)# filtering countries from filtered_forests dataset using filter_countrieshighlight_filtered_data <-lapply(filtered_forests, function(forest_ds) {filterCountries(forest_ds, forest_ds$entity)})```**Visualization the data and Animating the plots**The `ggplot2` package in R is utilized to create detailed visualizations for each decade. For each decade, a map is generated where countries are color-coded according to their forest conversion categories. The `geom_map()` function is employed to plot the world map, and additional layers are added to highlight the noteworthy countries, ensuring they stand out in the visual representation.We've encapsulated the common plotting logic for all decades within a function called `generateForestConversionPlot()`. This function streamlines the process of generating plots for each specific decade.```{r colourMapping, message=FALSE, include=FALSE}# color mapping for different forest conversion categoriescolor_mapping <-c("<-400k"="#d73027","-400k to -200k"="#f46d43","-200k to -100k"="#fdae61","-100k to 0"="#fee08b","0 to 100k"="#d9ef8b","100k to 200k"="#a6d96a","200k to 400k"="#66bd63",">400k"="#1a9850" )```We also developed a function called `generatePlotforAnimation()`, designed specifically to adjust text sizes in the original plot to enhance clarity during animation and save the resulting plots. The individual plots for each decade are compiled into an animated GIF using the `gganimate` package. The resulting GIF provides a dynamic overview of the global forest conversion patterns, emphasizing the transformations occurring over the specified decades.### AnalysisWe concentrated our visual analysis on the decades of `1990`, `2000`, `2010`. As some countries like `USA`, `Russia` and `Australia` are not having any data to visualize for `2015`. It is important to note that the animated plots incorporates data from all decades, providing a comprehensive overview of the entire timeline and the evolving patterns in forest cover across different years.```{r world_plot_function, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Function used to generate the plot"# Function for creating the ggplot map plot# Using the filtered_forests$`2000` dataset created earlier as a data source# using entity as map_id for first layer# using forest_convestion as fill aesthetic and word as map for second layer# using highlight_filtered_data$`2000` as another dataset for creating another map layer# using entity as map_id,forest_convestion as fill aesthetic and highlight_world as map for third layer# input : year - integer# output : world_plot - plot objectgenerateForestConversionPlot <-function(year) { world_plot <-ggplot(filtered_forests[[as.character(year)]], aes(map_id = entity)) +geom_map(aes(fill = forest_converstion),map = world,color ="#B2BEB5",linewidth =0.25,linetype ="blank" ) +geom_map(data = highlight_filtered_data[[as.character(year)]],aes(map_id = entity, fill = forest_converstion),map = highlight_world,color ="#71797E",show.legend = F ) +expand_limits(x = world$long, y = world$lat) +scale_fill_manual(values = color_mapping, na.value ="#F2F3F4") +coord_fixed(ratio =1) +labs(title =paste("Net Forest Conversion by Country in", year),subtitle ="Net change in forest area measures forest expansion minus deforestation",caption ="Data source: Our World in Data",fill ="Net Forest Conversion (hectares)" ) +theme_void() +theme(legend.position ="bottom",legend.direction ="horizontal",plot.title =element_text(size =19, face ="bold", hjust =0.5),plot.subtitle =element_text(size =15, color ="azure4", hjust =0.5),plot.caption =element_text(size =12, color ="azure4", hjust =0.95) ) +guides(fill =guide_legend(nrow =1,direction ="horizontal",title.position ="top",title.hjust =0.5,label.position ="bottom",label.hjust =1,label.vjust =1,label.theme =element_text(lineheight =0.25, size =9),keywidth =1,keyheight =0.5 ) )return(world_plot)}``````{r allDecades_plot, message=FALSE}# Generating plots for different decadesplot_1990 <-generateForestConversionPlot(1990)plot_2000 <-generateForestConversionPlot(2000)plot_2010 <-generateForestConversionPlot(2010)plot_2015 <-generateForestConversionPlot(2015)```**Change in forest cover for the decade `1990`**In the 1990 map, the initial phase of global forest conversion is illustrated. Countries are visually distinguished through color codes representing their net forest conversion rates. Particularly, `Brazil` stands out with significant deforestation during this period, in contrast to countries such as `China`, `India` and `USA` which showcase notable forest expansion.```{r plot_1990, message=FALSE}plot_1990```**Change in forest cover for the decade `2000`**Moving to the year 2000, the map reveals how forest conversion patterns have evolved. Several countries have altered their trajectory. Countries such as `Russia`, `China` and `USA` has shown increase in their forestation. While `Brazil` still continues to be in the state of deforestation. Compared to previous decade `India`'s forest conversion rate reduced even though they are moving towards reforestation.```{r plot_2000, message=FALSE}plot_2000```**Change in forest cover for the decade `2010`**By 2010, global efforts to combat deforestation are visible. The forest conversion has shifted its tides in some countries like `Russia` and `Australia`. But, `Brazil` still continues to be in the state of deforestation.```{r plot_2010, message=FALSE}plot_2010``````{r plot_2015, message=FALSE, eval=FALSE}plot_2015``````{r saving_plots, message=FALSE, include=FALSE}# Function for saving plot for animation# input : world_forest_plot - plot_object,# file_path - file path to savegeneratePlotforAnimation <-function(world_forest_plot, file_path) {# generating the plot updated_world_plot <- world_forest_plot +theme(plot.title =element_text(size =24),plot.subtitle =element_text(size =18),plot.caption =element_text(size =15),legend.key.size =unit(2, "lines"),legend.text =element_text(size =14),legend.title =element_text(size =16, face ="bold") )# saving the plot as an image fileggsave(updated_world_plot, filename = file_path,height =8, width =15, unit ="in", dpi =120)}generatePlotforAnimation(plot_1990, "images/q1/forest_plot_1990.jpg")generatePlotforAnimation(plot_2000, "images/q1/forest_plot_2000.jpg")generatePlotforAnimation(plot_2010, "images/q1/forest_plot_2010.jpg")generatePlotforAnimation(plot_2015, "images/q1/forest_plot_2015.jpg")```**Dynamic display of forest conversion across the countries over the past decades**```{r savingas_GIF, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Animation of the plots"# making gif using gganimate packageforest_plots <-list.files(path ="images/q1/", full.names =TRUE)forest_plot_list <-lapply(forest_plots, image_read)# joining all the saved imagesjoined_plots <-image_join(forest_plot_list)# animating the images using image_animate() and restting the resolution# setting fps = 1forest_animation <-image_animate(image_scale(joined_plots, "6000x4000"), fps =1)# saving image to gitimage_write(image = forest_animation, path ="images/world_forestaion.gif")forest_animation```### DiscussionThe visualizations offer a vivid depiction of the evolving global forest conversion patterns between `1990 and 2015`. Through color-coded maps, complex net forest conversion rates are presented in an easily understandable manner, facilitating the immediate recognition of trends. Darker shades representing significant deforestation or expansion immediately draw attention, guiding viewers to focus on countries undergoing substantial environmental changes. The animated visualizations highlight the dynamic nature of global forest conversion over the years across different countries.The plot reveals significant changes in forest cover over recent decades, with unexpected rises observed in specific countries. Notable positive shifts occurred in the `2000s` and `2010s`, particularly in countries like the `USA`, `Russia`, `Australia`, `China`, and `India`. These improvements might be attributed to global deforestation awareness campaigns, leading to increased forest coverage in these nations. However, a significant portion of *South America* and *Africa* continues to bear the brunt of deforestation, with no signs of slowing down, particularly in countries like `Brazil` and `Tanzania`. Economically developed continents seem to prioritize reforestation initiatives, contrasting with less economically advantaged regions where such efforts receive comparatively less focus.In conclusion, this visualization offers a compelling narrative of global environmental change, highlighting both challenges and successes in forest conservation efforts. It serves as a valuable tool for policymakers and environmentalists providing actionable insights into regional and global trends. Such visualizations can aid in targeted policy formulation, focusing resources where they are most needed, and fostering international collaborations to combat deforestation.## Question 2: How has the consumption of Soybean in Brazil changed over time, and how does it impact the afforestation or deforestation rates?### IntroductionThe consumption of soybeans, a versatile and globally significant crop, is intricately linked with land use changes, often impacting regions far beyond agricultural fields. In the second question, we focus on the dynamic story of soybean consumption in Brazil. Our central question revolves around the historical evolution of soybean consumption and its potential implications for afforestation and deforestation rates in this vital agricultural region. This analysis aims to unveil the intricate relationship between soybean consumption and environmental changes in Brazil. The findings will contribute to a deeper understanding of how agricultural practices in this key region influence land use, afforestation, and deforestation. This knowledge is invaluable for making informed decisions regarding sustainable land management and conservation practices in this dynamic agricultural landscape.For this visualization, we used `soybean_use` data which was sourced from Our World in Data and performed data manipulation to obtain a new column showing the total consumption of soybean. The soyabean_use data, comprises of the columns (variables) `human_food` ,`animal_feed` and `processed`.### ApproachCleaning and processing of the dataset includes the following steps:Soybean consumption in Brazil:1. Created a new column for calculating the total soybean consumption. Removing totals of countries whose total consumption is 0 using the `subset` function.2. Filtering for Brazil under the `entity` column, and `year` between 1990 and 2013.Forest coverage in Brazil:1. Filtering for `year` between 1990 and 2013 in the `forest_brazil` dataset.2. There is a parameter 'World' which shows the overall forest coverage data. Filtering out `entity` as 'World' and `year` between 1990 and 2013, and grouping by year.3. Using `left_join` we merge the two tables based on the `year` column.4. Since we now have percentage data and total data per year, we can calculate the change in forest coverage for Brazil by doing `forest_area.x`\*`forest_area.y`.```{r q2_brazil_process, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Pre-processing of soybean and forest data"#Function to pre-process the total_forest, soybean_use and forest_area datasets#Input : total_forest- tibble# soybean_use- tibble# forest_area- tibble#Output: soybean_brazil- tibble# forest_brazil- tibble#Cleaning total_forest tabletotal_forest_cleaned <-clean_names(total_forest)#Making a new column to calculate the total soybean consumptionsoybean <- soybean_use |>mutate(total = human_food + animal_feed + processed)#Some countries do not have consumption, and shows as 0. #Removing the rows if total=0soybean <-subset(soybean, total !=0)# Filter data for Brazilsoybean_brazil <- soybean |>filter(entity =="Brazil", year>=1990&year<=2013)# Filter data for Brazil forest: forest_brazil <- forest_area |>filter(entity =="Brazil",year>=1990&year<=2013)#Finding total forest coverage per yeartotal_forest_world <- total_forest_cleaned |>filter(year >=1990, year <=2013, entity =="World") |>group_by(year)# Left join to add total world forest coverage to the forest_brazil datasetforest_brazil <- forest_brazil |>left_join(total_forest_world, by ="year")#Finding actual total coverage for Brazil (percentage * total)forest_brazil <- forest_brazil|>mutate(forest_area_brazil = forest_area.x * forest_area.y /100)```### AnalysisPlotting the soybean data```{r q2_plot_soybean, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Plotting of soybean consumption in Brazil"#Code for creating the time series plot#Used the soybean_brazil dataset created earlier as a data source#using year and total as x and y axis for first layer#Plotting points over line to increase visibility as second layer#Manual fill to show trend as positive#Input: year and total- numeric#Output: plot_soybean_brazil- plot object# Create a line plot for Brazil soybean consumptionplot_soybean_brazil <-ggplot(soybean_brazil, aes(x = year, y = total, color ="Brazil")) +geom_line(linewidth =2) +#Plotting line plot of seriesgeom_point(color ="#6E8B3D") +#Plotting points for claritylabs(x ="\nYear", y ="Total (in lb)\n", title ="Soybean consumption in Brazil\n", caption ="Jon Harmon | TidyTuesday") +theme_minimal() +theme(legend.position ="none", plot.title =element_text(size =15)) +scale_y_continuous(labels = scales::label_number(scale =1e-06, suffix ="M")) +#Cleaning long numbersscale_color_manual(values =c("Brazil"="#a6d96a")) +scale_x_continuous(limits =c(1990, 2013), breaks =seq(1990, 2013, by =2)) #Defining year range#Saving plot to location, and defining custom widthggsave(plot_soybean_brazil, filename ="images/q2/plot_soybean_brazil.jpg", height =8, width =15, unit ="in", dpi =120)plot_soybean_brazil```Animating the soybean graph to show the trends:```{r q2_plot_soybean_anim, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Animation of soybean usage"#Code to animate the plot using gganimate package# Animate the plotanim_plot_soybean <- plot_soybean_brazil +transition_reveal(year)# Save as an animated GIFanim_save("soybean_brazil_animation.gif", anim_plot_soybean, renderer =gifski_renderer())# Load the animated GIFbrazil_soybean_animation <-image_read("soybean_brazil_animation.gif")# Display the animationbrazil_soybean_animation |>image_animate(fps =25)```Plotting the forest coverage in Brazil```{r q2_plot_forest, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Plotting of forest coverage in Brazil"#Code for creating the time series plot#Used the forest_brazil dataset created earlier as a data source#using year and total as x and y axis for first layer for line plot#Plotting points over line to increase visibility as second layer#Manual fill to show trend as negative#Input: year and forest_area_brazil- numeric#Output: plot_soybean_brazil- plot object# Create a line plot for Brazil with pointsplot_forest_brazil <-ggplot(forest_brazil, aes(x = year, y = forest_area_brazil, color ="Brazil")) +geom_line(linewidth =2) +#Plotting line plot of seriesgeom_point(color="#fdae61") +#Plotting points for claritylabs(x ="\nYear", y ="Forest coverage (in hectares)\n", title ="Forest coverage in Brazil\n", caption="Jon Harmon | TidyTuesday") +theme_minimal() +theme(legend.position ="none", plot.title =element_text(size =15)) +scale_color_manual(values =c("Brazil"="#fee08b"))+scale_x_continuous(limits =c(1990, 2013), breaks =seq(1990, 2013, by =2))+#Defining year rangescale_y_continuous(labels = scales::label_number(scale =1e-6, suffix ="M")) #Cleaning long numbersggsave(plot_forest_brazil, filename ="images/q2/plot_forest_brazil.jpg", height =8, width =15, unit ="in", dpi =120)plot_forest_brazil``````{r q2_plot_forest_anim, message=FALSE, echo=TRUE}#| code-fold: true#| code-summary: "Animation of forest coverage"#Animation of plot using gganimate package# Animate the plotanim_plot_forest <- plot_forest_brazil +transition_reveal(year)# Save as an animated GIFanim_save("forest_brazil_animation.gif", anim_plot_forest, renderer =gifski_renderer())# Load the animated GIFbrazil_forest_animation <-image_read("forest_brazil_animation.gif")# Display the animationbrazil_forest_animation |>image_animate(fps =25)```### DiscussionThe discussion on the relationship between soybean consumption and environmental changes in Brazil is of paramount importance, given the global significance of soybeans as a versatile crop ([Pagano & Miransari](https://www.sciencedirect.com/science/article/pii/B9780128015360000013)). The intricate link between land use changes and soybean consumption highlights the need for a comprehensive understanding of the dynamics at play. By examining the historical evolution of soybean consumption in Brazil, this study sheds light on the potential implications for afforestation and deforestation rates in this key agricultural region. It becomes evident that this had a notable impact on land use in Brazil.The utilization of time series visualization techniques, including geom_line and geom_point, has allowed for a comprehensive overview of the trends over time. These visualizations provide a clear depiction of the steady increase in soybean consumption in Brazil. The data shows a remarkable increase, from approximately 16.4 million pounds of soybeans in 1990 to a staggering 36.87 million pounds in 2013. This indicates a substantial growth over this period ([Pagano & Miransari](https://www.sciencedirect.com/science/article/pii/B9780128015360000013)).Moreover, the decrease in forest coverage during this period is stark. The forest coverage in Brazil dropped from 588 million hectares in 1990 to 507 million hectares in 2013, representing a significant loss of 81 million hectares of forest land during this time. This reduction in forest area is indicative of the environmental impact in Brazil ([Song et al.)](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8350977/#:~:text=Direct%20soybean%2Ddriven%20deforestation%20reached,forest%20loss%20during%20this%20period.).The correlation between rising soybean consumption and decreasing forest coverage in Brazil underscores the need for sustainable agricultural practices and conservation efforts. As the global demand for soybeans continues to grow, this study's findings, supported by the data, serve as a valuable resource for policymakers and stakeholders in the region ([Valdez](https://www.ers.usda.gov/amber-waves/2022/september/brazil-s-momentum-as-a-global-agricultural-supplier-faces-headwinds/)). It underscores the importance of considering the environmental consequences of agricultural expansion and the need for measures to mitigate deforestation. This research contributes to the broader understanding of the complex interplay between agriculture and environmental changes, making it a pivotal step towards more informed and sustainable land management practices in Brazil.## ReferencesHannah Ritchie (2021) - "Forest area". Published online at OurWorldInData.org. Retrieved from: '[https://ourworldindata.org/forest-area'](https://ourworldindata.org/forest-area')\[Online Resource\]Valdez, C. (2022). Brazil's Momentum as a Global Agricultural Supplier Faces Headwinds. Published online at ers.usda.gov. Retrieved from: [Economic Research Service](https://www.ers.usda.gov/amber-waves/2022/september/brazil-s-momentum-as-a-global-agricultural-supplier-faces-headwinds/)Pagano, M. & Miransari, M. (2016). "The importance of soybean production worldwide." Retrieved from: [Science Direct](https://www.sciencedirect.com/science/article/pii/B9780128015360000013)Song et al., (2021). "Massive soybean expansion in South America since 2000 and implications for conservation." Retrieved from: [National Library of Medicine](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8350977/#:~:text=Direct%20soybean%2Ddriven%20deforestation%20reached,forest%20loss%20during%20this%20period.)Quarto, For documentation and presentation - [Quarto](https://quarto.org/docs/reference/formats/html.html)ggplot, For understanding of different plot - [ggplot](https://ggplot2.tidyverse.org/reference/index.html)Logo, The logo used in the webiste - [Online](https://pngtree.com/so/pandas)