Simple Explanatory Data Analysis on Marcus Rashford’s Performance since 2015 using R.

Sujanshahi
3 min readOct 31, 2023

Sujan Shahi

Source: https://www.premierleague.com/players/13565/player/overview

A brief introduction

I am a very passionate football(soccer) fan and have followed Manchester United since I was a boy. Manchester United play in the Premier League in England. It is one of the biggest football leagues in the world. Marcus Rashford is one of our academy players who burst into the scene in 2015/2016, his senior debut season. Since then, he has been a player on watch for Manchester United.

Why This Analysis?

Well, Mainly two main reasons:

1) I am very passionate about football(soccer)

2) I wanted to learn how to use APIs to download data through R, convert the downloaded data from JSON format to an R data frame, and use that data to perform simple analysis.

To keep this analysis concise, I have only presented parts of the R code. To access the R code, please visit my GitHub page here

Accessing General FPL Data using APIs in R

We will use the general API to access general information on teams and players playing in the premier league using the following API: https://fantasy.premierleague.com/api/bootstrap-static/. For more information on FPL APIs, click here.

# Importing Libraries for handling API
library(httr)
library(jsonlite)
library('dplyr')

base_url = 'https://fantasy.premierleague.com/api/bootstrap-static/'
res = GET(base_url)
print(res)

#Converting the raw Unicode data to JSON Format
json_data = fromJSON(rawToChar(res$content))
class(json_data)

Accessing a Player’s Historic Performance Data

We will use the following API to access player-specific data:

https://fantasy.premierleague.com/api/element-summary/{player_id}

To make this reproducible for other players, I went ahead and created a function that takes in a player_id and gets the performance history of that player. We can extract a player_id using the General API provided above.

# Function to get performance data of a player
# Input: player_id, a number (integer) that denotes a player's id
# Output: history_data, ( dataframe ), a dataframe containing historic information of the player with player_id
get_history_data <- function(player_id){
url = paste( 'https://fantasy.premierleague.com/api/element-summary/',toString(player_id), '/', sep="" )
res = GET(url)
history_json_data = fromJSON(rawToChar(res$content))
history_data <- history_json_data$history_past
history_data
}

get_history_data(rashford_info$id)

Marcus Rashford's Performance Data

Goals Scored per Season

Fig: Number of Goals Scored by Marcus Rashford in Premier League Since 2015

Based on the graph, it looks like Marcus Rashford’s goal-scoring form took a big dip of -76% from Season 2019/20 to 2021/22 and it went back up again in 2022/23.

Number of Assists Per Season

Fig: Number of Assists for Marcus Rashford in Premier League Since 2015

Based on the graph, it looks like Marcus Rashford’s goal-scoring form took a big dip of -81% from Season 2020/21 to 2021/22 and it went back up again in 2022/23.

Goals Scored Vs Number of Assists

Fig: Number of Goals vs Number of Assists for Marcus Rashford in Premier League since 2015

Summary

It looks like Marcus Rashford’s goal-scoring form dipped by -76% going from 2019/2020 to 2021/22 whereas his number of assists decreased by -81% going from 2020/21 to 2021/22.

It was a great experience using R to download data using APIs , and using libraries such as tidyverse,dplyr in R to perform data analysis and visualization. I will continue to build on it to expand my knowledge in the field of Data Science.

--

--