Exploratory Analysis of FIFA 20 dataset using R

fifa_20 image

With FIFA World Cup 2022 around the corner, I combined my knowledge for football and data analytics to whip up a short exploratory analysis of the FIFA 20 dataset using R. I used the non-physical player attributes such as short_name, wage_eur, value_eur, age,club, potential, player_positions, Nationality, Overall.

I used the the dataset available on Kaggle which contains 17,000+ players featuring in FIFA20, each with more than 70 attributes. It is scraped from the website <SoFIFA>.

SQL part

Before we go to the analysis part first i cleaned the data using SQL querry in google cloud BigQuerry [“ab thum log kahoge bigquery woo bhi free mai , aur mai kahunga jalwa hai humara” ] and took only data of 2000 players because [“cleaning part mai bhi kuch karna tha na”].

SELECT *

FROM `fifa`

ORDER BY wage_eur DESC

LIMIT 2000

Importing packages

Before we start the analysis, let’s import the libraries.

#### type this in console

library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6      ✔ purrr   0.3.4 
## ✔ tibble  3.1.8      ✔ dplyr   1.0.10
## ✔ tidyr   1.2.1      ✔ stringr 1.4.1 
## ✔ readr   2.1.2      ✔ forcats 0.5.2 
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
 library(skimr)
library(Tmisc)
library(ggplot2)
library(dplyr)
library(gridExtra)
## 
## Attaching package: 'gridExtra'
## 
## The following object is masked from 'package:dplyr':
## 
##     combine
library(tidyselect)
# df = rea

reading file

df = read.csv(file = "table1.csv", stringsAsFactors = FALSE)
df = data.frame(df)
#df <- select_all(df)

Glimpse of Data set

#df <- data_frame(table1)
head(df)
##   sofifa_id
## 1    158023
## 2    183277
## 3     20801
## 4    194765
## 5    192985
## 6    176580
##                                                              player_url
## 1               https://sofifa.com/player/158023/lionel-messi/20/159586
## 2                https://sofifa.com/player/183277/eden-hazard/20/159586
## 3 https://sofifa.com/player/20801/c-ronaldo-dos-santos-aveiro/20/159586
## 4          https://sofifa.com/player/194765/antoine-griezmann/20/159586
## 5            https://sofifa.com/player/192985/kevin-de-bruyne/20/159586
## 6                https://sofifa.com/player/176580/luis-suarez/20/159586
##          short_name                           long_name age height_cm weight_kg
## 1          L. Messi      Lionel Andrés Messi Cuccittini  32       170        72
## 2         E. Hazard                         Eden Hazard  28       175        74
## 3 Cristiano Ronaldo Cristiano Ronaldo dos Santos Aveiro  34       187        83
## 4      A. Griezmann                   Antoine Griezmann  28       176        73
## 5      K. De Bruyne                     Kevin De Bruyne  28       181        70
## 6         L. Suárez            Luis Alberto Suárez Díaz  32       182        86
##   nationality            club overall potential value_eur wage_eur
## 1   Argentina    FC Barcelona      94        94  95500000   565000
## 2     Belgium     Real Madrid      91        91  90000000   470000
## 3    Portugal        Juventus      93        93  58500000   405000
## 4      France    FC Barcelona      89        89  69000000   370000
## 5     Belgium Manchester City      91        91  90000000   370000
## 6     Uruguay    FC Barcelona      89        89  53000000   355000
##   player_positions preferred_foot international_reputation weak_foot
## 1       RW, CF, ST           Left                        5         4
## 2           LW, CF          Right                        4         4
## 3           ST, LW          Right                        5         4
## 4       CF, ST, LW           Left                        4         3
## 5          CAM, CM          Right                        4         5
## 6               ST          Right                        5         4
##   skill_moves   work_rate  body_type real_face release_clause_eur
## 1           4  Medium/Low      Messi      TRUE          195800000
## 2           4 High/Medium     Normal      TRUE          184500000
## 3           5    High/Low C. Ronaldo      TRUE           96500000
## 4           4   High/High     Normal      TRUE          141500000
## 5           4   High/High     Normal      TRUE          166500000
## 6           3 High/Medium     Normal      TRUE          108700000
##                                                                                               player_tags
## 1 #Dribbler, #Distance Shooter, #Crosser, #FK Specialist, #Acrobat, #Clinical Finisher, #Complete Forward
## 2                                                                         #Speedster, #Dribbler, #Acrobat
## 3               #Speedster, #Dribbler, #Distance Shooter, #Acrobat, #Clinical Finisher, #Complete Forward
## 4                                     #Dribbler, #Engine, #Acrobat, #Clinical Finisher, #Complete Forward
## 5                     #Dribbler, #Playmaker  , #Engine, #Distance Shooter, #Crosser, #Complete Midfielder
## 6                                     #Distance Shooter, #Strength, #Clinical Finisher, #Complete Forward
##   team_position team_jersey_number loaned_from     joined contract_valid_until
## 1            RW                 10             01-07-2004                 2021
## 2            LW                  7             01-07-2019                 2024
## 3            LW                  7             10-07-2018                 2022
## 4            LW                 17             12-07-2019                 2024
## 5           RCM                 17             30-08-2015                 2023
## 6            ST                  9             11-07-2014                 2021
##   nation_position nation_jersey_number pace shooting passing dribbling
## 1                                   NA   87       92      92        96
## 2              LF                   10   91       83      86        94
## 3              LS                    7   90       93      82        89
## 4             CAM                    7   81       86      84        89
## 5             RCM                    7   76       86      92        86
## 6                                   NA   73       89      80        84
##   defending physic gk_diving gk_handling gk_kicking gk_reflexes gk_speed
## 1        39     66        NA          NA         NA          NA       NA
## 2        35     66        NA          NA         NA          NA       NA
## 3        35     78        NA          NA         NA          NA       NA
## 4        57     72        NA          NA         NA          NA       NA
## 5        61     78        NA          NA         NA          NA       NA
## 6        51     84        NA          NA         NA          NA       NA
##   gk_positioning
## 1             NA
## 2             NA
## 3             NA
## 4             NA
## 5             NA
## 6             NA
##                                                                                                                                         player_traits
## 1 Beat Offside Trap, Argues with Officials, Early Crosser, Finesse Shot, Speed Dribbler (CPU AI Only), 1-on-1 Rush, Giant Throw-in, Outside Foot Shot
## 2                                                             Beat Offside Trap, Selfish, Finesse Shot, Speed Dribbler (CPU AI Only), Crowd Favourite
## 3                                       Long Throw-in, Selfish, Argues with Officials, Early Crosser, Speed Dribbler (CPU AI Only), Skilled Dribbling
## 4                   Beat Offside Trap, Selfish, Argues with Officials, Finesse Shot, Speed Dribbler (CPU AI Only), Outside Foot Shot, Crowd Favourite
## 5                        Power Free-Kick, Avoids Using Weaker Foot, Dives Into Tackles (CPU AI Only), Leadership, Argues with Officials, Finesse Shot
## 6                                                                                                                 Diver, Speed Dribbler (CPU AI Only)
##   attacking_crossing attacking_finishing attacking_heading_accuracy
## 1                 88                  95                         70
## 2                 81                  84                         61
## 3                 84                  94                         89
## 4                 83                  89                         84
## 5                 93                  82                         55
## 6                 78                  91                         83
##   attacking_short_passing attacking_volleys skill_dribbling skill_curve
## 1                      92                88              97          93
## 2                      89                83              95          83
## 3                      83                87              89          81
## 4                      85                87              88          86
## 5                      92                82              86          85
## 6                      82                90              85          86
##   skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration
## 1                94                 92                 96                    91
## 2                79                 83                 94                    94
## 3                76                 77                 92                    89
## 4                85                 82                 90                    82
## 5                83                 91                 91                    77
## 6                82                 72                 84                    76
##   movement_sprint_speed movement_agility movement_reactions movement_balance
## 1                    84               93                 95               95
## 2                    88               95                 90               94
## 3                    91               87                 96               71
## 4                    81               90                 92               83
## 5                    76               78                 91               76
## 6                    70               79                 92               79
##   power_shot_power power_jumping power_stamina power_strength power_long_shots
## 1               86            68            75             68               94
## 2               82            56            84             63               80
## 3               95            95            85             78               93
## 4               82            89            87             63               83
## 5               91            63            89             74               90
## 6               88            69            82             86               86
##   mentality_aggression mentality_interceptions mentality_positioning
## 1                   48                      40                    94
## 2                   54                      41                    87
## 3                   63                      29                    95
## 4                   73                      49                    90
## 5                   76                      61                    88
## 6                   87                      41                    92
##   mentality_vision mentality_penalties mentality_composure defending_marking
## 1               94                  75                  96                33
## 2               89                  88                  91                34
## 3               82                  85                  95                28
## 4               86                  86                  89                59
## 5               94                  79                  91                68
## 6               82                  83                  85                57
##   defending_standing_tackle defending_sliding_tackle goalkeeping_diving
## 1                        37                       26                  6
## 2                        27                       22                 11
## 3                        32                       24                  7
## 4                        54                       49                 14
## 5                        58                       51                 15
## 6                        45                       38                 27
##   goalkeeping_handling goalkeeping_kicking goalkeeping_positioning
## 1                   11                  15                      14
## 2                   12                   6                       8
## 3                   11                  15                      14
## 4                    8                  14                      13
## 5                   13                   5                      10
## 6                   25                  31                      33
##   goalkeeping_reflexes
## 1                    8
## 2                    8
## 3                   11
## 4                   14
## 5                   13
## 6                   37
skim_without_charts(df)
Data summary
Name df
Number of rows 2000
Number of columns 77
_______________________
Column type frequency:
character 15
logical 1
numeric 61
________________________
Group variables None

Variable type: character

skim_variable n_missing complete_rate min max empty n_unique whitespace
player_url 0 1 49 78 0 2000 0
short_name 0 1 4 21 0 1983 0
long_name 0 1 2 42 0 1999 0
nationality 0 1 4 20 0 94 0
club 0 1 3 30 0 166 0
player_positions 0 1 2 12 0 276 0
preferred_foot 0 1 4 5 0 2 0
work_rate 0 1 8 13 0 8 0
body_type 0 1 4 19 0 9 0
player_tags 0 1 0 132 1526 81 0
team_position 0 1 2 3 0 29 0
loaned_from 0 1 0 30 1913 52 0
joined 0 1 0 10 87 575 0
nation_position 0 1 0 3 1523 25 0
player_traits 0 1 0 147 421 531 0

Variable type: logical

skim_variable n_missing complete_rate mean count
real_face 0 1 0.55 TRU: 1095, FAL: 905

Variable type: numeric

skim_variable n_missing complete_rate mean sd p0 p25 p50 p75 p100
sofifa_id 0 1.00 201672.42 25503.67 1179 189168.00 203333 220044.2 251573
age 0 1.00 27.25 3.70 18 25.00 27 30.0 41
height_cm 0 1.00 182.13 6.71 163 177.00 182 187.0 203
weight_kg 0 1.00 76.85 7.24 56 72.00 77 82.0 101
overall 0 1.00 77.18 4.12 66 74.00 77 80.0 94
potential 0 1.00 79.46 4.75 68 76.00 79 83.0 95
value_eur 0 1.00 12309625.00 12005200.19 525000 5500000.00 8500000 14000000.0 105500000
wage_eur 0 1.00 50909.50 45447.01 22000 27000.00 36000 53000.0 565000
international_reputation 0 1.00 1.66 0.79 1 1.00 1 2.0 5
weak_foot 0 1.00 3.20 0.72 1 3.00 3 4.0 5
skill_moves 0 1.00 2.92 0.88 1 2.00 3 3.0 5
release_clause_eur 87 0.96 23669470.99 23545344.26 998000 9900000.00 16200000 27700000.0 195800000
team_jersey_number 0 1.00 16.76 13.89 1 8.00 14 22.0 99
contract_valid_until 0 1.00 2021.65 1.29 2019 2021.00 2022 2023.0 2026
nation_jersey_number 1523 0.24 11.79 6.54 1 6.00 11 18.0 24
pace 137 0.93 70.47 11.94 29 64.00 71 78.0 96
shooting 137 0.93 63.57 13.63 15 56.00 67 73.0 93
passing 137 0.93 68.47 8.86 34 63.00 70 75.0 92
dribbling 137 0.93 72.65 8.82 34 68.00 74 78.0 96
defending 137 0.93 59.85 18.24 18 41.50 68 75.0 90
physic 137 0.93 71.06 8.07 41 67.00 72 77.0 89
gk_diving 1863 0.07 79.07 4.68 69 76.00 79 82.0 90
gk_handling 1863 0.07 76.39 4.99 64 73.00 76 80.0 92
gk_kicking 1863 0.07 72.72 7.42 43 69.00 73 78.0 93
gk_reflexes 1863 0.07 80.83 4.84 70 78.00 81 84.0 92
gk_speed 1863 0.07 46.91 8.17 27 42.00 48 53.0 64
gk_positioning 1863 0.07 77.84 4.88 66 74.00 78 81.0 91
attacking_crossing 0 1.00 61.61 18.39 7 54.00 67 74.0 93
attacking_finishing 0 1.00 57.29 20.02 5 44.00 63 73.0 95
attacking_heading_accuracy 0 1.00 62.08 17.42 7 55.00 66 74.0 93
attacking_short_passing 0 1.00 71.18 12.22 11 68.00 74 78.0 92
attacking_volleys 0 1.00 55.44 19.15 7 43.00 60 70.0 90
skill_dribbling 0 1.00 68.06 17.66 7 65.00 73 78.0 97
skill_curve 0 1.00 60.70 19.03 9 51.75 66 74.0 94
skill_fk_accuracy 0 1.00 53.73 19.00 8 40.00 57 69.0 94
skill_long_passing 0 1.00 64.91 13.34 12 59.00 68 74.0 92
skill_ball_control 0 1.00 71.36 14.83 9 70.00 75 79.0 96
movement_acceleration 0 1.00 68.64 13.64 23 61.00 70 78.0 97
movement_sprint_speed 0 1.00 69.01 13.30 24 62.00 71 78.0 96
movement_agility 0 1.00 68.99 13.33 23 62.00 71 78.0 96
movement_reactions 0 1.00 74.43 5.82 55 71.00 74 78.0 96
movement_balance 0 1.00 66.68 14.18 22 59.00 69 77.0 96
power_shot_power 0 1.00 70.42 11.40 15 65.00 73 78.0 95
power_jumping 0 1.00 69.24 11.65 30 64.00 71 77.0 95
power_stamina 0 1.00 70.59 13.56 17 66.00 73 79.0 97
power_strength 0 1.00 70.56 11.15 29 64.00 72 78.0 95
power_long_shots 0 1.00 60.34 19.11 6 54.00 67 74.0 94
mentality_aggression 0 1.00 66.68 15.94 11 58.00 71 78.0 94
mentality_interceptions 0 1.00 56.75 22.11 9 36.00 67 76.0 92
mentality_positioning 0 1.00 62.49 19.83 5 55.00 69 76.0 95
mentality_vision 0 1.00 65.81 13.19 13 59.00 69 75.0 94
mentality_penalties 0 1.00 57.65 16.18 9 48.00 60 70.0 92
mentality_composure 0 1.00 72.28 7.94 25 68.00 73 78.0 96
defending_marking 0 1.00 56.51 21.42 8 38.00 65 74.0 94
defending_standing_tackle 0 1.00 56.67 23.43 9 35.00 68 77.0 92
defending_sliding_tackle 0 1.00 53.41 24.00 8 29.00 64 74.0 90
goalkeeping_diving 0 1.00 15.31 17.61 1 8.00 11 14.0 90
goalkeeping_handling 0 1.00 15.13 16.95 1 8.00 11 14.0 92
goalkeeping_kicking 0 1.00 14.88 16.12 1 8.00 11 14.0 93
goalkeeping_positioning 0 1.00 15.22 17.32 1 8.00 11 14.0 91
goalkeeping_reflexes 0 1.00 15.35 18.09 1 8.00 11 14.0 92
glimpse(df)
## Rows: 2,000
## Columns: 77
## $ sofifa_id                  <int> 158023, 183277, 20801, 194765, 192985, 1765…
## $ player_url                 <chr> "https://sofifa.com/player/158023/lionel-me…
## $ short_name                 <chr> "L. Messi", "E. Hazard", "Cristiano Ronaldo…
## $ long_name                  <chr> "Lionel Andrés Messi Cuccittini", "Eden Haz…
## $ age                        <int> 32, 28, 34, 28, 28, 32, 33, 29, 31, 30, 33,…
## $ height_cm                  <int> 170, 175, 187, 176, 181, 182, 172, 183, 173…
## $ weight_kg                  <int> 72, 74, 83, 73, 70, 86, 66, 76, 70, 76, 82,…
## $ nationality                <chr> "Argentina", "Belgium", "Portugal", "France…
## $ club                       <chr> "FC Barcelona", "Real Madrid", "Juventus", …
## $ overall                    <int> 94, 91, 93, 89, 91, 89, 90, 88, 89, 89, 89,…
## $ potential                  <int> 94, 91, 93, 89, 91, 89, 90, 88, 89, 89, 89,…
## $ value_eur                  <int> 95500000, 90000000, 58500000, 69000000, 900…
## $ wage_eur                   <int> 565000, 470000, 405000, 370000, 370000, 355…
## $ player_positions           <chr> "RW, CF, ST", "LW, CF", "ST, LW", "CF, ST, …
## $ preferred_foot             <chr> "Left", "Right", "Right", "Left", "Right", …
## $ international_reputation   <int> 5, 4, 5, 4, 4, 5, 4, 4, 4, 4, 4, 5, 4, 4, 4…
## $ weak_foot                  <int> 4, 4, 4, 3, 5, 4, 4, 5, 4, 3, 3, 5, 3, 4, 2…
## $ skill_moves                <int> 4, 4, 5, 4, 4, 3, 4, 3, 4, 3, 3, 5, 2, 4, 4…
## $ work_rate                  <chr> "Medium/Low", "High/Medium", "High/Low", "H…
## $ body_type                  <chr> "Messi", "Normal", "C. Ronaldo", "Normal", …
## $ real_face                  <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, T…
## $ release_clause_eur         <int> 195800000, 184500000, 96500000, 141500000, …
## $ player_tags                <chr> "#Dribbler, #Distance Shooter, #Crosser, #F…
## $ team_position              <chr> "RW", "LW", "LW", "LW", "RCM", "ST", "RCM",…
## $ team_jersey_number         <int> 10, 7, 7, 17, 17, 9, 10, 8, 10, 5, 4, 10, 3…
## $ loaned_from                <chr> "", "", "", "", "", "", "", "", "", "", "",…
## $ joined                     <chr> "01-07-2004", "01-07-2019", "10-07-2018", "…
## $ contract_valid_until       <int> 2021, 2024, 2022, 2024, 2023, 2021, 2020, 2…
## $ nation_position            <chr> "", "LF", "LS", "CAM", "RCM", "", "", "SUB"…
## $ nation_jersey_number       <int> NA, 10, 7, 7, 7, NA, NA, 8, 9, 5, 15, 10, N…
## $ pace                       <int> 87, 91, 90, 81, 76, 73, 74, 45, 80, 42, 72,…
## $ shooting                   <int> 92, 83, 93, 86, 86, 89, 76, 80, 90, 62, 68,…
## $ passing                    <int> 92, 86, 82, 84, 92, 80, 89, 90, 77, 80, 75,…
## $ dribbling                  <int> 96, 94, 89, 89, 86, 84, 89, 81, 88, 80, 73,…
## $ defending                  <int> 39, 35, 35, 57, 61, 51, 72, 70, 33, 85, 87,…
## $ physic                     <int> 66, 66, 78, 72, 78, 84, 66, 69, 74, 80, 85,…
## $ gk_diving                  <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_handling                <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_kicking                 <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_reflexes                <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_speed                   <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_positioning             <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ player_traits              <chr> "Beat Offside Trap, Argues with Officials, …
## $ attacking_crossing         <int> 88, 81, 84, 83, 93, 78, 86, 88, 70, 62, 66,…
## $ attacking_finishing        <int> 95, 84, 94, 89, 82, 91, 72, 75, 93, 67, 63,…
## $ attacking_heading_accuracy <int> 70, 61, 89, 84, 55, 83, 55, 58, 78, 68, 92,…
## $ attacking_short_passing    <int> 92, 89, 83, 85, 92, 82, 92, 91, 83, 89, 80,…
## $ attacking_volleys          <int> 88, 83, 87, 87, 82, 90, 76, 82, 85, 44, 69,…
## $ skill_dribbling            <int> 97, 95, 89, 88, 86, 85, 87, 80, 88, 80, 65,…
## $ skill_curve                <int> 93, 83, 81, 86, 85, 86, 85, 86, 83, 66, 74,…
## $ skill_fk_accuracy          <int> 94, 79, 76, 85, 83, 82, 78, 84, 73, 68, 72,…
## $ skill_long_passing         <int> 92, 83, 77, 82, 91, 72, 88, 92, 64, 82, 83,…
## $ skill_ball_control         <int> 96, 94, 92, 90, 91, 84, 92, 89, 89, 88, 83,…
## $ movement_acceleration      <int> 91, 94, 89, 82, 77, 76, 77, 50, 82, 40, 74,…
## $ movement_sprint_speed      <int> 84, 88, 91, 81, 76, 70, 71, 41, 78, 43, 71,…
## $ movement_agility           <int> 93, 95, 87, 90, 78, 79, 92, 60, 84, 67, 78,…
## $ movement_reactions         <int> 95, 90, 96, 92, 91, 92, 89, 87, 92, 87, 87,…
## $ movement_balance           <int> 95, 94, 71, 83, 76, 79, 93, 71, 91, 49, 66,…
## $ power_shot_power           <int> 86, 82, 95, 82, 91, 88, 79, 87, 89, 61, 79,…
## $ power_jumping              <int> 68, 56, 95, 89, 63, 69, 68, 30, 81, 66, 93,…
## $ power_stamina              <int> 75, 84, 85, 87, 89, 82, 85, 74, 79, 86, 80,…
## $ power_strength             <int> 68, 63, 78, 63, 74, 86, 58, 73, 74, 77, 85,…
## $ power_long_shots           <int> 94, 80, 93, 83, 90, 86, 82, 86, 84, 54, 62,…
## $ mentality_aggression       <int> 48, 54, 63, 73, 76, 87, 62, 60, 65, 85, 90,…
## $ mentality_interceptions    <int> 40, 41, 29, 49, 61, 41, 82, 76, 24, 89, 88,…
## $ mentality_positioning      <int> 94, 87, 95, 90, 88, 92, 79, 75, 93, 77, 67,…
## $ mentality_vision           <int> 94, 89, 82, 86, 94, 82, 91, 89, 83, 86, 71,…
## $ mentality_penalties        <int> 75, 88, 85, 86, 79, 83, 82, 73, 83, 60, 86,…
## $ mentality_composure        <int> 96, 91, 95, 89, 91, 85, 92, 88, 90, 93, 84,…
## $ defending_marking          <int> 33, 34, 28, 59, 68, 57, 68, 72, 30, 90, 85,…
## $ defending_standing_tackle  <int> 37, 27, 32, 54, 58, 45, 76, 70, 29, 86, 87,…
## $ defending_sliding_tackle   <int> 26, 22, 24, 49, 51, 38, 71, 62, 24, 80, 90,…
## $ goalkeeping_diving         <int> 6, 11, 7, 14, 15, 27, 13, 10, 13, 5, 11, 9,…
## $ goalkeeping_handling       <int> 11, 12, 11, 8, 13, 25, 9, 11, 15, 8, 8, 9, …
## $ goalkeeping_kicking        <int> 15, 6, 15, 14, 5, 31, 7, 13, 6, 13, 9, 15, …
## $ goalkeeping_positioning    <int> 14, 8, 14, 13, 10, 33, 14, 7, 11, 9, 7, 15,…
## $ goalkeeping_reflexes       <int> 8, 8, 11, 14, 13, 37, 9, 10, 14, 13, 11, 11…

Ploting the graphs

Then after assignment process ,I started plotting graph as the initial phase using ggplot function of the tidyverse package.

firstly I brought the bar chart using ggplot which shows the count of players_potential that signifies the players potential which has the highest count of players.

ggplot(data = df) +
  geom_bar(mapping = aes(x = potential, fill = ..count.. ))

ggtitle("Potential distribution chart")
## $title
## [1] "Potential distribution chart"
## 
## attr(,"class")
## [1] "labels"

then I used a variable to generate the table 'tt' of top en richest players of football with high potential and which was by default the first 10 rows as I already used the SQL querry to sort data in the descending order which by default made the top paid players on the top and the data was extracted from the data frame.

tt <- head(df,10) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions)
head(tt)
##          short_name wage_eur value_eur age            club potential
## 1          L. Messi   565000  95500000  32    FC Barcelona        94
## 2         E. Hazard   470000  90000000  28     Real Madrid        91
## 3 Cristiano Ronaldo   405000  58500000  34        Juventus        93
## 4      A. Griezmann   370000  69000000  28    FC Barcelona        89
## 5      K. De Bruyne   370000  90000000  28 Manchester City        91
## 6         L. Suárez   355000  53000000  32    FC Barcelona        89
##   player_positions
## 1       RW, CF, ST
## 2           LW, CF
## 3           ST, LW
## 4       CF, ST, LW
## 5          CAM, CM
## 6               ST

then I plotted the same kind of graph that is bar graph using the attribute Age .

ggplot(data = df) +
 geom_bar(mapping = aes(x =age, fill = ..count.. ))

ggtitle("Age distribution chart")
## $title
## [1] "Age distribution chart"
## 
## attr(,"class")
## [1] "labels"

then got the same thing with weight attribute , this has been done to know few insights such as overall players with potential, age and weight. now by skipping all other kinds of attributes let us

ggplot(data = df) +
  geom_bar(mapping = aes(x = weight_kg, fill = ..count.. ))

ggtitle("weight distribution chart")
## $title
## [1] "weight distribution chart"
## 
## attr(,"class")
## [1] "labels"

Number of players as per their preferred playing positions. Based on the above graph, we’d expect some specific midfielder position to have the highest count, but here number of center-backs is the highest followed by the number of strikers.

ggplot(data = df) +
  geom_bar(mapping = aes(y = team_position , fill = ..count..)) +
  ggtitle("Distribution of players based on preferred position")

then i worked on to differentiate that what was the foot which majority if the players used anf clearly found that majority of players used right foot here I also used attributes potential to differentiate which foot performed better.

ggplot(data = df ,aes(x = preferred_foot, y = potential, color = preferred_foot)) +
  geom_jitter() +ggtitle("Distribution of players based on preferred foot")

then I calculated the wage of the club and used grid function for arranging the two garph of wages and differentiated between ten and hundred of clubs.

so first I will install grid package

Type this in console

install.packages(“ggplot2”) library(ggplot2)

install.packages(“gridExtra”) library(gridExtra)

now the code,

mm <- head(df,100) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions, )
pp <- head(df,10) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions, )

p1 <- ggplot(data = mm ,aes(y = club, x = wage_eur)) 
g1 <- p1 +  geom_point(aes(color= club)) + geom_jitter() + ggtitle("Top hundred club with high wage")
  p2 <- ggplot(data = pp ,aes(y = club, x = wage_eur)) 
g2 <- p2 + geom_point(aes(color= club)) + geom_jitter() + ggtitle("Top ten club with high wage")
grid.arrange(g1, g2, ncol=1)

then I found the best clubs of the year 2020 based on the attribute of potential.

tal <- head(df,10) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions, )


ggplot(data = tal, aes(x = factor(potential), fill = factor(age))) +
  geom_bar(width = 1) + coord_polar(theta = "y")

Age vs Overall of players divided amongst wage brackets. The highest wages are commanded by players of overall 85+ and age around 30 years. Cristiano Ronaldo is one of the three purple dots up there. Guess the other two in comments section below. :P

g_age_overall <- ggplot(df, aes(age, overall))
g_age_overall + geom_point(aes(color = age)) + geom_smooth(color="darkblue") + 
  ggtitle("Distribution between Age and Overall of players based on Value bracket")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'

I bought few data through tableau also so that i can make it more understandable to the viewer by following 5 second rule.

Tablue generated maps and graphs

Average age of countries all over in football


Sum of wage of countries all over in football

Thank you for reading. You can find the source code(R-notebook) at RPubs and GitHub. You can find me on <LinkedIn> , <Twitter> ,<gitlab> and <Github>.

#getwd()
#list.files("C:/path2/r.jpg", full.names=TRUE)