fifa_20 image
With FIFA World Cup 2022 around the corner, I combined my knowledge for football and data analytics to whip up a short exploratory analysis of the FIFA 20 dataset using R. I used the non-physical player attributes such as
short_name, wage_eur, value_eur, age,club, potential, player_positions, Nationality, Overall.
I used the the dataset
available on Kaggle
which contains 17,000+ players featuring in FIFA20
, each with more than 70 attributes. It is scraped from the website <SoFIFA>.
Before we go to the analysis part first i cleaned the data using
SQL querry in google cloud BigQuerry
[“ab thum log kahoge bigquery woo bhi free mai , aur mai kahunga jalwa hai humara” ] and
took only data of 2000 players
because [“cleaning part mai bhi kuch karna tha na”].
SELECT *
FROM `fifa`
ORDER BY wage_eur DESC
LIMIT 2000
Before we start the analysis, let’s import the
libraries
.
#### type this in console
library(tidyverse)
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.2 ──
## ✔ ggplot2 3.3.6 ✔ purrr 0.3.4
## ✔ tibble 3.1.8 ✔ dplyr 1.0.10
## ✔ tidyr 1.2.1 ✔ stringr 1.4.1
## ✔ readr 2.1.2 ✔ forcats 0.5.2
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
library(skimr)
library(Tmisc)
library(ggplot2)
library(dplyr)
library(gridExtra)
##
## Attaching package: 'gridExtra'
##
## The following object is masked from 'package:dplyr':
##
## combine
library(tidyselect)
# df = rea
df = read.csv(file = "table1.csv", stringsAsFactors = FALSE)
df = data.frame(df)
#df <- select_all(df)
#df <- data_frame(table1)
head(df)
## sofifa_id
## 1 158023
## 2 183277
## 3 20801
## 4 194765
## 5 192985
## 6 176580
## player_url
## 1 https://sofifa.com/player/158023/lionel-messi/20/159586
## 2 https://sofifa.com/player/183277/eden-hazard/20/159586
## 3 https://sofifa.com/player/20801/c-ronaldo-dos-santos-aveiro/20/159586
## 4 https://sofifa.com/player/194765/antoine-griezmann/20/159586
## 5 https://sofifa.com/player/192985/kevin-de-bruyne/20/159586
## 6 https://sofifa.com/player/176580/luis-suarez/20/159586
## short_name long_name age height_cm weight_kg
## 1 L. Messi Lionel Andrés Messi Cuccittini 32 170 72
## 2 E. Hazard Eden Hazard 28 175 74
## 3 Cristiano Ronaldo Cristiano Ronaldo dos Santos Aveiro 34 187 83
## 4 A. Griezmann Antoine Griezmann 28 176 73
## 5 K. De Bruyne Kevin De Bruyne 28 181 70
## 6 L. Suárez Luis Alberto Suárez Díaz 32 182 86
## nationality club overall potential value_eur wage_eur
## 1 Argentina FC Barcelona 94 94 95500000 565000
## 2 Belgium Real Madrid 91 91 90000000 470000
## 3 Portugal Juventus 93 93 58500000 405000
## 4 France FC Barcelona 89 89 69000000 370000
## 5 Belgium Manchester City 91 91 90000000 370000
## 6 Uruguay FC Barcelona 89 89 53000000 355000
## player_positions preferred_foot international_reputation weak_foot
## 1 RW, CF, ST Left 5 4
## 2 LW, CF Right 4 4
## 3 ST, LW Right 5 4
## 4 CF, ST, LW Left 4 3
## 5 CAM, CM Right 4 5
## 6 ST Right 5 4
## skill_moves work_rate body_type real_face release_clause_eur
## 1 4 Medium/Low Messi TRUE 195800000
## 2 4 High/Medium Normal TRUE 184500000
## 3 5 High/Low C. Ronaldo TRUE 96500000
## 4 4 High/High Normal TRUE 141500000
## 5 4 High/High Normal TRUE 166500000
## 6 3 High/Medium Normal TRUE 108700000
## player_tags
## 1 #Dribbler, #Distance Shooter, #Crosser, #FK Specialist, #Acrobat, #Clinical Finisher, #Complete Forward
## 2 #Speedster, #Dribbler, #Acrobat
## 3 #Speedster, #Dribbler, #Distance Shooter, #Acrobat, #Clinical Finisher, #Complete Forward
## 4 #Dribbler, #Engine, #Acrobat, #Clinical Finisher, #Complete Forward
## 5 #Dribbler, #Playmaker , #Engine, #Distance Shooter, #Crosser, #Complete Midfielder
## 6 #Distance Shooter, #Strength, #Clinical Finisher, #Complete Forward
## team_position team_jersey_number loaned_from joined contract_valid_until
## 1 RW 10 01-07-2004 2021
## 2 LW 7 01-07-2019 2024
## 3 LW 7 10-07-2018 2022
## 4 LW 17 12-07-2019 2024
## 5 RCM 17 30-08-2015 2023
## 6 ST 9 11-07-2014 2021
## nation_position nation_jersey_number pace shooting passing dribbling
## 1 NA 87 92 92 96
## 2 LF 10 91 83 86 94
## 3 LS 7 90 93 82 89
## 4 CAM 7 81 86 84 89
## 5 RCM 7 76 86 92 86
## 6 NA 73 89 80 84
## defending physic gk_diving gk_handling gk_kicking gk_reflexes gk_speed
## 1 39 66 NA NA NA NA NA
## 2 35 66 NA NA NA NA NA
## 3 35 78 NA NA NA NA NA
## 4 57 72 NA NA NA NA NA
## 5 61 78 NA NA NA NA NA
## 6 51 84 NA NA NA NA NA
## gk_positioning
## 1 NA
## 2 NA
## 3 NA
## 4 NA
## 5 NA
## 6 NA
## player_traits
## 1 Beat Offside Trap, Argues with Officials, Early Crosser, Finesse Shot, Speed Dribbler (CPU AI Only), 1-on-1 Rush, Giant Throw-in, Outside Foot Shot
## 2 Beat Offside Trap, Selfish, Finesse Shot, Speed Dribbler (CPU AI Only), Crowd Favourite
## 3 Long Throw-in, Selfish, Argues with Officials, Early Crosser, Speed Dribbler (CPU AI Only), Skilled Dribbling
## 4 Beat Offside Trap, Selfish, Argues with Officials, Finesse Shot, Speed Dribbler (CPU AI Only), Outside Foot Shot, Crowd Favourite
## 5 Power Free-Kick, Avoids Using Weaker Foot, Dives Into Tackles (CPU AI Only), Leadership, Argues with Officials, Finesse Shot
## 6 Diver, Speed Dribbler (CPU AI Only)
## attacking_crossing attacking_finishing attacking_heading_accuracy
## 1 88 95 70
## 2 81 84 61
## 3 84 94 89
## 4 83 89 84
## 5 93 82 55
## 6 78 91 83
## attacking_short_passing attacking_volleys skill_dribbling skill_curve
## 1 92 88 97 93
## 2 89 83 95 83
## 3 83 87 89 81
## 4 85 87 88 86
## 5 92 82 86 85
## 6 82 90 85 86
## skill_fk_accuracy skill_long_passing skill_ball_control movement_acceleration
## 1 94 92 96 91
## 2 79 83 94 94
## 3 76 77 92 89
## 4 85 82 90 82
## 5 83 91 91 77
## 6 82 72 84 76
## movement_sprint_speed movement_agility movement_reactions movement_balance
## 1 84 93 95 95
## 2 88 95 90 94
## 3 91 87 96 71
## 4 81 90 92 83
## 5 76 78 91 76
## 6 70 79 92 79
## power_shot_power power_jumping power_stamina power_strength power_long_shots
## 1 86 68 75 68 94
## 2 82 56 84 63 80
## 3 95 95 85 78 93
## 4 82 89 87 63 83
## 5 91 63 89 74 90
## 6 88 69 82 86 86
## mentality_aggression mentality_interceptions mentality_positioning
## 1 48 40 94
## 2 54 41 87
## 3 63 29 95
## 4 73 49 90
## 5 76 61 88
## 6 87 41 92
## mentality_vision mentality_penalties mentality_composure defending_marking
## 1 94 75 96 33
## 2 89 88 91 34
## 3 82 85 95 28
## 4 86 86 89 59
## 5 94 79 91 68
## 6 82 83 85 57
## defending_standing_tackle defending_sliding_tackle goalkeeping_diving
## 1 37 26 6
## 2 27 22 11
## 3 32 24 7
## 4 54 49 14
## 5 58 51 15
## 6 45 38 27
## goalkeeping_handling goalkeeping_kicking goalkeeping_positioning
## 1 11 15 14
## 2 12 6 8
## 3 11 15 14
## 4 8 14 13
## 5 13 5 10
## 6 25 31 33
## goalkeeping_reflexes
## 1 8
## 2 8
## 3 11
## 4 14
## 5 13
## 6 37
skim_without_charts(df)
Name | df |
Number of rows | 2000 |
Number of columns | 77 |
_______________________ | |
Column type frequency: | |
character | 15 |
logical | 1 |
numeric | 61 |
________________________ | |
Group variables | None |
Variable type: character
skim_variable | n_missing | complete_rate | min | max | empty | n_unique | whitespace |
---|---|---|---|---|---|---|---|
player_url | 0 | 1 | 49 | 78 | 0 | 2000 | 0 |
short_name | 0 | 1 | 4 | 21 | 0 | 1983 | 0 |
long_name | 0 | 1 | 2 | 42 | 0 | 1999 | 0 |
nationality | 0 | 1 | 4 | 20 | 0 | 94 | 0 |
club | 0 | 1 | 3 | 30 | 0 | 166 | 0 |
player_positions | 0 | 1 | 2 | 12 | 0 | 276 | 0 |
preferred_foot | 0 | 1 | 4 | 5 | 0 | 2 | 0 |
work_rate | 0 | 1 | 8 | 13 | 0 | 8 | 0 |
body_type | 0 | 1 | 4 | 19 | 0 | 9 | 0 |
player_tags | 0 | 1 | 0 | 132 | 1526 | 81 | 0 |
team_position | 0 | 1 | 2 | 3 | 0 | 29 | 0 |
loaned_from | 0 | 1 | 0 | 30 | 1913 | 52 | 0 |
joined | 0 | 1 | 0 | 10 | 87 | 575 | 0 |
nation_position | 0 | 1 | 0 | 3 | 1523 | 25 | 0 |
player_traits | 0 | 1 | 0 | 147 | 421 | 531 | 0 |
Variable type: logical
skim_variable | n_missing | complete_rate | mean | count |
---|---|---|---|---|
real_face | 0 | 1 | 0.55 | TRU: 1095, FAL: 905 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 |
---|---|---|---|---|---|---|---|---|---|
sofifa_id | 0 | 1.00 | 201672.42 | 25503.67 | 1179 | 189168.00 | 203333 | 220044.2 | 251573 |
age | 0 | 1.00 | 27.25 | 3.70 | 18 | 25.00 | 27 | 30.0 | 41 |
height_cm | 0 | 1.00 | 182.13 | 6.71 | 163 | 177.00 | 182 | 187.0 | 203 |
weight_kg | 0 | 1.00 | 76.85 | 7.24 | 56 | 72.00 | 77 | 82.0 | 101 |
overall | 0 | 1.00 | 77.18 | 4.12 | 66 | 74.00 | 77 | 80.0 | 94 |
potential | 0 | 1.00 | 79.46 | 4.75 | 68 | 76.00 | 79 | 83.0 | 95 |
value_eur | 0 | 1.00 | 12309625.00 | 12005200.19 | 525000 | 5500000.00 | 8500000 | 14000000.0 | 105500000 |
wage_eur | 0 | 1.00 | 50909.50 | 45447.01 | 22000 | 27000.00 | 36000 | 53000.0 | 565000 |
international_reputation | 0 | 1.00 | 1.66 | 0.79 | 1 | 1.00 | 1 | 2.0 | 5 |
weak_foot | 0 | 1.00 | 3.20 | 0.72 | 1 | 3.00 | 3 | 4.0 | 5 |
skill_moves | 0 | 1.00 | 2.92 | 0.88 | 1 | 2.00 | 3 | 3.0 | 5 |
release_clause_eur | 87 | 0.96 | 23669470.99 | 23545344.26 | 998000 | 9900000.00 | 16200000 | 27700000.0 | 195800000 |
team_jersey_number | 0 | 1.00 | 16.76 | 13.89 | 1 | 8.00 | 14 | 22.0 | 99 |
contract_valid_until | 0 | 1.00 | 2021.65 | 1.29 | 2019 | 2021.00 | 2022 | 2023.0 | 2026 |
nation_jersey_number | 1523 | 0.24 | 11.79 | 6.54 | 1 | 6.00 | 11 | 18.0 | 24 |
pace | 137 | 0.93 | 70.47 | 11.94 | 29 | 64.00 | 71 | 78.0 | 96 |
shooting | 137 | 0.93 | 63.57 | 13.63 | 15 | 56.00 | 67 | 73.0 | 93 |
passing | 137 | 0.93 | 68.47 | 8.86 | 34 | 63.00 | 70 | 75.0 | 92 |
dribbling | 137 | 0.93 | 72.65 | 8.82 | 34 | 68.00 | 74 | 78.0 | 96 |
defending | 137 | 0.93 | 59.85 | 18.24 | 18 | 41.50 | 68 | 75.0 | 90 |
physic | 137 | 0.93 | 71.06 | 8.07 | 41 | 67.00 | 72 | 77.0 | 89 |
gk_diving | 1863 | 0.07 | 79.07 | 4.68 | 69 | 76.00 | 79 | 82.0 | 90 |
gk_handling | 1863 | 0.07 | 76.39 | 4.99 | 64 | 73.00 | 76 | 80.0 | 92 |
gk_kicking | 1863 | 0.07 | 72.72 | 7.42 | 43 | 69.00 | 73 | 78.0 | 93 |
gk_reflexes | 1863 | 0.07 | 80.83 | 4.84 | 70 | 78.00 | 81 | 84.0 | 92 |
gk_speed | 1863 | 0.07 | 46.91 | 8.17 | 27 | 42.00 | 48 | 53.0 | 64 |
gk_positioning | 1863 | 0.07 | 77.84 | 4.88 | 66 | 74.00 | 78 | 81.0 | 91 |
attacking_crossing | 0 | 1.00 | 61.61 | 18.39 | 7 | 54.00 | 67 | 74.0 | 93 |
attacking_finishing | 0 | 1.00 | 57.29 | 20.02 | 5 | 44.00 | 63 | 73.0 | 95 |
attacking_heading_accuracy | 0 | 1.00 | 62.08 | 17.42 | 7 | 55.00 | 66 | 74.0 | 93 |
attacking_short_passing | 0 | 1.00 | 71.18 | 12.22 | 11 | 68.00 | 74 | 78.0 | 92 |
attacking_volleys | 0 | 1.00 | 55.44 | 19.15 | 7 | 43.00 | 60 | 70.0 | 90 |
skill_dribbling | 0 | 1.00 | 68.06 | 17.66 | 7 | 65.00 | 73 | 78.0 | 97 |
skill_curve | 0 | 1.00 | 60.70 | 19.03 | 9 | 51.75 | 66 | 74.0 | 94 |
skill_fk_accuracy | 0 | 1.00 | 53.73 | 19.00 | 8 | 40.00 | 57 | 69.0 | 94 |
skill_long_passing | 0 | 1.00 | 64.91 | 13.34 | 12 | 59.00 | 68 | 74.0 | 92 |
skill_ball_control | 0 | 1.00 | 71.36 | 14.83 | 9 | 70.00 | 75 | 79.0 | 96 |
movement_acceleration | 0 | 1.00 | 68.64 | 13.64 | 23 | 61.00 | 70 | 78.0 | 97 |
movement_sprint_speed | 0 | 1.00 | 69.01 | 13.30 | 24 | 62.00 | 71 | 78.0 | 96 |
movement_agility | 0 | 1.00 | 68.99 | 13.33 | 23 | 62.00 | 71 | 78.0 | 96 |
movement_reactions | 0 | 1.00 | 74.43 | 5.82 | 55 | 71.00 | 74 | 78.0 | 96 |
movement_balance | 0 | 1.00 | 66.68 | 14.18 | 22 | 59.00 | 69 | 77.0 | 96 |
power_shot_power | 0 | 1.00 | 70.42 | 11.40 | 15 | 65.00 | 73 | 78.0 | 95 |
power_jumping | 0 | 1.00 | 69.24 | 11.65 | 30 | 64.00 | 71 | 77.0 | 95 |
power_stamina | 0 | 1.00 | 70.59 | 13.56 | 17 | 66.00 | 73 | 79.0 | 97 |
power_strength | 0 | 1.00 | 70.56 | 11.15 | 29 | 64.00 | 72 | 78.0 | 95 |
power_long_shots | 0 | 1.00 | 60.34 | 19.11 | 6 | 54.00 | 67 | 74.0 | 94 |
mentality_aggression | 0 | 1.00 | 66.68 | 15.94 | 11 | 58.00 | 71 | 78.0 | 94 |
mentality_interceptions | 0 | 1.00 | 56.75 | 22.11 | 9 | 36.00 | 67 | 76.0 | 92 |
mentality_positioning | 0 | 1.00 | 62.49 | 19.83 | 5 | 55.00 | 69 | 76.0 | 95 |
mentality_vision | 0 | 1.00 | 65.81 | 13.19 | 13 | 59.00 | 69 | 75.0 | 94 |
mentality_penalties | 0 | 1.00 | 57.65 | 16.18 | 9 | 48.00 | 60 | 70.0 | 92 |
mentality_composure | 0 | 1.00 | 72.28 | 7.94 | 25 | 68.00 | 73 | 78.0 | 96 |
defending_marking | 0 | 1.00 | 56.51 | 21.42 | 8 | 38.00 | 65 | 74.0 | 94 |
defending_standing_tackle | 0 | 1.00 | 56.67 | 23.43 | 9 | 35.00 | 68 | 77.0 | 92 |
defending_sliding_tackle | 0 | 1.00 | 53.41 | 24.00 | 8 | 29.00 | 64 | 74.0 | 90 |
goalkeeping_diving | 0 | 1.00 | 15.31 | 17.61 | 1 | 8.00 | 11 | 14.0 | 90 |
goalkeeping_handling | 0 | 1.00 | 15.13 | 16.95 | 1 | 8.00 | 11 | 14.0 | 92 |
goalkeeping_kicking | 0 | 1.00 | 14.88 | 16.12 | 1 | 8.00 | 11 | 14.0 | 93 |
goalkeeping_positioning | 0 | 1.00 | 15.22 | 17.32 | 1 | 8.00 | 11 | 14.0 | 91 |
goalkeeping_reflexes | 0 | 1.00 | 15.35 | 18.09 | 1 | 8.00 | 11 | 14.0 | 92 |
glimpse(df)
## Rows: 2,000
## Columns: 77
## $ sofifa_id <int> 158023, 183277, 20801, 194765, 192985, 1765…
## $ player_url <chr> "https://sofifa.com/player/158023/lionel-me…
## $ short_name <chr> "L. Messi", "E. Hazard", "Cristiano Ronaldo…
## $ long_name <chr> "Lionel Andrés Messi Cuccittini", "Eden Haz…
## $ age <int> 32, 28, 34, 28, 28, 32, 33, 29, 31, 30, 33,…
## $ height_cm <int> 170, 175, 187, 176, 181, 182, 172, 183, 173…
## $ weight_kg <int> 72, 74, 83, 73, 70, 86, 66, 76, 70, 76, 82,…
## $ nationality <chr> "Argentina", "Belgium", "Portugal", "France…
## $ club <chr> "FC Barcelona", "Real Madrid", "Juventus", …
## $ overall <int> 94, 91, 93, 89, 91, 89, 90, 88, 89, 89, 89,…
## $ potential <int> 94, 91, 93, 89, 91, 89, 90, 88, 89, 89, 89,…
## $ value_eur <int> 95500000, 90000000, 58500000, 69000000, 900…
## $ wage_eur <int> 565000, 470000, 405000, 370000, 370000, 355…
## $ player_positions <chr> "RW, CF, ST", "LW, CF", "ST, LW", "CF, ST, …
## $ preferred_foot <chr> "Left", "Right", "Right", "Left", "Right", …
## $ international_reputation <int> 5, 4, 5, 4, 4, 5, 4, 4, 4, 4, 4, 5, 4, 4, 4…
## $ weak_foot <int> 4, 4, 4, 3, 5, 4, 4, 5, 4, 3, 3, 5, 3, 4, 2…
## $ skill_moves <int> 4, 4, 5, 4, 4, 3, 4, 3, 4, 3, 3, 5, 2, 4, 4…
## $ work_rate <chr> "Medium/Low", "High/Medium", "High/Low", "H…
## $ body_type <chr> "Messi", "Normal", "C. Ronaldo", "Normal", …
## $ real_face <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, T…
## $ release_clause_eur <int> 195800000, 184500000, 96500000, 141500000, …
## $ player_tags <chr> "#Dribbler, #Distance Shooter, #Crosser, #F…
## $ team_position <chr> "RW", "LW", "LW", "LW", "RCM", "ST", "RCM",…
## $ team_jersey_number <int> 10, 7, 7, 17, 17, 9, 10, 8, 10, 5, 4, 10, 3…
## $ loaned_from <chr> "", "", "", "", "", "", "", "", "", "", "",…
## $ joined <chr> "01-07-2004", "01-07-2019", "10-07-2018", "…
## $ contract_valid_until <int> 2021, 2024, 2022, 2024, 2023, 2021, 2020, 2…
## $ nation_position <chr> "", "LF", "LS", "CAM", "RCM", "", "", "SUB"…
## $ nation_jersey_number <int> NA, 10, 7, 7, 7, NA, NA, 8, 9, 5, 15, 10, N…
## $ pace <int> 87, 91, 90, 81, 76, 73, 74, 45, 80, 42, 72,…
## $ shooting <int> 92, 83, 93, 86, 86, 89, 76, 80, 90, 62, 68,…
## $ passing <int> 92, 86, 82, 84, 92, 80, 89, 90, 77, 80, 75,…
## $ dribbling <int> 96, 94, 89, 89, 86, 84, 89, 81, 88, 80, 73,…
## $ defending <int> 39, 35, 35, 57, 61, 51, 72, 70, 33, 85, 87,…
## $ physic <int> 66, 66, 78, 72, 78, 84, 66, 69, 74, 80, 85,…
## $ gk_diving <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_handling <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_kicking <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_reflexes <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_speed <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ gk_positioning <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA,…
## $ player_traits <chr> "Beat Offside Trap, Argues with Officials, …
## $ attacking_crossing <int> 88, 81, 84, 83, 93, 78, 86, 88, 70, 62, 66,…
## $ attacking_finishing <int> 95, 84, 94, 89, 82, 91, 72, 75, 93, 67, 63,…
## $ attacking_heading_accuracy <int> 70, 61, 89, 84, 55, 83, 55, 58, 78, 68, 92,…
## $ attacking_short_passing <int> 92, 89, 83, 85, 92, 82, 92, 91, 83, 89, 80,…
## $ attacking_volleys <int> 88, 83, 87, 87, 82, 90, 76, 82, 85, 44, 69,…
## $ skill_dribbling <int> 97, 95, 89, 88, 86, 85, 87, 80, 88, 80, 65,…
## $ skill_curve <int> 93, 83, 81, 86, 85, 86, 85, 86, 83, 66, 74,…
## $ skill_fk_accuracy <int> 94, 79, 76, 85, 83, 82, 78, 84, 73, 68, 72,…
## $ skill_long_passing <int> 92, 83, 77, 82, 91, 72, 88, 92, 64, 82, 83,…
## $ skill_ball_control <int> 96, 94, 92, 90, 91, 84, 92, 89, 89, 88, 83,…
## $ movement_acceleration <int> 91, 94, 89, 82, 77, 76, 77, 50, 82, 40, 74,…
## $ movement_sprint_speed <int> 84, 88, 91, 81, 76, 70, 71, 41, 78, 43, 71,…
## $ movement_agility <int> 93, 95, 87, 90, 78, 79, 92, 60, 84, 67, 78,…
## $ movement_reactions <int> 95, 90, 96, 92, 91, 92, 89, 87, 92, 87, 87,…
## $ movement_balance <int> 95, 94, 71, 83, 76, 79, 93, 71, 91, 49, 66,…
## $ power_shot_power <int> 86, 82, 95, 82, 91, 88, 79, 87, 89, 61, 79,…
## $ power_jumping <int> 68, 56, 95, 89, 63, 69, 68, 30, 81, 66, 93,…
## $ power_stamina <int> 75, 84, 85, 87, 89, 82, 85, 74, 79, 86, 80,…
## $ power_strength <int> 68, 63, 78, 63, 74, 86, 58, 73, 74, 77, 85,…
## $ power_long_shots <int> 94, 80, 93, 83, 90, 86, 82, 86, 84, 54, 62,…
## $ mentality_aggression <int> 48, 54, 63, 73, 76, 87, 62, 60, 65, 85, 90,…
## $ mentality_interceptions <int> 40, 41, 29, 49, 61, 41, 82, 76, 24, 89, 88,…
## $ mentality_positioning <int> 94, 87, 95, 90, 88, 92, 79, 75, 93, 77, 67,…
## $ mentality_vision <int> 94, 89, 82, 86, 94, 82, 91, 89, 83, 86, 71,…
## $ mentality_penalties <int> 75, 88, 85, 86, 79, 83, 82, 73, 83, 60, 86,…
## $ mentality_composure <int> 96, 91, 95, 89, 91, 85, 92, 88, 90, 93, 84,…
## $ defending_marking <int> 33, 34, 28, 59, 68, 57, 68, 72, 30, 90, 85,…
## $ defending_standing_tackle <int> 37, 27, 32, 54, 58, 45, 76, 70, 29, 86, 87,…
## $ defending_sliding_tackle <int> 26, 22, 24, 49, 51, 38, 71, 62, 24, 80, 90,…
## $ goalkeeping_diving <int> 6, 11, 7, 14, 15, 27, 13, 10, 13, 5, 11, 9,…
## $ goalkeeping_handling <int> 11, 12, 11, 8, 13, 25, 9, 11, 15, 8, 8, 9, …
## $ goalkeeping_kicking <int> 15, 6, 15, 14, 5, 31, 7, 13, 6, 13, 9, 15, …
## $ goalkeeping_positioning <int> 14, 8, 14, 13, 10, 33, 14, 7, 11, 9, 7, 15,…
## $ goalkeeping_reflexes <int> 8, 8, 11, 14, 13, 37, 9, 10, 14, 13, 11, 11…
Then after assignment process ,I started plotting graph as the initial phase using ggplot function of the
tidyverse package.
firstly I brought the bar chart using ggplot which shows the count of
players_potential
that signifies the players potential which has the highest count of players.
ggplot(data = df) +
geom_bar(mapping = aes(x = potential, fill = ..count.. ))
ggtitle("Potential distribution chart")
## $title
## [1] "Potential distribution chart"
##
## attr(,"class")
## [1] "labels"
then I used a variable to generate the table 'tt'
of top en richest players of football with high potential and which was by default the first 10 rows as I already used the SQL querry
to sort data in the
descending order which by default made the top paid players on the top and the data was extracted from the data frame.
tt <- head(df,10) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions)
head(tt)
## short_name wage_eur value_eur age club potential
## 1 L. Messi 565000 95500000 32 FC Barcelona 94
## 2 E. Hazard 470000 90000000 28 Real Madrid 91
## 3 Cristiano Ronaldo 405000 58500000 34 Juventus 93
## 4 A. Griezmann 370000 69000000 28 FC Barcelona 89
## 5 K. De Bruyne 370000 90000000 28 Manchester City 91
## 6 L. Suárez 355000 53000000 32 FC Barcelona 89
## player_positions
## 1 RW, CF, ST
## 2 LW, CF
## 3 ST, LW
## 4 CF, ST, LW
## 5 CAM, CM
## 6 ST
then I plotted the same kind of graph that is bar graph using the attribute Age .
ggplot(data = df) +
geom_bar(mapping = aes(x =age, fill = ..count.. ))
ggtitle("Age distribution chart")
## $title
## [1] "Age distribution chart"
##
## attr(,"class")
## [1] "labels"
then got the same thing with weight attribute , this has been done to know few insights such as overall players with potential, age and weight. now by skipping all other kinds of attributes let us
ggplot(data = df) +
geom_bar(mapping = aes(x = weight_kg, fill = ..count.. ))
ggtitle("weight distribution chart")
## $title
## [1] "weight distribution chart"
##
## attr(,"class")
## [1] "labels"
Number of players as per their preferred playing positions. Based on the above graph, we’d expect some specific midfielder position to have the highest count, but here number of center-backs is the highest followed by the number of strikers.
ggplot(data = df) +
geom_bar(mapping = aes(y = team_position , fill = ..count..)) +
ggtitle("Distribution of players based on preferred position")
then i worked on to differentiate that what was the foot which majority if the players used anf clearly found that majority of players used right foot here I also used attributes potential to differentiate which foot performed better.
ggplot(data = df ,aes(x = preferred_foot, y = potential, color = preferred_foot)) +
geom_jitter() +ggtitle("Distribution of players based on preferred foot")
then I calculated the wage of the club and used grid function for arranging the two garph of wages and differentiated between ten and hundred of clubs.
so first I will install grid package
install.packages(“ggplot2”) library(ggplot2)
install.packages(“gridExtra”) library(gridExtra)
now the code,
mm <- head(df,100) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions, )
pp <- head(df,10) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions, )
p1 <- ggplot(data = mm ,aes(y = club, x = wage_eur))
g1 <- p1 + geom_point(aes(color= club)) + geom_jitter() + ggtitle("Top hundred club with high wage")
p2 <- ggplot(data = pp ,aes(y = club, x = wage_eur))
g2 <- p2 + geom_point(aes(color= club)) + geom_jitter() + ggtitle("Top ten club with high wage")
grid.arrange(g1, g2, ncol=1)
then I found the best clubs of the year 2020 based on the attribute of potential.
tal <- head(df,10) %>% select(short_name, wage_eur, value_eur, age,club, potential, player_positions, )
ggplot(data = tal, aes(x = factor(potential), fill = factor(age))) +
geom_bar(width = 1) + coord_polar(theta = "y")
Age vs Overall of players divided amongst wage brackets. The highest wages are commanded by players of overall 85+ and age around 30 years. Cristiano Ronaldo is one of the three purple dots up there. Guess the other two in comments section below. :P
g_age_overall <- ggplot(df, aes(age, overall))
g_age_overall + geom_point(aes(color = age)) + geom_smooth(color="darkblue") +
ggtitle("Distribution between Age and Overall of players based on Value bracket")
## `geom_smooth()` using method = 'gam' and formula 'y ~ s(x, bs = "cs")'
I bought few data through tableau also so that i can make it more understandable to the viewer by following 5 second rule.