What is ‘Vibe Coding’ ?
You may have heard about Vibe Coding by now. If not, here’s how Andrej Karpathy, a leading figure in the AI space, defines it:
“There's a new kind of coding I call "vibe coding", where you fully give in to the vibes, embrace exponentials, and forget that the code even exists” (X, 2024)
- Andrej Karpathy
This approach transforms the developer’s role from writing code line by line to guiding, testing, and refining code generated by an AI model. Put simply, you’re having a conversation with a tool like ChatGPT: asking questions or giving prompts that trigger code execution in the background.
The key is clarity: the more specific your prompts, the better the results. And like any good collaboration, it often requires iteration and refinement to reach the output you’re aiming for.
Building on the philosophy of vibe coding, I thought it would be cool to adapt it to football and see how it could help me identify standout players.
Hence, I present to you: Vibe Scouting 😎
In the following section, I will show you the prompt I wrote to ChatGPT and the strategy we made in uncovering outliers from the 24/25 season in the Top 5 leagues. Enjoy being a fly on the wall!
Writing the Perfect Prompt
To support a data-driven scouting piece, I collaborated with ChatGPT to identify standout players under the age of 27 using Z-scores: a statistical method that highlights outliers.
👨💻 I started with the following prompt:
I want you to scout for me based on weighted Z-scores to find outliers in the data.
Some requirements are:
- Players must have played equal or more than 700 minutes
- Age filter < 27 to find relatively young and promising players with high market value
- For each position, you must make an evaluation of what metrics are suitable to measure high performance in that position (for instance, you should make different profiles for a deep lying midfielder, often called #6 and an attacking midfielder, often referred to as a #10). Pass completion is more important for a #6 for instance, while attacking players should take more risk. The same goes for differentiating between wingers and strikers although they both go under the same category of "FW".
Output:
The output I want is the top 20 players for each position. The positions I want are:
- Full backs (DF)
- Center Backs (DF)
- Deep Lying midfielder #6 (MF)
- Box to Box midfielder #8 (MF)
- Creative attacking midfielder #10 (MF)
- Advanced Striker #9 (FW)
- Playmaking Winger (FW)
- Inverted Winger (FW)
Before you start the analysis, ask me a question to clarify potential questions or suggestions.
ChatGPT followed up with clarifying questions on how roles should be assigned, how to normalize stats, and how I wanted the output structured. I clarified:
- Roles should be inferred using clustering
- Z-scores should be calculated and compared within each role group (e.g. full backs with full backs)
- Missing data should be handled carefully — no artificial values
- Output should include rankings together with the metrics used
- *Important: You need to feed ChatGPT with a dataset (excel file or a csv) together with the prompt. My data was event data from FBref: Top 5 Leagues 24/25*
Results
In the following section, I will go through the result from three different player roles. Let’s begin with strikers.

Output 1: Strikers Ranked by Z-Score
Our first bar chart ranks strikers using z-scores, a statistical measure that shows how far a player deviates from the average 💡
A z-score of 0 means average, while anything above 2 signals a clear outlier; someone performing significantly above their peers.
There is no need to discuss Kylian Mbappe in first place, but it’s worth noticing Mika Biereth in second place as his output for Monaco this season has been nothing short of extraordinary. What’s especially impressive is that he only broke into the team in October, yet still managed to score 13 goals and provide 2 assists in just 16 league games.
Notable mentions for scouts and analysts include the other players highlighted in red. While not quite at Biereth’s level, these relatively under-the-radar talents also score above average across key metrics.
🚀 For example, Luca Stassin and Emmanuel Emegha have quietly put together strong seasons in Ligue 1, consistently contributing with goals and promising underlying numbers.
Further up the table, we have Rodrigo Muniz and Myron Boadu who scores well above average due to the fact that they have really high output (goals) in the relatively few games they have played.
Now, we will look at the top “Box-to-Box” midfielders according to ChatGPT.

Output 2: Ranking Box-to-Box Midfielders
For midfielders in Europe, Pedri is the clear outlier averaging 11.2 progressive passes per 90 and 10.3 passes into the final third per 90 (ranking above the rest in both of these metrics).
On top of that, Pedri has the highest total of Shot Creating Actions (5.2 per 90), and he also ranked as number #1 when I looked at the deep-playmaker role which tells us what a complete player he is and how he excels between both boxes!
📊 For reference; here are the metrics and weights that ChatGPT used to evaluate box-to-box midfielders:
- Progressive Carries (15 %)
- Progressive Passes (15 %)
- Shot Creating Actions (15 %)
- Tackles (10 %)
- Interceptions (10 %)
- Assist P90 (15 %)
- Goals P90 (10 %)
- Passes into Final Third (15 %)
There are some other standouts here as well; Tijani Rijnders who have been linked to Manchester City lately, Angelo Stiller playing fantastically in Stuttgart, and Lamine Camara who I talked about earlier this season when I was scouting for elite midfielders in Europe 🔎
Fun fact: Ryan Gravenberch is the king of Interceptions (in this group) with 1.7 int per 90, while Alexis Mac Allister tops the chart for tackles at 3.3 tackles per 90 💥
🔗 Together, they form a midfield pivot duo that complements each other defensively; one breaking up passing lanes (Gravenberch), the other disrupting play directly (MacAllister). It’s a dynamic pairing that’s particularly exciting from a Moneyball and squad-building perspective.
Apparently, Slot made MacAllister rest for the last couple of games this season because of his “Argentinian mentality”, translated into tackles by the data.
The last role we will analyze is Defenders.

Output 3: Ranking Attacking Defenders
Because both full-backs and center-backs are grouped under “Defenders” in FBref data, I’ve chosen to label this player cluster as Attacking Defenders.
While the majority of the chart is dominated by full-backs, we also see center-backs who excel going forward; a profile I like to call Hybrid Defenders. The standout here is Joško Gvardiol, who ranks an incredible 3 standard deviations above the mean. His ability to operate in multiple roles makes him a uniquely versatile defender.
In fact, Gvardiol has the highest number of total carries into the final third, and only Nuno Tavares beats him on a per-90 basis in that metric. Let’s not forget that Gvardiol has scored five goals this season as a defender, and last season he had four!
Other center-backs like Jan Paul van Hecke, Alessandro Bastoni, and Nico Schlotterbeck also show up well, reflecting their progressive passing styles.
On the full-back side, we find several familiar names: Trent Alexander-Arnold, Milos Kerkez, and Achraf Hakimi. But there are also some pleasant surprises — particularly Dodô (Fiorentina) and Andrei Rațiu (Rayo Vallecano), with the highest totals of progressive carries over the season.
Lastly, shout-out to Nuno Tavares, Juan Miranda, and Pedro Porro for leading the way in crosses into the penalty area per 90, offering real value in final-third delivery.

Output 4: Promising Center Backs
Lastly, let’s look at the center back cluster. I think this is a really interesting list that the AI has given us, since we can see the likes of Dean Huijsen and Nathan Collins in the middle of the chart, both ranking above average for this role.
Both of them had great seasons, Huijsen on his way to Madrid, and Collins is the only player this season who have played every minute in the Premier League! Someone who deserves a proper vacation 🌴
However, the profile that stands out for me is 21 year old, Diego Coppola, who plays for Hellas Verona.

Coppola stands tall at 192 cm and boasts an impressive aerial duel win rate of 64.9%, averaging 4.1 aerials won per 90 minutes. He also leads his peers in interceptions with 2.2 int per 90, placing him in the 96th percentile for that metric❗️
This is somewhat reminiscent of Virgil Van Dijk’s data profile; dominating duels and reading the game through intercepting a lot. The only minus I can find is that he seems to love a tackle (2 tackles per 90) 😅 If you’ve read my Van Dijk series, you know what I mean.
That said, context matters. Coppola plays for Hellas Verona, a mid-table team where defenders are naturally more “active” due to their team setup. At first glance, Coppola’s possession numbers makes him look like a bad player. But after digging deeper, it turns out Hellas Verona averages just 38.8% possession – the lowest in Serie A and the fewest attempted passes in the league. So the low passing and on-ball stats say more about the system than the player.
The Power of AI
In this piece, I’ve shown how AI agents — specifically ChatGPT — can act as a powerful assistant for data-driven scouting. Even if you don’t have a coding background, this approach is highly accessible. Remember, all I did was upload the data and have a nice back-and-forth conversation with Chat 🤝
By understanding a few core statistical concepts like clustering and z-scores, you can guide the AI to perform complex analysis through simple, conversational prompts. And if you still don’t quite grasp these statistical concepts, don’t worry, I think I know someone you could ask 🤖
Of course, it’s not completely effortless. The results aren’t always perfect, and getting high-quality output requires iteration, domain knowledge, and a critical eye. The AI is powerful, but not flawless, and still depends on how clearly and specifically you define the task.
That said, it’s an incredibly useful tool for saving time, generating player shortlists, and surfacing overlooked talent, especially if you already have structured data at hand, and a understanding for metrics. The key lies in crafting thoughtful prompts and knowing what you’re looking for.
So what do you think: could vibe scouting be a viable tool in your scouting workflow?
I’d love to hear your thoughts.
Oh I almost forgot, don’t forget to say thank you.

📧 If you enjoyed this post, I’d appreciate it if you would share it with a friend 🌟
🔗 I post regularly on my LinkedIn as well, so feel free to connect with me there 🤝