Where can I actually find football/soccer data?
There are three main ways to get data. You can parse/scrape it from a hobbyist project/website, you can pay for it or you can try to collect it yourself.
Jump to a specific source:
- Newsletter
- Open source data on github
- Free APIs
- football-data.org (RESTful API)
- openfooty API (hard to get an API Key)
- Commercial APIs
- Other Websites
- footballsquads.co.uk
- Rec.Sport.Soccer Statistics Foundation (RSSSF)
- football-data.co.uk
- european-football-statistics.co.uk
- openligadb.db
- Wikipedia
- Post War English & Scottish Football League A – Z Player’s Database
- 2015 Women’s World Cup Data
- World Cup 2014 APIs
- Deprecated/Retired – “The Graveyard of APIs
- More Reading / Resources / Links
- Fantasy Data
Football Data Report Newsletter
If you would like more frequent updates regarding football APIs, football data and open source football projects then signup for the newsletter. Emails will be sent about once or twice per month. I will try to highlight free data and open source projects. Please email any links or newsletter suggestions.
openfootball – aka football.db
openfootball (aka football.db) has started a free, open source public domain football database. The data is historical data, meaning no lives scores but the data does include the schedule, teams and players for the 2014 World Cup along with global league data. This is a very promising project and has the potential to be the definitive source for historical data for the public. See the opensport Google Group for discussion and questions. The data is stored in various repos on github. Consider contributing any data you have yourself and be sure to thank Gerald Bauer. All the various repos can be intimidating. A good place to start is at github.com/openfootball.
An example of the plain text (custom format):
[Sat Aug/16]
12.45 Manchester United 1-2 Swansea City
15.00 Leicester City 2-2 Everton FC
15.00 Queens Park Rangers 0-1 Hull City
15.00 Stoke City 0-1 Aston Villa
Check out the following github organizations for more listings:
- https://github.com/footballcsv
- https://github.com/openfootball
- https://github.com/rsssf
- https://github.com/footballdata
There is also an open source football.db HTTP JSON(P) API for demo purposes. Example Endpoints:
http://footballdb.herokuapp.com/api/v1/event/world.2014/teams
http://footballdb.herokuapp.com/api/v1/event/world.2014/round/1
jokecamp/FootballData
jokecamp/FootballData – my own hodgepodge of JSON and CSV Football/Soccer data on GitHub with a focus on the EPL
soccerstats.us
soccerstats.us is github organization with multiple repositories for sets of data (with a focus on North American data). The parser is written in python and looks like it was designed to parse the rsssf.com text data.
Other smaller Projects/Repos
- github.com/jalapic/engsoccerdata includes a csv file of the top 4 tier English League Soccer games from 1888 to 2014.
- planetopendata/awesome-football includes another list of repos
- architv/soccer-cli – A command line interface for retrieving football scores.
Free APIs
football-data.org (beta)
football-data.org is a RESTful API in beta with regularly updated data. If you register for a free API keyyou will get CORS support. I recommend registering for a key to show your support and help the service track usage. However, a key is not required yet so you can try out the endpoints right now. I am excited to see this API grow and mature!
Available endpoints
/soccerseasons/
/soccerseasons/{id}/ranking
/soccerseasons/{id}/fixtures
/fixtures
/soccerseasons/{id}/teams
/teams/{id}
/teams/{id}/fixtures/
Some example calls:
- http://api.football-data.org/v1/soccerseasons
- http://api.football-data.org/v1/soccerseasons/351/teams
- http://api.football-data.org/v1/fixtures
- http://api.football-data.org/v1/teams/5
Example JSON output for a team:
{
"_links":{
"self":{
"href":"http://api.football-data.org/v1/teams/5"
},
"fixtures":{
"href":"http://api.football-data.org/v1/teams/5/fixtures"
},
"players":{
"href":"http://api.football-data.org/v1/teams/5/players"
}
},
"name":"FC Bayern München",
"code":"FCB",
"shortName":"Bayern",
"squadMarketValue":"559,100,000 €",
"crestUrl":"http://upload.wikimedia.org/wikipedia/commons/c/c5/Logo_FC_Bayern_München.svg"
}
openfooty API
openfooty API had promising API documentation but a quick look at the developer forums shows a stale community and questions about why no one seems to actually be able to get a developer key.
Subscription Services/APIs
football-api
football-api.com is a paid API service but does offer the English Premier League endpoints for free (demo use). The API will restrict by IP addresses and limit calls based on your package. Includes endponts for Competitions, teams, standings, live scores, fixtures and commentaries. See the pricing page. Prices range from $15 to $200 per month.
Example endpoints:
api/?Action=competitions&APIKey=####
api/?Action=standings&comp_id=1204&APIKey=####
api/?Action=today&comp_id=1204&APIKey=####
api/?Action=fixtures&comp_id=1024&&match_date=[DATE_IN_d.m.Y_FORMAT]&APIKey=####
api/?Action=commentaries&APIKey=###&match_id=[MATCH_ID]
CrowdScores and FastestLiveScores API
CrowdScores is a UK company that uses a crowd-sourcing football data collection process. You sign up for an account and report game events to their servers. They have iPhone and android apps for reporting. The collected data is then available as an API on FastestLiveScores.com. They currently offerthree different API tiers. Free trial ($0), Basic ($100 per month) and Pro (price unlisted). View the API documentation.
Example endpoints and parameters:
/teams{?round_ids,competition_ids}
/matches{?team_id,round_ids,competition_id,from,to}
/competitions
/rounds{?competition_ids}
/seasons
/league-table/{round_id}
/league-tables?{team_id,round_id,competition_id}
/football_states
/events
/playerstats?{team_ids,round_ids,competition_ids,season_ids}
SPAPI
SPAPI offers subscriptions for a RESTful Sports API including live scores, player statistics, betting odds, pre-game data and match event data. The data response look pretty comprehensive.
Pricing:
Starter Plan is $299 per month
Growth pla is $499 per month
Pro Plan is $899 per month
Look at the example of one of the action data objects. It looks like data for a defensive clearance including x,y coordinates.
{
"minute": 53,
"second": 10,
"team_id": 32,
"start_x": 20.8,
"start_y": 69.5,
"expanded_minute": 57,
"period": {
"name": "SecondHalf",
"value": 2
},
"type": {
"name": "Clearance",
"value": 12
},
"outcome_type": {
"name": "Successful",
"value": 1
},
"qualifiers": [{
"type": "Head"
}, {
"type": "Zone",
"value": "Back"
}],
"is_touch": true
}
Example endpoints:
https://spapi.pw/api/v1/competitions/5/upcoming_matches
https://spapi.pw/api/v1/competitions/5/finished_matches
https://spapi.pw/api/v1/competitions/5/standings?standing_type=standings
https://spapi.pw/api/v1/competitions/5/player_rankings?ranking_type=all
https://spapi.pw/api/v1/teams
https://spapi.pw/api/v1/matches/17588/scores
https://spapi.pw/api/v1/livescores
https://spapi.pw/api/v1/players/1/statistics
Pretty impressive but this level of detail comes at a price. Authentication method is a API key in querystring.
XMLSoccer.com
xmlsoccer.com is another subscription service. You can demo the service for free with the Scottish Premier League. The monthly pricing is 10 € per 1 month, 25 € per 3 months and 90 € per 12 months. You can browse the demo web methods here to see the types of calls availablehttp://www.xmlsoccer.com/FootballDataDemo.asmx and the WSDL. The data is only returned in XML.
Each call must provide an API Key. You can get a free demo API Key by registering.
Example results for a team
<XMLSOCCER.COM>
<Team>
<Team_Id>45</Team_Id>
<Name>Aberdeen</Name>
<Country>Scotland</Country>
<Stadium>Pittodrie Stadium</Stadium>
<HomePageURL>http://www.afc.co.uk</HomePageURL>
<WIKILink>http://en.wikipedia.org/wiki/Aberdeen_F.C.</WIKILink>
</Team>
<XMLSOCCER.COM>
Player data
<Player>
<Id>2523</Id>
<Name>David Goodwillie</Name>
<Height>1.7</Height>
<Weight>70.29</Weight>
<Nationality>Scotland</Nationality>
<Position>Forward</Position>
<Team_Id>45</Team_Id>
<PlayerNumber>17</PlayerNumber>
<DateOfBirth>1989-03-28T00:00:00-08:00</DateOfBirth>
<DateOfSigning>2014-07-07T00:00:00-07:00</DateOfSigning>
<Signing>Free</Signing>
</Player>
Open source Python client for XmlSoccer API
opta
opta is one of industry leaders. This is what the tv networks use and likely what the actual football clubs use for scouting. If only this data were public! However, opta Playground has a developer program that provides very limited access to historical data. The site reads “Opta can provide data for programmers wishing to develop a mobile app or website with selected historical data available to download.” You have to request permission in an email. I applied and they sent me the xml data set for 10 rounds of games from the start of the 2007/2008 Bundesliga 2. The more detailed game data had either x,y coordinates of game events. A very impressive dataset but it felt more like an advertisement. The data provided I had no interest in and I’m not sure why an indie developer would spend time working on a data set they could never afford. They even track this data point “Spectator on pitch.” Read this article FiveThirtyEight behind the scenes look at how opta tracks data (spoiler: young male gamers).
An example of an “event” in xml
<Event id="1115853439" event_id="9" type_id="3"
period_id="1" min="0" sec="19" player_id="21202"
team_id="1744" outcome="1" x="64.9" y="11.6"
timestamp="2007-08-19T13:02:08.482" last_modified="2007-08-19T13:02:13">
<Q id="152113216" qualifier_id="56" value="Right"/>
</Event>
prozone
prozone is another large commercial data provider.
Match Analysis
Match Analysis is another large commercial data provider that lists Fox Soccer Channel, US National Team and the MLS among their clients.
Other Websites
FootballSquads
footballsquads.co.uk has current and historical squad details for clubs (rosters) and national teams from all across the world for many leagues and competitions, including the 2014 World Cup squads.
And example of the squad/roster data:
Num Name Nat Pos Height Weight Date of Birth Birth Place Previous Club
1 David De Gea ESP G 1.92 82 07-11-90 Madrid Atlético Madrid
2 Rafael BRA D 1.72 65 09-07-90 Petrópolis Fluminense
Rec.Sport.Soccer Statistics Foundation (RSSSF)
Rec.Sport.Soccer Statistics Foundation (RSSSF) has massive collection of formatted plain text statistics.An example of English Premier leagues results.
Example of the data for table results:
1.Chelsea 8 7 1 0 23- 8 22
2.Manchester City 8 5 2 1 18- 8 17
3.Southampton 8 5 1 2 19- 5 16
4.West Ham United 8 4 1 3 15-11 13
and scores:
Round 1
[Aug 16]
Arsenal 2-1 Crystal P
Leicester 2-2 Everton
football-data.co.uk
football-data.co.uk is a betting and odds website that has made a lot of historical league data available as csv files. The data includes results and a lot of betting/odds related data. I have tried to aggregate and clean up the data in the following repo github.com/jokecamp/FootballData
Leagues and divisions included:
England Football Results Premiership & Divs 1,2,3 & Conference
Scotland Football Results Premiership & Divs 1,2 & 3
Germany Football Results Bundesligas 1 & 2
Italy Football Results Serie A & B
Spain Football Results La Liga (Premera & Segunda)
France Football Results Le Championnat & Division 2
Netherlands Football Results KPN Eredivisie
Belgium Football Results Jupiler League
Portugal Football Results Liga I
Turkey Football Results Ligi 1
Greece Football Results Ethniki Katigoria
The key/legend of all the field abbreviations gives you idea of what is available in the CSV files:
Div = League Division
Date = Match Date (dd/mm/yy)
HomeTeam = Home Team
AwayTeam = Away Team
FTHG = Full Time Home Team Goals
FTAG = Full Time Away Team Goals
FTR = Full Time Result (H=Home Win, D=Draw, A=Away Win)
HTHG = Half Time Home Team Goals
HTAG = Half Time Away Team Goals
HTR = Half Time Result (H=Home Win, D=Draw, A=Away Win)
Match Statistics (where available)
Attendance = Crowd Attendance
Referee = Match Referee
HS = Home Team Shots
AS = Away Team Shots
HST = Home Team Shots on Target
AST = Away Team Shots on Target
HHW = Home Team Hit Woodwork
AHW = Away Team Hit Woodwork
HC = Home Team Corners
AC = Away Team Corners
HF = Home Team Fouls Committed
AF = Away Team Fouls Committed
HO = Home Team Offsides
AO = Away Team Offsides
HY = Home Team Yellow Cards
AY = Away Team Yellow Cards
HR = Home Team Red Cards
AR = Away Team Red Cards
HBP = Home Team Bookings Points (10 = yellow, 25 = red)
ABP = Away Team Bookings Points (10 = yellow, 25 = red)
european-football-statistics.co.uk
www.european-football-statistics.co.uk is a visually dated website but has a lot of historical football data (mostly an overview of league/tournament results) displayed in nice clean HTML tables. Looks like they already have 2014 EPL stats. The site claims “The target of this site is to collect european football statistics which are not easily found on internet.”
openligadb.db
openligadb.db has an old-school windows asmx web service with methods such as “GetGoalsByMatch()”
Wikipedia
Wikipedia – has a lot of structured data and is also crowd/public sourced. You can use their API to query then parse the data. It is very fragmented into specific pages making this a good source if you are looking for very specific team/player data. For example here is a table of Manchester United season results http://en.wikipedia.org/wiki/List_of_Manchester_United_F.C._seasons.
Post War English & Scottish Football League A – Z Player’s Database
Post War English & Scottish Football League A – Z Player’s Database contains a lot of HTML tables of “players who appeared for their clubs between 1946/47 and the end of the 2013/14 season and who have now left their clubs.” Here is a list of ex-Manchester United players.
Stats included are: NAME, POS, SEASONS, SOURCE, TRANSFERRED TO, APPS, GOALS
2015 Women’s World Cup Data
FIFA PDF files – includes unformatted data on participating teams, schedules and random statistics
world-cup-women on github – plain text file list of teams and schedule
2015 Women’s World Cup Wikipedia page – includes a great visual bracket view
2014 World Cup APIs
kimono labs 2014 World Cup Api – has a very nice restful API available. Free registration required to access the API. The API has a player, team, club, matches, and playerseasonstats endpoints. See thedocumentation and start making calls withe the API explorer
2014 World Cup JSON API – Soccer for Good – The API is available now. The author explains that the data is from a scraper so the availability is not guaranteed but should be available throughout the tournament. There are endpoints for teams, matches, today, tomorrow and current. The Ruby on Rails source code is available on Github
World Cup in JSON – an open source ruby project available at github.com/estiens/worldcupjson that scrapes a few sources and combines into an API. API is available at http://worldcup.sfg.io/matches
Unofficial FIFA.com JSON API for Mobile Apps This is unofficial and I wouldn’t be surprised if it is protected/unavailable soon. Until then its nice to see data straight from the source. Known endpoints:matches, teams or detailed match info
Deprecated/Retired – “The Graveyard of APIs”
ESPN API has an API for registered users (free). You can get a list of all the players in the EPL. However they are very limited in their data. They restrict all fixtures and scores to “strategic partners.” However, you can get lists of players and teams. The Public API is being retired on Monday, December 8, 2014Read the announcement
StatsFC used to have an restful JSON API of all EPL scores and fixtures. It was about $8 us dollars a month but was recently shut down. See their official statement. They still offer widgets and they plan on reviving their servies. See their comments at the bottom of the page.
Other Reading / Resources
opendata.stackexchange forum
Are there any open datasets for Soccer statistics? – keep your eye on this open data forum for more answers.
Linked Soccer Data
Linked Soccer Data (pdf) is a white paper on one group’s attempt to “create a dataset including reliable information about soccer events covering as many historical data as available including recent competition results.” Some dead links but worth a skim.
Fantasy Data
- EPL Fantasy Geek – Current season EPL Fantasy stats for every player.
- Fantasy Soccer Data Downloader on github – A simple command line tool to download English Premier League (fantasy) soccer data
Even more links to explore
You’ve made it this far. Why stop now?
- MLS Player Salary information – 2007 to current salary amounts in pdf from the MLS Players union.
- See this 2012 opisthokonta.net blog post.
- Football Club Elo Ratings Data & Source
- sportscruncher links blog post and soccer ratings post
- Football kit colors
- Historial kits
- ENFA – English National Football Archive
- Association of Football Statisticians (AFS)
- European football squads since 1999
- FIFA Video Game Data at leereilly/fifa-soccer-12-ultimate-team-data
- Football Stats and Betting data
This list was compiled by:
http://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/