Friday 15 July 2011

What data is valid?

How should I decide what data to include in my analysis?

Is there most similarity between games in a particular season? A particular league? A particular team?

Is there more similarity between a 2009-2010 Premier League match and a 2010-2011 Premier League match, or between a 2010-2011 Ligue 1 match and a 2010-2011 Premier League match, or between a 2010-2011 Premier League Chelsea match and a 2010-2011 FA Cup Chelsea match?

So far I have been analyzing data by season. I reset my analysis at the beginning of each season. But within each season, all teams contribute to the model. Is it really reasonable for a Blackburn-Bolton match to affect the model I use to predict a Spurs-Fulham match in the same season?

No comments:

Post a Comment