Frankenstat Rankings - Version1.0
Frankenstats are stats calculated from multiple existing LPGA stats. Frankenstat Rankings - Version 1.0 was meant to be a proof of concept ranking system. The basis concept was that it was possible to use the existing LPGA statistics and calculate Frankenstats and the combination of LPGA stats and Frankenstats would allow a better understanding of the strengths and weaknesses of the LPGA players golfing skills compared to each other. The system should rate the players from best to worst for a series of skills and for overall ranking. The LPGA website has data from 2004 to 2009 (six years) upon which the system must be built. Most people believe that the longer hitters have an advantage over shorter hitters. Therefore, another purpose was to show how that advantage manifests, and what shorter hitters can do to minimize that advantage. I will use individual players to illustrate the points I wish to make. However, the individual players represent the different types and Tiers of players. Annika Sorenstam (pre injury) and Lorena Ochoa have been the only dominate players in the time span for which data is available. Discussion of either will be about what it takes to be a dominate player, which I define here as more than 5 victories in more than one season, Tier 1 players. One step down the pyramid of players is those with 3 to 5 victories in a season, Tier 2 players (Paula Creamer, Christie Kerr, Meg Mallon, Suzann Petterson, JiYai Shin, Karrie Webb) Tier 3 players are those with 2 victories in a season or multiple seasons with a victory (too many to name all of them). Tier 4 players are those with a single victory. Tier 5 players are those who have not won. Both Tier 1 players are long hitters, over 260 yards average distance. Tier 2 players are a combination of long hitters (Pettersen with Kerr and Webb over or just under 260 dependant on the year) and those in the mid range in length of just under 250 yards (Creamer and Shin). Tier 3, Tier 4 and Tier 5 players go from long hitters to short hitters and everything in between.
The first Frankenstat I use, I borrowed (stole) from Hound Dog. Driving Efficiency is Driving Distance * Driving Accuracy added to Driving Distance, then divided by 1.8. The calculation gives added weight to distance, but allows the mid range hitters to score well if they hit a very high percentage of fairways. Longer hitters have an advantage because they have shorter irons into the greens which should make it easier to hit a higher Percentage of Greens In Regulation. However, it is easier to hit the GIR from the fairway than the rough. So I believe the calculation to be a good compromise between distance and accuracy.
The first LPGA stat I use is Percent Greens In Regulation. I think that it is a good proxy for how well a players hits their irons. The better Iron players will hit a higher %GIR than a lessor iron player. Those players who score well year after year on %GIR must be the better iron players.
The second Frankenstat I use is the Stokes Tee to Green. I calculate the Scoring Average for tournaments that the LPGA keeps putting stats. The Strokes Tee to Green is then the Scoring Average minus the Average Putts Per Round. There are three things in play with respect to Stokes Tee to Green. %GIR is a big factor, then a players ability to get up and down from the sand or from around the green, and finally how many times a player is able to reach a Par 5 in two.
The second LPGA stat used is the Putts Per Green In Regulation, which is the best pure putting statistic available.
The third Frankenstat used the the Adjusted Total Putting. Using the Percent Greens In Regulation and the Putts Per Green In Regulation and Average Putts Per Round, I calculate the Adjusted Total Putting. The Adjusted Total Putting calculates the number of putts per round as if each player hit 12 greens in regulation and missed 6 greens. That calculation removes the % GIR from the putting stat to give a better comparison of total putting using an intermediate calculation of Putts Per Greens Missed.
The Frankenstat Ranking Version 1.0 ranks the players from 1 to however many players are included in the stats for each year (146 to 169 dependant on the year). Then using the place on the list for each stat, calculates an average place on the list to give an average rating. Ordering the average from lowest to highest then ranks the players from best to worst. To be honest, that is a terrible way to determine on overall ranking. A number of players may have very close to the same value but be fairly far apart on the list. Also, there is no way to compare a ranking from one year to another. I am now working on Frankenstat Ranking Version 2.0. The idea is to calculate a value for each of the five stats versus a set standard and weight each stat at 20% of the total and calculate a numeric value for each player. Then it is possible to compare an individual player rating from year to year as well as compare any player to another for any year (is Sorenstam's 8 win year better or worse than Ochoa's 8 win year?).
Observations below the fold.
This method of calculating a ranking will not correlate to who actually wins tournaments. This ranking is more about how a players plays on average for the whole year. Winning is about doing well a particular week. Often winning is about having a hot putter. A hot putting round the last round of a close tournament can result in a win. Sometimes it is about having a hot putter for the whole tounament or maybe for just 2 rounds. Sometimes the winner is determined not by who makes the most birdies, but by who made the least bogeys. In that case the hot putter does not win, but the best play from tee to green does.
In 2001 before the LPGA website has recorded statistics, Annika Sorenstam shot a 59 in a tournament. From what I can find she had 13 birdies and hit all 18 greens and had 25 putts. She had 11 one putt greens and had 2 two putt birdies on Par 5's. I do not know how many feet of birdie putts she holed. Sorenstam shot 261 which was 27 under par and won by two strokes (hot putter yields 27 under par). In 2008 Paula Creamer shot 60 in a tournament. She had 11 birdies and hit all 18 greens and had 25 putts. Creamer had 11 one putt greens and holed 125.5 feet of birdie putts (average of 11.4 feet per birdie putt). The next day Creamer had 7 birdies and holed 116 feet of birdie putts (average of 16.6 feet per birdie putt). Creamer won the tournament by two shots after final rounds of 70 and 73. In 2008 Hee-Won Han shot 61 in a tournament. She had 11 birdies and made 123 feet of birdie putts (average of 11.2 feet per birdie putt.). Her next best round was a 69 and she lost by 2 shots. A hot putter can win a lot of tournaments, but don't count on winning with one hot round. Second point, I mentioned above how long hitters can do well on Stokes Tee to Green by reaching Par 5's in two shots and wanting to understand how long hitters have an advantage. The long hitters make more eagles, but I believe a bigger advantage for the longest hitters is the number of two putt birdies they make. I would like to have a Frankenstat for two putt birdies, but with the available data it is not possible to calculate an accurate number. It is possible to calculate the number of birdies a player makes based on Percent Greens In Regulation and Putts Per Green In Regulation, but the number does not account for birdies made from off the greens or 3 putt bogeys and worse, which lower the number of birdies calculated. Subtracting the calculated birdies from the number of birdies actually made gives only an approximate number for Two Putt Birdies. Those numbers show a higher number for longer hitters as a general rule when I compare driving distance and players who have about the same Putts Per Green In Regulation. It seems logical that players with higher Putts Per Green In Regulation would tend to have more three putt bogeys than the better putters. So that is why I compare players with approximately the same Putts Per Green In Regulation.
The player with lowest ranking by Frankenstats that actually won a tournament was Silvia Cavalleri in 2007. Hound Dog has referred to victories like hers as a fluke (defined as a stroke of luck). Some people object to the term fluke, since those players played well enough to actually earn their playing privileges. However you refer to those victories singled out by Hound Dog as flukes, they were totally unexpected based on the performance of those players over a significant period of time. Cavaleri won the tournament with 4 rounds in the 60's (69-68-69-66). Out of 67 rounds she played in 2007 she had 9 rounds in the 60's and 20 rounds under par. Somehow she put together four rounds for her victory. I like the term, Fluke Victories, under those circumstances.
This has gotten to be long so I will add more observations with the next post, when I roll out Frankenstat Ranking Version 2.0
0 recs |
22 comments
|
Comments
Tatkins strikes again!
Good stuff on Cavalleri, especially the fact that four of her nine 2007 rounds in the 60s came on the same weekend!
While I generally agree with the statement about GIR (that it shows good iron play), there is evidence that the longer players have a much easier time reaching the green in regulation. I’ve tried adjusting for that with Approach Factor (GIR “minus” Total Driving, so to speak) but am not wholly satisfied with it. Any ideas?
More Frankenstats to come
With six years of data to work with, I am still working on other calculations. I do not yet have all of the available LPGA stats in my database. I add to the database as I decide what I want to calculate next. As I add the rest of the stats I will consider Approach Factor and other things.
An observation?
Tatkins says “it is easier to hit the GIR from the fairway than the rough” and Hound dog says “there is evidence that the longer players have a much easier time reaching the green in regulation.”
Evidence from the PGA Tour is that the long hitters do have it easier, even from the rough, because they’re using a shorter club. But the grooves issue also makes it clear that rough is a major factor. You might say that GIR should be better (in this order) from:
1) close to green in fairway
2) close to green in rough
3) far from green in fairway
4) far from green in rough
The question here is whether 2 is better than 3, or vice versa. The groove change later this year for the ladies may change that, but for now let’s use the order in the list. Perhaps you could assign each a multiplier and call it your “Approach Factor,” then use it to adjust the GIR stat. This multiplier would be 1 for #1 on the list, then increase as you went down.
For example, if a long player and a short player have identical GIR stats, and both hit from the fairway, the long player would use their GIR directly (GIR * 1) while the shorter player gets extra credit for being farther away (maybe GIR * 1.25).
Obviously this is a judgment call — how much better is #3 on the list than #1? How much worse than #4? The best I can think of (if you have the stats) is to compare how good an individual player’s GIR is on short par-4s vs. long par-3s, then relate these to their overall GIR. But while distance plays a part, I think driving accuracy is the bigger deal because rough makes it harder to accurately judge the distance you must hit the ball to land it on the green. I might be proved wrong by the par-4/par-3 comparision, but I think any multiplier has to give more weight to accuracy off the tee than to distance.
Mike Southern
www.ruthlessgolf.com
Available Stats
That is the problem. As a dirty old man I prefer to watch the ladies and the LPGA keeps a lot less statistics than the PGA. So it is more difficult to calculate a realistic evaluation of skill levels for the LPGA players with the available data. Actually my interest in women’s sports was because I assisted a relative coach a women’s fast pitch softball team. I learned to appreciate the athletic ability and dedication of women athletes. Right now I am looking for trends in the data and futher analysis will be possible when I have all the available data in the database. It takes a lot of time to enter all of the data, as there are almost 1000 data points for each stat entered when you consider all six years that are available.
In addition to being better-looking...
I think most people play better by imitating the women rather than the men. The ladies focus on using what they have, while the men seem to try every fad that comes up! Since my blog is more instructional, I would find the ladies’ game appealing for that reason alone.
Of course, looks don’t hurt! 8-)
Mike Southern
www.ruthlessgolf.com
by Ruthless Mike on Feb 3, 2010 2:02 PM PST up reply actions
Obviously there are a lot of situations where 3 is preferable to 2. “Rough” is a huge area which includes trees and water, while “Fairway” is relatively well defined. At the same time – whether a player slices one 40 yards offline or barely reaches the first cut, it still counts as a missed fairway. For that reason (along with a few others), Accuracy is less reliable in predicting a player’s overall success than Distance. So I have to disagree with you that Accuracy is the bigger deal and deserves more weight – unless I’m interpreting your point backwards because you’re relating it some way with GIR?
Let me try it again then....
The point of the Frankenstats is to discover who is more skilled in different areas, right? I’m saying GIR from the rough is a better reflection of a player’s approach skills than GIR from the fairway.
I understand that “rough” could include a lot of things when you miss the fairway, but it’s also unlikely you’d hit the green in regulation if you put it in the drink, agreed? So if you’re missing a lot of fairways and STILL hitting a lot of greens, that means you’re hitting great shots the rest of the time. I interpret that to mean you’re more skilled at hitting approach shots than the person who’s in the fairway but has the same GIR stat. (The putting stats should reflect if the fairway player is hitting them close enough to make a difference.)
Again, if a player hits it 40 yards offline and STILL hits a high GIR, then that player is hitting better approach shots than another player with the same GIR stat but consistently hitting from the fairway.
That’s why I think accuracy is a greater factor in adjusting GIRs, and why it should carry a greater weight in your Approach Factor. You expect a longer shot to be harder than a short shot, but it’s still a fairly straightforward shot if both are in the fairway. Unless the green is ridiculously small for the expected approach shot, any decently-hit shot from the fairway — even a long shot — can be expected to hit and stay on the green. By comparison, the variety of lies you can get in the rough means any shot that stays on the green probably took more skill, so it also means a good GIR rating from the rough demonstrates more skill than a comparable GIR from the fairway. (And presumably more potential to go low on days when your accuracy is better.)
Or, to put it another way, it takes less skill to have a high GIR number when you’re always in the fairway than if you’re always in the rough. Distance is important and it does have an effect, but not as much unless (1) your bomb always puts it in the fairway, in which case you’re deadly accurate anyway, or (2) you’re much longer than your competition. (Off-hand, I would define “much” as at least 20 yards or two full clubs longer.) Even then, unless your bombed misses are in the short rough, the extra length probably left you a more difficult shot than the shorter player has because, even if she’s 20 yards behind you, she’s hitting from a much easier lie in the fairway. In that case, you still need more skill to hit and hold the green than the player in the fairway.
Does that explain my logic better?
Mike Southern
www.ruthlessgolf.com
by Ruthless Mike on Feb 3, 2010 5:14 PM PST up reply actions
got it
You were still talking about Accuracy as it relates to GIR – my mistake. And I agree with your points.
Like I told Tatkins...
I don’t see how you guys avoid migraines — I had trouble just trying to explain what I was thinking. I can’t imagine what’s involved in turning it into a number!
Maybe you could turn it into a game, like Madden NFL… Predict the winners of the tournaments & get rich! ;-D
Mike Southern
www.ruthlessgolf.com
by Ruthless Mike on Feb 4, 2010 8:30 AM PST up reply actions
As an engineer I think of the overall picture (design) as well as the details
There are 18 holes, normally 4 Par 3’s, 4 Par 5’s and 10 Par 4’s. I have heard for years that players at the highest level get progressively better the shorter the club. So with a nine iron they would hit an area around the hole that would be smaller in diameter than from hitting an 8 iron and an 8 iron diameter would be smaller than the diameter for a 7 iron, etc. The commentators have said that when playing at the same distance from the tee on a Par 3, the longer players most times use a shorter club than the shorter hitters and have an advantage. So lets give a minor advantage to the longer hitters on the 4 Par 3’s.
The 4 Par 5’s are a much bigger advantage for the long hitters. Short hitters can normally reach one maybe two of the Par 5’s in two, the long hitters can reach 2 and probably 3 of the Par 5’s in two. At 20 yards longer they are hitting at least 2 and probably 3 clubs shorter, a much bigger advantage indeed. Also if the longer hitter misses the fairway on their drive, they can use their second shot to get back in the fairway and the third shot to the green gives a GIR as does reaching the green in 2.
The same advantage exists for the long hitters on the 10 Par 4’s as they are hitting shorter irons into the greens. Because on many Par 4’s placement in the fairway is also required, on many Par 4’s they use a shorter club off the tee so they do not have as big of an advantage as on Par 5’s.
Now consider some actual stats. Pettersen, Tseng, and Wie all had driving averages right at 269 yards, while Creamer had a driving average at 249 yards. Creamer’s accuracy has her hitting 11 of 14 fairways, Pettersen and Tseng would hit 9 of 14 fairways and Wie would hit 8 of 14. So Creamer has an advantage on two holes compared to Pettersen and Tseng and three holes compared to Wie. So Pettersen and Tseng have the advantage on 16 holes and Wie 15 holes compared to Creamer.
Creamer hits GIR at a 75% rate, Pettersen at a 72% rate, Tseng at a 71.5% rate, and Wie at a 70% rate. In my book Creamer is the much better iron player and would still be a better iron player if she only hit the 72% GIR the same as Pettersen, because of the advantages on the majority of holes that Pettersen has. Pettersen is a slightly better iron player than Tseng. As yet I am not sure about where I would place Wie. She hits one less fairway and a lower %GIR results as would be expected if she were about the same quality of Iron player as Pettersen and Tseng.
hmm...
You just gave me an idea on how to revisit Approach Factor. We know how often a player reaches the green with GIR and we can assume we know how good her tee shots are by using Total Driving/Efficiency. What formula would give us a raw number to express Approach Factor (rather than the “by rank” formula I settled on before)? Let’s try:
x/TD=GIR or its equivalent, TD*GIR=x
where x is Approach Factor.
Creamer, TD is 75.04 and GIR is .747 so x=56.05
Pettersen, TD is 74.17 and GIR is .721 so x=53.48
Tseng, TD is 74.21 and GIR is .717 so x=53.21
Wie, TD is 70.95 and GIR is .702 so x=49.81
Can I say “Eureka”?
You may well be on to something
I will have to run the calculation on the complete list of players and try my logic progression given above on a number of players to see how well the calculation fits. I chose the players in the above example based on Ruthless Mike’s example of 20 yards difference in driving distance.
So far I think it looks good
When I compare multiple variables it is looking good. The majority of the players at the top of the list (Ochoa, Kerr, Stanford, Pettersen, Tseng) are the longer hitters, and they are also among the best players. Creamer, Shin, Ai Miyazato, and In-Kyung Kim are the shorter hitters that are toward the top of the list and are also among the best players. Players like Brittany Lang, Maria Hjorth, and Sun Young Yoo that are highly rated in the Approach Factor and don’t rate as highly overall tend to not be rated as good putters. Shin ranks 8th in Approach Factor, but based on my putting stats is the number one ranked putter. Creamer is the number one in Approach Factor but does not rate as good in putting (I heard Creamer now wears contact lens which might explain her worst putting year last year).
try this one instead
GIR*(100-TD)=x
I’m using the TD number at it relates to its theoretical ceiling (100 is 300 yds per drive with a 100% accuracy rating). This appears to yield the results I’ve been looking for – that is, a player who is middle-of-the-pack in Total Driving but great in GIR (Wendy Ward is a great example) winds up with one of the highest Approach Factor number.
So, has the purpose of the Approach Factor changed?
Forgive the noob here if I’m lost in the stats, but I thought you intended to use Approach Factor as an adjustment to other stats. If I’m following the exchange here, it sounds like you two have come up with an entirely new “absolute” stat that compares iron play by adjusting for length differences. In other words, this is a stat that, if several players hit the same iron, allows you to determine which one hits the iron best?
Mike Southern
www.ruthlessgolf.com
my purpose
I want it to measure how much a player improved her position from her average tee shot (her Total Driving number) to her GIR rating. Using the Wendy Ward example, she was 96th in Total Driving but managed to finish 3rd in GIR. I can’t imagine anybody improving her position more than that so Wendy’s Approach Factor should be very high.
It seems that Tatkins might be looking for something different but I’ll let him speak to that.
I must admit to some confusion myself
The second equation, GIR*(100-TD)=x, appears to me to determine who were the wildest in driving accuracy and/or the shortest in driving distance that do the best job in hitting Greens In Regulation. More like a Recovery Factor for GIR than an Approach Factor, which is maybe what Hound Dog wanted all the time, based on his example of Wendy Ward.
My goal is to find an equation to determine who the best players are and what part of their game is the best and how it compares to the other players. I will run lots of calculations that will never see the light of this blog, because they do not tell me anything. How I use any stat may not be what I thought when I first made the calculation. The key for me is to find an equation that compares the games of the players and tells me the best players BASED ON THE NUMBERS. I prefer that, as opposed to listening to the so called experts tell me who is best in their BIASED OPINION.
So this really is a new stat...
It measures a player’s skill at controlling an iron. It’s sort of like describing Total Driving by combining Driving Distance and Driving Accuracy, only the “lie” is always the same. This new stat is “Total Approaching” and it combines distance (indirectly, from Driving Distance) and accuracy (GIR) from a variety of lies (indirectly, from Driving Accuracy). Correct?
Same stat, but each of you plans a different use for it. Hound Dog is looking for a way to track a single player’s improvement, while Tatkins is looking for a way to compare the skills of various players.
Mike Southern
www.ruthlessgolf.com
by Ruthless Mike on Feb 4, 2010 5:40 PM PST up reply actions
You are correct, except
Hound Dog is more interested in the second equation, GIR*(100-TD) = x. I am more interested in the first equation TD*GIR = x. However, once the calculations are done and that takes very little time with all of the data in the spreadsheet, we must decide if the results are meaningful. We could decide that both stats, just one, or neither have any real meaning. We do that by comparing the results to other stats and what we know from watching hundreds of hours of LPGA play. If the results do not make sense, we throw out the calculation and try something else. Normally, we do this out of sight and the things that aren’t right never show up in public, unless we are looking for ideas from other people.
Ok, I follow that. Thanks.
Mike Southern
www.ruthlessgolf.com
by Ruthless Mike on Feb 5, 2010 9:31 AM PST up reply actions

by 








