Jump to content


Photo

Renault Analysis Questioned


  • Please log in to reply
14 replies to this topic

#1 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 01 January 2011 - 12:34

I’d like to draw your attention, if I may, to the cover story which appears on pp10 & 11 of Autosport 30/12/10 written by F1 Editor Edd Straw. For those who have not seen the issue, on the cover is a banner declaring: “F1’s Untold Story. Revealed: How Renault [F1] won the development war”. However, inside the headline reads “Renault and Red Bull top the F1 2010 development race”, which isn’t quite the same.

The main thrust of this story, as far as I can tell, is that Renault out-developed their car compared the other teams (except RBR, one assumes). I say “as far as I can tell”, because the level of scientific rigour used to support this claim isn’t very good. The data source is obscure, the handling of the data is obscure, the data measure is missing completely, there may be double-counting, and the main conclusion of the results could be interpreted as selective. As a long-term reader of Autosport, I’m rather concerned by the publication of this piece, and its ramifications.

Having seemingly kicked 13 black cats and broken as many mirrors, for my sins I have had to study statistics twice (for professional exams and later towards my degree), and while I’m not a statistician, there appears to be a number of quite elementary things not exactly right with the presentation of the data used to support the conclusion.

In the magazine, this comes in the form of a graph which takes up half of p10. The unlabelled horizontal x-axis lists the 2010 grands prix combined into five blocks: Bahrain-China, Spain-Canada, Europe-Hungary, Belgium-Japan and Japan-Abu Dhabi. Note that Japan appears in both the 4th and 5th blocks. The vertical y-axis, which is inverted, begins at 100.000 at the top and meets the x-axis at 103.000. This axis is also unlabelled and therefore there is no indication as to what these values represent. My guess is that they are percentages, with the top performing team, Red Bull, representing 100% of something at the start, with the value scores of the other teams (which reasonably excludes the 2010 newbies) being related to them. The text is no help to what this ‘something’ might be, because the article dives in with the analysis without describing the methodology. Again I hazard a guess that these values are related to time, and that by using Red Bull as a benchmark, the other teams can be assessed relative to them. And assuming it is percentages of time which are being compared, it is not specified whether this is low fuel qualifying (for pure speed), individual fastest laps per GP (speed in race conditions) or the overall time to taken complete races. It also isn’t specified whether both drivers’ times for each GP have been averaged or combined to create the percentage, and if it is race data being used there is no indication as to what happens where drivers fail to set any time in a race due to a Lap 1 crash (e.g. for Petrov, Hulkenberg, Massa and Liuzzi in Japan). Zero scores will naturally affect the data.

My next problem with the data is the way it has been treated. As noted, the 19 races of 2010 are blocked together into five sets of four (if Japan has been double-counted), or into four sets of four plus one set of three if the 5th block has only been accidentally mislabelled by the graphics department. With 19 being a prime number, it would not be possible to have equal sized blocks (assuming again that Japan has not been double-counted), which is fair enough. But since one can’t have equal sized blocks to make fair like-for-like comparisons, why was the data clustered in the first place? It would have been preferable to have presented data points for all 19 races individually so that all the data was visible for scrutiny. Secondly, there is no information as to the mechanism used to cluster the data together. This is rather important, as any unusual spikes in the raw figures might distort (or ‘skew’) the final composite value found, and that would obviously affect the validity of the presentation and the subsequent interpretation. The effect of clustering like this is akin to what the US banks did to kick off the recession. By bundling up mortgages of various quality, the few very good ones obscured the utter awfulness of the many sub-prime lendings within each ‘Securitised’ bundle, which were then sold on to other banks. Now while a little good can mask a lot of bad, the reverse is also true, but either way the data’s statistical worth can be compromised.

So in short, then, we’re presented with a graph with no labels, containing unspecified data values, unevenly grouped by an unknown criterion, gleaned from an undisclosed prime source. The data might actually be perfectly good, but we have no way of telling. This obscurity also means that the study cannot be replicated, which is a scientific no-no.

Now let’s look at the analysis of this uncertain data, for which we have to take the graph as read because we can’t make our own to test the validity of the published one.

The interpretation in the text admits that Renault at the end of the season were below the level they were at the beginning of the season. Assuming my earlier assumptions are correct (we have to make assumptions on assumptions, here), this is acceptable, because it is relative to the performance of Red Bull, though as noted already we’re not told what these values are. So, if Red Bull does very well, Renault might appear worse by comparison to that baseline, but may have in fact improved in real terms. Perhaps. The interpretation of this decline given in the text is that “[Renault F1] switched its focus to developing the 2011 car”, the implication being that the fall is due to Renault stopping while others carried on. Well hold on a minute. The thrust of the piece is supposed to be that Renault out-developed the other teams, but in fact Renault stopped developing due to money, while other teams continued to develop their 2010 cars and some their 2011 cars as well. How is that ‘out-developing’ the others or winning the “war” if some of their adversaries could work on both? Usually war winners tend to still be on the field of battle at dusk, with the enemy either dead or fled. The interpretation goes on: “If it [Renault F1] can combine a good base car with strong development pace next year, the team are SURE TO WIN RACES [my emphasis]…” Yes, and if I had a million pounds, I’d be a millionaire. The use of the word “if” here is key, and not worth much unless the numbers can be demonstrated satisfactorily to support the conclusion.

The interpretation relating to Renault F1 is based on the notion that at the end of the 4th block, ending at Belgium-Japan(1), Renault significantly closed the gap to RBR, and according to the graph so they did. The analysis doesn’t mention, which the graph does, that part of the reason is that RBR themselves had a slump off the 100.000 mark (I’m not sure how that works if they are a benchmark, but that’s of course due to the unknown nature of the values used), so part of Renault’s ‘improvement’ in narrowing the gap is actually down to RBR being a bit off-song. At the end of the 5th block, Japan(2)-Abu, according to the graph, Red Bull regains its former status, but Renault falls back again. The others do too, as one might expect, except for Mercedes GP and Sauber, who continue upward. At the end of block 5, Mercedes finishes above Renault after having had a plunge throughout blocks 1-3. So why isn’t the story about Mercedes’ development come-back from a poor start?, given they overtook not only Renault, but repassed Williams too after a mid-season slump. Or for that matter, why isn’t the story about the cash-strapped Sauber team’s continued upward trend throughout blocks 2-5, overtaking STR and Force India? It might be perceived that there has been some selective data extraction, with the text seeming to look for good things to say about Renault at the front end of the piece, at the cost of what seem to be the more supportable stories of Mercedes’ fight-back, or Sauber’s return from the grave on a shoestring. It’s true, according to the graph anyway, that Renault also gained on RBR at the end of the 2nd block (Spain- Canada), but then so did McLaren (if to a slightly lesser extent) as RBR had had a little wobble during that period as well. Both Renault and McLaren then fell back at seemingly the same rate at the end of block 3, so one can’t read too much into that as being a purely ‘Renault thing’ either.

No doubt Genii, Renault F1’s owners, will be very pleased with this piece, because they can take it along to their sponsors and say: “Look - give us more money, because we’re going to be great in 2011. Autosport has proven it!” They can also take it on fishing expeditions to the sponsors of other teams, and say the same. Well, actually, Autosport has done nothing of the sort. Without considerably more detail as to how the results were derived, what we do have is a bunch of coloured lines, about whose source we’re told very little, except that the text tells us at the outset that the pretty lines say Renault could be better than the current top teams in 2011. And due to the obscurity of the data source, and how it was clustered into blocks, or whether Japan was counted twice or not, we can’t test that assertion for ourselves.

This Renault F1 development piece follows a previous Autosport cover and photo exclusive about the Lotus Group tie-in with Renault F1, where the phrase “the Real Lotus” was used on a front cover splash, implying by extension that the Lotus F1/Team Lotus outfit was somehow illegitimate. This earlier article was also penned by Edd Straw, and between the two issues, Renault Sport (not the F1 team) gave away a free calendar within the magazine. Now I’m not saying these events are evidence of a kind of conspiracy between Renault F1 and Autosport, and I am quite prepared to consider the possibility that Edd’s original piece (including the methodology) got sub-edded to death - pardon the pun - in order to make room for an enormous graph with missing axis labels.

Nevertheless, it doesn’t look good. I’ve read (and still possess) every issue of Autosport from January 1986 to date, and I can’t remember a time when the magazine has sailed so close to the wind when it comes to journalistic impartiality. First a snub to Hingham which would favour Renault F1, and now some questionable stats used to support a claim that in 2011 “…[Renault F1] are sure to win races”, which has the potential to advantage one team’s attractiveness above others to acquire sponsorship.

That doesn’t look good at all.

Advertisement

#2 korzeniow

korzeniow
  • Member

  • 5,671 posts
  • Joined: January 09

Posted 01 January 2011 - 13:56

Geez man :stoned: I approached this post several times and spreaded this read into instalments :stoned:

In brief: we don't know methodology of cumputing this ultimate pace.

Because graph presents ultimate pace and teams' pace relative to it.

I don't know why they separated races into those particular groups, but maby they focused on the tendency. If they would present all races the teams' paces would look like zigzags, which make it dificult for regular people to understand. But the tendency is most important. Just look at F1Fanatic's statistic zigzags: http://www.f1fanatic...ar-performance/ In Autosport case you look and undestand straightaway what's going on.







#3 boldhakka

boldhakka
  • Member

  • 2,802 posts
  • Joined: September 10

Posted 01 January 2011 - 13:58

Awesome well-reasoned post. I don't have anything to say except that this is quite common and only the tip of the iceberg. There's a lot of pseudo-technical verbiage that passes for technical analysis in F1. The articles that do the best are the ones that spin a plausible sounding narrative around random or technically complex data that seem to support the thesis of the article. James Allen and many others do it very well and got/get paid to do it. There isn't much truth in any of these, but they can be very entertaining.

It is mostly harmless though.

#4 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 01 January 2011 - 14:17

Geez man :stoned: I approached this post several times and spreaded this read into instalments :stoned:

In brief: we don't know methodology of cumputing this ultimate pace.

Because graph presents ultimate pace and teams' pace relative to it.

I don't know why they separated races into those particular groups, but maby they focused on the tendency. If they would present all races the teams' paces would look like zigzags, which make it dificult for regular people to understand. But the tendency is most important. Just look at F1Fanatic's statistic zigzags: http://www.f1fanatic...ar-performance/ In Autosport case you look and undestand straightaway what's going on.


Sorry to have bent your mind, but it needed time to explain!

One of the questions about the data is which 'ultimate pace' is being measured, and how? This isn't explained.

I've visited the F1Fanatics site as you recommended, and the top graph is how it should be done. They have explained where the data comes from, how they used it, and each race is clearly marked without duplication. The analysis is a description of the data without value judgements.

The Autosport chart seems to be the simpler, but actually it doesn't mean anything without knowing what is being measured, yet value judgements drawn from it.

#5 Slartibartfast

Slartibartfast
  • Paddock Club Host

  • 9,582 posts
  • Joined: March 08

Posted 01 January 2011 - 14:31

I second boldhakka, that was an excellent post. I don't think the abuse/misunderstanding of statistical analysis is restricted to F1, it seems to be the journalistic norm. I prefer to assume that the failings in this case are due to over-abbreviation and accidental omissions rather than an attempt to deliberately mislead as part of some pro-Renault agenda.
It's a shame that Autosport didn't see fit to inform us of their methodology but appear to believe that the validity of their analysis should be taken on trust. In which case, why bother with a meaningless graph at all? Wouldn't it be better to fill the space with words, even if they are one journalist's opinion?

It is mostly harmless though.

That's what they said about the Earth!

#6 korzeniow

korzeniow
  • Member

  • 5,671 posts
  • Joined: January 09

Posted 01 January 2011 - 14:38

I don't know what your problem is. The same tendency you can conclude from F1Fanatic's graph too

#7 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 01 January 2011 - 14:40

Awesome well-reasoned post. I don't have anything to say except that this is quite common and only the tip of the iceberg. There's a lot of pseudo-technical verbiage that passes for technical analysis in F1. The articles that do the best are the ones that spin a plausible sounding narrative around random or technically complex data that seem to support the thesis of the article. James Allen and many others do it very well and got/get paid to do it. There isn't much truth in any of these, but they can be very entertaining.

It is mostly harmless though.


Many thanks for the praise!

I agree a lot of this stuff is mostly harmless, but in this instance I fear it's gone a bit beyond that. This isn't some fan-boy website we're talking about - it's Autosport, and it's come up with some uncertain numbers to tell anyone who reads it that Renault "are sure to win races" next year (though this statement is qualified).

The potential for "pseudo-technical verbiage" to be used to validate claims of success is much greater the more impeccable the source appears to be. Take the quotes extracted from reviews appended to movie posters, for instance: a "Terrific!" from Empire magazine carries a lot more weight than a "Terrific!" from the Piddling-under-Bridge Shopper. The F1Fanatics charts (see above) are superior, but whose would Gerard Lopez most likely show to sponsors? Because of the way the piece has been constructed, the article comes across less like reporting, and more like promoting.

Edited by Kif, 01 January 2011 - 15:05.


#8 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 01 January 2011 - 14:47

It's a shame that Autosport didn't see fit to inform us of their methodology but appear to believe that the validity of their analysis should be taken on trust. In which case, why bother with a meaningless graph at all?



Hoopy!

Well, yes. What in a sense we have here is opinion being presented as reporting by the use of an uncertain chart from uncertain data.

#9 techspeed

techspeed
  • Member

  • 373 posts
  • Joined: April 09

Posted 01 January 2011 - 15:48

I must admit to not having seen the Autosport article, not gone near the magazine after the "Real Lotus" debacle trussed up as news.

If the reason for the whole Renault puff piece is that the graphs show how Renault caught up to Red Bull more than the others, looking at the F1fanatic graph you can clearly argue that Sauber, Williams, Ferrari, McLaren did a better job. Even the three new teams did a better job of closing the gap to Red Bull than Renault did, although they did start somewhat further back and had more to gain.

It sounds much like Edd Straw for whatever reason chose to write a story saying how great Renault have been, then found the statistics needed to prove his story by rather selective analysis of the data he had. Nothing new there for modern journalism and politics in general, but more should be expected from Autosport which used to have a reputation for impartiality and accuracy of its reporting.





#10 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 01 January 2011 - 15:49

I don't know what your problem is. The same tendency you can conclude from F1Fanatic's graph too


The problems with this piece of statery I've kind of already explained, and subsequent posts may cover other points.

With regard to the F1Fanatics chart (and again I thank you for directing me there) in relation to the Autosport chart is the degree of transparency, and how data has been used in a purer form; no clustering to smooth out any bumps. Further, we cannot tell whether these two charts are actually using the same data or the same measurement of that data; we only know F1Fanatics' methods.

For argument's sake, let's assume the two graphs are indeed measuring the same thing. F1Fanatics state that a number of trends from their data are unclear, but Autosport has made a very specific prediction about Renault. If you overlay the Renault and Mercedes charts on F1Fanatics, it's evident that both had similarly erratic seasons, with Mercedes narrowly having the final word. In the Autosport chart, in the 4th and 5th blocks, Mercedes continues to advance, whereas Renault slips back in the 5th block, and again Mercedes comes out just ahead. If this is the same data, then that's as expected. But I don't see any predictions about 2011 wins for Mercedes in the Autosport article. In fact, Mercedes GP aren't mentioned in the analysis at all.

(And before someone asks, I'm not a fan of any one team or driver.)

#11 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 01 January 2011 - 16:01

If the reason for the whole Renault puff piece is that the graphs show how Renault caught up to Red Bull more than the others, looking at the F1fanatic graph you can clearly argue that Sauber, Williams, Ferrari, McLaren did a better job. Even the three new teams did a better job of closing the gap to Red Bull than Renault did, although they did start somewhat further back and had more to gain.


....but more should be expected from Autosport which used to have a reputation for impartiality and accuracy of its reporting.


I appreciate that you've illustrated my points quite nicely.

For the first part, I can't see how the Renault "sure to win races" claim (and the lesser claim made about being WDC contenders, which also appears) can be supported any more than it might for other teams, at least not without more information about the Autosport data.

For the second part, I'd have been a bit happier if the story wasn't accompanied by a front page splash and a monster headline within, where Renault is named before the 2010 Constructors' champions, when the evidence provided is untestable.

Edited by Kif, 01 January 2011 - 16:38.


#12 CaptainJackSparrow

CaptainJackSparrow
  • Member

  • 2,368 posts
  • Joined: July 09

Posted 01 January 2011 - 20:44

Is that you Tony?



#13 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 03 January 2011 - 22:28

Is that you Tony?


Only if Tony is very white, lives in the Home Counties, and doesn't own an airline.

I refer you back to the closing parenthesis on Post No.10 above.

Check your flies - you cynicism is showing! :)

Edited by Kif, 03 January 2011 - 22:39.


#14 midgrid

midgrid
  • RC Forum Host

  • 10,132 posts
  • Joined: April 09

Posted 04 January 2011 - 15:12

Excellent analysis, Kif! I too thought the conclusions reached using the data were misleading (McLaren, Ferrari and Williams all seemed to improve more than Renault through the course of the season looking at the graph), but your post has exposed many more problems with the method.

The main thrust of Straw's argument is that Renault improved the most from Hungary to Japan (although McLaren and Williams improved at the same rate on the graph), but this is clearly exaggerated by Red Bull's overwhelming superiority at the Hungaroring, and Kubica's excellent qualifying performance at Suzuka (I imagine that Straw used the fastest lap time set by both/the faster Renault driver from each GP weekend, as was the case with the drivers' speed ranking published in each GP preview issue). But as you say, this is guesswork because it's not clear what data was used: both drivers? Faster driver? FP times? Qualifying times? Race times? A combination of the above? Was there any tinkering with the results to remove anomalies (e.g. no time set in a particular qualifying session)? It's strange, because the magazine has printed its method before when publishing such data.

I think you should e-mail your first post to the magazine, as it deserves a response (or perhaps send a shorter message in the hope that it is printed on the letters page).

#15 Kif

Kif
  • Member

  • 66 posts
  • Joined: October 08

Posted 04 January 2011 - 17:39

Excellent analysis, Kif! I too thought the conclusions reached using the data were misleading (McLaren, Ferrari and Williams all seemed to improve more than Renault through the course of the season looking at the graph), but your post has exposed many more problems with the method.

The main thrust of Straw's argument is that Renault improved the most from Hungary to Japan.. But as you say, this is guesswork because it's not clear what data was used... It's strange, because the magazine has printed its method before when publishing such data.

I think you should e-mail your first post to the magazine, as it deserves a response (or perhaps send a shorter message in the hope that it is printed on the letters page).


Hi Midgrid.

Thanks for your kind comments.

Yes, Edd Straw does seem to base his conclusions from the analysis on just block 4 - though (as you say) this has to be guesswork. The uneven blocks don't help, either, because each is of the same size on the graph, but Block 5 represents fewer races (providing Japan has not been counted in both Blocks 4 and 5, which is a separate problem again). Had the data been handled better, Mr. Straw may have come to a different conclusion, assuming (of course) that the tail wasn't used to wag the dog. And yes, in the past Autosport has published its methodology with this kind of mag padding, but lack of space can hardly be the excuse to have removed the method on this occasion, given the inflated size of the graph.

My sending this analysis to A/sport would probably not result in publication, since (by some irony) I have already appeared in the letters page in recent months, and therefore I suspect I would be disbarred anyway. It is a particular irony, since the piece was about Group Lotus! Although I have been following F1 for so long now that I've long since stopped supporting any one team or driver, events involving Renault and/or GL just so happen to have piqued my interest at the moment. Had the data been about another team, I'd be just as miffed by the application of bad science, though the earlier "Real Lotus are Back" article didn't help.

Nevertheless, while it is unlikely that I would be successful, that does not mean that you (or others) couldn't try yourself, providing you don't claim the work as your own! A reference to the post or the salient points would be sufficient. It's not my intention to promote myself unduly - I'm not a message boarder who has to be at the centre of very thread that's going, but pick and choose depending on my level of outrage! :)

As it is, I think it would be as much a concern to the other teams given the impression the piece leaves behind. I received a PM (I'll name no names) which advised me that Renault F1 has since decided to launch its new car (or the livery at least) at Autosport Int'l, and that this could be seen as more evidence of an abnormal relationship between A/sport and Renault F1. For myself I'm antiquated enough to give conspiracy theories a wide berth, but if people are coming to these conclusions (and that was not the first I've seen), that does point to something - however innocent - being not right in the press room.

Edited by Kif, 04 January 2011 - 17:55.