Материал готовится,

пожалуйста, возвращайтесь позднее

пожалуйста, возвращайтесь позднее

OPERATOR: The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.

PROFESSOR: I want to pick up exactly where I left off last time. When I was talking about various sins one can commit with statistics. And I had been talking about the sin of data enhancement, where the basic idea there is, you take a piece of data, and you read much more into it than it implies. In particular, a very common thing people do with data is they extrapolate. I'd given you a couple of examples. In the real world, it's often not desirable to say that I have a point here, and a point here, therefore the next point will surely be here. And we can just extrapolate in a straight line. We before saw some examples where I had an algorithm to generate points, and we fit a curve to it, used the curve to predict future points, and discovered it was nowhere close.

Unfortunately, we often see people do this sort of thing. One of my favorite stories is, William Ruckelshaus, who was head of the Environmental Protection Agency in the early 1970s. And he had a press conference, spoke about the increased use of cars, and the decreased amount of carpooling. He was trying to get people to carpool, since at the time carpooling was on the way down, and I now quote, "each car entering the central city, sorry, in 1960," he said, "each car entering the central city had 1.7 people in it. By 1970. this had dropped to less than 1.2. If present trends continue, by 1980, more than 1 out of every 10 cars entering the city will have no driver." Amazingly enough, the press reported this as a straight story, and talked about how we would be dramatically dropping. Of course, as it happened, it didn't occur. But it's just an example of, how much trouble you can get into by extrapolating.

The final sin I want to talk about is probably the most common, and it's called the Texas sharpshooter fallacy. Now before I get into that, are any of you here from Texas? All right, you're going to be offended. Let me think, OK, anybody here from Oklahoma? You'll like it. I'll dump on Oklahoma, it will be much better then. We'll talk about the Oklahoma sharpshooter fallacy. We won't talk about the BCS rankings, though.

So the idea here is a pretty simple one. This is a famous marksman who fires his gun randomly at the side of a barn, has a bunch of holes in it, then goes and takes a can of paint and draws bullseyes around all the places his bullets happened to hit. And people walk by the barn and say, God, he is good. So obviously, not a good thing, but amazingly easy to fall into this trap.

So here's another example. In August of 2001, a paper which people took seriously appeared in a moderately serious journal called The New Scientist. And it announced that researchers in Scotland had proven that anorexics are likely to have been born in June. I'm sure you all knew that. How did how did they prove this? Or demonstrate this? They studied 446 women. Each of whom had been diagnosed anorexic. And they observed that about 30 percent more than average were born in June. Now, since the monthly average of births, if you divide this by 12, it's about 37, that tells us that 48 were born in June.

PROFESSOR: I want to pick up exactly where I left off last time. When I was talking about various sins one can commit with statistics. And I had been talking about the sin of data enhancement, where the basic idea there is, you take a piece of data, and you read much more into it than it implies. In particular, a very common thing people do with data is they extrapolate. I'd given you a couple of examples. In the real world, it's often not desirable to say that I have a point here, and a point here, therefore the next point will surely be here. And we can just extrapolate in a straight line. We before saw some examples where I had an algorithm to generate points, and we fit a curve to it, used the curve to predict future points, and discovered it was nowhere close.

Unfortunately, we often see people do this sort of thing. One of my favorite stories is, William Ruckelshaus, who was head of the Environmental Protection Agency in the early 1970s. And he had a press conference, spoke about the increased use of cars, and the decreased amount of carpooling. He was trying to get people to carpool, since at the time carpooling was on the way down, and I now quote, "each car entering the central city, sorry, in 1960," he said, "each car entering the central city had 1.7 people in it. By 1970. this had dropped to less than 1.2. If present trends continue, by 1980, more than 1 out of every 10 cars entering the city will have no driver." Amazingly enough, the press reported this as a straight story, and talked about how we would be dramatically dropping. Of course, as it happened, it didn't occur. But it's just an example of, how much trouble you can get into by extrapolating.

The final sin I want to talk about is probably the most common, and it's called the Texas sharpshooter fallacy. Now before I get into that, are any of you here from Texas? All right, you're going to be offended. Let me think, OK, anybody here from Oklahoma? You'll like it. I'll dump on Oklahoma, it will be much better then. We'll talk about the Oklahoma sharpshooter fallacy. We won't talk about the BCS rankings, though.

So the idea here is a pretty simple one. This is a famous marksman who fires his gun randomly at the side of a barn, has a bunch of holes in it, then goes and takes a can of paint and draws bullseyes around all the places his bullets happened to hit. And people walk by the barn and say, God, he is good. So obviously, not a good thing, but amazingly easy to fall into this trap.

So here's another example. In August of 2001, a paper which people took seriously appeared in a moderately serious journal called The New Scientist. And it announced that researchers in Scotland had proven that anorexics are likely to have been born in June. I'm sure you all knew that. How did how did they prove this? Or demonstrate this? They studied 446 women. Each of whom had been diagnosed anorexic. And they observed that about 30 percent more than average were born in June. Now, since the monthly average of births, if you divide this by 12, it's about 37, that tells us that 48 were born in June.

Загрузка...

Выбрать следующее задание

Ты добавил

Выбрать следующее задание

Ты добавил