erikars: (Default)
Finished The Drunkard's Walk: How Randomness Rules Our Lives by Leonard Mlodinow. I'm giving this one a 3/5. Short summary: a must read if you are not familiar with the basic ideas of probability and statistics, and still a good read if you are familiar with the math but enjoy "history of mathematics" books (and I do!).

If I had to summarize this book in one sentence, I would quote page 11, "We habitually underestimate the effects of randomness." We assume, for example, that the hugely successful must have some secret or superior knowledge or talent. However, Mlodinow shows how, for example, given two people with the same skill, one may have a string of successes that makes them look like a superstar while another just does okay. This is most clearly demonstrated in sports where it is easier to assess someone's skill level (e.g., batting average), but the concept generalizes.

Each chapter focuses on a differen mathematical concept. We get some history of the concept, amusing stories about the people involved, a high level explanation, and examples. The concepts the book introduces are

  • "A and B" is always less likely than "A" or "B" alone.

  • Sample spaces. If all outcomes are equally likely, you can figure out the probability of "winning" by comparing the number of outcomes considered wins with the total number of outcomes.

  • If the outcomes are not all equally likely, you can still apply the idea of a sample space, but you have to weight the different outcomes.

  • A large number of samples is required before what you observe can be expected to match the predicted probability.

  • What you know changes what you know about the probability of an event (the gist of Bayesian reasoning without the math).

  • Measurements have errors. Difference within the bounds of these errors are meaningless.

  • Random variations over large populations tends to have discernible patterns (e.g., life expectancy), and there will always be some members at the extremes.

  • People are really bad at telling whether or not data is random. They will perceive random data as non-random and non-random data as random.

The level of mathematical detail decreases as the book progresses, but the chapters build upon each other. Although explained in the least mathematical detail, the last two concepts are the most important. I think that understanding these concepts is required for a basic level of mathematical literacy. I think pseudoscience would do less well if we made sure that our education system achieved this level of mathematical literacy.

Actually, on that note, I think that given the importance of probabilistic and statistical literacy, we should be teaching that in high school, maybe instead of calculus. (And, of course, I am influenced by Prof Benjamin from Mudd. (Watch the talk. It's only 3 minutes!))
erikars: (Default)
The AIG fiasco makes it clear how most people don't grok the difference between millions and billions and trillions. Yes, it is annoying, terribly annoying, that AIG is giving out bonuses with bailout money. Even more annoying that they are giving the money to people who helped caused the company's problems.

But let's get a little perspective here. I keep reading news articles saying that the company is giving out millions in bonuses out of the billions they received as if this were the primary use of the bailout money. $165 million on $30 billion is like $5.50 on $1000. This is like buying a latte and a lotto ticket when you have to borrow to pay the rent. This behavior is symptomatic of larger problems, but it is certainly not what we should waste our time focusing on.

ETA: Just to make sure it is absolutely clear. I do not think that AIG should be excused for these actions. However, I do think that the media is treating this irresponsibly by making it front page, top headline news instead of page A3 news. There are more important things to report on. Think about it this way, "AIG uses 0.55% of its bailout money for bonuses" would not be a front page headline; still worth investigating, but not front page.
erikars: (Default)
In today's edition of bad statistics, I recently came across this:
The study consisted of occupied homes that were previously on the market an average of 57 days as un-staged properties that had not sold. Those same homes were taken off the market and staged and re-listed. Those properties on average were sold 6 days on market after they were professionally staged, which is 89% less time on market.
Umm, I don't think you can treat a home that has been relisted as independent of the original listing.
erikars: (Default)
Over on the Google Research blog there is a cute little post about how most binary search implementations are broken. The gist of the problem is that formal proofs generally use what Bob Harper called at PL camp "God's integers" rather than n-bit integers, and some things are true of one that are not true of the other. This is not a new problem, but it is interesting seeing someone run up against it in practice. I do not know about the algorithm types, but I know that the problem of having a finite number of integers is something that comes up in PL, and it can make analysis more difficult.

For example (and here I go into PL-speak, for those of you who want to run way), it pretty much breaks analysis with the sign lattice (top, -, 0, +, bottom) because you can no longer conclude that '+ U +' is '+'. We can do better with the constant lattice (top, ...,-1,0,1,..., bottom), but it introduces all sorts of fairly-easy-but-niggly details (e.g., is the semantics of your language such that you should respect the overflowed value or that you can treat it as an arbitrary value).

In practice, we PL people tend to ignore such issues unless they are important (I sure hope the ASTRÉE people did not ignore them). Life is much simpler that way. However, you should always be aware of what assumptions your sound analyses and correct proofs are making.
erikars: (Default)
I shall start using the phrase "only if and if"


erikars: (Default)
Erika RS

May 2012

  123 45
2728 293031  


RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jul. 27th, 2017 12:45 pm
Powered by Dreamwidth Studios