Wednesday, November 11, 2009

My thoughts on Go (Google's new language)

In the last couple of days I've been looking at Google's new language. Here is kind of a brain dump of my thinking on it.

First of all anything new produced by people of this caliber needs to be taken seriously. It is not done and may fizzle, but it shouldn't be dismissed out of hand. In that spirit here is a high level overview.

That said, it seems to be a mix of good things, interesting things, and things that are oddly missing. Looking at code examples it is in the C family. Since they want to appeal to systems programmers they are focused on simplicity, compilation time, and run-time performance. It has good stories to tell on all three, but I'm someone who finds scripting languages acceptable for the latter two, and that means I have a pretty high tolerance for complexity.

At a higher level they have garbage collection, first class functions and closures. Given the audience that they are aiming for they have downplayed the latter two (I spent the better part of an hour trying to verify that before finding it buried in the spec. Those are cool. They have goto, but looking through the spec I see that they have labeled loop control. There is an old theorem that any flow of control that can be achieved with goto can be achieved with equal efficiency with labeled loop control. Therefore I suspect that goto won't get used much. (Though it does make sense for state machines.)

Those are some basics, but they have three key features that they think set their language apart from most offerings out there.

The first is goroutines. You can take any function foo and type in go foo(arguments);. This basically means, "You go away and do foo while I do other stuff. When you return you exit." This is a very simple yet powerful concurrency model. The programmer does not know or care whether the goroutine executes in the current thread or another thread. In fact if the language had proper support for it then you'd be fine with it executing in another process on another machine.

Of course telling goroutines to do work in the background isn't very useful unless you can communicate between them. Their solution to that is their second key idea, channels. A channel is something like a Unix pipe. It is just a line of communication you can pass messages along, and data synchronizes by blocking until it is read. Channels may be one way or two way, and a given channel may only send specific types of data. They provide abstraction because you don't need to know the details of what is on the other end of the channel. This makes simple services very easy to write. For instance they offer this example of a simple RPC server:
  func startServer(op binOp) (service chan *request, quit chan bool) {
service = make(chan *request);
quit = make(chan bool);
go server(op, service, quit);
return service, quit;

The first two ideas are not that surprising given the history of the people who came up with this. But the third is a little more surprising. The third idea they call interfaces. I actually dislike the name, I'd prefer to call it mixins instead. The idea is that an interface is defined to be any type that supports a given set of methods. You can then pass that object in anywhere where that interface is expected. You can also add methods to the interface, that are automatically available to objects that support that interface.

In short it is very similar to a Ruby mixin, except that you don't have to declare that you're importing the mixin, it autodetects that you fit the rules and does it for you.

OK, if that is what it has, what doesn't it have?

Well, libraries. Given that it was released yesterday, that is understandable. :-)

A bigger lack is exceptions. I understand this decision. The problem is that they envision writing servers with a micro-kernel design. But what happens when one of the services breaks down? Who can catch that error? The code that launched the service? The code that is communicating with it through channels? If there is no good answer, then what sense does it make to let a service just go down? If they do add exceptions they have hard design problems to solve which arise from the fact that you don't have a simple stack-based flow of control.

The much bigger omission is generics. The FAQ says that generics are missing because they introduce complexity in the type system and run time. Obviously so, but they are well worth it.

People who come from dynamic languages (eg Perl, JavaScript, Ruby, Python...) might well wonder what "generics" means. Here is a quick explanation. We have higher order functions. What if we wanted to implement grep? Well we could fairly easily write a function that takes a function mapping int to bool and returns a pair of channels where when you stick something in the one channel, it pops out the other if and only if the function returned true. We could write a similar one for Strings. And so on and so forth.

But you have to write one function per type you want to grep on! This is highly annoying. The functions all have identical form except different types. But you have to write the same function over and over again. As a result there is no way to avoid repetition of certain kinds of boring code. Which is why, when asked whether Go has libraries allowing functional constructs like map, reduce, scan, filter, etc the response was, The type system rather preludes them. They would either have to use reflection (slow) or be specialised for each type of slice. (See this thread for that exchange.)

If you wish to solve this you need to do one of three things. The first is to have a type system that is rich enough to handle meta-functions like this. (For instance Haskell.) The second is to have types be figured out and matched at run-time, which is what dynamic languages do. And the third is to implement generics.


OK, enough of an overview. What are my thoughts?

Well first I'll watch it. It is young and nobody knows what it will do.

Moving on I hope the Go community that springs up start thinking about designing around the idea of capability based systems. That is a strategy of handling security/access by handing out opaque tokens that you need to make specific kinds of requests. If you have the token then you have access. If you don't, then you don't have permission and have no way to make the request. See this introduction for an explanation of how powerful this idea is. And in Go you'd realize it by having the "opaque tokens" be channels to services that do whatever you need done.

On a somewhat negative note I have mixed feelings about interfaces. My comment on mixins is, "It is bad unless you do it a lot, and then it is good." That is because a mixin involves magic at a distance that adds stuff to your class. So using a mixin requires learning exactly how it works. That mental overhead pays off if you get to reuse it over and over again. But if you don't use it heavily, it is a source of misunderstanding and bugs. The interface idea has the same issues, with more magic, but without the huge bug of overwriting methods you implemented in a class. (That said, I'm sure that more than a few people will be wondering why they didn't get the method they wanted from interface A when it conflicts with interface B that you also match.)

On a negative note I predict that there will be a lot of bugs in real Go systems around issues like deadlocks between services. Concurrency always opens up the potential for hard to diagnose bugs. And knowing that, I hope they develop good news for detecting/debugging those issues.

And finally I think that the omission of generics is a mistake. If they want the language to grow up, they'll need to fix that. Exceptions, if done well, would be a good addition, but there are reasons to leave it out. But generics are just too big an issue.

Monday, November 9, 2009

A proposal on patent reform

In theory the purpose of patents is to increase the public domain by providing temporary incentives to spur innovation. In that spirit patents are supposed to be limited to things that would not be obvious to one versed in the art.

However patents have lead to many problems.

First is the question of which ideas are obvious, and which are not. This is not a simple question at all. The patent office has resolved this question by being fairly lenient about which patents are granted, thinking that the issue can always be resolved in litigation. However people are reluctant to litigate, so invalid patents are acquired in great number. The theory goes that any individual one is unlikely to stand up, but if I threaten you with a dozen patents at once, my odds of having at least one stand up are pretty good. This set of incentives leads to a lot of overly broad patents that are pretty obvious, which causes uncertainty for others.

Second you have the issue of how long patents are good for. When it comes to drugs, a multi-year search and FDA approval can only be recouped from a long patent. When it comes to software a 17 year patent warps the market for many generations of technologies. Thus there can be no simple balance between encouraging innovation and unnecessarily hobbling further advancement. (Many people in software think patents shouldn't apply there. Certainly patents do a lot of damage in the software world, but as Paul Graham points out, if you're against software patents then you're against patents in general.)

Third we have the challenge that areas that change rapidly have a lot of opportunity for being the first to face a particular problem. If you're first to face a problem then it is easy to get a patent. Given that the patent office is lenient on granting them, you're likely to get it. So we tend to see very intense patent activity in areas that change rapidly. However those are exactly the areas where patents do the most to inhibit further progress!

My proposal addresses these issues.

I propose that no patent shall be accepted unless the technologies that make the discovery feasible and commercially viable have been broadly available for at least a decade.

If the problem has been solvable and a solution desirable for at least a decade but nobody has done it, that is evidence that the solution is not obvious to one versed in the art. Conversely if the solution quickly occurs to someone fairly shortly after the problem has become both solvable and commercially interesting, that is evidence that it wasn't really that hard. Furthermore it is exactly advancements in fields that are rapidly changing where patent protection does the most economic harm.

Why did I choose a decade? I chose it as a round number that is fairly close to half the length that a patent lasts. My theory being that if nobody has discovered it in that length of time, then the odds are reasonably high that nobody would have discovered it in the time the patent restricted the use of the invention, and therefore we have evidence that the net economic result caused by granting the patent is positive.

Of course, like any attempt at compromise on a controversial issue, I am sure it will satisfy nobody. But I don't think it is that bad either.

Saturday, November 7, 2009

What various scholars have changed their minds about

The 2008 Edge questionnaire is one of my favorite collections of essays ever. What they did is asked many well-known thinkers the question, What have you changed your mind about? 165 of them answered. There is an index, but I find it better to just dive in.

Admittedly it is a lot of reading. What would I recommend? It depends on your interest. Personally I liked Freemon Dyson's explanation of how we know that dropping nuclear bombs did not cause the Japanese surrender, one of the founders of the inflation theory for how the early universe evolved saying why he thinks that the theory is wrong, a feminist explaining that the preponderance of men at the top echelons is because men vary more, a biologist discussing why experts often should be doubted, a psychologist discussing statistical illiteracy in medicine, and an information scientist talking about why the sample mean is a poor statistic to use. (The sample mean is where you add up all of the observations and divide by the count.)

Those are some of the ones that I liked, and can keep people occupied for a long time. But I'm sure everyone will have their personal favorites. And yours could easily not intersect mine.

Thursday, November 5, 2009

Did males start as parasites?

Many years ago I was curious about why sexual reproduction is so prevalent. This is a summary of what I found when I investigated.

There is a standard theory to explain sexual reproduction, but it only answers half the question. The standard theory is that a species with sexual reproduction has a pool of genetic diversity. Should the environment change suddenly (temperature changes, a new disease, more effective predators arrive, etc), any gene that helps will have an improved chance of survival and will quickly spread through the population. Without sexual reproduction any particular gene line would eventually go extinct unless it was lucky enough to have the beneficial mutation happen within it. Therefore sexual reproduction allows for much faster adaptation, and benefits the species that does it.

This is all well and good, and convinces me that sexual reproduction has advantages asexual reproduction (as well as being more fun), but it doesn't explain why we have distinct sexes. For instance look at the lowly earthworm. They are hermaphrodites. When they meet they impregnate each other, and both go off and have babies. Hermaphrodites therefore reproduce twice as quickly as a male/female version would. Why, then, aren't we all hermaphrodites?

It can't be a simple quirk of genetics. The genetics of sex reproduction is much more plastic than most people realize. Even if we stick to vertebrates you can find strange things such as hermaphrodites (mangrove killifish) and animals that change gender (African reed frogs, several kinds of fish). So gender is more complicated than the familiar XX vs XY rule for mammals that we learn in biology. If hermaphroditic reproduction was evolutionarily favored, there are enough ways to get there that you'd expect it to be more common than it is.

As it happens I have never encountered a biology text that attempts to tackle this part of the question. However many years ago I came up with a theory that seems reasonable to me. I've discussed this theory with a number of biologists who agree that it seems probable to them as well. So this is my theory, but it is at least plausible. And for all I know it is the standard theory in evolutionary biology, and I just never encountered any source that presented it. (I'm very much not a biologist.)

My theory is that males started as parasites. More precisely, cheaters in a sea of hermaphrodites. And once men were effective enough, the ability of hermaphrodites to impregnate each other atrophied and you got strict sexes.

Let me break this down in more detail.

Imagine a population of hermaphrodites. Now add individuals who can impregnate others, but can't themselves be impregnated. These are males. By refusing the be impregnated the males avoid the hardest part of reproduction. This gives them more energy to devote to impregnating others. As long as the cheater impregnates hermaphrodites twice as fast as hermaphrodites do, this strategy works out.

Evolutionary theory has studied cheating strategies like this. Theory says that in any given environment there will be a natural fraction of the population that is male. With more hermaphrodites/male than the ideal, the male strategy becomes more effective, so the male population will rise. With too few hermaphrodites/male the male strategy becomes less effective so the male population falls. The ideal mix is called the evolutionarily stable strategy. (For another example of an ESS, some salmon fight for the right to spawn, and some try to sneak in while spawning is happening. We can calculate the ideal ratio between how many males should try to fight vs cheat, and it is very close to the actual measured ratios in real salmon.)

If your imagination fails you, don't worry. This isn't a hypothetical possibility, we can find this exact kind of mix of hermaphrodites and males (with no females) in c. elegans. That is a nematode that a lot of biologists study. (Biologists often like to work with organisms that are heavily studied for the simple reason that when they need to know something tangential to their research about how that organism works, there is a good chance that someone else has already studied it.)

In this situation males are cheaters. Biologically in some ways they parallel a virus - they are unable to reproduce on their own but can reproduce by hijacking the reproductive system of their hosts. However biologists have found that symbiosis arises easily out of parasitic relationships. And that is what I believe happened with the genders.

Take our previous population. Imagine that the males are numerous and effective at their job. So effective that a hermaphrodite who attempts to find another hermaphrodite to impregnate will expend so much energy that she would have more children if she just let herself be impregnated more frequently by the very convenient males. In this situation a true female strategy makes sense. And once hermaphrodites take that step, it is only a question of time before the ability for a reproductive individual to play both the male and female parts atrophies, and strict genders emerge.

For more interesting thoughts on sexual differences in humans, here is a psychologist discussing the consequences of men being expendable. I don't agree with all of his conclusions, but he is starting with solid facts that most people aren't aware of. For instance here is confirmation that men are significantly over-represented both at the top and bottom of a lot of distributions.

And in a very different vein, I have long liked David Brin's Neoteny and Two-Way Sexual Selection in Human Evolution. A summary can't do it justice. It comes up with a plausible theory explaining topics as diverse as why women have breasts to why so many men are pedophiles. Even if, no especially if you think that it has to be BS, it is worth the read.

Is the financial crisis really over?

I recently asked a friend who works on Wall St about whether these warnings about the municipal bond market were accurate. He gave me a detailed response, and gave me permission to repeat the comments but without his name or company attached. This is very reasonable given his current position, so I won't say anything more detailed than that he is in a good position to know what is happening in the credit markets, and I trust his opinion far more than any of the talking heads you see on TV.

Here is what he said:

Well, more than remotely accurate....

We have 3+1/2 more disasters to go in this crisis:
- Resi: this is the 1/2 disaster, as we're a little more than half way through the resi correction; unfortunately, we still have a long ways to go!

- Commercial real estate lending / CMBS: 4th Q retail will likely be a disaster, and commercial real estate has had as much or more price explosion as resi; already we are seeing soaring bankruptcies, etc. Many shopping center tenants have 'go dark' provisions (no rent due if >x% of the mall is dark), which are coming close to exercise. Banks and insurance companies are hugely exposed here; could make subprime look like a walk in the park.

- Credit card debt: even though the pace of job destruction has slowed, we are still in job destruction mode; consumers are increasingly falling behind on all debt payments. People are (wisely) looking to save rather than pay down debt (treating the debt as a lost cause)....

- Muni. As an example, tax revenue in your golden state is down 40%, and this story is repeated all over the country at all levels of government. So far, the Fed government has kicked a huge amount of money down the muni chain, which has kept the problems largely at bay; however, this process is nearing an end. The fact is that the entire country is massively leveraged: consumers, local government, state government, and the federal government. In the boom years, governments piled on debt and hugely increased their services, and most also hugely grew employment as well as entitlements (health care and pensions for muni employees). Now, they are facing revenue shortfalls but have great difficulties cutting services (often mandated by law), cutting staff (administration is opposed), cutting capital expenses (again, mandated by law). Default is in the cards for a lot of them, as federal money runs out.

Given the way the administration handled the automakers, I expect the muni bond holders to get hurt while the pension plans are made whole. I don't know what government does to avoid massive service cuts, though! It has seemed to me that one of the motivations for doing federal health care now is actually to ease up on state medicare spending (that's why your Governator has been making pro-health care reform statements).

So far, managing the crisis has relied on using the remaining credit of the borrower of last resort (the US). And while lots of reports have shown that govie debt is low relative to GDP on a historic max basis, these reports have completely overlooked the net debt position of both government and consumers. It is of course difficult to tease it all out (lots of munis fund housing projects), but personally I feel that we really can't have much debt ceiling left. Particularly if you consider that current/recent levels of GDP require a level of corp credit, but we now have vastly less corp credit available: without more corp credit availability, GDP must continue to slide, which implies debt-to-GDP continues to rise even without more borrowing.

An economist friend of mine (who lived through Argentina) and I both believe that the only real outcome of the whole crisis is that the Fed will manage a graceful and slow dollar dilution, so that we see a huge inflation but over a long enough stretch of time so as to not be unduly destabilizing. This process would basically inflate away our debt as it revives the domestic economy. Unfortunately, so far as I know, no country has ever achieved economic success via a weakening currency!

Anyway, yes, this is all really bad, and still has a ways to go!

Reading this I am strongly reminded of what I saw predicted in Wealth and Democracy several years ago. The book is a long read, but buried in it were a list of parallels between the USA circa 2000 and the last 3 great world empires shortly before their collapse. In all 3 cases shortly after a boom caused by financial speculation there was a financial crisis, which they recovered from fairly fast at the same time that they launched a military adventure that was expected to be a quick, successful war. The war dragged on and cost far more than expected. Then public mood turned against the war around the time that a more serious financial collapse hit. After a series of subsequent financial collapses there was political unrest leading to civil war 15 years after the peak in 2 out of 3 cases. (The exception was England, which got involved in WW I before the civil unrest could become worse.)

When I first read this I thought, "Interesting, and Kevin Phillips does have a track record of making apparently absurd predictions that came true, but I'm not overly concerned." I still believe that the prospect of expressing ourselves through democracy can head off the possibility of civil war, but it has been scary watching the timeline unfold like clockwork.

Monday, November 2, 2009

Why I left math

Reading how shocked Doron Zeilberger is at the state of modern mathematics reminded me of why I left the subject.

Math departments regularly have visiting mathematicians come and give talks. Or at least the one I was at did. For the visiting professors these talks were a confirmation of success, all of these people came to hear about their research. So they would talk about their research and get quite excited about what they were describing.

As a grad student I attended. I quickly noticed that most of the professors in the math department went out of politeness. However they knew they wouldn't understand the talk, so they brought other things to do. If I looked around about 15 minutes into the talk, I'd see people reading books, grading homework, and otherwise not paying attention. At the end of the talk the speaker would ask whether there were questions. Inevitably the mathematician who invited the speaker would have some. Occasionally a second mathematician would have some. But the rest of the room wouldn't.

This was supposed to be the high point of the life of a mathematician? That's when I decided that, no matter how much I loved mathematics, I wanted a different career. Unfortunately my wife was in grad school as well, and we were in such a small town that I didn't have any immediate employment options. Therefore I remained a rather unmotivated grad student. In the end my wife switched to medical school just before I would have finished the PhD. I'm mildly disappointed that I didn't finish, but it really has been no loss.

Why do mathematicians put up with this? I'll need to describe a mathematical culture a little first. These days mathematicians are divided into little cliques of perhaps a dozen people who work on the same stuff. All of the papers you write get peer reviewed by your clique. You then make a point of reading what your clique produces and writing papers that cite theirs. Nobody outside the clique is likely to pay much attention to, or be able to easily understand, work done within the clique. Over time people do move between cliques, but this social structure is ubiquitous. Anyone who can't accept it doesn't remain in mathematics.

It is important for budding academics to understand this and get into a good clique. This is because your future career and possible tenure is based on your research. But the mathematicians making those decisions are unable to read your papers to judge your work. Therefore they base their decisions on the quality of journals you get your papers into, and the quality of people you get writing recommendations for your work. But both of those come down to getting into a group that includes some influential mathematicians who can get your papers accepted in good journals, and that can write strong letters of recommendation.

In fact if, like me, you are someone who likes to dabble in lots of things, you will be warned (as I was by multiple professors) about the dangers of not focusing on one small group. You will be told plenty of cautionary tales of mathematicians who published a number of good papers, but who didn't publish enough in any specific area to get good mathematicians to stand behind them. And therefore the unlucky generalist was unable to get tenure despite doing good work.

For a complete contrast, look at the situation in biology. A motivated advanced biology undergrad is both capable of, and expected to read current research papers. When biologists go to a talk they both expect to understand the talk. And biologists have no trouble making tenure decisions about colleagues based on reading their papers.

I subscribe to the belief that the difference is mainly cultural. Biology is fully as diverse and complex as mathematics. Furthermore what I have read about the history of mathematics suggests that the structure of the mathematical community was substantially different before WW II. For example David Hilbert was known for stopping speakers and forcing them to define anything he found unclear. (Amusingly he once had to ask Von Neumann what a "Hilbert Space" was.) But after WW II an explosion of university math departments and a focus on solving concrete problems lead to a fragmentation of mathematics. And once mathematicians came to accept that they couldn't be expected to understand each other, there was nothing to prevent mathematics from splintering into fairly small cliques. Which has happened, and this is unlikely to ever be reversed.

PS I'm amused at the fact that a number of comments at Y-combinator thought that the situation with programming was worse than mathematics. Yes, there are divisions within programming. But they are nothing compared to the fragmentation in mathematics. I've done both and there is simply no comparison.