Archive for the ‘Robotics’ Category

Software Development

Computer programming, “software development,” has been a large enough field to qualify as an “industry” for some time now. Most programmers spend their time figuring out and writing out, in excruciating detail, procedures for machines to follow. To do this effectively, they have to know, in fairly excruciating detail, what procedures other programmers have coded so that their procedures can interact correctly. Can this field continue into the future along the same lines? I don’t believe so.

The reason is actually fairly simple: piling detailed procedures on top of each other is not a scalable activity. This is so even if we use the latest and greatest techniques for organizing these procedures using objects, aspects, patterns, and the like. As the industry has found out, large piles of code become unmanageable, or at least very difficult to manage, whatever organizational techniques are used to create them. In fact, the most successful software projects have managed to contain their most important procedures within a small “kernel.”

Cube Farm awaiting a "Bank of Programmers"

Unfortunately, the current bank-of-programmers approach to software development does not encourage this kind of small-footprint development. There is a deep reason for this. We have been missing an important class of computational kernel: machine learning. As long as the systems we create have to be programmed for every procedural variation, we’re stuck with hordes of programmers and brittle software.

The usual story goes something like this: a beautiful small system is developed; users interact with it and generate wish lists of what changes they’d like, sometimes very loudly; programmers are put to work piling code on top of the original system; the system gains more fans but gradually becomes more and more difficult to maintain; at some point, someone decides that the system needs to be “re-architected” and the cycle begins all over.

Some have argued that the solution is to embrace this cycle, speed it up, and monitor it to make sure the system evolves in “the right direction.” That’s an improvement over the traditional rudderless approach, but it misses the main point: if systems could learn, then much of the piles of code would be unnecessary. In the next few decades, either we take machine learning more seriously, or the software crisis will get worse and worse. And no, “the cloud” isn’t going to help with this in and of itself.




The Reverend and the Russian

The value of a good guess can hardly be overestimated. Many have tried to make a science of guessing. The fields of probability and statistics are the results of these attempts. Two men’s ideas in these areas have gained currency in recent years. The two are the Reverend Thomas Bayes and Andrey Andreyevich Markov. As with all ideas that gain popularity, there is a danger that these will become dogma, or at least second-nature, and replace critical thinking. Their approaches are similar in that they both take an axiomatic view of likelihood and its measurement. It’s important to know their assumptions to avoid mis-applications.

The British Reverend’s axiom is also known as “Bayes’ Theorem” or “Bayes’ Rule”. It can be considered a theorem because it can be derived from other “laws” or rules. It says that new observations should affect our probability estimates for guesses in a simple multiplicative way: the probability that a guess is true after a new observation is the simple product of the likelihood of the observation, with the guess taken for granted, and the ratio of the prior probabilities of the guess and the observation. This provides a very simple way to “update our state of knowledge” based on new observations. As you can guess, it is extremely difficult, maybe even impossible, to verify this general rule empirically in any reasonable way. Is the universe really such that uncertainties propagate in this simple way? There is evidence that it may be, but that’s a far cry from taking the rule for granted.

A Markov Model of Congestive Heart Failure Treatment (National Library of Medicine)

The Russian mathematician’s axiom is easier to state. It simply says that the probability of one set of observations immediately following another set is independent of history. At first blush this seems counterintuitive. After all, how can we ignore the past and hope to make a good guess? Markov, of course, never claimed that we could. His axiom just generalizes properties of simple chance events: the probability of winning the lottery does not depend on what the winning numbers have been in the past (in the absence of cheating). His generalization simply says that the right set of observations gives a snapshot that summarizes the history that led to it. In the case of the honest lotto numbers, observations of previous drawings are irrelevant. Markov’s theory then builds on this axiom to consider chains and networks of these history-independent observations (or “states”); systems like these are said to have the “Markov Property.” One can then find interesting conclusions about these Markov chains and networks in the aggregate. It is important to keep in mind that the Markov property is a hypothesis and should not be used indiscriminately because it simplifies the math, or, worse yet, just because it sounds sophisticated.

Again, what does this all have to do with you and me? Calculations based on Bayes’ and Markov’s theories are fairly common now. Some of these calculations are in very critical policy and engineering areas. Just intoning their names gives a claim or a theory a certain level of respectability. We should guard against accepting such derivative claims and theories uncritically and consider whether the underlying assumptions are really applicable, at least in our own work.




Mysteries of the Brain

We know very little about how our brains work. Only recently have we begun to learn what goes on inside the cells and connections that make up the brain. These are good first steps, but the real challenge is to understand the network effects that actually lead to behaviors and how these network arrangements are formed. One group made a splash recently by claiming to have simulated the brain of a cat on a computer. This heroic computational effort needed one of the largest and fastest multi-processing computers on the planet. Even so, the simulation was limited and abstracted away from the real cat brain.

The RoBoard RB-100, a nano-itx board

Given this state of relative ignorance, can one take a practical engineering approach to construct an “artificial brain?” My contention is that this is possible. We need to combine insights from what little we know about the brain and complex computation in general. To these insights we should add something like the Nike tag line: “Just do it.” Fortunately, off-the-shelf processors are now powerful enough and networks are fast enough that we are less constrained by computational power than ever before. The key, as has been pointed out here before, is to avoid direct simulation of biological brains and think creatively about  computational kernels and network arrangements that can learn.

Engineering efforts of this kind will not only have direct benefits, but they will also enhance our understanding of biological systems. We should keep in mind though that these are only first steps. Ray Kurzweil’s singularity will not suddenly materialize tomorrow.




Connections and Complexity

Network effects had been neglected for quite a long time. The classical “reductionist” view of the world was simply this: if we understand the rules that govern the pieces of the universe, then we can easily deduce large-scale behaviors. The past century has shown that this is wishful thinking at best.

Much has now been said about the level of complexity that interconnections among simple components introduce. This appreciation for network effects has not yet been exploited for engineering purposes fully. The old reductionist views still dominate in most engineering disciplines. The same is true about other practical arts; we do not yet have good ways of purposefully exploiting network-introduced complexity.

The reasons for this neglect are deep. To exploit the kind of complexity that network connections bring about we need to look at design differently. We need to view design as guidance instead of control. Complex systems do not respond well to direct control; they can, however, be guided and steered. We know very little about techniques to do this kind of guidance and control effectively.

Japanese Zen Garden (National Geographic)

Japanese Zen Garden (National Geographic)

There is one group, though, that has been struggling with these methods for some time: teachers and trainers. Living systems, especially people and groups of people, present exactly the kind of complexity that requires guidance and steering. Unfortunately, the track record for teaching and training is not stellar.

Perhaps the most effective of trainers have been spiritual teachers, for example Zen masters. Zen spiritual leaders have instinctively understood that guiding a complex mind requires meticulous attention to the environment. These environments are arranged to evoke the proper frames of mind that encourage the right mental changes.

We should start thinking of the machines we build as potential trainees and students. Our task is to design into them the kind of complexity that can be guided as we need. If a machine is simple enough to be controlled directly (“programmed”), then it’s probably too simple to do anything really interesting.




Analysis and Synthesis

Software-controlled machines have been around for some time now. But we’re nowhere near a world filled with robotic helpers. There are deep reasons for this. One important reason is this: we’ve been using the wrong approach to program the machines.

Techniques for programming robots have an odd quality about them: the software contains numerical methods combined with symbolic heuristic methods. The numerical methods are essentially those used in computational physics. The symbolic methods are essentially those used to analyze symbolic phenomena like language and “mental processes”. Both sets of methods are analytical in nature. They are invaluable in understanding the world, and that understanding is very useful in creating robotic software. But I believe it is a mistake to use the methods directly.

A Neural Network

A Neural Network

Let’s see what a different approach may look like. There is one set of relevant methods that is synthetic in nature: these are “computational kernel” methods. The set includes cellular automata and fractals. Ironically, these synthetic techniques are often proposed for analyzing physical phenomena. They may be useful analysis, or even paradigm-changing as some claim. But they are much more promising as a basis for robotic software.

To control a machine, we do not need to program into it sophisticated techniques for understanding how machines work. We just need to provide the right computational kernels that, in aggregate, lead to sophisticated behavior. This kind of approach has been seriously attempted only by “neural network” (and their close cousins “probabilistic graphs”, “bayes nets”, etc.) researchers and practitioners . Unfortunately, neural networks have suffered from the analysis-synthesis confusion as well; they have been constrained by the analytical need to model biological brains.

It’s time to get beyond this confusion and design synthetic techniques that learn the behaviors we want. We can then use analytical techniques to understand why the systems we create behave the way they do. But that analysis will be separate from the computational kernels we use to program them.




  • Categories