# An Utilitarian View on Intelligence

The recent victories of AlphaGo has been but an interesting demonstration that machines are finally able to outplay masters at a highly complex game. Some may consider this to be the beginning of a real takeover of machines ala The Terminator, which to me is simply a fear of that which we do not understand.

As one of the many traits we can assign to people, intelligence is just one of them. However it seems to be a very valuable and desirable trait to many, and people often seek to measure this “intelligence” to order people by usefulness. Intelligence is measured in many different ways, from highly formal methods such as IQ tests to empirical rules judging by their personalities or the company they keep.

However let us first understand that intelligence is indeed valuable and while measuring it accurately may not be easy, measuring its value may be easier. How do we do this? We can try a very simple form of intelligence that we have developed ourselves called “artificial intelligence”.

Artificial intelligence comes in many different forms such as a computer player in games, sudoku solvers, self driving cars, the ability to read and parse written text and so on. However even with this many forms of artificial intelligence, their value is always a measure of the quality of their decisions in the domain. For example, an AI that is capable of beating most players at a game is considered better than an AI that does not. A self driving car that has less accidents on the road has a higher value than those that have more accidents. In essence the final value of a machine driven by some form of AI is always measured by its track record of decisions and results.

Thus the real value of an AI is the quality of the decisions it forms.

So what has AI really got to do with human intelligence? Well, I believe that AI and human intelligence actually are similar and almost equivalent in their playing field.

In its most degenerate form, being able to count is a quality humans learn and find useful, and which humans will practice in many fields. However, under fatigue, a human may count wrongly and this may result in bad decisions. Machines have mostly replaced humans in this regard due to their ability to more consistently count without errors (notwithstanding their ability to do it faster). This explains the great drop in secretaries and clerical jobs.

As time goes by the number of machines that can more consistently make better decisions about the actions they have to take will increase, and humans are forced to compete with this. After the industrial revolution, the value of labor has actually steadily dwindled as energy becomes more available and machines allow the conversion of energy sources into more kinds of value. Our value is more and more tied to the quality and weight of our decisions.

High level managers in companies and leaders are often held at a higher value than those who perform labor intensive tasks such as transporting due to the low value of labor and the high value of decisions. Certain labor intensive jobs do not have high value but are still performed by humans because humans are still capable of making decisions on the job that are better than machines, such as janitors and construction work, although cleaning is slowly being taken over by machines as exemplified by robots such as the Roomba.

In many ways we are already competing with machines to show that human decisions are still better than machine decisions in many fields. This means that if we were to assign a dollar value to machines that can do a particular work as well as a human being, that dollar value will be assigned to a human being in that field of work as well, essentially proving the equivalence of an AI and human intelligence in their ability to make decisions.

We are now in constant competition with machines to demonstrate our value through our ability to make better decisions on the actions we make, so the real value of human intelligence and thus the human who has that intelligence is also the quality of the decisions it makes him make, equivalent to the value of artificial intelligence itself.

IQ tests and apitude examinations have seemed to miss the point of intelligence and tested for a more abstract and less substantial quality of humans.

So in many ways, our old view of hard work as a good quality of a person falls in the face of scrutiny when we consider the modern realities we live in. Leaving decisions to another has always been tolerated, and blame and responsibility assigned to another, when the true value of our place in society actually comes from the decisions we make. The application of hard work is also a decision we often have to make but allow others to make for us. Perhaps we should begin to teach children less about working hard and more about learning to make good decisions.

# Rotating Integrals

This post assumes you know some trigonometry and can do some integral calculus on it.

My little brother in college complained about sines and cosines in an indefinite integral today. Specifically, an integration that he was annoyed with was:

$\int{x^3\,sin(x)\,dx}$

It sounded like a whine at first but upon closer inspection, his complaints are not unwarranted. This is indeed a troublesome kind of indefinite integral. There is a trick to doing it quickly, but let us try and perform this particular integration normally first:

Since this is integration of the product of two functions, we perform integration by parts.

$\int{u\,dv} = uv\, – \int{v\,du}$

I choose $u = x^3$ and $dv = sin(x)\,dx$ mostly to avoid having to integrate the $x^3$ and have fractions mess up my working.

This results in:

\begin{aligned} u & = x^3 \cr du & = 3x^2\,dx \cr dv & = sin(x)\,dx \cr v & = -cos(x) \end{aligned}

Hence:

\begin{aligned} \int{u\,dv} & = uv\, – \int{v\,du} \cr & = -x^3\,cos(x)\, – \int{–3x^2 cos(x)\,dx} \cr & = -x^3\,cos(x) + \int{3x^2 cos(x)\,dx} \end{aligned}

We seem to have ended up with another integral to solve: $\int{cos(x)(3x^2)\,dx}$, so we shall proceed to solve it.

\begin{array}{cc} \begin{aligned} \int{u\,dv} & = uv\, – \int{v\,du} \cr \int{3x^2 cos(x)\,dx} & = 3x^2 sin(x)\, – \int{6x\,sin(x)\,dx} \end{aligned} & \begin{aligned} u & = 3x^2 \cr du & = 6x\,dx \cr dv & = cos(x)\,dx \cr v & = sin(x) \end{aligned} \end{array}

How vexing. We have yet another integral to solve. Let us proceed anyway.

\begin{array}{cc} \begin{aligned} \int{u\,dv} & = uv\, – \int{v\,du} \cr \int{6x\,sin(x)\,dx} & = -6x\,cos(x)\, – \int{-6\,cos(x)\,dx} \cr \int{6x\,sin(x)\,dx} & = -6x\,cos(x) + \int{6\,cos(x)\,dx} \end{aligned} & \begin{aligned} u & = 6x \cr du & = 6\,dx \cr dv & = sin(x)\,dx \cr v & = -cos(x) \end{aligned} \end{array}

Once again, we have to solve another integral. This one seems to be the last though, because the x term has withered away into oblivion, it becomes a trivial integral.

\begin{aligned} \int{6x\,sin(x)\,dx} & = -6x\,cos(x) + \int{6\,cos(x)\,dx} \cr & = -6x\,cos(x) + 6\int{cos(x)\,dx} \cr & = -6x\,cos(x) + 6\,sin(x) \end{aligned}

Bringing it all together, we have:

\begin{aligned} \int{x^3\,sin(x)\,dx} & = -x^3\,cos(x) + \int{3x^2 cos(x)\,dx} \cr & = -x^3\,cos(x) + 3x^2 sin(x)\, – \int{6x\,sin(x)\,dx} \cr & = -x^3\,cos(x) + 3x^2 sin(x)\, – [-6x\,cos(x) + \int{6\,cos(x)\,dx}] \cr & = -x^3\,cos(x) + 3x^2 sin(x)\, – [-6x\,cos(x) + 6\,sin(x)] \cr & = -x^3\,cos(x) + 3x^2 sin(x)\, + 6x\,cos(x) – 6\,sin(x) \end{aligned}

Phew! Basically due to the fact that you have to integrate by parts as many times as the power of x, the amount of integration you have to do increases proportionally to x’s power, making it troublesome.

## A Closer Look

However, if you have been following the working closely, you will notice that all we are really doing when we integrate by parts is performing a derivative of both parts over and over again. Notice how the $sin(x)$ term keeps flipping between $sin(x)$ and $cos(x)$.

Better yet, if you differentiate $sin(x)$ 4 times, the entire thing rotates back to $sin(x)$ again. Since we integrated $sin(x)$ 4 times in this example, that is exactly what happened. I call this kind of integral a Rotating Integral because the trigonometric term constantly rotates between the four combinations of negative and positive, $sin(x)$ and $cos(x)$.

Notice also that the $u$ term we chose right at the beginning was differentiated at every step until it became a constant:

$x^3\,\rightarrow\,3x^2\,\rightarrow\,6x\,\rightarrow\,6$

If we factorize the result, we get the following:

$-x^3\,cos(x) + 3x^2 sin(x)\, + 6x\,cos(x) – 6\,sin(x) = (-x^3 + 6x)\,cos(x) + (3x^2 – 6)\,sin(x)$

Notice how there is a negative term in both the $sin(x)$ and $cos(x)$ groups. Since the starting $u$ we chose was positive, we can see the $sin(x)$ go through

$-cos(x)\,\rightarrow\,-sin(x)\,\rightarrow\,cos(x)\,\rightarrow\,sin(x)$

We can see a pattern of repeating derivatives distributed among the $sin(x)$ and $cos(x)$ groups. If we recognize that

$\begin{array}{cc} x^3 &\rightarrow\,3x^2 &\rightarrow\,6x &\rightarrow\,6 \cr f &\rightarrow\,f^\prime &\rightarrow\,f^{\prime \prime} &\rightarrow\,f^{\prime \prime \prime} \cr \end{array}$

And from our rearranged result,

$(-f + f^{\prime \prime})\,cos(x) + (f^\prime – f^{\prime \prime \prime})\,sin(x)$

This peaked my interest, and I decided to check the results of other powers of x:

$\begin{array}{cc} 3 & (-x^3 + 6x)\,cos(x) & + (3x^2 – 6)\,sin(x) \cr 4 & (-x^4 + 12x^2 – 24)\,cos(x) & + (4x^3 – 24x)\,sin(x) \cr 5 & (-x^5 + 20x^3 – 120x)\,cos(x) & + (5x^4 – 60x^2 + 120)\,sin(x) \cr 6 & (-x^6 + 30x^4 – 360x^2 + 720)\,cos(x) & + (6x^5 – 120x^3 + 720x)\,sin(x) \cr 7 & (-x^7 + 42x^5 – 840x^3 + 5040x)\,cos(x) & + (7x^6 – 210x^4 + 2520x^2 – 5040)\,sin(x) \end{array}$

## The Solution!

After doing some more wolframming, I found that we have a general solution of:

$\int{f\,sin(x)\,dx} = (– f + f^{\prime \prime} – f^{\prime \prime \prime \prime} + \cdots – f^{\prime2n})\,cos(x) + (f^{\prime} – f^{\prime \prime \prime} + f^{\prime \prime \prime \prime \prime} – \cdots + f^{\prime2n+1})\,sin(x)$

Where odd numbered differentials are on the $sin(x)$ group and even numbered differentials are on the $cos(x)$ group, and each differential alternates between negative and positive signs. The $sin(x)$ group starts with a positive sign while the $cos(x)$ starts with a negative sign.

Of course, the same rule slightly modified works for the $cos(x)$ form:

$\int{f\,cos(x)\,dx} = (f – f^{\prime \prime} + f^{\prime \prime \prime \prime} – \cdots + f^{\prime2n})\,sin(x) + (f^{\prime} – f^{\prime \prime \prime} + f^{\prime \prime \prime \prime \prime} – \cdots + f^{\prime2n+1})\,cos(x)$

Note that this only works if $f$ eventually derives into 0, and if the trigonometric function is $sin(x)$ or $cos(x)$. Otherwise you will need another method of finding the solution.

## Weird Application

With this we can try and verify that what I said was true by trying to use it on a very special function that does not change even when under differentiation: $e^x$. Even though I said this only works if it eventually derives to zero, let us try anyway:

So for this let us try the following function:

$\int{e^x\,sin(x)\,dx}$

This should give us

\begin{align} \int{e^x\,sin(x)\,dx} & = (– e^x + e^x – e^x + \cdots)\,cos(x) + (e^x – e^x + e^x – \cdots)\,sin(x) & \cr & = (– e^x + e^x – e^x + \cdots)\,cos(x) – (-e^x + e^x – e^x + \cdots)\,sin(x) & \text{If we flip the sign}\cr & = (– e^x + e^x – e^x + \cdots)(cos(x) – sin(x)) & \text{Hence} \cr & = (-1+1-1+\cdots)(e^x)(cos(x) – sin(x)) & \end{align}

We are left with a strange, almost nonsensical looking sum that looks like it should evaluate to zero, but does it?

\begin{align} S & = -1+1-1+1-1+1\cdots \cr S & = -1-S \cr 2S & = -1 \cr S & = –\frac{1}{2} \cr \end{align}

Oh so that infinite sum actually evaluates to $–\frac{1}{2}$. This may seem dubious but proving that is outside the scope of this post. (Lots of people have done it anyway.) This just means that our thing above should look like this:

\begin{align} \int{e^x\,sin(x)\,dx}& = (-1+1-1+\cdots)(e^x)(cos(x) – sin(x)) \cr & = –\frac{1}{2}(e^x)(cos(x) – sin(x)) \cr & = \frac{1}{2}(e^x)(sin(x)-cos(x)) \cr \end{align}

But is this correct? We must try and check again that we did not just stumble on nonsense.

## Verifying again

To verify, let us perform it again with good ol’ integration by parts.

$\int{e^x\,sin(x)\,dx}$

\begin{array}{cc} \begin{aligned} \int{u\,dv} & = uv\, – \int{v\,du} \cr \int{sin(x)(e^x)\,dx} & = sin(x)(e^x)\, – \int{e^x\,cos(x)\,dx} \end{aligned} & \begin{aligned} u & = sin(x) \cr du & = cos(x)\,dx \cr dv & = e^x\,dx \cr v & = e^x \end{aligned} \end{array}

We have to do this integration again for $cos(x)$, so let us do so.

\begin{array}{cc} \begin{aligned} \int{u\,dv} & = uv\, – \int{v\,du} \cr \int{e^x\,cos(x)\,dx} & = cos(x)(e^x)\, + \int{e^x\,sin(x)\,dx} \end{aligned} & \begin{aligned} u & = cos(x) \cr du & = -sin(x)\,dx \cr dv & = e^x\,dx \cr v & = e^x \end{aligned} \end{array}

Substituting back, we get:

\begin{aligned} \int{sin(x)(e^x)\,dx} & = sin(x)(e^x)\, – \int{e^x\,cos(x)\,dx} \cr & = sin(x)(e^x)\, – cos(x)(e^x)\, – \int{e^x\,sin(x)\,dx} \cr 2\int{sin(x)(e^x)\,dx} & = sin(x)(e^x)\, – cos(x)(e^x) \cr \int{sin(x)(e^x)\,dx} & = \frac{1}{2}(e^x)(sin(x)-cos(x)) \end{aligned}

Which is the same result we got before, proving that we didn’t screw up.

With this, you can find the indefinite integral of expressions of this form by simply differentiating the non-trigonometric side. You can also use identities, factorizations and other ways to rearrange an expression into this form to use this technique to integrate this otherwise troublesome class of Rotating Integrals.

# Finding Square Roots

Most of us has at some point in our lives used the Math.sqrt() function. We would even know that $\sqrt{2} = 1.414\ldots$. However, we never really give how it is implemented a second thought. Thus is the power of a tight abstraction. For those of us who lived in the age of calculators, finding square roots has always been a source of mystery. For me during my teenage years, calculating a square root has really just been about looking up a numeric table.

Obviously this very simple problem would have been solved close to the beginnings of civilization, and blazingly fast methods must already exist. But for sake of exploration, let us examine how we would implement a square root function should we need to.

$\sqrt{-n} = \sqrt{-1} \sqrt{n}$

Since square roots of negative numbers are really just square roots of positive numbers times the square root of a negative number, let us focus our efforts on finding just the square roots of positive real numbers.

One way to find a square root is by looking at this graph of $y=x^2$. Simply drawing a horizontal line from the y-axis from whose square root we desire to the graph line, and then find the intersect on the x-axis. So say we want $\sqrt{30}$.

We simply draw a line from the y-axis at 30 to the graph and find that it is located somewhere between 5.4 and 5.6 along the x-axis, but closer to 5.4. This makes sense. A quick glance at google) shows $\sqrt{30} = 5.4772255$

This in-betweenness tells us something about square roots. Besides the nice square roots, the ugly ones don’t seem to have any end to their decimals. Hence, they are always in between two numbers that we know. This means that we can keep making educated guesses about where the square root is until we get close enough.

$0 < \sqrt{30} < 30$

Let’s apply this knowledge first and do the calculation by hand for the number 30 since we know its answer so we can verify that our method is correct.

First of all, we must figure out where the root is. Obviously, it would be less than 30, since you need to multiply it by itself to get 30. Its also more than zero, because as mentioned earlier, you will always end up with a positive number.

$\sqrt{30} \approx 15?$

The number most in between of 0 and 30 is 15. But is this the number? The only way to check is by squaring 15.

$15^2 = 225$

Nope. Not even close. 225 is waaay too big. But this tells us something: the root cannot be bigger than 15 because if it was, we would get an even bigger square than 30, so we must look to the left of 15. This means our upper bound is now 15 instead of 30.

$0 < \sqrt{30} < 15$

The number in the middle this time is 7.5. Is this our number?

$7.5^2 = 56.25$

Still too big. This means our root must be smaller than 7.5, but still bigger than zero.

$0 < \sqrt{30} < 7.5$

The number in the middle of 7.5 is 3.75. Perhaps this is our number?

$3.75^2 = 14.0625$

It seems the square of 3.75 is smaller than 30! This means that that the root must be bigger than 3.75 since if the number is smaller, we get an even smaller square, we must be getting close.

$3.75 < \sqrt{30} < 7.5$

Now we must find out what number is in the middle. Well, from our geometry class we know that the average of two numbers is the middle number, so $\frac{3.75 + 7.5}{2} = 5.625$. We simply need to square this number to check now.

$5.625^2 = 31.640625$

It seems we have gotten a lot closer now that we moved the lower bound up. Our guess of 5.625 seems really close to the answer now. But because its square is bigger than 30, we know the root must be smaller than 5.625, so:

$3.75 < \sqrt{30} < 5.625$

The middle number of this is then 4.6875.

$4.6875^2 = 21.972656$

That’s really far away. This can’t be the answer, but lets keep looking. Since its square is smaller than 30, we change the lower bound to this.

$4.6875 < \sqrt{30} < 5.625$

This time, the middle number is 5.15625.

$5.15625^2 = 26.586914$

We are getting closer again, but this is smaller than 30, so our lower bound should be changed.

$5.15625 < \sqrt{30} < 5.625$

I know, it starts to get frustrating at this point since we seem to be crawling, but lets stay on for another two rounds. The number in the middle is 5.390625

$5.390625^2 = 29.058838$

Our result now is very very close to 30. It is smaller, so the lower bound should be changed.

$5.390625 < \sqrt{30} < 5.625$

The middle number is now 5.507812. That means the square is…

$5.507812^2 = 30.336$

Okay! We are pretty close, but the important thing is we know it will eventually arrive at the answer because we kept getting closer to the answer as we went. Now, we know computers do things faster than we do so let us write some code!

So first of all, we want to have our initial guess. The number to be rooted is 30.

Then, we write down what we did in every iteration.

Next, we tell the computer to do it over and over again until the square of our guess is the number itself, so we wrap that in a loop.

Great, that seems simple enough. Let’s put it all together.

Notice I put in a printf to check the bounds as it runs. This makes it more interesting and actually shows us what’s going on. Lets run it now.

After all that spam, now we can see that the result is the square root of 30, since we know 5.477226 is the square root of 30.

This method of calculating square roots is called the Bisection method, also known as a binary search. Its not very efficient as you can see, and takes a long time to converge.

Now we know what it takes to find a square root! Now you can simply try it with a bunch of numbers and get the right answer. You can also compare the answer with the actual sqrt() function provided by the standard library.

It is also interesting to note that you can use this method to get cube roots, quartic roots and quintic roots too.

# Script for Setting Up FTP to a Folder on a Mac

This script came in handy many times when I had to share things with my other laptops or windows users.

On a Mac you should have Ruby installed. Macs normally come with an ftpd whose frontend has been ripped out, so you can only do this on the command line. Basically, write this to a script file (lets call it setftp)

and then use it by typing:

./setftp /directoryIWantToShare

Tested on Mavericks

With this you will have an ftpd turn on whose root folder is the folder you named. in this case it is /directoryIWantToShare

# Hexlife Part 2

Yesterday while fooling around, I wanted to build a tri-star, but I made a mistake and made this instead:

But to my surprise I did not find a tri-star, instead this grew out of it:

Since I had made changes to tint different generations with different colors, I wanted to test it. Wondering if I had introduced bugs, I went through my git history to make sure everything was fine. After reverting, it turned out I hadn’t. It was just a variation I had never tried.

In my earlier post, I said that still lifes could not exist under the rules because cells could not live long enough, but these structures persisted. In other words, they were non-transient cells.

It turns out that most of the cells here had 3 neighbours, and that gave them a long lease on life. Also, all the way along the columns, there were no dead cells that had 2 cell neighbours, meaning that the cells would never die because of their spawn.

However, the ends are kept alive by an interesting “flower” oscillator that resembles the twinkling star oscillator, and this keeps the entire structure alive. Basically, you can make an infinitely long structure, but you must place the oscilators at the ends.

Armed with this knowledge, I set out to create a simpler column.

I call this kind of life form a Semi-still life, since a large proportion of it can be unchanging, but must be supported by oscillation.

Perhaps there are more of these under these rules…

# Hexlife

Last week I wrote a simple Game of Life variation that runs on a hexagonal grid after viewing an example on Wikipedia. The cells considered are the immediate neighbours. The rules are:

• if a dead cell is surrounded by 2 live cells, the dead cell becomes alive
• if a live cell is surrounded by 3 or 4 live cells, it stays alive
• in all other cases, the cell dies

One of the interesting things about this set of rules is the absence of still life. The reason for this absence is that a cell can only be born under conditions where a live cell would otherwise die. This means that whenever a cell is born it is likely to kill its parents.

Hence the only known stable life forms in this set of rules at the moment are oscillators. No gliders have been found yet.

You can play with my implementation here: http://davidsiaw.github.io/hexlife/

In my search for a glider in this set of rules, I found a bunch of oscillators and decided to record them:

Picture Description
The 2cell is the simplest and most common oscillator. It is left behind by almost any unstable life.
The Spinner is another common oscillator that has a period of 2. It simply looks like it is spinning.
The Mouth looks like a spinner but instead of 2nd level adjacent, they are 3rd level adjacent, so it looks like it is always opening and closing
The Needle has a period of 2 and flips back and forth.
The Dancer has a period of 2 and looks like its swinging back and forth.
The Star looks like a twinkling star. It is quite peculiar in the sense that it has a period of 3.
The Rotator has a period of 4 and looks like it is spinning in a weird way.
The Bat is perhaps the most common 4-period oscillator you get from random starts.
The Snake is a period-4 oscillator that looks like a snake that wiggles around
The Morpher is really simple but really interesting-looking oscillator. It has got a period of 12 and transforms into all its possible orientations. This means that even though it has no symmetry, it does not matter which way you orient it, it will achieve the same configurations. I call this temporal homogeneity.
The Tristar is a period-12 oscillator that twinkles in a more elaborate way than the star.
The Swimmer is an oscillator with a period of 48. You can actually find it on the Wikipedia page I linked. It seems like a lost fish swimming back and forth.

# Using Travis to Deploy My Blog

I’ve been using Travis-CI more and more as a platform from which I can deploy things, due to the fact that we can run any code on it. Today I made it so that this blog is deployed to gh-pages when pushed. I have also set up my personal blog to be pushed this way as well.

Why did I use Travis-CI instead of Shippable or other CI systems for this? Well, its mainly due to the fact that I was already using Travis, and the tools (specifically the Travis gem) are quite mature. Many of the things that are quite troublesome, like generating a key and placing decrypt commands into the .travis.yml, are now covered in simple command line instructions.

My blog uses Jekyll + Octopress, but I don’t like the limitations imposed by github on the templates I can use. So I decided it was better to simply upload the finished product. First of all, I push all my blog sources up to a public repository at https://github.com/davidsiaw/davidsiaw.github.io.source

While the setup is easy, its not obvious that you can do this. Hopefully this will go some way to helping others who want to circumvent the github limitations on their gh-pages content as well.

In this post I will show you how to set it up. First of all, I create a key that will give push access to my blog’s repository at https://github.com/davidsiaw/davidsiaw.github.io by calling up ssh-keygen

I then place the deploy_key.pub in my github repository.

Next, I make use of the Travis gem to encrypt my private key. I add the --add parameter to make it write to my .travis.yml (I am in the directory.)

This gives me a deploy_key.enc that is my encrypted private key.

In order to use this key, I need to add some more lines to .travis.yml to enable it to push to github. First of all, I need to install the key into the .ssh folder so git can use it. I also chmod it so ssh will not complain.

With this, I can now tell Travis to push the generated files. All I do is tell it to generate the site (since this is just Jekyll), and then call my deploy script which simply pushes the right stuff up to github.

With this, my website gets updated everytime I push my changes to https://github.com/davidsiaw/davidsiaw.github.io.source, Travis will automatically update my blog.

# New Blog

As all of you may have noticed the theme has changed and the URL seems to have been redirected to Github. I decided to try out this github pages thing and it seems to work pretty well. The static content idea is really attractive to me and really gives me much more power over the pages with a small tradeoff for convenience. But convenience is a non-issue for most programmers who can write tools to make the inconvenient convenient.

What I want to talk about here is actually how I started doing this and why I ended up moving my blog across, along with what I learned along the way about Jekyll-Bootstrap, Octopress, Jekyll, the annoying problems and the process of migrating from a WordPress blog to a Octopress/Disqus duo.

# Online vs Batch Learning

While debugging neuron, my new neural network simulation application, I found some (visually) interesting differences between online and batch learning. While batch learning is usually touted as a better form of learning, I found that the two don’t seem to make much difference, except for a steppy curve from online learning, as I would expect as the gradient changes differently if you keep calculating the values in a cycle instead of calculating the combined gradient of all the training data.

Here are the results from training a 2-2-1 network with biases with XOR as training data:

with the weights initialized to:

double[] wx0 = { 0.1, 0.2 }; double[] wx1 = { 0.3, 0.4 }; double[] wx2 = { 0.5, 0.6 };

double wh0 = 0.7; double wh1 = 0.9; double wh2 = 1.1;

Where wx are the values between the input and hidden layer and wh are the values between the hidden and output layer.

Here is the network topography:

Here is the training error: