Tribbles, Turing & TPU2

Before reading, note that the topic of solipsism is discussed somewhat in earnest herein. A bit of rambling should therefore be expected.

I suffer from aspirational polymathism. It is a disease that has plagued me since my earliest memories. As a child I was not content to study just one or two subjects. I craved constant stimulation from a wide array of new subjects, new ideas, new challenges, and asipred to master them all. Perhaps had we had more of the prescriptive remedies so popular today to address such childish thinking, I might have been diagnosed as hyperactive or ADD, been given my control meds such as to dissuade such childish ambitions. Then perhaps I would have had less interest in such a wide and crippling array of fields of study. And I might have learned to appreciate golf. But those were less sophisticated times. Alas, I accept my handicap and do not allow it to disturb my otherwise placid nature. In fact, I am fortunate to have found an avocation that sometimes even lauds this general scholar disease.

One of the attributes of aspirational polymathism is a constant need for the stimulation from new and deeper subjects to study. And it can’t be just one at a time. At any given time, I don’t have one book I read on the Kindle app on my iPad, but generally three or four open and active on my queue, processing in parallel. I read, switch off to another, read more, and so on. Plus I have NetFlix or Amazon Prime or HBO running in the background, sometimes with simultaneous video streams from one course or another, sometimes working on code or email or PowerPoint, all at the same time. It can be maddening, but also rewarding. Sometimes I fool myself into believing that I actually get more accomplished by giving full vent to my disease. I fool myself into thinking I learn more and faster when I indulge my manic mode.
Limitless
I confess that I do also enjoy an over-the-counter nootropic stack of my own concoction, sort of a Limitless fantasy I have, pursued for the past couple of years, and frankly it does seem to help my focus considerably. But aspirational polymathism does come with its own set of drawbacks. One of which is a tendency to jump very quickly to consider myriad down stream probable outcomes, with bifurcating branches of complex scenarios, most of which are way out of main stream thinking, as it were. My initial take on Google’s first TPU announcement, for example, was one of those moments. The more recent emergence of TPU2 and all it implies is another. But first a little more background.

So I have been reading SuperIntelligence and at the same time reading a text on Quantum Mechanics, and at the same time digesting Alan Turing’s 1950 paper for an AI course I was auditing online. Although I had read about Turing’s paper many times and read reviews and critiques, I had never actually read the original.

Alan Turing

Turing’s paper is probably the cornerstone of all Artificial Ingelligence R&D today, if not also the backbone of Theoretical Computer Science. I remember something Scott Aaronson wrote about Turing’s paper — how 70% of AI today can be traced back to that work. But it’s not the Turing Machine, nor the Church-Turing Thesis, nor the Imitation Game itself which struck me as especially germane from Turing’s paper. All those innovations I have read about, considered, appreciated, and more-or-less learned over the past few decades. No, it wasn’t all that from Turning’s seminal paper. It was something very different. It was statements he made about solipsism. The fact that he used the term 3 times in the paper is probably not all the surprising given the nature of the question: Can Machines Think? But When presenting his counter-arguments to those who, at the time, would out-of-hand dismiss the very question, Turing outlines a series of arguments to refute the detractors. He categorized objections based on perspective, for example, theological objections, or mathematical objections. One in particular, from the perspective of consciouslness, caught my fancy.

Can machines think? First we need to stipulate what we mean by ‘think,’ which is not as straight-forward as one might….think. So what does that mean? In denying the validity of the Imitation Game (i.e.: the Turing Test) the objector in question, Sir Geoffrey Jefferson, a British neurologist and pioneering neurosurgeon, expressed the view that writing a sonnet or composing a concerto, based on thoughts and emotions felt, and was the basis for judgement; and a machine was incapable of such feeling, and therefore could not think. Per Turing:

“According to the most extreme form of this view the only way by which one could be sure that a machine thinks is to be the machine and to feel oneself thinking. One could then describe these feelings to the world, but of course no one would be justified in taking any notice. Likewise according to this view the only way to know that a man thinks is to be that particular man. It is in fact the solipsist point of view. It may be the most logical view to hold but it makes communication of ideas difficult.” (emphasis mine)

With humor and brilliance Turing refuted Jefferson’s argument; taken to the logical conclusion, Jefferson’s view necessarily leads to hard core solipsism. It may be the most logical view indeed. But, as Turing observed, the motivation to communicate just about anything is a bit hampered when sentience is not presumed to exist in the ‘other.’ Therefore, let us stipulate that other human beings, despite frequent evidence to the contrary, do actually think. Further, let us accept the evidence of thinking to be communication itself. I believe Wittgenstein would agree. Without communication in some manner, evidence cannot be gathered.

I believe the Philosopher Wittgenstein’s influence on Turing is in evidence here. Turing did attend lectures by Ludwig Wittgenstein at Cabridge in 1939. Per notes from students attending those lectures (another book I read in tandem), Turing and Wittgenstein enjoyed many robust exchanges. Given the Philosopher’s views on solipsism and the limits of understanding, it follows that Turing was influenced in some ways by the assertion that the limits of language means the limits of my world.
Ludwig Wittgenstein
If thinking is manifested by the process of communication, indicating will and purpose, then we might agree that human life at all levels implies some modicum of thought. We ought not conclude otherwise. So to Turing’s question: can machines think? The question necessarily follows: can machines communicate? The two, per Wittgenstein, are linked.
The TPU
So now let’s consider TPU2 and all it implies. In the ASIC v. GPU battle, even with awesome new GPU options, ASIC, at least for TensorFlow, will always win. Google’s TPU is TensorFlow burned into a chip. That’s great. Cool stuff. So where can I buy one? Where might a get a rack of TPUs to power my next AI startup? Answer: you can’t. You have to rent them from the Google Cloud. Maybe that’s not a big deal. But maybe it is.

As the market for deep learning grows, if it grows at levels predicted, the differentiation provided by the TPU will likely suffice to give Google an edge akin to that which it already enjoys in search. No need ot create and sell hardware — just rent out intelligence and take for Google mass user processing and data to sweeten the deal. I’m not sure when the “Don’t be evil” line gets crossed, but something tells me we may be getting close.

By the same token, I’m a big fan of TensorFlow. It’s awesome. This brings up the Tribble metaphor.
The Trouble with Tribbles
You remember Tribbles from StarTrek of course. Cool, happiness-inducing, soft and sweet little pets. Everybody wanted one. One arrived on the Enterprise and the entire crew was enchanted by the sweet Tribble. In relatively short order, the little thing made a baby Tribble. Now more of the crew could pet one and love on one. Then the two Tribbles gave rise to four….and so on. You get the moral of the story, yes? Exponential growth, even of really cool stuff, might have extremely deleterious unintended consequences. Our TPU enchantment may give rise to some really sweet and productivity-increasing applications — stuff we can’t yet even imagine in this early chapter of the innovation-galore Network Age. But beware the Tribbles. They may be hiding in our closets or under our beds.

Closing this entry, it was Wittgenstein who said, “Nothing is so difficult as not deceiving oneself.” The same is true of the aggregate of ourselves. I am quite sure I deceive myself with my aspirational polymathism, and I am not much more than borderline trainable and merely opinionated. And I am also quite sure, besides me, other people actually think. If there’s hope for humanity, it’s in software, solipsists notwithstanding. Can machines think? Perhaps we will know soon enough. But beware the Tribbles. They too come with progress.

Posted in Big Data

Sentiment and Sentimentality

With a wink at Jane Austen, the world seems to have grown rather enamored with Artificial Intelligence recently. That’s not to say that AI hasn’t been a pop-culture archetype since I can remember. In terms of actual software and solutions, however, the ebb and flow of AI is something like a Jane Austen novel played out over the past 50 years. And per Ms. Austen, “If things are going untowardly one month, they are sure to mend the next.” Though recently, things seem to have jumped up a notch and a discernible phase shift has occurred. Probably due to a Moore’s Law tipping point coupled with unicorn hunting packs of investors, the widespread commercialization of good old fashioned AI (GOFAI), turbocharged with GPUs and ASICs, is now very actually real. But a neural network is not always the best nor only choice.

Take, for example, a recent blog entry on OpenAI, the Unsupervised Sentiment Neuron. The neuron in question, though, is not your standard neural network variety, but something far simpler. Detecting sentiment from Amazon reviews (unsupervised) can be reduced to a single ‘neuron’ which, from a linear regression model designed to predict the next single character in a sequence, and in doing so the model learned an interpretable feature, and that simply predicting the next character in Amazon reviews resulted in discovering the concept of sentiment. Per the researchers:

“The sentiment neuron within our model can classify reviews as negative or positive, even though the model is trained only to predict the next character in the text.”

Classifying sentiment accurately from short bursts of text using a next-character prediction engine is pretty awesome. The thing is, the sentiment use case was never a target, never pursued, and never predicted with the next-character work. It was one of those sweet side effects that often happens when we pursue the unknown. Like the microwave oven, the technology for which was discovered while researching radar gear. Or penicillin, quite accidentally discovered due to a sink full of dirty dishes. Metaphors notwithstanding, the emergence of an unsupervised classification model from the training and execution of a supervised learning machine model is very cool. And beyond the cool factor, new research into linguistics and information theory is clearly implied.

The epic fail of prediction systems the world witnessed on Election Day in the USA in 2016 may have, perhaps, discredited data science to some degree, or at least given us pause when it comes to machine learning, prediction engines, and the stock we place in such innovations. But old school methods, like the next-character predictor, may yet hold surprises for us. I am a huge fan of software and especially AI systems. But our sentimentality for more mature pursuits like NLP may yet hold surprises for us.

Posted in Big Data

Meet me on the Corner of State and Non-Ergodic

Do you know the Edge? It’s one of those web sites I read periodically…a place to seek out new ideas, emerging memes, and interesting discussions. Founded in 1995, they got traction a couple of years later, so this year (2017) they are celebrated their 20th anniversary. The Edge is one of those sites that has survived entirely due to well-considered, intelligent, quality content. Their motto: To arrive at the edge of the world’s knowledge, seek out the most complex and sophisticated minds, put them in a room together, and have them ask each other the questions they are asking themselves.

So I occasionally read postings on the Edge. They ask an annual question of a small group of intellectuals each year and provide the essays in a loosely coupled collection of ideas. Last year, for example, their question was, What do consider the most interesting recent [scientific] news? What makes it important? The responses, as you might imagine, were wide and varied. Have a look here at the table of contents for 2016. Each short essay makes for excellent reading. The collection for 2017 is, as one might expect, are also quite salient.

The Edge question for 2017: What scientific term or concept ought to be more widely known?

Two essays from the 2017 collection I now juxtapose here. The two, by coincidence, hail from two men whose work I have cited before: Stuart A. Kauffman and Scott Aaronson. Both are apparently acolytes of a humanist view of reality, akin to Richard Dawkins, in outlook if not tone. From what I’ve read by Kauffman, whom I do admire, he does leave a little wiggle room for a modicum of awe. But Aaronson appears to exhibit far less humility. They are both highly accomplished men in their fields, and have a right to be unabashed in their views. But, as my father used to say, we all put our pants on one leg at a time. It is from that leveling and loving thought this entry finds inspiration.

As you might guess, the terms suggested by Aaronson and Kauffman map to elements of the title of this blog: State and Non-Ergodic. These are the terms the two gentlemen in question assert should be of greater concern to the scientific community.

Let’s take them separately before comparing. First the concept of state, which comes from Aaronson, a quantum computing expert. In computer science we understand and appreciate finite-state machines — essentially a mathematical model of computation, with emphasis on finite. From Aaronson’s essay, state also containes a system’s hidden reality. Per the essay, state “…determines its [the system’s] behavior beneath surface appearances. But in another sense, there is nothing hidden about a state — for any part of the state that never mattered for observations could be sliced off with Occam’s Razor, to yield a similar and better description.”

Despite his semantic objectionsSemantic objections aside, Aaronson is arguing is not arguing for a soft deterministic model of the universe, or at least but rather a strongly predictable computational model. All we’re missing is an understanding of the hidden variables, to the extent we even need them. This is especially interesting given the fact that Aaronson clearly has a world-class understanding of quantum physics. His reliance on reference to and dismissal of Bohmian mechanics is notable. But the point is, from Aaronson’s perspective, state is actual, or at least computationally highly probabalistic, it as real as real can get, and it even transcends the otherwise indeterminate nature of quantum physics proffered by other interpretations. Per Aaronson, all that what is needed is a fuller understanding of the ‘hidden variables’ of reality. That appears to be Scott Aaronson’s view of the state of objective reality in our discussion.

By contrast, Kauffman (with far fewer words) suggests the term non-ergodic should be the hashtag science trending meme in 2017. To better get our heads around non-ergodic, we first need to understand something of the ergodic hypothesis, which states that “…over long periods of time, the time spent by a system in some region of the phase space of microstates with the same energy is proportional to the volume of this region, i.e., that all accessible microstates are equiprobable over a long period of time.” (Emphasis mine)

So if that is ergodic, then non-ergodic must mean that all accessible microstates are not equiprobable over a longer period of time.

Billiard dynamics. Note the difference between (a) the circular billiard and (b) the chaotic dynamics in the cardioid billiard.

Which also suggests that if not all states are equiprobable, then what determines the missing states? Why are not all states likely? Ergodicity is often assumed in the statistical analysis of computational physics. Yet Kauffman argues the non-ergodic deserves significantly more attention….why?

Kauffman is a biologist, an expert in quantum physics, and probably a bit of a philosopher, though I’m not certain he would agree with all those. He has long been a proponent of the view that the infinite adjacent possible is perhaps entailed by something beyond Darwin’s blind fitness rubric, at least insofar as biology is concerned.

Though an infinite number of combinations of DNA are possible, per Kauffman, only a tiny subset of those have been or will be explored in the entire life span of the known universe. Why? Morphological constraints? Something else? We simply don’t know. But a revolution in thinking implied by Kauffman may very well be the key to unlocking the next door of human understanding. Hence, non-ergodic properties of evolutionary biology ought to be main stream thinking fodder. Because we don’t know why the biosphere, and by implication the universe that contains it, is as it is.

So, on the one hand we have Aaronson, with the view that the Mind of God can be discerned in all cases by simply determining the essential hidden variables (via a highly probabilistic computational model). And on the other, Kauffman, who sees confounding patterns, but implies this too may simply require better understanding of the hidden variables of nature to yield a non-ergodic model — which too would rival the Mind of God in ambition.

See how I introduced the concept of “the Mind of God” there? I thought that was clever of me. By this I do not necessarily mean the book by that name, though I am confident Paul Davies did a fine job of framing the argument for a creative deity as root cause as may be possible from a scientific perspective. But rather something simpler — the Mind of God as the designer of (human) consciousness itself, which is something we really don’t understand from a scientific perspective.

Per Aaronson, the quantum slit experiment would does not harbor randomness at all. And maybe he is correct. See this article on Bohmian pilot wave theory. But if the universe is essentially a predictive computational engine with no room for quantum uncertainty, then it must also follow that the human brain, which is comprised entirely of the stuff of the universe, must also be similarly entailed.

I turn to my friend Dr. Kauffman, smiling wryly on the corner of State and Non-Ergodic. I think he was recalling Dean Radin’s talk in Tuscon in 2016. Yeah….human consciousness. There’s the rub. Regardless of what pilot wave theorists may think, there is something special, measurable, and repeatable about the observer. And statistically, we can prove it.

So let’s agree to meet at the corner of State and Non-Ergodic. We can decide which way to turn after we’re there.

Happy 2017!

UPDATE: Per a comment from Scott Aaronson, he was not and is not advocating a Bohmian interpretation of Quantum Mechanics. He explicitly states, “…a state might only determine the probabilities of various observations, not the observations themselves.” He goes on to further state quite clearly that he personally “…regard[s] the indeterminism of quantum measurement outcomes as a settled fact, to pretty much the extent anything in physics is a settled fact.” I am appreciative of the clarification. Hence, the updates to my original post with much gratitude. Please adjust digestion of above blog entry accordingly.

Posted in Big Data

Finding e in R

Though I have loved math since childhood, I am not a mathematician. But I love studying math. And sometimes I find myself exploring numbers just for the fun of it, especially when I’ve been using RStudio or Octave for a project.

Finding Euler’s number e is not improbable nor terribly difficult. The number e is everywhere. There is more than one youtube video demonstrating the emergence of e using various techniques. Euler’s number is one of those really beautifully transcendent properties of mathematics that puzzles and mystifies. And I know of nothing closer to inherent truth in this world than the beauty of mathematics.

But I wasn’t thinking about e when I found it where I did. I was thinking about something else. I was pondering the relationship between an infinite series of integers and how they might relate to rational numbers. The question that occurred to me was simple — is there is limit to the following:

f(x) = \sum_{x=1}^{\infty} 1/x

Note that I wasn’t considering x = 0 because dividing by zero is not a good idea. And since I’m not using the values from the infinite series as an exponent, zero becomes problematic. So for the sake of cleanliness I stated the problem as listed above and wondered if a limit would emerge.

As a software developer, my approach to exploring the problem was naturally to write some code. The R code for this little project can be downloaded from github if you’re interested. As x gets larger, the fraction gets smaller. Clearly the higher the value we assign to x the return value will become increasingly smaller when compared to earlier iterations. But is there a limit?

As I played with the code, I realized that even though the function would always return increasingly smaller values as x grew larger, it would nevertheless continue to grow and grow, even as x approached infinity and  1/x approached zero.

But then my thoughts turned to the ever increasing number of fractions needed to get to the next highest integer value. For example, when x = 1 , the value returned from the function is 1. So to get the return value to the next highest full integer value, in this case 2, the function must add:

x = 1/1 + 1/2 + 1/3 + 1/4

Which returns the value 2.08333. For the next highest value, the function must add:

x = 1+1/2+1/3+1/4+1/5+1/6+1/7+1/8+1/9+1/10+1/11

Which yields 3.019877. And so on. The number of fractions required to get to the next highest whole integer value increases.

Okay. Then it dawned on me that there might be a pattern to the increase in the number of fractions. So I tried a few experiments. What I discovered is e emerges from a simple ratio created by the number of fractions required to attain the next highest integer value in the series.

So, for example, the total number of iterations required to attain 2.xxx is 4. The total number of iterations to attain 3.xxx is 11. Divide 11 by 4 and the result is 2.75. To attain the next highest integer part, 4.xxx, requires 31 iterations of the function. Take the current value for iterations and divide by the previous value, and the result is 2.818182. Continuing, to attain the next highest integer part, 5.xxx, requires 83 iterations of the function. Take the current value for iterations and divide by the previous value, and the result is 2.6774194.

If we continue, eventually the number of iterations of the functions at the current integer value divided by the number of iterations from the previous integer point appears to hover around e plus or minus a very very small fraction.

integer portion value of j prev j j / prevj value of i prev i i / previ i / j
2 3 1 3 4 1 4 1.333333333
3 7 3 2.333333333 11 4 2.75 1.571428571
4 20 7 2.857142857 31 11 2.818181818 1.55
5 52 20 2.6 83 31 2.677419355 1.596153846
6 144 52 2.769230769 227 83 2.734939759 1.576388889
7 389 144 2.701388889 616 227 2.713656388 1.583547558
8 1058 389 2.719794344 1674 616 2.717532468 1.582230624
9 2876 1058 2.718336484 4550 1674 2.718040621 1.582058414
10 7817 2876 2.718011127 12367 4550 2.718021978 1.582064731
11 21250 7817 2.718434182 33617 12367 2.718282526 1.581976471
12 57763 21250 2.718258824 91380 33617 2.718267543 1.581981545
13 157017 57763 2.71829718 248397 91380 2.718286277 1.5819752
14 426817 157017 2.718285281 675214 248397 2.718285648 1.581975413
15 1160207 426817 2.718277388 1835421 675214 2.718280427 1.581977182
16 3153770 1160207 2.718282169 4989191 1835421 2.718281528 1.581976809
17 8572836 3153770 2.718281929 13562027 4989191 2.718281782 1.581976723
18 23303385 8572836 2.718281908 36865412 13562027 2.718281862 1.581976696
19 63345169 23303385 2.718281872 100210581 36865412 2.718281868 1.581976693
20 172190019 63345169 2.718281784 272400600 100210581 2.718281815 1.581976711
21 468061001 172190019 2.718281836 740461601 272400600 2.718281828 1.581976707
22 1272321714 468061001 2.718281829 2012783315 740461601 2.718281829 1.581976707

One common method to derive e is:

e=lim_{x=0}^{\infty} (1 + 1/x)^{x}

But with the methodology I discovered using RStudio, no exponential function is required. It is a simple ratio. Using the initial approach I used while experimenting:

f(x) = \sum_{x=1}^{\infty} 1/x

And count the number of iterations of the function as the value of x increases. Round of the fractional part of the returned value, leaving only the whole integer part. When the integer part exceeds the previous integer part, divide the current count by the previous count and the result appears to get increasingly close to e.

The current iteration count (i) divided by the previous iteration count (previ) also holds true for the difference between i and previ as well. When the function tracks a difference value, j, which results from i – previ, and prevj = j – prevj, then that difference between iterations (j / prevj) also appears to hover around e after a number of iterations.

I created a github repo for the code here.

Another interesting thing which emerged is the consistent and somewhat puzzling number that emerges when we divide i by j or previ by prevj. The number that emerges appears to be close to 1.581977. I can’t find that number in any math literature, but it does appear to be interesting. If we use pi as the hypotenuse of a right triangle and e as one of the sides, the missing side would be 1.574976…which is close but not that close. I was hoping a simple relationship like that might emerge. But I don’t think it’s that simple.

The emergence of e from a simple ratio of iterations over a simple sequence of 1 divided by consecutive integers, however, does appear to be an actual phenomenon. While it is likely others have found this magic before me, I have not found the work. Then again, as I stated at the outset, I am not a mathematician. I just like math.

Posted in Big Data

Quadratic Bowling

In November of last year I conducted a simple experiment, wondering if activity level on github could be a predictive factor in open source framework adoption. So I grabbed data from github APIs from several large organizations and played with the data in RStudio. Projects from AWS, Google, Apache, Facebook, LinkedIn, Microsoft, Oracle and Twitter were all included in the analysis, My intent was to find the most active projects and determine if those projects were increasing or decreasing in activity over the previous 12 months.

A Quadratic Bowl

A Quadratic Bowl

In the end, nothing terribly insightful emerged. If anything, the results underscored my own intuition as to where things would go in 2016. Hadoop, Cassandra, and Ambari all exhibited strong, steady interest. Spark was growing rapidly. As it turns out, Spark adoption in 2016 appears to be strongly correlated to github activity level increases in 2015, as Hadopp continues to find harbor in organizations of all sizes. But it didn’t take a data scientist to figure that out. Any reasonably-aware industry professional would have reached a similar conclusion.

I may or may not repeat a similar experiment this coming November. My instincts tell me that Machine Learning, including Neural Networks, will exhibit rapid adoption in 2017/2018, like that of Spark in 2016. Indeed, one of the motives for Spark adoption is to enable Machine Learning commercialization.

But just as adoption of Network Age technologies gave rise to major economic shifts at the turn of the century, the coming next phase (which includes pervasive adoption of IoT, good old fashioned AI (GOFAI), VR/AR, and a slew of sensors) will be an economic tsunami.

terminator_28453_4db5a1135e73d67af40067b5_1303953272-640x360

Evidence of GOFAI adoption is pretty much weekly news at this point in history. One notable story in particular from the headlines of this past week is the Google AI that invented its own crypto for messaging. Sure, it smacks of Terminator/Matrix darkness. But in reality, such innovation might very well be the foundation upon which the cyber-security of 2020 will rely.

As you know, the rapid rate of change we witness today is based on logarithmic accelerations of capabilities in Information Technology of which Moore’s Law is an exemplar. With each doubling of CPU cycles, storage capability and data itself, we brave frontiers increasingly riddled with attractors of both hope and terror. As prudent technologists, we must expect more Network Age asteroids. Dinosaurs beware. Mammals prepare. That’s what my instincts tell me now.

With the commercialization of GOFAI the quadratic bowl, as metaphor for Machine Learning artifact, might be just the thing for geek holiday gifts. Representing the error surface of the linear neuron in Neural Network implementations, for a linear neuron with a squared error, the visual (and mathematical) abstraction is a quadratic bowl. Too bad the Etsy shop in question seems to be taking a break.

quadraticbowl Beyond Machine Learning uptake in 2017, other predictions based on GOFAI commercial adoption naturally follow. For example, the next 10,000 startup business plans will add Machine Learning to something extant — some process, some flow, some problem set. Lower costs, increase speed, provide more for less by applying AI. That sort of incremental approach to innovation is probably the most common and likely. There is another set too — the disruptor flavored startups. Those innovations we cannot predict given the infinite adjacent possible. But we can prepare for asteroids, if mammals we would be. I’ll write more on that later.

Happy quadratic bowling.

Posted in Big Data, Borg, Data Science

The Temerity of a TPU

I attended a live broadcast of a recent Google I/O Conference here in Utah — akin to a Super Bowl Party for geeks — and I came to a minor epiphany. My hope in going was to learn a bit about Google’s Cloud offerings in order to better compare Google’s Cloud choices with AWS. But not much in that vein was covered, at least not at the sessions televised to the Xactware meeting rooms in Lehi. Evidently the crowd gathered there had more of an Android developer penchant, and there’s nothing wrong with that. But my choices for sessions of interest were in the minority. And with limited rooms to assemble, most of the sessions I had hoped hear were not live in Lehi. Luckily, most of it was captured by our friends at Google and posted within hours of the conference.

One thing, however, that did grab my attention at one of the live sessions and has had me ruminating for a couple of weeks since is the mention of their TPU in one of the keynotes: The Tensor Processing Unit (TPU).

The TPU is a TensorFlow machine learning stack burned into a chip. Faster than a bank of CPUs, more powerful than a bevy of GPUs, able to leap over a collection of FPGAs, the TPU ASICS (Application Specific Integrated Circuits) promises to accomplish exactly what Google says it will in terms of performance increases: an order of magnitude better Machine Learning performances, for a lot less energy costs.

In a recent discussion with my friend Wil Bown, AI/VP developer extraordinaire, he mentioned the huge performance boost BitCoin mining efforts have achieved from burning the essential code into ASICS, boosting performance well beyond FPGAs, which had previously trumped GPU banks, which themselves only a year or so ago were the cutting edge for coin miners.

So ASICS. Yeah, that does makes sense. In some instances, as layers of software infrastructure stratify, burning the full application stack to a chip will yield that final incremental competitive advantage. In fact, we could assert in general terms that entire industries have been built on that principle. For example, before there was Cisco, there was software, running on vanilla UNIX systems which functioned as switches and routers. Most often the higher costs of ASICS production limit their adoption to military applications, or aerospace, or manufacturing processes, though sometimes a more common use case emerges — like Cisco — that can utterly change everything.

But to learn that Google assembled a raised floor environment consisting of racks TPU laden servers, which was a critical factor in their Go Champion win earlier this year, gave me pause. TensorFlow became open source endeavor just a few months ago. In my mind, open source is the beginning of the journey, and certainly not the end. To therefore commit the application as such in silico seems quite premature if not remarkably misguided. It’s too early.

Machine Pareidolia

Pareidolia is is a psychological phenomenon involving a stimulus (an image or a sound) wherein the human mind perceives a familiar pattern of something where none actually exists. The face on Mars, or the Messiah in a piece of toast, for example. We humans see all sorts of patterns in the random distributions around us. Every named constellation is a pareidoliac construct. We seem to be hard wired to see faces — in clouds, trees, rocks, potatoes…it’s rather common. But we’re also aware that the faces we see in clouds are not really faces. They’re clouds. Machines, on the other hand, might not be tuned as well.

Consider the school bus and the ham sandwich.

A school bus or a ham sandwich?

A school bus or a ham sandwich?

Work by a group of researchers at the University of Wyoming expose what may be an inherent and dangerous vulnerability in state of the art neural networks: Deep Neural Networks are easily fooled. The image depicted here from their study using a well-trained DNN concluded with a high (99%) confidence that the layered stack of colors was indeed a school bus. And it’s not just one image. The school bus is but one example of a bevy of images, specifically designed to fool the DNN. To my human eye, the image is more easily identified as a sandwich — cheese on dark bread, perhaps. A school bus? Probably not. Maybe that’s just me — but I am confident, given the choice, most humans would pick the sandwich over the school bus to best explain the image here.

Clearly we cannot place deep confidence in Deep Neural Network implementations like TensorFlow just yet. But the fact is, we already have: Google has already baked it in and raised the floor, as it were. Therein lies the temerity.

The hacking possibilities are legion. If I know I can fool the image processing systems with cleverly crafted images, any security system relying on a DNN component is vulnerable. Any digital visual processing function — from facial recognition to autonomous vehicle piloting — can therefore be rather easily compromised. Any visual processing application can be hacked.

A Different Kind of Turing Test

In addition to my misgivings about the Google announcement, another question comes to mind as I ponder the TPU and the adjacent possible it may now entail: Will DNN systems, trained independently from different data sets, draw similar pareidoliac conclusions?

In other words, just as two people might see a face in the same picture of a cloud, might two systems see the same school bus in the image of a ham sandwich? If so, is that perhaps another test for ‘intelligence’ albeit not human? Might a different kind of Turing Test then emerge to measure the consciousness of our digital offspring?

The Temerity

I applaud bold advancements in technology. I admire competitive juices. But I’m just a little concerned we may now be at a very critical juncture in spacetime; a time in which the rate of innovation finally and fully exceeds the ability of the fitscape to adequately test it.

When the rate of innovation exceeds the ability of the fitness landscape to adequately test the innovation for fitness, bad stuff will result — that is my fear. The temerity of the TPU may be a step too soon, or too far, or both.

Posted in Big Data

Kauffman, Consciousness, and the Trans-Turing Machine

I met one of my heroes.

Perhaps ‘hero’ isn’t the perfect word to best describe my view of Stuart A. Kauffman. But it comes close. Suffice it to say that I’ve been a fanboy of Kauffman for well over a decade, and the chance of meeting him was one of the top three reasons for my attending the Science of Consciousness Conference (TSC) recently held in Tucson, AZ.kauffman

Maybe you are not familiar with the Santa Fe Institute (SFI), where Kauffman was faculty in residence. The SFI, founded in 1984, has been instrumental in goading the cross-disciplinary study of complex adaptive systems. Kauffman led those efforts during the first important decade of that work. Maybe you don’t know what a MacArther Fellowship is — it’s the ‘Genius Grant’ — a substantial prize awarded annually to a few dozen exceptional individuals in a variety of fields. Kauffman was a MacArthur Fellow from ’87 through ’92. But surely you’ve heard of Darwin. You know what economics is. You probably have some appreciation for computer design, genetics, and even quantum mechanics — at least on a science fiction level.

Well, then you should know that as a theoretical biologist, medical doctor, complexity studies pioneer, and profound influencer of thought, Stuart A. Kauffman is one of the giants of our age, having influenced how we now think about evolution, economics, and even computer programming. And I got to meet him.

I first came across his work in the early part of this century. Liz and I were living in Manhattan at the time, and my work at Sun Microsystems required copious trips on international flights, all of which gave me ample time to read. Kauffman’s book Investigations was one of many that drew my attention. 51BH1E0YB1L._SX379_BO1,204,203,200_But I was so taken with his ideas that I had to read it twice. And then, giving Kauffman full credit for his ideas, I refactored many of his concepts in my own investigations in distributed computing. Ideas like autonomous agents, emergence, and the adjacent possible all found their way into my own text on distributed computing which was published a few years afterward. In fact, looking back, I believe reading Kauffman helped me formulate many of my own ideas and gave me impetus and the drive to dare to write an ambitious book of my own.

Here we are 15 years later, and I’m a bit of an international traveler again, with flights and miles of time to read. Instead of evangelizing Java, I’m teaching Data Science and Big Data technologies for Think Big Analytics, a Teradata Company. But I am again prone to read more with opportunities to do so. My old friend of mind came to mind, so I shopped for any new works by Kauffman. And there it was — something very new. In fact, it was about to be published as I prepared for a recent 6-week trip to Prague and Mumbai earlier this year, Kauffman’s latest: Humanity in a Creative Universe. So I grabbed a Kindle copy and consumed with glee this new work from my old friend.

Regarding his new book, first, you absolutely must buy it. Then read it. Read it again. Kauffman suggests ideas and experiments that must give us pause. Quantum physics allows brief glimpses through a glass darkly into a baffling, non-intuitive universe. We sort of understand the mathematics of than quantum view, but cannot agree on meaning. There are dozens of interpretations of quantum mechanics. We are still just beginning. The foundation of everything we think we know was fundamentally rocked by much of 20th century science. Quantum physics is perhaps the most essential part of that, but its best gifts are yet to come.

All of the drivers of modern technology, and therefore civilization, were enabled by 20th century discoveries. Every endeavor of human life now reflects that work — communications, energy, education, entertainment, finance, manufacturing, medicine, transportation, technology…even religion. Our world (universe) view has metaphorically and quite literally exploded, expanding at an accelerating rate as is the cosmos itself. It was Maxwell, and Einstein and Hubble and Gödel that did it. Schrödinger and Shannon and von Neumann and Turing, and Watson and Crick, and a host of others. Giants upon whose shoulders we stand. That tectonic shift of mind was due to those 20th century giants and the gifts those giants left us. But even with all those changes, the sum of which ultimately gave rise the the Network Age, many 20th century discoveries are only now starting to yield engineered fruits. Quantum physics is that.

I cannot do Kauffman’s latest tome justice here. But there is something he does posit with fresh eyes that deserves a viewing in the theatre of human discovery: the Poised Realm.

The Poised Realm
The Poised Realm

From a deeper exploration of the adjacent possible, Kauffman mixes a cocktail of theoretical biology, complexity studies, and quantum mechanics to suggest an ontologically real realm that is both quantum and classical, existing in-between. I will leave it to Kauffman to provide the evidence, for which there are compelling arguments. But if we stipulate that Kauffman’s Poised Realm exists, in a neo-Cartesian sense, cleverly fitting between Res Potentia (that which is possible) and Res Extensa (that which is actual), then perhaps consciousness is the bridge; the Mind and life itself, poised critically as quantum coherent/de-coherent/re-coherent engine, providing the qualia; the magic. Human consciousness.

Yeah. That’s a lot to think about: consciousness.

Consciousness

The thing is, each of us knows we have it, but sometimes we are not so sure about everybody else. Who among us hasn’t bellied up to the bar at the Solipsism Saloon at some point and lapped up a few? The truth is, Descartes put us all on that path, cogito ergo summing us all into a post-modernity built on the foundations of mathematics, science, and exponential technological growth — heavily infused with deep alienation and a throbbing ipseity hangover.

But that particular elephant in the room — the hard problem of human consciousness — is today something we can quietly discuss and share in our cloistered safe spaces. It is now something okay to perhaps even study, when once it was akin to sporting a voodoo facial tattoo at a Mormon picnic, risking marginalization and intellectual banishment.

Stuart Hameroff and Sir Roger Penrose

Stuart Hameroff and Sir Roger Penrose

For the past 20 years Stuart Hameroff (of OrchOR fame) has been one of the leaders on the daring trek to a land which heretofore only artists and undergraduates would dare roam in the spooky underground of absinthe soaked nights. Hameroff is an anesthesiologist by profession, and another (albeit more recent) hero of mine. What better place to stand than anesthesiology when it comes to studying the on/off switch of awareness?

Meandering through this post, I am coming to the one salient question I attended the TSC conference to ask. In addition to meeting Kauffman and Hameroff — remember I said there were three top reasons for me heading for Tucson — I had a burning question. In his most recent work Kauffman discussed Trans-Turing Systems, the thought of which exploded in my head like the genetic memory of some Big Bang.

Trans-Turing Systems

From the Poised Realm, the embodiment of Trans-Turing Systems, as a real invention, doth flow. Of course you are familiar with the Turing Machine: the theoretical paper tape compute engine to which all modern processors are obliged to worship every Sunday…you are, of course, familiar with the Turing Machine.

The Turing Machine

The Turing Machine

The work of Alan Turing is the rock from which the quest for a congruent theoretical computer science was launched. Totally awesome quantum computing heavy Scott Aaronson has written that when it comes to AI, we can divide everything that’s been said about it into two categories: the 70% that was covered in Turing’s 1950 paper Computing Machinery and Intelligence, and the remaining 30% that has followed in the decades since then.

Turing Machine = Foundation of Computer Science

So whoa — a Trans-Turing System? What?? I must know more!

That was the third thing that drove me to Tucson. I had to ask Kauffman what he meant — what he saw, what he imagined. I found a description in the patent that Kauffman, et al, filed in 2014:

"Further disclosed herein is a Trans-Turing machine that includes a plurality of nodes, each node comprising at least one quantum degree of freedom that is coupled to at least one quantum degree of freedom in another node and at least one classical degree of freedom that is coupled to at least one classical degree of freedom in another node, wherein the nodes are configured such that the quantum degrees of freedom decohere to classicity and thereby alter the classical degrees of freedom, which then alter the decoherence rate of remaining quantum degrees of freedom; at least one input signal generator configured to produce an input signal that recoheres classical degrees of freedom to quantum degrees of freedom; and a detector configured to receive quantum or classical output signals from the nodes."

Sweet. I got it. Quantum computing nodes working in tandem with classical compute (Turing Machine) systems and what emerges is a Trans-Turing Machine, not constrained nor otherwise entailed by a bothersome set of NP-complete limits. Polynomial hierarchy collapse ensues, at long last P = NP, and we are full throttle to ride warp drive engines to the stars! Maybe? Maybe. Maybe not.

I had to ask Kauffman.

After I spotted him at the outdoor mixer on Thursday night, after I got over my fanboy flutters, after I introduced myself, chatted with him for a bit about his new book and how much I liked it, after I explained my own thoughts from my field in computer science, and how his book from a decade and a half earlier had so deeply influenced me, I did finally ask.

“So how do we build the Trans-Turnig Machine?”

A wry smile crossed his face. His eyes lit. For a moment he stopped being the intellectual giant I had come to revere, and revealed the mischievous, inquisitive, childlike spirit that must have driven him his entire life.

“I have no idea,” he said replied with a grin.

I was all satisfied. I knew he did not mean that he could not conceive of one, nor he did mean that he could not describe one, nor not define the attributes it might require, nor not imagine how it might function. What he meant was we still don’t know enough about quantum computing to imbue an instrument of our own creation with something akin to consciousness — whatever that means.

Today we all harvest the ample fruits from the first baby steps into the Network Age. We are still painting a digital patina over the planet. More stuff soon will think. We are clearly well into the age of pervasive computing, but computing is not yet ubiquitous, though soon it will be. Soon — within a decade — everything will be engineered to connect with everything, and almost all those systems are and will be awesome Turing Machines, programmable systems all, that will link us all together in a transcendent fine-grained meshed digital fabric of increasing value. Yet on the fringes, there is quantum computing, playfully peeking through from behind the classical physics curtain. And therein lies the unpredictable. It could be that Here There Be Monsters. Or not. That’s the beauty and the bizarre of where we are. Both terror and elation are on the rise, though neither are as appropriate nor as compelling as is the raw, robust curiosity that drives us ever forward.

Is the ineffable thing to come a D-Wave progeny? Maybe. Will Scott Aaronson explain and extend the exploding adjacent possible? Probably. Did Kauffman and Hameroff lead us to the brink? Absolutely. And from the wily Trans-Turing Machine, will Machine Consciousness one day emerge … whatever that means?

I have no idea.

Posted in Big Data, Data Science

And so come Robo-Burger

From 2005 through 2014 I taught a number online courses for the University of Phoenix including MBA Capstone projects, Business Strategy for graduate students and undergraduate Java programming. One of the discussion questions I used during that phase was that of a hypothetical Robo-Burger. The discussion question was generally worded like this(circa 2006):

“We live in a remarkable age in which productivity is increasing at an ever increasing rate, computers are getting ever smaller, faster and cheaper, and automation is the norm. Today robots provide much in the way of manufactured goods. As these trends continue, and advances in artificial intelligence slowly bring more human-like qualities to robotic interactions, it is just a question of time before almost all avocations that are staffed by humans today will be automated.

For the sake of discussion, consider Robo-Burger — a hypothetical franchise in which the entire supply chain (including production of agricultural goods and transportation) is automated. From raw input (cattle, grains, milk, sugars, fats, wood) to refined inputs (beef, buns, pies, frozen fries, wrapping paper) to supplies in the retail store (packaged, shipped and unloaded) to preparation (cooked, assembled, wrapped for the customer) to one-on-one customer interaction, to the financial transaction itself — the entire chain is automated.

Imagine Robo-Burger. Will it happen in your lifetime? Would you eat there? What are the technological barriers? What are the sociological barriers? Which is greater?”

Over that ten year period, I saw students go from mostly skeptical (it could never happen) to mostly accepting the possibility, at least as far as technology is concerned. The sociological impact remained the most difficult to digest. Here we are now, ten years after my first asking that question, and the age of Robo-Burger is upon us.

The reality of Robo-Burger is in part goaded by advancing automation technologies. But growing governmental increases in minimum wage to $15 per hour and beyond in the United States now is also a driving factor. Though Robo-Burger, while a very probable innovation coming soon to a store near you, is not the end of the story, but only the beginning. Robo-Burger, as students I taught grappling with the idea realized, is a metaphor for the Network Age redefinition of the Industrial Age artifact we have come to know as “jobs.”

Try this simple experiment. Open a browser window to search using Google. Type in “jobs lost automation” (no quotes), and wait 1/2 second for the 12+ million page results to be listed, 10 listings per page, ranked. Then click the “News” link at the top. No matter what time or day you search for the next (guessing) 2 years, you will find recent articles bemoaning, celebrating, recognizing or denying the loss of human jobs to automated systems. You will find at least a dozen published in the last 60 days. OR you can just CLICK HERE.

The Robo-Burger transition phase is now upon us. All human jobs that can be automated will be automated. Clearly, if the only change was technological elimination of Industrial Age jobs, the outcome would not be sustainable. It is therefore obvious that other major shifts must also occur.

If not jobs, then what? Whatever shall we do?

The Future of Fast Food

The Future of Fast Food

Some have argued that new kinds of jobs for humans will emerge, which is probably true. Data Scientist, for example, is a job category that did not exist ten years ago. The Robo-Burger onslaught in every industry will reduce demand for humans in more traditional roles, but increase demand for experts to care for and feed the automating technologies that replace humans.

Imagine a McDonald’s or KFC or Burger King or Taco Bell … pick your poison. Imagine a bevy of highly-skilled systems producing the wares for each order — which were placed through touch-screen or smartphone app. Each order is produced, packaged, bagged, and delivered all without human hands. This automated store, our Robo-Burger, will very likely need two or three humans on the premises at all times. These humans are trained to keep the automated systems happy. So there are the jobs. And sure, paying $15 for those workers is a bargain. Especially when you consider the fact that 20 or more human beings per store per shift are required to keep the same store running smoothly today.

Bear in mind, in the aggregate, advances in technology yield significant increases in productivity. We do more with less. Incremental increases in productivity have been the hallmark of human endeavors for centuries. This is good news. It means less stress on finite resources, lower costs, more stuff. Add to that the exponential increases with Moore’s Law-related technologies, and the idea of effectively free stuff is now reasonable to consider. So Robo-Burger should be a $1 menu dream.

But we are still stymied … even with our Robo-Burger machine repair gigs, we haven’t grown jobs. In fact, we’ve done just the opposite.

So the question remains: whatever shall we do?

I’ll leave that interesting question for another day.

Today, we celebrate the emergence of Robo-Burger. Would you like fries with that?

Posted in Big Data, Borg, Data Science

Neural Interface as Stent

As a followup to a previous blog entry, DARPA does it again.

A practical neural interface is one of the critical path components for human beings to accelerate the journey toward the full Trans-Human experience. DARPA gets it, and is about ready to start human trials of a Neural Interface for Brain stent.

The Experimental Neural Interface Brain Stent

The Experimental Neural Interface Brain Stent

Adapted from off-the-shelf stents, implanted via a relatively simple injection versus brain surgery, the stentrode experiments could set the stage for an unprecedented phase shift in technology.

The potential use cases for an easy-to-use, safe digital neural interfaces are infinite. Although the same could be said for most technology, the leverage in this case far surpasses any invention to date. By comparison, everything else is on par with, say, the screwdriver. Yes, the screwdriver is awesome. The use cases for the trusty screw driver are probably infinite — well beyond that of driving in screws or the opening of paint can lids. But I cannot command drones, read minds, leap over tall buildings in a single bound, nor levitate 10 feet off the ground using only a screwdriver. With a stentrode implant and the a network-connected assembly of the right gear, all those super powers (and many more) are but a fraction of the new possibilities in the new adjacent possible enabled by said invention.

Posted in Big Data

Data Science Now

I did it! I graduated!

After nearly a year of online courses, homework, assignments, and after work projects, culminating in a final Capstone project, I completed my Data Science Certification from John Hopkins University. And the experience was awesome.

Johns Hopkins University (Coursera delivered) Data Science Specialization

Johns Hopkins University (Coursera delivered) Data Science Specialization

I highly recommend the Johns Hopkins program for any out there interested in getting their Data Science mojo down. It helps to have a strong background in programming and statistics; either graduate-level courses or a lot of industry experience. But even if you have one but not the other, I still recommend it. One of the great things about Coursera is the option to take a course for no credit at no cost to learn the material, and then go back and take the course for credit.

So that’s it. I’ve finished the program. Next comes … Hadoop Developer Certification, most likely. That should be a pretty tame set of studies compared the to the last year of R, Data Science fundamentals, and applied predictive models. With any luck I’ll have completed that goal too before the end of 2015.

My intent for 2016 is to master the tools that come with Kali Linux, help to start a satellite of Hackers and Founders in Utah, learn Spark and enough Scala to get by, and hopefully explore a Hololens or something like it.

But in the end, Data Science is the foundation upon which all the looming innovation coming our way necessarily depends.

If there’s hope for humanity, it’s in software.

Everybody say ‘Amen.’

Posted in Big Data