≡ Menu

This Machine is Learning You: How We All Started Working for the Machines

Machine learning

During the last fifteen years, a strange parallel economy has covertly developed to the point where it envelops almost all internet users, including you. No money changes hands in this immense network, but it produces enormous transactional benefits nonetheless.

In this economy, you labor daily as a trainer, teaching software robots how to perform tasks. In return, the bots then take over much of those tasks for you. You trade your daily labor in exchange for the value produced by the work of your powerful and ubiquitous robot apprentices.

The most successful products of this epoch – applications like Gmail, YouTube, Amazon’s store, Facebook, Google Maps, and Spotify – have learned a great deal about their users’ likes, dislikes, and similarities. Applying Machine Learning technology, they use that data to better present what consumers will want to see and hide what they do not.

These applications did not get so powerful through traditional programming, but through self-learning. Instead of a team of coders defining the steps to be followed using a computing language, Machine Learning starts with a set of observable data – such as items that users bought – and learns to infer the patterns within the data.

So how does one get the data set to train a Machine Learning system?

Sometimes a Data Scientist can mine the data out of existing records collected for some other purpose. This is one of the reasons that companies now like to collect every bit of data they can about you. It is much easier to re-use existing data than to collect it from scratch. But frequently to obtain a well-organized set of questions and users’ responses, the team must gather new data.

One way to collect orderly data is to pay humans to answer questions. Using a system like the Amazon Mechanical Turk, you can define a question and get many thousands of answers from workers, paying a few cents for an answer. This approach is often used for problems that people solve well, such as image recognition. A Mechanical Turk worker might, for example, classify images as landscapes or indoor photos, or draw a circle around faces in photos. This well-organized set of images and identified areas is ideal for accelerating the training of a Machine Learning program.

It is even cheaper to get the humans to answer the questions for no money at all, by providing them some utility value. This is where you come in.

You have probably used a CAPTCHA, an application that requires that you identify a number or a small piece of text in order to prove that you are human. In doing so, you are doing useful work for somebody. Google initially trained its street number recognizers for Google Street View on data sets it built by putting photos of doorway areas into its CAPTCHA system.

Another way to get free data sets is by turning data collection into a game. Development teams are great at those kinds of problems, as software and UX designers often love to make (and play) games. They can quickly turn a Data Science problem into a slick quiz with a polished user experience.

You have probably taken a quiz like this on Facebook or another website. Applications that allow users to learn about themselves – or purport to do so – are very popular. However, the real goal of the quiz may have nothing to do with its ostensible purpose. For example, a quiz that purports to give you personality insight might well be measuring the subtle difference in response speed when you reply on questions containing one group of words vs. another. It might also correlate that information with metadata you allow it to access in your social profile, like your gender, age, or political affiliation. When companies speak about “converting clicks to value,” this is what they are talking about.

In a stealthy economic transition, most of us have acquired a new secondary role as a machine trainer. But while this work produces useful value, you can’t use it to pay for groceries. And here we come to the cusp of a looming economic crisis.

Up until now, the internet economy of smart agents has been subsidized by the traditional economy, in which employers pay workers paychecks in a structured manner. But as the role of machine training grows in importance, automation technologies are, through efficiencies, eliminating jobs. Automation creates new “traditional” wage-paying jobs, but not as many as it eliminates.

In the last such transition – the industrial revolution – farmers moved into factory jobs. Now, the industrial workers are moving stealthily into knowledge jobs such as machine training. But unlike during the industrial revolution, these jobs are not a direct replacement for the old ones. At present, working as a machine trainer is a second, usually unpaid job. It does provide value, such as more efficient email processing and better autonomous agents, but you can’t feed your family by helping to train a recommender system.

Discussions of the AI revolution often focus on the permanent elimination of entire job classes, such as drivers being replaced by self-driving cars. Proponents of the “abundance” view believe that new jobs will arise in previously unforeseen areas to replace the old ones. The problem we have at this moment is that the new jobs, like machine trainer, are arriving – but they are not replacing the earnings of the old ones.

As the internet economy continues to subsume the “real” economy, this automation crisis is coming to a head. In a follow-on article, we will discuss the evolving AI Economy and directions in which it might develop.


About the Author:

David RostcheckDavid Rostcheck is a consulting data scientist helping companies tackle challenging problems and develop advanced technology. He can be reached at drostcheck [at] leopardllc.com.

Like this article?

Please help me produce more content:



Please subscribe for free weekly updates:

  • Steve Morris

    Interesting article. But even if new jobs are not created directly, the result of AI is that services that previously might have been subscription services are now free. Your own examples of Gmail, YouTube, Amazon’s store, Facebook, Google Maps, and Spotify are all free. Automation is progressively stripping costs out of ordinary people’s lives. They can now spend that money on other things, creating new jobs in turn.

  • Yes, that’s a true and good point; the cost of many aspects of life are dropping. We’ve seen it with information products and, as the information economy expands to more and more areas of economic life, they will drop in other ways too (robot drivers are arriving, robot farms and doctors are coming behind them). These are the observation and projections of the “abundance” school. But as costs are run towards near-zero, we have seen some interesting effects in information-dominated fields that we can assume also will be repeated in these other disrupted fields:

    1. Certain areas become “walled gardens” where, although technology is dropping costs, cartels use captive regulation to create protected spaces where they can charge increasing amounts. To use the U.S. as an example: automobiles are cheaper, arbitration has slashed court costs – but insurance costs are rising. Medical costs are out of control, even though technology opens the possibility for cheaper, earlier, more effective treatments. Military expenditures are huge, although wars are fought with smarter weapons and less heavy iron than ever before. What those things show is that technological progress is one factor, but there are other social actors (guilds, cartels, government policies) that can also drive the landscape.

    2. Inequity has increased as content production has become “winner take all.” Most writers and filmmakers see their revenues go to zero, as they compete with near-free content, but a few blockbuster books and films reap huge windfalls. AI has the potential to further this inequality because it is inherently very winner-take-all – the team with the best algorithms (and the biggest data) wins. The gap between Google and Bing, or Facebook and MySpace, is enormous. And while AI companies create jobs, they currently destroy more than they create.

    3. At this point, the traditional full-time-job-based economy is the main driver of economic consumption, so as these jobs are disrupted, the consumer economy is stressed. There are many new jobs being created – such as TaskRabbits or Mechanical Turk workers – but they pay much less than traditional jobs. At some point things may settle down, so the low pay is, given the low costs, enough to sustain meaningful middle-class life, but right now there are a few great jobs and many (currently) marginal ones being produced. Governments are experimenting with fiscal policies to try to soften this effect, such as: creating more government jobs (ex. transportation security) pushing money directly into the economy (ex. extended unemployment benefits) to more direct subsidies (ex. discussion of a guaranteed basic income).

    As these trends run out and more costs go to zero, the gap between the growing set of excellent near-free services and the steep costs for others is becoming difficult for a worker with a new low-paid job – say a TaskRabbit – to bridge. They can get a first-rate near-free education (albeit without academic credit) – but health care and rent might consume almost all their income. This situation will only exacerbate as more costs fall to zero – when almost everything is almost free, the things that aren’t free are, by comparison, really expensive. So it’s an open question as to what the future economic system will look like as AI (and other disruptive technologies) become entrenched.

Over 3,000 super smart people have subscribed to my newsletter: