Thursday, August 28, 2014

Entrepreneurs See a Web Guided by Common Sense by John Markoff


By JOHN MARKOFF
Published: November 12, 2006
The New York Times

SAN FRANCISCO, Nov. 11 — From the billions of documents that form the World Wide Web and the links that weave them together, computer scientists and a growing collection of start-up companies are finding new ways to mine human intelligence.

Their goal is to add a layer of meaning on top of the existing Web that would make it less of a catalog and more of a guide — and even provide the foundation for systems that can reason in a human fashion. That level of artificial intelligence, with machines doing the thinking instead of simply following commands, has eluded researchers for more than half a century.

Referred to as Web 3.0, the effort is in its infancy, and the very idea has given rise to skeptics who have called it an unobtainable vision. But the underlying technologies are rapidly gaining adherents, at big companies like I.B.M. and Google as well as small ones. Their projects often center on simple, practical uses, from producing vacation recommendations to predicting the next hit song.

But in the future, more powerful systems could act as personal advisers in areas as diverse as financial planning, with an intelligent system mapping out a retirement plan for a couple, for instance, or educational consulting, with the Web helping a high school student identify the right college.

The projects aimed at creating Web 3.0 all take advantage of increasingly powerful computers that can quickly and completely scour the Web.

“I call it the World Wide Database,” said Nova Spivack, the founder of a start-up firm whose technology detects relationships between nuggets of information by mining the World Wide Web. “We are going from a Web of connected documents to a Web of connected data.”

Web 2.0, which describes the ability to seamlessly connect applications (like geographic mapping) and services (like photo-sharing) over the Internet, has in recent months become the focus of dot-com-style hype in Silicon Valley. But commercial interest in Web 3.0 — or the “semantic Web,” for the idea of adding meaning — is only now emerging.

The classic example of the Web 2.0 era is the “mash-up” — for example, connecting a rental-housing Web site with Google Maps to create a new, more useful service that automatically shows the location of each rental listing.

In contrast, the Holy Grail for developers of the semantic Web is to build a system that can give a reasonable and complete response to a simple question like: “I’m looking for a warm place to vacation and I have a budget of $3,000. Oh, and I have an 11-year-old child.”

Under today’s system, such a query can lead to hours of sifting — through lists of flights, hotel, car rentals — and the options are often at odds with one another. Under Web 3.0, the same search would ideally call up a complete vacation package that was planned as meticulously as if it had been assembled by a human travel agent.

How such systems will be built, and how soon they will begin providing meaningful answers, is now a matter of vigorous debate both among academic researchers and commercial technologists. Some are focused on creating a vast new structure to supplant the existing Web; others are developing pragmatic tools that extract meaning from the existing Web.

But all agree that if such systems emerge, they will instantly become more commercially valuable than today’s search engines, which return thousands or even millions of documents but as a rule do not answer questions directly.

Underscoring the potential of mining human knowledge is an extraordinarily profitable example: the
basic technology that made Google possible, known as “Page Rank,” systematically exploits human knowledge and decisions about what is significant to order search results. (It interprets a link from one page to another as a “vote,” but votes cast by pages considered popular are weighted more heavily.)

Today researchers are pushing further. Mr. Spivack’s company, Radar Networks, for example, is one of several working to exploit the content of social computing sites, which allow users to collaborate in gathering and adding their thoughts to a wide array of content, from travel to movies.

Radar’s technology is based on a next-generation database system that stores associations, such as one person’s relationship to another (colleague, friend, brother), rather than specific items like text or numbers.

One example that hints at the potential of such systems is KnowItAll, a project by a group of University of Washington faculty members and students that has been financed by Google. One sample system created using the technology is Opine, which is designed to extract and aggregate user-posted information from product and review sites.

One demonstration project focusing on hotels “understands” concepts like room temperature, bed comfort and hotel price, and can distinguish between concepts like “great,” “almost great” and “mostly O.K.” to provide useful direct answers. Whereas today’s travel recommendation sites force people to weed through long lists of comments and observations left by others, the Web. 3.0 system would weigh and rank all of the comments and find, by cognitive deduction, just the right hotel for a particular user.

“The system will know that spotless is better than clean,” said Oren Etzioni, an artificial-intelligence researcher at the University of Washington who is a leader of the project. “There is the growing realization that text on the Web is a tremendous resource.”

In its current state, the Web is often described as being in the Lego phase, with all of its different parts capable of connecting to one another. Those who envision the next phase, Web 3.0, see it as an era when machines will start to do seemingly intelligent things.

Researchers and entrepreneurs say that while it is unlikely that there will be complete artificial-intelligence systems any time soon, if ever, the content of the Web is already growing more intelligent. Smart Webcams watch for intruders, while Web-based e-mail programs recognize dates and locations. Such programs, the researchers say, may signal the impending birth of Web 3.0.

“It’s a hot topic, and people haven’t realized this spooky thing about how much they are depending on A.I.,” said W. Daniel Hillis, a veteran artificial-intelligence researcher who founded Metaweb Technologies here last year.

Like Radar Networks, Metaweb is still not publicly describing what its service or product will be, though the company’s Web site states that Metaweb intends to “build a better infrastructure for the Web.”

“It is pretty clear that human knowledge is out there and more exposed to machines than it ever was before,” Mr. Hillis said.

Both Radar Networks and Metaweb have their roots in part in technology development done originally for the military and intelligence agencies. Early research financed by the National Security Agency, the Central Intelligence Agency and the Defense Advanced Research Projects Agency predated a pioneering call for a semantic Web made in 1999 by Tim Berners-Lee, the creator of the World Wide Web a decade earlier.

Intelligence agencies also helped underwrite the work of Doug Lenat, a computer scientist whose company, Cycorp of Austin, Tex., sells systems and services to the government and large corporations. For the last quarter-century Mr. Lenat has labored on an artificial-intelligence system named Cyc that he claimed would some day be able to answer questions posed in spoken or written language — and to reason.

Cyc was originally built by entering millions of common-sense facts that the computer system would “learn.” But in a lecture given at Google earlier this year, Mr. Lenat said, Cyc is now learning by mining the World Wide Web — a process that is part of how Web 3.0 is being built.

During his talk, he implied that Cyc is now capable of answering a sophisticated natural-language query like: “Which American city would be most vulnerable to an anthrax attack during summer?”

Separately, I.B.M. researchers say they are now routinely using a digital snapshot of the six billion documents that make up the non-pornographic World Wide Web to do survey research and answer questions for corporate customers on diverse topics, such as market research and corporate branding.

Daniel Gruhl, a staff scientist at I.B.M.’s Almaden Research Center in San Jose, Calif., said the data mining system, known as Web Fountain, has been used to determine the attitudes of young people on death for a insurance company and was able to choose between the terms “utility computing” and “grid computing,” for an I.B.M. branding effort.

“It turned out that only geeks liked the term ‘grid computing,’ ” he said.

I.B.M. has used the system to do market research for television networks on the popularity of shows by mining a popular online community site, he said. Additionally, by mining the “buzz” on college music Web sites, the researchers were able to predict songs that would hit the top of the pop charts in the next two weeks — a capability more impressive than today’s market research predictions.

There is debate over whether systems like Cyc will be the driving force behind Web 3.0 or whether intelligence will emerge in a more organic fashion, from technologies that systematically extract meaning from the existing Web. Those in the latter camp say they see early examples in services like del.icio.us and Flickr, the bookmarking and photo-sharing systems acquired by Yahoo, and Digg, a news service that relies on aggregating the opinions of readers to find stories of interest.

In Flickr, for example, users “tag” photos, making it simple to identify images in ways that have eluded scientists in the past.

“With Flickr you can find images that a computer could never find,” said Prabhakar Raghavan, head of research at Yahoo. “Something that defied us for 50 years suddenly became trivial. It wouldn’t have become trivial without the Web.”

Thinking about Clean Clouds

There is a trade-off between efficiency/resource utilization on the one hand and reliability/convenience on the other with regard to cloud computing.  Although use of the term 'cloud computing' provokes images of data, apps and programs residing in some nebulous non-place like a cloud, it is all just moved to a data center somewhere. In order to prevent service outages or downtime, most providers maintain excess capacity meaning that often server utilization rates are "6% to 15%, with 75% of servers using less than 10%." (Glanz, 2012). Many are 'comatose' servers, doing little more than burning electricity and "little if any computational work." (Glanz, 2012).  The studies found that up to 75% of the servers in a given 'farm' were basically idle. 

Servers are not the only energy consumers in data centers however, industrial cooling systems are needed to keep the massive spaces useable.  There are also backup battery installations and chargers necessary to prevent disruptions due to the local electric grid.  In fact, most data centers maintain a stable of diesel generators that kick in whenever the local electrical service grid goes down.  The explosion of consumer data creation, transmission and storage due to the relative cheapness of cloud based centralized computing is leading to an energy useage profile that is simply not sustainable. 

One possible avenue to redress this mismatch is virtualization, which effectively merges multiple servers into one large flexible computing platform that can host a variety of applications or data on a much more rapid scaling basis.  While this may reduce the need to maintain individual excess capacity in computing resources and help improve server utilization rates, it will not reduce the total energy footprint. "Nationwide, data centers used about 76 billion kilowatt-hours in 2010, or roughly 2 percent of all electricity used in the country that year," noted The new York Times.

Glanz, J. "Power, Pollution and the Internet."  New York Times.  (September 22, 2012). 
Retrieved from
http://www.nytimes.com/2012/09/23/technology/data-centers-waste-vast-amounts-of-energy-belying-industry-image.html

Which is better -- real or artificial intelligence?

The question of which is better, artificial or real intelligence contains several concepts that need unpacking first in order to make a reasonable answer.

First what exactly do we mean by intelligence?

There are a number of qualitative characteristics that seem to constitute intelligence. The ability to make sense of various inputs from the environment as well as remembering and learning from past experience are considered signs of intelligence. Also, the ability to make sense of ambiguity or seemingly contradictory inputs and dealing with perplexity through rational inference are components of intelligence. Intelligence means using a reasoned or rational thought process to respond and react successfully to new situations or scenarios. Intelligence is complex and adaptive. Intelligence is recognizing and making judgments about the relative importance of different elements within a situation to arrive at reasonable and effective responses.

Second, what do we mean by artificial intelligence? (I am assuming that 'real' intelligence is that exhibited by human agents).

Artificial intelligence, also called machine intelligence, is behavior performed by a computer system that if done by a human would be considered intelligent behavior. AI differs from the typical computer system in that it focuses on symbolic as opposed to numeric manipulation and utilizes heuristic processing as opposed to algorithmic processing. Heuristics are basically a form of intuitive knowledge, often called rules of thumb. Rather than a specific and rigid algorithm, heuristics are flexible and adaptive. AI's represent knowledge symbolically and manipulate that knowledge using heuristics thereby becoming capable of responses that appear in context to replicate real intelligence. Perhaps the best example is the chess program Deep Blue developed by IBM which beat Chess Grand Master Gary Kasparov. (Chess has long been considered a yardstick of measuring intelligent behavior and seems to be a perfect replicator of symbolic heuristic manipulation.)

Lastly, what do we mean by better? Better in what sense?

If it is simply a matter of preference, not surprisingly we humans would appear to prefer real (or human) intelligence. But there are a number of areas in which AI is arguably superior. Because artificial intelligence operates within a computer platform, it is well documented in a way that real intelligence is not. It is also internally consistent and thorough. It does not suffer from fatigue or have bad days. AI tends to execute tasks much faster than a human could and perform those tasks with a higher degree of accuracy. It also has a characteristic of permanence that real intelligence unfortunately lacks, as well as being easier to duplicate and disseminate. For certain tasks, AI is considerably cheaper to develop and deploy. But even with all of these advantages, there are still areas in which it is inferior to real intelligence. AIs are as of yet not naturally creative and lack inspiration. There are no AI Mozarts or Picassos yet. They are also inferior to human intelligence with regard to sensing the environment directly and adapting quickly. AIs are inferior to human jet fighter pilots.

So it would probably be correct to say, each is good at certain things and not so good at other things.

Reference
Turban, E., Sharda, R., & Delen, D. (2010). Decision Support and Business Intelligent Systems (9th ed.). Upper Saddle River, NJ: Prentice Hall.

Sunday, March 9, 2014

Sam's Club - Business Analytics Project

This report system is based on a real dataset from Sam’s Club. This dataset is made available through the Enterprise System University Alliance. The Walton College from the University of Arkansas currently hosts this dataset. We will utilize this dataset to answer at least the following decision making questions. 1. What are the most popular items in a particular sales period? 2. Are there items popular in one store but not so popular in another store? 3. What is the percentage of members who visit more than two stores in the selected period? 4. Does the type of membership relate to purchase amount or number of visits? Of course, it would be an idea report system if one can provide “ad-­‐hoc” query capability. The above questions are just a few examples that you can provide recommendation to decision makers by study the data. Try to build a web reporting system so that ad-­‐hoc query is possible.

Wednesday, September 25, 2013

Faster, Cheaper, Better by Michael Hammer and Lisa Hershman

Faster, Cheaper, Better
The 9 Levers for Transforming How Work Gets Done
by Michael Hammer & Lisa W. Hershman
A Review

A REENGINEERING HOW-TO FOR EXECUTIVES

One of the co-founders of the reengineering movement, Michael Hammer has always focused on how companies get things done more than what they need to do. In Faster, Cheaper, Better: The 9 Levers for Transforming How Work Gets Done, he and co-author Lisa Hershman, CEO of Hammer and Company, offer a detailed framework for improving the key processes in a company — the five to ten end-to-end processes, from product development to order fulfillment, that bring all the value to the company's customer. "End-to-end" are the key words in the authors' approach.

Processes in most companies are fragmented: different people or departments doing different tasks along the process with no real thought given to the efficiency of the entire process. "Most people want to do a good job," write Hammer and Hershman. "They are given goals and they strive to meet them. They focus intently on doing their job correctly and well, and they are rewarded for their efforts. But few understand how their narrowly defined jobs fit into the overall picture of what the company is trying to accomplish."

Value for the customer is not created by a job, it is created by a series of jobs that together form the end-to-end process, the authors argue. This focus on end-to-end processes may remind readers of Hammer's classic reengineering approach in Reengineering the Corporation and with good reason. That book, the authors write, explained why the end-to-end process was the better way to organize the operations of a company. Faster, Cheaper, Better explains how to "harness" the concept to make the company more profitable.

Using the Nine Levers There are, according to the authors, nine levers that companies can use to improve the performance of their end-to-end processes. The first five involve the design and execution of the processes. The authors call these levers "process enablers."

The five process enablers identified by the authors are:

The process design.
Companies need to design new end-to-end processes focused on the needs of their customers and eliminate redundancy in the processes.

Metrics.
Instead of letting each function select its own measures, use customer-focused process measures.

Process owners.
Designate formal process owners — with the responsibility and the authority to improve the entire process — to work with the traditional functional leaders.

Performers.
Redesigning processes changes the way people work. Performers must see their job as part of the whole value-creating process.

Infrastructure.
A new approach to work by the performers requires a new infrastructure to support them — including new compensation plans, new training and development opportunities, a new reporting structure and the necessary tools.

Through their experience with numerous companies, the authors found that implementing these five process enablers could only succeed if four enterprise capabilities were in place:

Leadership.
The leadership of the company needs to think in process terms, not functional terms, and to align their efforts to improve end-to-end processes.

Culture.
The best companies will have a process-based culture that is relentlessly focused on the customer.

Expertise.
Process management and redesign need to be core competencies within the organization.

Governance.
A formal governing infrastructure needs to be in place to implement end-to-end processes, such as a program management office that includes a chief process officer or a process council at the top of the organization.

Faster, Cheaper, Better combines a detailed, extensively researched, yet practical methodology with continuous examples of the methodology's application to the real world — both within the discussion of the methodology and in the second part of the book that offers in-depth case studies. A summary chart at the end of the book, that shows the nine levers in four different stages of maturation, provides yet another guide for companies to improve their processes. This is a business book that delivers on its promise to transform how works gets done in your company.

Monday, September 16, 2013

Why Agile isn't working

Why Agile Isn't Working: Bringing Common Sense to Agile Principles

Agile promises many things, but the reality in the field is often very far from the expectations. Is it agile we need--or an agile way of thinking?

By Lajos Moczar
June 4, 2013
 
Changing Agile Thinking With 3 Common Sense Principles
One thing that's seductive about agile is the name. We like the idea of being agile in our thinking. Agile as a methodology cannot deliver agile thinking, however, and inevitably ends up preventing it.

Think of "agile" as the ability to take the input of all the variable elements of the project—budget, time, design patterns, reusability, customer needs, corporate needs, precedents, standards, technology innovations and limitations—and come up with a pragmatic approach that solves the problem at hand in such a way that the product is delivered properly.

This sounds like a tall order, but really comes down to three common sense principles. If there's one key quality a good project manager needs, it's prioritization: The ability to take the pressures of all project elements and determine which path to follow based on what's most important to achieve.

Remember, the goal is to deliver a quality product on time and to budget; as a rule, there are always some elements that have to be sacrificed to fulfil the needs of the others. It's the role of the project manager to define and maintain the project priorities so they can function as a decision framework for team members as they carry out their tasks.

One of the hardest things for many developers is pragmatism. Rather than think practically, they inevitable fall into abstract approaches to problems. I was on one very expensive project based on a purist design in which everything was reusable and could be dynamically coupled together. It was a marvel of abstraction. The only problem was that it didn't work. It had the worst performance I have ever seen in a Java system and was incapable of supporting the required load. The core problem there was that the team attempted to solve real-world problems with an abstract approach.

No matter how virtual technology gets, it's still ultimately based on the physical world that we humans operate in. Physical problems cannot be solved abstractly. Look at any factory machine and see how many oddly shaped parts it contains, each good for just one very vital physical function. Software is no different. Sometimes things are meant for one use only. That's not a bad thing if it gets the job done and functions properly for many years.

Finally, dynamism means the ability to switch strategies when the current one isn't working. Many times, despite our best planning, what seems like a good project, design or development strategy hits a brick wall. The most important thing in this situation is to know when to call it off. If the brick wall is only a brick wall, a bit of brute force—i.e. one or two late nights—will get through it. But if it's really a mountain of rock, it's best not to risk the entire project but, instead, find another way around.
It sounds like a simple concept, but I've watched projects go on for weeks, team members beating their collective heads against the mountain, only to conclude too late that another approach was needed.

Agile promises solutions it cannot deliver. It promotes sloppy requirements, hides the true cost of development and prevents effective management. Contrary to what we're told to expect, this leads to long-running projects, dissatisfied customers and an overall IT ineffectiveness. What we hope to find in the methodology, however, is achievable through agile thinking. With this thinking, we can in fact solve the problems of IT project management and learn how to deliver stable products on time and to budget. If the project team knows how to think and work effectively, then we don't need to look to a methodology to save us from project failures. We can do it ourselves.

Retrieved from
http://www.cio.com/article/734338/Why_Agile_Isn_t_Working_Bringing_Common_Sense_to_Agile_Principles

Lajos Moczar is a senior technology strategist and consultant, operating as The Consultant CTO. When not consulting, he writes on various IT topics. Follow everything from CIO.com on Twitter @CIOonline, Facebook, Google + and LinkedIn.