- Book Options and Supplements
- About the Author
- Acknowledgments
- Dedication
- Preface
- Chapter 1: Zara: Fast Fashion from Savvy SystemsPrint Chapter|
Chapter 1 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 1 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 1 Study Aid Package has been added to your cart for $2.49.
- Chapter 2: Strategy and TechnologyPrint Chapter|
Chapter 2 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 2 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 2 Study Aid Package has been added to your cart for $2.49.
- Chapter 3: Netflix: David Becomes GoliathPrint Chapter|
Chapter 3 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 3 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 3 Study Aid Package has been added to your cart for $2.49.
- Chapter 4: Moore’s Law and More: Fast, Cheap Computing and What It Means for the ManagerPrint Chapter|
Chapter 4 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 4 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 4 Study Aid Package has been added to your cart for $2.49.
- Chapter 5: Understanding Network EffectsPrint Chapter|
Chapter 5 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 5 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 5 Study Aid Package has been added to your cart for $2.49.
- Chapter 6: Peer Production, Social Media, and Web 2.0Print Chapter|
Chapter 6 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 6 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 6 Study Aid Package has been added to your cart for $2.49.
- Chapter 7: Facebook: Building a Business from the Social GraphPrint Chapter|
Chapter 7 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 7 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 7 Study Aid Package has been added to your cart for $2.49.
- Section 1: Introduction
- Section 2: What’s the Big Deal?
- Section 3: The Social Graph
- Section 4: Facebook Feeds—Ebola for Data Flows
- Section 5: F8—Facebook as a Platform
- Section 6: Advertising and Social Networks: A Work in Progress
- Section 7: Beacon Busted
- Section 8: Predators and Privacy
- Section 9: Walled Garden or Open Field?
- Section 10: Is Facebook Worth It?
- Chapter 8: Google: Search, Online Advertising, and Beyond…Print Chapter|
Chapter 8 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 8 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 8 Study Aid Package has been added to your cart for $2.49.
- Section 1: Introduction
- Section 2: Understanding Search
- Section 3: Understanding the Increase in Online Ad Spending
- Section 4: Search Advertising
- Section 5: Ad Networks—Distribution beyond Search
- Section 6: More Ad Formats and Payment Schemes
- Section 7: Customer Profiling and Behavioral Targeting
- Section 8: Profiling and Privacy
- Section 9: Search Engines, Ad Networks, and Fraud
- Section 10: The Battle Unfolds
- Chapter 9: Understanding Software: A Primer for ManagersPrint Chapter|
Chapter 9 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 9 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 9 Study Aid Package has been added to your cart for $2.49.
- Chapter 10: Software in Flux: Partly Cloudy and Sometimes FreePrint Chapter|
Chapter 10 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 10 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 10 Study Aid Package has been added to your cart for $2.49.
- Section 1: Introduction
- Section 2: Open Source
- Section 3: Why Open Source?
- Section 4: Examples of Open Source Software
- Section 5: Why Give It Away? The Business of Open Source
- Section 6: Cloud Computing: Hype or Hope?
- Section 7: The Software Cloud: Why Buy When You Can Rent?
- Section 8: SaaS: Not without Risks
- Section 9: The Hardware Cloud: Utility Computing and Its Cousins
- Section 10: Clouds and Tech Industry Impact
- Section 11: Virtualization: Software That Makes One Computer Act Like Many
- Section 12: Make, Buy, or Rent
- Chapter 11: The Data Asset: Databases, Business Intelligence, and Competitive AdvantagePrint Chapter|
Chapter 11 Print–It–Yourself has been added to your cart for $1.99.
Chapter Audio|Chapter 11 Audio has been added to your cart for $2.49.
Chapter Study AidsChapter 11 Study Aid Package has been added to your cart for $2.49.
- Section 1: Introduction
- Section 2: Data, Information, and Knowledge
- Section 3: Where Does Data Come From?
- Section 4: Data Rich, Information Poor
- Section 5: Data Warehouses and Data Marts
- Section 6: The Business Intelligence Toolkit
- Section 7: Data Asset in Action: Technology and the Rise of Wal-Mart
- Section 8: Data Asset in Action: Harrah’s Solid Gold CRM for the Service Sector
There are no key terms for this page.
Understanding Search
Learning Objectives
After studying this section you should be able to do the following:
-
Understand the mechanics of search, including how Google indexes the Web and ranks its organic search results.
-
Examine the infrastructure that powers Google and how its scale and complexity offer key competitive advantages.
Before diving into how the firm makes money, let’s first understand how Google’s core service, search, works.
Perform a search (or queryquerySearch.) on Google or another search engine, and the results you’ll see are referred to by industry professionals as organic or natural searchorganic or natural searchSearch engine results returned and ranked according to relevance.. Search engines use different algorithms for determining the order of organic search results, but at Google the method is called PageRankPageRankAlgorithm developed by Google cofounder Larry Page to rank Web sites. (a bit of a play on words, it ranks Web pages, and was initially developed by Google cofounder Larry Page). Google does not accept money for placement of links in organic search results. Instead, PageRank results are a kind of popularity contest. Web pages that have more pages linking to them are ranked higher.
Figure 8.5.

The query for “Toyota Prius” triggers organic search results, flanked top and right by sponsored link advertisements.
The process of improving a page’s organic search results is often referred to as search engine optimization (SEO)search engine optimization (SEO)The process of improving a page’s organic search results.. SEO has become a critical function for many marketing organizations since if a firm’s pages aren’t near the top of search results, customers may never discover its site.
Google is a bit vague about the specifics of precisely how PageRank has been refined, in part because many have tried to game the system. The less scrupulous have tried creating a series of bogus Web sites, all linking back to the pages they’re trying to promote (this is called link fraudlink fraudAlso called “spamdexing” or “link farming.” The process of creating a series of bogus Web sites, all linking back to the pages one is trying to promote., and Google actively works to uncover and shut down such efforts). We do know that links from some Web sites carry more weight than others. For example, links from Web sites that Google deems as “influential,” and links from most “.edu” Web sites, have greater weight in PageRank calculations than links from run-of-the-mill “.com” sites.
Spiders and Bots and Crawlers—Oh My!
When performing a search via Google or another search engine, you’re not actually searching the Web. What really happens is that the major search engines make what amounts to a copy of the Web, storing and indexing the text of online documents on their own computers. Google’s index considers over one trillion URLs.[298] The upper right-hand corner of a Google query shows you just how fast a search can take place (in the example above, rankings from over eight million results containing the term “Toyota Prius” were delivered in less than two tenths of a second).
To create these massive indexes, search firms use software to crawl the Web and uncover as much information as they can find. This software is referred to by several different names: software robots, spiders, Web crawlerssoftware robots, spiders, Web crawlersSoftware that traverses available Web links in an attempt to perform a given task. Search engines use spiders to discover documents for indexing and retrieval.. They all pretty much work the same way. In order to make its Web sites visible, every online firm provides a list of all of the public, named servers on its network, known as Domain Name Service (DNS)Domain Name Service (DNS)Internet directory service that allows devices and services to be named and discoverable. The DNS, for example, helps your browser locate the appropriate computers when entering an address like http://finance.google.com. listings. For example, Yahoo! has different servers that can be found at http://www.yahoo.com, sports.yahoo.com, weather.yahoo.com, finance.yahoo.com, et cetera. Spiders start at the first page on every public server and follow every available link, traversing a Web site until all pages are uncovered.
Google will crawl frequently updated sites, like those run by news organizations, as often as several times an hour. Rarely updated, less popular sites might only be reindexed every few days. The method used to crawl the Web also means that if a Web site isn’t the first page on a public server, or isn’t linked to from another public page, then it’ll never be found.[299] Also note that each search engine also offers a page where you can submit your Web site for indexing.
While search engines show you what they’ve found on their copy of the Web’s contents; clicking a search result will direct you to the actual Web site, not the copy. But sometimes you’ll click a result only to find that the Web site doesn’t match what the search engine found. This happens if a Web site was updated before your search engine had a chance to reindex the changes. In most cases you can still pull up the search engine’s copy of the page. Just click the “Cached” link below the result (the term cachecacheA temporary storage space used to speed computing tasks. refers to a temporary storage space used to speed computing tasks).
But what if you want the content on your Web site to remain off limits to search engine indexing and caching? Organizations have created a set of standards to stop the spider crawl, and all commercial search engines have agreed to respect these standards. One way is to put a line of HTML code invisibly embedded in a Web site that tells all software robots to stop indexing a page, stop following links on the page, or stop offering old page archives in a cache. Users don’t see this code, but commercial Web crawlers do. For those familiar with HTML code (the language used to describe a Web site), the command to stop Web crawlers from indexing a page, following links, and listing archives of cached pages looks like this:
<META NAME=“ROBOTS” CONTENT=“NOINDEX, NOFOLLOW, NOARCHIVE”>
There are other techniques to keep the spiders out, too. Web site administrators can add a special file (called robots.txt) that provides similar instructions on how indexing software should treat the Web site. And a lot of content lies inside the “dark Web,” either behind corporate firewalls or inaccessible to those without a user account—think of private Facebook updates no one can see unless they’re your friend—all of that is out of Google’s reach.
What’s It Take to Run This Thing?
Sergey Brin and Larry Page started Google with just four scavenged computers.[300] But in a decade, the infrastructure used to power the search sovereign has ballooned to the point where it is now the largest of its kind in the world.[301] Google doesn’t disclose the number of servers it uses, but by some estimates, it runs over 1.4 million servers in over a dozen so-called server farmsserver farmA massive network of computer servers running software to coordinate their collective use. Server farms provide the infrastructure backbone to SaaS and hardware cloud efforts, as well as many large-scale Internet services. worldwide.[302] In 2008, the firm spent $2.18 billion on capital expenditures, with data centers, servers, and networking equipment eating up the bulk of this cost.[303] Building massive server farms to index the ever-growing Web is now the cost of admission for any firm wanting to compete in the search market. This is clearly no longer a game for two graduate students working out of a garage.
Video Clip
Google’s Container Data Center
Take a virtual tour of one of Google’s data centers.
The size of this investment not only creates a barrier to entry, it influences industry profitability, with market-leader Google enjoying huge economies of scale. Firms may spend the same amount to build server farms, but if Google has two-thirds of this market (and growing) while Microsoft’s search draws only about one-tenth the traffic, which do you think enjoys the better return on investment?
The hardware components that power Google aren’t particularly special. In most cases the firm uses the kind of Intel or AMD processors, low-end hard drives, and RAM chips that you’d find in a desktop PC. These components are housed in rack-mounted servers about 3.5 inches thick, with each server containing two processors, eight memory slots, and two hard drives.
In some cases, Google mounts racks of these servers inside standard-sized shipping containers, each with as many as 1,160 servers per box.[304] A given data center may have dozens of these server-filled containers all linked together. Redundancy is the name of the game. Google assumes individual components will regularly fail, but no single failure should interrupt the firm’s operations (making the setup what geeks call fault-tolerantfault-tolerantSystems that are capable of continuing operation even if a component fails.). If something breaks, a technician can easily swap it out with a replacement.
Each server farm layout has also been carefully designed with an emphasis on lowering power consumption and cooling requirements. And the firm’s custom software (much of it built upon open-source products) allows all this equipment to operate as the world’s largest grid computer.
Web search is a task particularly well suited for the massively parallel architecture used by Google and its rivals. For an analogy of how this works, imagine that working alone, you need try to find a particular phrase in a hundred-page document (that’s a one server effort). Next, imagine that you can distribute the task across five thousand people, giving each of them a separate sentence to scan (that’s the multi-server grid). This difference gives you a sense of how search firms use massive numbers of servers and the divide-and-conquer approach of grid computing to quickly find the needles you’re searching for within the Web’s haystack (for more on grid computing, see Chapter 4, Moore’s Law and More: Fast, Cheap Computing and What It Means for the Manager, and for more on server farms, see Chapter 10, Software in Flux: Partly Cloudy and Sometimes Free).
Figure 8.6.

The Google Search Appliance is a hardware product that firms can purchase in order to run Google search technology within the privacy and security of an organization’s firewall.
Google will even sell you a bit of its technology so that you can run your own little Google in-house without sharing documents with the rest of the world. Google’s line of search appliances are rack-mounted servers that can index documents within a corporation’s Web site, even specifying password and security access on a per-document basis. Selling hardware isn’t a large business for Google, and other vendors offer similar solutions, but search appliances can be vital tools for law firms, investment banks, and other document-rich organizations.
Trendspotting with Google
Google not only gives you search results, it lets you see aggregate trends in what its users are searching for, and this can yield powerful insights. For example, by tracking search trends for flu symptoms, Google’s Flu Trends Web site can pinpoint outbreaks one to two weeks faster than the Centers for Disease Control and Prevention.[305] Want to go beyond the flu? Google’s Trends, and Insights for Search services allow anyone to explore search trends, breaking out the analysis by region, category (image, news, product), date, and other criteria. Savvy managers can leverage these and similar tools for competitive analysis, comparing a firm, its brands, and its rivals.
Figure 8.7.

Google Insights for Search can be a useful tool for competitive analysis and trend discovery. The chart shows a comparison (over a twelve-month period, and geographically) of search interest in the terms Wii, Playstation, and Xbox.
Key Takeaways
-
Ranked search results are often referred to as organic or natural search. PageRank is Google’s algorithm for ranking search results. PageRank orders organic search results based largely on the number of Web sites linking to them, and the “weight” of each page as measured by its “influence.”
-
Search engine optimization (SEO) is the process of using natural or organic search to increase a Web site’s traffic volume and visitor quality. The scope and influence of search has made SEO an increasingly vital marketing function.
-
Users don’t really search the Web; they search an archived copy built by crawling and indexing discoverable documents.
-
Google operates from a massive network of server farms containing hundreds of thousands of servers built from standard, off-the-shelf items. The cost of the operation is a significant barrier to entry for competitors. Google’s share of search suggests the firm can realize economies of scales over rivals required to make similar investments while delivering fewer results (and hence ads).
-
Web site owners can hide pages from popular search-engine Web crawlers using a number of methods, including HTML tags, a no-index file, or ensuring that Web sites aren’t linked to other pages and haven’t been submitted to Web sites for indexing.
Questions and Exercises
-
How do search engines discover pages on the Internet? What kind of capital commitment is necessary to go about doing this? How does this impact competitive dynamics in the industry?
-
How does Google rank search results? Investigate and list some methods that an organization might use to improve its rank in Google’s organic search results. Are there techniques Google might not improve of? What risk does a firm run if Google or another search firm determines that it has used unscrupulous SEO techniques to try to game ranking algorithms?
-
Sometimes Web sites returned by major search engines don’t contain the words or phrases that initially brought you to the site. Why might this happen?
-
What’s a cache? What other products or services have a cache?
-
What can be done if you want the content on your Web site to remain off limits to search engine indexing and caching?
-
What is a “search appliance?” Why might an organization choose such a product?
-
Become a better searcher: Look at the advanced options for your favorite search engine. Are there options you hadn’t used previously? Be prepared to share what you learn during class discussion.
-
Visit Google Trends and Google Insights for Search. Explore the tool as if you were comparing a firm with its competitors. What sorts of useful insights can you uncover? How might businesses use these tools?
[298] A. Wright, “Exploring a ‘Deep Web’ That Google Can’t Grasp,” New York Times, February 23, 2009.
[299] Most Web sites do have a link where you can submit a Web site for indexing, and doing so can help promote the discovery of your content.
[300] M. Liedtke, “Google Reigns as World’s Most Powerful 10-Year-Old,” Associated Press, September 5, 2008.
[301] David F. Carr, “How Google Works,” Baseline, July 6, 2006.
[302] R. Katz, “Tech Titans Building Boom,” IEEE Spectrum 46, no. 2 (February 1, 2009).
[303] Google, “Google Announces Fourth Quarter and Fiscal Year 2008 Results,” press release, January 22, 2009.
[304] S. Shankland, “Google Unlocks Once-Secret Server,” CNET, April 1, 2009.
[305] S. Bruce, “Google Says User Data Aids Flu Detection,” eHealthInsider, May 25, 2009.

Citation Information
APA Format:Gallaugher, John., Information Systems: A Manager's Guide To Harnessing Technology. Retrieved Sep 2, 2010 from http://www.flatworldknowledge.com/node/41126 .
MLA Format:Gallaugher, John. Information Systems: A Manager's Guide To Harnessing Technology. 1969 . Flat World Knowledge. 2 Sep, 2010. <http://www.flatworldknowledge.com/node/41126> .
Chapter 8 Print–It–Yourself has been added to your cart for $1.99.
This book is not available for adoption
Adopt this book for your course
We are happy you want to adopt this Flat World Knowledge textbook for your course! You'll need to register as a user to get started.
Why? Registering allows you to post your course's information on our website so students can find their book, and gives you access to My(flat)World where you can keep track of all the books you adopt.
Are you a new user? Sign up here for free.
Adopt this book for your course
Thank you for your interest in adopting this book for your class. It is NOT YET PUBLISHED. When it is, you will click this button and:
Fill out a short adoption form. When you submit it, we will generate (and send to you) a URL that is unique to your class. That is where your students will go to get their free online book, or to purchase affordable alternatives.
You will also be able to print out this adoption form and bring it to the bookstore so that they can order and sell copies locally of the softcover print version.
This book is not available for customization
You must log in to customize textbooks.
New user? Sign up here for free, and give it a try.
Features:
Drag-and-drop chapters into a new table of contents that suits your syllabus. Resequence and delete down to the section level!
Even better: Annotate content at the paragraph level, giving you fine grained control over the content to suit your exact needs.
Another benefit: No more being forced to switch to new editions. Ever. You move to new editions when you have time and when you see merit. Not when we do.
We have more to do: More cool features in the works, like adding your own authored content, as well as editing existing content all the way to the sentence level. Stay tuned.
This book is not yet published. When it does, our customization features let you:
Drag-and-drop chapters into a new table of contents that suits your syllabus. Resequence and delete down to the section level!
Even better: Annotate content at the paragraph level, giving you fine grained control over the content to suit your exact needs.
Another benefit: No more being forced to switch to new editions. Ever. You move to new editions when you have time and when you see merit. Not when we do.
We have more to do: More cool features in the works, like adding your own authored content, as well as editing existing content all the way to the sentence level. Stay tuned.
Your book has already been saved for print.
You typically should not customize your book further. If your bookstore or students have already ordered the book they will not see your future changes.
If you choose to make further customizations you can do so by choosing 'customize' for this book from My Flatworld
This book does not have any Educator Supplements
Only approved educators have access to the supplements for this textbook. Please note: Educator access is manually approved within approximately 48 business hours after your registration.
If you already have an account and have been approved as an educator, then please login.
Are you a new user? Sign up for free.
You can also feel free to contact us regarding this matter.