Business Intelligence, 4e(Sharda/Delen/Turban) Chapter 7 big data Concepts and tools 1)In the opening vignette, the Access Telecom(AT), built a system to better visualize customers who were unhappy before they canceled their service Ar TRUE Diff: 2 Page Ref: 372 2)The term"Big Data"is relative as it depends on the size of the using organization Answer TRUE Diff: 2 Page Ref: 373 3) Satellite data can be used to evaluate the activity at retail locations as a source of alternative data Answer: TRUE Diff:2 Page Ref: 377 4) Big Data is being driven by the exponential growth, availability, and use of informatic Answer: TRUE Diff:2 Page Ref: 373 5)The quality and objectivity of information disseminated by influential users of Twitter is higher than that disseminated by noninfluential users Answer: TRUE Diff: 2 Page Ref: 392 6) Big Data uses commodity hardware, which is expensive, specialized hardware that is custom built for a client or application Answer: FALsE Diff: 2 Page Ref: 375 7) MapReduce can be easily understood by skilled programmers due to its procedural nature Answer. TRUE Diff: 2 Page Ref: 385 8)Hadoop was designed to handle petabytes and exabytes of data distributed over multiple nodes Answer: TRU Diff: 2 Page Ref: 385 9)Hadoop and Map Reduce require each other to work Answer: FALSE Diff: 2 Page Ref: 386 10)In most cases, Hadoop is used to replace data warehouses Answer: FALSE Diff: 2 Page Ref: 389 Copyright C 2018 Pearson Education, Inc
1 Copyright © 2018 Pearson Education, Inc. Business Intelligence, 4e (Sharda/Delen/Turban) Chapter 7 Big Data Concepts and Tools 1) In the opening vignette, the Access Telecom (AT), built a system to better visualize customers who were unhappy before they canceled their service. Answer: TRUE Diff: 2 Page Ref: 372 2) The term "Big Data" is relative as it depends on the size of the using organization. Answer: TRUE Diff: 2 Page Ref: 373 3) Satellite data can be used to evaluate the activity at retail locations as a source of alternative data. Answer: TRUE Diff: 2 Page Ref: 377 4) Big Data is being driven by the exponential growth, availability, and use of information. Answer: TRUE Diff: 2 Page Ref: 373 5) The quality and objectivity of information disseminated by influential users of Twitter is higher than that disseminated by noninfluential users. Answer: TRUE Diff: 2 Page Ref: 392 6) Big Data uses commodity hardware, which is expensive, specialized hardware that is custom built for a client or application. Answer: FALSE Diff: 2 Page Ref: 375 7) MapReduce can be easily understood by skilled programmers due to its procedural nature. Answer: TRUE Diff: 2 Page Ref: 385 8) Hadoop was designed to handle petabytes and exabytes of data distributed over multiple nodes in parallel. Answer: TRUE Diff: 2 Page Ref: 385 9) Hadoop and MapReduce require each other to work. Answer: FALSE Diff: 2 Page Ref: 386 10) In most cases, Hadoop is used to replace data warehouses. Answer: FALSE Diff: 2 Page Ref: 389
11) Despite their potential, many current NosQL tools lack mature management and monitoring tools Answer: TRUE Diff: 2 Page Ref: 389 12) There is a clear difference between the type of information support provided by influential users versus the others on Twitter Answer: TRUE Diff: 2 Page Ref: 392 13)Social media mentions can be used to chart and predict flu outbreaks Answer: TRUE Diff: 2 Page Ref: 400 14)In Application Case 7.6, Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse, it was found that urban individuals have a higher number of diagnosed disease conditions Answer: TRUE Diff: 2 Page Ref: 403 15) For low latency, interactive reports, a data warehouse is preferable to Hadoop Answer: TRUE Diff:2 Page Ref: 396 16) If you have many flexible programming languages running in parallel, Hadoop is preferable to a data warehouse Answer TRUE Diff: 2 Page Ref: 396 17)In the Salesforce case study, streaming data is used to identify services that customers use most Answer: FALSE Diff: 2 Page Ref: 410 18)It is important for Big Data and self-service business intelligence to go hand in hand to get maximum value from analytics wer Diff: 1 Page Ref: 395 19) Big Data simplifies data governance issues, especially for global firms Answer: fals Diff: 2 Page Ref: 406 Copyright C 2018 Pearson Education, Inc
2 Copyright © 2018 Pearson Education, Inc. 11) Despite their potential, many current NoSQL tools lack mature management and monitoring tools. Answer: TRUE Diff: 2 Page Ref: 389 12) There is a clear difference between the type of information support provided by influential users versus the others on Twitter. Answer: TRUE Diff: 2 Page Ref: 392 13) Social media mentions can be used to chart and predict flu outbreaks. Answer: TRUE Diff: 2 Page Ref: 400 14) In Application Case 7.6, Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse, it was found that urban individuals have a higher number of diagnosed disease conditions. Answer: TRUE Diff: 2 Page Ref: 403 15) For low latency, interactive reports, a data warehouse is preferable to Hadoop. Answer: TRUE Diff: 2 Page Ref: 396 16) If you have many flexible programming languages running in parallel, Hadoop is preferable to a data warehouse. Answer: TRUE Diff: 2 Page Ref: 396 17) In the Salesforce case study, streaming data is used to identify services that customers use most. Answer: FALSE Diff: 2 Page Ref: 410 18) It is important for Big Data and self-service business intelligence to go hand in hand to get maximum value from analytics. Answer: TRUE Diff: 1 Page Ref: 395 19) Big Data simplifies data governance issues, especially for global firms. Answer: FALSE Diff: 2 Page Ref: 406
20)Current total storage capacity lags behind the digital information being generated in the world Answer: TRUE Diff: 2 Page Ref: 406 21)Using data to understand customers/clients and business operations to sustain and foster growth and profitability is A)easier with the advent of Bi and Big Data B)essentially the same now as it has always been C)an increasingly challenging task for today's enterprises D)now completely automated with no human intervention required Answer: C Diff: 2 Page Ref: 373 22)A newly popular unit of data in the Big Data era is the petabyte(PB), which is A)109 bytes B)1012 bytes C)1015 bytes D)1018 bytes Answer: C Diff: 2 Page Ref: 375 23)Which of the following sources is likely to produce Big Data the fastest? A)order entry clerks B)cashi C)RFID tags D)online customers Answer: C Diff:2 Page Ref: 374 24)Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called? A) volatile B) periodicity C) D)variability Ansy Diff:2 Page Ref: 376 25)In the Twitter case study, how did influential users support their tweets? A) B)objective data C)multiple posts D)references to other users Answer: B Diff:2 Page Ref: 392 Copyright o 2018 Pearson Education, Inc
3 Copyright © 2018 Pearson Education, Inc. 20) Current total storage capacity lags behind the digital information being generated in the world. Answer: TRUE Diff: 2 Page Ref: 406 21) Using data to understand customers/clients and business operations to sustain and foster growth and profitability is A) easier with the advent of BI and Big Data. B) essentially the same now as it has always been. C) an increasingly challenging task for today's enterprises. D) now completely automated with no human intervention required. Answer: C Diff: 2 Page Ref: 373 22) A newly popular unit of data in the Big Data era is the petabyte (PB), which is A) 109 bytes. B) 1012 bytes. C) 1015 bytes. D) 1018 bytes. Answer: C Diff: 2 Page Ref: 375 23) Which of the following sources is likely to produce Big Data the fastest? A) order entry clerks B) cashiers C) RFID tags D) online customers Answer: C Diff: 2 Page Ref: 374 24) Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage. What is this feature of Big Data called? A) volatility B) periodicity C) inconsistency D) variability Answer: D Diff: 2 Page Ref: 376 25) In the Twitter case study, how did influential users support their tweets? A) opinion B) objective data C) multiple posts D) references to other users Answer: B Diff: 2 Page Ref: 392
26) Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near-real time with highly accurate insights. What is this process called? A)in-memory analytics B)in-database analytics C) grid computing D)applia AnswerA Diff: 2 Page Ref: 380 27)Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources? A)in-memory analytics B)in-database analytics C)grid computing D)appliances Answer: C Diff: 2 Page Ref: 380 28) How does Hadoop work? A)It integrates Big Data into a whole so large data elements can be processed as a whole on one B)It integrates Big Data into a whole so large data elements can be processed as a whole on multiple cor C)It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on one computer D) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers Answer: D Diff: 3 Page Ref: 386 29)What is the Hadoop Distributed File System(hDFS)designed to handle? A)unstructured and semistructured relational data B)unstructured and semistructured non-relational data C)structured and semistructured relational data D)structured and semistructured non-relational data Answer: B Diff: 2 Page Ref: 385 30)In a Hadoop"stack, what is a slave node? A)a node where bits of programs are stored B)a node where metadata is stored and used to organize data processing C)a node where data is stored and processed D)a node responsible for holding all the source programs Answer: C Diff: 2 Page Ref: 386 Copyright C 2018 Pearson Education, Inc
4 Copyright © 2018 Pearson Education, Inc. 26) Allowing Big Data to be processed in memory and distributed across a dedicated set of nodes can solve complex problems in near–real time with highly accurate insights. What is this process called? A) in-memory analytics B) in-database analytics C) grid computing D) appliances Answer: A Diff: 2 Page Ref: 380 27) Which Big Data approach promotes efficiency, lower cost, and better performance by processing jobs in a shared, centrally managed pool of IT resources? A) in-memory analytics B) in-database analytics C) grid computing D) appliances Answer: C Diff: 2 Page Ref: 380 28) How does Hadoop work? A) It integrates Big Data into a whole so large data elements can be processed as a whole on one computer. B) It integrates Big Data into a whole so large data elements can be processed as a whole on multiple computers. C) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on one computer. D) It breaks up Big Data into multiple parts so each part can be processed and analyzed at the same time on multiple computers. Answer: D Diff: 3 Page Ref: 386 29) What is the Hadoop Distributed File System (HDFS) designed to handle? A) unstructured and semistructured relational data B) unstructured and semistructured non-relational data C) structured and semistructured relational data D) structured and semistructured non-relational data Answer: B Diff: 2 Page Ref: 385 30) In a Hadoop "stack," what is a slave node? A) a node where bits of programs are stored B) a node where metadata is stored and used to organize data processing C) a node where data is stored and processed D) a node responsible for holding all the source programs Answer: C Diff: 2 Page Ref: 386
3 1)In a Hadoop"stack, what node periodically replicates and stores data from the Name Node should it fail? A)backup node B) secondary node C)substitute node Answer: B Diff: 2 Page Ref: 386 32)All of the following statements about MapReduce are true EXCEPt A)MapReduce is a general-purpose execution engine B)MapReduce handles the complexities of network communication C) MapReduce handles parallel programming D) Map Reduce runs without fault tolerance Answer: D Diff: 2 Page Ref: 389 33)In a network analysis, what connects nodes? A)edg B)metrics C) paths D)visualizations AnswerA Diff: 2 Page Ref: 403 34) In the analyzing disease Patterns from an Electronic Medical Records Data Warehouse case study, what was the analytic goal? A)determine if diseases are accurately diagnose B)determine probabilities of diseases that are comorbid C)determine differences in rates of disease in urban and rural populations D) determine differences in rates of disease in males v. females Answer: C Diff: 2 Page Ref: 402 35)Traditional data warehouses have not been able to keep up with A)the evolution of the SQL language B)the variety and complexity of data C)expert systems that run on them D)OLAP Ar nswer Diff: 2 Page Ref: 393 Copyright C 2018 Pearson Education, Inc
5 Copyright © 2018 Pearson Education, Inc. 31) In a Hadoop "stack," what node periodically replicates and stores data from the Name Node should it fail? A) backup node B) secondary node C) substitute node D) slave node Answer: B Diff: 2 Page Ref: 386 32) All of the following statements about MapReduce are true EXCEPT A) MapReduce is a general-purpose execution engine. B) MapReduce handles the complexities of network communication. C) MapReduce handles parallel programming. D) MapReduce runs without fault tolerance. Answer: D Diff: 2 Page Ref: 389 33) In a network analysis, what connects nodes? A) edges B) metrics C) paths D) visualizations Answer: A Diff: 2 Page Ref: 403 34) In the Analyzing Disease Patterns from an Electronic Medical Records Data Warehouse case study, what was the analytic goal? A) determine if diseases are accurately diagnosed B) determine probabilities of diseases that are comorbid C) determine differences in rates of disease in urban and rural populations D) determine differences in rates of disease in males v. females Answer: C Diff: 2 Page Ref: 402 35) Traditional data warehouses have not been able to keep up with A) the evolution of the SQL language. B) the variety and complexity of data. C) expert systems that run on them. D) OLAP. Answer: B Diff: 2 Page Ref: 393