2Chapter1What Is Statistics?problem of making decisions in the face of uncertainty" And Mood, Graybill, andBoes (1974) define statistics as"the technology of the scientific method"and addthat statistics is concerned with"(1) the design of experiments and investigations,(2) statistical inference."A superficial examination of these definitions suggests asubstantial lack of agreement,but all possess common elements.Each descriptionimplies that data are collected, with inference as the objective.Each requires select-ing a subset of a large collection of data,either existent or conceptual,in ordertoinfer the characteristics of the complete set. All the authors imply that statistics is atheory of information, with inferencemakingas its objective.The large bodyofdata that is thetargetofour interest is called the population,andthe subset selected from it is a sample.The preferences of voters for a gubernatorialcandidate, Jones, expressed in quantitative form (1 for“prefer"and o for“do notprefer")provide a real, finite,and existing population of great interest to Jones.Todeterminethetruefraction who favorhis election,Joneswouldneed to interviewall eligible voters-a task that is practically impossible.The voltage at a particularpoint in the guidance system for a spacecraft may be tested in the only three sys-tems that have been built.The resulting data could be used to estimate the voltagecharacteristics for other systems that might be manufactured some time in the future.In this case, thepopulation is conceptual.Wethink of the sample of three as beingrepresentativeofalargepopulation ofguidance systems that couldbebuiltusingthesame method.Presumably,this population would possess characteristics similar tothe three systems in the sample. Analogously,measurements on patients in a medicalexperimentrepresenta samplefroma conceptual population consistingof all patientssimilarly afflicted today, as well as those who will be aflicted in the near future.Youwill find it useful to clearly define the populations of interest for each of the scenariosdescribed earlier in this section and to clarify the inferential objective for each.It is interesting to note that billions of dollars are spent each year by U.S. indus-try and governmentfor data fromexperimentation,sample surveys,and otherdatacollection procedures.This money is expended solely to obtain information aboutphenomena susceptible to measurement in areas of business, science,or the arts.Theimplications of this statement provide keys to the nature of the very valuable contri-bution that the discipline of statistics makes to research and development in all areasof society. Information useful in inferring some characteristic of a population (eitherexisting or conceptual) is purchased in a specified quantity and results in an inference(estimation or decision) with an associated degree ofgoodness.For example,if Jonesarrangesfor asampleofvoterstobeinterviewed,theinformation in the sample can beusedtoestimatethetruefractionofall voterswhofavorJones'selection.Inadditionto the estimate itself, Jones should also be concerned with the likelihood (chance)that the estimate provided is closeto thetrue fraction ofeligible voters who favor hiselection. Intuitively, the larger the number of eligible voters in the sample, the higherwill bethelikelihoodof an accurateestimate.Similarly,ifadecisionismaderegardingtherelativemerits of two manufacturing processes based on examination of samplesof products from both processes, we should be interested in the decision regardingwhich is better and the likelihood that the decision is correct. In general, the study ofstatistics is concerned with the design of experiments or sample surveys to obtain aspecifiedquantity of information atminimum cost andthe optimum use ofthis infor-mation in making an inferenceabouta population.The objective ofstatistics is to makeCopyright 2011 Cengage Learning. All Rights Rewholeorinpart..DoetChapter(s)Editoral:
2 Chapter 1 What Is Statistics? problem of making decisions in the face of uncertainty.” And Mood, Graybill, and Boes (1974) define statistics as “the technology of the scientific method” and add that statistics is concerned with “(1) the design of experiments and investigations, (2) statistical inference.” A superficial examination of these definitions suggests a substantial lack of agreement, but all possess common elements. Each description implies that data are collected, with inference as the objective. Each requires selecting a subset of a large collection of data, either existent or conceptual, in order to infer the characteristics of the complete set. All the authors imply that statistics is a theory of information, with inference making as its objective. The large body of data that is the target of our interest is called the population, and the subset selected from it is a sample. The preferences of voters for a gubernatorial candidate, Jones, expressed in quantitative form (1 for “prefer” and 0 for “do not prefer”) provide a real, finite, and existing population of great interest to Jones. To determine the true fraction who favor his election, Jones would need to interview all eligible voters—a task that is practically impossible. The voltage at a particular point in the guidance system for a spacecraft may be tested in the only three systems that have been built. The resulting data could be used to estimate the voltage characteristics for other systems that might be manufactured some time in the future. In this case, the population is conceptual. We think of the sample of three as being representative of a large population of guidance systems that could be built using the same method. Presumably, this population would possess characteristics similar to the three systems in the sample. Analogously, measurements on patients in a medical experiment represent a sample from a conceptual population consisting of all patients similarly afflicted today, as well as those who will be afflicted in the near future. You will find it useful to clearly define the populations of interest for each of the scenarios described earlier in this section and to clarify the inferential objective for each. It is interesting to note that billions of dollars are spent each year by U.S. industry and government for data from experimentation, sample surveys, and other data collection procedures. This money is expended solely to obtain information about phenomena susceptible to measurement in areas of business, science, or the arts. The implications of this statement provide keys to the nature of the very valuable contribution that the discipline of statistics makes to research and development in all areas of society. Information useful in inferring some characteristic of a population (either existing or conceptual) is purchased in a specified quantity and results in an inference (estimation or decision) with an associated degree of goodness. For example, if Jones arranges for a sample of voters to be interviewed, the information in the sample can be used to estimate the true fraction of all voters who favor Jones’s election. In addition to the estimate itself, Jones should also be concerned with the likelihood (chance) that the estimate provided is close to the true fraction of eligible voters who favor his election. Intuitively, the larger the number of eligible voters in the sample, the higher will be the likelihood of an accurate estimate. Similarly, if a decision is made regarding the relative merits of two manufacturing processes based on examination of samples of products from both processes, we should be interested in the decision regarding which is better and the likelihood that the decision is correct. In general, the study of statistics is concerned with the design of experiments or sample surveys to obtain a specified quantity of information at minimum cost and the optimum use of this information in making an inference about a population. The objective of statistics is to make Copyright 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
1.2Characterizinga Setof Measurements:Graphical Methods3an inference aboutapopulationbased on information contained ina samplefromthat population and to provide an associatedmeasure of goodness for the inference.Exercises1.1Foreachofthefollowing situations,identifythepopulationofinterest,theinferentialobjectiveand how you might go about collecting a sample.A university researcher wants to estimate the proportion of U.S. citizens froma"Generation X"who are interested in starting their own businesses.For more than a century,normal body temperature for humans has been accepted to be98.6° Fahrenheit. Is it really? Researchers want to estimate the average temperature ofhealthyadults in the United StatesA city engineer wants to estimate the average weekly water consumption for single-familydwelling units in the city.TheNational HighwaySafetyCouncilwantstoestimatethe proportion of automobiletireswith unsafe tread among all tires manufactured by a specific company during the currentproduction year.Apolitical scientist wants to determine whethera majorityofadultresidents ofa statefavora unicameral legislature.fA medical scientist wants to estimate the average length of time until the recurrence of acertain disease.gAn electrical engineer wants to determine whether the average length of life of transistorsof a certain type is greater than 500hours.1.2 Characterizing a Set of Measurements:GraphicalMethodsIn the broadest sense, making an inference implies partially or completely describinga phenomenon or physical object.Little difficulty is encountered when appropriateand meaningful descriptive measures are available, but this is not always the case.For example, we might characterize a person by using height, weight, color of hairand eyes, and other descriptive measures of the person's physiognomy.Identifying asetofdescriptivemeasurestocharacterizeanoilpaintingwouldbeacomparativelymore difficulttask.Characterizinga population that consists ofa setof measurementsis equally challenging.Consequently,a necessarypreludetoa discussion of inferencemaking is the acquisition ofa methodfor characterizing a set of numbers.The charac-terizations must be meaningful so thatknowledge ofthe descriptive measures enablesus to clearly visualize the set ofnumbers. In addition, we require that the characteriza-tions possess practical significance so thatknowledge of the descriptive measuresfora population can be used to solvea practical,nonstatistical problem.We will developour ideas on this subject by examining a process that generates a population.Consider astudytodetermineimportantvariables affectingprofit in abusiness thatmanufacturescustom-mademachined devices.Someofthesevariablesmightbethedollar size of the contract, the type of industry with which the contract is negotiated,the degree of competition in acquiring contracts, the salesperson who estimates theCopyright 2011 Cenga0Riaen
1.2 Characterizing a Set of Measurements: Graphical Methods 3 an inference about a population based on information contained in a sample from that population and to provide an associated measure of goodness for the inference. Exercises 1.1 For each of the following situations, identify the population of interest, the inferential objective, and how you might go about collecting a sample. a A university researcher wants to estimate the proportion of U.S. citizens from “Generation X” who are interested in starting their own businesses. b For more than a century, normal body temperature for humans has been accepted to be 98.6◦ Fahrenheit. Is it really? Researchers want to estimate the average temperature of healthy adults in the United States. c A city engineer wants to estimate the average weekly water consumption for single-family dwelling units in the city. d The National Highway Safety Council wants to estimate the proportion of automobile tires with unsafe tread among all tires manufactured by a specific company during the current production year. e A political scientist wants to determine whether a majority of adult residents of a state favor a unicameral legislature. f A medical scientist wants to estimate the average length of time until the recurrence of a certain disease. g An electrical engineer wants to determine whether the average length of life of transistors of a certain type is greater than 500 hours. 1.2 Characterizing a Set of Measurements: Graphical Methods In the broadest sense, making an inference implies partially or completely describing a phenomenon or physical object. Little difficulty is encountered when appropriate and meaningful descriptive measures are available, but this is not always the case. For example, we might characterize a person by using height, weight, color of hair and eyes, and other descriptive measures of the person’s physiognomy. Identifying a set of descriptive measures to characterize an oil painting would be a comparatively more difficult task. Characterizing a population that consists of a set of measurements is equally challenging. Consequently, a necessary prelude to a discussion of inference making is the acquisition of a method for characterizing a set of numbers. The characterizations must be meaningful so that knowledge of the descriptive measures enables us to clearly visualize the set of numbers. In addition, we require that the characterizations possess practical significance so that knowledge of the descriptive measures for a population can be used to solve a practical, nonstatistical problem. We will develop our ideas on this subject by examining a process that generates a population. Consider a study to determine important variables affecting profit in a business that manufactures custom-made machined devices. Some of these variables might be the dollar size of the contract, the type of industry with which the contract is negotiated, the degree of competition in acquiring contracts, the salesperson who estimates the Copyright 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
4Chapter1What Is Statistics?contract,fixed dollar costs,and the supervisor who is assigned the task of organizingand conducting the manufacturing operation.The statistician will wish to measure theresponseordependentvariable,profitpercontract,forseveral jobs(thesample).Alongwith recording the profit, the statistician will obtain measurements on the variablesthat might be related to profit-the independent variables. His or her objective is touse information in the sample to infer the approximate relationship of the independentvariables just described to the dependent variable,profit, and to measure the strengthof thisrelationship.The manufacturer's objectiveis to determineoptimum conditionsfor maximizing profit.Thepopulation ofinterest in themanufacturingproblem is conceptual andconsistsof allmeasurementsof profit (per unitof capital and labor invested)thatmightbemade on contracts,now and in thefuture,for fixed values ofthe independent variables(sizeofthecontract,measureofcompetition,etc.).Theprofitmeasurements willvaryfromcontracttocontractinaseeminglyrandommannerasaresultof variationsinmaterials,timeneededto complete individual segments of the work,and other uncon-trollable variables affecting the job.Consequently,we view thepopulation as beingrepresented bya distribution ofprofitmeasurements, with theform of the distributiondepending on specific values of the independent variables.Our wish to determine therelationship between the dependent variable,profit, and a set of independent variablesis therefore translated into adesire to determine theeffect of the independent variablesontheconceptual distributionofpopulationmeasurements.An individual population (or any set of measurements)can be characterized bya relative frequency distribution,which can be represented by a relative frequencyhistogram.Agraphis constructedbysubdividingtheaxis ofmeasurement intointer-vals of equal width. A rectangle is constructed over each interval, such that the heightof the rectangle is proportional to the fraction of the total number of measurementsfalling in each cell. For example, to characterize the ten measurements 2.1, 2.4, 2.2,2.3,2.7,2.5,2.4,2.6,2.6,and 2.9,we could divide theaxis ofmeasurement into in-tervals of equal width (say,.2 unit),commencing with2.05.The relative frequencies(fraction of total number of measurements),calculated for each interval, are shownin Figure 1.1.Notice that the figure gives a clear pictorial description of the entire setoftenmeasurements.Observe that we have not given precise rules for selecting the number, widths,or locations of the intervals used in constructing a histogram.This is because theFIGURE1.1RelativeFrequencyRelativefrequency3histogram.2>2.253.052.052.452.652.85Axis ofMeasurementCopyright 2011 Cengage LRighEditors
4 Chapter 1 What Is Statistics? contract, fixed dollar costs, and the supervisor who is assigned the task of organizing and conducting the manufacturing operation. The statistician will wish to measure the response or dependent variable, profit per contract, for several jobs (the sample). Along with recording the profit, the statistician will obtain measurements on the variables that might be related to profit—the independent variables. His or her objective is to use information in the sample to infer the approximate relationship of the independent variables just described to the dependent variable, profit, and to measure the strength of this relationship. The manufacturer’s objective is to determine optimum conditions for maximizing profit. The population of interest in the manufacturing problem is conceptual and consists of all measurements of profit (per unit of capital and labor invested) that might be made on contracts, now and in the future, for fixed values of the independent variables (size of the contract, measure of competition, etc.). The profit measurements will vary from contract to contract in a seemingly random manner as a result of variations in materials, time needed to complete individual segments of the work, and other uncontrollable variables affecting the job. Consequently, we view the population as being represented by a distribution of profit measurements, with the form of the distribution depending on specific values of the independent variables. Our wish to determine the relationship between the dependent variable, profit, and a set of independent variables is therefore translated into a desire to determine the effect of the independent variables on the conceptual distribution of population measurements. An individual population (or any set of measurements) can be characterized by a relative frequency distribution, which can be represented by a relative frequency histogram. A graph is constructed by subdividing the axis of measurement into intervals of equal width. A rectangle is constructed over each interval, such that the height of the rectangle is proportional to the fraction of the total number of measurements falling in each cell. For example, to characterize the ten measurements 2.1, 2.4, 2.2, 2.3, 2.7, 2.5, 2.4, 2.6, 2.6, and 2.9, we could divide the axis of measurement into intervals of equal width (say, .2 unit), commencing with 2.05. The relative frequencies (fraction of total number of measurements), calculated for each interval, are shown in Figure 1.1. Notice that the figure gives a clear pictorial description of the entire set of ten measurements. Observe that we have not given precise rules for selecting the number, widths, or locations of the intervals used in constructing a histogram. This is because the 2.05 2.25 2.45 2.65 2.85 3.05 .3 .2 .1 0 Relative Frequency Axis of Measurement FIGURE 1.1 Relative frequency histogram Copyright 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
1.2Characterizinga Set ofMeasurements:Graphical Methods5selection of these items is somewhat at the discretion of the person who is involvedintheconstruction.Although they are arbitrary,a few guidelines can be very helpful in selecting theintervals.Points ofsubdivision oftheaxis ofmeasurement should be chosen sothatit isimpossibleforameasurementtofall onapointofdivision.Thiseliminates a sourceofconfusion and is easily accomplished, as indicated in Figure 1.1.The second guidelineinvolvesthe width ofeach interval and consequently,theminimumnumberof intervalsneeded to describe the data.Generally speaking,we wish to obtain information on theform of the distribution of the data. Many times the form will be mound-shaped, asillustrated in Figure 1.2. (Others prefer to refer to distributions such as these as bell-shaped, or normal.)Using many intervals with a small amount of data results in littlesummarization and presents a picture very similar to the data in their original form.Thelargerthe amountof data,thegreaterthenumber ofincluded intervals can be whilestill presenting a satisfactorypictureof thedata.We suggest spanning the range of thedata withfrom5to20 intervals and using the larger numberof intervals for largerquantities ofdata.Inmostreal-lifeapplications,computer software(Minitab,SAS,R, S+, JMP, etc.) is used to obtain any desired histograms. These computer packagesall producehistograms satisfying widely agreed-uponconstraints on scaling,numberof intervals used, widths of intervals, and thelike.Some people feel that the description of data is an end in itself. Histograms areoften used for this purpose, but there aremany other graphical methods that providemeaningful summaries of the information contained in a set of data. Some excellentreferences for the general topic of graphical descriptive methods are given in thereferences attheendofthis chapter.Keepin mind,however,thattheusual objectiveof statistics is to make inferences.The relative frequency distribution associated withadata set and the accompanying histogram are sufficient for ourobjectives in developingthe material in this text. This is primarily due to the probabilistic interpretation thatcanbe derived from the frequency histogram, Figure 1.1.We have already stated thatthe area of a rectangle over a given interval is proportional to the fraction of the totalnumber ofmeasurements falling in that interval.Let'sextend this idea one stepfurther.If a measurement is selected at random from the original data set, the probabilitythat it will fall in a given interval is proportional to the area under the histogram lyingover that interval. (At this point, we rely on the layperson's concept of probabilityThis term is discussed in greater detail in Chapter 2.)For example, for the data usedtoconstructFigure1.1,theprobabilitythatarandomly selectedmeasurementfallsintheintervalfrom2.05to2.45is.5becausehalfthemeasurementsfallinthisinterval.Correspondingly,thearea under the histogram in Figure 1.1 over the interval fromFIGURE1.2RelativeRelativefrequencyFrequencydistribution11I2.053.052.252.452.652.85Copyright 2011 Cengage LearsAll Rights pter(s)Editoria
1.2 Characterizing a Set of Measurements: Graphical Methods 5 selection of these items is somewhat at the discretion of the person who is involved in the construction. Although they are arbitrary, a few guidelines can be very helpful in selecting the intervals. Points of subdivision of the axis of measurement should be chosen so that it is impossible for a measurement to fall on a point of division. This eliminates a source of confusion and is easily accomplished, as indicated in Figure 1.1. The second guideline involves the width of each interval and consequently, the minimum number of intervals needed to describe the data. Generally speaking, we wish to obtain information on the form of the distribution of the data. Many times the form will be mound-shaped, as illustrated in Figure 1.2. (Others prefer to refer to distributions such as these as bellshaped, or normal.) Using many intervals with a small amount of data results in little summarization and presents a picture very similar to the data in their original form. The larger the amount of data, the greater the number of included intervals can be while still presenting a satisfactory picture of the data. We suggest spanning the range of the data with from 5 to 20 intervals and using the larger number of intervals for larger quantities of data. In most real-life applications, computer software (Minitab, SAS, R, S+, JMP, etc.) is used to obtain any desired histograms. These computer packages all produce histograms satisfying widely agreed-upon constraints on scaling, number of intervals used, widths of intervals, and the like. Some people feel that the description of data is an end in itself. Histograms are often used for this purpose, but there are many other graphical methods that provide meaningful summaries of the information contained in a set of data. Some excellent references for the general topic of graphical descriptive methods are given in the references at the end of this chapter. Keep in mind, however, that the usual objective of statistics is to make inferences. The relative frequency distribution associated with a data set and the accompanying histogram are sufficient for our objectives in developing the material in this text. This is primarily due to the probabilistic interpretation that can be derived from the frequency histogram, Figure 1.1. We have already stated that the area of a rectangle over a given interval is proportional to the fraction of the total number of measurements falling in that interval. Let’s extend this idea one step further. If a measurement is selected at random from the original data set, the probability that it will fall in a given interval is proportional to the area under the histogram lying over that interval. (At this point, we rely on the layperson’s concept of probability. This term is discussed in greater detail in Chapter 2.) For example, for the data used to construct Figure 1.1, the probability that a randomly selected measurement falls in the interval from 2.05 to 2.45 is .5 because half the measurements fall in this interval. Correspondingly, the area under the histogram in Figure 1.1 over the interval from 0 2.05 2.25 2.45 2.65 2.85 3.05 Relative Frequency FIGURE 1.2 Relative frequency distribution Copyright 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it
6Chapter1What Is Statistics?2.05to2.45 is halfofthetotal area under the histogram.Itis clearthat this interpreta-tion applies tothedistribution ofanysetofmeasurements--apopulationorasampleSuppose that Figure 1.2 gives the relative frequency distribution of profit (in mil-lions ofdollars)fora conceptual population of profit responsesfor contracts at spec-ified settings ofthe independent variables (size ofcontract,measure of competition)etc.).The probability that the next contract (at the same settings of the independentvariables) yields a profit that falls in the interval from 2.05 to 2.45 million is given bythe proportion of the area under the distribution curve that is shaded in Figure 1.2.Exercises1.2Are some cities more windy than others? Does Chicago deserve to be nicknamed"The WindyCity"? Given below are the average wind speeds (in miles per hour) for 45 selected U.S.cities8.912.48.611.39.28.835.16.27.07.111.810.77.69.19.28.29.08.79.110.910.39.67.811.59.37.98.88.812.78.47.85.710.510.59.68.910.210.37.710.68.38.89.58.89.4Source:TheWorld Almanac and Bookof Facts,2004.Construct a relative frequency histogram for these data. (Choose the class boundaries2without including the value 35.1 in the range of values.)bThe value 35.1 was recorded at Mt.Washington, New Hampshire.Does the geography ofthat city explain the magnitude of its average wind speed?The average wind speed for Chicago is 10.3 miles per hour.What percentage of the citiesChave average wind speeds in excess of Chicago's?d Do you think that Chicago is unusually windy?1.3Ofgreat importance to residents ofcentral Florida isthe amountofradioactivematerial presentin the soil ofreclaimed phosphate mining areas.Measurements of the amount of 238U in 25 soilsamples were as follows (measurements in picocuries per gram):.746.471.902.69.75.329.991.772.411.96.702.42.541.663.363.59.371.098.324.064.55.762.035.7012.48Construct a relative frequency histogram for these data.1.4The top 40 stocks on the over-the-counter (OTC) market, ranked by percentage of outstandingshares traded on one day last year are as follows:2.8811.886.275.494.814.403.783.443.112.687.996.075.264.794.053.693.363.032.742.637.155.985.074.553.943.623.262.992.742.622.892.617.135.914.944.433.933.483.202.69Construct a relative frequency histogram to describe these data.abWhat proportion ofthese top 40 stocks traded more than 4% of the outstanding shares?Copyright 2011 CengageRightantensEditon
6 Chapter 1 What Is Statistics? 2.05 to 2.45 is half of the total area under the histogram. It is clear that this interpretation applies to the distribution of any set of measurements—a population or a sample. Suppose that Figure 1.2 gives the relative frequency distribution of profit (in millions of dollars) for a conceptual population of profit responses for contracts at specified settings of the independent variables (size of contract, measure of competition, etc.). The probability that the next contract (at the same settings of the independent variables) yields a profit that falls in the interval from 2.05 to 2.45 million is given by the proportion of the area under the distribution curve that is shaded in Figure 1.2. Exercises 1.2 Are some cities more windy than others? Does Chicago deserve to be nicknamed “The Windy City”? Given below are the average wind speeds (in miles per hour) for 45 selected U.S. cities: 8.9 12.4 8.6 11.3 9.2 8.8 35.1 6.2 7.0 7.1 11.8 10.7 7.6 9.1 9.2 8.2 9.0 8.7 9.1 10.9 10.3 9.6 7.8 11.5 9.3 7.9 8.8 8.8 12.7 8.4 7.8 5.7 10.5 10.5 9.6 8.9 10.2 10.3 7.7 10.6 8.3 8.8 9.5 8.8 9.4 Source: The World Almanac and Book of Facts, 2004. a Construct a relative frequency histogram for these data. (Choose the class boundaries without including the value 35.1 in the range of values.) b The value 35.1 was recorded at Mt. Washington, New Hampshire. Does the geography of that city explain the magnitude of its average wind speed? c The average wind speed for Chicago is 10.3 miles per hour. What percentage of the cities have average wind speeds in excess of Chicago’s? d Do you think that Chicago is unusually windy? 1.3 Of great importance to residents of central Florida is the amount of radioactive material present in the soil of reclaimed phosphate mining areas. Measurements of the amount of 238U in 25 soil samples were as follows (measurements in picocuries per gram): .74 6.47 1.90 2.69 .75 .32 9.99 1.77 2.41 1.96 1.66 .70 2.42 .54 3.36 3.59 .37 1.09 8.32 4.06 4.55 .76 2.03 5.70 12.48 Construct a relative frequency histogram for these data. 1.4 The top 40 stocks on the over-the-counter (OTC) market, ranked by percentage of outstanding shares traded on one day last year are as follows: 11.88 6.27 5.49 4.81 4.40 3.78 3.44 3.11 2.88 2.68 7.99 6.07 5.26 4.79 4.05 3.69 3.36 3.03 2.74 2.63 7.15 5.98 5.07 4.55 3.94 3.62 3.26 2.99 2.74 2.62 7.13 5.91 4.94 4.43 3.93 3.48 3.20 2.89 2.69 2.61 a Construct a relative frequency histogram to describe these data. b What proportion of these top 40 stocks traded more than 4% of the outstanding shares? Copyright 2011 Cengage Learning. All Rights Reserved. May not be copied, scanned, or duplicated, in whole or in part. Due to electronic rights, some third party content may be suppressed from the eBook and/or eChapter(s). Editorial review has deemed that any suppressed content does not materially affect the overall learning experience. Cengage Learning reserves the right to remove additional content at any time if subsequent rights restrictions require it