that are not available to the general audience. There are more sources in Factiva towards the end of our sample period. However, the difference will not be crucial to our results, as our sample period is relatively short, and all the econometric analyses are benchmarked with the non-internet sample during the same For each IPO in our sample, we""search by name"in Factiva for the period between 90 days prior to the public offer and the end of december, 2000. Since the book building period of most IPOs in our sample lasts 13 weeks, this includes the majority of media coverage during the pre-IPO stage for each firm Occasionally, we collect news reports from Factiva using search by keyword instead of"search by name. This occurs mainly for firms involved in mergers during or after our sample period(Factiva drops all indexing after a merger, even if it just happened this year). We hand-collect all the news articles in which the IPO firm was mentioned. In particular, we do not limit our news articles only to those news items where the firm is only mentioned in the headline or in the lead paragraph, because this could potentially exclude a large volume of news reports that actually cover the firm There are a total of 171.488 news items. two coauthors start news coding in the fall of 2002 and the entire sample is completed at the end of 2004. We classify each news item into one of three categories good,"bad"or"neutral". Good news items(bad news items)are defined as news items which carry positive(negative)statements or implications about the firm. Neutral news items are news items that cannot be classified as good or bad. We do not classify news based on previous returns as in Chan(2003), because doing so automatically assumes the signed direction of causality from returns to news we also could not include the following: Factiva Aviation Insurance Digest, Factiva Marine Insurance Digest Dow Jones Emerging Market Reports, Dow Jones Commodities Service, Dow Jones Money Management Alert, and Dow Jones Professional Investor Report ° There is a subtle distinction between“ "search by keyword”and“ search by name” of an ipo firm in factiva.When search by keyword"is used, Factiva returns virtually all news articles that at least mentioned the name of the firm once,which could be noisy. On the other hand, news articles generated by"search by name" are more related to the firm and therefore more focused 10 prior to late March of 2004, Factiva re-indexed all the previous news reports about a target firm involved in a merger to the acquirer. A search by a firms name in this case only returns news items where both the target and the acquirer are reported. After late March 2004, this particular situation was solved as Factiva introduced an updated version of its datab
9 that are not available to the general audience.8 There are more sources in Factiva towards the end of our sample period. However, the difference will not be crucial to our results, as our sample period is relatively short, and all the econometric analyses are benchmarked with the non-internet sample during the same period. For each IPO in our sample, we “search by name” in Factiva for the period between 90 days prior to the public offer and the end of December, 2000. Since the book building period of most IPOs in our sample lasts 13 weeks, this includes the majority of media coverage during the pre-IPO stage for each firm. Occasionally, we collect news reports from Factiva using “search by keyword” instead of “search by name.” 9 This occurs mainly for firms involved in mergers during or after our sample period (Factiva drops all indexing after a merger, even if it just happened this year).10 We hand-collect all the news articles in which the IPO firm was mentioned. In particular, we do not limit our news articles only to those news items where the firm is only mentioned in the headline or in the lead paragraph, because this could potentially exclude a large volume of news reports that actually cover the firm. There are a total of 171,488 news items. Two coauthors start news coding in the fall of 2002 and the entire sample is completed at the end of 2004. We classify each news item into one of three categories: “good”, “bad” or “neutral”. Good news items (bad news items) are defined as news items which carry positive (negative) statements or implications about the firm. Neutral news items are news items that cannot be classified as good or bad. We do not classify news based on previous returns as in Chan (2003), because doing so automatically assumes the signed direction of causality from returns to news. 8 So we also could not include the following: Factiva Aviation Insurance Digest, Factiva Marine Insurance Digest, Dow Jones Emerging Market Reports, Dow Jones Commodities Service, Dow Jones Money Management Alert, and Dow Jones Professional Investor Report. 9 There is a subtle distinction between “search by keyword” and “search by name” of an IPO firm in Factiva. When “search by keyword” is used, Factiva returns virtually all news articles that at least mentioned the name of the firm once, which could be noisy. On the other hand, news articles generated by “search by name” are more related to the firm, and therefore more focused. 10 Prior to late March of 2004, Factiva re-indexed all the previous news reports about a target firm involved in a merger to the acquirer. A search by a firm's name in this case only returns news items where both the target and the acquirer are reported. After late March 2004, this particular situation was solved as Factiva introduced an updated version of its database
There are two ways of classifying news items: mechanically using Content Analysis software or using human judgment. The advantage of the former method over the latter method is that it is less expensive, it is faster, and it is consistent. The disadvantage of the former method over the latter method is that it is prone to making serious mistakes. If software is programmed to classify a news item as good if it detects a number of positive words, it is bound to misclassify a news item as good if the news contains many good words about the competition, and few bad words about the firm. The software may also misclassify a news item as neutral if there are no obvious positive or negative words in the article, whereas a human will judge correctly from the context that the article is good news or bad news about a firm. So we chose human judgment for classification We read each of the 171, 488 news items individually, and classify each news item as either good bad "or"neutral "using our judgment. Our judgment is based on the content of each individual news item, without forming a new expectation after each piece of news. This method of human judgment has obvious drawbacks, the most important of which is lack of consistency. To reduce possible time-varying judgment errors, we have one author start from the last firm that went public in the internet IPO sample and read the news in reverse chronological order, while the second author starts from the first firm that went public the non-internet IPO sample and read the news in chronological order. Later on, when we conduct the regression analysis, we difference news variables to remove the firm effect. This also removes the author bias effect, if we assume that the authors bias is firm-specific However, even with the above two approaches, we could still face possible judgment error as the same piece of news may be categorized differently by different human beings over time. So we conducted an experiment to verify the consistency of our judgments. Though we will delay the discussion of this xperiment to the"robustness tests"section of this paper, it should be pointed out here that we did exhibit consistency in our judgment. Correlations between our classification of news items and the classification of the same news items made by other participants in the experiment were strongly positive We define the degree of media coverage as the number of news items about a sample IPO firm during a specific period. For the pre-IPO period (up to the offer day), news items are counted and 10
10 There are two ways of classifying news items: mechanically using Content Analysis software or using human judgment. The advantage of the former method over the latter method is that it is less expensive, it is faster, and it is consistent. The disadvantage of the former method over the latter method is that it is prone to making serious mistakes. If software is programmed to classify a news item as good if it detects a number of positive words, it is bound to misclassify a news item as good if the news contains many good words about the competition, and few bad words about the firm. The software may also misclassify a news item as neutral if there are no obvious positive or negative words in the article, whereas a human will judge correctly from the context that the article is good news or bad news about a firm. So we chose human judgment for classification. We read each of the 171,488 news items individually, and classify each news item as either “good”, “bad” or “neutral” using our judgment. Our judgment is based on the content of each individual news item, without forming a new expectation after each piece of news. This method of human judgment has obvious drawbacks, the most important of which is lack of consistency. To reduce possible time-varying judgment errors, we have one author start from the last firm that went public in the internet IPO sample and read the news in reverse chronological order, while the second author starts from the first firm that went public in the non-internet IPO sample and read the news in chronological order. Later on, when we conduct the regression analysis, we difference news variables to remove the firm effect. This also removes the author bias effect, if we assume that the author’s bias is firm-specific. However, even with the above two approaches, we could still face possible judgment error as the same piece of news may be categorized differently by different human beings over time. So we conducted an experiment to verify the consistency of our judgments. Though we will delay the discussion of this experiment to the “robustness tests” section of this paper, it should be pointed out here that we did exhibit consistency in our judgment. Correlations between our classification of news items and the classification of the same news items made by other participants in the experiment were strongly positive. We define the degree of media coverage as the number of news items about a sample IPO firm during a specific period. For the pre-IPO period (up to the offer day), news items are counted and
classified for the whole period, as there is no price information during this period For the post-IPO period news items are classified and counted on a daily basis. For any given day, we aggregate news items about the same firm from multiple media sources and do not distinguish between"real news"and"spin-news his research design is created with the intent to investigate the impact of the intensity of the media coverage, and is based on the fact that different types of media may reach different types of investors. In addition, the criterion for estimating the influence of individual media is ambiguous, and very often the same contents will be covered by various media sources C Summary IPO Statistics Table I reports summary statistics of offer characteristics obtained from SDC and CRSP, broken down by internet and non-internet IPOs. As the internet industry was new, the internet firms, not surprisingly, are younger. The average IPO is over 9 years old at the time of its offering for the non-internet sample, but is less than 5 years old for the internet sample. However, the difference in age between the two samples drops substantially when we examine the median instead of the mean firm age. During the book- building period(from the registration date to the offer date), the average expected offer price, reflected in the mean of the indicative price range included in the issuers S-1 filing, is significantly higher for the non- internet sample($1324 compared to $12 14 for the internet sample). This is in sharp contrast with the final offer price, where the internet issues on average are set a higher price($14.76 versus $13.67 for the non- internet sample). Accordingly, in terms of price revisions, measured as the percentage change between the final offer price and the expected offer price, the average internet firm revises its price much more than the average non-internet firms, 23% versus 4.15%, and this difference is highly significant The most distinguishing feature of the sample is the first-day returns, calculated as the percentage change between the final offer price and the first-day closing price, which we take from CRSP tapes if Surprisingly, when we disaggregated the media into the top ten by circulation(Associated Press News Wire, Chicago Tribune, Daily News, NY, Dow Jones News Service, Houston Chronicle, LA Times, Reuters News, New York Times, Wall Street Journal, USA Today)and the wire services only(Associated Press News wire, Dow Jones News Service, and Reuters News), we found that the former covered internet firms with more statistically significant intensity(the top ten media sources account for 58.74% of the media coverage for internet firms and only 55.66% of the media coverage for non-internet firms), but the latter showed no statistically differential preferences(53.28% for internet firms vs 52. 17% for non-internet firms) 11
11 classified for the whole period, as there is no price information during this period. For the post-IPO period, news items are classified and counted on a daily basis. For any given day, we aggregate news items about the same firm from multiple media sources and do not distinguish between “real news” and “spin-news”. This research design is created with the intent to investigate the impact of the intensity of the media coverage, and is based on the fact that different types of media may reach different types of investors. In addition, the criterion for estimating the influence of individual media is ambiguous, and very often the same contents will be covered by various media sources. 11 C. Summary IPO Statistics Table 1 reports summary statistics of offer characteristics obtained from SDC and CRSP, broken down by internet and non-internet IPOs. As the internet industry was new, the internet firms, not surprisingly, are younger. The average IPO is over 9 years old at the time of its offering for the non-internet sample, but is less than 5 years old for the internet sample. However, the difference in age between the two samples drops substantially when we examine the median instead of the mean firm age. During the bookbuilding period (from the registration date to the offer date), the average expected offer price, reflected in the mean of the indicative price range included in the issuer’s S-1 filing, is significantly higher for the noninternet sample ($13.24 compared to $12.14 for the internet sample). This is in sharp contrast with the final offer price, where the internet issues on average are set a higher price ($14.76 versus $13.67 for the noninternet sample). Accordingly, in terms of price revisions, measured as the percentage change between the final offer price and the expected offer price, the average internet firm revises its price much more than the average non-internet firms, 23% versus 4.15%, and this difference is highly significant. The most distinguishing feature of the sample is the first-day returns, calculated as the percentage change between the final offer price and the first-day closing price, which we take from CRSP tapes if 11 Surprisingly, when we disaggregated the media into the top ten by circulation (Associated Press News Wire, Chicago Tribune, Daily News, NY, Dow Jones News Service, Houston Chronicle, LA Times, Reuters News, New York Times, Wall Street Journal, USA Today) and the wire services only (Associated Press News Wire, Dow Jones News Service, and Reuters News), we found that the former covered internet firms with more statistically significant intensity (the top ten media sources account for 58.74% of the media coverage for internet firms and only 55.66% of the media coverage for non-internet firms), but the latter showed no statistically differential preferences (53.28% for internet firms vs. 52.17% for non-internet firms)
available with seven days of the offer date(as in Lowry and Schwert 2002. The internet firms averaged a stunning 83.72% first-day return during our sample period, which is similar to the 89% first-day return documented by Ljungqvist and Wilhelm(2003)for internet IPOs during 1999 and 2000. This first-day return is more than twice in size compared to the non-internet sample(4%) Surprisingly, the two samples do share many similarities. Because of our method of construction, the average gross proceeds are around $88 million for both samples. The width of the filing price range defined as the difference between the high and low prices suggested in the preliminary prospectus and often viewed in the IPO literature as a proxy for ex ante uncertainty about a firms value, is virtually the same between the two samples About sixty-seven percent of internet firms operate in what we refer to as "high-tech"industries three-digit SIC codes 283, 357, 366, 367, 381, 382, 383, 384, 737, 873, and 874). This definition follows Benveniste, Ljungqvist, Wilhelm and Yu(2003)and"Hi Tech Industry Group"defined by SDC, and overs industries such as pharmaceuticals, computing, computer equipment, electronics, medical and measurement equipment, software and biotech industries. Interestingly, sixty-three percent of non-internet firms belong to the"high-tech"category as well, and the difference between the two samples is not significant either economically or statistically. This reflects the"high-tech" industry clustering in the sample period of 1996-2000. Correlated with this feature, most of the firms in the two samples trade on V MEDIA COVERAGE A necessary condition for the media to have an economic role in the internet iPo bubble is that the overall media coverage for internet IPOs is different from that of non-internet IPOs. We investigate this issue in the current section We explore this question for the entire sample period of 1996-2000, as well as for two sub-periods before and after the price peaked. The first peak definition is a market-wide definition. On March 24th, 2000, the Nasdaq 100 index reached its highest point in our 1996 to 2000 sample period. We follow the 12
12 available with seven days of the offer date (as in Lowry and Schwert 2002.) The internet firms averaged a stunning 83.72% first-day return during our sample period, which is similar to the 89% first-day return documented by Ljungqvist and Wilhelm (2003) for internet IPOs during 1999 and 2000. This first-day return is more than twice in size compared to the non-internet sample (41%). Surprisingly, the two samples do share many similarities. Because of our method of construction, the average gross proceeds are around $88 million for both samples. The width of the filing price range, defined as the difference between the high and low prices suggested in the preliminary prospectus and often viewed in the IPO literature as a proxy for ex ante uncertainty about a firm’s value, is virtually the same between the two samples. About sixty-seven percent of internet firms operate in what we refer to as “high-tech” industries (three-digit SIC codes 283, 357, 366, 367, 381, 382, 383, 384, 737, 873, and 874). This definition follows Benveniste, Ljungqvist, Wilhelm and Yu (2003) and “Hi Tech Industry Group” defined by SDC, and covers industries such as pharmaceuticals, computing, computer equipment, electronics, medical and measurement equipment, software and biotech industries. Interestingly, sixty-three percent of non-internet firms belong to the “high-tech” category as well, and the difference between the two samples is not significant either economically or statistically. This reflects the “high-tech” industry clustering in the sample period of 1996-2000. Correlated with this feature, most of the firms in the two samples trade on Nasdaq. IV. MEDIA COVERAGE A necessary condition for the media to have an economic role in the internet IPO bubble is that the overall media coverage for internet IPOs is different from that of non-internet IPOs. We investigate this issue in the current section. We explore this question for the entire sample period of 1996-2000, as well as for two sub-periods: before and after the price peaked. The first peak definition is a market-wide definition. On March 24th, 2000, the Nasdaq 100 index reached its highest point in our 1996 to 2000 sample period. We follow the
traditional literature and take this date as given to be the market peak. The second peak definition is intended to capture shifts in individual stocks, and is calculated as the date at which the firms market capitalization reaches the highest point in the sample period. The sub-periods defined using the first definition of a peak(March 24, 2000)are, therefore, in calendar time, whereas the sub-periods defined using the second definition of a peak(firm peak) are in event time. Throughout this paper, we use before the peak and bubble period interchangeably, and after the peak and post-bubble period interchangeably A. Unconditional Media Coverage First, we examine the unconditional media coverage of the internet sample and the non-internet sample, without taking into account the impact of previous price movements on this coverage. Figures 1-a through 1-e and Figures 2-a through 2-e provide a visual presentation of these patterns over various periods of time In Figures 1-a through 1-e, news items for the two samples are aggregated over time and firms while in Figures 2-a through 2-e, the news items are per day per firm. Figures 1-a and 2-a cover the entire sample period. Figures 1-b(1-c)and figures 2-b(2-c)report the degree of media coverage before(after)the peak, where the peak is the day in which the firm's stock price peaked. Figures 1-d(1-e)and figures 2-d (2-e)report the degree of media coverage before(after )the peak, where the peak is March 24, 2000 Compared to the non- internet sample, the internet sample had significantly higher media coverage in all the three measures(total number of news, good news, and bad news), during all the three time periods (entire sample period, before and after the peak, however you define the peak). This was true in the aggregate, as well as per firm per day basis Next, we use net news, defined as the difference between the number of good news and bad news to proxy media sentiment. Media sentiment, regardless whether it reflects public opinion or is different from public opinion, is considered optimistic if net news is positive and is considered pessimistic if net news is negative. Interestingly, as shown in Figures 1-b, 1-d, 2-b and 2-d, during the bubble period, internet IPO firms have more positive net news than their matching sample. This indicates that media eported relatively more good news than bad news for the internet firms than they did for the non-internet 13
13 traditional literature and take this date as given to be the market peak. The second peak definition is intended to capture shifts in individual stocks, and is calculated as the date at which the firm’s market capitalization reaches the highest point in the sample period. The sub-periods defined using the first definition of a peak (March 24, 2000) are, therefore, in calendar time, whereas the sub-periods defined using the second definition of a peak (firm peak) are in event time. Throughout this paper, we use before the peak and bubble period interchangeably, and after the peak and post-bubble period interchangeably. A. Unconditional Media Coverage First, we examine the unconditional media coverage of the internet sample and the non-internet sample, without taking into account the impact of previous price movements on this coverage. Figures 1-a through 1-e and Figures 2-a through 2-e provide a visual presentation of these patterns over various periods of time. In Figures 1-a through 1-e, news items for the two samples are aggregated over time and firms, while in Figures 2-a through 2-e, the news items are per day per firm. Figures 1-a and 2-a cover the entire sample period. Figures 1-b (1-c) and figures 2-b (2-c) report the degree of media coverage before (after) the peak, where the peak is the day in which the firm’s stock price peaked. Figures 1-d (1-e) and figures 2-d (2-e) report the degree of media coverage before (after) the peak, where the peak is March 24, 2000. Compared to the non-internet sample, the internet sample had significantly higher media coverage in all the three measures (total number of news, good news, and bad news), during all the three time periods (entire sample period, before and after the peak, however you define the peak). This was true in the aggregate, as well as per firm per day basis. Next, we use net news, defined as the difference between the number of good news and bad news, to proxy media sentiment. Media sentiment, regardless whether it reflects public opinion or is different from public opinion, is considered optimistic if net news is positive and is considered pessimistic if net news is negative. Interestingly, as shown in Figures 1-b, 1-d, 2-b and 2-d, during the bubble period, internet IPO firms have more positive net news than their matching sample. This indicates that media reported relatively more good news than bad news for the internet firms than they did for the non-internet