1294 IEEE COMMUNICATIONS SURVEYS TUTORIALS.VOL.16.NO.3.THIRD QUARTER 2014 Managing RFID Data: Challenges,Opportunities and Solutions Lei Xie,Member.IEEE,Yafeng Yin,Student Member,IEEE,Athanasios V.Vasilakos,Senior Member,IEEE,and Sanglu Lu,Member IEEE Abstract-The advances of Radio-Frequency Identification In conventional RFID applications,the items are attached (RFID)technology have significantly enhanced the capability of with RFID tags and densely placed in a large scale.The RFID capturing data from pervasive space.It becomes a great challenge tag is a small mircochip attached with an antenna in a package in the information era to effectively understand human behavior, mobility and activity through the perceived RFID data.Focusing During the scanning process,the RFID reader powers up and on RFID data management,this article provides an overview of transmits a continuous wave to energize the tags.The tag then current challenges,emerging opportunities and recent progresses responds to the reader with tag-carried information by mod- in RFID.In particular,this article has described and analyzed ulating the backscattered signals.The reader further decodes the research work on three aspects:algorithm,protocol and performance evaluation.We investigate the research progress in the signal and obtains the corresponding information 678 RFID with anti-collision algorithms,authentication and privacy In this way,the RFID system can not only "identify"but protection protocols,localization and activity sensing,as well as also "locate"the labeled item,by providing the information performance tuning in realistic settings.We emphasize the basic including the "identity"as well as the "location".Here,the principles of RFID data management to understand the state- "identity"is precise information extracted from the tag,while of-the-art and to address directions of future research in RFID. the "location"is usually inaccurate information estimated according to environmental parameters.Furthermore,these Index Terms-RFID,data management,anti-collision algo- RFID applications often involve lots of human activities,in rithms,authentication and privacy protection,localization and order to precisely understand human behavior,mobility and activity sensing,performance evaluation. activity through the perceived RFID data,it is essential to ef- fectively manage RFID data to extract the useful information, I.INTRODUCTION while ensuring the overall performance.How to effectively manage RFID data?As a matter of fact,several properties are S A TECHNOLOGY for automated identification of ob- jects,RFID(Radio-Frequency IDentification)has gained essentially required for effective RFID data management.We elaborate on these properties as follows: significant momentum in the past few years.Recently RFID has been more and more frequently introduced in applications Efficiency:while interrogating a number of tags,two met- like retail and distribution,target monitoring and tracking,etc, rics are highly pertinent to efficiency,i.e.,time efficiency as elaborated in literatures [1],[2],and [3].The most impor- to reduce the total scanning time,and energy efficiency tant reason of such phenomenon is that,by leveraging the to reduce the total power consumption. RFID technology,we are able to embed the intelligence into Trustworthy:the communication between the reader each physical object in a low-cost approach.In this way,the and tags should be both privacy-preserving and anti- "passive intelligence"is effectively realized in the ubiquitous counterfeiting. space,which can be regarded as a right candidate for green Localizability:the objects or people should be accurately communications [4][5].As such a novel technology,RFID located within the specified range of time delay. has provided us with 1)the capability to uniquely identify Reliability:in realistic settings,the RFID system should the objects in the physical world and 2)a passive,battery- be able to tackle with issues like signal interference, free interconnection mechanism.Furthermore,the dropping multi-path effect and energy absorption. tag costs and vigorous RFID standardization have accelerated These properties basically belong to the nonfunctional the wide-spread usage of the RFID technology. requirements for RFID systems.According to the above descriptions,it is essential to provide some mechanisms Manuscript received May 8.2013:revised November 15,2013.L.Xie,Y. to effectively satisfy the above nonfunctional requirements Yin,and S.Lu are supported in part by National Natural Science Foundation of China under Grant No.61100196,61321491.91218302:JiangSu Natural Therefore,several new research topics are necessarily to Science Foundation under Grant No.BK2011559:Key Project of Jiangsu be addressed for RFID data management.Specifically,they Research Program under Grant No.BE2013116:EU FP7 IRSES MobileCloud include anti-collision algorithms,authentication and privacy Project under Grant No.612212.Lei Xie is the corresponding author. L.Xie,Y.Yin,and S.Lu are with the State Key Laboratory for Novel protection protocols,localization and activity sensing,as well Software Technology.Nanjing University,China (e-mail:Ixie@nju.edu.cn as performance tuning in realistic settings.Fig.I illustrates yyf@dislab.nju.edu.cn,sanglu@nju.edu.cn). A.Vasilakos is with University of Western Macedonia,Greece (e-mail: these research topics corresponding to the above properties.In vasilako@ath.forthnet.gr). fact,these research topics are not independent of each other, Digital Object Identifier 10.1109/SURV.2014.022614.00143 but are correlated with each other.Besides,since these topics 1553-877X/14/$31.00©20141EEE
1294 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 3, THIRD QUARTER 2014 Managing RFID Data: Challenges, Opportunities and Solutions Lei Xie, Member, IEEE, Yafeng Yin, Student Member, IEEE, Athanasios V. Vasilakos, Senior Member, IEEE, and Sanglu Lu, Member, IEEE Abstract—The advances of Radio-Frequency Identification (RFID) technology have significantly enhanced the capability of capturing data from pervasive space. It becomes a great challenge in the information era to effectively understand human behavior, mobility and activity through the perceived RFID data. Focusing on RFID data management, this article provides an overview of current challenges, emerging opportunities and recent progresses in RFID. In particular, this article has described and analyzed the research work on three aspects: algorithm, protocol and performance evaluation. We investigate the research progress in RFID with anti-collision algorithms, authentication and privacy protection protocols, localization and activity sensing, as well as performance tuning in realistic settings. We emphasize the basic principles of RFID data management to understand the stateof-the-art and to address directions of future research in RFID. Index Terms—RFID, data management, anti-collision algorithms, authentication and privacy protection, localization and activity sensing, performance evaluation. I. INTRODUCTION AS A TECHNOLOGY for automated identification of objects, RFID (Radio-Frequency IDentification) has gained significant momentum in the past few years. Recently RFID has been more and more frequently introduced in applications like retail and distribution, target monitoring and tracking, etc, as elaborated in literatures [1], [2], and [3]. The most important reason of such phenomenon is that, by leveraging the RFID technology, we are able to embed the intelligence into each physical object in a low-cost approach. In this way, the “passive intelligence” is effectively realized in the ubiquitous space, which can be regarded as a right candidate for green communications [4][5]. As such a novel technology, RFID has provided us with 1) the capability to uniquely identify the objects in the physical world and 2) a passive, batteryfree interconnection mechanism. Furthermore, the dropping tag costs and vigorous RFID standardization have accelerated the wide-spread usage of the RFID technology. Manuscript received May 8, 2013; revised November 15, 2013. L. Xie, Y. Yin, and S. Lu are supported in part by National Natural Science Foundation of China under Grant No. 61100196, 61321491, 91218302; JiangSu Natural Science Foundation under Grant No. BK2011559; Key Project of Jiangsu Research Program under Grant No. BE2013116; EU FP7 IRSES MobileCloud Project under Grant No. 612212. Lei Xie is the corresponding author. L. Xie, Y. Yin, and S. Lu are with the State Key Laboratory for Novel Software Technology, Nanjing University, China (e-mail: lxie@nju.edu.cn, yyf@dislab.nju.edu.cn, sanglu@nju.edu.cn). A. Vasilakos is with University of Western Macedonia, Greece (e-mail: vasilako@ath.forthnet.gr). Digital Object Identifier 10.1109/SURV.2014.022614.00143 In conventional RFID applications, the items are attached with RFID tags and densely placed in a large scale. The RFID tag is a small mircochip attached with an antenna in a package. During the scanning process, the RFID reader powers up and transmits a continuous wave to energize the tags. The tag then responds to the reader with tag-carried information by modulating the backscattered signals. The reader further decodes the signal and obtains the corresponding information [6][7][8]. In this way, the RFID system can not only “identify” but also “locate” the labeled item, by providing the information including the “identity” as well as the “location”. Here, the “identity” is precise information extracted from the tag, while the “location” is usually inaccurate information estimated according to environmental parameters. Furthermore, these RFID applications often involve lots of human activities, in order to precisely understand human behavior, mobility and activity through the perceived RFID data, it is essential to effectively manage RFID data to extract the useful information, while ensuring the overall performance. How to effectively manage RFID data? As a matter of fact, several properties are essentially required for effective RFID data management. We elaborate on these properties as follows: • Efficiency: while interrogating a number of tags, two metrics are highly pertinent to efficiency, i.e., time efficiency to reduce the total scanning time, and energy efficiency to reduce the total power consumption. • Trustworthy: the communication between the reader and tags should be both privacy-preserving and anticounterfeiting. • Localizability: the objects or people should be accurately located within the specified range of time delay. • Reliability: in realistic settings, the RFID system should be able to tackle with issues like signal interference, multi-path effect and energy absorption. These properties basically belong to the nonfunctional requirements for RFID systems. According to the above descriptions, it is essential to provide some mechanisms to effectively satisfy the above nonfunctional requirements. Therefore, several new research topics are necessarily to be addressed for RFID data management. Specifically, they include anti-collision algorithms, authentication and privacy protection protocols, localization and activity sensing, as well as performance tuning in realistic settings. Fig. 1 illustrates these research topics corresponding to the above properties. In fact, these research topics are not independent of each other, but are correlated with each other. Besides, since these topics 1553-877X/14/$31.00 c 2014 IEEE
XIE er al:MANAGING RFID DATA:CHALLENGES.OPPORTUNITIES AND SOLUTIONS 1295 mainly focus on different layers of the protocol stack,the TABLE I relationship between these topics and the protocol stack is also CHALLENGES AND OPPORTUNITIES FOR RFID DATA MANAGEMENT shown in the figure.This figure provides us a clear structured (C DENOTES CHALLENGES,O DENOTES OPPORTUNITIES) framework of these research topics. PHY Layer MAC Layer APP Layer The main contributions of this article are summarized as Anti-collision Al- C(2).(3) C(2) follows: gorithm 0:(2) 0:(2) Authentication C:2).3) We investigate the current challenges and emerging op- Privacy Protection 0:(1).(4) 0:2).(3) portunities in managing RFID data,and provide some Localization Ac- C(1) guidance and substantial lessons learned in the research tivity Sensing 0:(4) 0:(1),5 Performance C:(1) C(3) of RFID data management. Tuning in Realistic We emphasize the basic principles of RFID data manage- Settings ment to understand the state-of-the-art research achieve- ments.Furthermore,we present comparative studies over unstable communication is really a challenging problem existing approaches to analyze their features specifically for passive RFID systems. and concretely. 2) Due to the scarce resource in the tag side,the logical We provide a long term vision for future research di- processing in the tag is required to be simple enough, rections,with more emphasis on insights into future challenges and opportunities in RFID research. the storage requirement in the tag should be limited to several kilobytes.Therefore,it is actually very difficult The rest of this article is organized as follows.We first to implement comprehensive and powerful functions in present the challenges and opportunities for RFID data man- agement in Section II.We then respectively introduce the the RFID systems while satisfying the scarce resource constraint. research progresses of anti-collision algorithms,authentication 3)There already exists an industry standard ISO 18000- and privacy protection protocols,as well as localization and 6C (also known as EPC Class 1 Generation 2)which activity sensing in Section III,IV,and V.After that,in specifies how RFID readers identify and read RFID Section VI we further provide a summative discussion from a tags,the design of protocols and algorithms over RFID more realistic perspective,i.e.,performance tuning in realistic systems should conform to the standard,or at least be settings.We envision the future research directions in Section subject to minor changes.In contrast with the other VⅡand conclude in Section VⅢ wireless devices like sensor nodes,smart phones,etc, since the chips in the tags conventionally cannot be II.CHALLENGES AND OPPORTUNITIES FOR RFID DATA reprogrammed,the only part we can freely reprogram is MANAGEMENT the RFID reader,still the standard should be conformed. Compared to the other intelligent systems like two- Any solutions which requires to substantially change the dimension code,sensor network,etc.,the current design of tag's execution logics are considered to be impractical RFID system has provided us the following opportunities for for RFID systems. effective RFID data management [9]: Therefore,in order to effectively investigate into the tech- 1)RFID tags can be automatically identified in a non- niques of managing RFID data,one intuitive thinking is to contact approach. sufficiently leverage the potential opportunities to overcome 2)RFID tag contains a certain number of logical gates, the emerging challenges.In order to distinguish these opportu- which support simple logical processing in the tag side. nities and challenges more closely,Table I has categorized the 3)RFID tag is able to store some data permanently in its above challenges and opportunities according to the research tiny on-board memory. topics and involved layers in the protocol stack.In this way, 4)The backscattered signal from RFID tag has properties we can get a better understanding of how to deal with these like any other wireless devices,e.g..the received signal challenges and opportunities. strength (RSS). 5)RFID tag is very cheap,which makes the large scale- III.ANTI-COLLISION ALGORITHMS IN RFID SYSTEMS deployment become possible. In conventional RFID applications,a large number of RFID Nevertheless,there exist some significant challenges that tags are widely deployed in the specified regions.In order to must be overcome before the benefits are realized.The major identify these tags in a fast approach,it is essential for the challenges of managing RFID data are summarized as follows: RFID reader to utilize an anti-collision algorithm to effectively 1)The communication link between the RFID reader and read these tags one by one.In wireless environment,conven- the tag is not stable,which is susceptible to issues tional wireless devices mainly leverage CSMA/CA to realize like signal interference,multi-path effect and energy the communication among multiple devices,e.g..the 802.11 absorption.Unlike those battery-powered systems,the protocol series.Being different from the conventional wireless passive RFID system is especially vulnerable to the devices,the RFID tag is a rather simple wireless device with ambient noises and interferences due to the backscatter scarce resource.The tags are unable to self-regulate their radio property.For example,the missed reading phenomenon transmissions to avoid collisions.Specifically,the tag does is very common for RFID systems even in a relatively not have enough processing capacity and energy to realize ideal situation close to free space.How to deal with the the above competition mechanisms to avoid transmission
XIE et al.: MANAGING RFID DATA: CHALLENGES, OPPORTUNITIES AND SOLUTIONS 1295 mainly focus on different layers of the protocol stack, the relationship between these topics and the protocol stack is also shown in the figure. This figure provides us a clear structured framework of these research topics. The main contributions of this article are summarized as follows: • We investigate the current challenges and emerging opportunities in managing RFID data, and provide some guidance and substantial lessons learned in the research of RFID data management. • We emphasize the basic principles of RFID data management to understand the state-of-the-art research achievements. Furthermore, we present comparative studies over existing approaches to analyze their features specifically and concretely. • We provide a long term vision for future research directions, with more emphasis on insights into future challenges and opportunities in RFID research. The rest of this article is organized as follows. We first present the challenges and opportunities for RFID data management in Section II. We then respectively introduce the research progresses of anti-collision algorithms, authentication and privacy protection protocols, as well as localization and activity sensing in Section III, IV, and V. After that, in Section VI we further provide a summative discussion from a more realistic perspective, i.e., performance tuning in realistic settings. We envision the future research directions in Section VII and conclude in Section VIII. II. CHALLENGES AND OPPORTUNITIES FOR RFID DATA MANAGEMENT Compared to the other intelligent systems like twodimension code, sensor network, etc., the current design of RFID system has provided us the following opportunities for effective RFID data management [9]: 1) RFID tags can be automatically identified in a noncontact approach. 2) RFID tag contains a certain number of logical gates, which support simple logical processing in the tag side. 3) RFID tag is able to store some data permanently in its tiny on-board memory. 4) The backscattered signal from RFID tag has properties like any other wireless devices, e.g., the received signal strength (RSS). 5) RFID tag is very cheap, which makes the large scaledeployment become possible. Nevertheless, there exist some significant challenges that must be overcome before the benefits are realized. The major challenges of managing RFID data are summarized as follows: 1) The communication link between the RFID reader and the tag is not stable, which is susceptible to issues like signal interference, multi-path effect and energy absorption. Unlike those battery-powered systems, the passive RFID system is especially vulnerable to the ambient noises and interferences due to the backscatter property. For example, the missed reading phenomenon is very common for RFID systems even in a relatively ideal situation close to free space. How to deal with the TABLE I CHALLENGES AND OPPORTUNITIES FOR RFID DATA MANAGEMENT (C DENOTES CHALLENGES, O DENOTES OPPORTUNITIES) PHY Layer MAC Layer APP Layer Anti-collision Algorithm C:(2),(3) O:(2) C:(2) O:(2) Authentication & Privacy Protection O:(1),(4) C:(2),(3) O:(2),(3) Localization & Activity Sensing C:(1) O:(4) O:(1),(5) Performance Tuning in Realistic Settings C:(1) C:(3) unstable communication is really a challenging problem for passive RFID systems. 2) Due to the scarce resource in the tag side, the logical processing in the tag is required to be simple enough, the storage requirement in the tag should be limited to several kilobytes. Therefore, it is actually very difficult to implement comprehensive and powerful functions in the RFID systems while satisfying the scarce resource constraint. 3) There already exists an industry standard ISO 18000- 6C (also known as EPC Class 1 Generation 2) which specifies how RFID readers identify and read RFID tags, the design of protocols and algorithms over RFID systems should conform to the standard, or at least be subject to minor changes. In contrast with the other wireless devices like sensor nodes, smart phones, etc, since the chips in the tags conventionally cannot be reprogrammed, the only part we can freely reprogram is the RFID reader, still the standard should be conformed. Any solutions which requires to substantially change the tag’s execution logics are considered to be impractical for RFID systems. Therefore, in order to effectively investigate into the techniques of managing RFID data, one intuitive thinking is to sufficiently leverage the potential opportunities to overcome the emerging challenges. In order to distinguish these opportunities and challenges more closely, Table I has categorized the above challenges and opportunities according to the research topics and involved layers in the protocol stack. In this way, we can get a better understanding of how to deal with these challenges and opportunities. III. ANTI-COLLISION ALGORITHMS IN RFID SYSTEMS In conventional RFID applications, a large number of RFID tags are widely deployed in the specified regions. In order to identify these tags in a fast approach, it is essential for the RFID reader to utilize an anti-collision algorithm to effectively read these tags one by one. In wireless environment, conventional wireless devices mainly leverage CSMA/CA to realize the communication among multiple devices, e.g., the 802.11 protocol series. Being different from the conventional wireless devices, the RFID tag is a rather simple wireless device with scarce resource. The tags are unable to self-regulate their radio transmissions to avoid collisions. Specifically, the tag does not have enough processing capacity and energy to realize the above competition mechanisms to avoid transmission
1296 IEEE COMMUNICATIONS SURVEYS TUTORIALS.VOL.16.NO.3.THIRD QUARTER 2014 Nonfunctional Research Topics in Protocol Stack Requirements RFID Data Management Efficiency Anti-collision Algorithms PHY Layer Authentication and Privacy Trustworthy Protection Protocols MAC Layer Localization and Activity Localizability Sensing APP Layer Reliability Performance Tuning in Realistic Settings Fig.1.New Research Topics in RFID Data Management collisions.Therefore,effective anti-collision algorithms should TABLE II be customized for RFID systems.Klair et al.provided an COMPARATIVE STUDY OF THE TWO KINDS OF ANTI-COLLISION ALGORITHMS overview of RFID anti-collision protocols [10].Here,to be more specifically,we further classify and elaborate the princi- Binary tree-based anti- Slotted ALOHA-based ples of the most recent research progresses.In the following collision algorithms anti-collision algorithms Advantages Normally deterministic al- The randomized algorithm subsections,we first introduce the anti-collision algorithms in gorithm is used. is used with a good perfor- tag identification,then,two kinds of ingenious techniques are The query tree-based al- mance on average. respectively introduced,i.e.,estimating the tag size and polling gorithms do not require The randomization makes to store the intermediate the results over slots meet the tags based on the anti-collision algorithms. state variable. a certain probability distri- bution,facilitating various statistical analysis A.Tag Identification Disadvantages For the query tree-based The randomness of the al- In regard to the features of RFID system,an effective algorithms,the delay for gorithm makes the "star- tag identification is sub. vation problem"possible. tag identification protocol is expected to have the following ject to the length and dis- some tags can never se properties:1)Simple:the processing logic of the protocol tribution of tag IDs. lect a singleton slot to re- (including the execution flow and the state transition)should spond.In the worst case. be as simple as possible,due to the scarce resource in the tag; the delay approaches+. 2)Efficient:when interrogating a large number of tags,the protocol should provide a light-weight communication mech- anism to avoid unnecessary transmissions of control messages. range.At the beginning of each frame,the reader broadcasts In this way,the high throughput and low transmission delay the frame size f to the surrounding tags,i.e.,the number can be guaranteed. of slots in the following frame.After each tag receives the The current anti-collision algorithms can be divided into frame size f,it randomly selects a slot from the Ist slot to two categories:the tree based anti-collision algorithms [11,12] the fth slot in this frame and responds to the reader.If the tag and the slotted ALOHA based anti-collision algorithms [13- successfully responds,i.e.,no collision happens in the selected 18].The former leverages the binary search tree to divide the slot,then the specified tag is kept silent in the following query collided tag set into two subsets in a recursive approach.In rounds;otherwise,the tag will continue to select another slot regard to those tag sets with opportunities of collisions,the in the next frame to resend the ID.Therefore,there exists one "keeping silent mechanism is used to resolve the collision of the following situations for each slot within the frame:1) problems.The method to divide the sets is comprised of empty slot:no tag responds in this slot;2)singleton slot:only the random binary tree algorithms and the query binary tree one unique tag responds in this slot;3)collision slot:multiple algorithms.Myung et al.[11]proposed an adaptive tree-based tags respond in this slot.Each kind of slot gives different anti-collision algorithm to efficiently identify the tags.Pan et information.Currently,the slotted ALOHA protocol has been al.[12]proposed a smart tree traversal mechanism based on adopted in the EPC C1G2 standard. the query tree,which conducts the tag identification with low In Table II,we respectively compare the two kinds of anti- delay. collision algorithms in terms of advantages and disadvantages. The ALOHA protocol is first used for random access in Generally speaking,there are pros and cons to both kinds of packet radio network.In order to improve the efficiency of tag algorithms. identification in RFID systems,the slotted ALOHA protocol In regard to the slotted ALOHA protocol,the frame size [13,14]is proposed to effectively resolve the collisions.The for each query round is of crucial importance to the overall slotted ALOHA protocol combines a certain number of time performance for identification.For a given tag set,if the frame slots into a"frame".During the interrogation,the reader sends size is too large,then most of the slots in this frame will a continuous wave to energize all tags in the effective scanning be empty slots,causing the waste of slots;if the frame size
1296 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 3, THIRD QUARTER 2014 Nonfunctional Requirements Efficiency Trustworthy Localizability Reliability Research Topics in RFID Data Management Anti-collision Algorithms Authentication and Privacy Protection Protocols Localization and Activity Sensing Performance Tuning in Realistic Settings Protocol Stack PHY Layer MAC Layer APP Layer Fig. 1. New Research Topics in RFID Data Management collisions. Therefore, effective anti-collision algorithms should be customized for RFID systems. Klair et al. provided an overview of RFID anti-collision protocols [10]. Here, to be more specifically, we further classify and elaborate the principles of the most recent research progresses. In the following subsections, we first introduce the anti-collision algorithms in tag identification, then, two kinds of ingenious techniques are respectively introduced, i.e., estimating the tag size and polling the tags based on the anti-collision algorithms. A. Tag Identification In regard to the features of RFID system, an effective tag identification protocol is expected to have the following properties: 1) Simple: the processing logic of the protocol (including the execution flow and the state transition) should be as simple as possible, due to the scarce resource in the tag; 2) Efficient: when interrogating a large number of tags, the protocol should provide a light-weight communication mechanism to avoid unnecessary transmissions of control messages. In this way, the high throughput and low transmission delay can be guaranteed. The current anti-collision algorithms can be divided into two categories: the tree based anti-collision algorithms [11, 12] and the slotted ALOHA based anti-collision algorithms [13– 18]. The former leverages the binary search tree to divide the collided tag set into two subsets in a recursive approach. In regard to those tag sets with opportunities of collisions, the “keeping silent” mechanism is used to resolve the collision problems. The method to divide the sets is comprised of the random binary tree algorithms and the query binary tree algorithms. Myung et al. [11] proposed an adaptive tree-based anti-collision algorithm to efficiently identify the tags. Pan et al. [12] proposed a smart tree traversal mechanism based on the query tree, which conducts the tag identification with low delay. The ALOHA protocol is first used for random access in packet radio network. In order to improve the efficiency of tag identification in RFID systems, the slotted ALOHA protocol [13, 14] is proposed to effectively resolve the collisions. The slotted ALOHA protocol combines a certain number of time slots into a “frame”. During the interrogation, the reader sends a continuous wave to energize all tags in the effective scanning TABLE II COMPARATIVE STUDY OF THE TWO KINDS OF ANTI-COLLISION ALGORITHMS Binary tree-based anticollision algorithms Slotted ALOHA-based anti-collision algorithms Advantages Normally deterministic algorithm is used. The query tree-based algorithms do not require to store the intermediate state variable. The randomized algorithm is used with a good performance on average. The randomization makes the results over slots meet a certain probability distribution, facilitating various statistical analysis. Disadvantages For the query tree-based algorithms, the delay for tag identification is subject to the length and distribution of tag IDs. The randomness of the algorithm makes the “starvation problem” possible, some tags can never select a singleton slot to respond. In the worst case, the delay approaches +∞. range. At the beginning of each frame, the reader broadcasts the frame size f to the surrounding tags, i.e., the number of slots in the following frame. After each tag receives the frame size f, it randomly selects a slot from the 1st slot to the fth slot in this frame and responds to the reader. If the tag successfully responds, i.e., no collision happens in the selected slot, then the specified tag is kept silent in the following query rounds; otherwise, the tag will continue to select another slot in the next frame to resend the ID. Therefore, there exists one of the following situations for each slot within the frame: 1) empty slot: no tag responds in this slot; 2) singleton slot: only one unique tag responds in this slot; 3) collision slot: multiple tags respond in this slot. Each kind of slot gives different information. Currently, the slotted ALOHA protocol has been adopted in the EPC C1G2 standard. In Table II, we respectively compare the two kinds of anticollision algorithms in terms of advantages and disadvantages. Generally speaking, there are pros and cons to both kinds of algorithms. In regard to the slotted ALOHA protocol, the frame size for each query round is of crucial importance to the overall performance for identification. For a given tag set, if the frame size is too large, then most of the slots in this frame will be empty slots, causing the waste of slots; if the frame size
XIE er al:MANAGING RFID DATA:CHALLENGES.OPPORTUNITIES AND SOLUTIONS 1297 0.4 0.8 0.6 0.2 0.4 0.2 100 200 300 400 20 40 60 80100 Frame Size f Frame size f (a) (b) Fig.2.(a)Given the tag size n100.the relationship between the read efficiencyand the frame size f.(b)As the tag size n varies,the maximum read efficiency obtained when f*=n. is too small,then there will exist transmission collisions for efficiency in a query round approaches to I when n>5,if most of the slots,causing most tags to retransmit in the next the optimal frame size f*=n is used.If this strategy is used query round.Hence,the frame size should be dynamically for each query round,then the overall efficiency is close to 1 adjusted according to the number of tags in the current Since I is the upper bound of the overall efficiency,then the query round.Therefore,many researchers have investigated global optimal efficiency is also achieved with this strategy. into this problem.Schoute et al.analyze the impact of the The above anti-collision algorithms are conventionally de- dynamic frame size selection on the reading performance in vised towards fairly idealized settings,without sufficiently ad- slotted ALOHA protocol [15].Floerkemeier et al.propose the dressing the difficulties in real applications,e.g.,the frequent strategy of optimizing the dynamic frame size,according to the movement of tags.the signal interference among multiple Bayesian probability model [16].Vogt et al.leverage Markov RFID readers,the signal attenuation and multi-path effect, process to model the identification process,and calculate a etc.Therefore,a few research work start to focus on and series of the optimized frame sizes during the interrogation try to solve the above problems.Due to the limited scanning process[17刀]. range of a single RFID reader.multiple readers are con- Lee et al.further derive the optimal dynamic frame sizes ventionally deployed in the application scenarios.Since the to maximize the channel efficiency [18].Their research con- former research work mainly focus on resolving the collisions clusions indicate that,if the frame size used in each query among multiple tag transmissions,without considering the round is equivalent to the number of tags to be identified, signal interference problem among multiple readers,literature then the maximum efficiency can be achieved for the channel.[19,20]propose optimized activating and scheduling schemes The detail principle is described as follows:assume the current for multiple readers.In this way,multiple readers are able number of tags within the effective scanning range is n,the to collaboratively avoid the signal transmission collisions. frame size is f.As the number of tags in each slot conforms Furthermore,by utilizing the mobile RFID reader,Sheng et al. to the Binomial distribution,then,the expected number of sin-develop efficient schemes for continuous scanning operations gleton slots in this frame is E[m]=nx(1-1/f)"-1.In order defined in both spatial and temporal domains [21].Their to maximize the channel efficiency i.e.,the ratio of the basic idea is to fully utilize the information gathered in the number of singleton slots to the frame size,we get the value previous scanning operations to reduce the scanning time of of f through the extremum of':aE並=0→f*=n, the succeeding ones.The former research work mainly devise then the maximum channel efficiency is n/f*1/e.In optimized schemes to identify statically deployed tags in fairly Fig.2(a)and Fig.2(b),we provide more detail descriptions for idealized settings,without considering the impact of some the above properties.Fig.2(a)depicts the impact of the frame ubiquitous issues like path loss in the mobile environment.In size f on the channel efficiency,when the tag size n =100. regard to this problem,Xie et al.propose a probabilistic model Note that as the frame size gradually increases from 1 to 400, for RFID tag identification [22],according to the continuous the read efficiency gradually increases to the maximum value changing properties of signal attenuation.Based on this model, and then decreases.The maximum efficiency is achieved they devise optimal parameters for the tag identification, when f=100.Fig.2(b)depicts the impact of the tag size n which conforms to the EPC Gen2 standard.Moreover,in on the channel efficiency,when the optimal frame size f*=n order to execute the continuous scanning with mobile reader is used.Note that when the value of n is small,the optimal for tag identification in realistic settings,Xie et al.conduct efficiency is larger than 1,e.g.,when the tag size n =1,comprehensive experimental study on mobile RFID reading, the optimal efficiency is 100%if the frame size is set to 1.and design very efficient algorithms to maximize the time- As the value of n gradually increases,the optimal efficiency efficiency and energy-efficiency by skillfully adjusting the quickly converges to It implies that,the local optimal read readers power and moving speed [23].In order to identify
XIE et al.: MANAGING RFID DATA: CHALLENGES, OPPORTUNITIES AND SOLUTIONS 1297 0 100 200 300 400 0 0.1 0.2 0.3 0.4 Frame Size f Read Efficiency n1/f (a) 20 40 60 80 100 0 0.2 0.4 0.6 0.8 1 The maximum of read efficiency n1/f Frame size f (b) Fig. 2. (a) Given the tag size n = 100, the relationship between the read efficiency n1 f and the frame size f. (b) As the tag size n varies, the maximum read efficiency obtained when f∗ = n. is too small, then there will exist transmission collisions for most of the slots, causing most tags to retransmit in the next query round. Hence, the frame size should be dynamically adjusted according to the number of tags in the current query round. Therefore, many researchers have investigated into this problem. Schoute et al. analyze the impact of the dynamic frame size selection on the reading performance in slotted ALOHA protocol [15]. Floerkemeier et al. propose the strategy of optimizing the dynamic frame size, according to the Bayesian probability model [16]. Vogt et al. leverage Markov process to model the identification process, and calculate a series of the optimized frame sizes during the interrogation process [17]. Lee et al. further derive the optimal dynamic frame sizes to maximize the channel efficiency [18]. Their research conclusions indicate that, if the frame size used in each query round is equivalent to the number of tags to be identified, then the maximum efficiency can be achieved for the channel. The detail principle is described as follows: assume the current number of tags within the effective scanning range is n, the frame size is f. As the number of tags in each slot conforms to the Binomial distribution, then, the expected number of singleton slots in this frame is E[n1] = n×(1−1/f)n−1. In order to maximize the channel efficiency n1 f , i.e., the ratio of the number of singleton slots to the frame size, we get the value of f through the extremum of n1 f : ∂E[n1]/f ∂f = 0 → f ∗ = n, then the maximum channel efficiency is n1/f ∗ → 1/e. In Fig.2(a) and Fig.2(b), we provide more detail descriptions for the above properties. Fig.2(a) depicts the impact of the frame size f on the channel efficiency, when the tag size n = 100. Note that as the frame size gradually increases from 1 to 400, the read efficiency gradually increases to the maximum value 1 e and then decreases. The maximum efficiency is achieved when f = 100. Fig.2(b) depicts the impact of the tag size n on the channel efficiency, when the optimal frame size f ∗ = n is used. Note that when the value of n is small, the optimal efficiency is larger than 1 e , e.g., when the tag size n = 1, the optimal efficiency is 100% if the frame size is set to 1. As the value of n gradually increases, the optimal efficiency quickly converges to 1 e . It implies that, the local optimal read efficiency in a query round approaches to 1 e when n > 5, if the optimal frame size f ∗ = n is used. If this strategy is used for each query round, then the overall efficiency is close to 1 e . Since 1 e is the upper bound of the overall efficiency, then the global optimal efficiency is also achieved with this strategy. The above anti-collision algorithms are conventionally devised towards fairly idealized settings, without sufficiently addressing the difficulties in real applications, e.g., the frequent movement of tags, the signal interference among multiple RFID readers, the signal attenuation and multi-path effect, etc. Therefore, a few research work start to focus on and try to solve the above problems. Due to the limited scanning range of a single RFID reader, multiple readers are conventionally deployed in the application scenarios. Since the former research work mainly focus on resolving the collisions among multiple tag transmissions, without considering the signal interference problem among multiple readers, literature [19, 20] propose optimized activating and scheduling schemes for multiple readers. In this way, multiple readers are able to collaboratively avoid the signal transmission collisions. Furthermore, by utilizing the mobile RFID reader, Sheng et al. develop efficient schemes for continuous scanning operations defined in both spatial and temporal domains [21]. Their basic idea is to fully utilize the information gathered in the previous scanning operations to reduce the scanning time of the succeeding ones. The former research work mainly devise optimized schemes to identify statically deployed tags in fairly idealized settings, without considering the impact of some ubiquitous issues like path loss in the mobile environment. In regard to this problem, Xie et al. propose a probabilistic model for RFID tag identification [22], according to the continuous changing properties of signal attenuation. Based on this model, they devise optimal parameters for the tag identification, which conforms to the EPC Gen2 standard. Moreover, in order to execute the continuous scanning with mobile reader for tag identification in realistic settings, Xie et al. conduct comprehensive experimental study on mobile RFID reading, and design very efficient algorithms to maximize the timeefficiency and energy-efficiency by skillfully adjusting the readers power and moving speed [23]. In order to identify
1298 IEEE COMMUNICATIONS SURVEYS TUTORIALS.VOL.16.NO.3.THIRD QUARTER 2014 TABLE III COMPARISON OF ALGORITHMS FOR TAG SIZE ESTIMATION Empty slots Estimation Compatibility Indicator for Estima- Probability Model 0.8 Singleton slots Algorithm with Slotted tion -Collision slots ALOHA 25 Compatible The number of Binomial 0.6 empty and collision distribution slots 0.4 26 Compatible The number of Posterior probability empty,singleton and model based on bi- collision slots nomial distribution 0.2 127 Compatible The first slot with tag Binomial response distribution T2829 Not The fringe of colli- Geometric distribu- 0 Compatible sion slots tion 2 6 8 10 [30 Compatible The average run- Binomial The ratio of tag size to frame size n/f length of ones in the distribution bit string received Fig.3.The normalized expected number of empty/singleton/collision slots as the ratio of子varies Binomial distribution,the authors present unified estimation algorithms according to the empty slots and collision slots, the RFID tags towards the specified area,Yin et al.propose a and they further reduce the estimation errors through repeated "focus and shoot"method for efficient tag identification,based sampling on extensive empirical study on RFID systems [24]. Although the number of singleton slots is not monotonic to the overall tag size,the tag size cannot be estimated purely B.Estimating the Tag Size from the number of singleton slots,nevertheless,the number With the further extension of RFID applications,a number of three kinds of slots together can help to estimate the tag of applications only require some statistical information for size in a more accurate approach.Hence,Chen et al.present a data analysis and mining.In this situation,the RFID system posterior probability model based on the observed number of only requires to quickly estimate the statistics,instead of empty/singleton/collision slots [26].They propose an accurate identifying the tags one by one.One of the most important tag estimate method by maximizing the posterior probability. statistics is the tag size,i.e.,the overall number of tags. The above mechanisms all require a large number of sam- Moreover,the slotted ALOHA protocol based on dynamic plings over the three kinds of slots to improve the estimation frame sizes also requires to estimate the current tag size for accuracy.In fact,some other properties can be leveraged to determining the frame size.Therefore,recently there are many accurately estimate the tag size in a faster approach.Han research work which focus on how to fast and accurately et al.propose an efficient and anonymous scheme for tag estimate the overall tag size.The core idea is as follows:since size estimation [27],which leverages the position of the first the slotted ALOHA protocol requires the tags to randomly reply from a group of tags in a frame.Moreover,in order select the slots to respond inside the frame,then it is possible to tackle with the multiple reading problem,i.e.,one single to leverage the statistics regularities in the randomized algo- tag can be interrogated multiple times with multiple readers, rithm to estimate the tag size.Specifically,according to the Chen et al.present a replicate-insensitive estimation scheme current protocol in EPC Gen 2 standard,the number of empty [28],which eliminates the multiple reading in multi-reader slots,singleton slots and collision slots actually conform to scenarios.The authors further propose an adaptively splitting- the Binomial distribution in a statistical approach.When the based arbitration protocol [29].This protocol requires a single number of samplings for slots is large enough,the number of tag to select multiple slots in a frame to respond,according tags can be effectively estimated according to the Binomial to the Geometric distribution.Specifically,each tag have a distribution. probability of to respond in the t-1th slot.Then,the Based on the above idea,Kodialam et al.propose a fast tag size can be effectively estimated according to the fringe and reliable estimation mechanisms for tag size in a practical of collision slots in the frame.Shahzad et al.propose a new approach [25].The major idea is as follows:assume in a scheme for estimating tag population size called average run certain query round with fixed frame size f,as the overall based tag estimation [30].The technique is based on the number of tags n increases,the number of empty slots is average run-length of ones in the bit string received using the expected to decrease,the number of collision slots is expected standardized framed slotted ALOHA protocol.It is easy to to increase,while the number of singleton slots is expected deploy because it neither requires modification to tags nor to to first increase and then decrease.Therefore,the number the communication protocol between tags and readers.Table of empty slots (collision slots)is monotonically decreasing III provides a comparison of the above algorithms for tag size (increasing)with the overall number of tags.Fig.3 depicts estimation. the normalized expected number of empty/singleton/collision Instead of simply estimating the tag size,a number of slots as the ratio of4 varies,here the number of slots is researchers start to investigate how to realize more compli- normalized to a real number in the range 0 through I by cated data analysis and mining over RFID systems.Sheng et dividing the frame size f.Based on the probabilistic model of al.consider the problem of identifying popular categories of
1298 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 3, THIRD QUARTER 2014 0 2 4 6 8 10 0 0.2 0.4 0.6 0.8 1 The ratio of tag size to frame size n/f Normalized expected number of slots Empty slots Singleton slots Collision slots Fig. 3. The normalized expected number of empty/singleton/collision slots as the ratio of n f varies the RFID tags towards the specified area, Yin et al. propose a “focus and shoot” method for efficient tag identification, based on extensive empirical study on RFID systems [24]. B. Estimating the Tag Size With the further extension of RFID applications, a number of applications only require some statistical information for data analysis and mining. In this situation, the RFID system only requires to quickly estimate the statistics, instead of identifying the tags one by one. One of the most important statistics is the tag size, i.e., the overall number of tags. Moreover, the slotted ALOHA protocol based on dynamic frame sizes also requires to estimate the current tag size for determining the frame size. Therefore, recently there are many research work which focus on how to fast and accurately estimate the overall tag size. The core idea is as follows: since the slotted ALOHA protocol requires the tags to randomly select the slots to respond inside the frame, then it is possible to leverage the statistics regularities in the randomized algorithm to estimate the tag size. Specifically, according to the current protocol in EPC Gen 2 standard, the number of empty slots, singleton slots and collision slots actually conform to the Binomial distribution in a statistical approach. When the number of samplings for slots is large enough, the number of tags can be effectively estimated according to the Binomial distribution. Based on the above idea, Kodialam et al. propose a fast and reliable estimation mechanisms for tag size in a practical approach [25]. The major idea is as follows: assume in a certain query round with fixed frame size f, as the overall number of tags n increases, the number of empty slots is expected to decrease, the number of collision slots is expected to increase, while the number of singleton slots is expected to first increase and then decrease. Therefore, the number of empty slots (collision slots) is monotonically decreasing (increasing) with the overall number of tags. Fig.3 depicts the normalized expected number of empty/singleton/collision slots as the ratio of n f varies, here the number of slots is normalized to a real number in the range 0 through 1 by dividing the frame size f. Based on the probabilistic model of TABLE III COMPARISON OF ALGORITHMS FOR TAG S IZE ESTIMATION Estimation Algorithm Compatibility with Slotted ALOHA Indicator for Estimation Probability Model [25] Compatible The number of empty and collision slots Binomial distribution [26] Compatible The number of empty, singleton and collision slots Posterior probability model based on binomial distribution [27] Compatible The first slot with tag response Binomial distribution [28][29] Not Compatible The fringe of collision slots Geometric distribution [30] Compatible The average runlength of ones in the bit string received Binomial distribution Binomial distribution, the authors present unified estimation algorithms according to the empty slots and collision slots, and they further reduce the estimation errors through repeated sampling. Although the number of singleton slots is not monotonic to the overall tag size, the tag size cannot be estimated purely from the number of singleton slots, nevertheless, the number of three kinds of slots together can help to estimate the tag size in a more accurate approach. Hence, Chen et al. present a posterior probability model based on the observed number of empty/singleton/collision slots [26]. They propose an accurate tag estimate method by maximizing the posterior probability. The above mechanisms all require a large number of samplings over the three kinds of slots to improve the estimation accuracy. In fact, some other properties can be leveraged to accurately estimate the tag size in a faster approach. Han et al. propose an efficient and anonymous scheme for tag size estimation [27], which leverages the position of the first reply from a group of tags in a frame. Moreover, in order to tackle with the multiple reading problem, i.e., one single tag can be interrogated multiple times with multiple readers, Chen et al. present a replicate-insensitive estimation scheme [28], which eliminates the multiple reading in multi-reader scenarios. The authors further propose an adaptively splittingbased arbitration protocol [29]. This protocol requires a single tag to select multiple slots in a frame to respond, according to the Geometric distribution. Specifically, each tag have a probability of 1 2t to respond in the t − 1th slot. Then, the tag size can be effectively estimated according to the fringe of collision slots in the frame. Shahzad et al. propose a new scheme for estimating tag population size called average run based tag estimation [30]. The technique is based on the average run-length of ones in the bit string received using the standardized framed slotted ALOHA protocol. It is easy to deploy because it neither requires modification to tags nor to the communication protocol between tags and readers. Table III provides a comparison of the above algorithms for tag size estimation. Instead of simply estimating the tag size, a number of researchers start to investigate how to realize more complicated data analysis and mining over RFID systems. Sheng et al. consider the problem of identifying popular categories of