XIE er al:MANAGING RFID DATA:CHALLENGES.OPPORTUNITIES AND SOLUTIONS 1299 RFID tags out of a large collection of tags,without reading to be singleton/collision slot but turns out to be empty,then all the tag data [31].They propose effective algorithms based it implies that the tag(s)mapping to this slot is(are)missing on the idea of group testing.which can efficiently derive Fig.4 gives an example of the polling mechanism.As the figure popular categories of tags.Kodialam et al.propose a privacy- shows,the dotted box denotes the missing tag,while the solid preserving estimation mechanism over dynamically moving box denotes the existing tag.Assume that the tag E,F and H tags in applications like object tracking [32].This mechanism are missing,if the corresponding slots turn out to be empty, can fast count the number of tags moving from location A then the tags can be determined missing.However,if the tag C to B during the time interval(t1,t2).In this way,the traffic is missing,the corresponding slot still turns out to be collision can be effectively tracked without exactly reading the tag IDs. slot due to the existence of the tag B and D.In this situation, In general,research work in this area are relatively rare.With the tag C cannot be identified as missing,resulting in the the further extension of RFID applications,efficient techniques false positive error.The above polling mechanism provides a for data analysis and mining over RFID systems will attract theoretical foundation to recent research work. widespread attentions and in-depth studies Based on the above understanding,Li et al.study a practically important problem of monitoring a large set of C.Polling the Tags RFID tags,i.e.,efficiently identifying the missing tags in a large RFID system [33].Based on the"pseudo randomness", Instead of obtaining all tags'IDs in the effective scanning they design a series of missing-tag identification protocols range,some applications only require to poll the tags in a that employ novel techniques to reduce the execution time. specified tag set,verifying whether these tags are missing. In order to deal with the problem of collecting real-time For example,in regard to the applications like warehouse information from a set of battery-powered active tags,Qiao management,assume that each of the goods is attached with et al.propose a tag-ordering polling protocol [34]that can an RFID tag with unique ID,the administrator should check reduce per-tag energy consumption by more than an order the specified inventories according to the list of goods.In this of magnitude.Moreover,they apply partitioned Bloom filters situation,it becomes an essential problem to devise efficient to further enhance the performance.such that it can achieve polling protocols over the RFID tags,such that both the much better energy efficiency without degradation in protocol time efficiency and energy efficiency can be achieved.A execution time.Chen et al.respectively propose a single- straightforward solution is to realize the following"roll call" hash and a multi-hash based information collection protocol to mechanism:the reader continuously broadcasts the tags'IDs address the above problem [35],which dramatically reduces according to the check list.After each tag ID is broadcasted. the expected execution time.In order to meet the requirement the reader will wait for the specified tag to return a short of prompt and reliable batch authentications in large scale response.If a short response is obtained,then it implies that RFID applications,Yang et al.propose an identification-free this tag exists in the scanning range;otherwise,this tag is batch authentication protocol based on the polling mechanism expected to be missing.However,there are a number of [36].Conventionally,in order to authenticate a large number disadvantages for the above mechanism:1)it is not compatible of tags,the reader should identify and authenticate the tags with the slotted ALOHA:2)the transmission delay of the one by one,which requires a large amount of scanning tag ID with 96 bits is fairly large,the efficiency is greatly time.Based on the "pseudo randomness",this batch authen- reduced.Therefore,many researchers have investigated into tication protocol greatly reduces the overall scanning time, this problem,and tried to design efficient polling protocols while guaranteeing the probability of false positive error to based on the slotted ALOHA,aiming to sufficiently reduce be below the specified threshold.Being different from the the overall scanning time. above research work,which mainly poll over all tags in Note that during each query round of the interrogation,each the scanning area,Zheng et al.address the problem of fast of the activated tags will "randomly"select a slot to respond searching a particular subset in a large number of RFID tags inside the frame.However,due to the simplicity of the tag's [37].They utilize Bloom filter-based compact approximators design,it is rather difficult to realize the "pure randomness". to efficiently aggregate and exchange the tag information Instead,the"pseudo randomness"is used as follows:For each with a two-phase approximation protocol,which significantly query round the reader first broadcasts a random number r to reduces the searching time.Generally speaking,these polling the tags.After that,each of the activated tags will compute mechanisms mainly leverage the pseudo randomness in the a pseudo random number s as the selected slot number.here slotted ALOHA protocol to effectively reduce the additional s hash(ID,r)mod f,ID and f respectively denote the tag cost of uniquely identifying the tags.Nevertheless,since there ID and frame size.The above "pseudo randomness"implies exist the false positive errors for the polling mechanisms. that,in regard to a specified tag,once the random number r optimized algorithms are proposed to reduce the probability and the frame size f is set,the corresponding slot selected by of false positive errors. this tag inside the frame is determined.This property makes it possible for efficient polling over the tags.Since the tag IDs in the specified tag set can be known in advance,then D.Summing Up Challenges and Opportunities before interrogating the tags,the reader can precompute the According to the above recent research progress,we sum- expected state for each slot accordingly.The expected state marize the challenges and opportunities for anti-collision for each slot can be empty slot,singleton slot or collision slot. algorithms in Table IV,respectively for the tag identification, Therefore,in regard to a specified slot,if this slot is expected estimating the tag size,as well as polling the tags
XIE et al.: MANAGING RFID DATA: CHALLENGES, OPPORTUNITIES AND SOLUTIONS 1299 RFID tags out of a large collection of tags, without reading all the tag data [31]. They propose effective algorithms based on the idea of group testing, which can efficiently derive popular categories of tags. Kodialam et al. propose a privacypreserving estimation mechanism over dynamically moving tags in applications like object tracking [32]. This mechanism can fast count the number of tags moving from location A to B during the time interval (t1, t2). In this way, the traffic can be effectively tracked without exactly reading the tag IDs. In general, research work in this area are relatively rare. With the further extension of RFID applications, efficient techniques for data analysis and mining over RFID systems will attract widespread attentions and in-depth studies. C. Polling the Tags Instead of obtaining all tags’ IDs in the effective scanning range, some applications only require to poll the tags in a specified tag set, verifying whether these tags are missing. For example, in regard to the applications like warehouse management, assume that each of the goods is attached with an RFID tag with unique ID, the administrator should check the specified inventories according to the list of goods. In this situation, it becomes an essential problem to devise efficient polling protocols over the RFID tags, such that both the time efficiency and energy efficiency can be achieved. A straightforward solution is to realize the following “roll call” mechanism: the reader continuously broadcasts the tags’ IDs according to the check list. After each tag ID is broadcasted, the reader will wait for the specified tag to return a short response. If a short response is obtained, then it implies that this tag exists in the scanning range; otherwise, this tag is expected to be missing. However, there are a number of disadvantages for the above mechanism: 1) it is not compatible with the slotted ALOHA; 2) the transmission delay of the tag ID with 96 bits is fairly large, the efficiency is greatly reduced. Therefore, many researchers have investigated into this problem, and tried to design efficient polling protocols based on the slotted ALOHA, aiming to sufficiently reduce the overall scanning time. Note that during each query round of the interrogation, each of the activated tags will “randomly” select a slot to respond inside the frame. However, due to the simplicity of the tag’s design, it is rather difficult to realize the “pure randomness”. Instead, the “pseudo randomness” is used as follows: For each query round the reader first broadcasts a random number r to the tags. After that, each of the activated tags will compute a pseudo random number s as the selected slot number, here s = hash(ID, r)mod f, ID and f respectively denote the tag ID and frame size. The above “pseudo randomness” implies that, in regard to a specified tag, once the random number r and the frame size f is set, the corresponding slot selected by this tag inside the frame is determined. This property makes it possible for efficient polling over the tags. Since the tag IDs in the specified tag set can be known in advance, then before interrogating the tags, the reader can precompute the expected state for each slot accordingly. The expected state for each slot can be empty slot, singleton slot or collision slot. Therefore, in regard to a specified slot, if this slot is expected to be singleton/collision slot but turns out to be empty, then it implies that the tag(s) mapping to this slot is(are) missing. Fig.4 gives an example of the polling mechanism. As the figure shows, the dotted box denotes the missing tag, while the solid box denotes the existing tag. Assume that the tag E,F and H are missing, if the corresponding slots turn out to be empty, then the tags can be determined missing. However, if the tag C is missing, the corresponding slot still turns out to be collision slot due to the existence of the tag B and D. In this situation, the tag C cannot be identified as missing, resulting in the false positive error. The above polling mechanism provides a theoretical foundation to recent research work. Based on the above understanding, Li et al. study a practically important problem of monitoring a large set of RFID tags, i.e., efficiently identifying the missing tags in a large RFID system [33]. Based on the “pseudo randomness”, they design a series of missing-tag identification protocols that employ novel techniques to reduce the execution time. In order to deal with the problem of collecting real-time information from a set of battery-powered active tags, Qiao et al. propose a tag-ordering polling protocol [34] that can reduce per-tag energy consumption by more than an order of magnitude. Moreover, they apply partitioned Bloom filters to further enhance the performance, such that it can achieve much better energy efficiency without degradation in protocol execution time. Chen et al. respectively propose a singlehash and a multi-hash based information collection protocol to address the above problem [35], which dramatically reduces the expected execution time. In order to meet the requirement of prompt and reliable batch authentications in large scale RFID applications, Yang et al. propose an identification-free batch authentication protocol based on the polling mechanism [36]. Conventionally, in order to authenticate a large number of tags, the reader should identify and authenticate the tags one by one, which requires a large amount of scanning time. Based on the “pseudo randomness”, this batch authentication protocol greatly reduces the overall scanning time, while guaranteeing the probability of false positive error to be below the specified threshold. Being different from the above research work, which mainly poll over all tags in the scanning area, Zheng et al. address the problem of fast searching a particular subset in a large number of RFID tags [37]. They utilize Bloom filter-based compact approximators to efficiently aggregate and exchange the tag information with a two-phase approximation protocol, which significantly reduces the searching time. Generally speaking, these polling mechanisms mainly leverage the pseudo randomness in the slotted ALOHA protocol to effectively reduce the additional cost of uniquely identifying the tags. Nevertheless, since there exist the false positive errors for the polling mechanisms, optimized algorithms are proposed to reduce the probability of false positive errors. D. Summing Up Challenges and Opportunities According to the above recent research progress, we summarize the challenges and opportunities for anti-collision algorithms in Table IV, respectively for the tag identification, estimating the tag size, as well as polling the tags
1300 IEEE COMMUNICATIONS SURVEYS TUTORIALS.VOL.16.NO.3.THIRD QUARTER 2014 gAggGg可gggg Expected State Observed State Polling Cannot decide if tag Tag E,F are Tag His C is missing,false Results missing positive error. missing. Fig.4.An example of the polling mechanism in slotted ALOHA-based protocol TABLE IV bits,while the user memory can only store 512 bits.Hence, CHALLENGES AND OPPORTUNITIES FOR ANTI-COLLISION ALGORITHMS the tag's scarce resource cannot support the implementation Mechanism Challenges Opportunities of the above encryption and decryption operations.Therefore, Tag Iden- Ensuring time-efficiency Dynamically adjusting the greatest challenge to the security and privacy protection tification in realistic situations. the parameters like in RFID systems lies in how to implement the authentication the reader's power,the scanning angle,and the and privacy protection protocols in a lightweight approach.In frame size can effectively the following subsections,we elaborate more on the related improve the actual research work respectively in physical mechanism-based so- performance. Estimating Ensuring both the The regular probability lutions,symmetric-key encryption-based solutions,and hash the Tag accuracy and the time- distribution in Monte function-based solutions. Size efficiency for estimation. Carlo method contributes to accurate estimation. Polling Ensuring time-efficiency Pseudo randomness can the Tags while reducing false be leveraged for effective A.Physical Mechanism-based Solutions positive/negative errors. polling. The privacy problem of RFID system is caused by the lack of authentication when the RFID reader interrogates the tags.Without the privacy protection,any RFID reader can IV.AUTHENTICATION AND PRIVACY PROTECTION privately interrogate the surrounding tags to obtain their ID. PROTOCOLS In regard to those encrypted tags which cannot be directly The premise of efficient RFID data management is to identified,they can still be tracked by the illegal readers guarantee the security and privacy of RFID data.The threats according to the backscattered encrypted messages.In order to of RFID systems mainly come from the unauthorized access protect the users'privacy in RFID systems,a straightforward to the tags and the existence of counterfeit tags.Specifically. approach is to utilize the physical mechanism-based solutions. the security problem is how to effectively authenticate the tags which mainly include tag killing,electrostatic screening,active when there exist counterfeit tags;the privacy problem is how jamming and blocking. to prevent the illegal access from the readers to preserve the The tag killing is actually a brute-force operation.In order users'privacy. to prevent a tag from illegal eavesdropping,the reader simply For conventional solutions in network security,there are deactivates a tag by sending a"kill"command with a tag- already some encryption and decryption algorithms like DES,specific PIN.When a tag receives a "kill"command from AES,RSA and ECC.They can implement functions like the reader,it renders itself permanently inoperative.In this encryption and authentication,such that the illegal access,respect,the killing operation leads to loss of primary functions spoofing,eavesdropping and replay attack can be effec-for RFID tags,while preventing arbitrary interrogation and tively resisted.However,they require a large number of tracking from illegal users,since "dead tags tell no tales". logic processing units on the chip,e.g.,AES requires about Apparently,it is not a very reasonable solution to completely 20000~30000 logic gates,whereas RSA and ECC require deactivate a tag.Sarma et al.propose a solution to partially more logic gates to implement their functions. erase the unique part of identification code while keeping the Due to the low cost limitation of RFID tags,conventionally other part including the category ID [38].In this way,the one RFID tag can only have 5000~10000 logic gates.More-tag can be prevented from being tracked,while sacrificing the over,these logic gates are mainly used to implement the basic unique identification.Inoue et al.propose to use a new identifi- functions of the tag,leaving very few gates for the security cation code to replace the former code for unique identification functions.Besides,the tag's on-board memory is also rather [39].The former identification code can be reactivated when limited,conventionally the EPC memory can only store 96 the tag is recycled.In consideration of reusing the tags,the
1300 IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 3, THIRD QUARTER 2014 0 0 1 0 C 0 0 C 1 1 0 0 Tag A Tag B Tag C Tag D Tag E Tag F Tag G Tag H Ă 0 0 1 0 C 0 0 Ă 0 1 0 0 0 Expected State Observed State Polling Results Tag E,F are missing Tag H is missing. Cannot decide if tag C is missing, false positive error. Fig. 4. An example of the polling mechanism in slotted ALOHA-based protocol TABLE IV CHALLENGES AND OPPORTUNITIES FOR ANTI-COLLISION ALGORITHMS Mechanism Challenges Opportunities Tag Identification Ensuring time-efficiency in realistic situations. Dynamically adjusting the parameters like the reader’s power, the scanning angle, and the frame size can effectively improve the actual performance. Estimating the Tag Size Ensuring both the accuracy and the timeefficiency for estimation. The regular probability distribution in Monte Carlo method contributes to accurate estimation. Polling the Tags Ensuring time-efficiency while reducing falsepositive/negative errors. Pseudo randomness can be leveraged for effective polling. IV. AUTHENTICATION AND PRIVACY PROTECTION PROTOCOLS The premise of efficient RFID data management is to guarantee the security and privacy of RFID data. The threats of RFID systems mainly come from the unauthorized access to the tags and the existence of counterfeit tags. Specifically, the security problem is how to effectively authenticate the tags when there exist counterfeit tags; the privacy problem is how to prevent the illegal access from the readers to preserve the users’ privacy. For conventional solutions in network security, there are already some encryption and decryption algorithms like DES, AES, RSA and ECC. They can implement functions like encryption and authentication, such that the illegal access, spoofing, eavesdropping and replay attack can be effectively resisted. However, they require a large number of logic processing units on the chip, e.g., AES requires about 20000∼30000 logic gates, whereas RSA and ECC require more logic gates to implement their functions. Due to the low cost limitation of RFID tags, conventionally one RFID tag can only have 5000∼10000 logic gates. Moreover, these logic gates are mainly used to implement the basic functions of the tag, leaving very few gates for the security functions. Besides, the tag’s on-board memory is also rather limited, conventionally the EPC memory can only store 96 bits, while the user memory can only store 512 bits. Hence, the tag’s scarce resource cannot support the implementation of the above encryption and decryption operations. Therefore, the greatest challenge to the security and privacy protection in RFID systems lies in how to implement the authentication and privacy protection protocols in a lightweight approach. In the following subsections, we elaborate more on the related research work respectively in physical mechanism-based solutions, symmetric-key encryption-based solutions, and hash function-based solutions. A. Physical Mechanism-based Solutions The privacy problem of RFID system is caused by the lack of authentication when the RFID reader interrogates the tags. Without the privacy protection, any RFID reader can privately interrogate the surrounding tags to obtain their ID. In regard to those encrypted tags which cannot be directly identified, they can still be tracked by the illegal readers according to the backscattered encrypted messages. In order to protect the users’ privacy in RFID systems, a straightforward approach is to utilize the physical mechanism-based solutions, which mainly include tag killing, electrostatic screening, active jamming and blocking. The tag killing is actually a brute-force operation. In order to prevent a tag from illegal eavesdropping, the reader simply deactivates a tag by sending a “kill” command with a tagspecific PIN. When a tag receives a “kill” command from the reader, it renders itself permanently inoperative. In this respect, the killing operation leads to loss of primary functions for RFID tags, while preventing arbitrary interrogation and tracking from illegal users, since “dead tags tell no tales”. Apparently, it is not a very reasonable solution to completely deactivate a tag. Sarma et al. propose a solution to partially erase the unique part of identification code while keeping the other part including the category ID [38]. In this way, the tag can be prevented from being tracked, while sacrificing the unique identification. Inoue et al. propose to use a new identifi- cation code to replace the former code for unique identification [39]. The former identification code can be reactivated when the tag is recycled. In consideration of reusing the tags, the