Counting rFld Tags efficiently and Anonymously Hao Hants,Bo Shengt,Chiu C.Tant,Qun Lit,Weizhen Maof,Sanglu Lus fCollege of William and Mary,Williamsburg,VA,USA Northeastern University,Boston,MA,USA SState Key Laboratory of Novel Software Technology,Nanjing University,China Email:fhhan,cct,liqun,wm}@cs.wm.edu,shengbo@ccs.neu.edu,sanglu@nju.edu.cn Abstract-Radio Frequency IDentification (RFID)technology Prior work in [12]and [13]considers this problem by using has attracted much attention due to its variety of applications,e.g. probabilistic estimation based on the framed-slotted ALOHA inventory control and object tracking.One important problem in model.Unfortunately,the scanning time can be considerably RFID systems is how to quickly estimate the number of distinct tags without reading each tag individually.This problem plays a long due to the large frame size required.The performance crucial role in many real-time monitoring and privacy-preserving becomes worse when the mobile tags appear dynamically so applications.In this paper,we present an efficient and anonymous that counting them at a fixed time instant is not possible.That scheme for tag population estimation.This scheme leverages the is because the tags have to be scanned independently with each position of the first reply from a group of tags in a frame.Results counting consuming a long time from mathematical analysis and extensive simulation demonstrate that our scheme outperforms other protocols proposed in the In this paper,we propose a novel scheme for the reader to previous work. quickly estimate the number of distinct tags within a required accuracy.Our scheme is based on a new distinct element I.INTRODUCTION counting method [14],without reading either the actual or Radio Frequency IDentification(RFID)technology is widely pseudo IDs.The main idea of our algorithm is to utilize used in monitoring applications such as inventory control and the position of the first reply from a group of tags in a object tracking [1]-[7].Small RFID tags,each with a unique frame to infer the number of tags.Theoretical analysis and ID,are attached to items under monitoring.An RFID reader extensive simulation show that our scheme outperforms earlier can remotely collect these IDs later for verification.Due to RFID tag estimation schemes.Moreover,our scheme tries to the large number of deployed RFID tags,collecting all tag optimize incremental counting in a mobile environment.Note IDs for verification is inefficient.Some real-time applications. that our approach has a general purpose of counting RFID tags. such as counting the number of tags in a shipping portal,need Combined with other commands,it can be flexibly adopted in more efficient techniques to manage tag data.In this paper,we various applications. consider the problem of efficiently and anonymously estimating Our contributions are summarized as follows the cardinality of a large set of RFID tags with a desired We propose a novel anonymous estimating scheme which accuracy. does not collect the ID from each RFID tag,but is still Efficient techniques for estimating the number of RFID able to estimate the number of tags accurately. tags are important for applications when the time window for We present estimators for both static and dynamic sets of collecting tag data is small.These applications include real- tags.The static set specifies a snapshot of a set of tags, time monitoring or managing a large quantity of products and the dynamic set considers that tags can join or leave For example,a warehouse operator may need to perform a the set with time.Both our estimators are more efficient quick estimation of the number of products left in stock.Such than the existing protocols,even when the cardinality of applications demand efficient estimating schemes instead of the the tag set varies across many orders of magnitude. slow and unnecessary process of reading every tag ID. We propose a novel send-and-reply protocol among the Anonymity is another important issue when dealing with reader and tags to improve performance RFID tags attached to uniquely identifiable items such as The rest of our paper is as follows.Section II contains passports [8]or driver's licenses [9].Either broadcasting tag the related work.Section III presents our problem definition IDs in the open,or revealing IDs to the RFID reader may leak and system model.Section IV outlines the main idea of our personal information.For instance,an adversary could capture schemes.Section V details the algorithms.Our schemes are the communication between the reader and tags or compromise evaluated in Section VI,and Section VII concludes. the reader to track users'activities.Identifying each tag ID increases individual security and privacy risks.An alternative II.RELATED WORK way of providing anonymity is to use cryptographic protocols For a reader to successfully identify every tag in proxim- to mask the actual ID [10],[11].However,the cryptographic ity,collision arbitration protocols must be considered so that techniques require additional modification to the tag hardware, replies from multiple tags will not be garbled due to collision. as well as increase the computational complexity on both tags Collision arbitration protocols are divided into two approaches: and readers. ALOHA-based [15]-[17]and tree-based [18]-[20].In the first
Counting RFID Tags Efficiently and Anonymously Hao Han†§, Bo Sheng‡ , Chiu C. Tan† , Qun Li† , Weizhen Mao† , Sanglu Lu§ †College of William and Mary, Williamsburg, VA, USA ‡Northeastern University, Boston, MA, USA §State Key Laboratory of Novel Software Technology, Nanjing University, China Email: †{hhan, cct, liqun, wm}@cs.wm.edu, ‡ shengbo@ccs.neu.edu, § sanglu@nju.edu.cn Abstract—Radio Frequency IDentification (RFID) technology has attracted much attention due to its variety of applications, e.g., inventory control and object tracking. One important problem in RFID systems is how to quickly estimate the number of distinct tags without reading each tag individually. This problem plays a crucial role in many real-time monitoring and privacy-preserving applications. In this paper, we present an efficient and anonymous scheme for tag population estimation. This scheme leverages the position of the first reply from a group of tags in a frame. Results from mathematical analysis and extensive simulation demonstrate that our scheme outperforms other protocols proposed in the previous work. I. INTRODUCTION Radio Frequency IDentification (RFID) technology is widely used in monitoring applications such as inventory control and object tracking [1]–[7]. Small RFID tags, each with a unique ID, are attached to items under monitoring. An RFID reader can remotely collect these IDs later for verification. Due to the large number of deployed RFID tags, collecting all tag IDs for verification is inefficient. Some real-time applications, such as counting the number of tags in a shipping portal, need more efficient techniques to manage tag data. In this paper, we consider the problem of efficiently and anonymously estimating the cardinality of a large set of RFID tags with a desired accuracy. Efficient techniques for estimating the number of RFID tags are important for applications when the time window for collecting tag data is small. These applications include realtime monitoring or managing a large quantity of products. For example, a warehouse operator may need to perform a quick estimation of the number of products left in stock. Such applications demand efficient estimating schemes instead of the slow and unnecessary process of reading every tag ID. Anonymity is another important issue when dealing with RFID tags attached to uniquely identifiable items such as passports [8] or driver’s licenses [9]. Either broadcasting tag IDs in the open, or revealing IDs to the RFID reader may leak personal information. For instance, an adversary could capture the communication between the reader and tags or compromise the reader to track users’ activities. Identifying each tag ID increases individual security and privacy risks. An alternative way of providing anonymity is to use cryptographic protocols to mask the actual ID [10], [11]. However, the cryptographic techniques require additional modification to the tag hardware, as well as increase the computational complexity on both tags and readers. Prior work in [12] and [13] considers this problem by using probabilistic estimation based on the framed-slotted ALOHA model. Unfortunately, the scanning time can be considerably long due to the large frame size required. The performance becomes worse when the mobile tags appear dynamically so that counting them at a fixed time instant is not possible. That is because the tags have to be scanned independently with each counting consuming a long time. In this paper, we propose a novel scheme for the reader to quickly estimate the number of distinct tags within a required accuracy. Our scheme is based on a new distinct element counting method [14], without reading either the actual or pseudo IDs. The main idea of our algorithm is to utilize the position of the first reply from a group of tags in a frame to infer the number of tags. Theoretical analysis and extensive simulation show that our scheme outperforms earlier RFID tag estimation schemes. Moreover, our scheme tries to optimize incremental counting in a mobile environment. Note that our approach has a general purpose of counting RFID tags. Combined with other commands, it can be flexibly adopted in various applications. Our contributions are summarized as follows. • We propose a novel anonymous estimating scheme which does not collect the ID from each RFID tag, but is still able to estimate the number of tags accurately. • We present estimators for both static and dynamic sets of tags. The static set specifies a snapshot of a set of tags, and the dynamic set considers that tags can join or leave the set with time. Both our estimators are more efficient than the existing protocols, even when the cardinality of the tag set varies across many orders of magnitude. • We propose a novel send-and-reply protocol among the reader and tags to improve performance. The rest of our paper is as follows. Section II contains the related work. Section III presents our problem definition and system model. Section IV outlines the main idea of our schemes. Section V details the algorithms. Our schemes are evaluated in Section VI, and Section VII concludes. II. RELATED WORK For a reader to successfully identify every tag in proximity, collision arbitration protocols must be considered so that replies from multiple tags will not be garbled due to collision. Collision arbitration protocols are divided into two approaches: ALOHA-based [15]–[17] and tree-based [18]–[20]. In the first
approach,the framed-slotted ALOHA (FSA)protocol,which B.System Model is an extension of the pure ALOHA protocol [21],is widely used in RFID standards.Built on that,adaptive FSA protocols, The MAC protocol for our RFID system is based on the where frame size is adaptively adjusted,are explored in [15], adaptive framed-slotted ALOHA model.To read a set of tags,a [22-[241. reader first powers up and transmits continuous wave (CW)to Recent research work [12],[13],[17],[25]is the closest to energize tags.Each tag waits for the reader's command before this paper.A probabilistic analytical model for anonymously replying.This is known as the Reader Talks First mode. estimating tag population is first proposed in [12].The main The communication between the reader and tags is composed idea is to use the framed-slotted ALOHA protocol and monitor of multiple frames.Each frame is partitioned into slots.Here, the number of empty and collision slots to count tags.However, we refer to an individual frame as a round.The reader will first the drawbacks of the estimators in [12]are that all the tags broadcast a begin round command containing the frame size f must be readable by the reader in a single probe and that the in the forthcoming round,and a random seed R.The frame size reader must know approximately the magnitude of the number is the number of slots available for tags to choose in a round of tags to be estimated.Due to these constraints,an Enhanced Each tag picks a slot,and this slot determines when a tag will Zero-Based (EZB)estimator is presented in [13].By tuning reply.An RFID tag uses a hash function h(),f,R,and its ID the parameters for multiple iterations,the number of tags can to pick a slot in the current round,ie.,h(f,R,id)[0,f-1]. be estimated with high accuracy,even when the tag population We assume that the outputs of the hash function have a uniform varies a lot.The key improvement in our work over [12]and random distribution such that the tag has the equal probability [13]is that our scheme does not scan the entire frame,which to select any slot within the round given a seed and ID. drastically reduces time cost.Finally,another novel estimator Each RFID tag has a slot counter which will decrease each for the same problem is proposed in [25]with more focus on time the reader indicates that the current slot has ended.The tag the multiple-reader scenario.However,the scheme requires a will only reply when its slot counter reaches zero.When all the special geometric distribution hash function,which might not slots in the frame have been accounted for,the reader sends an be available in the off-the-shelf RFID systems. end round command to terminate this round.We assume that the reader can issue an end round command to terminate a III.PROBLEM DEFINITION AND SYSTEM MODEL round at any time without waiting for the frame to end.The A.Problem Definition procedure is illustrated in Fig.1.We call this the original send- and-reply protocol. TABLE I NOTATIONS Begin round End slot End round command ommand Symbols Descriptions command Confidence interval Reader <f,R> Error probability t Number of distinct tags Tag#1 #1 Upper bound of the number of tags Tag#2 #2 t Estimation of the number of tags Random variable for the number of continuous empty Tag#3 #3 slots before the first non-empty slot in a frame Frame size (the number of slots in a frame) R Random seed Tag#t e Load factor t/f Empty Slot 无 Number of waiting slots Collision Slot m Number of rounds (frames) -round a】 Hash function TO Theoretical time cost (in number of slots)in a round Fig.1.Collection sequence of passive RFID systems using the adaptive FSA m Number of sets of tags Since every RFID tag chooses its own slot individually,there Given an RFID reader and a set of tags,we want to quickly will be instances where no tag picks a particular slot.We term and accurately estimate the number of distinct RFID tags in this as an empty slot.A slot that has only been chosen by the set without identifying each tag individually.Our algorithms one tag is known as a singleton slot.A slot that is chosen by allow a user to specify his desired accuracy using two variables, more than one tag is called a collision slot.We refer to both a confidence interval e and an error probability 6.Lower values singleton slot and collision slot as non-empty slot in this paper. of e and o result in a more accurate estimation.Our algorithms After collecting all replies,the reader can generate a bitstring. return an estimation t of the actual number of tags t,such such as that Pr[lf-t<et]>1-6.For example,if the set has 5000 RFID tags,and given e=5%and 6=1%,the desired {·1110|11110|11·}, estimator should output the number within [4750,5250]with probability greater than 99%.Table I summarizes the notations where 0 indicates an empty slot,and 1 represents a non-empty used. slot
approach, the framed-slotted ALOHA (FSA) protocol, which is an extension of the pure ALOHA protocol [21], is widely used in RFID standards. Built on that, adaptive FSA protocols, where frame size is adaptively adjusted, are explored in [15], [22]–[24]. Recent research work [12], [13], [17], [25] is the closest to this paper. A probabilistic analytical model for anonymously estimating tag population is first proposed in [12]. The main idea is to use the framed-slotted ALOHA protocol and monitor the number of empty and collision slots to count tags. However, the drawbacks of the estimators in [12] are that all the tags must be readable by the reader in a single probe and that the reader must know approximately the magnitude of the number of tags to be estimated. Due to these constraints, an Enhanced Zero-Based (EZB) estimator is presented in [13]. By tuning the parameters for multiple iterations, the number of tags can be estimated with high accuracy, even when the tag population varies a lot. The key improvement in our work over [12] and [13] is that our scheme does not scan the entire frame, which drastically reduces time cost. Finally, another novel estimator for the same problem is proposed in [25] with more focus on the multiple-reader scenario. However, the scheme requires a special geometric distribution hash function, which might not be available in the off-the-shelf RFID systems. III. PROBLEM DEFINITION AND SYSTEM MODEL A. Problem Definition TABLE I NOTATIONS Symbols Descriptions ǫ Confidence interval δ Error probability t Number of distinct tags tmax Upper bound of the number of tags t˜ Estimation of the number of tags X Random variable for the number of continuous empty slots before the first non-empty slot in a frame f Frame size (the number of slots in a frame) R Random seed ρ Load factor t/f k Number of waiting slots n Number of rounds (frames) h(·) Hash function T(·) Theoretical time cost (in number of slots) in a round m Number of sets of tags Given an RFID reader and a set of tags, we want to quickly and accurately estimate the number of distinct RFID tags in the set without identifying each tag individually. Our algorithms allow a user to specify his desired accuracy using two variables, a confidence interval ǫ and an error probability δ. Lower values of ǫ and δ result in a more accurate estimation. Our algorithms return an estimation t˜ of the actual number of tags t, such that P r[|t˜ − t| ≤ ǫt] ≥ 1 − δ. For example, if the set has 5000 RFID tags, and given ǫ = 5% and δ = 1%, the desired estimator should output the number within [4750, 5250] with probability greater than 99%. Table I summarizes the notations used. B. System Model The MAC protocol for our RFID system is based on the adaptive framed-slotted ALOHA model. To read a set of tags, a reader first powers up and transmits continuous wave (CW) to energize tags. Each tag waits for the reader’s command before replying. This is known as the Reader Talks First mode. The communication between the reader and tags is composed of multiple frames. Each frame is partitioned into slots. Here, we refer to an individual frame as a round. The reader will first broadcast a begin round command containing the frame size f in the forthcoming round, and a random seed R. The frame size is the number of slots available for tags to choose in a round. Each tag picks a slot, and this slot determines when a tag will reply. An RFID tag uses a hash function h(·), f, R, and its ID to pick a slot in the current round, i.e., h(f, R, id) → [0, f −1]. We assume that the outputs of the hash function have a uniform random distribution such that the tag has the equal probability to select any slot within the round given a seed and ID. Each RFID tag has a slot counter which will decrease each time the reader indicates that the current slot has ended. The tag will only reply when its slot counter reaches zero. When all the slots in the frame have been accounted for, the reader sends an end round command to terminate this round. We assume that the reader can issue an end round command to terminate a round at any time without waiting for the frame to end. The procedure is illustrated in Fig. 1. We call this the original sendand-reply protocol. #1 #2 #3 #t Begin round command End slot command End round command Reader ... <f, R> Tag#1 Tag#2 Tag#3 Tag#t Singleton Slot Collision Slot Empty Slot ... st 1 round ... Fig. 1. Collection sequence of passive RFID systems using the adaptive FSA Since every RFID tag chooses its own slot individually, there will be instances where no tag picks a particular slot. We term this as an empty slot. A slot that has only been chosen by one tag is known as a singleton slot. A slot that is chosen by more than one tag is called a collision slot. We refer to both singleton slot and collision slot as non-empty slot in this paper. After collecting all replies, the reader can generate a bitstring, such as { · · · | 1 | 0 | 1 | 1 | 0 | 1 | · · · }, where 0 indicates an empty slot, and 1 represents a non-empty slot
IV.INTUITION A.Basic Algorithm The previous research [12],[13],[17]takes advantage of Again,our algorithm is based on the idea of making obser- the framed-slotted ALOHA protocol to estimate the number vation on the first non-empty slot.However,if the number of of tags.The basic idea is based on the probability model we tags t is small,the position of the first reply may be located have described previously.The reader scans all the slots and at the end of the frame.Apparently,it is not efficient to use records the status of each slot:empty,singleton,or collision. the original send-and-reply protocol described in Section IIl. By examining the number of empty slots,collision slots,the In that protocol,a reader broadcasts f and R at the beginning reader can then estimate the number of tags. of a round,and waits for the first reply from tags.Therefore, This estimation method while powerful,has some limitations. when the first reply is toward the end of the round,the reader The main limitation is the large frame size,which translates to has to wait for the period of time almost equal to the frame a long protocol running time,when there exist a huge amount size. of tags.Suppose for a large tag population but the frame size is To resolve this issue to improve the query efficiency,we considerably small.All the tags'responses will be packed in a propose a new send-and-reply communication protocol among small number of slots,which means that the number of empty reader and tags.Compared to the original protocol,our new slots will become zero and number of collision slots will be protocol can identify the first non-empty slot in O(log2f)time equal to the frame size.To make estimation accurate,the frame slots instead of (f). size should be in proportion to the number of tags.Therefore, The new send-and-reply protocols for reader and tags are scanning the whole frame is inefficient when tag population is shown in Algorithm 1 and 2 respectively.In the protocols,the large.Furthermore,the performance is even worse in mobile reader sends an extra frame range r to all tags.Initially,the environment,where either RFID tag or reader can move.To reader splits the whole frame into two,and sets the first half count tags over a period of time,we have to use a very large frame as the candidate range,the second half frame as the frame size at the beginning,such that we can superimpose all alternative range.The reader always sends out its candidate the frames and guarantee that the number of empty slots is not range to the tags.Each tag evaluates h(f,R,id)and replies zero in the end [13]. immediately if the result is inside the range r.Otherwise,it To overcome the large frame size problem in the previous keeps silent without doing anything.Then the reader checks protocols,we propose a new idea based on a randomized the forthcoming slot.If the slot is empty,which indicates algorithm for counting.Suppose we have n random numbers there is no tag within the candidate range,the reader splits the uniformly and randomly chosen from (0.1).By examining the alternative range into two and picks the first half as the new smallest number,say x,we can estimate n.Intuitively,the candidate range,and the second half as the new alternative smaller z is,the larger n would be.If all the numbers are range.If the slot is not empty,which indicates there is at uniformly laid out,n should be approximated by 1/x.Of least one tag in the candidate range,the reader then splits the course,this estimation is very crude with a very large variance. candidate range into two,and sets the first half as the the new Fortunately,we can run the same process for a sufficiently large candidate range,and the second half as the new alternative number of times,the estimation will become more accurate. range.The above procedure is like a binary search tree as shown More details will be described later. in Fig.2.The reader keeps traversing from the root to the leaves Our scheme does not require the reader to scan the whole and records the path in each iteration.Finally,the reader can frame.Instead,the reader only needs to identify the first non- identify the first non-empty slot using the equation in line 16 empty slot,and uses the number of consecutive empty slots of Algorithm 1,where z;is a 0/1 bit indicating the state of the before that to estimate the number of tags.Again.the fewer ithiteration. the empty slots appear before the first non-empty slot,the more Fig.2 illustrates a simple example with frame size of 16. tags there are.In practice,certain number of iterations of such In the first iteration,the reader sends the frame size 16,search operations are performed,and the mean value is used to achieve an accurate estimation.For example,given, range [0,7],and a random seed to all tags.No tag replies,so the first slot is empty.Then the reader starts the second iteration {0|0|1} →X1=2 with a new range r=[8,11].At this time,at least one tag {1} +X2=0 replies,so the slot is "1".Repeating the same process twice, {0101010101010101011} →X3=9 the reader identifies the first non-empty slot to be 10. where Xi denotes the number of empty slots before the It is not difficult to find that if the number of tags is relatively first reply position in round i.From theoretical analysis and small to the frame size,our new send-and-reply protocol is extensive simulation,we find even though multiple iterations more efficient than the original protocol.Otherwise,the original are required for accuracy,the total time is still much shorter protocol is better.Therefore,we combine both of them to than the schemes in prior work. determine X.In the combined send-and-reply protocol,we define the number of waiting slots k.At every round,the V.ANONYMOUS ESTIMATING ALGORITHMS original protocol is tried first.Only when there is no reply In this section,we describe our novel RFID tag estimating within k slots,we turn to use our new protocol.So in the scheme,First Non-Empty slots Based (FNEB)estimator. worst case,only k+log2 f slots are required
IV. INTUITION The previous research [12], [13], [17] takes advantage of the framed-slotted ALOHA protocol to estimate the number of tags. The basic idea is based on the probability model we have described previously. The reader scans all the slots and records the status of each slot: empty, singleton, or collision. By examining the number of empty slots, collision slots, the reader can then estimate the number of tags. This estimation method while powerful, has some limitations. The main limitation is the large frame size, which translates to a long protocol running time, when there exist a huge amount of tags. Suppose for a large tag population but the frame size is considerably small. All the tags’ responses will be packed in a small number of slots, which means that the number of empty slots will become zero and number of collision slots will be equal to the frame size. To make estimation accurate, the frame size should be in proportion to the number of tags. Therefore, scanning the whole frame is inefficient when tag population is large. Furthermore, the performance is even worse in mobile environment, where either RFID tag or reader can move. To count tags over a period of time, we have to use a very large frame size at the beginning, such that we can superimpose all the frames and guarantee that the number of empty slots is not zero in the end [13]. To overcome the large frame size problem in the previous protocols, we propose a new idea based on a randomized algorithm for counting. Suppose we have n random numbers uniformly and randomly chosen from (0,1). By examining the smallest number, say x, we can estimate n. Intuitively, the smaller x is, the larger n would be. If all the numbers are uniformly laid out, n should be approximated by 1/x. Of course, this estimation is very crude with a very large variance. Fortunately, we can run the same process for a sufficiently large number of times, the estimation will become more accurate. More details will be described later. Our scheme does not require the reader to scan the whole frame. Instead, the reader only needs to identify the first nonempty slot, and uses the number of consecutive empty slots before that to estimate the number of tags. Again, the fewer the empty slots appear before the first non-empty slot, the more tags there are. In practice, certain number of iterations of such operations are performed, and the mean value is used to achieve an accurate estimation. For example, given, { 0 | 0 | 1 } → X1 = 2 { 1 } → X2 = 0 { 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 } → X3 = 9 where Xi denotes the number of empty slots before the first reply position in round i. From theoretical analysis and extensive simulation, we find even though multiple iterations are required for accuracy, the total time is still much shorter than the schemes in prior work. V. ANONYMOUS ESTIMATING ALGORITHMS In this section, we describe our novel RFID tag estimating scheme, First Non-Empty slots Based (FNEB) estimator. A. Basic Algorithm Again, our algorithm is based on the idea of making observation on the first non-empty slot. However, if the number of tags t is small, the position of the first reply may be located at the end of the frame. Apparently, it is not efficient to use the original send-and-reply protocol described in Section III. In that protocol, a reader broadcasts f and R at the beginning of a round, and waits for the first reply from tags. Therefore, when the first reply is toward the end of the round, the reader has to wait for the period of time almost equal to the frame size. To resolve this issue to improve the query efficiency, we propose a new send-and-reply communication protocol among reader and tags. Compared to the original protocol, our new protocol can identify the first non-empty slot in O(log2f) time slots instead of O(f). The new send-and-reply protocols for reader and tags are shown in Algorithm 1 and 2 respectively. In the protocols, the reader sends an extra frame range r to all tags. Initially, the reader splits the whole frame into two, and sets the first half frame as the candidate range, the second half frame as the alternative range. The reader always sends out its candidate range to the tags. Each tag evaluates h(f, R, id) and replies immediately if the result is inside the range r. Otherwise, it keeps silent without doing anything. Then the reader checks the forthcoming slot. If the slot is empty, which indicates there is no tag within the candidate range, the reader splits the alternative range into two and picks the first half as the new candidate range, and the second half as the new alternative range. If the slot is not empty, which indicates there is at least one tag in the candidate range, the reader then splits the candidate range into two, and sets the first half as the the new candidate range, and the second half as the new alternative range. The above procedure is like a binary search tree as shown in Fig. 2. The reader keeps traversing from the root to the leaves and records the path in each iteration. Finally, the reader can identify the first non-empty slot using the equation in line 16 of Algorithm 1, where zi is a 0/1 bit indicating the state of the i th iteration. Fig. 2 illustrates a simple example with frame size of 16. In the first iteration, the reader sends the frame size 16, search range [0, 7], and a random seed to all tags. No tag replies, so the first slot is empty. Then the reader starts the second iteration with a new range r = [8, 11]. At this time, at least one tag replies, so the slot is “1”. Repeating the same process twice, the reader identifies the first non-empty slot to be 10. It is not difficult to find that if the number of tags is relatively small to the frame size, our new send-and-reply protocol is more efficient than the original protocol. Otherwise, the original protocol is better. Therefore, we combine both of them to determine X. In the combined send-and-reply protocol, we define the number of waiting slots k. At every round, the original protocol is tried first. Only when there is no reply within k slots, we turn to use our new protocol. So in the worst case, only k + log2 f slots are required
Algorithm 1 New send-and-reply protocol for the reader of the FNEB estimator is shown in Algorithm 3.The algorithm 1:if f is not a power of 2 then takes tmar,6 and e as inputs,where tmaz denotes the upper 2: f =2flog2f bound of tag population t.Initially,the reader computes pa- 3:end if rameters f,k,and n by inputs,and then applies the combined 4:a=0,b=f/2-1 send-and-reply protocol n rounds to obtain the average value of 5:Set the search range r [a,b]and random seed R X.denoted by Y.At last,the estimation t is calculated below: 6:for i=1 to log2 f do 7: Reader broadcasts r,f,and R,and listens in the forth- 1+Y coming slot for reply (only one slot) i=f.mn-y (1) 8: if the slot is EMPTY then 9 21=0 10: a=b+1,b=b+rl/2,and updates r 1 else Algorithm 3 FNEB estimator for static tag set 12: 21=1 b=(b-1)/2,and updates r INPUT:tmaz,6,and e 13: OUTPUT:t 14 end if 1:Compute the frame size f and waiting slots k 15:end for i6:ReturnX=∑ef(1-2)·2lo8ef-i 2:Compute the number of rounds n 3: for i =1 to n do 4: Generate a new random seed Ri Algorithm 2 New send-and-reply protocol for each tag 5: Broadcast (f,Ri)to all tags and wait their replies 1:Receive range r,f,and R from reader 6: Run the original send-and-reply protocol 2:Compute slot number sn=h(f.R,id) if receive reply before kth slot then 3:if sn is inside r then 8 Xi slot number of first reply -1 4: Reply immediately 9 else 5:else 10: Run the new send-and-reply protocol 6: Keep silent X;value returned by Algorithm 1 7:end if 12: end if 13:end for Frame Size=16 Empty Slots 14:Add all Xi and get the average Y=Xi/n L-0 15:Return fIn 1tY 3 4 In the next two subsections,we will explain why this algo- ④。5 rithm can achieve the desired accurate estimation and how to 16 0(6 07 compute parameters f,k,and n(lines 1 and 2 in Algorithm 3). 8 To ease understanding,we first present the mathematics behind 0 -10 the algorithm and how to pick parameter n.We then describe 11 12 how to determine f and k. 013 。15 14 B.Pick n 0101☐ Z3 The value of n directly determines the performance of our Fig.2.Illustration of our new send-and-reply protocol scheme.If n is too small,the estimated f cannot meet the de- sired accuracy.However,a large n will increase the estimation time.Next,we first present the theoretical underpinnings for Our combined send-and-reply protocol requires a slight mod- the FNEB algorithm,followed by the bounds for n that can ification to existing RFID tags.We add an optional bit mask satisfy the accuracy requirement. to indicate the search range r in each end slot command sent Given the frame size f,each tag has the probability to by the reader.If the parameter is set to a valid range,those select a specific slot in the frame.For t tags in total,the tags who pick a response slot inside the range will reply in probability of a certain slot to be empty (denoted as Po)is the forthcoming slot,no matter what value their slot counters Po=(1-).Since f is normally large,Po can be simplified are.If the parameter is set to null,the original send-and-reply to Poe-,where p=t.We call p the load factor.Let protocol is then used. the random variable X be the number of consecutive empty With the basic idea described above,the complete algorithm slots before the first non-empty slot in a frame.We then have
Algorithm 1 New send-and-reply protocol for the reader 1: if f is not a power of 2 then 2: f = 2⌈log2 f⌉ 3: end if 4: a = 0, b = f /2 − 1 5: Set the search range r = [a, b] and random seed R 6: for i = 1 to log2 f do 7: Reader broadcasts r, f, and R, and listens in the forthcoming slot for reply (only one slot) 8: if the slot is EMPTY then 9: zi = 0 10: a = b + 1, b = b + |r|/2, and updates r 11: else 12: zi = 1 13: b = (b − 1)/2, and updates r 14: end if 15: end for 16: Return X = Plog2 f i=1 (1 − zi) · 2 log2 f−i Algorithm 2 New send-and-reply protocol for each tag 1: Receive range r, f, and R from reader 2: Compute slot number sn = h(f, R, id) 3: if sn is inside r then 4: Reply immediately 5: else 6: Keep silent 7: end if z1 z2 z3 z4 [0~7] [0~3] 0 2 4 6 8 2 3 4 1 0 1 1 0 0 1 0 1 0 1 0 1 0 1 0 5 6 7 8 9 11 12 13 15 14 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 Empty Slots 0 1 0 1 10 Frame Size = 16 1 10 12 14 [12~13] [8~11] [8~9] [4~5] [0~1] Fig. 2. Illustration of our new send-and-reply protocol Our combined send-and-reply protocol requires a slight modification to existing RFID tags. We add an optional bit mask to indicate the search range r in each end slot command sent by the reader. If the parameter is set to a valid range, those tags who pick a response slot inside the range will reply in the forthcoming slot, no matter what value their slot counters are. If the parameter is set to null, the original send-and-reply protocol is then used. With the basic idea described above, the complete algorithm of the FNEB estimator is shown in Algorithm 3. The algorithm takes tmax, δ and ǫ as inputs, where tmax denotes the upper bound of tag population t. Initially, the reader computes parameters f, k, and n by inputs, and then applies the combined send-and-reply protocol n rounds to obtain the average value of X, denoted by Y . At last, the estimation t˜ is calculated below: t˜= f · ln 1 + Y Y (1) Algorithm 3 FNEB estimator for static tag set INPUT: tmax, δ, and ǫ OUTPUT: t˜ 1: Compute the frame size f and waiting slots k 2: Compute the number of rounds n 3: for i = 1 to n do 4: Generate a new random seed Ri 5: Broadcast (f, Ri) to all tags and wait their replies 6: Run the original send-and-reply protocol 7: if receive reply before kth slot then 8: Xi = slot number of first reply - 1 9: else 10: Run the new send-and-reply protocol 11: Xi = value returned by Algorithm 1 12: end if 13: end for 14: Add all Xi and get the average Y = Pn i=1 Xi/n 15: Return t˜= f ln 1+Y Y In the next two subsections, we will explain why this algorithm can achieve the desired accurate estimation and how to compute parameters f, k, and n (lines 1 and 2 in Algorithm 3). To ease understanding, we first present the mathematics behind the algorithm and how to pick parameter n. We then describe how to determine f and k. B. Pick n The value of n directly determines the performance of our scheme. If n is too small, the estimated t˜ cannot meet the desired accuracy. However, a large n will increase the estimation time. Next, we first present the theoretical underpinnings for the FNEB algorithm, followed by the bounds for n that can satisfy the accuracy requirement. Given the frame size f, each tag has the probability 1 f to select a specific slot in the frame. For t tags in total, the probability of a certain slot to be empty (denoted as P0) is P0 = (1− 1 f ) t . Since f is normally large, P0 can be simplified to P0 ≈ e −ρ , where ρ = t f . We call ρ the load factor. Let the random variable X be the number of consecutive empty slots before the first non-empty slot in a frame. We then have
Pr[X u]=Po(1-Po).The expectation of X is is asymptotically normal with mean 0 and variance 1;that is, f-1 f-1 Z satisfies the standard normal distribution and its cumulative E(X)=∑uPr(X=)=∑uP1-R) distribution function is u=0 2=0 1 f-1)P时+1-fPH+B Φ(x)= V2-心 e-号du. 1-Po We can find a constant c which makes Po(1-P6)-fF8 1-P% Pr[-c≤Z≤d=Φ(c)-Φ(-c) Since that0<A<l,then Po→0 and fPo→0 when f is =erf(c/V②)=1-6, large.So E(X)can be further simplified to where erf is the error function [27].By solving the formulation P 1 E(X)≈1-R=eP-I (2) above,we get the value of c.For example,if 6 =1%,then c=2.576.Thus,the desired accuracy can be rewritten as Correspondingly,the variance of X is Pr[t-t≤et=Pr[(1-e)t≤t≤(1+e) f-1 Var(X) (u-E(X)2Pr(X=u) =Pr[(1-e)t≤fn 1+Y ≤(1+e) u=0 P e-(1+e)p e-(1-e)p 中 (1-P)P (3) =Prl2e-+y≤1e--l According to the intuitive relation between E(X)and t,the Therefore,if we have observation of X can be used to estimate t.However,there e-(1+c)p -e-+9F-μ e-(1-c)p exists variance between the observed value of X and E(X). ≤-cand-e--o-4 ≥c By the law of large number [26],the estimation becomes more 0 accurate when the number ofobservations gets larger.We define arandom procesY=∑-,÷sthe man ofobervations. then we can guarantee Pr[lt-t et]>1-6.Combining o and Eq.3 to solve the inequalities,we get where Xi is the random variable X for the ith observation. Note that E(Xi)=E(X)and Var(X;)=Var(X).Since n≥ c2e-o(ep-e-ce)2 the reader gives a different random seed in each broadcast,Xi (1-e-p)2 (1 <i<n)is independent with each other.Therefore,we have ■ EY)=∑EX=nEX =E(X) In practice,the number of tags t is not known a priori. making it difficult to predict the exact number of rounds. and However,the minimum number of rounds n is a monotonically Var(Y)=Var(i)=nVar(X)Var(x) increasing function against the load factor p;that is,the number of rounds calculated by t=tma is large enough for the actual n2 n2 t.Therefore,n in line 2 of Algorithm 3 is computed by Since that E(Y)=E(X),by solving Eq.2 for t,we get n=etma (ctmal-cem t=f.In 1+E(Y) (4) E(Y) (1-e-etmaz/f)2 Then,according to Eq.1,by substituting Y for E(Y),we have C.Determine Optimal Parameters f and k i=f.mnl+y The estimating time of our algorithms is affected by two y factors:the number of rounds and the time cost in each round. Next,we will show how to use Var(Y)to compute the tight Here,the time cost is measured by the number of slots.From the discussion above.we find that the number of rounds n is bound of parameter n. dependent on the frame size f.The time cost in a round is either Theorem 1.Given 6.e,and p.if the number of rounds n is x+1*(if the number of empty slots observed in that round not less than the agorithm described above is smaller than korThat relies on both fand can guarantee the accuracy requirement,that is,Pr-t Hence,if we select inappropriate f and k,the performance of e≥1-6. our scheme will be adversely affected.Our remaining problem is to determine the "best"value for parameter f and k on a Proof:We use u and a to denote the expectation and given upper bound tmaz. standard variance of Y,i.e.,u=E(Y)and o =VVar(Y)= Remember that the probability of the random variable X VVar(X)/n.By the central limit theorem,we know equals to u is Po(1-Po),where Po e-t/f.We use the z=Y-4 *Note that one additional slot is needed for the first non-empty slot
P r[X = u] = P u 0 (1 − P0). The expectation of X is E(X) = f X−1 u=0 uP r(X = u) = f X−1 u=0 uP u 0 (1 − P0) = (f − 1)P f+1 0 − fPf 0 + P0 1 − P0 = P0 1 − P0 (1 − P f 0 ) − fPf 0 . Since that 0 < P0 < 1, then P f 0 → 0 and fPf 0 → 0 when f is large. So E(X) can be further simplified to E(X) ≈ P0 1 − P0 = 1 e ρ − 1 . (2) Correspondingly, the variance of X is V ar(X) = f X−1 u=0 (u − E(X))2P r(X = u) ≈ P0 (1 − P0) 2 . (3) According to the intuitive relation between E(X) and t, the observation of X can be used to estimate t. However, there exists variance between the observed value of X and E(X). By the law of large number [26], the estimation becomes more accurate when the number of observations gets larger. We define a random process Y = Pn i=1 Xi n as the mean of n observations, where Xi is the random variable X for the i th observation. Note that E(Xi) = E(X) and V ar(Xi) = V ar(X). Since the reader gives a different random seed in each broadcast, Xi (1 ≤ i ≤ n) is independent with each other. Therefore, we have E(Y ) = Pn i=1 E(Xi) n = nE(X) n = E(X) and V ar(Y ) = V ar( Pn i=1 X i) n2 = nV ar(X) n2 = V ar(X) n . Since that E(Y ) = E(X), by solving Eq. 2 for t, we get t = f · ln1 + E(Y ) E(Y ) . (4) Then, according to Eq. 1, by substituting Y for E(Y ), we have t˜= f · ln1 + Y Y . Next, we will show how to use V ar(Y ) to compute the tight bound of parameter n. Theorem 1. Given δ, ǫ, and ρ, if the number of rounds n is not less than c 2 e −ρ (e ρ−e −ǫρ) 2 (1−e−ǫρ) 2 , the algorithm described above can guarantee the accuracy requirement, that is, P r[|t˜− t| ≤ ǫt] ≥ 1 − δ. Proof: We use µ and σ to denote the expectation and standard variance of Y , i.e., µ = E(Y ) and σ = p p V ar(Y ) = V ar(X)/n. By the central limit theorem, we know Z = Y − µ σ is asymptotically normal with mean 0 and variance 1; that is, Z satisfies the standard normal distribution and its cumulative distribution function is Φ(x) = 1 √ 2π Z x −∞ e − u2 2 du. We can find a constant c which makes P r[−c 6 Z 6 c] = Φ(c) − Φ(−c) = erf(c/√ 2) = 1 − δ, where erf is the error function [27]. By solving the formulation above, we get the value of c. For example, if δ = 1%, then c = 2.576. Thus, the desired accuracy can be rewritten as P r[|t˜− t| 6 ǫt] = P r[(1 − ǫ)t 6 t˜6 (1 + ǫ)t] = P r[(1 − ǫ)t 6 f ln 1 + Y Y 6 (1 + ǫ)t] = P r[ e −(1+ǫ)ρ 1 − e−(1+ǫ)ρ 6 Y 6 e −(1−ǫ)ρ 1 − e−(1−ǫ)ρ ]. Therefore, if we have e −(1+ǫ)ρ 1−e−(1+ǫ)ρ − µ σ 6 −c and e −(1−ǫ)ρ 1−e−(1−ǫ)ρ − µ σ > c, then we can guarantee P r[|t˜− t| 6 ǫt] > 1 − δ. Combining σ and Eq. 3 to solve the inequalities, we get n > c 2 e −ρ (e ρ − e −ǫρ) 2 (1 − e−ǫρ) 2 . In practice, the number of tags t is not known a priori, making it difficult to predict the exact number of rounds. However, the minimum number of rounds n is a monotonically increasing function against the load factor ρ; that is, the number of rounds calculated by t = tmax is large enough for the actual t. Therefore, n in line 2 of Algorithm 3 is computed by n = c 2 · e −tmax/f · (e tmax/f − e −ǫtmax/f ) 2 (1 − e−ǫtmax/f ) 2 C. Determine Optimal Parameters f and k The estimating time of our algorithms is affected by two factors: the number of rounds and the time cost in each round. Here, the time cost is measured by the number of slots. From the discussion above, we find that the number of rounds n is dependent on the frame size f. The time cost in a round is either x + 1∗ (if the number of empty slots observed in that round is smaller than k) or k + log2f. That relies on both f and k. Hence, if we select inappropriate f and k, the performance of our scheme will be adversely affected. Our remaining problem is to determine the “best” value for parameter f and k on a given upper bound tmax. Remember that the probability of the random variable X equals to u is P u 0 (1 − P0), where P0 = e −t/f . We use the ∗Note that one additional slot is needed for the first non-empty slot