IEEE RFID 2010 Privacy Protection for RFID-based Tracking Systems Chiu C.Tan Qun Li Lei Xie Department of Computer Science Department of Computer Science State Key Laboratory of College of William and Mary College of William and Mary Novel Software Technology cct@cs.wm.edu liqun@cs.wm.edu Nanjing University xielei@dislab.nju.edu.cn Abstract-RFID technology is increasingly being deployed in to the database.A user wanting to know for instance who was ubiquitous computing environments for object tracking and at a meeting in Office 1 at 1:00 pm can query the database localization.Existing tracking architecture usually assumes the to determine the IDs of the tags,and thus the identity of the use of a trusted server which is invulnerable to compromise by internal and external adversaries.However,maintaining such meeting participants. a trusted server is unlikely in the real world.In this paper, The potential for abuse of such a powerful tracking system we consider the problem of adding privacy protection to object is apparent.The straightforward approach is to limit the access tracking systems built upon passive RFID tags,without relying to the database data using passwords.Each user is associated on a trusted server assumption.Our protocol continues to protect user privacy in the event of partial compromise of a server. with a list of RFID tags they are authorized to access,and must supply the correct password when querying the database. I.INTRODUCTION For example,a user with ID=101 running a query "Select location From database Where ID=101 and Time=time"must The low cost and rugged design of RFID tags make them first supply the appropriate permissions for his tag before suitable to be attached to everyday objects such as key chains the location information is released.More advance techniques and coffee mugs.With every object embedded with its own such as role-based access control [3],[4].and Hippocratic RFID tag,ubiquitous computing concepts like object tracking databases [5],can also be used to protect the database. and localization or location based services can be realized [1]. This type of solutions have a common requirement.A 2],allowing many new applications to be developed upon this trusted database is needed to maintain and enforce the infrastructure.At the same time,widespread RFID deployment privacy policy.Without a database that is invulnerable to creates privacy risks for everyone,not just a few willing early compromise by criminal hackers or disgruntled employees, adopters,thus the need for privacy protection becomes critical. and always managed by competent system administrators, Wired connection we cannot protect user privacy.We believe that maintaining such a trusted server is difficult in practice.Even three letter government agencies with strong emphasis on security have Database Comidor office r been vulnerable to database breaches or mistakes [6].[7].[8]. 0 In this paper,we consider the problem of how to build Radio range Office 2 1 a tracking system that does not rely on a trusted database Comridor 2 to protect user privacy.Our solution divides up the database 8- operations into separate servers such that we can tolerate ΣRFID reader Otfice 3 the compromise of any one of the databases.Our protocols O RFID tag 8- are designed to take into account the computationally weak (attached toa person or thing Office 4 capabilities of an RFID tag,and allow a user to efficiently query the database for answers. Fig.1. Typical RFID tracking system We make the following contributions in this paper.We propose a technique to divide a database query,originally Fig.I shows an environment where RFID tags are attached intended to a single database,into multiple databases such to people and objects.Each RFID tag has an associated unique that if only one of the database is compromised,the user's ID which represents that tag in the system.A network of privacy is protected.We also consider how to detect,but not RFID readers are deployed in the building,with each reader prevent,adversary disruption attacks such as the data deletion associated with a specific location,such as "Office 1"or and manipulation. "Corridor 2".The readers are connected to a database server The rest of the paper is as follows.We formulate our prob- via a wired network.A reader will periodically broadcast a lem and adversary model in Section II.Section III introduces query,and all the RFID tags within the vicinity will respond two strawman protocols to motivate our solutions.Sections IV to that reader.The reader will then forward these tag responses presents our protocol against different types of adversaries. 978-1-4244-5743-4/10/$26.00©20101EEE 53
Privacy Protection for RFID-based Tracking Systems Chiu C. Tan Department of Computer Science College of William and Mary cct@cs.wm.edu Qun Li Department of Computer Science College of William and Mary liqun@cs.wm.edu Lei Xie State Key Laboratory of Novel Software Technology Nanjing University xielei@dislab.nju.edu.cn Abstract—RFID technology is increasingly being deployed in ubiquitous computing environments for object tracking and localization. Existing tracking architecture usually assumes the use of a trusted server which is invulnerable to compromise by internal and external adversaries. However, maintaining such a trusted server is unlikely in the real world. In this paper, we consider the problem of adding privacy protection to object tracking systems built upon passive RFID tags, without relying on a trusted server assumption. Our protocol continues to protect user privacy in the event of partial compromise of a server. I. INTRODUCTION The low cost and rugged design of RFID tags make them suitable to be attached to everyday objects such as key chains and coffee mugs. With every object embedded with its own RFID tag, ubiquitous computing concepts like object tracking and localization or location based services can be realized [1], [2], allowing many new applications to be developed upon this infrastructure. At the same time, widespread RFID deployment creates privacy risks for everyone, not just a few willing early adopters, thus the need for privacy protection becomes critical. Database RFID reader RFID tag (attached to a person or thing) Office 1 Office 2 Office 3 Office 4 Corridor 1 Corridor 2 Wired connection Radio range Fig. 1. Typical RFID tracking system Fig. 1 shows an environment where RFID tags are attached to people and objects. Each RFID tag has an associated unique ID which represents that tag in the system. A network of RFID readers are deployed in the building, with each reader associated with a specific location, such as “Office 1” or “Corridor 2”. The readers are connected to a database server via a wired network. A reader will periodically broadcast a query, and all the RFID tags within the vicinity will respond to that reader. The reader will then forward these tag responses to the database. A user wanting to know for instance who was at a meeting in Office 1 at 1:00 pm can query the database to determine the IDs of the tags, and thus the identity of the meeting participants. The potential for abuse of such a powerful tracking system is apparent. The straightforward approach is to limit the access to the database data using passwords. Each user is associated with a list of RFID tags they are authorized to access, and must supply the correct password when querying the database. For example, a user with ID=101 running a query “Select location From database Where ID=101 and Time=time” must first supply the appropriate permissions for his tag before the location information is released. More advance techniques such as role-based access control [3], [4], and Hippocratic databases [5], can also be used to protect the database. This type of solutions have a common requirement. A trusted database is needed to maintain and enforce the privacy policy. Without a database that is invulnerable to compromise by criminal hackers or disgruntled employees, and always managed by competent system administrators, we cannot protect user privacy. We believe that maintaining such a trusted server is difficult in practice. Even three letter government agencies with strong emphasis on security have been vulnerable to database breaches or mistakes [6], [7], [8]. In this paper, we consider the problem of how to build a tracking system that does not rely on a trusted database to protect user privacy. Our solution divides up the database operations into separate servers such that we can tolerate the compromise of any one of the databases. Our protocols are designed to take into account the computationally weak capabilities of an RFID tag, and allow a user to efficiently query the database for answers. We make the following contributions in this paper. We propose a technique to divide a database query, originally intended to a single database, into multiple databases such that if only one of the database is compromised, the user’s privacy is protected. We also consider how to detect, but not prevent, adversary disruption attacks such as the data deletion and manipulation. The rest of the paper is as follows. We formulate our problem and adversary model in Section II. Section III introduces two strawman protocols to motivate our solutions. Sections IV presents our protocol against different types of adversaries. IEEE RFID 2010 978-1-4244-5743-4/10/$26.00 ©2010 IEEE 53
TABLE I Section V considers additional security issues not directly UNENCRYPTED TABLE IN DATABASE related to user privacy.Finally we review the related work in Section VI and conclude in Section VII D Time Location 101 10.00am Office 1■ II.RELATED WORK 10210.00am Office 3 101 10:5am Office 2 Privacy protection is an important component in ubiquitous computing environments [9],[10].The technique of using mix zones was proposed by [11],[12]where by users could specify certain areas where nobody could trace their movements.Other time is the time that tag was read,and location is the physical researchers [13],[14],[15]proposed techniques to help users location corresponding to that particular reader.We assume define privacy policies.This approach is more flexible than that the database knows the locations of all the RFID readers. mix zones at the cost of additional complexity in specifying and can associate the data from a reader to a location. the policies.The idea of allowing users to specify "virtual The database can represent the data from all the RFID walls"was suggested by [16]to simplify the creation of a readers as a table with 3 attributes,ID,Time,and Location. privacy policy.Physical access control [17],[18]addresses Table I illustrates this database.A user who own tag 101,can the problem of specifying a privacy policy.Users can only query this database later by issuing a query, obtain the location information of people that are were present together at the same time.The intuition is that users that were Select from DB where ID=101 and Time=10:00 am at the same location at the same time already know each and determine he was at Office 1 at 10 am. other's presence,and thus there is no privacy issues when releasing that information later.Our work differs from these Such a setup provides no privacy protections,since anyone proposed techniques in that we do not rely on trusted servers to can query the database to determine anybody's movements. protect user privacy.Our idea of separating location,time,and Under the trusted server assumption,we can protect privacy identity is similar to that proposed by [19].but our solutions by associating a password with each ID.The user running are designed to work with RFID tags. the above query for instance,will have to supply the correct RFID security is an active area of research with many password associated with ID=101,before the database will different protocols being proposed [20],[21].[22].While our release the information.While more complex schemes can paper also proposes a simple security protocol,our focus is be designed to provide better access control,a fundamental less on the security and privacy between RFID reader and tag, requirement is that the database cannot be compromised.An but oriented more towards data already collected and archived. adversary with access to Table I learns everything. Closely related to our paper is research on searching en- In this paper,we consider the problem of how to protect crypted data.In this problem,a user encrypts his data and privacy in the event of such a server is compromised by an stores it at an untrusted server.The user wants to be able to adversary. search of part of his data in an efficient manner.Since the A.Adversary model server is untrustworthy,the user cannot send over his secret key.The user also cannot request the server to transmit all the We assume that the adversary seeks only to track the encrypted data back since it is inefficient.An search system movements of a user.The adversary succeeds if he is able using symmetric key to encrypt data was proposed by [23],to extract the identity of the user from the database,or if he while [24]suggested a public key based scheme.Practical is able to link two entries in the database to the same user. encrypted database query retrieval systems were proposed We assume that the adversary can have free access to the by [25],[26].However,unlike our paper,prior research in this database data such as in Table I,as well as observe the area do not consider the privacy implications of ubiquitous database interactions between a user and the database.The environments such as malicious tracking of users.This was adversary is however,unable to determine the identity of shown in the second strawman approach by using [26]as someone querying the database.For instance,the adversary an example.Furthermore,these prior techniques assume that cannot deduce a user identity through the MAC address of more advance hardware such as laptops are used,rather than the user's device when the user is querying the database. computational weak RFID tags. Techniques such as an anonymizing network [27]can be deployed to achieve this.Also,the adversary cannot reprogram III.PROBLEM FORMULATION the database to execute functions it otherwise will not perform. We consider a network of RFID readers,R1,...,Rn,are In our paper,we assume that the RFID tags are able to per- deployed throughout a facility.Each reader is programmed form simple operations such as generating random numbers, to periodically broadcast a query to read all the RFID tags perform a hash function,and XOR two bitstrings together. within its vicinity.The captured data is then forwarded to a These are common assumptions made in RFID security lit- backend database for storage.In the unencrypted scenario,the erature [28].[29],[30].While our solution in the paper is database stores this information as a ID:time location presented using hash functions,symmetric key encryption can tuple,where ID is a number that identifies that RFID tag be substituted instead with minor modifications. 54
Section V considers additional security issues not directly related to user privacy. Finally we review the related work in Section VI and conclude in Section VII. II. RELATED WORK Privacy protection is an important component in ubiquitous computing environments [9], [10]. The technique of using mix zones was proposed by [11], [12] where by users could specify certain areas where nobody could trace their movements. Other researchers [13], [14], [15] proposed techniques to help users define privacy policies. This approach is more flexible than mix zones at the cost of additional complexity in specifying the policies. The idea of allowing users to specify “virtual walls” was suggested by [16] to simplify the creation of a privacy policy. Physical access control [17], [18] addresses the problem of specifying a privacy policy. Users can only obtain the location information of people that are were present together at the same time. The intuition is that users that were at the same location at the same time already know each other’s presence, and thus there is no privacy issues when releasing that information later. Our work differs from these proposed techniques in that we do not rely on trusted servers to protect user privacy. Our idea of separating location, time, and identity is similar to that proposed by [19], but our solutions are designed to work with RFID tags. RFID security is an active area of research with many different protocols being proposed [20], [21], [22]. While our paper also proposes a simple security protocol, our focus is less on the security and privacy between RFID reader and tag, but oriented more towards data already collected and archived. Closely related to our paper is research on searching encrypted data. In this problem, a user encrypts his data and stores it at an untrusted server. The user wants to be able to search of part of his data in an efficient manner. Since the server is untrustworthy, the user cannot send over his secret key. The user also cannot request the server to transmit all the encrypted data back since it is inefficient. An search system using symmetric key to encrypt data was proposed by [23], while [24] suggested a public key based scheme. Practical encrypted database query retrieval systems were proposed by [25], [26]. However, unlike our paper, prior research in this area do not consider the privacy implications of ubiquitous environments such as malicious tracking of users. This was shown in the second strawman approach by using [26] as an example. Furthermore, these prior techniques assume that more advance hardware such as laptops are used, rather than computational weak RFID tags. III. PROBLEM FORMULATION We consider a network of RFID readers, R1, · · · , Rn, are deployed throughout a facility. Each reader is programmed to periodically broadcast a query to read all the RFID tags within its vicinity. The captured data is then forwarded to a backend database for storage. In the unencrypted scenario, the database stores this information as a ID : time : location tuple, where ID is a number that identifies that RFID tag, TABLE I UNENCRYPTED TABLE IN DATABASE ID Time Location 1. 101 10:00am Office 1 2. 102 10:00am Office 3 3. 101 10:15am Office 2 · · · · · · · · · · · · time is the time that tag was read, and location is the physical location corresponding to that particular reader. We assume that the database knows the locations of all the RFID readers, and can associate the data from a reader to a location. The database can represent the data from all the RFID readers as a table with 3 attributes,ID, Time, and Location. Table I illustrates this database. A user who own tag 101, can query this database later by issuing a query, Select * from DB where ID=101 and Time=10:00 am and determine he was at Office 1 at 10 am. Such a setup provides no privacy protections, since anyone can query the database to determine anybody’s movements. Under the trusted server assumption, we can protect privacy by associating a password with each ID. The user running the above query for instance, will have to supply the correct password associated with ID=101, before the database will release the information. While more complex schemes can be designed to provide better access control, a fundamental requirement is that the database cannot be compromised. An adversary with access to Table I learns everything. In this paper, we consider the problem of how to protect privacy in the event of such a server is compromised by an adversary. A. Adversary model We assume that the adversary seeks only to track the movements of a user. The adversary succeeds if he is able to extract the identity of the user from the database, or if he is able to link two entries in the database to the same user. We assume that the adversary can have free access to the database data such as in Table I, as well as observe the database interactions between a user and the database. The adversary is however, unable to determine the identity of someone querying the database. For instance, the adversary cannot deduce a user identity through the MAC address of the user’s device when the user is querying the database. Techniques such as an anonymizing network [27] can be deployed to achieve this. Also, the adversary cannot reprogram the database to execute functions it otherwise will not perform. In our paper, we assume that the RFID tags are able to perform simple operations such as generating random numbers, perform a hash function, and XOR two bitstrings together. These are common assumptions made in RFID security literature [28], [29], [30]. While our solution in the paper is presented using hash functions, symmetric key encryption can be substituted instead with minor modifications. 54
Database Reader Tag Finally,we assume that having a rational adversary pre- cludes attacks such as deleting or shuffling entries in the Request database since such actions do not help the adversary identify a (a)a=(101,n)a (b)B=(101.ID]e user.This is reasonable since such disruption attacks increases (c)a)B the risk of detection.We will discuss more about defending ((10LnB(a)) against such attacks in the Section IV. ((101.n).(a]8) Store ((101.n],(a)). IV.STRAWMAN SOLUTIONS 山le. We motivate our solutions by considering the limitations of seemingly possible solutions.For all these strawman solutions. we assume a more powerful RFID tag that can do symmetric Fig.3.Strawman protocol 2.The tag ID is 101 key encryption is used.We always assume the user is the owner of RFID tag 101. B.Strawman 2 A.Strawman I We can modify a technique from [26]to improve the query performance.Now,each RFID tag maintains two keys that it Database Reader Tag keeps secret,i.e.tag 101 will have k1101,k2102.We remove Request the superscript and use k1 and k2 to denote keys to simplify (a (101n the presentation.The protocol is shown in Fig.3,and after 101,n2 10L,n the data is collected,the table resembles Table III. Store (101,n].time,loc TABLE IⅢ DATABASE TABLE BY STRAWMAN 2 D Time Location Fig.2.Strawman protocol 1.The tag ID is 101 101,n1k1,{a1B 10:00am Office 1 102,n2k1:a2B, 10.00am In this simple protocol,we let the RFID tag return its 10L,n3k1,a3B1 10:15am Office 2 , encrypted ID in the form of [ID,nk where n is a random number,and k is the tag's secret key.For completeness,we show this interaction in Fig.2.Now,in the database.we have This approach also provides the same protections against an adversary as the earlier strawman protocol,since knowing TABLE II (101,n1}k1,[o}3 does not lead to knowing 101.Fur- DATABASE TABLE FROM STRAWMAN I thermore,the adversary cannot link {101,n,[1}8,and ID Time Location (101,n3k1,[o3)8 together since a different random n is 101,n1k 10.00am Office 1 used. 102,n2k2 10.00am Office 3 When a user queries the database,he will issue a"Select 101,n3k1 10:15am Office 2 from DB where ID=3 and Time=10:00am",where 1 t T1 3={101,IDk2 where ki and k2 are the secret keys of tags 101 and 102 respectively.We see that this approach protects privacy since The database will encrypt the first portion of each ID field the adversary observing this table cannot learn the actual IDs, with B,and check whether it matches the second portion.If 101 and 102,of the tags.There is also no linkability,since it does,the database will return that entry to the user.For (101,nk and {101,n3hk,use different random numbers example,encrypting {101,n}k with B will match {o)s, n and n3 while encrypting the same ID,thus resulting in and hence this entry belongs to the user.Since {101,nhk is different ciphertexts. protected via k1 which is only know by the user,the database The problem with this solution arrises when the user wants does not have to verify the user.It is clear that strawman 2 is to query the database.He can no longer simply do a"Select more efficient than strawman 1. from DB where ID=101",since all the ID attributes now However,once an adversary observes B.the adversary can have a random number component.The user cannot recall the determine future time and locations that B have visited.For nI and n3 values used since the limited storage capacity of the instance,using B,the adversary know that the same tag visited RFID tag means these random numbers have to be generated Office 2 at time 10:15 am.The reason is that while at time on-the-fly and never archived.The only option is for the user 10:15 am we have {101,n3}k1 which uses a different n3,the to retrieve the entire table,and attempt to decrypt each entry's same B is used,since B ={101,ATTR)k2.The ATTR is ID field until he finds the all the entries with IDs equal to a fixed value in the database and cannot be changed by the his own.This is clearly inefficient given the large size of the user.Thus,strawman 2 only prevents linkability so long as table. the adversary never observes a user querying the database. 55
Finally, we assume that having a rational adversary precludes attacks such as deleting or shuffling entries in the database since such actions do not help the adversary identify a user. This is reasonable since such disruption attacks increases the risk of detection. We will discuss more about defending against such attacks in the Section IV. IV. STRAWMAN SOLUTIONS We motivate our solutions by considering the limitations of seemingly possible solutions. For all these strawman solutions, we assume a more powerful RFID tag that can do symmetric key encryption is used. We always assume the user is the owner of RFID tag 101. A. Strawman 1 Database Reader Tag Request (a){101,n}k Store{101,n}k , time, loc {101,n}k {101,n}k Fig. 2. Strawman protocol 1. The tag ID is 101. In this simple protocol, we let the RFID tag return its encrypted ID in the form of {ID, n}k where n is a random number, and k is the tag’s secret key. For completeness, we show this interaction in Fig.2. Now, in the database, we have TABLE II DATABASE TABLE FROM STRAWMAN 1 ID Time Location 1. {101, n1}k1 10:00am Office 1 2. {102, n2}k2 10:00am Office 3 3. {101, n3}k1 10:15am Office 2 · · · · · · · · · · · · where k1 and k2 are the secret keys of tags 101 and 102 respectively. We see that this approach protects privacy since the adversary observing this table cannot learn the actual IDs, 101 and 102, of the tags. There is also no linkability, since {101, n1}k1 and {101, n3}k1 use different random numbers n1 and n3 while encrypting the same ID, thus resulting in different ciphertexts. The problem with this solution arrises when the user wants to query the database. He can no longer simply do a “Select * from DB where ID=101”, since all the ID attributes now have a random number component. The user cannot recall the n1 and n3 values used since the limited storage capacity of the RFID tag means these random numbers have to be generated on-the-fly and never archived. The only option is for the user to retrieve the entire table, and attempt to decrypt each entry’s ID field until he finds the all the entries with IDs equal to his own. This is clearly inefficient given the large size of the table. Database Reader Tag Request α β β α (c){ } (b) {101,ID} (a) {101,n} k2 k1 = = ({101,n}k1,{α}β ) time , loc Store ({101,n}k1,{α}β ), ({101,n}k1,{α}β ) Fig. 3. Strawman protocol 2. The tag ID is 101. B. Strawman 2 We can modify a technique from [26] to improve the query performance. Now, each RFID tag maintains two keys that it keeps secret, i.e. tag 101 will have k1 101, k2 102. We remove the superscript and use k1 and k2 to denote keys to simplify the presentation. The protocol is shown in Fig. 3, and after the data is collected, the table resembles Table III. TABLE III DATABASE TABLE BY STRAWMAN 2 ID Time Location 1. {101, n1}k1, {α1}β1 10:00am Office 1 2. {102, n2}k1, {α2}β2 10:00am Office 3 3. {101, n3}k1, {α3}β1 10:15am Office 2 · · · · · · · · · · · · This approach also provides the same protections against an adversary as the earlier strawman protocol, since knowing {101, n1}k1, {α1}β1 does not lead to knowing 101. Furthermore, the adversary cannot link {101, n1}k1, {α1}β1 and {101, n3}k1, {α3}β1 together since a different random n is used. When a user queries the database, he will issue a “Select * from DB where ID=βˆ and Time=10:00am”, where βˆ = {101, ID}k2 The database will encrypt the first portion of each ID field with βˆ, and check whether it matches the second portion. If it does, the database will return that entry to the user. For example, encrypting {101, n1}k1 with βˆ will match {α1}β1 , and hence this entry belongs to the user. Since {101, n1}k1 is protected via k1 which is only know by the user, the database does not have to verify the user. It is clear that strawman 2 is more efficient than strawman 1. However, once an adversary observes βˆ, the adversary can determine future time and locations that βˆ have visited. For instance, using βˆ, the adversary know that the same tag visited Office 2 at time 10:15 am. The reason is that while at time 10:15 am we have {101, n3}k1 which uses a different n3, the same β is used, since β = {101, AT T R}k2. The AT T R is a fixed value in the database and cannot be changed by the user. Thus, strawman 2 only prevents linkability so long as the adversary never observes a user querying the database. 55
This vulnerability does not occur in [26]because in their a simple incremental counter ct.We use subscripts i when encrypted database,all fields are encrypted by the user and denoting more than one reader or tag. stored into the table.In other words,“10:l5am”and“Office We denote the RFID tag identifier as w,and w will be stored 2"are all encrypted separately.The "B"used to encrypted under the"ID"attribute in the LS.For instance,when there is “10:00am'and“10:l5am”will be{10:00am,Time}k2and no security at all,the w value is just ID,in strawman protocol {10:15 am,Time}k2,so knowing the B value of 10:00 am 1,the w value will be {ID,nhk,and in strawman protocol 2 does not reveal anything about 10:15 am.We cannot do the w=({101,n}k1,[a}B).A summary of the notation used in same in an RFID based tracking system because the RFID tag this paper is given in Table IV. cannot determine the time and location independently TABLE IV C.Discussion SUMMARY OF NOTATIONS From the strawman protocols,we can make the following RFID reader observations.First,a direct encryption of data by the RFID Tag tag (strawman 1)protects user privacy even if the database Timestamp Server Location Server is compromised.However,this approach has slow query Tag's secret,known only to tag owner performance since the database cannot return only the user's Tag's ID,known only to tag owner data to him.Second,the solution cannot merely focus on Tag's counter value,known only to tag owner building an encrypted database,but must also consider the Tag identifier timestamp,stored in TS user query process.The protocol has to ensure that the user's location,stored in LS query does not reveal useful information that an adversary Nonce generated by tag can use to violate user's privacy (strawman 2).Finally,unlike Nonce generated by LS more powerful laptop devices [31].the RFID tag cannot maintain an internal clock to determine time by itself,not capture IP address or GPS coordinates to determine its location V.RFID PROTOCOL independently.While [32]gives a solution for the RFID tag The intuition behind our protocol is for the RFID tag to to determine time,it is unclear that the same techniques can generate two pieces of data each time it responds to an RFID be applied to location information since the tag movements is reader.The reader will store one piece of data associated with unpredictable. the time the tag was read in the TS,and the other piece of The intuition behind our approach is to utilize separate data associated with the location of the tag in the LS.An database servers to perform different roles in the RFID track- adversary compromising either LS or TS will be unable to ing system,as well as store different types of data in each violate the privacy of the RFID tag.When a legitimate user server.There are two roles in the RFID tracking system,where wishes to query the databases,he will query both TS and LS the tag was read (location),and at what time the tag was separately and combine the result to satisfy his query.The read (time).Our approach uses two database servers,one to overview of an RFID reader collecting data and user querying control and store time data and the other to control and store the servers is shown in Fig.4. location data.In this paper,we assume that the adversary can only compromise one of the following,(a)the database Reading RFID tag Querying database storing time information,(b)the database storing location LS information,(c)a small number of RFID readers. TS We justify this assumption because the two servers can be housed at different physical locations,running different operating systems,and managed by a separate group of system administrators.This raises the bar for an adversary User to compromise both servers.Also,given the large number of 、 (1)Query TS to obtain time RFID readers deployed,it is reasonable to assume that the (1)Read data from tag. data.then query LS to get adversary cannot control a majority of them. (2)Return time data location data. to TS,location data to DS (2)Combine both to satisfy The following notations are used to describe our protocols. query. The server controlling the timestamps is the timestamp server TS,and the server controlling the location is the location Fig.4.(L)Obtaining data from tags.(R)Querying databases for data. server LS.We use terms server and database interchangeably in this paper.While both TS and LS can communicate with the RFID readers,only the LS knows the location of these A.Collecting data from tag RFID readers.We denote an RFID reader as R,and an RFID Fig.5 shows the protocol for collecting data from an RFID tag as T.Each T maintain its own secret s,ID and a simple tag.When a reader queries the RFID tag in Step (1),the tag incremental counter ct.The values of s,ID,and ct are kept will first generate a random number n by hashing its secret s secret and only known to the tag owner.The tag also maintains with the current counter value,ct.The tag will then increment 56
This vulnerability does not occur in [26] because in their encrypted database, all fields are encrypted by the user and stored into the table. In other words, “10:15 am” and “Office 2” are all encrypted separately. The “β” used to encrypted “10:00 am” and “10:15 am” will be {10:00 am, T ime}k2 and {10:15 am, T ime}k2, so knowing the β value of 10:00 am does not reveal anything about 10:15 am. We cannot do the same in an RFID based tracking system because the RFID tag cannot determine the time and location independently. C. Discussion From the strawman protocols, we can make the following observations. First, a direct encryption of data by the RFID tag (strawman 1) protects user privacy even if the database is compromised. However, this approach has slow query performance since the database cannot return only the user’s data to him. Second, the solution cannot merely focus on building an encrypted database, but must also consider the user query process. The protocol has to ensure that the user’s query does not reveal useful information that an adversary can use to violate user’s privacy (strawman 2). Finally, unlike more powerful laptop devices [31], the RFID tag cannot maintain an internal clock to determine time by itself, not capture IP address or GPS coordinates to determine its location independently. While [32] gives a solution for the RFID tag to determine time, it is unclear that the same techniques can be applied to location information since the tag movements is unpredictable. The intuition behind our approach is to utilize separate database servers to perform different roles in the RFID tracking system, as well as store different types of data in each server. There are two roles in the RFID tracking system, where the tag was read (location), and at what time the tag was read (time). Our approach uses two database servers, one to control and store time data and the other to control and store location data. In this paper, we assume that the adversary can only compromise one of the following, (a) the database storing time information, (b) the database storing location information, (c) a small number of RFID readers. We justify this assumption because the two servers can be housed at different physical locations, running different operating systems, and managed by a separate group of system administrators. This raises the bar for an adversary to compromise both servers. Also, given the large number of RFID readers deployed, it is reasonable to assume that the adversary cannot control a majority of them. The following notations are used to describe our protocols. The server controlling the timestamps is the timestamp server T S, and the server controlling the location is the location server LS. We use terms server and database interchangeably in this paper. While both T S and LS can communicate with the RFID readers, only the LS knows the location of these RFID readers. We denote an RFID reader as R, and an RFID tag as T . Each T maintain its own secret s, ID and a simple incremental counter ct. The values of s, ID, and ct are kept secret and only known to the tag owner. The tag also maintains a simple incremental counter ct. We use subscripts i when denoting more than one reader or tag. We denote the RFID tag identifier as ω, and ω will be stored under the “ID” attribute in the LS. For instance, when there is no security at all, the ω value is just ID, in strawman protocol 1, the ω value will be {ID, n}k, and in strawman protocol 2, ω = ({101, n}k1, {α}β). A summary of the notation used in this paper is given in Table IV. TABLE IV SUMMARY OF NOTATIONS R RFID reader T Tag T S Timestamp Server LS Location Server s Tag’s secret, known only to tag owner ID Tag’s ID, known only to tag owner ct Tag’s counter value, known only to tag owner ω Tag identifier t timestamp, stored in TS loc location, stored in LS n Nonce generated by tag Nls Nonce generated by LS V. RFID PROTOCOL The intuition behind our protocol is for the RFID tag to generate two pieces of data each time it responds to an RFID reader. The reader will store one piece of data associated with the time the tag was read in the T S, and the other piece of data associated with the location of the tag in the LS. An adversary compromising either LS or T S will be unable to violate the privacy of the RFID tag. When a legitimate user wishes to query the databases, he will query both T S and LS separately and combine the result to satisfy his query. The overview of an RFID reader collecting data and user querying the servers is shown in Fig. 4. TS LS (1) Query TS to obtain time data, then query LS to get location data. (2) Combine both to satisfy query. User TS LS (1) Read data from tag. (2) Return time data to TS, location data to DS Reading RFID tag Querying database Fig. 4. (L) Obtaining data from tags. (R) Querying databases for data. A. Collecting data from tag Fig. 5 shows the protocol for collecting data from an RFID tag. When a reader queries the RFID tag in Step (1), the tag will first generate a random number n by hashing its secret s with the current counter value, ct. The tag will then increment 56
Timestamp Location Server Server Reader Tag the ID 101 is protected by a random number n.To query Request, LS,the user must first determine the value of n he used at 10:00 am.In our protocol,we let n=h(s.ct)and increment (1)(a)n=h(s,ct) ct immediately afterwards.While the tag cannot recall the (b)ct=ct+ (c)o=h(id,n) random n value used at 10:00 am,the owner of the tag can (3) 0,n determine the current value of ct in the tag.(We assume that --n (2) there is a command to recover this.Knowing the secret s @.R as well as the initial and current value ct,the user can re- (4) generate all the random n values the RFID tag has used so far.For instance,the user can generate a list The user then Fig.5.Reader-tag interaction.The dotted line in step (3)denotes that the TABLE VII reader transmits directly to the timestamp server,bypassing the location server. TABLE MAINTAINED BY USER n time the ct by one (Step 1b),and create the identifier w as h(ID.n). cti_1 h(s,cti_1) Finally,in Step (2),the tag returns w,n to the reader. cti h(s,cti) When the reader receives this tuple,the reader will append ctit1 h(s,cti+1) 。,, , the current timestamp t together with n,and send that to the TS in Step(3).This messages bypasses the LS completely. queries the TS database using meaning the LS never learns n.Then,the reader will append the reader's ID,Ri,and w,and send everything to the LS in Select Time from TS where Random Value ni,(1) step (4).Using Ri,the LS can determine the reader's. where ni=h(s,cti). TABLE V After receiving the time corresponding to ni from Table V, TABLE MAINTAINED BY TS the user determine whether the returned time is larger than or smaller than his target time of 10:00 am.If the returned Time Random value 10.00am ni time is larger,the user will pick another an smaller ct value, 10.00am nj compute n,and query the TS again.If the returned time is 10:5am nk smaller,the user will pick a target ct value.Using a simple binary search,the user only needs to execute O(log n)queries to TS. At the end of the protocol,the TS maintains a table shown With the user now knowing his n corresponding to 10:00 in Table V.The TS table has 2 attributes,time,and a random am,he can now query the LS server using the following query. value.The random values associated with time 10:00 am for instance,are all the n values transmitted by all the RFID Select from LS where ID-w, (2) readers.Each n value represent a different RFID tag.Similarly. where w=h(101,n). TABLE VI C.Security analysis TABLE MAINTAINED BY LS We consider the security of our system after the adversary ID Location compromises the TS,the LS and the reader Ri.Since we 1 Office 1 Office 3 allow anyone to query the databases,the adversary controlling w3 Office 2 the TS for instance,can also query the LS like a regular user. Compromise TS server:The adversary succeeds in attack- ing TS if he can determine which w in LS belongs to a user, the LS maintains a table shown in Table VI.The LS table has or if the adversary can determine that two w values belong to 2 attributes,the tag identifier w and the location that tag was the same person.The reason for focusing on ws is because read.Each entry in the LS table represent a different RFID through w,the adversary determine the whereabouts of a user. tag response.Note that the LS does not update the table in The adversary controlling TS is able to access all the real time,and thus the ordering of entries do not indicate the records such as those in Table V,as well as observe multiple time an RFID tag was read. queries (Query 1)made by a user and the corresponding response.The adversary can also query the LS using infor- B.Querying the database mation from his observations. Let the user wanting to learn his location at 10:00 am.He We begin by examining what the adversary can learn from will need to query LS using the w value h(101,n),where n controlling TS.For a single RFID tag T:with secret s;being is the value he choose at 10:00 am.Notice that the LS table queried twice,the n results from h(s,ct)and h(s,(ct+1))will is similar to strawman protocol I's table.In both instances, be different since the ct value is automatically incremented 57
Timestamp Server Location Server Reader Tag Request (c) h(id, n) (b) ct ct 1 (a) n h(s, ct) = = + = ω ω,n t,n ω,Ri (1) (2) (3) (4) Fig. 5. Reader-tag interaction. The dotted line in step (3) denotes that the reader transmits directly to the timestamp server, bypassing the location server. the ct by one (Step 1b), and create the identifier ω as h(ID, n). Finally, in Step (2), the tag returns ω, n to the reader. When the reader receives this tuple, the reader will append the current timestamp t together with n, and send that to the T S in Step (3). This messages bypasses the LS completely, meaning the LS never learns n. Then, the reader will append the reader’s ID, Ri , and ω, and send everything to the LS in step (4). Using Ri , the LS can determine the reader’s. TABLE V TABLE MAINTAINED BY T S Time Random value 1. 10:00 am ni 2. 10:00 am nj 3. 10:15 am nk · · · · · · · · · At the end of the protocol, the T S maintains a table shown in Table V. The T S table has 2 attributes, time, and a random value. The random values associated with time 10:00 am for instance, are all the n values transmitted by all the RFID readers. Each n value represent a different RFID tag. Similarly, TABLE VI TABLE MAINTAINED BY LS ID Location 1. ω1 Office 1 2. ω2 Office 3 3. ω3 Office 2 · · · · · · · · · the LS maintains a table shown in Table VI. The LS table has 2 attributes, the tag identifier ω and the location that tag was read. Each entry in the LS table represent a different RFID tag response. Note that the LS does not update the table in real time, and thus the ordering of entries do not indicate the time an RFID tag was read. B. Querying the database Let the user wanting to learn his location at 10:00 am. He will need to query LS using the ω value h(101, n), where n is the value he choose at 10:00 am. Notice that the LS table is similar to strawman protocol 1’s table. In both instances, the ID 101 is protected by a random number n. To query LS, the user must first determine the value of n he used at 10:00 am. In our protocol, we let n = h(s, ct) and increment ct immediately afterwards. While the tag cannot recall the random n value used at 10:00 am, the owner of the tag can determine the current value of ct in the tag. (We assume that there is a command to recover this.) Knowing the secret s as well as the initial and current value ct, the user can regenerate all the random n values the RFID tag has used so far. For instance, the user can generate a list The user then TABLE VII TABLE MAINTAINED BY USER ct n time · · · · · · · · · cti−1 h(s, cti−1) ? cti h(s, cti) ? cti+1 h(s, cti+1) ? · · · · · · · · · queries the T S database using Select Time from T S where Random Value = ni , (1) where ni = h(s, cti). After receiving the time corresponding to ni from Table V, the user determine whether the returned time is larger than or smaller than his target time of 10:00 am. If the returned time is larger, the user will pick another an smaller ct value, compute n, and query the T S again. If the returned time is smaller, the user will pick a target ct value. Using a simple binary search, the user only needs to execute O(log n) queries to T S. With the user now knowing his n corresponding to 10:00 am, he can now query the LS server using the following query. Select * from LS where ID=ω, (2) where ω = h(101, n). C. Security analysis We consider the security of our system after the adversary compromises the T S, the LS and the reader Ri . Since we allow anyone to query the databases, the adversary controlling the T S for instance, can also query the LS like a regular user. Compromise T S server: The adversary succeeds in attacking T S if he can determine which ω in LS belongs to a user, or if the adversary can determine that two ω values belong to the same person. The reason for focusing on ωs is because through ω, the adversary determine the whereabouts of a user. The adversary controlling T S is able to access all the records such as those in Table V, as well as observe multiple queries (Query 1) made by a user and the corresponding response. The adversary can also query the LS using information from his observations. We begin by examining what the adversary can learn from controlling T S. For a single RFID tag Ti with secret si being queried twice, the n results from h(s, ct) and h(s,(ct+1)) will be different since the ct value is automatically incremented 57