关键词: data privacy inference detection systems sensor data

Mesh : Algorithms Telemedicine Data Collection

来  源:   DOI:10.3390/s22218140

Abstract:
With the advent of sensors, more and more services are developed in order to provide customers with insights about their health and their appliances\' energy consumption at home. To do so, these services use new mining algorithms that create new inference channels. However, the collected sensor data can be diverted to infer personal data that customers do not consent to share. This indirect access to data that are not collected corresponds to inference attacks involving raw sensor data (IASD). Towards these new kinds of attacks, existing inference detection systems do not suit the representation requirements of these inference channels and of user knowledge. In this paper, we propose RICE-M (Raw sensor data based Inference ChannEl Model) that meets these inference channel representations. Based on RICE-M, we proposed RICE-Sy an extensible system able to detect IASDs, and evaluated its performance taking as a case study the MHEALTH dataset. As expected, detecting IASD is proven to be quadratic due to huge sensor data managed and a quickly growing amount of user knowledge. To overcome this drawback, we propose first a set of conceptual optimizations that reduces the detection complexity. Although becoming linear, as online detection time remains greater than a fixed acceptable query response limit, we propose two approaches to estimate the potential of RICE-Sy. The first one is based on partitioning strategies which aim at partitioning the knowledge of users. We observe that by considering the quantity of knowledge gained by a user as a partitioning criterion, the median detection time of RICE-Sy is reduced by 63%. The second approach is H-RICE-SY, a hybrid detection architecture built on RICE-Sy which limits the detection at query-time to users that have a high probability to be malicious. We show the limits of processing all malicious users at query-time, without impacting the query answer time. We observe that for a ratio of 30% users considered as malicious, the median online detection time stays under the acceptable time of 80 ms, for up to a total volume of 1.2 million user knowledge entities. Based on the observed growth rates, we have estimated that for 5% of user knowledge issued by malicious users, a maximum volume of approximately 8.6 million user\'s information can be processed online in an acceptable time.
摘要:
随着传感器的出现,越来越多的服务被开发,以便为客户提供关于他们的健康和他们的家电\'在家里的能源消耗的见解。要做到这一点,这些服务使用新的挖掘算法来创建新的推理通道。然而,收集的传感器数据可以被转移到推断客户不同意共享的个人数据。这种对未收集的数据的间接访问对应于涉及原始传感器数据(IASD)的推理攻击。面对这些新型攻击,现有的推理检测系统不适合这些推理通道和用户知识的表示要求。在本文中,我们提出了满足这些推理通道表示的RICE-M(基于原始传感器数据的推理信道模型)。基于RICE-M,我们提出了RICE-Sy一个能够检测IASD的可扩展系统,并以MHEALTH数据集为例评估了其性能。不出所料,由于管理的大量传感器数据和快速增长的用户知识,检测IASD被证明是二次的。为了克服这个缺点,我们首先提出了一组降低检测复杂度的概念优化。虽然变得线性,由于在线检测时间保持大于固定的可接受查询响应限制,我们提出了两种方法来估计RICE-Sy的潜力。第一个是基于分区策略,旨在对用户的知识进行分区。我们观察到,通过将用户获得的知识数量作为划分标准,RICE-Sy的中位检测时间减少了63%。第二种方法是H-RICE-SY,建立在RICE-Sy上的混合检测体系结构,该体系结构将查询时的检测限制为具有高恶意概率的用户。我们展示了在查询时处理所有恶意用户的限制,而不会影响查询应答时间。我们观察到,对于30%的用户被认为是恶意的,在线检测时间中位数保持在80ms的可接受时间以下,总共有120万个用户知识实体。根据观察到的增长率,我们估计,对于恶意用户发布的5%的用户知识,在可接受的时间内,可以在线处理最大约860万用户的信息。
公众号