Presentation is loading. Please wait.

Presentation is loading. Please wait.

Hive Security Yongqiang He Software Engineer Facebook Data Infrastructure Team.

Similar presentations


Presentation on theme: "Hive Security Yongqiang He Software Engineer Facebook Data Infrastructure Team."— Presentation transcript:

1

2 Hive Security Yongqiang He Software Engineer Facebook Data Infrastructure Team

3 1 Hive Security 2 Verification 3 HDFS Permission Agenda

4 User/Group/Role and Privilege User can belong to some groups. The user and group information are provided by authenticator. And each user or group can have some privileges and roles. A role can be a member of another role, but not in a circular manner. Hive manages roles and the mapping between user/group and role. Privileges are associated with user, group, and role. Can grant a privilege to a user individually. Can grant to a group, and all users in the group will get the privilege. Can grant to a role, and all users who have the role will get the privilege.

5 No Deny, Only grant Grant all to group1 on db_name.tbl_name; Revoke all on db_name.tbl_name from group1; Revoke all on db_name.tbl_name from usr1_in_group1; should fail because the grant is on the group and not on the user usr1.

6 4 levels of privileges User level: For all objects in all databases; Its globally. DB level: For all objects in one database; Table/Partition level For all objects in that table/partition; If the table is partitioned, will check partition level and ignore table level. If not, will check table level. Partition level privileges are automatically inherited from table level at the partition creation time. Column level Rule: First, check user level privilege. If pass, then pass. Second, check db level, if pass then pass. Third, check table/partition level, if pass then pass. Last, check column level, if pass then pass. Finally, deny.

7 Use case: 1. In a database, most tables are accessible to everyone. New tables got created all the time, and they should be visible to everyone. But there is one( or a few) table secret_tbl that is only accessible to a small group of people. And only a few columns (c1, c2, c3) in that table are visible to everyone after a amount of time. 1)Partition the table based on date; 2)Add the small group of people to s_group; 3)Create 3 roles, one is everyone_role, one is s_group_role, and the other is s_column_role; 4)Grant role everyone_role to everyone; grant s_group_role to s_group; grant c1,c2,c3 to s_column_role; 5) Grant all on secret_tbl to s_group_role; grant select(c1, c2, c3) on secret_tbl to s_column_role; 6) Whenever a new table get created, by default grant all on that table to everyone_role 7)After a mount of time, revoke all on secret_tbl/ds=partition from s_group_role

8 HDFS Permission Without HDFS support, there is no real security. If a user has direct access to the file, the user can do anything. For a highly-secured table, set the group permission on files of that table. Hive should pass the correct unix group information to HDFS. For column level privileges, the most secured way is file level isolation -- - file format that support column group like Zebra. One other option is to Hive Server. All queries should be submitted from hive server.

9 First version of Hive authorization Goal: Protect a good user from committing a mistake. Malicious user can hack the system in different ways. More protection just complicated their hack process. If they want, they can always find ways to do it.


Download ppt "Hive Security Yongqiang He Software Engineer Facebook Data Infrastructure Team."

Similar presentations


Ads by Google