Presentation is loading. Please wait.

Presentation is loading. Please wait.

Network Connectivity Checker

Similar presentations


Presentation on theme: "Network Connectivity Checker"— Presentation transcript:

1 Network Connectivity Checker
Team: Manjunath Shettar Jayashankar Tekkedatha Samhith Venkatesh

2 Agenda Overview Design Implementation Demo
1. Network connectivity Checker intro

3 Ceph Network Ceph network config 1 mon & mds
Public (Front) & cluster(Back) network Cluster network --- Replication and Heartbeat is done on back network Public – talks to clients and Mons

4 Objective Network Connectivity Checker
Point to Point OSD connectivity check Front network Back network Topology aware mesh connectivity check

5 Initial Approach Start network server external to OSD process during OSD-init Physical node’s connectivity can still be checked even if OSD crashes Reason: Socket infra to be associated with OSD We have to assume if OSD is down then node is unreachable Other OSD processes ping the above socket to check connectivity Ceph daemon commands employed to perform ping test Reason: Daemon commands are local to the OSD nodes

6 Design Point to Point Check Existing heartbeat mechanism employed
ceph tell command is used for ping Mesh Check Ceph OSD Tree topology generated by CRUSH Wrapper Point to Point checks are efficiently utilized for mesh check

7 Design – Point to Point check
Commands introduced ceph tell osd.<source-osd> nc_check ping <destination-osd> ceph tell osd.<source-osd> nc_check ping_front <destination-osd> ceph tell osd.<source-osd> nc_check ping_back <destination-osd>

8 Design – Point to Point check
Message structure

9 Design – Point to Point check
Piggy back on the heartbeat infrastructure

10 Design – Point to Point check
Check back and front network

11 Design – Point to Point check
Ping Response

12 General Ceph Topology Root Datacenter Room Row Rack Host OSD The CRUSH hierarchy is aligned with the physical infrastructure Ceph allows creation of “buckets” to define hierarchy ceph osd tree –format json-pretty

13 Design - Mesh Check Steps:
Obtain the CRUSH hierarchy using the ceph osd tree –format json-pretty Parse the json output and traverse the entire topology Use the ceph tell osd version command to validate the status of each OSD Cross ping check between entities of the same level (both front and back) Add active OSD as representative for the parent and its ancestor. Assumption: OSDs once verified as active during mesh check will not go down until completion.

14 Design - Mesh Check Room Row Rack - 1 Host - 1 OSD - 1 OSD - 2

15 Design - Mesh Check Mesh check - traverse to OSD
Room Row Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Mesh check - traverse to OSD Issue ceph tell osd version for OSD process

16 Design - Mesh Check Identified OSD as active Active children: 1 Room
Row Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Active children: 1

17 Design - Mesh Check Active children: 1, 2 Room Row Rack - 1 Host - 1
OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Active children: 1, 2

18 Design - Mesh Check Performing cross ping check
Room Row Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Ping check between active children

19 Design - Mesh Check Ping check successful Room Row Rack - 1 Host - 1
OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8

20 Design - Mesh Check OSD failure scenario Active children: 4 Room Row
Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Active children: 4

21 Design - Mesh Check Performing cross ping check across hosts
Room Row Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Ping check between active children

22 Design - Mesh Check Ping successful Active children: 1,2,4 Room Row
Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Active children: 1,2,4

23 Design - Mesh Check Mesh check in other rack Active children: 5,6,7,8
Room Row Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Active children: 5,6,7,8

24 Design - Mesh Check Performing cross ping check across racks
Room Row Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8 Active children: 1,2,4 Active children: 5,6,7,8

25 Design - Mesh Check Ping successful Room Row Rack - 1 Host - 1 OSD - 1

26 Design - Mesh Check Mesh check completes Reports OSD and link failures
Room Row Rack - 1 Host - 1 OSD - 1 OSD - 2 Host - 2 OSD - 3 OSD - 4 Rack - 2 Host - 3 OSD - 5 OSD - 6 Host - 4 OSD - 7 OSD - 8

27 Mesh check Recursive function call

28 Mesh check Ping check across OSDs in a Host

29 Mesh check Ping check across representative OSDs at an hierarchy

30 DEMO

31 Thank you


Download ppt "Network Connectivity Checker"

Similar presentations


Ads by Google