ISIS 2 RUNTIME PARAMETERS Ken Birman 1 Cornell University
Parameters 2 Many features of Isis 2 depend on parameters you can modify to “shape” the behavior of the platform. They give you very fine control over behavior of Isis 2 There are three main categories of parameters 1. Those that determine how the system will start up 2. Those that determine how it sends messages 3. Those that control limits, timeouts and other bounds
What happens when you call IsisSystem.Start()? Startup Parameters 3
How IsisSystem.Start() works 4 1. The library initializes itself and determines the IP address of “local host.” If the host has several IP addresses, it picks the last of the IPv4 addresses 2. The system scans the “environment” variables to read values of the parameters. These will override the default values compiled into Isis 2 1. In Linux/bash, use “export” to set them, either in.bashrc or in a shell script. Or call setenv(2) 2. In Windows, use the “set” command, or call Environment.SetEnvironmentVariable("something", somevalue);
How IsisSystem.Start() works 5 1. Next, the system decides which network interfaces it should use (all of them, unless you tell it otherwise by setting ISIS_NETWORK_INTERFACES) 1. Do this if you expect to run on machines that have a “production” network and a “management” network 2. Otherwise leave ISIS_NETWORK_INTERFACES alone 2. Having done this, it attempts to contact the ORACLE 1. If the ORACLE isn’t found, it restarts the ORACLE 2. Otherwise, it asks the ORACLE to let it join the ISISMEMBERS system group
Logging 6 Normally, upon restart, Isis 2 creates a log file for messages printed by the library You can inhibit this by setting ISIS_MUTE=true You can also direct that messages be echoed to the Debug stream rather than the Console when calling IsisSystem.Start() If you allow logging and want to write to the log, call IsisSystem.Write() or IsisSystem.WriteLine() Output goes to the log plus to Console, or Debug stream
Fast start: But there can only be one… 7 For extreme speed, you can tell Isis 2 not to hunt for the ORACLE (by specifying an argument to IsisSystem.Start) It will restart instantly. But if you launch two instances this way, they won’t communicate with one-another. So… do this only in the first instance that you launch
Overwhelming the Membership Oracle 8 If processes start one by one, no issue…. But what if you try to start 50 at once, or 500? Oracle Hello? Welcome! Oracle
Master/Worker 9 If a system will be big, launching hundreds of members can overload the ORACLE. Better performance: add many all at the same time In this case use the Master/Worker pattern Master starts first, collects a list of the workers Workers start after the master and register with it Then Master can add a batch of workers to the system, and to any groups that are desired
Master: Accumulates workers, tells them what to do 10 static void beMaster(string[] args) { IsisSystem.Start(); Semaphore waitForWorkers = new Semaphore(0,1); bool fullyStaffed = false List myWorkers = new List (); IsisSystem.RegisterAsMaster((NewWorker)delegate(Address worker) { lock (myWorkers) if (fullyStaffed) IsisSystem.RejectWorker(worker); else { myWorkers.Add(worker); if(myWorkers.Count() == GOAL) { fullyStaffed = true; waitForWorkers.Release(1); } } }); waitForWorkers.WaitOne(); IsisSystem.BatchStart(myWorkers); // This delays until they have all finished their batch start IsisSystem.WaitForWorkerSetup(myWorkers); Group.MultiJoin(myWorkers, new Group[] { myGroup }); // In front of this next line do whatever you want this application to do IsisSystem.WaitForever(); // If the master shuts down, its workers will too IsisSystem.Shutdown(); } Accumulate workers Main thread waits until enough workers have connected, then starts them all at once… … Then adds them all to groups we may want to use
RunAsWorker: Let Master run the show 11 static void beWorker(string[] args) { // This next line assumes that argument 0 is the master's Address // You can also use new Address(mastersHost, 0) if you know the host IP // address of the master but don’t know the master’s pid. IsisSystem.RunAsWorker(args[0]); // This line blocks until the master issues the BatchStart() call // Notice that in this one special case we call it AFTER RunAsWorker! IsisSystem.Start(); // Before calling this next line do whatever setup this worker must do: // create your group handles and register callbacks – but don’t call Join // For example, you might call g = new Group(“something”), then call // g.ViewHandlers += myViewHandler; … etc – anything needed to have the // group ready for a Join. But you call SetUp done INSTEAD of g.Join(). IsisSystem.WorkerSetupDone(); // Now, for each group the Master created using a multijoin, you wait // for its first view to be reported. This is one way to do that: foreach (Group g in myGroups) while (!g.HasFirstView) Thread.Sleep(250); // WaitForever would freeze the main thread but if the worker has joined // groups (or gets added to groups by the master using MultiJoin(), the // worker could be quite active, receiving messages, sending them, etc) IsisSystem.WaitForever(); // If the master shuts down the worker will throw an // IsisException("master termination"); // If this next line actually executes, this particular worker will exit // (in effect, this worker is a normal Isis application by now, except that // if the master terminates, it does too. In particular, it can // deliberately chose to leave the system if it wishes to do so IsisSystem.Shutdown(); }
Master/Worker Timeline Worker Master 12 Oracle IsisSystem.RunAsWorker(mAddress); IsisSystem.Start(); Reached goal IsisSystem.BatchStart(myWorkers); IsisSystem.Start();... Accumulate workers Group g = new Group(“myGroup”);... Attach handlers for g, but don’t call Join IsisSystem.WorkerSetupDone(); IsisSystem.WaitForever(); Setup done for all workers IsisSystem.WaitForWorkerSetup(myWorkers); Group.MultiJoin(myWorkers, new Group[] { myGroup }); IsisSystem.WaitForever(); Group myGroup = new Group(“myGroup”);... Attach handlers for myGroup, then myGroup.Join(); foreach (Group g in myGroups) while (!g.HasFirstView) Thread.Sleep(250); New view
Why does this help? 13 Workers only send one message to Master Hence it experiences less load It adds them all at once, first to the system, then to whatever groups the application will use Hence only one group view needs to be sent, and it can be sent efficiently, using a broadcast Overall load is much reduced
How to control what internet protocols Isis 2 uses Messaging Parameters 14
IP multicast / ISIS_UNICAST_ONLY 15 Isis 2 will broadcast to find the ORACLE unless you tell it not to do so. Default: OK to use IP multicast, UDP, broadcast ISIS_UNICAST_ONLY: don’t use IP multicast. Still requires UDP (older ISIS_TCP_ONLY feature was eliminated starting in Isis v2.1) You must list the machines on which Isis 2 ORACLE will run if you put the system in ISIS_UNICAST_ONLY mode. ISIS_HOSTS=“…”
Normal versus UNICAST_ONLY 16 With normal IP multicast packets are still sent directly With ISIS_UNICAST_ONLY, packets travel on a tree of point-to-point links and must be forwarded, perhaps log 2 (N) times IP multicast Unicast tree: power of 2 “reach”
ISIS_HOSTS 17 Idea is to list the places where the ORACLE can run ISIS_HOSTS=c1.cs.cornell.edu,c2.cs.cornell.edu … or ISIS_HOSTS= , Processes running on other machines can join the system but can’t restart it from scratch
ISIS_HOSTS: numerical is best! 18 We have seen bugs in the Linux DNS when accessed from Mono. Sometimes it hangs To avoid this, use fully numerical IP addresses when you set the values in ISIS_HOSTS Use the IPv4 addresses for the machines on which you want the ORACLE to run. In this case DNS never hangs The “ping” and “traceroute” commands are examples of ways you can look these up. On Windows, string names are fine. On Linux, they work, but don’t put the DNS under heavy load.
ISIS_PORTp 19 The system uses two standard IP ports ISIS_PORTp: for p2p messages ISIS_PORTa: Set to ISIS_PORTp+1, for acks/nacks These ports should not be blocked by your firewall On Linux, also check iptables, which is like a firewall If two instances of Isis 2 use non-overlapping port ranges, they will not notice one-another.
ISIS_MAXIPMCADDRS 20 When permitted to use IP multicast, Isis 2 tries not to overuse that feature: ISIS_MCRANGE_LOW: low-end of the IPMC address range Isis 2 should use. By default, CLASSD+5000, where CLASSD is /8 ISIS_MCRANGE_HIGH: high-end of the IPMC range ISIS_MAXIPMCADDRS: limit on how many multicast addresses Isis 2 can use, system-wide. It is perfectly reasonable to set this to a small number, like 5 or 10. The system should work if ISIS_MAXIPMCADDRS 2. If ISIS_UNICAST_ONLY is true, then no IPMC addresses are used at all.
ISIS_TTL 21 Broadcast and multicast messages are automatically relayed by routers Each “hop” causes the “time to live” field in the message to be decremented If the TTL reaches zero, the router drops the packet Isis 2 initializes the TTL value using ISIS_TTL. You can set this to 0 or 1 to confine the system to a single segment of your network.
ISIS_MAXMSGLEN 22 Automatically adjusted but you can provide a recommended value if you wish Isis 2 will override the value in some situations Normally not something you would need to modify If a message is too large, Isis 2 will automatically fragment it and reassemble it prior to delivery
These are less often changed Other limits and timeouts 23
ISIS_DEFAULTTIMEOUT 24 Normally 45secs. OK to reduce if you wish. Failure detection needs twice this long, hence 90s. This applies if you kill a process “suddenly” (e.g. ^C) or if the machine on which it was running crashes 45s is very slow, but on cloud computing systems long delays happen more often than you would expect! On lightly loaded clusters, you can set ISIS_DEFAULTTIMEOUT much lower, but not less than 2s. If you design a failure sensing solution of your own, call Isis.ProcessFailed(who) to tell us if a process crashes.
Help! I’ve been poisoned! 25 If a process throws this exception, it means that some other process thought it had failed If a dead process reappears, live members send it a “you have been poisoned” message Prevents system partitioning Rule in Isis 2 : Only allow a single partition to remain alive at one time. If a partition forms, immediately shut one side down (the side lacking a majority)
Speeding up failure detection 26 If a process will exit (rather than crash), call IsisSystem.Shutdown() first. This rapidly announces the departure and the process will immediately be removed from groups it belongs to Like a fast failure notification – as if it said “bye!” You can also eliminate a group rapidly (without killing its members) using g.Terminate()
Hints for EC2 users 27 On EC2 we recommend using ISIS_UNICAST_ONLY EC2 gives you a “virtual cluster” with nodes numbered from IP address xxx.xxx.xxx.0. You can use this range to set ISIS_HOSTS even before launching your application If you use the Master/Worker startup mode, you can tell the system the master is at: new Address(xxx.xxx.xxx.0, 0); This works because the master will run on node xxx.xxx.xxx.0 (due to ISIS_HOSTS) and the pid is ignored in the BeWorker call, so using 0 is fine.
How can it be done? Debugging Isis 2 issues 28
Debugging is hard… 29 … debugging distributed systems even harder Useful tools Visual studio. Keep in mind that even an exception thrown inside Isis 2 could be caused by a mistake in your code. All those upcalls will be issued from Isis 2 stacks! You can call IsisSystem.GetState() to obtain a string representing the state of the Isis system itself. But you’ll need help from Cornell experts to understand this data. You can call IsisSystem.RunTimeStatsState() to obtain a self- explanatory string with counts of messages sent and received. The data itself is in IsisSystem.RTS, and you can access this at runtime.
Suggestions 30 Isis 2 is multithreaded. So write thread-safe code. Don’t block during upcalls from Isis 2 into your code. The library assumes that upcalls will complete quickly and could malfunction otherwise. Isis 2 has a lot of threads. Don’t let this worry you. We gave you the source code. If you notice a bug, post it to isis2.codeplex.com on the “issues” page Post questions on the codeplex “discussions” page