Distributed mega-scale Agent Management in MASS: diffusion, guarded migration, merger and termination Cherie Wasous CSS_700 Thesis – Winter 2014 (Feb. 6)
from: Romanus, css497 summer2013, “Developing and Extending the MASS Library (Java) Places.exchangeBoundary( )” Overall MASS Framework Places Maintain & Manages the Place locations Manages exchange between the Place locations Place Maintains Place location data Provides a user software interface Agents Maintains & Manages the Agent units Manages the exchange and migration of Agent units Agent Maintains the Agent data Provides a user software interface callAll( ) callSome( ) exchangeAll( ) exchangeBoundary( ) callMethod( ) {User created functions} callAll( ) manageAll( ) migrate( ) spawn( ) kill( ) callMethod( ) {User created functions}
MASS execution model from: Chuang, MS Thesis, “Design and Qualitative/Quantative Analysis of Multi-Agent Spatial Simulation Library”
Current state of “Climate” app for Evaluation Places: timeSlots X numberOfDays Each Place Element: a 123(XRANGE) x 162(YRANGE) grid of the Pacific NW Place Compute:(bogus) moisture_flux, direction for each grid location, find max Places climateData = new Places( 1, "ClimateData", (Object)null, nTimeSlots, nDays ); In User Application: climateData.callAll( ClimateData.compute_, null ); In ClimateData.compute_: for ( int x = 0; x < XRANGE; x++ ) { for ( int y = 0; y < YRANGE; y++ ) { moisture_flux[x][y] = ( time + 1) * ( day + 1 ) * ( (double) x + ( (double) y / ) ); } } for ( int x = 0; x < XRANGE; x++ ) { for ( int y = 0; y < YRANGE; y++ ) { direction[x][y] = (double) x + ( (double) y / ); } } for ( int x = 0; x < XRANGE; x++ ) { for ( int y = 0; y < YRANGE; y++ ) { if ( maxFlux < moisture_flux[x][y] ) { maxFlux = moisture_flux[x][y]; maxX = x; maxY = y; } } }
a made-up/bogus “Climate” app for evaluations Place elements of climate related data for the Pacific NW Agents who search the place elements looking for a maximum value As agents complete their job, they put the max they found into Vector at first place element for this node User App then collects these values, and then looks for the overall max value
Current state of “Climate” app for Evaluation Agents: create one agent at each timeSlot for Day Zero Agent: if finds new max, copy it; migrate to the next Day (same timeSlot); when do last Day, deposit max found into Vector maxSeen & prepare to kill Self Agents agents = new Agents( 2, "MaxFinder", maxFinderArgs, climateData, nAgents ); public int map( int maxAgents, int[ ] size, int[ ] coordinates ) { int currX = coordinates[0], currY = coordinates[1]; if ( myRunMode == 1 ) { // place an agent at each timeslot for the first day if ( currY == 0 ) return 1; else return 0; } } In User Application: Agents agents = new Agents( 2, "MaxFinder", maxFinderArgs, climateData, nAgents ); while ( agents.totalAgents() > 0 ) { agents.callAll( MaxFinder.find_, (Object)null ); agents.manageAll(); } In MaxFinder.find_: if ( this.maxFlux < ((ClimateData)place).maxFlux ) { this.maxFlux = ((ClimateData)place).maxFlux; this.maxX = ((ClimateData)place).maxX; this.maxY = ((ClimateData)place).maxY; this.maxDir=((ClimateData)place).direction[maxX][maxY]; this.maxTime=((ClimateData)place).time; this.maxDay=((ClimateData)place).day; } if (currY == nDays - 1 ) { // dump this agent's value into Vector “maxSeen” for this process MaxClimateData temp = new MaxClimateData( this.maxFlux, this.maxDir, this.maxTime, this.maxDay, this.maxX, this.maxY ); ((ClimateData)place).maxSeen.add( temp ); kill( ); // set-up to kill this agent on next manageAll } else { migrate( currX, currY+1); }
Current state of “Climate” app for Evaluation Place Collect: searches Vector maxSeen and returns max value for this node In User Application: Vector results = new Vector (); Object[ ] tempArgs = new Object[ nDays * nTimeSlots ]; Object[ ] valuesReturned = climateData.callAll( ClimateData.collect_, (Object[ ])tempArgs ); In ClimateData.collect_: if ( this is first place element on this node ) { MaxClimateData thisNodeMax = new MaxClimateData(); Iterator iter = maxSeen.iterator(); while ( iter.hasNext( ) ) { MaxClimateData temp = iter.next(); if ( thisNodeMax.mcdFlux < temp.mcdFlux ) { thisNodeMax = temp; } } return (Object)thisNodeMax; } else { return null; }
Current state of “Climate” app for Evaluation Finally, back in User Application: does final search of each node’s returned value to find the overallMax for ( int i = 0; i < valuesReturned.length; i++ ) { if ( valuesReturned[ i ] != null ) { MaxClimateData gotValue = (MaxClimateData)valuesReturned[ i ]; results.add( gotValue ); } } Iterator iter = results.iterator(); MaxClimateData overallMax = new MaxClimateData(); overallMax.mcdFlux = 0.0; MaxClimateData tempCD = new MaxClimateData(); while ( iter.hasNext( ) ) { tempCD = iter.next(); if ( overallMax.mcdFlux < tempCD.mcdFlux ) { overallMax = tempCD; } } print overallMax
Summary of “Climate” app for Evaluation Places: timeSlots X numberOfDays Each Place Element: a 123x162 grid of the Pacific NW Place Compute:(bogus) moisture_flux, direction for each grid location, find max Agents: create one agent at each timeSlot for Day Zero Agent: if finds new max, copy it; migrate to the next Day (same timeSlot) Agent: when do last Day, deposit max found into Vector at place 0,0 & killSelf Place Collect: first place element on each node searches thru Vector and returns the max value for that node Finally, in user app.: final search of returned values to find the overallMax
Snapshot of Performance Thursday, February 06, 2014 All runs: 1 process, 12 agents, 12 timeSlots x 364 days, runmode=1 (an agent at day zero for each timeslot & migrate thru the days, one at a time) runmode=1 (an agent at day zero for each timeslot & migrate thru the days, one at a time, at last day put max into place0,0 maxSeen Vector) All times: in milliseconds UW-B 302 corner PC (UW QZLP1.uwb.edu) Cherie's Laptop UW-B 302 Hercules (hercules.uwb.edu) # th mass init Places create Places compute Agents search Places collect user main sort mass finish# th mass init Places create Places compute Agents search Places collect user main sort mass finish# th mass init Places create Places compute Agents search Places collect user main sort mass finish Xeon QuadCore W3520,2.7GHz, 4cores8th, 45nm, i7-3615QM,2.3GHz,4core8th,22nm2Q'13 Xeon E5520, 2.27GHz, (4cores8th)*2, 45nmQ1'09, L1=256kB, L2=4x256kB, L3=6-MB(1.5M/core), 6-GBram L1=,L2=IntelSmartCache=6MB, 8-GBram 8MiSmartCache, (?cpuinfo:8siblings,16proc.), 6-GBram (Package 2 CPUs) (the 2nd th./core shares ALU/FP)
Next Steps Add to current “Climate” app: runMode=0: only use Places (so each place sends its max to main, then main sorts) runMode=2: spawn more Agents, once startup Scripts to gather & present performance data for various # proc/th/runModes Begin Agent Enhancements in MASS code, re-eval. with “Climate” app
“diffusion, guarded migration, merger, and termination” Modify Agents Overload Constructor: Agents( byte inject, byte guardedMigration, int handle, String className, Object argument, Places places ) Instantiates a set of agents from the “className” class according to the technique indicated by “inject”: 0 = chaotic (nThreads*nProc agent elements), 1 = controlled (nThreads*nProc agent elements), 2 = one-way scan ( size[0] in 1D, size[1] in 2D, size[2] in 3D These agent elements migrate per “guardedMigration” algorithm: 0 = naive (collisions are not a concern, unlimited agents per place) 1 = greedy (only 1 agent per place maximum) 2 = best-effort at greedy ( try not to have more than 1 agent per place, but on occasion there will be more than 1 )
“diffusion, guarded migration, merger, and termination” Modify Agent abstract class Overload version of migrate: public boolean migrate( byte diffusion ) Initiates an agent migration upon a next call to Agents.manageAll( ). diffusion indicates technique of migration to use: 0 = chaotic, 1 = controlled, 2 = one-way scan
MASS code base known problems: Congestion (seen in SugarScape app) – when use multi-threaded multi- process MASS. Workaround for now: when using multi-process, use single-thread (1/24/14 ) Hang – when 2 processes, single-thread, and agents try to cross over to other node. Testbed provided dslab on hercules: /cherie/ClimateNew4/testrun/runClimate.sh (2/4/14) Agents Object[] callSome() does not have implementation code Currently just return null. This function was desired to be used by agent code of bionetworks & climate in Fall minor: change “ERROR” to “WARNING” re: missing DLB config file (so users do not “panic”) (this problem still seen in code base of.new4) When totally delete DLB.properties file you can see this message.
Questions ???