Why does my perfectly working App Crash and Burn in Production? Matt Kramer Project Manager, STL Boeing Scalability Test Lab cell
The Business Need for Load & Stress Time, Money, & Chaos impacts on core businesses Can the application support the users it needs to? Lost productivity if the product is slow Will it slow down when user counts get too high? Impacts on customer retention if your product becomes unstable or slows to a crawl What kind of response times can we expect in production? Life span of products How many users can the system support? Do we need more servers? Getting product support form 3 rd party solution providers while they are still under contract and before the big checks have been written Did the partner deliver on the contracted deliverable and should we pay them? Support costs Will it still take 30 seconds to login if we have 300 people on the system?
Key Challenges Getting good End User information- Ask a lot of questions, dig, mine the database for session info Developing a good load profile- How many users, how long will they be on the system, how often will they complete tasks, are there peaks caused by work hours? Getting a good, production like environment- Is it networked correctly, if not a full prod copy then at least have everything to scale. Getting workable code- in enough time to develop results before a release- the age old balance between functionality and access. Under the hood of the Database- Getting good a Good view into what is happening in the database. I3 for Oracle solutions is great.
Product Pre-Screening (if possible) Code Quality Product Overall Maturity Coding standards SQL Quality Architecture
Risk of OutSourced and 3 rd Party Revenue - Cost = Profit. Capitalism as a system is always looking for ways to reduce cost. Even if a company has a product Good Talent makes good Products Good Talent costs $ - Ranked by cost (not by value) 1 st world Talent 1 st world mediocrity 3 rd world Talent 3 rd world mediocrity Talent Vacuum- Industry wide issue with more demand than talent History- India as a traditional supply of individual talent not products. Talent market in India- short supply, workers are more likely to switch jobs for promotions, the largest and most aggressive customers tend to get the best resources for as long as they are noisy. Communication Gaps caused by Language Culture Time zones distance Dirty Laundry- Technical Salespeople don’t know, or won’t share the dirty secrets The Deal vs the Real-Technologically illiterate executives commonly make purchasing decisions Lack of Access- Getting access to fixes, answers, or details is next to impossible
Load & Performance Risk Factors Code Maturity Over all architecture Has the product been migrated from another OS, platform, from client server? Has there been any re-architecture of the product recently Degree of Product Customization possible Is the product cross platform? Are there implementations in use of the type and size that your company needs (users & transaction size)? Are there implementations in use with similar data set, database size, and data growth curve? Stability of the underlying technologies being used The number and complexity of integrations with different products Degree of change in recent releases High turn over of the development and support staff writing and supporting the application Quality of the Database schema, normalization, upkeep, best practices Degree that the product is being customized for your company
Information Needed (what will Production look like?) Architectural understanding User types- Batch jobs or other system impacts Integrations Networking Any load balancing
Typical Issues/BottleNecks SQL statements without any indexes that quietly increase response times. Memory not being released by processes- shows up in longer test runs De-normalized databases which cause lots of large multi-table joins and slow response times- Load Balancing or Clustering solutions not fit for the volume of data they are supporting- common with Master/slave configurations or applications not meant to support clustering. Check the resource usage on the different servers. Un-tuned Servers or services- memory allocation, buffer sizes Code touching Technology Solution weaknesses- windows hot-folders with extreme amounts of data Poor Architecture Chatty applications- How large and how many round trips does a transaction take? Lots of round trips for an application act as a multiplier on response times Reports- At large Companies Executive reports or reporting applications can have huge impacts on the database.