Logical Troubleshooting & RCA: Learn What You Need to Know

Chaotic troubleshooting at the first sign of symptoms can cost your organization opportunity, time, and money.
0.0 (0 ratings)
Instead of using a simple lifetime average, Udemy calculates a
course's star rating by considering a number of different factors
such as the number of ratings, the age of ratings, and the
likelihood of fraudulent ratings.
20 students enrolled
$19
$50
62% off
Take This Course
  • Lectures 25
  • Length 1 hour
  • Skill Level Intermediate Level
  • Languages English
  • Includes Lifetime access
    30 day money back guarantee!
    Available on iOS and Android
    Certificate of Completion
Wishlisted Wishlist

How taking a course works

Discover

Find online courses made by experts from around the world.

Learn

Take your courses with you and learn anywhere, anytime.

Master

Learn and practice real-world skills and achieve your goals.

About This Course

Published 7/2015 English

Course Description

Based on extensive related hands-on practical experience, this course provides you with the skills and knowledge to effectively and efficiently discover Logical Troubleshooting and Root Cause Analysis (RCA) operations needed to provide valuable solutions for business and IT.

Logical Troubleshooting & RCA: Learn What You Need to Know course authored by Chuck Morrison, MBA, PMP with over 25 years Program Management and Business Architecture experience in Silicon Valley California.

Logical troubleshooting is a systematic and well-structured methodology for determining the root cause(s) of complex system problems and issues resulting from related symptoms and malfunctions. Logical troubleshooting is a structured approach used in program/project management, engineering, business and IT administration, diagnostic medicine, telecommunications, and training development to identify and eliminate possible causes to discover root cause then return systems to proper functional operation.

All affected stakeholders including sponsors, subject matter experts, and other resources must be involved in collaborative development viable solution based on root cause for any executive decisions. This requires the leadership, skills, and knowledge or experienced analyst and architects capable of supporting an effective business solution needed to return business systems to proper operation.

Critical processes emphasized during this course are collaboration, listening, analysis, and modeling techniques needed for effective and efficient system operations solutions. This course focuses on helping you develop the skills and knowledge needed to help you support effective solutions and decisions regardless of your role.

If you find my course useful, please consider leaving a review and rating. Your review is much appreciated. You can go directly to the review page for this course then click and enter your review and rating.

Thank You and Best Regards,

Chuck Morrison, MBA, PMP

What are the requirements?

  • Some technical experience desired.
  • Ability to collaborate and listen for business wants and needs
  • Capability to capture and define business and technical requirements
  • Interest in the fields of business analysis, quality assurance, and information architecture
  • Ability to collect and organize tasks, activities and resources into diagrams, flow charts, and graphical models

What am I going to get from this course?

  • Identify key symptoms related to events causing problems and issues
  • Capture information to determine what's known about and related to the problem and issue
  • Develop a causal factor chart or sequence diagram to map events and causal relationships to symptoms to show investigators why events have occurred and how these can be addressed
  • Identify root cause(s) based on causal factor chart by creating a decision diagram or root cause map to address reasons the event occurred and how it can be addressed effectively
  • Present responsible stakeholders the information they need to make the right decisions for business and operational improvement based on root cause investigation team's recommendations.
  • Understand the difference between Logical Troubleshooting and Root Cause Analysis and why each is used.
  • Generate recommendations for prevention and measurement of causal or root cause event(s) and how to improve operations to prevent recurrence.

What is the target audience?

  • Product Owners and Sponsors
  • Subject Matter Experts (SMEs)
  • Business Process Managers
  • Business Process Users
  • Product, Project, Portfolio, and Program Managers
  • Business Analysts & Architects
  • System & Software Developers

What you get with this course?

Not for you? No problem.
30 day money back guarantee.

Forever yours.
Lifetime access.

Learn on the go.
Desktop, iOS and Android.

Get rewarded.
Certificate of completion.

Curriculum

Section 1: Why Are Logical Troubleshooting and Root Cause Analysis Needed?
00:41

Welcome to my course “Learn to Your Work-Breakdown-Structure from Use Cases”

Hello, I'm Chuck Morrison, an MBA and PMP certified Senior Program/Projects Manager and Business Architecture Professional.

My specialties are: Business Process Engineering, Software Systems Development, Cross-Functional Program and Change Management.

My significant skills and accomplishments include:

  • Over 20 years of expansive and diverse experience as a Program, Project and Portfolio Manager, Consultant and Business Architect/Analyst working for companies such as VMware, HP Enterprise Services, Hawaiian Airlines and DIRECTV.
  • Proven success in leading multiple, complex projects, process improvements and system migrations throughout the entire project lifecycle that generate cost savings of over $50M.
  • Managed a total of 27 concurrent, highly visible CPUC Rule 20 projects according to schedule and timeline across multi-locations and sites with a total budget of $40M for Pacific Gas and Electric Company (PG&E).
  • Extensive technology background with recognized business acumen to define and deliver small to large-scale, complex business process and systems infrastructure projects.

My significant accomplishments also include:

  • During my youth, I had the good fortune of calling home the awesome forests near Somersworth, New Hampshire, the exciting salmon runs of Adak, Alaska, and the beautiful mountains and beaches of California – from Eureka to Yosemite to San Francisco, Los Angeles, and San Diego. It was also to my good fortune in my learning experience to see and walk in every state in the United States at least once.
  • Later, it was my good fortune to experience the world on a global scale from the breathtaking beauty and church bells of Frankfurt and Wurzburg Germany. Next, I found myself in experience evening sky of Tokyo, Japan and Mount Fuji for atop Tokyo Tower, followed by the bright red skies of Taipei in Taiwan and Manila Bay in the Philippines, then the busy international harbor of Kowloon near Victoria City, Hong Kong, and the intricate vistas on the Tonkin Gulf near Hai Phong as well as the rugged coastline near Ho Chi Minh City (formerly Saigon) Vietnam, and the exuberant beauty of the Sidney, Australia harbor.

And. Please. If your have any questions about any part of this training or any related questions to this course or Udemy please ask. You have my promise to find you an answer.

Please refer to course goals in 00-LT&RCALearn to Analyze Business Application Issues Root CausesCourseGoals.pdf attached.

02:44

Introduce self to class

Welcome and thank you for joining our course. Please take a moment to introduce yourself to me and the other students in our class using our Udemy Course Discussions to add then post your introduction.

Just include a little information about yourself including your name and location You don't have to be specific about location if you prefer … just include your state or city or country. Also, please let us know where you’re coming from.

Are you working full-time, is this your first time taking or creating and online course, are you working full or part? Is this your first time creating your own online business, or making money online. Do you have a website? If so, please include your website address so we can find out a little more information about you and start following you on your own channel. If you’re on Facebook, Twitter, LinkedIn, or other social media, please let us know your contact information if you want to share.

Please contact me with any questions or suggestions you may have about our course.

If during this or any other of my courses, or after you’ve completed any of my courses, you have any questions or related suggestions for improvement; please don’t hesitate to contact me using Udemy’s Instructor Messaging system.

Simply click the Blue “Add Discussion” button then add you information and comments to the dialog box. When finished click the Green “Post” button. That’s it … it’s that easy for communication with me and other student on Udemy.

Remember, you have my promise to work with you to find an answer for your questions and suggestions, which may include course enhancements and/or adjustments or reviews and ratings. I look forward to hearing your comments and suggestions.

And, please after completing any of my courses or if you find this course or any of my courses useful, please consider leaving a review and rating. Your review is much appreciated. You can go directly to the review page for any course then click and enter your review and rating.

I'm excited to meet you and just as I did in my “Welcome” video giving information about myself, I really am excited to get to know you better. Please take just 30 seconds to introduce yourself to the course; I will highly appreciate it. See you in the next video lecture.

Thank You and Best Regards, Chuck

01:47

Lecture 2 – What's the difference between Logical Troubleshooting and Root Cause Analysis?

Discussion

Logical Troubleshooting is a structured symptom/event-based problem solving method used to determine the source of problems and issues (what, when, where, how, how much) leading to any corrective action(s) needed to render or repair a system, process, or product functionally operational.

As with Logical Troubleshooting, Root Cause Analysis identifies what, where, when, how, as well as why something happened in order to prevent recurrence. Root causes are underlying and reasonably identifiable events which can be controlled by management decisions based on generation of recommendations for preventive actions. The process involves data collection, cause charting, root cause identification and recommendations for preventive management and implementation of measurable controls and processes.

In short, Logical Troubleshooting is designed to ensure effective and efficient corrective action(s) of symptoms, whereas, Root Cause Analysis (RCA) design to ensure prevention of recurring events.

Note: This course is not intended to provide ALL details involved in Logical Troubleshooting and RCA, however, it does provide you with sufficient guidelines needed to discover then analyze problems and solutions for yourself more effectively and efficiently.

01:15

Lecture 3 – Imagine …

Discussion -

You and your team are responsible for a major, business system and were just notified that the system crashed and everyone is sitting around waiting for your system to become operational again.

You're part of a team that must support the company's production control and logistics delivery operation for several critical customers with symptoms you and your team have never seen nor heard of before.

More precisely, customers are beating down your companies doors for must-have immediate delivery of products and services without a page written about processes or procedures and people you've never met who do not know what to do next and you haven't even a clue about what happened, when, or what's the impact on time or resources.

What do you do, where do you begin?

By completing this course, you will posses the set of tools and guidelines needed create your action plan and move forward to resolving business and technical problems and issues using logical troubleshooting and root cause analysis. So, are you ready to get started?

01:12

Lecture 4 – Please Allow Me to Share a Few Related Quotes …

Discussion –

The problem with troubleshooting is that trouble shoots back. ~Author Unknown

Continuous improvement is not about the things you do well — that's work. Continuous improvement is about removing the things that get in the way of your work. The headaches, the things that slow you down, that's what continuous improvement is all about. ~Bruce Hamilton

Amateurs work until they get it right. Professionals work until they can't get it wrong. ~Author Unknow

The first rule of any technology used in a business is that automation applied to an efficient operation will magnify the efficiency. The second is that automation applied to an inefficient operation will magnify the inefficiency. ~Bill Gates

What gets measured, gets managed. ~Peter Drucker

01:32

Lecture 5 – Why Are Logical Troubleshooting and Root Cause Analysis Needed …

Discussion –

•Logical Troubleshooting is used in many business, science, and engineering related fields to develop and maintain complex processes and systems in which problems and issues exhibit varied symptoms and related causes.

•Structured problem solving and analysis or Logical Troubleshooting is logical structured process or method for identifying and diagnosing system failure and malfunction symptoms and rectifying causes and root causes.

•Diagnosis and process of elimination are used to determine causes of failure based on functional requirements expectation of a system. Inputs to the system are expected to respond with specific measureable outputs.

•Troubleshooting isolates unexpected system output behavior leading to corrective and/or preventive measures.

•Root Cause Analysis (RCA) is typically a reactive method for identifying event root causes of failure then solving them. However, RCA can be used for preemptive action based on forecasting of probable events.

•RCA can involve Logical Troubleshooting as part of the incident management process.

01:49

Lecture 6 – What's This Course About?

Discussion –

Based on over 45 years of related hands-on practical experience, this course provides you with the skills and knowledge to effectively and efficiently troubleshoot and discover the root cause of opportunities and problems needed provide valuable solutions for business and IT.

Logical troubleshooting is a systematic and well-structured methodology for determining the root cause(s) of complex system problems and issues resulting from related symptoms and malfunctions. Logical troubleshooting is a structured approach used in program/project management, engineering, business and IT administration, diagnostic medicine, telecommunications, and training development to identify and eliminate possible causes and to discover root cause then return systems to proper functional operation.

All affected stakeholders including sponsors, subject matter experts, and other resources must be involved in collaborative development of viable solutions based on root cause for any executive decisions. This requires the leadership, skills, and knowledge of experienced analyst and architects capable of supporting an effective business solution needed to return business systems to proper operation.

Critical processes emphasized during this course are collaboration, listening, analysis, and modeling techniques needed for effective and efficient system operations solutions. This course helps you develop the skills and knowledge needed to help you support effective solutions and decisions regardless of your role.

01:39

Lecture 7 – What's Do You Get from This Course?

Discussion –

•Identify key symptoms related to events causing problems and issues

•Select investigation team

•Capture information to determine what's known about and related to the problem and issue

•Develop a causal factor chart or sequence diagram to map events and causal relationships of symptoms to show investigators why events have occurred and how these can be addressed

•Identify root cause(s) based on causal factor chart by creating a decision diagram or root cause map to address reasons the event occurred and how it can be addressed effectively

•Present responsible stakeholders the information they need to make the right decisions for business and operational improvement based on root cause investigation team's recommendations.

•Understand the difference between Logical Troubleshooting and Root Cause Analysis and why each is used.

•Generate recommendations for prevention and measurement of causal or root cause event(s) and how to improve operations to prevent recurrence.

Enables identifying, assigning, tracking, controlling, and managing Logical Troubleshooting and Root Cause Analysis (RCA) activities.

Aids capture & development of program/project scope, effort, budget and schedule.

00:36

Lecture 8 – What are the course requirements?

Discussion –

•Some technical experience desired.

•Ability to collaborate and listen for business wants and needs

•Capability to capture and define business and technical requirements

•Interest in the fields of business analysis and information architecture

•Ability to collect and organize tasks, activities and resources into diagrams and graphical models

00:22

Who's the Target Audience? –

•Subject Matter Experts (SMEs)

•Product Owners

•Business Process Managers

•Business Process Users

•Product, Project, and Program Managers

•Business Analysts & Architects

3 questions

What's the difference between Logical Troubleshooting and Root Cause Analysis?

Section 2: Reducing Chaos Using Logical Troubleshooting & Root Cause Analysis
03:07

Discussion –

Logical Troubleshooting can generally be structured into these basic steps:

  • Determining symptoms … what related problems and issues are present?
  • Capture problem related information. Describe the symptom(s). Are there any related circumstances or issues?
  • Analyze the information. Are symptoms related? List any possible cause(s).
  • Generate possible solution(s). How can can cause(s) and/or root cause(s) be addressed.
  • Select the best solution or combination of solutions that can feasibly and effectively eliminate the problem root cause?
  • Plan implementation and testing of solution or combination of solutions.
  • Implement plan then measure results to ensure root cause corrective action is successful.
  • Follow-up to ensure the root cause is corrected then determine possible process improvement(s).

Logical Troubleshooting is a structure method focused on discovery of the cause(s) of a problem or issue then determining the needed steps for corrective action. The major goal of troubleshooting is to return the related system, product, or process to functional operation. In short, Logical Troubleshooting is designed to ensure effective and efficient corrective action(s) of symptoms, whereas, Root Cause Analysis (RCA) design to ensure prevention of recurring events.

Logical trouble shooting involves organizing people, tools and equipment, policies, process, product, system, and procedures around symptoms or effects – what, who, when, where, how much around causes and why and are often structure using Ishikawa, Kepner Tregoe, and other structured analysis methods.

Troubleshooting is a logical, systematic approach to problem analysis applied to repair of failed processes and products. Troubleshooting employs discovery of the source of problems and issues resulting from symptoms so a process or product can be rendered operational again. Determining the most likely cause(s) is a process of elimination of possible alternate causes then confirming a feasible alternative solution or solutions needed to remedy the cause(s) of malfunction.

A system is described in terms of intended expectations as to functional behavior and structure. Events or inputs to a system are expected to result in specific outputs. Corrective action is taken to correct failures and prevent similar failures from reoccurring.

03:17

Root Cause Analysis can generally be structured into these basic steps:

  • Gather preliminary data – determine what happened: how long in existence, impact
  • Select the problem solving team
  • Define the problem – What, Who (Actors), When, Where, How Much (Impact) – Specific Symptoms, event sequence, conditions, significance
  • Contain the problem – Causal Factors and risks … drill baby drill
  • Identify Root Cause – determine why it happened: Physical, Human, Organizational
  • Identify Corrective Action and Metrics
  • Implement Corrective Action
  • Prevent Recurrence Measures – reduce the probability of recurrence.
  • Congratulate team

Root Cause Analysis identifies what, where, when, how, as well as why something happened in order to prevent recurrence. Root causes are underlying and reasonably identifiable events which can be controlled by management decisions based on generation of recommendations for preventive actions. The process involves data collection, cause charting, root cause identification and recommendations for preventive management and implementation of measurable controls and processes.

Root cause analysis (RCA) is a process designed for use in investigating and categorizing the root causes of events with safety, health, environmental, quality, reliability and production impacts. The term “event” is used to generically identify occurrences that produce or have the potential to produce these types of consequences.

Simply stated, RCA is a tool designed to help identify not only what and how an event occurred, but also why it happened. Only when investigators are able to determine why an event or failure occurred will they be able to specify workable corrective measures that prevent future events of the type observed.

Understanding why an event occurred is the key to developing effective recommendations. Imagine an occurrence during which an operator is instructed to close valve A; instead, the operator closes valve B. The typical investigation would probably conclude operator error was the cause.

01:19

Lecture 12 – Problem Perceptions – In the Eye of the Beholder (Team)

This diagram show that a problem is perceived differently by each stakeholder because of their role as well as their individual knowledge and skills experience. The nature of the core team selected allow the collaboration from different perspective of the reported problem.

Perception of each actor Support Person, Process or Operations Person, Quality Assurance, Information Technology and other core team members is focused on the problem based on stakeholder goals, values, roles, skills, knowledge, and experience. Problem perceptions are both and aid and a delimiter to team collaboration and consensus efforts.

Team-building is a significant consideration for effective and efficient troubleshooting and RCA. Data collection, organization, critical listening and feedback, functional decomposition, and diagramming skills are essential. Selecting the right solution to th right problem is more than in the eye of the beholder.

01:51

Lecture 13 – Problem Solving Process … Getting to the Solution

This diagram serves as a guideline for the Basic Problem Solving Process. Problem Identification – What prevents you from reaching your goal?

•Problem Identification (Recognition) – Gather data, relationships, context information

•Problem Description (Definition) – Description of a question involving doubt, uncertainty, or difficulty requiring reasoning to resolve.

•Problem Analysis (Elaboration) – Process of collecting and analyzing data to determine the cause of a failure. Is immediate solution needed? What specifically is the failure? Who knows about the failure and its context.

•Root Cause(s) Identification (Brainstorming/Mapping) – a method of problem solving used for identifying the root causes of faults or problems

•Root Cause Elimination – Preventive action … root cause analysis aids transformation from a reactive culture (one that reacts to problems) into a forward-looking culture (one that solves problems before they occur or escalate).

•Symptom Monitoring - Program for observation, supervision, and control of the events and activities of other programs and systems.

Each of these steps is typically used to get to problem solutions and will be discussed in more detail as you progress through this course …

05:59

Lecture 14 – Logical Troubleshooting Steps & Philosophy/Strategy – Document Findings and actions – prevent “recreating the wheel”

Provide System History with Troubleshooting Charts & Logs – Problems, Symptoms, Corrective & Preventive Actions/Modifications

The Logical Troubleshooting Philosophy Strategy employ the following guidelines …

Symptom Recognition – System Output to input required results comparison measurements to discover specific faults

Verify a problem actually exits – change from required/expected performance quality vs ghost problem e.g., is the power on, are lights flashing, is there a flow?

Understand what is really a fault based on system requirements – what's happening vs what should be happening per specification

Symptom Elaboration – Looks for other problems/issues … what are the specific symptoms - Isolate possible Cause(s)

Is the problem total system, is the source outside our context? Is it sporadic, how often & where does the failure occur? When did it last occur and where?

Know the system … is it showing symptoms of an impending failure or past failures?

Differentiate the characteristics of each failure symptom & context. Can these symptoms be produced by similar or related symptoms actions and events? Have you observed other indicators that may be related to the failure? Do measurements drift? How much, When? Is there a correlation among the samples?

Does manipulation of available controls have any effect on symptoms that can or do eliminate possible causes? What are the readings and specifications?

Write it down … do not leave anything the memory. Note any observed contributing factors

Listing of Probable Faulty Functions – Identify problem … Use common problem and repairs list for reference to past problems and fixes (events occurring more than once)

Use a functional block, activity, and/or context diagram to show events, symptoms, and cause relationships

Include monitoring devices related to system functions to determine if the system and components are working as required and specified

Could or would a failure of function(s) or similar functions cause these symptoms?

Collaborate & Corroborate findings with a known system and function(s) SMEs

Localizing the Faulty Function or Component – locate possible cause(s) vs symptoms – evaluate & test

Determine actual cause based on measurement of time, pressure, speed, sequence, delay, temperature or other variable parameter related to system operating functions to determine a actual/contributing component or functional failure(s).

List all pressures, flows, inputs, outputs and triggering events associated with faulty function(s)

Recheck any abnormal readings – note: first discovery does not necessary indicate actual fault – evaluate each faulty function to discover most probable system(s) sources

Remove suspected failed components then test separately from the system to ascertain specified functioning of complex components.

Failure Analysis & corrective action

Repair or replace failed components as needed to render system operational as soon as possible

Avoid repair and replacement until exact cause of failure is known

Document failure rates for repairs and replacements visually and through measurement to avoid recurring failures and system deficiencies

Retest Requirements – Verify problem is corrected

Once the system is returned to operation, verify operation of any failure related affected functions … perform any verifications needed to ensure required system operation.

Trouble_Log.png, TroubleProblemExists.png, TroubleFlow.png

________________________________________________________

Lecture 14 – Logical Troubleshooting Problem Flow –

This diagram serves as a guideline to the troubleshooting team to verify problem with the failure related operator to remove operator problems prior to conducting further troubleshooting analysis.

TroubleProblemExists.png

________________________________________________________

Lecture 14 – Logical Troubleshooting Flow –

This Logical Troubleshooting Flow diagram serves as a guideline for more detailed collection, quick problem fixes and analysis of troubleshooting related items during problem analysis and can be used as a source for reference documentation and future troubleshooting including item, symptoms, corrective action, technician, date, and comments. It also serves to remind the troubleshooting team to search for troubleshooting aids availability. When closely followed it serves to ensure the right problem is corrected and serves as an aide to further root cause and preventive measures analysis.

TroubleFlow.png

________________________________________________________

Lecture 14 – Logical Troubleshooting Log –

This log sheet serves as a guideline for collection of troubleshooting related items during problem analysis and can be used as a source for reference documentation and future troubleshooting including item, symptoms, corrective action, technician, date, and comments.

Trouble_Log.png

06:23

Lecture 15 – Using Flowchart & Schematics for Troubleshooting

In this and the previous lessons, we translate logical troubleshooting and root cause finding into flowcharts and schematics as a guide to future system troubleshooting. Regardless of its, application troubleshooting does not substantial differ by WHAT is done, but rather HOW it is done … this strategy or approach is used to diagnose system failure and returns the system to specified functional operation. The following tactics or general steps (guidelines) are used in approximate order to implement troubleshooting strategy … refer to the previous diagram TroubleFlow.png and TroubleProblemExists.png

Talk with the Operator(s) – Operators know the system and when it has a problem and can restate problem indicators or symptoms with subtle nuance as needed to describe symptoms helpfully or sometimes not. Use patience with the right questions so operators feel their contribution is useful. Often the operator knows exactly what the problem is, but has trouble expressing what is seen, heard, or knows in specific terms, so listening attentively is key. Talking with the operator is always a valuable first step in troubleshooting.

Verify Symptoms – Eyewitness testimony in not infallible so the reported malfunction system(s) must be verified. Often the power was turned off or a switch was accidentally switched, cable pulled, or typographical errors, so it's critical to verify the symptoms before wasting valuable time chasing ghosts. However, If the failure is operator induced, please be kind when restoring the system to operation in their presence. Often, the system can be returned to normal operation with a few simple steps by the operator by phone.

Attempt Quick Fixes – Look for simple fixes when possible. Is a fuse blown? is the system plugged in? Is a gasket of plug loose? In a way, quick fixes are a form of preventive maintenance … clearing symptoms before severe failures occur. General service before failures occur, whether needed not, is always a good tactic. It's Murphy's Law. System failures will and do occur at the most inopportune time in the most inopportune place … bulbs burn out in the darkest places … remember the Boy Scout Motto: “Be Prepared”. Learn from your failures.

Review Troubleshooting Aids – Once you've talked to the operator(s), verified symptoms, and tried quick fixes, the actual fault may not have been found. Now is time to monitor system sensor readings, however, note that these may also fail. Review related tags, manuals and logs for indications of failure or potential failure. Manuals can provide specific troubleshooting trees and flowcharts whiles Procedural troubleshooting aids often provide pre-packaged analysis of steps-of-procedure making up for lack of knowledge. Sophisticated diagnostic programs using video on the Internet may also be available.

Step-by-Step Search – Preferred search methods must yield the most relevant system information effectively and efficiently. One such method is based on Sun Tzu's “Divide and Conquer” or problem-splitting strategy; a point approximate mid-way in the system or process is chosen and tested. If successful the preceding portion of the system is considered OK eliminating it from further testing. Then the next approximate mid-point is selected and tested until by process of eliminate the system failure is isolated. The same procedure can often be performed using component swapping. The component at the approximate half-way of the system is replaced until the failure is cleared by substitution with a known good component. When the failure is found, it is corrected. Other methods include signal and flow tracing based on waveform, statistical correlation or other similar forms of analysis.

Clear the Trouble – Once the cause of failure is found, it must be repaired or cleared then verified.

Perform Preventive Maintenance – PM is the routine process of clearing problems before these occur which must be documented as performed, usually to a schedule of routine preventive maintenance.

Make Final Checks – Perform final verification of specified functional system operation.

Complete Paperwork – The jobs NOT finished until the paperwork is completed including PM, records, and system history logs… it's a critical part of the troubleshooting strategy and is needed for future troubleshooting

Inform Area Supervision/Instruct Operator – Once the system is again operational, the all stakeholders must be informed or trained and cautioned about any related peculiarities.

Translate troubleshooting and root cause finding into flowcharts and schematics as a guide to future system troubleshooting. Regardless the nature of troubleshooting does not substantially differ by WHAT is done, but rather HOW it is done … the strategy or approach used to diagnose system failure and return the system to specified functional operation. The following tactics or general steps (guidelines) are used in approximate order to implement the troubleshooting strategy


00:43

Lecture 16 – Managing Cause and Effect Diagrams

Based on Problem Effect or Event, Identify Categories such as: People, Tools, Equipment, Policies, Procedures, Other … Then do the following:

  • Identify the Trouble or Problem
  • Draw a Main Line Pointing to the Problem
  • Identify the Possible Major Causes of the Problem
  • Identify Each Possible Minor Cause Associated With the Major Causes
  • Identify Each Contributing Factor to the Minor Causes
  • Review the Cause and Effect Diagram
  • CauseAndEffect.png
01:17

Lecture 17 – How Can We Test and Measure Information?

Quantitative Measures – involve performing statistical analysis on data that has numerical values based on an objective scale of observations … when something is measured with a numerical value quantitative data is created.

Qualitative Measures – search for patterns in non-numerical data using categorical natural language descriptions that cannot be measured objectively, but can be observed subjectively such as – smells, tastes, textures, attractiveness, and color. When something is classified using judgment qualitative data is created.

Translate troubleshooting and root cause findings into control charts for continuous tracking of process problem resulting in exceptions to upper and lower control limits (UCL/LCL) to guide to future system troubleshooting. Process measurements above or below the established UCL and LCL limits are considered process failures requiring root cause analysis, corrective action, and preventive analysis.

03:34

Lecture 18 – Decision Making … Getting to Results … Kepner-Tregoe

The Kepner-Tregoe (KP) Problem Solving & Decision Making method is a structure process ensuring successful problem and issue resolution, prioritization of issues, solid decision-making, and analysis of opportunities and risks. The method provides clear thinking capability by cutting through business and technology clutter. The method aids decisive action needed to address complex challenges of any organization. The KP processes combined with Logical Troubleshooting and RCA method taught in this course become the foundation for effective business and technical decision-making leadership.

The Kepner-Tregoe (KP) Problem Solving & Decision Making processes include …

Situation (Context) Appraisal: Complex Situations Context … How to clarify, prioritize, and manage issues and concerns within a complex situation or context. In this process, a plan is developed for effective resolution based on analysis of required stakeholder roles when actions are to be taken. The process clarifies complex situations with priorities to avoid misdirected actions.

Problem (Failure) Analysis: Working though complex system failures … Discover cause(s) through organization and analysis of critical failure related information. Possible cause(s) is identified then experimented with against related findings. Root Cause(s) for the failure are then verified and any related causes and related impact are considered for corrective and preventive actions. This process ensures valid cause(s) is known prior to corrective actions.

Decision (Risk) Analysis: Working through complex decisions … Clarify measurable objectives needed for solid decision-making. Evaluate the appropriate range of alternatives and access the risk impact prior to decision-making. Ensures informed decision choices with maximum benefits with minimization of risk(s).

Potential Problem (Impact) Analysis: Risk Management of potential failures … Anticipate threats and vulnerabilities through planning. Considers the risks of potential failures and how to prevent them. Consider specific trigger contingent (preventive) action in the event of failure. Provides advanced planning needed to prevent unplanned reactive action(s).

Opportunity (Leverage) Analysis: Leveraging Opportunities … Plan to anticipate and leverage planned actions to ensure improved benefits. Cause(s) and promotion for potential opportunity(s) is explored. Considers how to capitalize on a related actions and triggers when opportunity occurs. Aids advanced preparation to gain added advantage when situations result in more improved opportunity(s) than expected.

2 questions

How do do we reduce chaos through Logical Troubleshooting and RCA?

Section 3: When Do We Use Logical Troubleshooting & RCA Methods?
01:18

Lecture 19 – When do we start Logical Troubleshooting and Root Cause Analysis?

Logical Troubleshooting is an approach used to discover resolution and immediate correction of a reported system failure. Problem solving through incident investigation, troubleshooting, and root cause analysis are connected with response to what's the real problem, why did it happen, how can it be corrected, and what will be done to prevent it from recurring.

Root cause analysis is used to get past the symptoms of a failure event by identifying underlying cause(s) of why an incident occurred in order to discover and implement the most effective solution(s) and provide recommendations for future preventive measures. Three fundamental steps for Cause Mapping are:

Define the issue by impact on overall goals and objectives

Analyze causes using visual mapping of the triggers, symptoms, causes, and stakeholders within related environment

Mitigate, eliminate, or prevent significant probable negative impact to goals by selecting the most effective and right solutions.

01:03

Lecture 20 – When do we stop Logical Troubleshooting and Root Cause Analysis?

Stop troubleshooting when your corrective action solution has resolved the right problem and returned your system to functional operation.

When your cause is discovered and your recommendations are accepted by the responsible stakeholders.

If you pay attention to the details you are more likely to be effective in troubleshooting with the right corrective action.

Learn to observe the details as you work through solving a problem, often the smallest details make all the difference.

After systematic thought, observation, collaboration the point is reached for corrective action; It's time to implement our considered root cause solution. Make the repair or adjustment to render you system functionally operation then stop, look, and listen for any missed or omitted problem symptoms.

01:28

Lecture 21 – What Do Activity Diagrams Look Like? … ActivityDiagram.jpg

As you can see in the activity diagram, each set of activities is contained in a swimlane such as configure schedule, spider crawl or indexer, protected logon, user search, and other activity swimlanes as needed.

Each swimlane initiates with an initial node and may contain several connected activities or tasks. Each swimlane may also require decisions and/or loopbacks, merges, forks and/or joins to connect activities to other activities within each swimlane scenario. Upon completion of a set of swimlane activities, the sequence is terminal with as a flow final. Business rules within each decision determines the direction of flow from decisions and activities in the activity diagram.

  • Activity diagram symbols:
  • Initial Node(s)
  • Activities (Tasks)
  • Decisions/Merge (Loopbacks) – Forks/Joins
  • Swimlanes (Domain/Context/Scope)
  • Flow Final
  • Business Rules
1 question

When to we use Logical Troubleshooting and RCA?

Section 4: Conclusion ...
01:45

Lecture 22 – Logical Troubleshooting & RCA – Conclusion – Conclusion

We made it …we've completed all our Course Goals

  • Identify key symptoms related to events causing problems and issues
  • Capture information to determine what's known about and related to the problem and issue
  • Develop a causal factor chart or sequence diagram to map events and causal relationships to symptoms to show investigators why events have occurred and how these can be addressed
  • Identify root cause(s) based on causal factor chart by creating a decision diagram or root cause map to address reasons the event occurred and how it can be addressed effectively
  • Present responsible stakeholders the information they need to make the right decisions for business and operational improvement based on root cause investigation team's recommendations.
  • Understand the difference between Logical Troubleshooting and Root Cause Analysis and why each is used.
  • Generate recommendations for prevention and measurement of causal or root cause event(s) and how to improve operations to prevent recurrence.
  • Generate problem definition including What, Why, Who, When, Where, Scope, Context, Objectives, Risks, Milestones, Metrics, Conditions, Events

Thank you and congratulations for taking this opportunity for yourself to expand your skills and knowledge. Thank you for your decision to complete this course successfully.

And, please, if your have any questions about any part of this training or any related questions to this course or Udemy please ask. You have my promise to find you an answer.

If you found my course useful, please consider leaving a review and rating. You review is much appreciated. You can go directly to the review page for this course

https://www.udemy.com/learn-to-analyze-business-application-issues-root-causes/learn/#/

then click and enter your review and rating.

Thank You and Best Regards, Chuck Morrison http://bit.ly/1MKcqGi

11 pages

For definitions of terms used in this course, please see 23-LT&RCAglossary.pdf

2 pages

OO UML developed by “The 3 Amigos” Grady Booch, Ivar Jacobson and James Rumbaugh at Rational Software during 1994–95 with further development led by them through 1996 …

Rational Software transferred to IBM … OO UML accepted by OMG & ISO

Please see other References (attached) ...

Students Who Viewed This Course Also Viewed

  • Loading
  • Loading
  • Loading

Instructor Biography

Chuck Morrison, Program/Project Manager & Business/IT Architect (MBA, PMP)

“A working model using mission-driven measures in a team approach enables focus on profitable customer-driven solutions."

With extensive Program Management and Business Architecture experience in Silicon Valley California it's been my good fortune and opportunity to experience working with many Fortune 500 companies. Workflow modeling is my expertise, joy and passion. As a seasoned professional my enjoyment is using and sharing the skills and knowledge with others through teaching and writing. Chuck has also authored and published other Udemy courses, Amazon eBooks, Linked SlideShare, and YouTube videos.

PMI PMP certified: Principal Strategist, Architect, and Leader with MBA and extensive experience in business and technology consulting, planning, designing, mentoring, negotiating, and delivering project, product, program, and process solutions. Successful track record planning, managing, and leading small to multi-site, concurrent, complex cross-functional projects and portfolios requiring business process engineering, Internet and information technology, quality management, instrumentation, and training.

Specialties: -Programs/Projects Management (PMI PMP): Program, Product, Project, and Process (SDLC, Agile, PMBOK, DMAIC, RUP, ITIL, InfoSec, NetSec, CISSP)-Business/Technical Process/Systems Modeling, Analysis, and Design (UML, OOA/D, BRD, MRD, FRD, HLD, ERD)-Web Portal Planning, Design, Documentation, and QA (Web 2.0, HTML, TCP/IP, HTTP, B2B, B2C)-Client/Team-Focused Consultant, Mentor, and Communicator-Inventory/Supply Chain Modeling/Management (APICS CPIM)

Thank You and Best Regards,
Chuck Morrison, MBA, PMP, CPIM, WWISA

Ready to start learning?
Take This Course