1 / 7

The Problem/NASA Relevance

trishab
Download Presentation

The Problem/NASA Relevance

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Technology Infusion: Text-Mining and Tagging for Software Change RequestsExecutive BriefingJane T. Malin and David R. ThroopNASA Johnson Space Center (JSC)Project: Technology Infusion of Text-mining for Problem Trending into Software Change Reports at JSCSoftware Assurance SymposiumSeptember, 2008

  2. The Problem/NASA Relevance International Space Station generates ~1400 Software Change Requests (SCR) annually It is difficult to find trends and recurring anomalies within the large set of SCRs. • Particularly urgent when trying to find ‘more reports similar to this one’ during flight anomalies • Typical “manual” analysis uses database searches • Critical information about software changes is captured in natural-language text fields (English sentences.) • Text is not well behaved, so keyword search or data mining approaches fail • Syntactic and semantic variants are used often

  3. Approach • Leverage Text-Mining technology used to: • Extract model parts for system modeling from requirements • Find trends in Discrepancy Reports Semantic Text Mining and Tagging • Analyzes sets (10,000s) of problem-report records from databases • Each record has multiple fields, some of which contain English-language text describing problems, causes, consequences, equipment. • Text-mining approach • Performs syntactic parsing of each text field in the data record • Uses hierarchical aerospace ontologies of concepts and nomenclature to identify problem-type or equipment-type tags to add to each record • Searches for word-patterns that match problems or entities of interest • Adds additional tag fields to records • Uses tags for graphs and other browsing capabilities for analysts

  4. Current Capability • User: ISS Robotics • 3200 .html SCR records to converted tab-delimited format • Text analysis and hierarchical tagging for problem types • Capability to limit tagging scope to only software failures • Analysis of multiple fields • Improved bar chart formats Errors co-occurring with ‘Deactivation’ in one year

  5. Current Software Problems • Software problem type hierarchy from Aerospace Ontology, with mapping words • Software_Threat: spyware, spam, virus, malware, worm, Trojan horse, Trojan, root kit, exploit, ping, brute force attack, dictionary attack, replay attack, piggybacking, denial of service, sabotage • Programmer_Error: programmer error, {Bad} programming practice • Software_or_Computer_Error (error, faulty): software error, software problem, BIT error, controller error, computer error, display error, program error, bit count error, check error, not reinitialized, compiler error, bug, phase error, exception, {Programming_Language} exception, page fault, general protection fault, halt failure, crash • Software_Security_Anomaly: protocol anomaly, traffic anomaly • Software_Sequence_Error: command sequence error, task sequence error, boot sequence error, function sequence error, sequence error • Software_Resource_Contention: thrashing, unwanted synchronization, multithread error, deadlock, live lock, lock error, contention, race condition, data race • Data_Error: data error, bit error, parity error, missing pointer, i/o error, input error, input/output error, output error, word error, divide by zero • Corruption: corrupted packet, corrupt file • Memory_Error: corrupted memory, memory write error, memory error, read error, integer overflow, buffer overflow, memory leak, {Insufficient} memory, overwritten memory, overwrite, write over, write on top of • Software_Vulnerability: dangling pointer, format string vulnerability, code injection, intrusion, hijack • Bad_Software_Structure:{Bad} {Software_Structure} • Missing_Software_Structure:{Missing} {Software_Structure} • Software_Not_Responding: crash, hang up, lock up, freeze • Note: Brackets expand. For example, {Software_Structure} expands to: comment, code, dictionary, expression, statement, instruction, computation, algorithm, string, thread, pointer, link, hyperlink, reference, command sequence, error log, DLL, load, software load, dump data, segment, use-define chain, call graph, control flow graph, handler

  6. Technical Challenges Software module names identify system failure modes E.g. don’t tag Fire-in-cabin annunciation as ‘FIRE’. Handled by tagging only software-related failures Usability Challenges Determining what trends are most useful User interviews Repeated prototypes Redesigning user displays to accommodate information overload Technical Challenges

  7. Planned Capability • Additional iteration of suggestions and refinement of requirements • More software failure terms and concepts in the tagging ontology • Support for identifying and eliminating false positives • Documented user requirements and capabilities • Proposal for wider use by many JSC organizations that search and analyze SCR database records • Including tighter integration with current SCR database, linking back to it

More Related