Scalable Dynamic Analysis of Binary Code

Scalable Dynamic Analysis of Binary Code PDF Author: Ulf Kargén
Publisher: Linköping University Electronic Press
ISBN: 9176850498
Category :
Languages : en
Pages : 73

Get Book Here

Book Description
In recent years, binary code analysis, i.e., applying program analysis directly at the machine code level, has become an increasingly important topic of study. This is driven to a large extent by the information security community, where security auditing of closed-source software and analysis of malware are important applications. Since most of the high-level semantics of the original source code are lost upon compilation to executable code, static analysis is intractable for, e.g., fine-grained information flow analysis of binary code. Dynamic analysis, however, does not suffer in the same way from reduced accuracy in the absence of high-level semantics, and is therefore also more readily applicable to binary code. Since fine-grained dynamic analysis often requires recording detailed information about every instruction execution, scalability can become a significant challenge. In this thesis, we address the scalability challenges of two powerful dynamic analysis methods whose widespread use has, so far, been impeded by their lack of scalability: dynamic slicing and instruction trace alignment. Dynamic slicing provides fine-grained information about dependencies between individual instructions, and can be used both as a powerful debugging aid and as a foundation for other dynamic analysis techniques. Instruction trace alignment provides a means for comparing executions of two similar programs and has important applications in, e.g., malware analysis, security auditing, and plagiarism detection. We also apply our work on scalable dynamic analysis in two novel approaches to improve fuzzing — a popular random testing technique that is widely used in industry to discover security vulnerabilities. To use dynamic slicing, detailed information about a program execution must first be recorded. Since the amount of information is often too large to fit in main memory, existing dynamic slicing methods apply various time-versus-space trade-offs to reduce memory requirements. However, these trade-offs result in very high time overheads, limiting the usefulness of dynamic slicing in practice. In this thesis, we show that the speed of dynamic slicing can be greatly improved by carefully designing data structures and algorithms to exploit temporal locality of programs. This allows avoidance of the expensive trade-offs used in earlier methods by accessing recorded runtime information directly from secondary storage without significant random-access overhead. In addition to being a standalone contribution, scalable dynamic slicing also forms integral parts of our contributions to fuzzing. Our first contribution uses dynamic slicing and binary code mutation to automatically turn an existing executable into a test generator. In our experiments, this new approach to fuzzing achieved about an order of magnitude better code coverage than traditional mutational fuzzing and found several bugs in popular Linux software. The second work on fuzzing presented in this thesis uses dynamic slicing to accelerate the state-of-the-art fuzzer AFL by focusing the fuzzing effort on previously unexplored parts of the input space. For the second dynamic analysis technique whose scalability we sought to improve — instruction trace alignment — we employed techniques used in speech recognition and information retrieval to design what is, to the best of our knowledge, the first general approach to aligning realistically long program traces. We show in our experiments that this method is capable of producing meaningful alignments even in the presence of significant syntactic differences stemming from, for example, the use of different compilers or optimization levels.

Scalable Dynamic Analysis of Binary Code

Scalable Dynamic Analysis of Binary Code PDF Author: Ulf Kargén
Publisher: Linköping University Electronic Press
ISBN: 9176850498
Category :
Languages : en
Pages : 73

Get Book Here

Book Description
In recent years, binary code analysis, i.e., applying program analysis directly at the machine code level, has become an increasingly important topic of study. This is driven to a large extent by the information security community, where security auditing of closed-source software and analysis of malware are important applications. Since most of the high-level semantics of the original source code are lost upon compilation to executable code, static analysis is intractable for, e.g., fine-grained information flow analysis of binary code. Dynamic analysis, however, does not suffer in the same way from reduced accuracy in the absence of high-level semantics, and is therefore also more readily applicable to binary code. Since fine-grained dynamic analysis often requires recording detailed information about every instruction execution, scalability can become a significant challenge. In this thesis, we address the scalability challenges of two powerful dynamic analysis methods whose widespread use has, so far, been impeded by their lack of scalability: dynamic slicing and instruction trace alignment. Dynamic slicing provides fine-grained information about dependencies between individual instructions, and can be used both as a powerful debugging aid and as a foundation for other dynamic analysis techniques. Instruction trace alignment provides a means for comparing executions of two similar programs and has important applications in, e.g., malware analysis, security auditing, and plagiarism detection. We also apply our work on scalable dynamic analysis in two novel approaches to improve fuzzing — a popular random testing technique that is widely used in industry to discover security vulnerabilities. To use dynamic slicing, detailed information about a program execution must first be recorded. Since the amount of information is often too large to fit in main memory, existing dynamic slicing methods apply various time-versus-space trade-offs to reduce memory requirements. However, these trade-offs result in very high time overheads, limiting the usefulness of dynamic slicing in practice. In this thesis, we show that the speed of dynamic slicing can be greatly improved by carefully designing data structures and algorithms to exploit temporal locality of programs. This allows avoidance of the expensive trade-offs used in earlier methods by accessing recorded runtime information directly from secondary storage without significant random-access overhead. In addition to being a standalone contribution, scalable dynamic slicing also forms integral parts of our contributions to fuzzing. Our first contribution uses dynamic slicing and binary code mutation to automatically turn an existing executable into a test generator. In our experiments, this new approach to fuzzing achieved about an order of magnitude better code coverage than traditional mutational fuzzing and found several bugs in popular Linux software. The second work on fuzzing presented in this thesis uses dynamic slicing to accelerate the state-of-the-art fuzzer AFL by focusing the fuzzing effort on previously unexplored parts of the input space. For the second dynamic analysis technique whose scalability we sought to improve — instruction trace alignment — we employed techniques used in speech recognition and information retrieval to design what is, to the best of our knowledge, the first general approach to aligning realistically long program traces. We show in our experiments that this method is capable of producing meaningful alignments even in the presence of significant syntactic differences stemming from, for example, the use of different compilers or optimization levels.

A Scalable Mixed-level Approach to Dynamic Analysis of C and C++ Programs

A Scalable Mixed-level Approach to Dynamic Analysis of C and C++ Programs PDF Author: Philip Jia Guo
Publisher:
ISBN:
Category :
Languages : en
Pages : 112

Get Book Here

Book Description
This thesis addresses the difficult task of constructing robust and scalable dynamic program analysis tools for programs written in memory-unsafe languages such as C and C++, especially those that are interested in observing the contents of data structures at run time. In this thesis, I first introduce my novel mixed-level approach to dynamic analysis, which combines the advantages of both source- and binary-based approaches. Second, I present a tool framework that embodies the mixed-level approach. This framework provides memory safety guarantees, allows tools built upon it to access rich source- and binary-level information simultaneously at run time, and enables tools to scale to large, real-world C and C++ programs on the order of millions of lines of code. Third, I present two dynamic analysis tools built upon my framework - one for performing value profiling and the other for performing dynamic inference of abstract types - and describe how they far surpass previous analyses in terms of scalability, robustness, and applicability. Lastly, I present several case studies demonstrating how these tools aid both humans and automated tools in several program analysis tasks: improving human understanding of unfamiliar code, invariant detection, and data structure repair.

Binary Code Reuse

Binary Code Reuse PDF Author: Junyuan Zeng
Publisher:
ISBN:
Category : Binary system (Mathematics)
Languages : en
Pages : 250

Get Book Here

Book Description
Binary code reuse aims to extract certain pieces of code from application binaries and make it possible to recompile and relink them with other components to produce new software. With the wide existence of binary code, it is useful to reuse the binary code for different security applications, such as malware analysis and virtual machine introspection. For instance, a malware analyst could reuse proprietary decompression and decryption algorithms from malware binary in order to decode their encoded network messages for malware analysis. In this dissertation, we present a systematic dynamic binary analysis based approach for binary code reuse. In particular, to overcome the challenges for static binary analysis, like obfuscation, this dissertation focuses on applying automated dynamic binary analysis to advance the state-of-the-art of binary code reuse techniques in different aspects. Specifically, a novel solution is presented to generate reusable source code from binary execution traces, featuring obfuscation resilience, free point-to/alias analysis and so on. Meanwhile, in order to facilitate function-level code reuse, this dissertation also proposes a new technique to automatically recover function interfaces, which can instruct end users to generate and pass appropriate inputs. Finally, since the dynamic execution of our target programs may compromise our analysis, a new dynamic binary instrumentation framework is introduced for the purpose of secure analysis. Compared with the existing platforms, it holds the following advantages: it can perform out-of-VM instrumentation and introspection, it is PIN-API compatible, and it is platform independent.

Dynamic Binary Modification

Dynamic Binary Modification PDF Author: Kim Hazelwood
Publisher: Morgan & Claypool Publishers
ISBN: 1608454584
Category : Computers
Languages : en
Pages : 83

Get Book Here

Book Description
Dynamic binary modification tools form a software layer between a running application and the underlying operating system, providing the powerful opportunity to inspect and potentially modify every user-level guest application instruction that executes. Toolkits built upon this technology have enabled computer architects to build powerful simulators and emulators for design-space exploration, compiler writers to analyze and debug the code generated by their compilers, software developers to fully explore the features, bottlenecks, and performance of their software, and even end-users to extend the functionality of proprietary software running on their computers. Several dynamic binary modification systems are freely available today that place this power into the hands of the end user. While these systems are quite complex internally, they mask that complexity with an easy-to-learn API that allows a typical user to ramp up fairly quickly and build any of a number of powerful tools. Meanwhile, these tools are robust enough to form the foundation for software products in use today. This book serves as a primer for researchers interested in dynamic binary modification systems, their internal design structure, and the wide range of tools that can be built leveraging these systems. The hands-on examples presented throughout form a solid foundation for designing and constructing more complex tools, with an appreciation for the techniques necessary to make those tools robust and efficient. Meanwhile, the reader will get an appreciation for the internal design of the engines themselves. Table of Contents: Dynamic Binary Modification: Overview / Using a Dynamic Binary Modifier / Program Analysis and Debugging / Active Program Modification / Architectural Exploration / Advanced System Internals / Historical Perspectives / Summary and Observations

Binary Code Fingerprinting for Cybersecurity

Binary Code Fingerprinting for Cybersecurity PDF Author: Saed Alrabaee
Publisher: Springer Nature
ISBN: 3030342387
Category : Computers
Languages : en
Pages : 264

Get Book Here

Book Description
This book addresses automated software fingerprinting in binary code, especially for cybersecurity applications. The reader will gain a thorough understanding of binary code analysis and several software fingerprinting techniques for cybersecurity applications, such as malware detection, vulnerability analysis, and digital forensics. More specifically, it starts with an overview of binary code analysis and its challenges, and then discusses the existing state-of-the-art approaches and their cybersecurity applications. Furthermore, it discusses and details a set of practical techniques for compiler provenance extraction, library function identification, function fingerprinting, code reuse detection, free open-source software identification, vulnerability search, and authorship attribution. It also illustrates several case studies to demonstrate the efficiency, scalability and accuracy of the above-mentioned proposed techniques and tools. This book also introduces several innovative quantitative and qualitative techniques that synergistically leverage machine learning, program analysis, and software engineering methods to solve binary code fingerprinting problems, which are highly relevant to cybersecurity and digital forensics applications. The above-mentioned techniques are cautiously designed to gain satisfactory levels of efficiency and accuracy. Researchers working in academia, industry and governmental agencies focusing on Cybersecurity will want to purchase this book. Software engineers and advanced-level students studying computer science, computer engineering and software engineering will also want to purchase this book.

BinSign

BinSign PDF Author: Lina Nouh
Publisher:
ISBN:
Category :
Languages : en
Pages : 113

Get Book Here

Book Description
Software reverse engineering is a complex process that incorporates different techniques involving static and dynamic analyses of software programs. Numerous tools are available that help reverse engineers in automating the dynamic analysis process. However, the process of static analysis remains a challenging and tedious process for reverse engineers. The static analysis process requires a great amount of manual work. Therefore, it is very demanding and time-consuming. One aspect of reverse engineering that provides reverse engineers with useful information regarding a statically analyzed piece of code is function fingerprinting. Binary code fingerprinting is a challenging problem that requires an in-depth analysis of internal binary code components for deriving identifiable and expressive signatures. Binary code fingerprints are helpful in the reverse engineering process and have various security applications such as malware variant detection, malware clustering, binary auditing, function recognition, and library identification. Moreover, binary code fingerprinting is also useful in automating some reverse engineering tasks such as clone detection, library function identification, code similarity, authorship attribution, etc. In addition, code fingerprints are valuable in cyber forensics as well as the process of patch analysis in order to identify patches or make sure that the patch complies with the security requirements.In this thesis, we propose a binary function fingerprinting and matching approach and implement a tool named BinSign based on the proposed approach that enhances and accelerates the reverse engineering process. The main objective of BinSign is to provide an accurate and scalable solution to binary code fingerprinting by computing and matching structural and syntactic code profiles for disassemblies while outperforming existing techniques. The structural profile of binary code is captured through decomposing the control-flow-graph of a function into tracelets. We describe the underlying methodology and evaluate its performance in several use cases, including function matching, function reuse, library function detection, malware analysis, and function indexing scalability. We also provide some insights into the effects of different optimization levels and obfuscation techniques on our fingerprint matching methodology. Additionally, we emphasize the scalability aspect of BinSign that is achieved through applying locality sensitive hashing, filtering techniques, and distributing the computations across several machines. The min-hashing process is combined with the banding technique of locality sensitive hashing in order to ensure a scalable and efficient fingerprint matching process. We perform our experiments on a database of 6 million functions that includes well-known libraries, malware samples, and some dynamic library files obtained from the Microsoft Windows operating system. The indexing process of fingerprints is distributed across multiple machines and it requires an average time of 0.0072 seconds per function. A comparison is also conducted with relevant existing tools, which shows that BinSign achieves a higher accuracy than these tools.

Tools for High Performance Computing 2015

Tools for High Performance Computing 2015 PDF Author: Andreas Knüpfer
Publisher: Springer
ISBN: 3319395890
Category : Computers
Languages : en
Pages : 184

Get Book Here

Book Description
High Performance Computing (HPC) remains a driver that offers huge potentials and benefits for science and society. However, a profound understanding of the computational matters and specialized software is needed to arrive at effective and efficient simulations. Dedicated software tools are important parts of the HPC software landscape, and support application developers. Even though a tool is by definition not a part of an application, but rather a supplemental piece of software, it can make a fundamental difference during the development of an application. Such tools aid application developers in the context of debugging, performance analysis, and code optimization, and therefore make a major contribution to the development of robust and efficient parallel software. This book introduces a selection of the tools presented and discussed at the 9th International Parallel Tools Workshop held in Dresden, Germany, September 2-3, 2015, which offered an established forum for discussing the latest advances in parallel tools.

Federal Plan for Cyber Security and Information Assurance Research and Development

Federal Plan for Cyber Security and Information Assurance Research and Development PDF Author: National Science and Technology Council (U.S.) Interagency Working Group on Cyber Security and Information Assurance
Publisher:
ISBN:
Category : Computer networks
Languages : en
Pages : 140

Get Book Here

Book Description


Applications and Techniques in Information Security

Applications and Techniques in Information Security PDF Author: Lynn Batten
Publisher: Springer
ISBN: 3662456702
Category : Computers
Languages : en
Pages : 275

Get Book Here

Book Description
This book constitutes the refereed proceedings of the International Conference on Applications and Techniques in Information Security, ATIS 2014, held in Melbourne, Australia, in November 2014. The 16 revised full papers and 8 short papers presented were carefully reviewed and selected from 56 submissions. The papers are organized in topical sections on applications; curbing cyber crimes; data privacy; digital forensics; security implementations.

Handbook of Software Engineering

Handbook of Software Engineering PDF Author: Sungdeok Cha
Publisher: Springer
ISBN: 3030002624
Category : Computers
Languages : en
Pages : 524

Get Book Here

Book Description
This handbook provides a unique and in-depth survey of the current state-of-the-art in software engineering, covering its major topics, the conceptual genealogy of each subfield, and discussing future research directions. Subjects include foundational areas of software engineering (e.g. software processes, requirements engineering, software architecture, software testing, formal methods, software maintenance) as well as emerging areas (e.g., self-adaptive systems, software engineering in the cloud, coordination technology). Each chapter includes an introduction to central concepts and principles, a guided tour of seminal papers and key contributions, and promising future research directions. The authors of the individual chapters are all acknowledged experts in their field and include many who have pioneered the techniques and technologies discussed. Readers will find an authoritative and concise review of each subject, and will also learn how software engineering technologies have evolved and are likely to develop in the years to come. This book will be especially useful for researchers who are new to software engineering, and for practitioners seeking to enhance their skills and knowledge.