Unmasking KorPlug: A Technical Breakdown - Part 2

Executive Summary
This analysis represents the second instalment in a comprehensive examination of the KorPlug malware family. Previous reporting detailed the initial loading vector utilising DLL side-loading techniques against legitimate utilities to achieve code execution.
The second-stage payload executes via a designated entry point function. Static analysis of the binary reveals that the Initialise function, invoked by the preceding loader stage, exhibits an anomalous Control Flow Graph (CFG) structure.
Reverse-engineering efforts targeting this function present significant analytical challenges due to implemented obfuscation mechanisms. Both static disassembly and dynamic analysis methodologies encounter substantial impediments when attempting to enumerate the function's operational logic.
The following technical analysis, with the file-details in Table #1, documents the methodologies employed to circumvent the identified obfuscation techniques and extract actionable intelligence regarding KorPlug's second-stage execution capabilities.
SHA-256 Hash |
b6b239fe0974cf09fe8ee9bc5d0502174836a79c53adccdbb1adeb1f15c6845c |
File Size |
638976 Bytes (624.00 KB) |
File Type |
x86 PE |
Table #1. Details of malicious DLL
Technical Breakdown
Flashback to Stage-3
Analysis of the terminal phase detailed in Part 1 reveals that despite the decoded payload maintaining standard DLL file structure, execution occurs through non-conventional loading mechanisms.
The sample implements shellcode-style execution via the EnumSystemGeoID API function call. The payload's initial byte sequence contains redirect instructions that divert execution flow directly to the Initialize function, circumventing standard Windows DLL loading procedures and associated security mechanisms.
Static analysis of the binary within disassembly environments reveals that the Initialize function exhibits a complex and extensive Control Flow Graph (CFG) structure. This report documents the methodologies employed to transform the obfuscated CFG representation into a comprehensible format suitable for reverse engineering analysis, while maintaining the integrity of the original execution logic and operational behaviour.
O-LLVM
The observed CFG structure and block segmentation patterns are consistent with O-LLVM [1] implementation characteristics. O-LLVM represents a modified iteration of the LLVM compiler infrastructure [2] that incorporates code obfuscation capabilities. This toolset is commonly deployed to impede reverse engineering and static analysis efforts across various threat vectors, including malware campaigns, digital rights management systems, and software protection schemes.
O-LLVM implements three primary obfuscation methodologies:
- Control Flow Flattening: Transforms function control flow structures into flattened switch-based dispatch mechanisms, obscuring conventional conditional logic and iterative constructs.
- Bogus Control Flow: Introduces spurious conditional branches and unreachable code segments designed to mislead disassembly tools and analytical processes.
- Instruction Substitution: Replaces straightforward instruction sequences with functionally equivalent but syntactically complex alternatives to defeat signature-based detection mechanisms.
O-LLVM operates on LLVM's Intermediate Representation (IR) layer, enabling language-agnostic obfuscation across any codebase compiled through Clang or LLVM-based toolchains. The resulting binary complexity substantially increases reverse-engineering difficulty.
Analysis efforts included evaluation of multiple publicly available deobfuscation tools and documented methodologies. Testing revealed that existing solutions did not successfully process the sample without significant modification.
In RevEng.AI’s assessment of publicly-available tooling, MODeflattener [3] demonstrated the highest compatibility potential, with associated documentation [4] providing foundational reference material for the techniques detailed in this analysis.
Control Flow Graph Component Analysis
Effective deobfuscation of the target function requires systematic identification and categorisation of individual components within the Control Flow Graph (CFG). O-LLVM obfuscation implementations assign specific operational roles to distinct basic block types. The following analysis examines the functional classification of these block categories to establish the foundation for subsequent deobfuscation procedures.

The pre-dispatcher block represents the most readily identifiable component within the obfuscated CFG structure. This block exhibits a characteristic high predecessor count relative to other basic blocks within the function. Analysis reveals that the pre-dispatcher typically contains minimal operational logic, consisting primarily of an unconditional jump instruction directing execution flow to the initial dispatcher block.
The first dispatcher block initiates the computational logic responsible for determining subsequent execution paths within the obfuscated function. This block operates through manipulation of a designated control variable, referenced as the state variable within control flow flattening implementations. Individual basic blocks within the obfuscated structure assign discrete values to this state variable.
The dispatcher and backbone components subsequently utilise these assigned values to perform calculated branch operations, determining the target basic block for continued execution flow.
The dispatcher block initiates execution immediately following initial state variable assignment. The opening instruction within this block performs additional manipulation of the state variable (observed as [esp+9Ch+var_8C] within the analysed sample) to commence the resolution process for determining the current state value and directing execution to the corresponding target block.
Subsequent blocks contain the core operational logic intended for execution by the malware sample. These backbone blocks demonstrate consistent instruction patterns utilising JMP, MOV, SUB, and JZ operations.
These instructions function collectively to manipulate and evaluate the state variable contents, facilitating calculated jumps to appropriate successor blocks within the execution sequence. This instruction sequence architecture serves as the foundational mechanism governing execution flow control throughout the obfuscated function.
The relevant blocks constitute the critical components containing the sample's core operational functionality.
These blocks house the primary behavioural logic and executable actions that fulfil the malware's intended objectives. Irrespective of the specific malicious capabilities implemented by the threat actor, the fundamental operational code responsible for achieving the malware's design goals resides within these relevant block structures.
Not all critical blocks within the obfuscated structure maintain the pre-dispatcher as a successor block. The terminal block, which denotes function completion, contains no successor relationships. This architectural variation does not compromise execution integrity, as proper resolution of state variable logic and backbone comparison mechanisms ensures natural progression to the terminal block without disrupting established control flow patterns.
Within relevant blocks, state variable manipulation represents the primary mechanism governing execution flow transitions. Analysis of the sample reveals that state variable assignment operations consistently occur at the conclusion of each relevant block. These relevant blocks can be categorised into two distinct operational types based on their state assignment methodology.
Simple Assignment Blocks implement direct state variable assignment through hardcoded value placement utilising MOV instructions. Conditional Assignment Blocks incorporate decision logic where subsequent state values depend on evaluated conditions, enabling selection between two possible state values. This conditional pattern typically manifests through the following instruction sequence: MOV eax, value1; MOV ecx, value2; CMOVZ ecx, eax; MOV [state variable], ecx.
Through systematic manipulation of state variable values at the conclusion of each relevant block that transitions back to the pre-dispatcher, the sample enforces a predetermined execution sequence and ensures processing of functional components in a specified order.
This obfuscation technique, while maintaining logical execution flow integrity, generates a significantly more complex CFG structure compared to conventional compilation output. The resulting architectural complexity substantially impedes manual analysis efforts within standard SRE tooling.
Deobfuscation Implementation Methodology
With comprehensive understanding of the Control Flow Graph components established, analysts can develop automated deobfuscation tools utilising any programming language or framework capable of CFG analysis and structural pattern recognition. The primary objective involves systematic mapping of all potential state variable values assigned within simple and conditional relevant blocks, followed by determination of actual jump targets processed by backbone block logic.
The deobfuscation procedure requires enumeration of state variable assignments across all relevant blocks and correlation of these values with backbone block comparison operations to identify legitimate execution paths.
Upon completion of this mapping process, the deobfuscation tool can implement binary patching operations to redirect each relevant block's execution flow directly to its intended successor relevant block, effectively bypassing the obfuscated dispatcher mechanism. This process includes removal of operational code within backbone blocks and the first dispatcher, restoring the function to its original, unobfuscated control flow structure.
Following successful mapping of execution paths, the deobfuscation process requires patching each relevant block to implement direct jumps to the subsequent relevant block within the legitimate execution sequence.
The following implementation utilises Python in conjunction with the angr binary analysis framework to automate the deobfuscation process.
Angr provides comprehensive capabilities for extracting critical components from individual basic blocks, enabling systematic identification and classification of the structural elements detailed in the preceding analysis.
The implementation iterates through identified relevant blocks to catalog their operational characteristics, classifying each block as either simple or conditional type based on state variable assignment methodology. The analysis process records the specific values assigned to the state variable at the conclusion of each block's execution sequence.
Analysis must also account for a specialised category designated as tail blocks. While tail blocks technically qualify as relevant blocks due to their direct predecessor relationship with the pre-dispatcher, they contain no substantive operational logic. These blocks typically consist of a single JMP instruction directing execution to the pre-dispatcher component.
During the binary patching phase–which will be covered in more detail further in this report–tail blocks are classified and processed as backbone components, requiring removal from the execution logic. This treatment is appropriate as tail blocks neither manipulate the state variable nor contribute meaningful functionality to the CFG structure, serving only as transitional elements within the obfuscated architecture.
With state variable values enumerated and cataloged for each relevant block, the subsequent analysis phase requires identification of corresponding jump operations within the backbone block structures.
With comprehensive state variable to jump target correlation established, the final implementation phase involves systematic patching of the binary to eliminate obfuscated control flow mechanisms. This process replaces existing state variable manipulation operations and jumps to the pre-dispatcher with direct jump instructions targeting the appropriate successor relevant blocks. For conditional execution flows, the patching procedure may require insertion of conditional jump instructions followed by unconditional jump operations, depending on the specific branching logic requirements.
Attention must be maintained regarding instruction size variations during the patching process. Original instruction sequences, such as MOV [state variable], value; JMP pre-dispatcher, typically consume 13 bytes of binary space.
Direct JMP instruction replacements may require only 5 bytes, creating a size differential that must be addressed to preserve binary integrity. The remaining byte space must be populated with NOP (no operation) instructions to maintain proper binary alignment and prevent execution of unintended instruction sequences that could compromise deobfuscation effectiveness.
The concluding phase involves systematic removal of backbone blocks, dispatcher components, and tail blocks from the binary's execution flow. This elimination process produces a simplified binary structure that accurately represents the underlying operational logic of the original, unobfuscated code.
Conclusion
This methodology demonstrates effective analysis and deobfuscation capabilities against heavily flattened control flow structures generated by O-LLVM obfuscation implementations.
Through systematic identification of obfuscation architectural components, classification of relevant block types, and reconstruction of original execution logic via targeted binary patching, analysts can achieve significantly improved visibility into the malware's core operational behaviour.
While this implementation was developed for analysis of the specific KorPlug sample, the underlying methodology provides a framework adaptable to similar obfuscation schemes through appropriate modifications. This approach parallels adaptation strategies employed with existing tools such as MODeflattener, enabling broader application across O-LLVM obfuscated threat samples with suitable technical adjustments to accommodate implementation variations.
As documented in the initial assessment phase, MODeflattener failed to process the analysed sample without modification. Manual intervention involving direct specification of missing basic block addresses and corresponding state variable transition updates enabled successful tool operation for this particular instance. However, these modifications represent sample-specific adaptations that do not provide generalised compatibility across other O-LLVM obfuscated binaries.
The complete Python implementation utilising the angr binary analysis framework[5], including comprehensive reference materials and supporting documentation, is accessible through the resources provided in the footnote citations.
IOCs
SHA-256 |
Description |
b6b239fe0974cf09fe8ee9bc5d0502174836a79c53adccdbb1adeb1f15c6845c |
The content of the analyzed sample, identified by its SHA-256 hash |
Footnotes
[1] The LLVM Compiler Infrastructure - https://llvm.org/
[2] O-LLVM Wiki - https://github.com/obfuscator-llvm/obfuscator/wiki
[3] MODeflattener GitHub Repository - https://github.com/mrT4ntr4/MODeflattener
[4] mrT4ntr4's MODeflattener Post - https://mrt4ntr4.github.io/MODeflattener/
[5] Github link to the code.