凯撒密码进阶:如何判别密文是随机生成的
2019-08-28 21:47阅读:
'I chose Python as a working title for the project, being
in a slightly irreverent mood (and a big fan of Monty Python's
Flying Circus).' - Guido van Rossum
“我选择了巨蟒作为项目代号,稍显不敬,我又是巨蟒飞行马戏团的忠实粉丝”—“吉多·范·罗森
任务场景驱动
实例课程覆盖知识和技能点
丁丁猫创客的孩子30课学习
'Don't set out to learn Python. Choose a problem you're
interested in and learn to solve it with Python.'
不必马上开始python学习,先找一个你感兴趣的问题试着用python解决
关键词
generated randomly 随机生成
import matplotlib.pyplot as plt
数据呈现结论 可视化
wrap-around
很难翻译,看下图意会吧
caesar 最古老的加密法

高频场景
概要的都要学到手,本节用到函数
如何随机生成字符和文章段落
如何分辨随机生成文本的特征
import matplotlib.pyplot as plt
alphabet = 'abcdefghijklmnopqrstuvwx
yz'
code = '''
swodkdbkfovvobpbywkxkxds
aeovkxngrycksndgyfkcdkxn
dbexuvoccvoqcypcdyx
ocdkxnsxdronocobdxokbdro
wyxdrockxnrkvpcexukcrkdd
obonfsckqovsocgrycop
bygxkxngbsxuvonvszkxncxo
obypmyvnmywwkxndovvdrkds
dccmevzdybgovvdrycoz
kccsyxcbokngrsmriodcebfs
focdkwzonyxdrocovspovocc
drsxqcdrorkxndrkdwym
uondrowkxndrorokbddrkdpo
nkxnyxdrozonocdkvdrocogy
bnckzzokbwixkwoscyji
wkxnskcusxqypusxqcvyyuyx
wigybuciowsqrdikxnnoczks
bxydrsxqlocsnobowksx
cbyexndronomkiypdrkdmyvy
cckvgbomulyexnvocckxnlkb
odrovyxokxnvofovckxn
ccdbodmrpkbkgki
'''
letter_counts = [code.count(l) for l in alphabet]
letter_colors = plt.cm.hsv([0.8*i/max(letter_counts) for i in
letter_counts])
plt.bar(range(26), letter_counts, color=letter_colors)
plt.xticks(range(26), alphabet) # letter labels on x-axis
plt.tick_params(axis='x', bottom=False) # no ticks, only labels on
x-axis
plt.title('Frequency of each letter')
plt.savefig('output.png')
上节课我们学习凯撒密码就是将原文的每个字母转换为对应的数字,比如采用PYTHON的ORD()函数,每个数字分别加或减去固定值-+
,破解时逆运算即可
今天的挑战是敌方故意迷惑我方,丁丁猫的孩子们每人都拿到两份无序的字母文本,虽然看起来都是无序的,但还是有差别,其中一份是经过凯撒加密的有意义的情报。
首先如何分辨其中一份是经过凯撒密码加密,才能顺利还原情报原文。没有精力看英文的孩子,可以跳过本文所有的英文部分,只看中文部分足够理解本文。
回忆上节课的练手任务之二:密码中出现次数最多的字母是?
现在就要用到这个任务的结论了。
message = '
Once upon a midnight dreary, while I pondered, weak and
weary,
Over many a quaint and curious volume of forgotten
lore—
While I nodded, nearly napping, suddenly there came a
tapping,
As of some one gently rapping, rapping at my chamber
door—'
现在试着发现message中各个字母出现的频率,
并且为了便于发现规律,频率用图形化直观显示,
需要写代码统计之:
text = code
alphabet = 'abcdefghijklmnopqrstuvwxyz'
def count_most(text):
#a-z遍历26字母表
bench,res = 0,sorted(text)
for e in alphabet:
#e_most是出现次数最多的字母,bench是出现总次数
if res.count(e) > bench:
bench =
res.count(e)
e_most =
e
return e_most,bench
print(count_most(text))
('g', 27, 'z', ['a', 'a', 'a', 'a', 'a', 'a', 'a', 'a', 'a',
'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c',
'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'c', 'd', 'd', 'e',
'e', 'e', 'e', 'e', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f', 'f',
'f', 'f', 'f', 'f', 'f', 'f', 'g', 'g', 'g', 'g', 'g', 'g', 'g',
'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g', 'g',
'g', 'g', 'g', 'g', 'g', 'g', 'g', 'h', 'h', 'h', 'i', 'i', 'i',
'i', 'i', 'i', 'i', 'i', 'i', 'j', 'j', 'j', 'j', 'j', 'j', 'j',
'j', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k', 'k',
'k', 'k', 'k', 'k', 'k', 'k', 'k', 'm', 'n', 'n', 'n', 'n', 'n',
'n', 'n', 'n', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o', 'o',
'o', 'o', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p',
'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p', 'p',
'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q',
'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'q', 'r', 'r', 'r',
'r', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 'r', 's', 't', 't', 't',
't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't', 't',
't', 't', 't', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'u', 'v', 'v',
'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v', 'v',
'v', 'w', 'w', 'w', 'w', 'w', 'w', 'w', 'x', 'x', 'x', 'y', 'y',
'y', 'y'])
结论是字母'g' 出现的次数最多是27次!
okay!现在我们需要一点想象力:
两个文本,一个是随机生成的字母,一个是有意义的文章,这之间是否存在某些特征可以区分两者的不同?
重点来了,随机意味着每个字母出现的频率是平均的,上面例子证明了我们的猜测,g出现的频率明显偏高!
为证明我们的猜想,选择两个文本分别呈现频率分布
Text1 =
'yfdpcpoplhhwdpssbjnsqvtlcpzpxqugtjphvgotuvwxufgoqigxwgkskduooyeuoue
fjlnmsqpgxrmcseeliswdheywseqgcbeothskxdzekgxmmkildjnaqbukprpfaaknsu
qpdwayqaqfxsoapvsgreqydqjnkpjghvrkygtidzibhrqkmocukhcunpjcazzvomtsc
fgycwfltmiegaejwcqrgsnxxcbtcrckufwsdxdhbxgppxcuzapbdhftzmugryfseavv
bssqlxanvmfwwzityziixasivzkmvtfczqmdgkabcnjbyhaoealengfptuedlmvryeb
titbwqkekzdpmbtiphdkwwiduassvbgalxgrfhrjrjplxpujrprqzcpcdqsjorigazt
kwwlnwbjryrzhgcttroyemuwwixwufymnknirzmexyowobvardlqktzajzoijwulomg
ztefdpftjealzapcgipgaaspuzxklvd'
Text 2 =
'swodkdbkfovvobpbywkxkxdsaeovkxngrycksndgyfkcdkxndbexuvoccvoqcypcdyx
ocdkxnsxdronocobdxokbdrowyxdrockxnrkvpcexukcrkddobonfsckqovsocgrycop
bygxkxngbsxuvonvszkxncxoobypmyvnmywwkxndovvdrkdsdccmevzdybgovvdrycoz
kccsyxcbokngrsmriodcebfsfocdkwzonyxdrocovspovoccdrsxqcdrorkxndrkdwym
uondrowkxndrorokbddrkdponkxnyxdrozonocdkvdrocogybnckzzokbwixkwoscyji
wkxnskcusxqypusxqcvyyuyxwigybuciowsqrdikxnnoczksbxydrsxqlocsnobowksx
cbyexndronomkiypdrkdmyvycckvgbomulyexnvocckxnlkbodrovyxokxnvofovckxn
ccdbodmrpkbkgki'
以上是两份情报,不清楚那一份是有价值的
Below you see two strings of letters. Both seem random, but
one of them is a meaningful text encoded with a Caesar cipher. One
way of telling coded messages apart from random noise is to look at
the letter frequencies: if a few letters appear
significantly more often than the rest, as is usually the case in
written language, then the text is most likely not randomly
generated.
To help you decide which text is which, here is a program
that can show how often each letter appears as a bar graph. Copy
each text into the indicated line and run the program to see
it.
Which text contains a secret message?
发现两份文本呈现明显不同的特征:
Text1 中各个字母出现的频率比较平均
Text2 中字母o/k/e/d明显高出平均不少!
丁丁猫的孩子们应选择第二份文本破解,有兴趣的可以看下英文的解释
Correct answer: Text 2
Here are the letter distributions of both texts, side by
side:The letters in the first text occur much more uniformly than
in the second, where a few letters appear very often and a good
portion of the alphabet almost not at all. This sort of uneven
letter distribution is characteristic of a
natural language text. The uniform
distribution of the first text is a strong sign that it has been
generated randomly.
猛戳链接
Python 入坑练手:凯撒密码
大咖说
'Don't set out to learn Python. Choose a problem
you're interested in and learn to solve it with Python.' -
@jakevdp
Ready to meet other passionate Pythonistas and talk
more #Python? @pybites has a rich and diverse community on Slack!
Head over to CodeChalleng.es and join ...
'It’s not at all important to get it right the
first time. It’s vitally important to get it right the last time.”
- The Pragmatic Programmer
'第一次就能搞定代码并不重要。更重要的问题,最后一次是正确的。”-
务实的程序员
'Every great developer you know got there by
solving problems they were unqualified to solve until they actually
did it.' - Patrick McKenzie
“每一个伟大的开发者都是解决他们没有能力解决的问题开始的,直到他们真正做到了。
-Patrick McKenzie
相关阅读
速查宝典之python cheat sheet
Python cheat sheet入坑之2
Python cheat sheet入坑之3
Python cheat sheet人坑之4
Phython cheat sheet 之5 可读性
速查关键词
1. Collections:
List,
Dictionary,
Set,
Tuple,
Range,
Enumerate,
Iterator,
Generator.
2. Types:
Type,
String,
Regular_Exp,
Format,
Numbers,
Combinatorics,
Datetime
3. Syntax:
Args,
Inline,
Closure,
Decorator,
Class,
Duck_Types,
Enum,
Exceptions
4. System:
Print,
Input,
Command_Line_Arguments,
Open,
Path,
Command_Execution.
5. Data:
CSV,
JSON,
Pickle,
SQLite,
Bytes,
Struct,
Array,
MemoryView,
Deque.
6. Advanced:
Threading,
Operator,
Introspection,
Metaprograming,
Eval,
Coroutine.
7. Libraries:
Progress_Bar,
Plot,
Table,
Curses,
Logging,
Scraping,
Web,
Profile,NumPy,
Image,
Audio.