Python hashlib MD5 digest of any UNC file always yields same hash -
the below code shows 3 files on unc share hosted on machine have same hash. shows local files have different hashes. why be? feel there unc consideration don't know about.
python 2.7.5 (default, may 15 2013, 22:44:16) [msc v.1500 64 bit (amd64)] on win32 type "help", "copyright", "credits" or "license" more information. >>> import hashlib >>> fn_a = '\\\\some.host.com\\shares\\folder1\\file_a' >>> fn_b = '\\\\some.host.com\\shares\\folder1\\file_b' >>> fn_c = '\\\\some.host.com\\shares\\folder2\\file_c' >>> fn_d = 'e:\\file_d' >>> fn_e = 'e:\\file_e' >>> fn_f = 'e:\\folder3\\file_f' >>> f_a = open(fn_a, 'r') >>> f_b = open(fn_b, 'r') >>> f_c = open(fn_c, 'r') >>> f_d = open(fn_d, 'r') >>> f_e = open(fn_e, 'r') >>> f_f = open(fn_f, 'r') >>> hashlib.md5(f_a.read()).hexdigest() '54637fdcade4b7fd7cabd45d51ab8311' >>> hashlib.md5(f_b.read()).hexdigest() '54637fdcade4b7fd7cabd45d51ab8311' >>> hashlib.md5(f_c.read()).hexdigest() '54637fdcade4b7fd7cabd45d51ab8311' >>> hashlib.md5(f_d.read()).hexdigest() 'd2bf541b1a9d2fc1a985f65590476856' >>> hashlib.md5(f_e.read()).hexdigest() 'e84be3c598a098f1af9f2a9d6f806ed5' >>> hashlib.md5(f_f.read()).hexdigest() 'e11f04ed3534cc4784df3875defa0236'
edit: further investigate problem, tested using file host. appears changing host change result.
>>> fn_h = '\\\\host\\share\\file' >>> f_h = open(fn_h, 'r') >>> hashlib.md5(f_h.read()).hexdigest() 'f23ee2dbbb0040bf2586cfab29a03634'
...but tried different file on new host, , got new result!
>>> fn_i = '\\\\host\\share\\different_file' >>> f_i = open(fn_i, 'r') >>> hashlib.md5(f_i.read()).hexdigest() 'a8ad771db7af8c96f635bcda8fdce961'
so, i'm confused. have fact original host \\host.com
format , new host \\host
format?
i did additional research based on comments , answers provided. decided needed study permutations of these 2 features of code:
a raw string literal used path name, i.e. whether or not:
a. file path string raw single backslashes in path, vs.
b. file path string not raw double backslashes in path(fyi don't know, raw string 1 proceeded "r" this:
r'this raw string'
)the
open
function moder
orrb
.
(fyi again don't know,b
inrb
mode indicates read file binary.)
the results demonstrated:
- the string literal / backslashes make no difference in whether or not hashes of different files different
- my error not opening file in binary mode. when using
rb
mode inopen
, got different results.
yay! , help.
Comments
Post a Comment