I want to make sure that nobody changed a file. In order to accomplish that, I want not only to check MD5 sum of the file, but also check its size, since as far as I understand this additional simple check can sophisticate falsification by several digits.
May I trust the size that stat returns? I don't mean if changes were made to stat itself. I don't go that deep. But, for instance, may one compromise the file size that stat returns by hacking the directory file? Or by similar means, that do not require superuser privileges?
It's Linux.
-
Why do you care about the size of the file? Comparing MD5 sums will tell you with absolute certainty if the file has changed or not. Flipping bits within the file will retain the file size, but could be a completely different file.
: Comparing MD5 sums will tell you with absolute certainty if the file has changed or not. - No. See http://en.wikipedia.org/wiki/Pigeonhole_principleSvenW : Yes, MD5 is broken, but I don't think it's broken enough to fear an attack by your users(!) and still rule out a rootkit infection (which could alter both the output of stat and md5). Also, just use a better hash, like SHA-2 or somehing.: @SvenW Any hash function can return finite number of different values. So there are always different arguments for a hash function that return the same value, since number of different arguments is infinite. However it's much more harder to find different arguments of the same length (not to say impoxible), that had the same hash function value.SvenW : Well, in theory you are right. But practically, creating documents with the same hash and size or even just the same hash is impossible, at the very least for hash functions that are much better than MD5, ie. SHA-2. Again, I wonder what your attack scenario might be?: you could always store more than one hash for you files, why not store both sha and md5 hashes, hacking both maybe very difficult!From Brian Tillman -
Here's a demo of sparse files which is one way size can be misleading:
$ dd if=/dev/zero of=sparse.out bs=512 seek=100000 count=0 0+0 records in 0+0 records out 0 bytes (0 B) copied, 7.5053e-05 s, 0.0 kB/s $ echo hi>>sparse.out $ ls -l sparse.out -rw-r--r-- 1 user group 51200003 2010-04-13 02:09 sparse.out $ stat sparse.out File: `sparse.out' Size: 51200003 Blocks: 24 IO Block: 4096 regular file Device: 802h/2050d Inode: 1111111 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1111/ user) Gid: ( 1111/ group) Access: 2010-04-13 02:09:11.000000000 -0500 Modify: 2010-04-13 02:09:09.000000000 -0500 Change: 2010-04-13 02:09:09.000000000 -0500 $ hexdump -C sparse.out 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 030d4000 68 69 0a |hi.| 030d4003 $ du sparse.out 12 sparse.outAs you can see, the byte count in
lsandstatshow the allocated space, but only the block count ofstatand the output ofduare even close to the actual contents of the file.: +1 for an interesting fact, but it's not exactly what I was looking for. The file appears to be 51200003 bytes long, whether you read it or check its size with `stat`, so it doesn't matter how it is physically stored in the filesystem. So as far as I can see, it doesn't compromise the file size in any way.janneb : Well, 'ls -l' does a lstat() syscall, as does 'du' and the 'stat' command-line tool. The difference is just which field(s) they read from the stat struct the syscall returns.From Dennis Williamson -
You ask if someone may compromise the size of the file returned by stat by hacking the directory file. No, that's not possible. The directory is simply is a list of file names and inode numbers. All of the other file information (owner, group, mode, size, etc.) is contained in the inode (at least in POSIX compliant file systems) and that is from where stat collects this information.
From TCampbell
0 comments:
Post a Comment