Tuesday, January 25, 2011

Move 53,800+ files into 54 separate folders with ~1000 files each?

Trying to import 53,800+ individual files (messages) using Gmail's POP fetcher.

Gmail understandably refuses, giving the error: "Too many messages to download. There are too many messages on the other server."

The folder in question looks like similar to:

/usr/home/customer/Maildir/cur/1203672790.V57I586f04M867101.mail.net:2,S
/usr/home/customer/Maildir/cur/1203676329.V57I586f22M520117.mail.net:2,S
/usr/home/customer/Maildir/cur/1203677194.V57I586f26M688004.mail.net:2,S
/usr/home/customer/Maildir/cur/1203679158.V57I586f2bM182864.mail.net:2,S
/usr/home/customer/Maildir/cur/1203680493.V57I586f33M740378.mail.net:2,S
/usr/home/customer/Maildir/cur/1203685837.V57I586f0bM835200.mail.net:2,S
/usr/home/customer/Maildir/cur/1203687920.V57I586f65M995884.mail.net:2,S
...

Using the shell (tcsh, sh, etc. on FreeBSD), what one-line command can I type to split this directory full of files into separate folders so Gmail only sees 1000 messages at a time? Something with find or ls | xargs mv maybe. Whatever is fastest.

The desired output directory would now look something like:

/usr/home/customer/Maildir/cur/1203672790.V57I586f04M867101.mail.net:2,S
/usr/home/customer/Maildir/cur/1203676329.V57I586f22M520117.mail.net:2,S
...
/usr/home/customer/set1/ (contains messages 1-1000)
/usr/home/customer/set2/ (contains messages 1001-2000)
/usr/home/customer/set3/ (etc.)
  • It's not a one-liner, but you could copy chunks like this (in csh):

    foreach file (`ls | head -n 1000`)
    mv $file /tmp/new/dir
    end
    

    I'm not 100% sure that pipe will work with the number of files you've got, but it's worth a shot. Also, you might be able to do 500 at a time with this command, just change that 1000 to 500.

    ane : Thanks very much Chris - this worked great.
    warren : in bash: `for file in (`ls | head -n 1000`) do mv $file /tmp/new/dir; done`
    ane : Thanks Warren. That will work too.
    From Chris S
  • count=target=0;
    
    find srcdir/ -type f |
        while read file; do
    
            count=$((count+1));
            target=$((count/10000));
    
            [ -d $target ] || mkdir $target
    
            echo mv "$file" $target; #remove the 'echo' if you like what you see
        done
    

    collapsed to a single line (and with the 'echo' safeguard removed):

    count=target=0; find srcdir/ -type f | while read file; do count=$((count+1)); target=$((count/10000)); [ -d $target ] || mkdir $target; mv "$file" $target; done
    

    This isn't the fastest but it's clean. this solution avoids parsing the output of 'ls' http://mywiki.wooledge.org/ParsingLs and quotes the references to "$file" otherwise files with abnormal names (as Maildir files often are) would break the code. For example if any of the files have spaces or semicolons then referencing $file without quotes isn't going to get you far (most of the time)

    Read more about quoting: http://mywiki.wooledge.org/Quotes and http://wiki.bash-hackers.org/syntax/words

    ane : Thanks Stuart !

0 comments:

Post a Comment