Tuesday, February 8, 2011

Problem with Bash output redirection

I was trying to remove all the lines of a file except the last line but the following command did not work, although file.txt is not empty.

$cat file.txt |tail -1 > file.txt

$cat file.txt

Why is it so?

  • It seems to not like the fact you're writing it back to the same filename. If you do the following it works:

    $cat file.txt | tail -1 > anotherfile.txt
    
  • Redirecting from a file through a pipeline back to the same file is unsafe; if file.txt is overwritten by the shell when setting up the last stage of the pipeline before tail starts reading off the first stage, you end up with empty output.

    Do the following instead:

    tail -1 file.txt >file.txt.new && mv file.txt.new file.txt
    

    ...well, actually, don't do that in production code; particularly if you're in a security-sensitive environment and running as root, the following is more appropriate:

    TEMPFILE="$(mktemp -t rewrite_file_txt.XXXXXX)"
    chown --reference=file.txt "${TEMPFILE}"
    chmod --reference=file.txt "${TEMPFILE}"
    tail -1 file.txt >"${TEMPFILE}" && mv "${TEMPFILE}" file.txt
    

    Another approach (avoiding temporary files) is the following:

    LASTLINE="$(tail -1 file.txt)"; cat >file.txt <<<"${LASTLINE}"
    

    (The above implementation is bash-specific, but works in cases where echo does not -- such as when the last line contains "--version", for instance).

    Finally, one can use sponge from moreutils:

    tail -1 file.txt | sponge file.txt
    
    Marcel Levy : Note that tail accepts a filename as an argument: "tail -1 file.txt > file.txt.new && mv file.txt.new file.txt"
    Charles Duffy : @Marcel Levy - quite right, and it has the potential to run more efficiently that way; updated.
  • As Lewis Baumstark says, it doesn't like it that you're writing to the same filename.

    This is because the shell opens up "file.txt" and truncates it to do the redirection before "cat file.txt" is run. So, you have to

    tail -1 file.txt > file2.txt; mv file2.txt file.txt
    
    From wnoise
  • tail -1 > file.txt will overwrite your file, causing cat to read an empty file because the re-write will happen before any of the commands in your pipeline are executed.

    From dsm
  • Before 'cat' gets executed, Bash has already opened 'file.txt' for writing, clearing out its contents.

    In general, don't write to files you're reading from in the same statement. This can be worked around by writing to a different file, as above:

    $cat file.txt | tail -1 >anotherfile.txt
    $mv anotherfile.txt file.txt
    or by using a utility like sponge from moreutils:
    $cat file.txt | tail -1 | sponge file.txt
    This works because sponge waits until its input stream has ended before opening its output file.

    From ephemient
  • When you submit your command string to bash, it does the following:

    1. Creates an I/O pipe.
    2. Starts "/usr/bin/tail -1", reading from the pipe, and writing to file.txt.
    3. Starts "/usr/bin/cat file.txt", writing to the pipe.

    By the time 'cat' starts reading, 'file.txt' has already been truncated by 'tail'.

    That's all part of the design of Unix and the shell environment, and goes back all the way to the original Bourne shell. 'Tis a feature, not a bug.

  • tmp=$(tail -1 file.txt); echo $tmp > file.txt;

    wnoise : nicely avoids temporary files, but it should be quoted to avoid all the standard issues.
    From Ken
  • You can use sed to delete all lines but the last from a file:

    sed -i '$!d' file
    
    • -i tells sed to replace the file in place; otherwise, the result would write to STDOUT.
    • $ is the address that matches the last line of the file.
    • d is the delete command. In this case, it is negated by !, so all lines not matching the address will be deleted.
    From ctstone
  • This works nicely in a Linux shell:

    FILTER="tail -1"
    FILE=file.txt
    
    $FILTER < "$FILE" \
      | dd conv=notrunc of="$FILE" 2>&1 \
      | grep bytes \
      | cut -d ' ' -f 1 \
      | xargs -i dd bs=1 seek='{}' if=/dev/null of="$FILE"
    

    dd's "notrunc" option is used to write the filtered contents back, in place, while dd is needed again (with a byte count) to actually truncate the file. If the new file size is greater or equal to the old file size, the second dd invocation is not necessary.

    The advantages of this over a file copy method are: 1) no additional disk space necessary, 2) faster performance on large files, and 3) pure shell.

    From m104
  • Just for this case it's possible to use

    cat < file.txt | (rm file.txt; tail -1 > file.txt)"
    That will open "file.txt" just before connection "cat" with subshell in "(...)". "rm file.txt" will remove reference from disk before subshell will open it for write for "tail", but contents will be still available through opened descriptor which is passed to "cat" until it will close stdin. So you'd better be sure that this command will finish or contents of "file.txt" will be lost

    From ony
  • echo "$(tail -1 file.txt)" > file.txt
    
    From ghostdog74

0 comments:

Post a Comment