BigAdmin System Administration Portal
Community Submitted Article
Print-friendly VersionPrint-friendly Version
This content is submitted by a BigAdmin user. It has not been reviewed for technical accuracy by Sun Microsystems, though it may have been lightly edited to improve readability. If you find an error or would like to comment on the article, please contact the submitter or use the comment field at the bottom of the article. Community submissions may not follow Sun trademark guidelines. For information on Sun trademarks, please see http://www.sun.com/suntrademarks/.
 
 

How to Safely Move and Copy Files and Directories

Darek Licznerski, June 2006

Abstract: This article discusses the problems that can occur when you move and copy files and directories, and the author also offers solutions.

Contents:
Introduction

You could say, "This is not a problem for me because I can copy files with the cp or cp -r commands or move files with the mv command." However, sometimes you can break something and lose data.

I recommend you avoid such solutions as the cp or mv commands when you need to copy or move big files, large numbers of files, or large amounts of data. Also, avoid using commands such as cp or mv when you need to keep the same owner, group, permissions, and all the other information about the data.

Caveats

NFS

You can see problems when the destination or source directory is mounted on a local machine from a remote machine via NFS or through the network with any other protocol. To avoid such problems check the source and destination directories via commands. For example:

# df -k <directory>
where <directory> - source or destination directory

Or use this for the Solaris 9 Operating System (OS) and later versions:

# df -h <directory>
where <directory> - source or destination directory

Example:

# df -k  /source_directory
Filesystem      kbytes    used     avail      capacity    Mounted on
127.0.0.1:/source
                34750583  28174655 6228423    82%     /source_directory

Here the directory /source from the host with the IP address 127.0.0.1 (in this case localhost) is mounted on a local machine under the directory /source_directory. To check what type of service is used to mount the directory, run a command such as the following:

# mount -v | grep "<directory>" where <directory> - source or destination directory

Symbolic Links

You can see problems when the destination or source directory -- or even worse, the subdirectories -- are linked in the source directory.

When you run the cp -r command on the directory that contains symbolic links, the links in the destination directory will be changed to the real data to which those links pointed. If you use the cp -r command directory to copy about 1 Gbyte of data that contains large quantities of links, the destination will be 1 Gbyte plus all the data to which those links pointed. Sometimes the result could be huge and could overload your system.

To check if the source directory contains symbolic links, do the following:

# cd <directory>
where <directory> - source or destination directory

# find . -type link | more

To count the number of links, use this:

# find . -type link | wc

Here is the result of copying directory example01 to the new one, example02, using the command cp -r ./example01 ./example02:

# find ./example01 -type link | wc
    2440    2440  171743

# du -ks ./example01
2772092 ./example01

# find ./example02 -type link | wc
       0       0       0 

# du -ks ./example02
6326172 ./example02

Unknown Destination

Problems can surface when you are not familiar with the real location, destination, or source directory because they may point somewhere else.

For example:

# df -k /source_directory
Filesystem         kbytes    used    avail     capacity    Mounted on
/dev/dsk/c0t0d0s0  34750583  28174655 6228423  82%   /source_directory   

# df -k /destination_directory
Filesystem    kbytes    used    avail     capacity    Mounted on
127.0.0.1:/destination
              34750583  28174655  6228423   82% /destination_directory

Here /source_directory is the directory created on a local disk (on the root partition). /destination_directory is the directory /destination on the host with the IP address 127.0.0.1 (in this case localhost), mounted on a local machine. For the purpose of this example, the IP address is localhost. However, in a real scenario, it could be an IP from a different subnet, and the packets (data) would be transferred through the network and who knows where.

When the source or destination directory is mounted through the network, you need to know the following:

  • The real way to reach such a destination or source directory
  • How long it takes to copy data to and from the directory
  • Whether the data will be copied through any other machines or devices (like network accelerators)

The best way is to copy some example data to the destination and get a measurement.

Problems With Access

If you are unfamiliar with how to access a destination or source directory, you may be unable to access the directories or subdirectories.

When you decide to move data using the mv command, make sure that no processes are running in the source directory that will be moved.

Sometimes you can't access the mounted directory because you have the wrong permissions on the remote host for the directory you are trying to mount. Make sure that the destination directory is not mounted as a read-only file system. For example:

# df -k /destination_directory
Filesystem kbytes     used     avail    capacity    Mounted on
127.0.0.1:/destination
          34750583   28174655   6228423  82%     /destination_directory 

# mount -v | grep "<directory>"

where <directory> - destination directory

Also, check that no read-only flags are set up. The best way is to try to write an example file to the destination before copying or moving large amounts of data.

Interruptions During the Copying or Moving Process

Various problems are possible, for example, if you are using the mv command from the source to the destination directory and one of them is mounted through the network. If the network becomes unavailable while the files are being moved, or you kill the terminal or kill the data moving process by mistake, data will be moved partially and you also can lose data.

A problem also occurs if you run the cp command from the source to the destination directory, with one of them mounted through the network, and the network becomes unavailable while files are being copied. In this case data will be moved partially.

When you do such moving of data remotely and, for example, you export DISPLAY to your machine, there is a high probability that the process could be interrupted. So make sure that the place from which you execute such commands is stable.

To check the terminal name, do the following:

# tty
/dev/console

# tty
/dev/pts/1

If you are logged in to the console you also have control during the reboot. Consider using the nohup command to avoid interruptions caused by long-running processes.

Moving Between Different Devices

It takes much longer to move files and directories via the mv command among directories that are not created on the same partition or disks than it does to move directories sharing the same partition. (On the same partition, this mv command works at once, but between different devices you can be surprised by how long it takes.) The time required mostly depends on having access to such a partition or device. Make sure that during the moving of files the machine will not be rebooted and your moving process will not be interrupted.

For example:

# df -k /source_directory
Filesystem         kbytes    used      avail    capacity  Mounted on
/dev/dsk/c0t0d0s0  34750583  28174655  6228423  82%  /source_directory   

# df -k /destination_directory
Filesystem        kbytes    used     avail   capacity  Mounted on
/dev/md/dsk/d30   34750583  28174655 6228423 82% /destination_directory

In the previous case the /source_directory is the directory created on the local disk (root partition) and the /destination_directory is the directory created on the metadevice. So the time would not be the same as it would be on the same partition.

Make sure that the destination and source directory have the same block structure. If not, avoid using cp or mv commands. To check this you can create the same text file on each device and run the du -ks command on each file.

For example:

# du -ks ./example.txt
4       ./example.txt

# du -ks /tmp/example.txt
8       /tmp/example.txt

But the diff or sdiff -s commands show no differences.

Owner, Groups, Permissions

Make sure that owner, group, permissions, and any other file information are preserved. Some users or services might lose access to such data in the new location, and you would not be aware of that. Sometimes the service could be a critical part of the system.

Summary and Some Issues

Instead of using mv or cp commands to move or copy large amounts of data, consider the following procedure:

  1. Make sure that none of the caveats mentioned previously would apply.

  2. Verify disk spaces.

    For example:

    # df -k <directory>
    where <directory> - source directory

    # df -k <directory>
    where <directory> - destination directory

  3. Copy data to the destination using commands such as cpio (recommended), tar, rsync, ufsdump, or ufsrestore.

    Example:

    Let the source directory be /source, and let the destination directory be /destination.

    # cd  /source
    # cd ..
    # find  ./source  -depth -print | cpio -cvo> /destination/source_data.cpio
    # cd /destination
    # cpio -icvmdI ./source_data.cpio
    # rm -rf ./source_data.cpio
    		

    This -c option is important in case the data is copied among different types of machines. This option provides read or write header information in ASCII character form for portability. There are no UID or GID restrictions associated with this header format. Use this option between SVR4-based machines, or use the -H odc option between unknown machines. The -c option implies the use of expanded device numbers, which are only supported on SVR4-based systems. Use the -H odc option when you are transferring files between the SunOS 4 or Interactive UNIX and the Solaris 2.6 OS or compatible versions.

    The procedure above first creates a file in the /destination directory that contains all the data packed via the cpio command and then unpacks this file in the destination.

    For more information, see the manual pages for the cpio command and other commands if needed.

    Also refer to:


  4. Compare data in the source directory and in the destination directory via commands such as sdiff -s, diff, du -ks, find, cmp, and dircmp.

    Example:

    Let the source directory be /source, and let the destination directory be /destination.

    # cd /source
    # cd ..
    # dircmp -s ./source ./destination
    		
  5. If you intend to move data instead of copying, and you are completely sure that all data in the new location is correct, just remove all data in the source location or back it up somewhere else.

 


The information and links on this page have been provided by a BigAdmin user. The submitter is solely responsible for such information and links. Sun is not responsible for the availability of external sites or resources, and does not endorse and is not responsible or liable for any content, advertising, products, or other materials on or available from such sites or resources. Sun will not be responsible or liable, directly or indirectly, for any actual or alleged damage or loss caused by or in connection with use of or reliance on the information posted here, or goods or services available on or through any external site or resource.

Unless otherwise licensed, code in all technical manuals herein (including articles, FAQs, samples) is provided under this License.


BigAdmin