[svlug] Re: [svlug]how to copy a bunch of "." files?

Tin Le tin at le.org
Thu May 4 00:42:55 PDT 2000


-----BEGIN PGP SIGNED MESSAGE-----


On Wed, 3 May 2000 dfox at belvdere.vip.best.com wrote:

> > When using find, I see a lot of people using xargs when most of the time it
> > isn't necessary. For example, the above find sequence can be written as:

> On most Unices, one's command line buffer isn't all that long. It's
> pretty long on Linux, as far as I know. When someone does a find, it

Command line buffer size is dependent on the shell, e.g. it maybe 1K for
bash, 5K for csh, etc.  (the numbers given are just for example, not real).

I've misplaced my POSIX books (haven't had to use them for several years),
but POSIX codified and specified all the various parameters, including
minimum command line buffer size.  If I am not mistaken, POSIX compliant
shells must have a minimum command line size of 10K.

> generates a list of filenames, which are then piped to the other processes.
> Now, strictly in a case like this, xargs isn't necessary, since the
> process will just take the filenames as they come, but generally, in
> most situations, xargs is a safety net. Its nature is to take a few
> filenames at a time, do the operation(s), and then get some more file-
> names, and so forth.

This is not correct.  The problem with find and its -exec flag is that
_FOR EACH PARAM GIVEN_ it will _FORK_ off a process to handle the -exec
request.

Since forking a process is 'forking' expensive in *NIX (pardon the
expression ;->), it is better to batch them with xargs.

Btw, xargs take more than "a few" filenames at a time :-)

> Rather than using 'find', I typically use constructs involving back-
> slashes, which take what's in between them, and substitute the output
> right there in the command line. For instance:

> rm `expr` rather than
> find . <expr> -exec something {}; 

> with or without xargs. In my opinion, the first is cleaner, but one
> does run the risk of buffer overflow, if 'expr' evaluates to a very
> very big list of files, e.g., on a very large subdirectory.

Yes, you will find that one day many of your "old" scripts that used
to work start failing.  Remember the Y2K fiasco?  When people thought
they would never need more than two digits for date?

Always write your scripts and programs with an eye toward longevity.
You _never_ know how long people will keep using them, and it is better
to expect the unexpected than to run into problems later.

A number of _LARGE_ filesystems such as XFS, Ext3 and Reiserfs is coming
down the pipe.  These fs allows huge directories.

When I was working on XFS at SGI, two of the test cases I wrote created
a very deep directory tree (17+ million levels) and a large (30+ million
files in one single directory).  Guess what will happen to your rm `expr`
line.... :-)))  or even "find . <expr> -exec something".  Both of them
will die, guarantee.

I've actually had to go through several iterations of fixing rm, ls and
other IRIX utilities in order to make them XFS and 64bit clean.  It was
hard work.

Tin Le


-----BEGIN PGP SIGNATURE-----
Version: 2.6.2i

iQCVAgUBOREp+RiIIbPkDHhBAQFxsQQAksDkmMgN47TAh0k81FPy3r/SAFsSFKVV
YVLJpxjZ52Im3Jnhytw+L6rTMBImOEb1jm9moON7UCCu44HTab8w6hKdyeZu6zyo
MORfF7gRhaTIOsG3owq49J9u6H2YHJrnjyLLuqMW8gc6EsjWiL58d6znpdam4YJo
FaYwvwU3sGY=
=DFh4
-----END PGP SIGNATURE-----






More information about the svlug mailing list