[svlug] How bigs my home

james@linuxrebel.org james at linuxrebel.org
Thu Oct 22 09:45:27 PDT 2009

On Wed, 21 Oct 2009 23:45:51 -0700, Robert Hajime Lanning
<lanning at lanning.cc> wrote:
> james at linuxrebel.org wrote:
>> All,
>>    Situation is this 2.5TB of home dirs.  Most consisting of large
>>    amounts
>> of small files running between 48k and 20GB of data.  du -sh will print
>> out a very nice accounting of the size of a dir (used in a script of
>> course) but on some of the larger ones say 20GB it can take 25-30
>> minutes to return. Get 10 in a row and you see what happens.
> Thats what happens when you stat() thousands/millions of files. du and 
> pretty much anything else will have to stat() every file to get their 
> size, then add all the sizes up.  You can try running each home 
> directory as a separate du in parallel.  That could shorten the overall
> runtime at the expense of flooding the NAS with stat() requests.
>>    Complications. This is done over a NFS mount rather than directly on
>> the NAS itself. Can't change the file system (NAS only does what it
>> does) Can't tweak the file system (3rd party managed by people who only
>> know how to say no).  Can't install software on the NAS itself.
> NFS, ok, even slower...  Though, if you break up each home directory 
> into a separate nfs share, you can df the share.  This would have big
> scaling issues as you get into the nightmare of thousands of shares to
> manage and the maximum concurrent mounts that a system can have.
>>   What I'm looking to do eventually will be to alert space hogs that
>> need to trim back.  I'm trying to  avoid the blank stare one gets when a
>> developer finds out his/her data is lost because it exceeded the
>> available drive space (*grin*)
> Sounds like a weekly batch run over the weekend.  Get your report on 
> Monday morning.
>>    Anyone know of a way to calculate the size of a directory faster than
>> du -sh can do it?  I'm not worried about sorting or other aspects. 
>> Just the fastest way to tell me how big a specific dir is.
> That is about it.

I thought I was going to be limited to this.  However I'm also not so bold
to believe I've got all the answers *grin*.  Thanks for confirming what I
was true.  I think, given the number of users I'll have it count the number
of dirs,
divide by 7, and then do the corresponding dirs for that day.  That way
everyone is 
checked once a week, but not checked all on the same day.  Thanks!


More information about the svlug mailing list