|
|
Notes 0009Directory MethodsIn Perl, we can manipulate directories in pretty much the same fashion as in the UNIX shell, using just about the same commands. For example, creating a directory, is a mkdir function. Here are some directory manipulation functions (directly from the perlfunc man pages): Creating Directoriesmkdir FILENAME,MASK Creates the directory specified by FILENAME, with permissions specified by MASK (as modified by umask). If it succeeds it returns true, otherwise it returns false and sets $! (errno). If omitted, MASK defaults to 0777. In general, it is better to create directories with permissive MASK, and let the user modify that with their umask, than it is to supply a restrictive MASK and give the user no way to be more permissive. The exceptions to this rule are when the file or directory should be kept private (mail files, for instance). The perlfunc(1) entry on umask discusses the choice of MASK in more detail. Removing Directoriesrmdir FILENAME Deletes the directory specified by FILENAME if that directory is empty. If it succeeds it returns true, otherwise it returns false and sets $! (errno). If FILENAME is omitted, uses $_. Be warned using unlink to remove directories. Here are the unlink specs: unlink LIST Deletes a list of files. Returns the number of files successfully deleted. $cnt = unlink 'a', 'b', 'c'; unlink @goners; unlink <*.bak>; Note: unlink will not delete directories unless you are superuser and the -U flag is supplied to Perl. Even if these conditions are met, be warned that unlinking a directory can inflict damage on your filesystem. Use rmdir instead. If LIST is omitted, uses $_. Changing Directories (current directory)chdir EXPR Changes the working directory to EXPR, if possible. If EXPR is omitted, changes to the directory specified by $ENV{HOME}, if set; if not, changes to the directory specified by $ENV{LOGDIR}. If neither is set, chdir does nothing. It returns true upon success, false otherwise. See the example under die. Reading DirectoriesBefore you can read a directory, you need to open it: opendir DIRHANDLE,EXPR Opens a directory named EXPR for processing by readdir, telldir, seekdir, rewinddir, and closedir. Returns true if successful. DIRHANDLEs have their own namespace separate from FILEHANDLEs. Once you open it, you can read the DIRHANDLE to get contents of the directory: readdir DIRHANDLE Returns the next directory entry for a directory opened by opendir. If used in list context, returns all the rest of the entries in the directory. If there are no more entries, returns an undefined value in scalar context or a null list in list context. If you're planning to filetest the return values out of a readdir, you'd better prepend the directory in question. Otherwise, because we didn't chdir there, it would have been testing the wrong file. opendir(DIR, $some_dir) || die "can't opendir $some_dir: $!"; Once done reading, obviously there is a closedir method. closedir DIRHANDLE Closes a directory opened by opendir and returns the success of that system call. DIRHANDLE may be an expression whose value can be used as an indirect dirhandle, usually the real dirhandle name. Alternative MethodThere is also the filename globbing operator. The general format of it is: @files = <*.xml>; This will store the names of all xml files from the current directory in @files array. You should however avoid it, since that makes it very confusing to use with reading file handles. For example: <$a> grabs a line from a file handle referenced by $a, while <a> looks for the file named "a" in the current directory. A better way to express the glob is with a glob: @files = glob("*.xml"); You can also use that in loops:
One thing about globs though, is that if the filename contains a "\n" (newline) character, the glob will return that as two different names (it splits things on new lines). While this is extremely rare, it does happen - and you're a lot better of using opendir and readdir combination as opposed to the glob (then again, if you need a quick fix without too much typing, then it's ok to use it - There's More Than One Way To Do It). Recursion - Traversing DirectoriesYou can traverse the directory structure fairly easily using a recursive subroutine:
To traverse the current directory, you'd call it with: trav("."); (or with any other directory - the one you want to traverse). Note that you can have other things (subroutine calls) besides the print inside of that trav(). Actually a good exercise would be to modify this subroutine to also accept a reference to a subroutine as a parameter and call that subroutine on every directory. This would make the trav method a lot more reusable. Non-Recursive?As you know (or should know) that anything you can do recursively you can do without recursion using a stack, here's the non-recursive implementation:
You call this trav the same was as above (the order of the traversal is different, but otherwise it's the same). The order is different because in effect, we're not using a "stack" but a "queue" (which makes our traversal breadth-first).
|