Class DirectoryScanner


  • public class DirectoryScanner
    extends Object

    Class for scanning a directory for files/directories which match certain criteria.

    These criteria consist of selectors and patterns which have been specified. With the selectors you can select which files you want to have included. Files which are not selected are excluded. With patterns you can include or exclude files based on their filename.

    The idea is simple. A given directory is recursively scanned for all files and directories. Each file/directory is matched against a set of selectors, including special support for matching against filenames with include and and exclude patterns. Only files/directories which match at least one pattern of the include pattern list or other file selector, and don't match any pattern of the exclude pattern list or fail to match against a required selector will be placed in the list of files/directories found.

    When no list of include patterns is supplied, "**" will be used, which means that everything will be matched. When no list of exclude patterns is supplied, an empty list is used, such that nothing will be excluded. When no selectors are supplied, none are applied.

    The filename pattern matching is done as follows: The name to be matched is split up in path segments. A path segment is the name of a directory or file, which is bounded by File.separator ('/' under UNIX, '\' under Windows). For example, "abc/def/ghi/xyz.java" is split up in the segments "abc", "def","ghi" and "xyz.java". The same is done for the pattern against which should be matched.

    The segments of the name and the pattern are then matched against each other. When '**' is used for a path segment in the pattern, it matches zero or more path segments of the name.

    There is a special case regarding the use of File.separators at the beginning of the pattern and the string to match:
    When a pattern starts with a File.separator, the string to match must also start with a File.separator. When a pattern does not start with a File.separator, the string to match may not start with a File.separator. When one of these rules is not obeyed, the string will not match.

    When a name path segment is matched against a pattern path segment, the following special characters can be used:
    '*' matches zero or more characters
    '?' matches one character.

    Examples:
    "**\*.class" matches all .class files/dirs in a directory tree.
    "test\a??.java" matches all files/dirs which start with an 'a', then two more characters and then ".java", in a directory called test.
    "**" matches everything in a directory tree.
    "**\test\**\XYZ*" matches all files/dirs which start with "XYZ" and where there is a parent directory called test (e.g. "abc\test\def\ghi\XYZ123").

    Case sensitivity may be turned off if necessary. By default, it is turned on.

    Example of usage:

       String[] includes = {"**\\*.class"};
       String[] excludes = {"modules\\*\\**"};
       ds.setIncludes(includes);
       ds.setExcludes(excludes);
       ds.setBasedir(new File("test"));
       ds.setCaseSensitive(true);
       ds.scan();
    
       System.out.println("FILES:");
       String[] files = ds.getIncludedFiles();
       for (int i = 0; i < files.length; i++) {
         System.out.println(files[i]);
       }
     

    This will scan a directory called test for .class files, but excludes all files in all proper subdirectories of a directory called "modules".

    Author:
    Arnout J. Kuiper ajkuiper@wxs.nl, Magesh Umasankar, Bruce Atherton, Antoine Levy-Lambert
    • Constructor Detail

      • DirectoryScanner

        public DirectoryScanner()
      • DirectoryScanner

        public DirectoryScanner​(Path dir)
      • DirectoryScanner

        public DirectoryScanner​(Path dir,
                                String... includes)
    • Method Detail

      • setBasedir

        public void setBasedir​(Path basedir)
        Sets the base directory to be scanned. This is the directory which is scanned recursively.
        Parameters:
        basedir - The base directory for scanning. Should not be null.
      • getBasedir

        public Path getBasedir()
        Returns the base directory to be scanned. This is the directory which is scanned recursively.
        Returns:
        the base directory to be scanned
      • setIncludes

        public void setIncludes​(String... includes)

        Sets the list of include patterns to use. All '/' and '\' characters are replaced by File.separatorChar, so the separator used need not match File.separatorChar.

        When a pattern ends with a '/' or '\', "**" is appended.

        Parameters:
        includes - A list of include patterns. May be null, indicating that all files should be included. If a non-null list is given, all elements must be non-null.
      • getIncludes

        public List<String> getIncludes()
        Returns:
        Un-modifiable list of the inclusion patterns
      • isCaseSensitive

        public boolean isCaseSensitive()
      • setCaseSensitive

        public void setCaseSensitive​(boolean caseSensitive)
      • scan

        public Collection<Path> scan()
                              throws IOException,
                                     IllegalStateException
        Scans the base directory for files which match at least one include pattern and don't match any exclude patterns. If there are selectors then the files must pass muster there, as well.
        Returns:
        the matching files
        Throws:
        IllegalStateException - if the base directory was set incorrectly (i.e. if it is null, doesn't exist, or isn't a directory).
        IOException - if failed to scan the directory (e.g., access denied)
      • scandir

        protected <C extends Collection<Path>> C scandir​(Path rootDir,
                                                         Path dir,
                                                         C filesList)
                                                  throws IOException
        Scans the given directory for files and directories. Found files and directories are placed in their respective collections, based on the matching of includes, excludes, and the selectors. When a directory is found, it is scanned recursively.
        Type Parameters:
        C - Target matches collection type
        Parameters:
        rootDir - The directory to scan. Must not be null.
        dir - The path relative to the root directory (needed to prevent problems with an absolute path when using dir). Must not be null.
        filesList - Target Collection to accumulate the relative path matches
        Returns:
        Updated files list
        Throws:
        IOException - if failed to scan the directory
      • isIncluded

        protected boolean isIncluded​(String name)
        Tests whether or not a name matches against at least one include pattern.
        Parameters:
        name - The name to match. Must not be null.
        Returns:
        true when the name matches against at least one include pattern, or false otherwise.
      • couldHoldIncluded

        protected boolean couldHoldIncluded​(String name)
        Tests whether or not a name matches the start of at least one include pattern.
        Parameters:
        name - The name to match. Must not be null.
        Returns:
        true when the name matches against the start of at least one include pattern, or false otherwise.
      • normalizePattern

        public static String normalizePattern​(String pattern)
        Normalizes the pattern, e.g. converts forward and backward slashes to the platform-specific file separator.
        Parameters:
        pattern - The pattern to normalize, must not be null.
        Returns:
        The normalized pattern, never null.
      • replace

        public static String replace​(String text,
                                     String repl,
                                     String with,
                                     int max)

        Replace a String with another String inside a larger String, for the first max values of the search String.

        A null reference passed to this method is a no-op.

        Parameters:
        text - text to search and replace in
        repl - String to search for
        with - String to replace with
        max - maximum number of values to replace, or -1 if no maximum
        Returns:
        the text with any replacements processed