Categories
Coding

Was the java.util.regex package based on Jakarta-ORO?

A lot of Java libraries that need regex functionality, but also need to retain compatability across 1.3 and earlier VMs use the famous Jakarta ORO library. This is a great (and fully-featured) regex implementation. An example of a project that uses this is one of the projects that I am a committer on, Jakarta Commons-Net. This is a very popular library which is mainly used for its FTP functionality, e.g. Ant’s FTP task uses it as its FTP engine.
However, it is designed to be compatible with 1.3 and earlier VMs. One of the implications of this is that its regex parsing is done by ORO, which necessitates extra “baggage” in the form of a separate jar download. As an exercise, I decided to see how easy it would be to convert the existing ORO code to a JDK-based approach. I first created a branch in SVN (here), and started to work on the ORO code.
It turned out to be trivial to make the changes, in fact the APIs were so similar I have a strong feeling that the Sun engineers who implemented the java.util.regex package may have been heavily influenced by the design of the ORO interface.
First, look at the ORO version:


private Pattern pattern = null;
    private MatchResult result = null;
    protected PatternMatcher _matcher_ = null;

    public RegexFTPFileEntryParserImpl(String regex)
    {
        super();
        try
        {
            _matcher_ = new Perl5Matcher();
            pattern   = new Perl5Compiler().compile(regex);
        }
        catch (MalformedPatternException e)
        {
            throw new IllegalArgumentException (
               "Unparseable regex supplied:  " + regex);
        }
    }

    /**
     * Convenience method delegates to the internal MatchResult's matches()
     * method.
     *
     @param s the String to be matched
     @return true if s matches this object's regular expression.
     */

    public boolean matches(String s)
    {
        this.result = null;
        if (_matcher_.matches(s.trim()this.pattern))
        {
            this.result = _matcher_.getMatch();
        }
        return null != this.result;
    }



    /**
     * Convenience method delegates to the internal MatchResult's groups()
     * method.
     *
     @return the number of groups() in the internal MatchResult.
     */

    public int getGroupCnt()
    {
        if (this.result == null)
        {
            return 0;
        }
        return this.result.groups();
    }

And now look at the equivalent Java 1.4+ version:


private Pattern pattern = null;
    private MatchResult result = null;
    protected Matcher _matcher_ = null;

    public RegexFTPFileEntryParserImpl(String regex)
    {
        super();
        try
        {
            pattern   = Pattern.compile(regex);
        }
        catch (PatternSyntaxException pse)
        {
            throw new IllegalArgumentException (
               "Unparseable regex supplied:  " + regex);
        }
    }

    /**
     * Convenience method delegates to the internal MatchResult's matches()
     * method.
     *
     @param s the String to be matched
     @return true if s matches this object's regular expression.
     */

    public boolean matches(String s)
    {
        this.result = null;
        _matcher_ = pattern.matcher(s);
        if (_matcher_.matches())
        {
            this.result = _matcher_.toMatchResult();
        }
        return null != this.result;
    }



    /**
     * Convenience method delegates to the internal MatchResult's groups()
     * method.
     *
     @return the number of groups() in the internal MatchResult.
     */

    public int getGroupCnt()
    {
        if (this.result == null)
        {
            return 0;
        }
        return this.result.groupCount();
    }