Categories
Coding

Apache Source License Scanner (Ruby)

The other day, I needed to replace all instances of the old-style Apache license header in the commons-net codebase with the new style license header. My first thought was to write a simple Perl script to extract and replace instances of the license, however I decided to give Ruby a shot. The result is below: LicenseScanner.rb will scan the given directory and all subdirectories for Java source files and attempt to locate and replace any instances of older-style source headers. Being used to Perl for this kind of thing, you really appreciate some of the Perl-isms built into Ruby, for instance:

  • “Here” variables – see the definition of @@new_asl_license;
  • First class regular expression support, via // and =~.

Here is the source:

require "find"
# Written by Rory Winston <rwinston@apache.org>
# 
class LicenseScanner
  # Older, incorrect license headers - Replace
  @@asl_patt_10 = /(\/\*(.*) \* The Apache Software License, Version 1.1(.*?)\*?(?!\/)\*\/)/mis
  @@asl_patt_20 = /(\/\*(.*) \* Licensed under the Apache License, Version 2.0(.*?)\*?(?!\/)\*\/)/mis
  @@asl_patt_other = /(\/\*(.*) \* Copyright 200[0-9](-200[0-9])? The Apache Software Foundation(.*?)\*?(?!\/)\*\/)/mis

  # New corrected header - Leave as-is
  @@new_asl_pattern = /(\/\*(.*) \* Licensed to the Apache Software Foundation (ASF)(.*?)\*?(?!\/)\*\/)/mis

  # The new license header 
  @@new_asl_license = <<EOL
/*
 * Licensed to the Apache Software Foundation (ASF) under one or more
 * contributor license agreements.  See the NOTICE file distributed with
 * this work for additional information regarding copyright ownership.
 * The ASF licenses this file to You under the Apache License, Version 2.0
 * (the "License"); you may not use this file except in compliance with
 * the License.  You may obtain a copy of the License at
 *
 *      http://www.apache.org/licenses/LICENSE-2.0
 *
 * Unless required by applicable law or agreed to in writing, software
 * distributed under the License is distributed on an "AS IS" BASIS,
 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
EOL

  @prompt = true

  def initialize(dir)
    @dir = dir
    @str = ""
  end

  def initialize(dir, prompt)
    @dir = dir
    @prompt = prompt
  end

  def scan
    Find.find(@dir) do |path|
      Find.prune if [".",".."].include? path
      readFile(path) if File.basename(path) =~ /(.*).java$/
    end
  end

  def readFile(filename)
    @str = ""
    @filename = filename
    @str = IO.read(@filename)
    @str.gsub!("\015", "")
    scanFile
  end

  def scanFile
    if @str =~ @@asl_patt_10
      puts "Found Apache 1.x License in #{@filename}"
      replaceLicense(@@asl_patt_10)
    elsif @str =~ @@asl_patt_20
      puts "Found other Apache 2.0 License in #{@filename}"
      replaceLicense(@@asl_patt_20)
    elsif @str =~ @asl_patt_other
      puts "Found Apache 1/2 license in file in #{@filename}"
      replaceLicense(@asl_patt_other)
    elsif @str =~ @new_asl_pattern
      puts "Correct license found in #{@filename}"
    else
      puts "No license found in #{@filename}"
      insertLicense
    end
  end

  # Replace an existing Apache 2.0 license with the new format
  def replaceLicense(pattern)
    resp = confirm "Replace license in #{@filename}"

    case resp
      when "y","Y"
        # Open existing file and truncate
        puts "Replacing license in #{@filename}..."
        srcFile = File.new(@filename, "w+")
        srcFile.puts  @str.sub(pattern, @@new_asl_license)

        srcFile.close
      when "n", "N"
        puts "Skipping..."
    end
  end

  # Insert a new license at the beginning of the file
  def insertLicense
    resp = confirm "Insert license in #{@filename}"

    case resp
      when "y","Y"
        puts "Inserting license in #{@filename}..."
        srcFile = File.new(@filename, File::TRUNC | File::RDWR)
        srcFile.puts  @@new_asl_license
        srcFile.puts @str
        srcFile.close
      when "n", "N"
        puts "Skipping..."
     end

  end

  def confirm(msg)
    if @prompt != true
      puts "#{msg} [y/n]?"
      resp = gets
      resp.chomp!

      while (resp !~ /[YyNn]/)
        puts "#{msg} [y/n]?"
        resp = gets
        resp.chomp!
      end
    else
      resp = "y"
    end

    resp
  end

end

# Usage: LicenseScanner.new("path/to/src/dir", auto-replace [true/false])
scanner = LicenseScanner.new("c:/sandbox/net/src/main/java", true)
scanner.scan