Class StringsComparator


  • public class StringsComparator
    extends java.lang.Object

    It is guaranteed that the comparisons will always be done as o1.equals(o2) where o1 belongs to the first sequence and o2 belongs to the second sequence. This can be important if subclassing is used for some elements in the first sequence and the equals method is specialized.

    Comparison can be seen from two points of view: either as giving the smallest modification allowing to transform the first sequence into the second one, or as giving the longest sequence which is a subsequence of both initial sequences. The equals method is used to compare objects, so any object can be put into sequences. Modifications include deleting, inserting or keeping one object, starting from the beginning of the first sequence.

    This class implements the comparison algorithm, which is the very efficient algorithm from Eugene W. Myers An O(ND) Difference Algorithm and Its Variations. This algorithm produces the shortest possible edit script containing all the commands needed to transform the first sequence into the second one.

    This code has been adapted from Apache Commons Collections 4.0.

    Since:
    1.0
    See Also:
    EditScript, EditCommand, CommandVisitor
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      private static class  StringsComparator.Snake
      This class is a simple placeholder to hold the end part of a path under construction in a StringsComparator.
    • Field Summary

      Fields 
      Modifier and Type Field Description
      private java.lang.String left
      First character sequence.
      private java.lang.String right
      Second character sequence.
      private int[] vDown
      Temporary array.
      private int[] vUp
      Temporary array.
    • Constructor Summary

      Constructors 
      Constructor Description
      StringsComparator​(java.lang.String left, java.lang.String right)
      Constructs a new instance of StringsComparator.
    • Field Detail

      • left

        private final java.lang.String left
        First character sequence.
      • right

        private final java.lang.String right
        Second character sequence.
      • vDown

        private final int[] vDown
        Temporary array.
      • vUp

        private final int[] vUp
        Temporary array.
    • Constructor Detail

      • StringsComparator

        public StringsComparator​(java.lang.String left,
                                 java.lang.String right)
        Constructs a new instance of StringsComparator.

        It is guaranteed that the comparisons will always be done as o1.equals(o2) where o1 belongs to the first sequence and o2 belongs to the second sequence. This can be important if subclassing is used for some elements in the first sequence and the equals method is specialized.

        Parameters:
        left - first character sequence to be compared
        right - second character sequence to be compared
    • Method Detail

      • buildScript

        private void buildScript​(int start1,
                                 int end1,
                                 int start2,
                                 int end2,
                                 EditScript<java.lang.Character> script)
        Builds an edit script.
        Parameters:
        start1 - the begin of the first sequence to be compared
        end1 - the end of the first sequence to be compared
        start2 - the begin of the second sequence to be compared
        end2 - the end of the second sequence to be compared
        script - the edited script
      • buildSnake

        private StringsComparator.Snake buildSnake​(int start,
                                                   int diag,
                                                   int end1,
                                                   int end2)
        Builds a snake.
        Parameters:
        start - the value of the start of the snake
        diag - the value of the diagonal of the snake
        end1 - the value of the end of the first sequence to be compared
        end2 - the value of the end of the second sequence to be compared
        Returns:
        The snake built
      • getMiddleSnake

        private StringsComparator.Snake getMiddleSnake​(int start1,
                                                       int end1,
                                                       int start2,
                                                       int end2)
        Gets the middle snake corresponding to two subsequences of the main sequences.

        The snake is found using the MYERS Algorithm (this algorithms has also been implemented in the GNU diff program). This algorithm is explained in Eugene Myers article: An O(ND) Difference Algorithm and Its Variations.

        Parameters:
        start1 - the begin of the first sequence to be compared
        end1 - the end of the first sequence to be compared
        start2 - the begin of the second sequence to be compared
        end2 - the end of the second sequence to be compared
        Returns:
        The middle snake
      • getScript

        public EditScript<java.lang.Character> getScript()
        Gets the EditScript object.

        It is guaranteed that the objects embedded in the insert commands come from the second sequence and that the objects embedded in either the delete commands or keep commands come from the first sequence. This can be important if subclassing is used for some elements in the first sequence and the equals method is specialized.

        Returns:
        The edit script resulting from the comparison of the two sequences