com.oberon.util
Class FileDiff
java.lang.Object
com.oberon.util.FileDiff
public class FileDiff
- extends java.lang.Object
FileDiff Text file difference utility.
---- Copyright 1987, 1989 by Donald C. Lindsay,
School of Computer Science, Carnegie Mellon University.
Copyright 1982 by Symbionics.
Use without fee is permitted when not for direct commercial
advantage, and when credit to the source is given. Other uses
require specific permission.
Converted from C to Java by Ian F. Darwin, http://www.darwinsys.com/, January, 1997.
Copyright 1997, Ian F. Darwin.
Conversion is NOT FULLY TESTED.
USAGE: diff oldfile newfile
This program assumes that "oldfile" and "newfile" are text files.
The program writes to stdout a description of the changes which would
transform "oldfile" into "newfile".
The printout is in the form of commands, each followed by a block of
text. The text is delimited by the commands, which are:
DELETE AT n
..deleted lines
INSERT BEFORE n
..inserted lines
n MOVED TO BEFORE n
..moved lines
n CHANGED FROM
..old lines
CHANGED TO
..newer lines
The line numbers all refer to the lines of the oldfile, as they are
numbered before any commands are applied.
The text lines are printed as-is, without indentation or prefixing. The
commands are printed in upper case, with a prefix of ">>>>", so that
they will stand out. Other schemes may be preferred.
Files which contain more than MAXLINECOUNT lines cannot be processed.
This can be fixed by changing "symbol" to a Vector.
The algorithm is taken from Communications of the ACM, Apr78 (21, 4, 264-),
"A Technique for Isolating Differences Between Files."
Ignoring I/O, and ignoring the symbol table, it should take O(N) time.
This implementation takes fixed space, plus O(U) space for the symbol
table (where U is the number of unique lines). Methods exist to change
the fixed space to O(N) space.
Note that this is not the only interesting file-difference algorithm. In
general, different algorithms draw different conclusions about the
changes that have been made to the oldfile. This algorithm is sometimes
"more right", particularly since it does not consider a block move to be
an insertion and a (separate) deletion. However, on some files it will be
"less right". This is a consequence of the fact that files may contain
many identical lines (particularly if they are program source). Each
algorithm resolves the ambiguity in its own way, and the resolution
is never guaranteed to be "right". However, it is often excellent.
This program is intended to be pedagogic. Specifically, this program was
the basis of the Literate Programming column which appeared in the
Communications of the ACM (CACM), in the June 1989 issue (32, 6,
740-755).
By "pedagogic", I do not mean that the program is gracefully worded, or
that it showcases language features or its algorithm. I also do not mean
that it is highly accessible to beginners, or that it is intended to be
read in full, or in a particular order. Rather, this program is an
example of one professional's style of keeping things organized and
maintainable.
The program would be better if the "print" variables were wrapped into
a struct. In general, grouping related variables in this way improves
documentation, and adds the ability to pass the group in argument lists.
This program is a de-engineered version of a program which uses less
memory and less time. The article points out that the "symbol" arrays
can be implemented as arrays of pointers to arrays, with dynamic
allocation of the subarrays. (In C, macros are very useful for hiding
the two-level accesses.) In Java, a Vector would be used. This allows an
extremely large value for MAXLINECOUNT, without dedicating fixed arrays.
(The "other" array can be allocated after the input phase, when the exact
sizes are known.) The only slow piece of code is the "strcmp" in the tree
descent: it can be speeded up by keeping a hash in the tree node, and
only using "strcmp" when two hashes happen to be equal.
- Version:
- Java version 0.9, 1997
Method Summary |
java.lang.String |
doDiff(com.oberon.util.fileInfo oldFile,
com.oberon.util.fileInfo newFile)
Do comparison |
java.lang.String |
doFileDiff(java.lang.String oldFile,
java.lang.String newFile)
|
java.lang.String |
doTextDiff(java.lang.String oldText,
java.lang.String newText)
|
static void |
main(java.lang.String[] argstrings)
main - entry point when used standalone. |
Methods inherited from class java.lang.Object |
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
FileDiff
public FileDiff()
main
public static void main(java.lang.String[] argstrings)
- main - entry point when used standalone.
NOTE: no routines return error codes or throw any local
exceptions. Instead, any routine may complain
to stderr and then exit with error to the system.
doFileDiff
public java.lang.String doFileDiff(java.lang.String oldFile,
java.lang.String newFile)
doTextDiff
public java.lang.String doTextDiff(java.lang.String oldText,
java.lang.String newText)
doDiff
public java.lang.String doDiff(com.oberon.util.fileInfo oldFile,
com.oberon.util.fileInfo newFile)
- Do comparison
Copyright © 2008-2014 Mirko Solazzi. All Rights Reserved.