Finding difference between strings in Javascript -
i'd compare 2 strings (a before , after) , detect , changed between them.
for change, want know:
- starting position of change (inclusive, starting @ 0)
- ending position of change (inclusive, starting @ 0) relative previous text
- the "change"
assume strings change in 1 place @ time (for example, never "bill" -> "kiln").
additionally, need start , end positions reflect type of change:
- if deletion, start , end position should start , end positions of deleted text, respectively
- if replacement, start , end position should start , end positions of "deleted" text, respectively (the change "added" text)
- if insertion, start , end positions should same; entry point of text
- if no change, let start , end positions remain zero, empty change
for example:
"0123456789" -> "03456789" start: 1, end: 2, change: "" (deletion) "03456789" -> "0123456789" start: 1, end: 1, change: "12" (insertion) "hello world!" -> "hello aliens!" start: 6, end: 10, change: "aliens" (replacement) "hi" -> "hi" start: 0, end: 0, change: "" (no change)
i able detect positions of changed text, doesn't work in cases because in order accurately, need know kind of change made.
var oldtext = "my edited string!"; var newtext = "my first string!"; var changestart = 0; var newchangeend = 0; var oldchangeend = 0; console.log("comparing start:"); (var = 0; < newtext.length; i++) { console.log(i + ": " + newtext[i] + " -> " + oldtext[i]); if (newtext[i] != oldtext[i]) { changestart = i; break; } } console.log("comparing end:"); // "addition"? if (newtext.length > oldtext.length) { (var = 1; < newtext.length; i++) { console.log(i + "(n: " + (newtext.length - i) + " o: " + (oldtext.length - i) + ": " + newtext.substring(newtext.length - i, newtext.length - + 1) + " -> " + oldtext.substring(oldtext.length - i, oldtext.length - + 1)); if (newtext.substring(newtext.length - i, newtext.length - + 1) != oldtext.substring(oldtext.length - i, oldtext.length - + 1)) { newchangeend = newtext.length - i; oldchangeend = oldtext.length - i; break; } } // "deletion"? } else if (newtext.length < oldtext.length) { (var = 1; < oldtext.length; i++) { console.log(i + "(n: " + (newtext.length - i) + " o: " + (oldtext.length - i) + ": " + newtext.substring(newtext.length - i, newtext.length - + 1) + " -> " + oldtext.substring(oldtext.length - i, oldtext.length - + 1)); if (newtext.substring(newtext.length - i, newtext.length - + 1) != oldtext.substring(oldtext.length - i, oldtext.length - + 1)) { newchangeend = newtext.length - i; oldchangeend = oldtext.length - i; break; } } // same length... } else { // } console.log("change start: " + changestart); console.log("nchange end : " + newchangeend); console.log("ochange end : " + oldchangeend); console.log("change: " + oldtext.substring(changestart, oldchangeend + 1));
how tell whether or not insertion, deletion, or replacement took place?
i've searched , came few other similar questions, don't seem help.
i have gone through code , logic matching string makes sense me. logs changestart
, newchangeend
, oldchangeend
correctly , algorithm flows alright. want know if insertion, deletion or replacement took place. here's how go it.
first of all, need make sure after have got first point of mis-match i.e. changestart
when traverse strings end, index shouldn't cross changestart
.
i'll give example. consider following strings:
var newtext = "hello worllolds!"; var oldtext = "hello worlds!"; changestart -> 10 //makes sense oldchangeend -> 8 newchangeend -> 11 console.log("change: " + newtext.substring(changestart, newchangeend + 1)); //ouputs "lo"
the problem in case when starts matching back, flow this:
comparing end: 1(n: 12 o: 12: ! -> !) 2(n: 11 o: 11: s -> s) 3(n: 10 o: 10: d -> d) -> need stop here! //although there not mismatch, have reached changestart , //we have established characters 0 -> changestart-1 match //that why outputs "lo" instead of "lol"
assuming, said makes sense, need modify for
loops so:
if (newtext.length > oldtext.length) { (var = 1; < newtext.length && ((oldtext.length-i)>=changestart); i++) { ... newchangeend = newtext.length - -1; oldchangeend = oldtext.length - -1; if(//mismatch condition reached){ //break..that code fine. } }
this condition -> (oldtext.length-i)>=changestart
takes care of anomaly mentioned , therefore for
loop automatically terminates if condition reached. however, mentioned there might situations condition reached before mis-match encountered demonstrated. need update values of newchangeend
, oldchangeend
1 less matched value. in case of mis-match, store values appropriately.
instead of else -if
wrap 2 conditions in situation know newtext.length > oldtext.length
not true i.e. either replacement or deletion. again newtext.length > oldtext.length
means replacement or insertion per examples, makes sense. else
like:
else { (var = 1; < oldtext.length && ((oldtext.length-i)>=changestart); i++) { ... newchangeend = newtext.length - -1; oldchangeend = oldtext.length - -1; if(//mismatch condition reached){ //break..that code fine. } }
if have understood minor changes far, identifying specific cases simple:
- deletion - condition ->
changestart > newchangeend
. deleted stringchangestart -> oldchangeend
.
deleted text -> oldtext.substring(changestart, oldchangeend + 1);
- insertion - condition ->
changestart > oldchangeend
. inserted string @changestart
.
inserted text -> newtext.substring(changestart, newchangeend + 1);
- replacement - if
newtext != oldtext
, above 2 conditions not met, is replacement.
text in old string got replaced -> oldtext.substring(changestart, oldchangeend + 1);
the replacement text -> newtext.substring(changestart, newchangeend + 1);
start , end positions in oldtext
got replaced -> changestart -> oldchangeend
i have created jsfiddle incorporating changes have mentioned in code. might want check out. hope gets started in right direction.
Comments
Post a Comment