Tuesday, 15 September 2015

Why is java.util.regex.Matcher start() and end() returning extra characters in this case? -


I am generating some regexes dynamically and result in matching results with another string post-hoc in my application I am here. I am taking the initial and end indices of a match, then changing the each letter of each one and adjusting the offset for the next matches. Although in a match of many other successfully matched / replaced cases I saw that my starting and ending index contains an additional character.

Here is the code that I am using to create regexes:

  Pattern.compile ("[^ a-zA-Z] + (? & Lt; Mail & gt; "+ pattern. Question (search [i]) +") [^ a-zA-Z] + ")  

where an additional character is added

search [i] = "on daily basis"

[^ a-zA-Z] + ( It is relevant text that is

      

Matching on daily basis with

I have a desired output

  on a daily basis  

This is the output which I get from matcher.group ( "Match"), although when I debug the beginning () and end () results from the same maturing context, I get 356 and 375 respectively (this is in the context of full text), but you can see that the difference These two numbers are 19, whereas the string is only 16 characters "on a daily basis"

I'm assuming that I have an account for \ Q and \ E pattern The Shykta. questions? But then where is the third extra extra character coming from? And why is this particularly in this pattern / target string case?

Is there any other unrelated cause of discrepancy that I see? As a result, it is expected that you [^ a- ZA-Z] + . Therefore, although the actual text length is 16 , then the total length of the matched string will be different.

Although the matching text will be returned to that group, it will begin to match the full match index end () similar to the method, from the last index of the string of the match Index will be the last one.

If you want to start and end the index of the matched group, you can assign the group name to both start (string) and the end (string) Method

Try it in a small string, and you will know.

  string search = "on daily basis"; String tourism = "on daily basis."; Pattern pattern = Pattern.compile ("[^ a-zA-Z] + (? & Lt; mail & gt;" + pattern quote (search) + ") [^ a-zA-Z] +"); Matcher matcher = pattern.matcher (toMatch); If (matcher.find ()) {System.out.println (matcher.group) Length ()); Println (matcher.start ()); Println (matcher.end ()); Println (matcher.group ("match") length ().); Println (matcher.start ("matches")); // Your expected result is System.out.println (matcher.end ("match")); }  

So in the above example, the length of the group is different from the length of the full match (which you get as a result).


No comments:

Post a Comment